WO2020062861A1 - Voice playback control method and device for bluetooth speaker - Google Patents

Voice playback control method and device for bluetooth speaker Download PDF

Info

Publication number
WO2020062861A1
WO2020062861A1 PCT/CN2019/084833 CN2019084833W WO2020062861A1 WO 2020062861 A1 WO2020062861 A1 WO 2020062861A1 CN 2019084833 W CN2019084833 W CN 2019084833W WO 2020062861 A1 WO2020062861 A1 WO 2020062861A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
mobile terminal
bluetooth speaker
server
voice data
Prior art date
Application number
PCT/CN2019/084833
Other languages
French (fr)
Chinese (zh)
Inventor
祁学文
吴海全
迟欣
张恩勤
曹磊
师瑞文
Original Assignee
深圳市冠旭电子股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市冠旭电子股份有限公司 filed Critical 深圳市冠旭电子股份有限公司
Publication of WO2020062861A1 publication Critical patent/WO2020062861A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/72409User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality by interfacing with external accessories
    • H04M1/72415User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality by interfacing with external accessories for remote control of appliances
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/725Cordless telephones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/04Circuits for transducers, loudspeakers or microphones for correcting frequency response
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the invention belongs to the technical field of Bluetooth speaker control, and particularly relates to a method and a device for controlling voice playback of a Bluetooth speaker.
  • Bluetooth speakers with voice wake-up function which can support both recording and playback, are widely used.
  • the mobile phone establishes a connection with the Bluetooth speaker, transmits the voice data recorded by the Bluetooth speaker to the mobile phone, and interacts with the server through the mobile phone application App.
  • the server performs voice recognition and returns the result to the mobile phone, and then transmits the mobile phone app to the Bluetooth speaker for playback .
  • A2DP Advanced Audio Distribution Profile (Bluetooth Audio Transmission Protocol) connection
  • Bluetooth A2DP Bluetooth Audio Transmission Protocol
  • embodiments of the present invention provide a method and a device for controlling voice playback of a Bluetooth speaker, so as to solve the problems of connection delay and slow response of the speaker during the voice interaction in the prior art.
  • a first aspect of the embodiments of the present invention provides a method for controlling voice playback of a Bluetooth speaker, including:
  • a second aspect of the embodiments of the present invention provides a method for controlling voice playback of a Bluetooth speaker, including:
  • a third aspect of the embodiments of the present invention provides a method for controlling voice playback of a Bluetooth speaker, including:
  • the Bluetooth speaker sends voice data to the mobile terminal
  • the mobile terminal uploads the voice data to a server
  • the Bluetooth speaker receives a message that the upload of the voice data is completed by the mobile terminal
  • the Bluetooth speaker establishes a first voice path with the mobile terminal, while the server performs voice recognition;
  • the mobile terminal receives the speech recognition result
  • the mobile terminal sends the voice recognition result to a Bluetooth speaker through a first voice path, and the Bluetooth speaker performs voice playback.
  • a fourth aspect of the embodiments of the present invention provides a Bluetooth speaker voice playback control device, including:
  • a first voice data processing module configured to collect voice data and send the voice data to a mobile terminal, where the voice data is uploaded to a server via the mobile terminal for voice recognition;
  • a first connection establishing module configured to receive a voice data upload end message sent by a mobile terminal, and establish a first voice path with the mobile terminal before the server feeds back a speech recognition result to the mobile terminal;
  • the voice playback module is configured to receive a voice recognition result sent by the mobile terminal via the first voice path, and play the voice recognition result; wherein the voice recognition result is a result fed back from the server to the mobile terminal.
  • a fifth aspect of the embodiments of the present invention provides a mobile terminal, including:
  • a second voice data processing module configured to receive voice data sent by the Bluetooth speaker end, and upload the voice data to a server for voice recognition
  • a second connection establishing module configured to send a message that voice data uploading is completed to the Bluetooth speaker, and establish a first voice path with the Bluetooth speaker before receiving the voice recognition result fed back by the server;
  • the voice recognition result processing module is configured to receive a voice recognition result fed back by the server, and send the voice recognition result to a Bluetooth speaker via the first voice path for voice playback.
  • a sixth aspect of the embodiments of the present invention provides a Bluetooth speaker voice playback control system, including a Bluetooth speaker, a mobile terminal, and a server.
  • the Bluetooth speaker is used to collect voice data and send the voice data to the mobile terminal through the second voice path;
  • the mobile terminal is configured to receive the voice data and upload the voice data to the server, and feedback a message that the voice data upload is completed to the Bluetooth speaker;
  • the Bluetooth speaker and the mobile terminal are respectively used to establish a first voice path before the server feeds back the voice recognition result;
  • the mobile terminal is further configured to receive a voice recognition result fed back by the server, and send the voice recognition result to a Bluetooth speaker via the first voice path.
  • the Bluetooth speaker is also used to receive the voice recognition results sent by the mobile terminal and perform voice playback.
  • a seventh aspect of the embodiments of the present invention provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, implements the steps of the foregoing method.
  • the embodiment of the present invention has the beneficial effect that the embodiment of the present invention can establish a voice playback path between the Bluetooth speaker and the mobile terminal before the server feedbacks the voice recognition result, and upon receiving the feedback from the server to the mobile terminal,
  • the speech recognition result is used, it is not necessary to connect the channels and directly play the voice, which reduces the delay of the voice interaction of the Bluetooth speaker and improves the response speed of the voice interaction.
  • FIG. 1 is a schematic diagram of a system scenario applicable to a method for controlling voice playback of a Bluetooth speaker according to Embodiment 1 of the present invention
  • FIG. 2 is a schematic flowchart of implementing a method for controlling voice playback of a Bluetooth speaker according to a second embodiment of the present invention
  • FIG. 3 is a schematic flowchart of a method for controlling a Bluetooth speaker voice playback method provided by a mobile terminal according to a third embodiment of the present invention
  • FIG. 4 is a schematic diagram of an interaction flow of a method for controlling voice playback of a Bluetooth speaker according to a fourth embodiment of the present invention.
  • FIG. 5 is an exemplary diagram of a Bluetooth speaker voice playback control device provided by Embodiment 5 of the present invention.
  • FIG. 1 is a schematic diagram of a system scenario applicable to a method for controlling voice playback of a Bluetooth speaker according to an embodiment of the present invention. For convenience of explanation, only parts related to this embodiment are shown.
  • the system collects voice data from the Bluetooth speaker 11 and transmits it to the mobile terminal 12.
  • the mobile terminal 12 uploads the voice data to the server 13 and performs voice recognition by the server 13.
  • the server 13 feeds back the voice recognition results to the mobile terminal.
  • the Bluetooth speaker 11 and the mobile terminal 12 established a voice playback channel.
  • the Bluetooth speaker 11 received the voice recognition result fed back from the server 13 to the mobile terminal 12, it did not need to connect the channel and directly played the voice.
  • the delay of voice interaction of the Bluetooth speaker is reduced, and the response speed of the voice interaction is improved.
  • FIG. 2 is a schematic flowchart of a method for controlling a voice playback of a Bluetooth speaker according to an embodiment of the present invention.
  • the execution subject of this process is the Bluetooth speaker 11 shown in FIG. 1, which is detailed as follows:
  • Step S201 Collect voice data and send the voice data to a mobile terminal, and the voice data is uploaded to the server via the mobile terminal for voice recognition.
  • the Bluetooth speaker has a built-in microphone array and can also be used for long-distance pickup.
  • the Bluetooth speakers include, but are not limited to: ordinary monocular Bluetooth speakers, outdoor monophonic Bluetooth speakers, home-style dual-tube Bluetooth speakers, Outdoor sports Bluetooth speakers or large multi-cylinder home Bluetooth speakers can collect voice data and transmit the voice data to the mobile terminal through the established Bluetooth protocol.
  • the voice data is transmitted by a mobile terminal to a server or the cloud for voice recognition through the network, and the voice recognition is to transform unstructured voice data information into a structured index through voice recognition to implement audio or recording Data information mining and retrieval; including signal processing and feature extraction of speech information, decoding of acoustic models and language models, and finally generating speech recognition results.
  • the step of collecting voice data and transmitting the voice data to a mobile terminal, where the voice data is used for uploading to the server for voice recognition via the mobile terminal includes:
  • A1 Generate a wake-up event and send the wake-up event to the mobile terminal.
  • the wake-up event may be a voice wake-up event
  • the Bluetooth speaker has a built-in microphone array, which can collect voice data in real time, and the voice data can be used as a match with the wake-up keyword or as a voice recognition Voice data source.
  • the microphone array of the Bluetooth speaker is always in a low-power operation state, only the data is collected and the wake-up word is matched, the voice data can be continuously recorded; when the recorded voice data is matched by the wake-up algorithm to the wake-up keyword, the Bluetooth speaker triggers an interrupt. And the mobile terminal is notified of the voice wake-up event through the protocol stack.
  • A2 After sending the wake-up event is completed, establish a second voice path with the mobile terminal.
  • the second voice path may be a voice data path or a synchronous SCO connection path; after the speaker terminal sends a wake-up event to the mobile terminal, a synchronous SCO connection is established with the mobile terminal. Because it is synchronized with the mobile terminal for the SCO connection, the microphone array on the Bluetooth speaker side is turned on to receive voice data.
  • the voice data is sent to the mobile terminal via the second voice path.
  • the Bluetooth speaker receives the input voice data, and sends the voice data to the mobile terminal through the second voice path, thereby transmitting the voice data during the voice interaction process.
  • Step S202 Receive a voice data upload end message sent by the mobile terminal, and establish a first voice path with the mobile terminal before the server feeds back the voice recognition result to the mobile terminal.
  • the first voice path is a voice playback path, which may be a Bluetooth audio transmission protocol connection established between the Bluetooth speaker and the mobile terminal; before the server returns the voice recognition result or the voice data information is uploaded to the server, After receiving the message that the upload of the voice data transmitted by the mobile terminal is completed, the Bluetooth speaker end establishes a voice playback channel with the mobile terminal. Because voice recognition takes time, voice feedback arrives at the mobile terminal through the network. After the voice data is uploaded to the server, the voice recognition channel is established before waiting for voice feedback; the voice playback channel connection will be established and voice recognition will be established. The processes are performed simultaneously in different child threads.
  • establishing a first voice path with the mobile terminal includes:
  • the Bluetooth audio transmission protocol established between the Bluetooth speaker and the mobile terminal may be a Bluetooth audio transmission protocol A2DP connection or a synchronous SCO-oriented connection; the synchronous SCO-oriented connection is bidirectional and can collect voice data, Voice data can also be played; the Bluetooth audio transmission protocol A2DP connection can support mono or stereo high-quality audio data transmission, and has a higher sampling rate.
  • Step S203 Receive a voice recognition result sent by the mobile terminal via the first voice path, and play the voice recognition result; wherein the voice recognition result is a result fed back from the server to the mobile terminal.
  • the first voice path is a voice playback path. Since the voice playback path between the Bluetooth speaker and the mobile terminal has been established, the Bluetooth speaker directly sends an air packet after receiving the voice recognition result fed back by the server. It receives the voice recognition result sent by the mobile terminal and plays the voice after receiving the air packet data to realize the rapid response of the voice interaction process and reduce the delay of the voice interaction of the Bluetooth speaker.
  • the Bluetooth speaker voice interaction when the Bluetooth speaker voice interaction is performed, after the voice data is entered and uploaded to the server, the establishment of the voice playback path with the mobile terminal is started, so that the establishment of the voice playback path and the voice recognition and voice feedback Synchronized execution in different sub-threads. After the voice feedback is over, since the voice playback channel has been established, the voice playback is directly performed, which improves the response speed and reduces the interaction delay.
  • FIG. 3 is a schematic flowchart of a method for controlling voice playback of a Bluetooth speaker according to an embodiment of the present invention.
  • the execution subject of this process is the mobile terminal 12 shown in FIG. 1.
  • the mobile terminal may be a mobile phone, a computer, or a tablet with a Bluetooth connection function, which is not specifically limited herein, and is described in detail below. :
  • Step S301 Receive voice data sent by a Bluetooth speaker, and upload the voice data to a server for voice recognition.
  • the mobile terminal performs voice pickup through the Bluetooth speaker end, and after receiving the input voice data, establishes a connection with an independent server or the cloud, and uploads the received voice data to the independent server or the cloud, and the independent server Or cloud for voice recognition of voice data.
  • the step of receiving voice data transmitted by the Bluetooth speaker end and uploading the voice data to a server for voice recognition includes:
  • the wake-up event may be a voice wake-up event; when the voice data entered on the Bluetooth speaker end is matched with the wake-up keyword by the wake-up algorithm, the Bluetooth speaker triggers an interrupt, and the mobile terminal receives the Bluetooth speaker through the protocol line After the voice wake-up event is received, the mobile terminal responds to the wake-up event and performs a voice pickup process from the Bluetooth speaker end.
  • the second voice path may be a voice data path for transmitting voice data; the voice data path may also be a synchronously-oriented SCO connection path; after the mobile terminal receives a voice wake-up event, then Immediately establish a connection with the voice data transmission path of the Bluetooth speaker, and specifically establish a synchronous SCO-oriented connection.
  • the synchronous SCO-oriented connection is bidirectional, which is mainly used for synchronous voice transmission and uses reserved time slots to transmit data packets. Can transmit voice or data.
  • B3. Receive voice data of the Bluetooth speaker through the second voice path; wherein the first voice path is established after the second voice path is established.
  • the second voice channel may be a voice data channel, and specifically may be a synchronous SCO-oriented connection channel; since the mobile terminal and the Bluetooth speaker end maintain a synchronous-oriented connection, it is preferentially opened when the voice data channel is established.
  • the microphone array on the Bluetooth speaker side the mobile terminal picks up the voice from the Bluetooth speaker side through the voice data channel, and obtains the voice data through the voice data channel.
  • step S302 a message that voice data uploading is completed is sent to the Bluetooth speaker, and a first voice path is established with the Bluetooth speaker before receiving the voice recognition result fed back by the server.
  • the first voice path may be a voice playback path; it may be a Bluetooth audio transmission protocol established by a Bluetooth speaker and a mobile terminal; specifically, it may be a Bluetooth audio transmission protocol A2DP connection, or it may be a synchronous SCO connection.
  • the mobile terminal After the mobile terminal uploads the voice data to the cloud or a stand-alone server, it sends a message that the upload is complete to the Bluetooth speaker, and establishes a voice playback channel with the Bluetooth speaker before receiving the server's feedback of the speech recognition result, or after sending the message that the upload is complete.
  • the cloud or an independent server performs voice recognition on the voice data and feedbacks the voice recognition results to the mobile terminal, that is, the establishment of the voice playback channel and the voice recognition and voice feedback. Simultaneously execute in different threads.
  • the mobile terminal receives the speech recognition result, the speech playback path has been established.
  • Step S303 Receive a voice recognition result fed back by the server, and send the voice recognition result to a Bluetooth speaker for voice playback via the first voice path.
  • the first voice path may be a voice playback path; since a voice playback path has been established with a Bluetooth speaker, after receiving the voice recognition result fed back by the server, the voice is sent directly in the form of an air packet. The recognition result is transmitted to the Bluetooth speaker, and the voice data of the air packet is played to realize the rapid response of the voice interaction process and reduce the delay of the voice interaction of the Bluetooth speaker.
  • the mobile terminal performs voice pickup through the Bluetooth speaker, and uploads the voice data to the server for voice recognition, and completes the establishment of the voice playback channel with the Bluetooth speaker before the upload is completed and the voice recognition is received.
  • the voice recognition result is directly transmitted to the Bluetooth speaker through the established voice playback channel for voice playback, which reduces the delay of the Bluetooth connection and improves the response rate of voice interaction.
  • FIG. 4 shows a schematic diagram of an interaction process of a method for controlling voice playback of a Bluetooth speaker according to an embodiment of the present invention.
  • the execution subject participating in the interaction process includes a Bluetooth speaker and a mobile terminal.
  • the implementation principle of the interaction process is as described in FIGS. 2 to 3
  • the implementation principle of each execution subject side is the same, so this interaction process is only briefly described, without repeating:
  • the Bluetooth speaker sends voice data to the mobile terminal
  • the mobile terminal uploads the voice data to the server
  • the Bluetooth speaker receives a message that the upload of the voice data sent by the mobile terminal is completed
  • the Bluetooth speaker establishes the first voice path with the mobile terminal, and the server performs voice recognition at the same time;
  • the mobile terminal receives the voice recognition result
  • the mobile terminal sends the voice recognition result to the Bluetooth speaker via the first voice path, and the Bluetooth speaker performs voice playback.
  • the method for controlling voice playback of a Bluetooth speaker further includes:
  • the Bluetooth speaker sends a wake event to the mobile terminal
  • the Bluetooth speaker establishes a second voice path with the mobile terminal
  • the Bluetooth speaker sends voice data to the mobile terminal via a second voice path; wherein the first voice path is established after the second voice path is established.
  • the Bluetooth speaker establishes a first voice path with the mobile terminal, and the server performs voice recognition, including:
  • the Bluetooth speaker establishes a Bluetooth audio transmission protocol connection with the mobile terminal.
  • FIG. 5 shows an example diagram of a Bluetooth speaker voice playback control device according to an embodiment of the present invention. For ease of description, only parts related to the embodiment of the present invention are shown.
  • the Bluetooth speaker voice playback control device includes:
  • a first voice data processing module 51 configured to collect voice data and send the voice data to a mobile terminal, where the voice data is used for uploading to the server for voice recognition via the mobile terminal;
  • a first connection establishing module 52 configured to receive a voice data upload end message sent by the mobile terminal, and establish a voice playback channel with the mobile terminal before the server feeds back the voice recognition result to the mobile terminal;
  • the voice playback module 53 is configured to receive a voice recognition result sent by a mobile terminal via the first voice path, and play the voice recognition result; wherein the voice recognition result is a result fed back from the server to the mobile terminal.
  • the Bluetooth speaker voice playback control device further includes:
  • a wake-up module for generating a wake-up event and sending the wake-up event to a mobile terminal
  • a second voice path establishment module is configured to establish a second voice path with the mobile terminal after the sending of the wake-up event is completed.
  • an embodiment of the present invention further provides a mobile terminal, including:
  • a second voice data processing module configured to receive voice data sent by the Bluetooth speaker end, and upload the voice data to a server for voice recognition
  • a second connection establishing module configured to send a message that voice data uploading is completed to the Bluetooth speaker end, and establish a first voice path with the Bluetooth speaker before receiving the voice recognition result fed back by the server;
  • the voice recognition result processing module is configured to receive the voice recognition result fed back by the server, and send the voice recognition result to the Bluetooth speaker end via the first voice path for voice playback.
  • an embodiment of the present invention further provides a Bluetooth speaker voice playback control system, including a Bluetooth speaker, a mobile terminal, and a server;
  • the Bluetooth speaker is used to collect voice data and send the voice data to the mobile terminal through the second voice path;
  • the mobile terminal is configured to receive the voice data and upload the voice data to the server, and feedback a message that the voice data upload is completed to the Bluetooth speaker;
  • the Bluetooth speaker and the mobile terminal are respectively used to establish a first voice path before the server feeds back the voice recognition result;
  • the mobile terminal is further configured to receive a voice recognition result fed back by the server, and send the voice recognition result to a Bluetooth speaker via the first voice path;
  • the Bluetooth speaker is also used to receive the voice recognition results sent by the mobile terminal and perform voice playback.
  • An embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, implements steps of a Bluetooth speaker voice playback control method.
  • the disclosed apparatus / terminal device and method may be implemented in other ways.
  • the device / terminal device embodiments described above are only schematic.
  • the division of the modules or units is only a logical function division.
  • components can be combined or integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the integrated module / unit When the integrated module / unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on such an understanding, the present invention implements all or part of the processes in the methods of the above embodiments, and may also be completed by a computer program instructing related hardware.
  • the computer program may be stored in a computer-readable storage medium.
  • the computer When the program is executed by a processor, the steps of the foregoing method embodiments can be implemented.
  • the computer program includes computer program code, and the computer program code may be in a source code form, an object code form, an executable file, or some intermediate form.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a mobile hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signals, telecommunication signals, and software distribution media.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • electric carrier signals telecommunication signals
  • software distribution media any entity or device capable of carrying the computer program code
  • a recording medium a U disk, a mobile hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signals, telecommunication signals, and software distribution media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Telephone Function (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present application relates to the technical field of Bluetooth speaker control and provides a playback control method and device for a Bluetooth speaker, the method comprising: collecting voice data, and transmitting the voice data to a mobile terminal, uploading the voice data to a server by means of the mobile terminal for carrying out voice recognition; receiving a message transmitted by the mobile terminal indicating that the uploading of the voice data has been completed, and establishing a first voice channel with the mobile terminal before the server sends the result of the voice recognition as a feedback to the mobile terminal; receiving the result of the voice recognition transmitted by the mobile terminal by means of the first voice channel and playing the result of the voice recognition, wherein the result of the voice recognition is the result sent by the server to the mobile terminal as a feedback. The present invention can establish a voice playback channel with a mobile terminal in the process of voice recognition, and can play the voice directly without requiring to establish a connection after receiving the result of voice recognition as a feedback, thus reducing the delay during the voice interaction with a Bluetooth speaker and increasing the response speed.

Description

一种蓝牙音箱语音播放控制的方法及装置Method and device for controlling voice playback of Bluetooth speaker 技术领域Technical field
本发明属于蓝牙音箱控制技术领域,尤其涉及一种蓝牙音箱语音播放控制的方法及装置。The invention belongs to the technical field of Bluetooth speaker control, and particularly relates to a method and a device for controlling voice playback of a Bluetooth speaker.
背景技术Background technique
目前无线音箱越来越普及,带有语音唤醒功能,既能支持录音又能支持播放的蓝牙音箱被广泛应用。手机与蓝牙音箱建立连接,将蓝牙音箱录入的语音数据传输到手机,并经过手机应用程序App与服务器交互,服务器进行语音识别并返回结果至手机端,经过手机应用程序App传输至蓝牙音箱进行播放。蓝牙音箱在播放过程中,需要建立A2DP(Advanced Audio Distribution Profile ,蓝牙音频传输协议)连接,并且在手机端接收到服务器的反馈结果,需要将反馈结果通过A2DP播放到蓝牙音箱时,才建立A2DP连接;从而在进行蓝牙音箱语音播放时,存在从获取服务器反馈结果后到由蓝牙音箱播放语音时,建立蓝牙A2DP连接的延时,语音交互过程音箱的响应速度慢,降低了用户的体验效果。Currently, wireless speakers are becoming more and more popular. Bluetooth speakers with voice wake-up function, which can support both recording and playback, are widely used. The mobile phone establishes a connection with the Bluetooth speaker, transmits the voice data recorded by the Bluetooth speaker to the mobile phone, and interacts with the server through the mobile phone application App. The server performs voice recognition and returns the result to the mobile phone, and then transmits the mobile phone app to the Bluetooth speaker for playback . During the playback of Bluetooth speakers, A2DP (Advanced Audio Distribution Profile (Bluetooth Audio Transmission Protocol) connection, and when the mobile phone receives the server's feedback results, the A2DP connection needs to be established only when the feedback results are played to the Bluetooth speakers through A2DP; therefore, there is a problem with After the server feedback result is obtained, the Bluetooth A2DP connection is delayed when voice is played by the Bluetooth speaker, and the response speed of the speaker is slow during the voice interaction process, which reduces the user experience effect.
技术问题technical problem
有鉴于此,本发明实施例提供了一种蓝牙音箱语音播放控制的方法及装置,以解决现有技术中语音交互过程中存在连接延时、音箱响应速度慢的问题。In view of this, embodiments of the present invention provide a method and a device for controlling voice playback of a Bluetooth speaker, so as to solve the problems of connection delay and slow response of the speaker during the voice interaction in the prior art.
技术解决方案Technical solutions
本发明实施例的第一方面提供了一种蓝牙音箱语音播放控制的方法,包括:A first aspect of the embodiments of the present invention provides a method for controlling voice playback of a Bluetooth speaker, including:
采集语音数据,并将所述语音数据发送至移动终端,所述语音数据经移动终端上传至服务器进行语音识别;Collect voice data and send the voice data to a mobile terminal, and the voice data is uploaded to the server via the mobile terminal for voice recognition;
接收移动终端发送的语音数据上传结束的消息,在所述服务器向所述移动终端反馈语音识别结果之前,与移动终端建立第一语音通路;Receiving a message of completion of uploading voice data sent by a mobile terminal, and establishing a first voice path with the mobile terminal before the server feeds back a voice recognition result to the mobile terminal;
接收移动终端经由所述第一语音通路发送的语音识别结果,并播放所述语音识别结果;其中,所述语音识别结果为由服务器反馈至移动终端的结果。Receiving a voice recognition result sent by a mobile terminal via the first voice path, and playing the voice recognition result; wherein the voice recognition result is a result fed back by the server to the mobile terminal.
本发明实施例的第二方面提供了一种蓝牙音箱语音播放控制的方法,包括:A second aspect of the embodiments of the present invention provides a method for controlling voice playback of a Bluetooth speaker, including:
接收由蓝牙音箱发送的语音数据,并将所述语音数据上传至服务器进行语音识别;Receiving voice data sent by a Bluetooth speaker, and uploading the voice data to a server for voice recognition;
将语音数据上传结束的消息发送至蓝牙音箱,在接收服务器反馈的语音识别结果之前,与蓝牙音箱建立第一语音通路;Send the message that the voice data upload is completed to the Bluetooth speaker, and establish a first voice path with the Bluetooth speaker before receiving the voice recognition result fed back by the server;
接收服务器反馈的语音识别结果,并将所述语音识别结果经由所述第一语音通路发送至蓝牙音箱进行语音播放。Receiving a voice recognition result fed back by the server, and sending the voice recognition result to a Bluetooth speaker via the first voice path for voice playback.
本发明实施例的第三方面提供了一种蓝牙音箱语音播放控制的方法,包括:A third aspect of the embodiments of the present invention provides a method for controlling voice playback of a Bluetooth speaker, including:
蓝牙音箱向移动终端发送语音数据;The Bluetooth speaker sends voice data to the mobile terminal;
移动终端将所述语音数据上传至服务器;The mobile terminal uploads the voice data to a server;
蓝牙音箱接收移动终端发送的语音数据上传结束的消息;The Bluetooth speaker receives a message that the upload of the voice data is completed by the mobile terminal;
蓝牙音箱与移动终端建立第一语音通路,同时服务器进行语音识别;The Bluetooth speaker establishes a first voice path with the mobile terminal, while the server performs voice recognition;
移动终端接收语音识别结果;The mobile terminal receives the speech recognition result;
移动终端将所述语音识别结果经由第一语音通路发送至蓝牙音箱,由蓝牙音箱进行语音播放。The mobile terminal sends the voice recognition result to a Bluetooth speaker through a first voice path, and the Bluetooth speaker performs voice playback.
本发明实施例的第四方面提供了一种蓝牙音箱语音播放控制装置,包括:A fourth aspect of the embodiments of the present invention provides a Bluetooth speaker voice playback control device, including:
第一语音数据处理模块,用于采集语音数据,并将所述语音数据发送至移动终端,所述语音数据经移动终端上传至服务器进行语音识别;A first voice data processing module, configured to collect voice data and send the voice data to a mobile terminal, where the voice data is uploaded to a server via the mobile terminal for voice recognition;
第一连接建立模块,用于接收移动终端发送的语音数据上传结束的消息,在所述服务器向所述移动终端反馈语音识别结果之前,与移动终端建立第一语音通路;A first connection establishing module, configured to receive a voice data upload end message sent by a mobile terminal, and establish a first voice path with the mobile terminal before the server feeds back a speech recognition result to the mobile terminal;
语音播放模块,用于接收移动终端经由所述第一语音通路发送的语音识别结果,并播放所述语音识别结果;其中,所述语音识别结果为由服务器反馈至移动终端的结果。The voice playback module is configured to receive a voice recognition result sent by the mobile terminal via the first voice path, and play the voice recognition result; wherein the voice recognition result is a result fed back from the server to the mobile terminal.
本发明实施例的第五方面提供了一种移动终端,包括:A fifth aspect of the embodiments of the present invention provides a mobile terminal, including:
第二语音数据处理模块,用于接收由蓝牙音箱端发送的语音数据,并将所述语音数据上传至服务器进行语音识别;A second voice data processing module, configured to receive voice data sent by the Bluetooth speaker end, and upload the voice data to a server for voice recognition;
第二连接建立模块,用于将语音数据上传结束的消息发送至蓝牙音箱,在接收服务器反馈的语音识别结果之前,与蓝牙音箱建立第一语音通路;A second connection establishing module, configured to send a message that voice data uploading is completed to the Bluetooth speaker, and establish a first voice path with the Bluetooth speaker before receiving the voice recognition result fed back by the server;
语音识别结果处理模块,用于接收服务器反馈的语音识别结果,并将所述语音识别结果经由所述第一语音通路发送至蓝牙音箱进行语音播放。The voice recognition result processing module is configured to receive a voice recognition result fed back by the server, and send the voice recognition result to a Bluetooth speaker via the first voice path for voice playback.
本发明实施例的第六方面提供了一种蓝牙音箱语音播放控制系统,包括蓝牙音箱、移动终端和服务器,A sixth aspect of the embodiments of the present invention provides a Bluetooth speaker voice playback control system, including a Bluetooth speaker, a mobile terminal, and a server.
蓝牙音箱用于采集语音数据并通过第二语音通路向移动终端发送语音数据;The Bluetooth speaker is used to collect voice data and send the voice data to the mobile terminal through the second voice path;
移动终端用于接收所述语音数据并将所述语音数据上传至服务器,以及反馈语音数据上传结束的消息至蓝牙音箱;The mobile terminal is configured to receive the voice data and upload the voice data to the server, and feedback a message that the voice data upload is completed to the Bluetooth speaker;
服务器用于接收和识别所述语音数据并反馈与所述语音数据对应的语音识别结果;A server for receiving and recognizing the voice data and feeding back a voice recognition result corresponding to the voice data;
蓝牙音箱与移动终端还分别用于在服务器反馈语音识别结果之前,建立第一语音通路;The Bluetooth speaker and the mobile terminal are respectively used to establish a first voice path before the server feeds back the voice recognition result;
移动终端还用于接收服务器反馈的语音识别结果,并经过所述第一语音通路将所述语音识别结果发送至蓝牙音箱。The mobile terminal is further configured to receive a voice recognition result fed back by the server, and send the voice recognition result to a Bluetooth speaker via the first voice path.
蓝牙音箱还用于接收移动终端发送的语音识别结果,并进行语音播放。The Bluetooth speaker is also used to receive the voice recognition results sent by the mobile terminal and perform voice playback.
本发明实施例的第七方面提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述方法的步骤。A seventh aspect of the embodiments of the present invention provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, implements the steps of the foregoing method.
有益效果Beneficial effect
本发明实施例与现有技术相比存在的有益效果是:本发明实施例可以在服务器反馈语音识别结果之前,由蓝牙音箱与移动终端建立语音播放通路,在接收到由服务器反馈至移动终端的语音识别结果时,不需要在进行通路的连接,直接进行语音的播放,降低了蓝牙音箱语音交互的延时,提高了语音交互的响应速度。Compared with the prior art, the embodiment of the present invention has the beneficial effect that the embodiment of the present invention can establish a voice playback path between the Bluetooth speaker and the mobile terminal before the server feedbacks the voice recognition result, and upon receiving the feedback from the server to the mobile terminal, When the speech recognition result is used, it is not necessary to connect the channels and directly play the voice, which reduces the delay of the voice interaction of the Bluetooth speaker and improves the response speed of the voice interaction.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings in the following description are only the present invention. For some embodiments, for those of ordinary skill in the art, other drawings can be obtained according to these drawings without paying creative labor.
图1是本发明实施例一提供的蓝牙音箱语音播放控制方法所适用的系统场景示意图;FIG. 1 is a schematic diagram of a system scenario applicable to a method for controlling voice playback of a Bluetooth speaker according to Embodiment 1 of the present invention; FIG.
图2是本发明实施例二提供的蓝牙音箱语音播放控制方法实现流程示意图;FIG. 2 is a schematic flowchart of implementing a method for controlling voice playback of a Bluetooth speaker according to a second embodiment of the present invention; FIG.
图3是本发明实施例三提供的移动终端控制蓝牙音箱语音播放方法的实现流程示意图;3 is a schematic flowchart of a method for controlling a Bluetooth speaker voice playback method provided by a mobile terminal according to a third embodiment of the present invention;
图4是本发明实施例四提供的蓝牙音箱语音播放控制方法的交互流程示意图;4 is a schematic diagram of an interaction flow of a method for controlling voice playback of a Bluetooth speaker according to a fourth embodiment of the present invention;
图5是本发明实施例五提供的蓝牙音箱语音播放控制装置的示例图。FIG. 5 is an exemplary diagram of a Bluetooth speaker voice playback control device provided by Embodiment 5 of the present invention.
本发明的实施方式Embodiments of the invention
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本发明实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本发明。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本发明的描述。In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are provided in order to thoroughly understand the embodiments of the present invention. However, it should be clear to a person skilled in the art that the present invention can also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary details.
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in this specification and the appended claims, the term "comprising" indicates the presence of described features, integers, steps, operations, elements and / or components, but does not exclude one or more other features , The whole, steps, operations, elements, components, and / or their presence or addition.
还应当理解,在此本发明说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本发明。如在本发明说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。It should also be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to limit the invention. As used in the description of the invention and the appended claims, the singular forms "a", "an" and "the" are intended to include the plural forms unless the context clearly indicates otherwise.
还应当进一步理解,在本发明说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It should be further understood that the term "and / or" used in the present description and the appended claims refers to any combination of one or more of the listed items and all possible combinations, and includes these combinations .
为了说明本发明所述的技术方案,下面通过具体实施例来进行说明。In order to explain the technical solution of the present invention, the following description is made through specific embodiments.
实施例一Example one
图1示出了本发明实施例提供的蓝牙音箱语音播放控制方法所适用的系统场景示意图,为了便于说明,仅示出了与本实施例相关的部分。FIG. 1 is a schematic diagram of a system scenario applicable to a method for controlling voice playback of a Bluetooth speaker according to an embodiment of the present invention. For convenience of explanation, only parts related to this embodiment are shown.
参照图1,该系统由蓝牙音箱11采集语音数据,并传输至移动终端12,由移动终端12将语音数据上传至服务器13,由服务器13进行语音识别;在服务器13反馈语音识别结果至移动终端12之前,由蓝牙音箱11与移动终端12建立语音播放通路,蓝牙音箱11在接收到由服务器13反馈至移动终端12的语音识别结果时,不需要再进行通路的连接,直接进行语音的播放,降低了蓝牙音箱语音交互的延时,提高了语音交互的响应速度。Referring to FIG. 1, the system collects voice data from the Bluetooth speaker 11 and transmits it to the mobile terminal 12. The mobile terminal 12 uploads the voice data to the server 13 and performs voice recognition by the server 13. The server 13 feeds back the voice recognition results to the mobile terminal. Before 12, the Bluetooth speaker 11 and the mobile terminal 12 established a voice playback channel. When the Bluetooth speaker 11 received the voice recognition result fed back from the server 13 to the mobile terminal 12, it did not need to connect the channel and directly played the voice. The delay of voice interaction of the Bluetooth speaker is reduced, and the response speed of the voice interaction is improved.
下面对图1所示的系统场景下的蓝牙音箱语音播放控制方法进行详细阐述。The method for controlling voice playback of the Bluetooth speaker in the system scenario shown in FIG. 1 is described in detail below.
实施例二Example two
图2示出了本发明实施例提供的蓝牙音箱语音播放控制方法的实现流程示意图。在本实施例中,该流程的执行主体为图1所示的蓝牙音箱11,详述如下:FIG. 2 is a schematic flowchart of a method for controlling a voice playback of a Bluetooth speaker according to an embodiment of the present invention. In this embodiment, the execution subject of this process is the Bluetooth speaker 11 shown in FIG. 1, which is detailed as follows:
步骤S201,采集语音数据,并将所述语音数据发送至移动终端,所述语音数据经移动终端上传至服务器进行语音识别。Step S201: Collect voice data and send the voice data to a mobile terminal, and the voice data is uploaded to the server via the mobile terminal for voice recognition.
在本发明实施例中,蓝牙音箱内置麦克风阵列,还可进行远距离拾音;所述的蓝牙音箱包括但不仅限于:普通单筒蓝牙音箱、户外单筒蓝牙音箱、家居型双筒蓝牙音箱、户外运动型蓝牙音箱或大型多筒家居蓝牙音箱,均可进行语音数据的采集,并将语音数据通过建立的蓝牙协议传输至移动终端。In the embodiment of the present invention, the Bluetooth speaker has a built-in microphone array and can also be used for long-distance pickup. The Bluetooth speakers include, but are not limited to: ordinary monocular Bluetooth speakers, outdoor monophonic Bluetooth speakers, home-style dual-tube Bluetooth speakers, Outdoor sports Bluetooth speakers or large multi-cylinder home Bluetooth speakers can collect voice data and transmit the voice data to the mobile terminal through the established Bluetooth protocol.
其中,所述的语音数据由移动终端通过网络传输至服务器或云端进行语音识别,所述的语音识别为通过语音识别将非结构化的语音数据信息转化为结构化的索引,实现对音频或录音数据的信息挖掘与检索;包括对语音信息的信号处理和特征提取,进行声学模型以及语言模型的解码,最终生成语音识别结果。The voice data is transmitted by a mobile terminal to a server or the cloud for voice recognition through the network, and the voice recognition is to transform unstructured voice data information into a structured index through voice recognition to implement audio or recording Data information mining and retrieval; including signal processing and feature extraction of speech information, decoding of acoustic models and language models, and finally generating speech recognition results.
进一步的,所述采集语音数据,并将所述语音数据传输至移动终端,所述语音数据用于经移动终端上传至服务器进行语音识别的步骤,包括:Further, the step of collecting voice data and transmitting the voice data to a mobile terminal, where the voice data is used for uploading to the server for voice recognition via the mobile terminal, includes:
A1、生成唤醒事件,并将唤醒事件发送至移动终端。A1. Generate a wake-up event and send the wake-up event to the mobile terminal.
在本实施例中,所述唤醒事件可以是语音唤醒事件;蓝牙音箱内置了麦克风阵列,可以实时采集语音数据,所述语音数据既可以作为与唤醒关键词的匹配,也可以作为进行语音识别的语音数据来源。当蓝牙音箱的麦克风阵列一直处于低功耗运行状态,只采集数据并进行唤醒词的匹配,可以一直录取语音数据;当录取的语音数据经过唤醒算法匹配到唤醒关键词后,蓝牙音箱触发中断,并通过协议栈通知移动终端语音唤醒事件。In this embodiment, the wake-up event may be a voice wake-up event; the Bluetooth speaker has a built-in microphone array, which can collect voice data in real time, and the voice data can be used as a match with the wake-up keyword or as a voice recognition Voice data source. When the microphone array of the Bluetooth speaker is always in a low-power operation state, only the data is collected and the wake-up word is matched, the voice data can be continuously recorded; when the recorded voice data is matched by the wake-up algorithm to the wake-up keyword, the Bluetooth speaker triggers an interrupt. And the mobile terminal is notified of the voice wake-up event through the protocol stack.
A2、在所述唤醒事件发送完成后,与移动终端建立第二语音通路。A2: After sending the wake-up event is completed, establish a second voice path with the mobile terminal.
在本实施例中,所述的第二语音通路可以是语音数据通路,还可以是同步面向SCO连接通路;音箱端将唤醒事件发送至移动终端完成后,则与移动终端建立同步面向SCO连接,由于与移动终端保持同步面向SCO连接,蓝牙音箱端的麦克风阵列优先打开接收语音数据。In this embodiment, the second voice path may be a voice data path or a synchronous SCO connection path; after the speaker terminal sends a wake-up event to the mobile terminal, a synchronous SCO connection is established with the mobile terminal. Because it is synchronized with the mobile terminal for the SCO connection, the microphone array on the Bluetooth speaker side is turned on to receive voice data.
A3、所述语音数据经由所述第二语音通路发送至移动终端。A3. The voice data is sent to the mobile terminal via the second voice path.
在本实施例中,建立第二语音通路连接后,蓝牙音箱接收录入语音数据,将所述的语音数据经过第二语音通路连发送至移动终端,从而进行语音交互过程的语音数据的传输。In this embodiment, after the second voice path connection is established, the Bluetooth speaker receives the input voice data, and sends the voice data to the mobile terminal through the second voice path, thereby transmitting the voice data during the voice interaction process.
步骤S202,接收移动终端发送的语音数据上传结束的消息,在所述服务器向所述移动终端反馈语音识别结果之前,与移动终端建立第一语音通路。Step S202: Receive a voice data upload end message sent by the mobile terminal, and establish a first voice path with the mobile terminal before the server feeds back the voice recognition result to the mobile terminal.
在本发明实施例中,所述第一语音通路为语音播放通路,可以是蓝牙音箱与移动终端建立的蓝牙音频传输协议连接;在服务器返回语音识别结果之前或者在语音数据信息上传至服务器结束,接收到由移动终端传输的语音数据上传结束的消息之后,由蓝牙音箱端建立与移动终端的语音播放通路。由于语音识别需要时间,语音反馈通过网络传输到达移动终端,在语音数据上传至服务器结束后,进行语音识别并等待语音反馈之前,先建立好语音播放通路;将建立语音播放通路连接和进行语音识别过程在不同的子线程同时进行。In the embodiment of the present invention, the first voice path is a voice playback path, which may be a Bluetooth audio transmission protocol connection established between the Bluetooth speaker and the mobile terminal; before the server returns the voice recognition result or the voice data information is uploaded to the server, After receiving the message that the upload of the voice data transmitted by the mobile terminal is completed, the Bluetooth speaker end establishes a voice playback channel with the mobile terminal. Because voice recognition takes time, voice feedback arrives at the mobile terminal through the network. After the voice data is uploaded to the server, the voice recognition channel is established before waiting for voice feedback; the voice playback channel connection will be established and voice recognition will be established. The processes are performed simultaneously in different child threads.
进一步,在接收移动终端传输的语音数据上传结束的消息,在所述服务器向所述移动终端反馈语音识别结果之前,与移动终端建立第一语音通路,包括:Further, before receiving the message that the upload of the voice data transmitted by the mobile terminal is completed, and before the server feeds back the voice recognition result to the mobile terminal, establishing a first voice path with the mobile terminal includes:
在服务器进行语音识别开始时,与移动终端建立蓝牙音频传输协议连接。When the server starts speech recognition, a Bluetooth audio transmission protocol connection is established with the mobile terminal.
在本实施例中,蓝牙音箱与移动终端建立的蓝牙音频传输协议可以是蓝牙音频传输协议A2DP连接,还可以是同步面向SCO连接;所述的同步面向SCO连接是双向的,可以采集语音数据,也可以播放语音数据;所述的蓝牙音频传输协议A2DP连接可以支持单声道或立体声高质量音频数据的传输,具有较高的采样率。In this embodiment, the Bluetooth audio transmission protocol established between the Bluetooth speaker and the mobile terminal may be a Bluetooth audio transmission protocol A2DP connection or a synchronous SCO-oriented connection; the synchronous SCO-oriented connection is bidirectional and can collect voice data, Voice data can also be played; the Bluetooth audio transmission protocol A2DP connection can support mono or stereo high-quality audio data transmission, and has a higher sampling rate.
步骤S203,接收移动终端经由所述第一语音通路发送的语音识别结果,并播放所述语音识别结果;其中,所述语音识别结果为由服务器反馈至移动终端的结果。Step S203: Receive a voice recognition result sent by the mobile terminal via the first voice path, and play the voice recognition result; wherein the voice recognition result is a result fed back from the server to the mobile terminal.
在本发明实施例中,所述的第一语音通路为语音播放通路;由于已经建立好蓝牙音箱与移动终端的语音播放通路,当接收到服务器反馈的语音识别结果后,蓝牙音箱直接以空中包的形式接收移动终端发送的语音识别结果,并在接收到空中包数据后进行语音播放,实现语音交互过程的快速响应,降低蓝牙音箱语音交互的延时。In the embodiment of the present invention, the first voice path is a voice playback path. Since the voice playback path between the Bluetooth speaker and the mobile terminal has been established, the Bluetooth speaker directly sends an air packet after receiving the voice recognition result fed back by the server. It receives the voice recognition result sent by the mobile terminal and plays the voice after receiving the air packet data to realize the rapid response of the voice interaction process and reduce the delay of the voice interaction of the Bluetooth speaker.
通过本发明实施例,在进行蓝牙音箱语音交互时,在进行语音数据的录入与上传至服务器结束后,启动与移动终端的语音播放通路的建立,使得语音播放通路的建立与语音识别、语音反馈在不同的子线程同步执行,在语音反馈结束后,由于语音播放通路已经建立,则直接进行语音的播放,提高了响应速度,降低了交互延时。According to the embodiment of the present invention, when the Bluetooth speaker voice interaction is performed, after the voice data is entered and uploaded to the server, the establishment of the voice playback path with the mobile terminal is started, so that the establishment of the voice playback path and the voice recognition and voice feedback Synchronized execution in different sub-threads. After the voice feedback is over, since the voice playback channel has been established, the voice playback is directly performed, which improves the response speed and reduces the interaction delay.
实施例三Example three
图3示出了本发明实施例提供的蓝牙音箱语音播放控制方法的实现流程示意图。在本实施例中,该流程的执行主体为图1所示的移动终端12,所述的移动终端可以是具有蓝牙连接功能的手机、电脑或平板等,在此不做具体限定,详述如下:FIG. 3 is a schematic flowchart of a method for controlling voice playback of a Bluetooth speaker according to an embodiment of the present invention. In this embodiment, the execution subject of this process is the mobile terminal 12 shown in FIG. 1. The mobile terminal may be a mobile phone, a computer, or a tablet with a Bluetooth connection function, which is not specifically limited herein, and is described in detail below. :
步骤S301,接收由蓝牙音箱发送的语音数据,并将所述语音数据上传至服务器进行语音识别。Step S301: Receive voice data sent by a Bluetooth speaker, and upload the voice data to a server for voice recognition.
在本发明实施例中,移动终端通过蓝牙音箱端进行语音拾音,接收到输入的语音数据后,建立与独立服务器或云端的连接,将接收的语音数据上传至独立服务器或云端,由独立服务器或云端对语音数据进行语音识别。In the embodiment of the present invention, the mobile terminal performs voice pickup through the Bluetooth speaker end, and after receiving the input voice data, establishes a connection with an independent server or the cloud, and uploads the received voice data to the independent server or the cloud, and the independent server Or cloud for voice recognition of voice data.
进一步的,所述接收由蓝牙音箱端传输的语音数据,并将所述语音数据上传至服务器进行语音识别的步骤,包括:Further, the step of receiving voice data transmitted by the Bluetooth speaker end and uploading the voice data to a server for voice recognition includes:
B1、接收由蓝牙音箱发送的唤醒事件。B1. Receive the wake-up event sent by the Bluetooth speaker.
在本实施例中,所述的唤醒事件可以是语音唤醒事件;当蓝牙音箱端录入的语音数据经过唤醒算法匹配到唤醒关键词后,蓝牙音箱则触发中断,移动终端通过协议线接收到蓝牙音箱的语音唤醒事件,移动终端接收到语音唤醒事件后,则响应该唤醒事件,进行从蓝牙音箱端的语音拾音过程。In this embodiment, the wake-up event may be a voice wake-up event; when the voice data entered on the Bluetooth speaker end is matched with the wake-up keyword by the wake-up algorithm, the Bluetooth speaker triggers an interrupt, and the mobile terminal receives the Bluetooth speaker through the protocol line After the voice wake-up event is received, the mobile terminal responds to the wake-up event and performs a voice pickup process from the Bluetooth speaker end.
B2、根据所述唤醒事件,与蓝牙音箱建立第二语音通路。B2. Establish a second voice path with the Bluetooth speaker according to the wake-up event.
在本实施例中,所述的第二语音通路可以是语音数据通路,用于传输语音数据;所述的语音数据通路还可以是同步面向SCO连接通路;移动终端接收到语音唤醒事件后,则立即建立与蓝牙音箱的语音数据传输通路的连接,具体建立同步面向SCO连接,所述的同步面向SCO连接接是双向的,主要用于同步语音的传送,并且利用保留时隙传送数据包,既可以传送话音也可以传送数据。In this embodiment, the second voice path may be a voice data path for transmitting voice data; the voice data path may also be a synchronously-oriented SCO connection path; after the mobile terminal receives a voice wake-up event, then Immediately establish a connection with the voice data transmission path of the Bluetooth speaker, and specifically establish a synchronous SCO-oriented connection. The synchronous SCO-oriented connection is bidirectional, which is mainly used for synchronous voice transmission and uses reserved time slots to transmit data packets. Can transmit voice or data.
B3、通过所述第二语音通路接收蓝牙音箱的语音数据;其中,所述第一语音通路在所述第二语音通路建立完成后建立。B3. Receive voice data of the Bluetooth speaker through the second voice path; wherein the first voice path is established after the second voice path is established.
在本实施例中,所述的第二语音通路可以是语音数据通路,具体可以为同步面向SCO连接通路;由于移动终端与蓝牙音箱端保持面向同步连接,在建立语音数据通路时,会优先打开蓝牙音箱端的麦克风阵列,移动终端通过语音数据通路从蓝牙音箱端进行语音拾音,通过语音数据通路获取语音数据。In this embodiment, the second voice channel may be a voice data channel, and specifically may be a synchronous SCO-oriented connection channel; since the mobile terminal and the Bluetooth speaker end maintain a synchronous-oriented connection, it is preferentially opened when the voice data channel is established. The microphone array on the Bluetooth speaker side, the mobile terminal picks up the voice from the Bluetooth speaker side through the voice data channel, and obtains the voice data through the voice data channel.
步骤S302,将语音数据上传结束的消息发送至蓝牙音箱,在接收服务器反馈的语音识别结果之前,与蓝牙音箱建立第一语音通路。In step S302, a message that voice data uploading is completed is sent to the Bluetooth speaker, and a first voice path is established with the Bluetooth speaker before receiving the voice recognition result fed back by the server.
在本发明实施例中,所述第一语音通路可以是语音播放通路;可以是蓝牙音箱与移动终端建立的蓝牙音频传输协议;具体可以是蓝牙音频传输协议A2DP连接,还可以是同步面向SCO连接。移动终端将语音数据上传至云端或独立服务器后,发送上传结束的消息至蓝牙音箱,并在接收服务器反馈语音识别结果之前,或者发送上传结束的消息之后,建立与蓝牙音箱的语音播放通路。In the embodiment of the present invention, the first voice path may be a voice playback path; it may be a Bluetooth audio transmission protocol established by a Bluetooth speaker and a mobile terminal; specifically, it may be a Bluetooth audio transmission protocol A2DP connection, or it may be a synchronous SCO connection. . After the mobile terminal uploads the voice data to the cloud or a stand-alone server, it sends a message that the upload is complete to the Bluetooth speaker, and establishes a voice playback channel with the Bluetooth speaker before receiving the server's feedback of the speech recognition result, or after sending the message that the upload is complete.
需要说明的是,在建立与蓝牙音箱的语音播放通路的同时,云端或独立服务器对语音数据进行语音识别,并将语音识别结果反馈至移动终端,即语音播放通路的建立与语音识别、语音反馈在不同的线程同时执行,在移动终端接收到语音识别结果时,语音播放通路已建立完成。It should be noted that, while establishing a voice playback channel with a Bluetooth speaker, the cloud or an independent server performs voice recognition on the voice data and feedbacks the voice recognition results to the mobile terminal, that is, the establishment of the voice playback channel and the voice recognition and voice feedback. Simultaneously execute in different threads. When the mobile terminal receives the speech recognition result, the speech playback path has been established.
步骤S303,接收服务器反馈的语音识别结果,并将所述语音识别结果经由所述第一语音通路发送至蓝牙音箱进行语音播放。Step S303: Receive a voice recognition result fed back by the server, and send the voice recognition result to a Bluetooth speaker for voice playback via the first voice path.
在本发明实施例中,所述的第一语音通路可以是语音播放通路;由于已经与蓝牙音箱建立好语音播放通路,当接收到服务器反馈的语音识别结果后,直接以空中包的形式发送语音识别结果至蓝牙音箱端,进行空中包数据的语音播放,实现语音交互过程的快速响应,降低蓝牙音箱语音交互的延时。In the embodiment of the present invention, the first voice path may be a voice playback path; since a voice playback path has been established with a Bluetooth speaker, after receiving the voice recognition result fed back by the server, the voice is sent directly in the form of an air packet. The recognition result is transmitted to the Bluetooth speaker, and the voice data of the air packet is played to realize the rapid response of the voice interaction process and reduce the delay of the voice interaction of the Bluetooth speaker.
通过本发明实施例,移动终端通过蓝牙音箱进行语音拾音,并将语音数据上传至服务器进行语音识别,在上传结束、语音识别之前完成与蓝牙音箱的语音播放通路的建立,接收到服务器反馈的语音识别结果后,直接将语音识别结果通过已建立的语音播放通路传输至蓝牙音箱,进行语音播放,降低了蓝牙连接的延时,提高了语音交互的响应速率。According to the embodiment of the present invention, the mobile terminal performs voice pickup through the Bluetooth speaker, and uploads the voice data to the server for voice recognition, and completes the establishment of the voice playback channel with the Bluetooth speaker before the upload is completed and the voice recognition is received. After the voice recognition result, the voice recognition result is directly transmitted to the Bluetooth speaker through the established voice playback channel for voice playback, which reduces the delay of the Bluetooth connection and improves the response rate of voice interaction.
实施例四Embodiment 4
图4示出了本发明实施例提供的蓝牙音箱语音播放控制方法的交互流程示意图,参与该交互流程的执行主体包括蓝牙音箱、移动终端,该交互流程的实现原理与图2至图3所述的每个执行主体侧的实现原理相一致,因此仅简要地描述该交互流程,不赘述:FIG. 4 shows a schematic diagram of an interaction process of a method for controlling voice playback of a Bluetooth speaker according to an embodiment of the present invention. The execution subject participating in the interaction process includes a Bluetooth speaker and a mobile terminal. The implementation principle of the interaction process is as described in FIGS. 2 to 3 The implementation principle of each execution subject side is the same, so this interaction process is only briefly described, without repeating:
1、蓝牙音箱向移动终端发送语音数据;1. The Bluetooth speaker sends voice data to the mobile terminal;
2、移动终端将所述语音数据上传至服务器;2. The mobile terminal uploads the voice data to the server;
3、蓝牙音箱接收移动终端发送的语音数据上传结束的消息;3. The Bluetooth speaker receives a message that the upload of the voice data sent by the mobile terminal is completed;
4、蓝牙音箱与移动终端建立第一语音通路,同时服务器进行语音识别;4. The Bluetooth speaker establishes the first voice path with the mobile terminal, and the server performs voice recognition at the same time;
5、移动终端接收所述语音识别结果;5. The mobile terminal receives the voice recognition result;
6、移动终端将所述语音识别结果经由第一语音通路发送至蓝牙音箱,由蓝牙音箱进行语音播放。6. The mobile terminal sends the voice recognition result to the Bluetooth speaker via the first voice path, and the Bluetooth speaker performs voice playback.
进一步的,所述的蓝牙音箱语音播放控制的方法,还包括:Further, the method for controlling voice playback of a Bluetooth speaker further includes:
蓝牙音箱将唤醒事件发送移动终端;The Bluetooth speaker sends a wake event to the mobile terminal;
根据唤醒事件,蓝牙音箱与移动终端建立第二语音通路;According to the wake-up event, the Bluetooth speaker establishes a second voice path with the mobile terminal;
蓝牙音箱将语音数据经由第二语音通路发送至移动终端;其中,所述第一语音通路在所述第二语音通路建立完成后建立。The Bluetooth speaker sends voice data to the mobile terminal via a second voice path; wherein the first voice path is established after the second voice path is established.
进一步的,蓝牙音箱与移动终端建立第一语音通路,同时服务器进行语音识别,包括:Further, the Bluetooth speaker establishes a first voice path with the mobile terminal, and the server performs voice recognition, including:
在服务器进行语音识别时,蓝牙音箱与移动终端建立蓝牙音频传输协议连接。When the server performs voice recognition, the Bluetooth speaker establishes a Bluetooth audio transmission protocol connection with the mobile terminal.
需要说明的是,本领域技术人员在本发明揭露的技术范围内,可容易想到的其他排序方案也应在本发明的保护范围之内,在此不一一赘述。It should be noted that other sorting schemes that can be easily conceived by those skilled in the art within the technical scope disclosed in the present invention should also fall within the protection scope of the present invention, and are not described in detail here.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present invention.
实施例五Example 5
图5示出了本发明实施例提供的蓝牙音箱语音播放控制装置的示例图,为了便于说明,仅示出了与本发明实施例相关的部分。FIG. 5 shows an example diagram of a Bluetooth speaker voice playback control device according to an embodiment of the present invention. For ease of description, only parts related to the embodiment of the present invention are shown.
所述蓝牙音箱语音播放控制装置,包括:The Bluetooth speaker voice playback control device includes:
第一语音数据处理模块51,用于采集语音数据,并将所述语音数据发送至移动终端,所述语音数据用于经移动终端上传至服务器进行语音识别;A first voice data processing module 51, configured to collect voice data and send the voice data to a mobile terminal, where the voice data is used for uploading to the server for voice recognition via the mobile terminal;
第一连接建立模块52,用于接收移动终端发送的语音数据上传结束的消息,在所述服务器向所述移动终端反馈语音识别结果之前,建立与移动终端的语音播放通路;A first connection establishing module 52, configured to receive a voice data upload end message sent by the mobile terminal, and establish a voice playback channel with the mobile terminal before the server feeds back the voice recognition result to the mobile terminal;
语音播放模块53,用于接收移动终端经由所述第一语音通路发送的语音识别结果,并播放所述语音识别结果;其中,所述语音识别结果为由服务器反馈至移动终端的结果。The voice playback module 53 is configured to receive a voice recognition result sent by a mobile terminal via the first voice path, and play the voice recognition result; wherein the voice recognition result is a result fed back from the server to the mobile terminal.
进一步的,所述的蓝牙音箱语音播放控制装置,还包括:Further, the Bluetooth speaker voice playback control device further includes:
唤醒模块,用于生成唤醒事件,并将唤醒事件发送至移动终端;A wake-up module for generating a wake-up event and sending the wake-up event to a mobile terminal;
第二语音通路建立模块,用于在所述唤醒事件发送完成后,与移动终端建立第二语音通路。A second voice path establishment module is configured to establish a second voice path with the mobile terminal after the sending of the wake-up event is completed.
进一步,本发明实施例还提供了一种移动终端,包括:Further, an embodiment of the present invention further provides a mobile terminal, including:
第二语音数据处理模块,用于接收由蓝牙音箱端发送的语音数据,并将所述语音数据上传至服务器进行语音识别;A second voice data processing module, configured to receive voice data sent by the Bluetooth speaker end, and upload the voice data to a server for voice recognition;
第二连接建立模块,用于将语音数据上传结束的消息发送至蓝牙音箱端,在接收服务器反馈的语音识别结果之前,与蓝牙音箱建立第一语音通路;A second connection establishing module, configured to send a message that voice data uploading is completed to the Bluetooth speaker end, and establish a first voice path with the Bluetooth speaker before receiving the voice recognition result fed back by the server;
语音识别结果处理模块,用于接收服务器反馈的语音识别结果,并将所述语音识别结果经由所述第一语音通路发送至蓝牙音箱端进行语音播放。The voice recognition result processing module is configured to receive the voice recognition result fed back by the server, and send the voice recognition result to the Bluetooth speaker end via the first voice path for voice playback.
进一步的,本发明实施例还提供了一种蓝牙音箱语音播放控制系统,包括蓝牙音箱、移动终端和服务器;Further, an embodiment of the present invention further provides a Bluetooth speaker voice playback control system, including a Bluetooth speaker, a mobile terminal, and a server;
蓝牙音箱用于采集语音数据并通过第二语音通路向移动终端发送语音数据;The Bluetooth speaker is used to collect voice data and send the voice data to the mobile terminal through the second voice path;
移动终端用于接收所述语音数据并将所述语音数据上传至服务器,以及反馈语音数据上传结束的消息至蓝牙音箱;The mobile terminal is configured to receive the voice data and upload the voice data to the server, and feedback a message that the voice data upload is completed to the Bluetooth speaker;
服务器用于接收和识别所述语音数据并反馈与所述语音数据对应的语音识别结果;A server for receiving and recognizing the voice data and feeding back a voice recognition result corresponding to the voice data;
蓝牙音箱与移动终端还分别用于在服务器反馈语音识别结果之前,建立第一语音通路;The Bluetooth speaker and the mobile terminal are respectively used to establish a first voice path before the server feeds back the voice recognition result;
移动终端还用于接收服务器反馈的语音识别结果,并经过所述第一语音通路将所述语音识别结果发送至蓝牙音箱;The mobile terminal is further configured to receive a voice recognition result fed back by the server, and send the voice recognition result to a Bluetooth speaker via the first voice path;
蓝牙音箱还用于接收移动终端发送的语音识别结果,并进行语音播放。The Bluetooth speaker is also used to receive the voice recognition results sent by the mobile terminal and perform voice playback.
本发明实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现蓝牙音箱语音播放控制方法的步骤。An embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, implements steps of a Bluetooth speaker voice playback control method.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本发明的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of the description, only the above-mentioned division of functional units and modules is used as an example. In practical applications, the above functions can be allocated by different functional units according to needs. Module completion, that is, dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above. Each functional unit and module in the embodiment may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit. The integrated unit may be hardware. It can be implemented in the form of software functional units. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing from each other, and are not used to limit the protection scope of the present invention. For specific working processes of the units and modules in the foregoing system, reference may be made to corresponding processes in the foregoing method embodiments, and details are not described herein again.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above embodiments, the description of each embodiment has its own emphasis. For a part that is not detailed or recorded in an embodiment, reference may be made to related descriptions of other embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Those of ordinary skill in the art may realize that the units and algorithm steps of each example described in connection with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. A person skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the present invention.
在本发明所提供的实施例中,应该理解到,所揭露的装置/终端设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/终端设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。In the embodiments provided by the present invention, it should be understood that the disclosed apparatus / terminal device and method may be implemented in other ways. For example, the device / terminal device embodiments described above are only schematic. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be another division manner, such as multiple units. Or components can be combined or integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist separately physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括是电载波信号和电信信号。When the integrated module / unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on such an understanding, the present invention implements all or part of the processes in the methods of the above embodiments, and may also be completed by a computer program instructing related hardware. The computer program may be stored in a computer-readable storage medium. The computer When the program is executed by a processor, the steps of the foregoing method embodiments can be implemented. The computer program includes computer program code, and the computer program code may be in a source code form, an object code form, an executable file, or some intermediate form. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a mobile hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signals, telecommunication signals, and software distribution media. It should be noted that the content contained in the computer-readable medium can be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdictions. For example, in some jurisdictions, the computer-readable medium Excludes electric carrier signals and telecommunication signals.
以上所述实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围,均应包含在本发明的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present invention, but not limited thereto. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that they can still implement the foregoing implementations. The technical solutions described in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not deviate the essence of the corresponding technical solutions from the spirit and scope of the technical solutions of the embodiments of the present invention, and should be included in Within the scope of the present invention.

Claims (13)

  1. 一种蓝牙音箱语音播放控制的方法,其特征在于,包括:A method for controlling voice playback of a Bluetooth speaker, comprising:
    采集语音数据,并将所述语音数据发送至移动终端,所述语音数据经移动终端上传至服务器进行语音识别;Collect voice data and send the voice data to a mobile terminal, and the voice data is uploaded to the server via the mobile terminal for voice recognition;
    接收移动终端发送的语音数据上传结束的消息,在所述服务器向所述移动终端反馈语音识别结果之前,与移动终端建立第一语音通路;Receiving a message of completion of uploading voice data sent by a mobile terminal, and establishing a first voice path with the mobile terminal before the server feeds back a voice recognition result to the mobile terminal;
    接收移动终端经由所述第一语音通路发送的语音识别结果,并播放所述语音识别结果;其中,所述语音识别结果为由服务器反馈至移动终端的结果。Receiving a voice recognition result sent by a mobile terminal via the first voice path, and playing the voice recognition result; wherein the voice recognition result is a result fed back by the server to the mobile terminal.
  2. 如权利要求1所述的蓝牙音箱语音播放控制的方法,其特征在于,采集语音数据,并将所述语音数据传输至移动终端,所述语音数据用于经移动终端上传至服务器进行语音识别之前,包括:The method for controlling voice playback of a Bluetooth speaker according to claim 1, wherein the voice data is collected and transmitted to the mobile terminal, and the voice data is used for uploading to the server for voice recognition through the mobile terminal. ,include:
    生成唤醒事件,并将唤醒事件发送至移动终端;Generate a wake-up event and send the wake-up event to the mobile terminal;
    在所述唤醒事件发送完成后,与移动终端建立第二语音通路;Establishing a second voice path with the mobile terminal after the wake-up event is sent;
    所述语音数据经由所述第二语音通路发送至移动终端。The voice data is sent to a mobile terminal via the second voice path.
  3.   如权利要求1所述的蓝牙音箱语音播放控制的方法,其特征在于,在所述服务器向所述移动终端反馈语音识别结果之前,与移动终端建立第一语音通路,包括:The method for controlling voice playback of a Bluetooth speaker according to claim 1, wherein before the server feeds back the voice recognition result to the mobile terminal, establishing a first voice path with the mobile terminal comprises:
    在服务器进行语音识别时,与移动终端建立蓝牙音频传输协议连接。When the server performs voice recognition, a Bluetooth audio transmission protocol connection is established with the mobile terminal.
  4. 一种蓝牙音箱语音播放控制的方法,其特征在于,包括:A method for controlling voice playback of a Bluetooth speaker, comprising:
    接收由蓝牙音箱发送的语音数据,并将所述语音数据上传至服务器进行语音识别;Receiving voice data sent by a Bluetooth speaker, and uploading the voice data to a server for voice recognition;
    将语音数据上传结束的消息发送至蓝牙音箱,在接收服务器反馈的语音识别结果之前,与蓝牙音箱建立第一语音通路;Send the message that the voice data upload is completed to the Bluetooth speaker, and establish a first voice path with the Bluetooth speaker before receiving the voice recognition result fed back by the server;
    接收服务器反馈的语音识别结果,并将所述语音识别结果经由所述第一语音通路发送至蓝牙音箱进行语音播放。Receiving a voice recognition result fed back by the server, and sending the voice recognition result to a Bluetooth speaker via the first voice path for voice playback.
  5. 如权利要求4所述的蓝牙音箱语音播放控制的方法,其特征在于,接收由蓝牙音箱发送的语音数据,并将所述语音数据上传至服务器进行语音识别之前,包括:The method for controlling voice playback of a Bluetooth speaker according to claim 4, before receiving voice data sent by the Bluetooth speaker and uploading the voice data to a server for voice recognition, comprising:
    接收由蓝牙音箱发送的唤醒事件;Receive the wake-up event sent by the Bluetooth speaker;
    根据所述唤醒事件,与蓝牙音箱建立第二语音通路;Establishing a second voice path with the Bluetooth speaker according to the wake-up event;
    通过所述第二语音通路接收蓝牙音箱的语音数据;其中,所述第一语音通路在所述第二语音通路建立完成后建立。Receive voice data of a Bluetooth speaker through the second voice path; wherein the first voice path is established after the second voice path is established.
  6. 一种蓝牙音箱语音播放控制的方法,其特征在于,包括:A method for controlling voice playback of a Bluetooth speaker, comprising:
    蓝牙音箱向移动终端发送语音数据;The Bluetooth speaker sends voice data to the mobile terminal;
    移动终端将所述语音数据上传至服务器;The mobile terminal uploads the voice data to a server;
    蓝牙音箱接收移动终端发送的语音数据上传结束的消息;The Bluetooth speaker receives a message that the upload of the voice data is completed by the mobile terminal;
    蓝牙音箱与移动终端建立第一语音通路,同时服务器进行语音识别;The Bluetooth speaker establishes a first voice path with the mobile terminal, while the server performs voice recognition;
    移动终端接收语音识别结果;The mobile terminal receives the speech recognition result;
    移动终端将所述语音识别结果经由第一语音通路发送至蓝牙音箱,由蓝牙音箱进行语音播放。The mobile terminal sends the voice recognition result to a Bluetooth speaker through a first voice path, and the Bluetooth speaker performs voice playback.
  7. 如权利要求6所述的蓝牙音箱语音播放控制的方法,其特征在于,还包括:The method for controlling voice playback of a Bluetooth speaker according to claim 6, further comprising:
    蓝牙音箱将唤醒事件发送移动终端;The Bluetooth speaker sends a wake event to the mobile terminal;
    根据唤醒事件,蓝牙音箱与移动终端建立第二语音通路;According to the wake-up event, the Bluetooth speaker establishes a second voice path with the mobile terminal;
    蓝牙音箱将语音数据经由第二语音通路发送至移动终端;其中,所述第一语音通路在所述第二语音通路建立完成后建立。The Bluetooth speaker sends voice data to the mobile terminal via a second voice path; wherein the first voice path is established after the second voice path is established.
  8. 如权利要求6所述的蓝牙音箱语音播放控制的方法,其特征在于,蓝牙音箱与移动终端建立第一语音通路,同时服务器进行语音识别,包括:The method for controlling voice playback of a Bluetooth speaker according to claim 6, wherein the Bluetooth speaker establishes a first voice path with the mobile terminal and the server performs voice recognition, comprising:
    在服务器进行语音识别时,蓝牙音箱与移动终端建立蓝牙音频传输协议连接。When the server performs voice recognition, the Bluetooth speaker establishes a Bluetooth audio transmission protocol connection with the mobile terminal.
  9. 一种蓝牙音箱语音播放控制装置,其特征在于,包括:A Bluetooth speaker voice playback control device is characterized in that it includes:
    第一语音数据处理模块,用于采集语音数据,并将所述语音数据发送至移动终端,所述语音数据经移动终端上传至服务器进行语音识别;A first voice data processing module, configured to collect voice data and send the voice data to a mobile terminal, where the voice data is uploaded to a server via the mobile terminal for voice recognition;
    第一连接建立模块,用于接收移动终端发送的语音数据上传结束的消息,在所述服务器向所述移动终端反馈语音识别结果之前,与移动终端建立第一语音通路;A first connection establishing module, configured to receive a voice data upload end message sent by a mobile terminal, and establish a first voice path with the mobile terminal before the server feeds back a speech recognition result to the mobile terminal;
    语音播放模块,接收移动终端经由所述第一语音通路发送的语音识别结果,并播放所述语音识别结果;其中,所述语音识别结果为由服务器反馈至移动终端的结果。The voice playback module receives a voice recognition result sent by a mobile terminal via the first voice path, and plays the voice recognition result; wherein the voice recognition result is a result fed back from the server to the mobile terminal.
  10. 如权利要求9所述的蓝牙音箱语音播放控制装置,其特征在于,还包括:The Bluetooth speaker voice playback control device according to claim 9, further comprising:
    唤醒模块,用于生成唤醒事件,并将唤醒事件发送至移动终端;A wake-up module for generating a wake-up event and sending the wake-up event to a mobile terminal;
    第二语音通路建立模块,用于在所述唤醒事件发送完成后,与移动终端建立第二语音通路。A second voice path establishment module is configured to establish a second voice path with the mobile terminal after the sending of the wake-up event is completed.
  11. 一种移动终端,其特征在于,包括:A mobile terminal, comprising:
    第二语音数据处理模块,用于接收由蓝牙音箱端发送的语音数据,并将所述语音数据上传至服务器进行语音识别;A second voice data processing module, configured to receive voice data sent by the Bluetooth speaker end, and upload the voice data to a server for voice recognition;
    第二连接建立模块,用于将语音数据上传结束的消息发送至蓝牙音箱,在接收服务器反馈的语音识别结果之前,与蓝牙音箱建立第一语音通路;A second connection establishing module, configured to send a message that voice data uploading is completed to the Bluetooth speaker, and establish a first voice path with the Bluetooth speaker before receiving the voice recognition result fed back by the server;
    语音识别结果处理模块,用于接收服务器反馈的语音识别结果,并将所述语音识别结果经由所述第一语音通路发送至蓝牙音箱进行语音播放。The voice recognition result processing module is configured to receive a voice recognition result fed back by the server, and send the voice recognition result to a Bluetooth speaker via the first voice path for voice playback.
  12. 一种蓝牙音箱语音播放控制系统,其特征在于,包括蓝牙音箱、移动终端和服务器,A Bluetooth speaker voice playback control system, comprising a Bluetooth speaker, a mobile terminal, and a server.
    蓝牙音箱用于采集语音数据并通过第二语音通路向移动终端发送语音数据;The Bluetooth speaker is used to collect voice data and send the voice data to the mobile terminal through the second voice path;
    移动终端用于接收所述语音数据并将所述语音数据上传至服务器,以及反馈语音数据上传结束的消息至蓝牙音箱;The mobile terminal is configured to receive the voice data and upload the voice data to the server, and feedback a message that the voice data upload is completed to the Bluetooth speaker;
    服务器用于接收和识别所述语音数据并反馈与所述语音数据对应的语音识别结果;A server for receiving and recognizing the voice data and feeding back a voice recognition result corresponding to the voice data;
    蓝牙音箱与移动终端还分别用于在服务器反馈语音识别结果之前,建立第一语音通路;The Bluetooth speaker and the mobile terminal are respectively used to establish a first voice path before the server feeds back the voice recognition result;
    移动终端还用于接收服务器反馈的语音识别结果,并经过所述第一语音通路将所述语音识别结果发送至蓝牙音箱;The mobile terminal is further configured to receive a voice recognition result fed back by the server, and send the voice recognition result to a Bluetooth speaker via the first voice path;
    蓝牙音箱还用于接收移动终端发送的语音识别结果,并进行语音播放。The Bluetooth speaker is also used to receive the voice recognition results sent by the mobile terminal and perform voice playback.
  13. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至8任一项所述方法的步骤。A computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 8 are implemented.
PCT/CN2019/084833 2018-09-28 2019-04-28 Voice playback control method and device for bluetooth speaker WO2020062861A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811141089.8 2018-09-28
CN201811141089.8A CN110971744B (en) 2018-09-28 2018-09-28 Method and device for controlling voice playing of Bluetooth sound box

Publications (1)

Publication Number Publication Date
WO2020062861A1 true WO2020062861A1 (en) 2020-04-02

Family

ID=69952820

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/084833 WO2020062861A1 (en) 2018-09-28 2019-04-28 Voice playback control method and device for bluetooth speaker

Country Status (2)

Country Link
CN (1) CN110971744B (en)
WO (1) WO2020062861A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114253148A (en) * 2021-12-09 2022-03-29 英华达(上海)科技有限公司 Intelligent device control method, gateway device and intelligent device control system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105163236A (en) * 2015-09-11 2015-12-16 青岛歌尔声学科技有限公司 Intelligent sound system with gateway control function
US20160098996A1 (en) * 2014-10-02 2016-04-07 International Business Machines Corporation Management of voice commands for devices in a cloud computing environment
CN206865727U (en) * 2017-06-29 2018-01-09 北京纽曼腾飞科技有限公司 A kind of group Baffle Box of Bluetooth extension system based on mobile terminal
CN108074566A (en) * 2017-12-07 2018-05-25 珠海横琴万智联科技有限公司 Financial management Intelligent voice broadcasting system and broadcasting method
CN108159687A (en) * 2017-12-19 2018-06-15 芋头科技(杭州)有限公司 A kind of automated induction systems and intelligent sound box equipment based on more people's interactive processes
CN207638865U (en) * 2017-12-14 2018-07-20 桂林广岳科技有限公司 A kind of bluetooth playing device
CN108551629A (en) * 2018-06-22 2018-09-18 四川斐讯信息技术有限公司 A kind of control method and system of Split intelligent speaker

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100598622B1 (en) * 2003-12-12 2006-07-07 주식회사 현대오토넷 Digital car audio of a has voice recognition function
JP2013068532A (en) * 2011-09-22 2013-04-18 Clarion Co Ltd Information terminal, server device, search system, and search method
EP2787790B1 (en) * 2012-11-16 2017-07-26 Huawei Device Co., Ltd. Method, mobile terminal and system for establishing bluetooth connection
CN105050034B (en) * 2015-08-25 2017-04-05 百度在线网络技术(北京)有限公司 Voice service implementation method and apparatus and system based on bluetooth connection
CN105161111B (en) * 2015-08-25 2017-09-26 百度在线网络技术(北京)有限公司 Audio recognition method and device based on bluetooth connection
CN106372246A (en) * 2016-09-20 2017-02-01 深圳市同行者科技有限公司 Audio playing method and device
CN107277754B (en) * 2017-07-12 2020-02-28 深圳市冠旭电子股份有限公司 Bluetooth connection method and Bluetooth peripheral equipment
CN107277272A (en) * 2017-07-25 2017-10-20 深圳市芯中芯科技有限公司 A kind of bluetooth equipment voice interactive method and system based on software APP
CN108172242B (en) * 2018-01-08 2021-06-01 深圳市芯中芯科技有限公司 Improved Bluetooth intelligent cloud sound box voice interaction endpoint detection method
CN108563468B (en) * 2018-03-30 2021-09-21 深圳市冠旭电子股份有限公司 Bluetooth sound box data processing method and device and Bluetooth sound box
CN108566634B (en) * 2018-03-30 2021-06-25 深圳市冠旭电子股份有限公司 Method and device for reducing continuous awakening delay of Bluetooth sound box and Bluetooth sound box

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160098996A1 (en) * 2014-10-02 2016-04-07 International Business Machines Corporation Management of voice commands for devices in a cloud computing environment
CN105163236A (en) * 2015-09-11 2015-12-16 青岛歌尔声学科技有限公司 Intelligent sound system with gateway control function
CN206865727U (en) * 2017-06-29 2018-01-09 北京纽曼腾飞科技有限公司 A kind of group Baffle Box of Bluetooth extension system based on mobile terminal
CN108074566A (en) * 2017-12-07 2018-05-25 珠海横琴万智联科技有限公司 Financial management Intelligent voice broadcasting system and broadcasting method
CN207638865U (en) * 2017-12-14 2018-07-20 桂林广岳科技有限公司 A kind of bluetooth playing device
CN108159687A (en) * 2017-12-19 2018-06-15 芋头科技(杭州)有限公司 A kind of automated induction systems and intelligent sound box equipment based on more people's interactive processes
CN108551629A (en) * 2018-06-22 2018-09-18 四川斐讯信息技术有限公司 A kind of control method and system of Split intelligent speaker

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114253148A (en) * 2021-12-09 2022-03-29 英华达(上海)科技有限公司 Intelligent device control method, gateway device and intelligent device control system

Also Published As

Publication number Publication date
CN110971744A (en) 2020-04-07
CN110971744B (en) 2022-09-23

Similar Documents

Publication Publication Date Title
KR102569374B1 (en) How to operate a Bluetooth device
CN102456347B (en) Realization system and method for split-type multi-channel synchronous play for multimedia file based on wireless transmission technology
WO2014166243A1 (en) Multi-terminal multi-channel independent play method and apparatus
KR102396745B1 (en) Ring network of bluetooth speakers
WO2018152679A1 (en) Audio file transmitting method and apparatus, audio file receiving method and apparatus, devices and system
WO2020063675A1 (en) Smart loudspeaker box and method for using smart loudspeaker box
WO2023005412A1 (en) Recording method and apparatus, wireless earphones and storage medium
WO2021185077A1 (en) Audio processing method, apparatus and system
WO2020124920A1 (en) Voice interaction method and bluetooth device
WO2020114181A1 (en) Network voice recognition method, network service interaction method and intelligent earphone
CN114258003A (en) Audio playing control method, system, device and storage medium
CN103686540A (en) Active wireless network sound equipment and use method thereof
CN206389530U (en) A kind of separated wireless earphone
CN107431859A (en) The radio broadcasting of the voice data of encapsulation with control data
WO2022062979A1 (en) Audio processing method, computer-readable storage medium, and electronic device
TW202236084A (en) Systems and methods of handling speech audio stream interruptions
WO2020082710A1 (en) Voice interaction control method, apparatus and system for bluetooth speaker
WO2017185339A1 (en) Wireless connection method, apparatus and system
WO2020062861A1 (en) Voice playback control method and device for bluetooth speaker
WO2024055738A1 (en) Method and apparatus for audio data sharing, and electronic device and storage medium
CN112105005B (en) Method and device for controlling Bluetooth equipment to play
WO2020062862A1 (en) Voice interactive control method and device for speaker
US20170295273A1 (en) Data return type voice input/output device for smart phone
CN202772917U (en) Multimedia-file split-type multi-channel synchronized broadcast implement system based on wireless transmission technology
CN202077166U (en) Synchronized regulation sound equipment and application equipment thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19865471

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19865471

Country of ref document: EP

Kind code of ref document: A1