WO2020114181A1 - 网络语音识别方法、网络业务交互方法及智能耳机 - Google Patents

网络语音识别方法、网络业务交互方法及智能耳机 Download PDF

Info

Publication number
WO2020114181A1
WO2020114181A1 PCT/CN2019/115873 CN2019115873W WO2020114181A1 WO 2020114181 A1 WO2020114181 A1 WO 2020114181A1 CN 2019115873 W CN2019115873 W CN 2019115873W WO 2020114181 A1 WO2020114181 A1 WO 2020114181A1
Authority
WO
WIPO (PCT)
Prior art keywords
headset
charging box
voice command
cloud server
audio data
Prior art date
Application number
PCT/CN2019/115873
Other languages
English (en)
French (fr)
Inventor
龚树强
仇存收
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020114181A1 publication Critical patent/WO2020114181A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • H04L67/025Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W88/00Devices specially adapted for wireless communication networks, e.g. terminals, base stations or access point devices
    • H04W88/02Terminal devices
    • H04W88/06Terminal devices adapted for operation in multiple networks or having at least two operational modes, e.g. multi-mode terminals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the present application relates to the technical field of smart terminals, and in particular, to a network voice recognition method, a network service interaction method, and a smart headset.
  • the wireless smart headset 100 includes two components, a wireless headset 11 and a charging box 12.
  • the wireless headset 11 includes an audio acquisition/playback/codec unit 111 and a wireless audio unit 122.
  • the audio collecting/playing/codec unit 111 is used to collect audio data and play the received audio data
  • the wireless audio unit 122 is used to realize wireless communication with the smart terminal, and establish an audio data transmission channel to transmit the audio collected by the wireless headset Data and audio data to be played.
  • the charging box 12 includes a charging/button control unit 121 and an energy storage/power supply unit 122.
  • the energy storage/power supply unit 122 is used to realize battery energy storage and power supply to the earphone, and the like.
  • network speech recognition or network service interaction is welcomed by consumers, and all intelligent terminals are seeking solutions to support network speech recognition or network service interaction.
  • smart speakers/mobile phones have been able to support network voice recognition and business interaction.
  • smart speakers are external audio units with poor privacy.
  • the size of the smartphone is limited by the screen, and it cannot meet the experience requirements of scenes with carrying requirements, such as sports scenes.
  • Smart headsets are good in privacy and easy to carry. However, currently no smart headsets can achieve network voice recognition or network service interaction.
  • the present application provides a network voice recognition method, a network service interaction method, and a smart headset, so that the smart headset can implement network voice recognition or network service interaction.
  • An aspect of the present application provides a network voice recognition method, which is applied to a smart headset including a headset and a charging box.
  • the method includes: the charging box receives a voice command from the headset; the charging The box sends the voice command to the cloud server; the charging box receives the voice command recognition result from the cloud server; the charging box sends the voice command recognition result to the headset, or the charging box executes the A voice command and send a voice command execution result to the headset; and the headset plays the voice command recognition result or the voice command execution result.
  • the charging box sends the voice command received from the headset to the cloud server, and then the charging box receives the voice command recognition result sent by the cloud server, which can enable the smart headset to realize network voice recognition, and can also cause the charging box to execute Voice command after recognition.
  • the method further includes: the headset collecting the voice command; and the headset sending the collected voice command to the charging box.
  • the method further includes: the headset establishing a communication connection with the charging box; and the charging box establishing a communication connection with the cloud server.
  • the method further includes the earphone executing the voice command.
  • the recognized voice command is executed by the headset.
  • the smart headset includes a headset and a charging box.
  • the method includes: the charging box receives a voice command from the headset.
  • the voice command is used to instruct to obtain audio data; the charging box sends the voice command to a cloud server; the charging box receives audio data from the cloud server; the charging box sends the audio data to the headset; And the earphone decodes and plays the audio data.
  • the charging box receives the voice instruction for acquiring audio data sent by the earphone and sends it to the cloud server, the cloud server recognizes the voice instruction and executes the voice instruction, sends the audio data to the charging box, and the charging box sends the audio data
  • the headset is decoded and played, so that the smart headset can realize network service interaction.
  • the method further includes: the charging box decodes the audio data received from the cloud server; and the charging box sends the audio data to the headset, including: the charging box The headset sends the decoded audio data.
  • the charging box when the format of the audio data sent between the charging box and the headset is inconsistent, the charging box also decodes the audio data received from the cloud server, and after decoding, sends it to the headset for re-decoding.
  • the method further includes: the charging box receives a voice command recognition result from the cloud server; and the charging box The headset sends the voice command recognition result.
  • the charging box when voice recognition and network service interaction are not integrated in the cloud server, the charging box also receives the voice command recognition result sent by the cloud server and sends it to the headset.
  • the method further includes: the headset collecting the voice command; and the headset sending the collected voice command to the charging box.
  • the method further includes: the headset establishing a communication connection with the charging box; and the charging box establishing a communication connection with the cloud server.
  • the smart headset includes a headset and a charging box.
  • the method includes: the charging box receives a voice command from the headset; the The charging box sends the voice command to the cloud server; the charging box receives the voice command execution result from the cloud server; the charging box sends the voice command execution result to the headset; and the headset plays the Voice command execution results.
  • the charging box receives the voice command sent by the headset and sends it to the cloud server.
  • the cloud server recognizes and executes the voice command, and returns the voice command execution result to the charging box, so that the smart headset realizes network service interaction.
  • the smart headset includes a headset and a charging box, the charging box is used to receive voice commands from the headset; the charging box is also used to send a The voice command; the charging box is also used to receive a voice command recognition result from the cloud server; the charging box is also used to send the voice command recognition result to the headset, or the charging box is also used to Execute the voice command and send a voice command execution result to the headset; and the headset is used to play the voice command recognition result or the voice command execution result.
  • the headset is also used to collect the voice command; and the headset is also used to send the collected voice command to the charging box.
  • the headset is also used to establish a communication connection with the charging box; and the charging box is also used to establish a communication connection with the cloud server.
  • the headset is also used to execute the voice command.
  • the smart headset includes a headset and a charging box, the charging box is used to receive a voice command from the headset, the voice command is used to instruct to obtain audio data;
  • the charging box is also used to send the voice command to the cloud server;
  • the charging box is also used to receive audio data from the cloud server;
  • the charging box is also used to send the audio data to the headset; and
  • the headset is used to decode and play the audio data.
  • the charging box is also used to decode audio data received from the cloud server; and the charging box is also used to send the decoded audio data to the headset.
  • the charging box is also used to receive the voice command recognition result from the cloud server; and the charging box is also used to send the voice command recognition result to the headset.
  • the headset is also used to collect the voice command; and the headset is also used to send the collected voice command to the charging box.
  • the headset is also used to establish a communication connection with the charging box; and the charging box is also used to establish a communication connection with the cloud server.
  • the smart headset includes a headset and a charging box, the charging box is used to receive voice commands from the headset; the charging box is also used to send a The voice box; the charging box is also used to receive the voice command execution result from the cloud server; the charging box is also used to send the voice command execution result to the headset; and the headset is used to play The voice command execution result.
  • Yet another aspect of the present application provides a computer-readable storage medium having instructions stored therein, which when executed on a computer, causes the computer to perform the methods described in the above aspects.
  • Yet another aspect of the present application provides a computer program product containing instructions that, when run on a computer, causes the computer to perform the methods described in the above aspects.
  • FIG. 1 is a schematic structural diagram of an existing smart earphone
  • FIG. 2 is a schematic diagram of a general structure of a smart headset provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a network speech recognition method provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a network service interaction provided by an embodiment of this application.
  • FIG. 5 is a schematic flowchart of another network service interaction provided by an embodiment of this application.
  • Network voice recognition refers to the function that the terminal collects the user's voice command and transmits it to the cloud server using the network. After recognizing the user's voice command, it returns to the terminal and is executed by the terminal.
  • the network interactive service refers to that the terminal sends the service request to the cloud server, and the cloud server responds to the terminal request and sends the execution result to the terminal.
  • the network voice recognition service and the network interaction service may be integrated in one server, or may be executed by different servers.
  • FIG. 2 is a general structural diagram of a smart earphone provided by an embodiment of the present application.
  • the smart earphone includes a earphone 21 and a charging box 22.
  • the earphone 21 is wirelessly connected to the charging case 22.
  • the structure of the earphone 21 is the same as the structure shown in FIG. 1, and includes an audio acquisition/playback/codec unit 211 and a wireless audio unit 212.
  • the audio collection/playback/codec unit 211 specifically includes functions such as voice wake-up, local audio playback, audio playback, audio collection, and audio codec. It should be noted that, for different network functions, the functions included in the audio collection/playback/codec unit 211 may be different.
  • the functions of the audio collection/playback/codec unit 211 may include voice wake-up, audio collection, and audio playback; when the smart headset is used to implement network interactive services, such as acquiring audio For data, the functions included in the audio collection/playback/codec unit 211 are voice wake-up, audio playback, audio collection, and audio codec.
  • the audio collection/playback/codec unit 211 may include all the above functions.
  • the wireless audio unit 212 may perform Bluetooth (BT) communication with the wireless audio unit 223 of the charging case 22 and the like.
  • the charging box 22 includes an Internet service unit 221, an audio codec unit 222, a wireless audio unit 223, a wireless network unit 224, a charging/button control unit 225, an energy storage/power supply 226, and the like. Since the charging case 22 has a function of connecting to a cloud server, it can also be called a networked charging case.
  • the Internet service unit 221 includes a streaming media service software development kit (software development kit, SDK) and a voice engine cloud SDK.
  • the streaming media service SDK is streaming media software for connecting to cloud servers
  • the voice engine cloud SDK is voice software for connecting to cloud servers. According to the function realization of the smart earphone, it may include the above two SDKs, or include one of the SDKs.
  • the audio codec unit 222 includes functions such as local playback, streaming media playback, and audio codec. According to the function realization of the smart earphone, the audio codec unit 222 is optional.
  • the wireless network unit 224 is used to realize a communication connection with a cloud server, and may use 4G/3G/2G, Wi-Fi, TCP/IP and other communication connection methods to establish a communication connection with the server.
  • the functions of the charging/button control unit 225 and the energy storage/power supply unit 226 are the same as the functions of the charging/button control unit 121 and the energy storage/power supply unit 122 of the embodiment shown in FIG. 1, respectively.
  • FIG. 3 is a schematic flowchart of a network voice recognition method according to an embodiment of the present application, which is applied to the smart headset shown in FIG.
  • the method includes the following steps:
  • the headset sends a voice command to the charging box.
  • the user wakes up the headset by voice or manually to issue a voice command.
  • the headset collects the user's voice command through the microphone, optionally, before S101, the method further includes: the headset collects the voice command; and the headset sends the collected voice command to the charging box .
  • the headset can also obtain local voice commands.
  • the headset itself cannot recognize voice commands. Therefore, after the voice command is collected by the earphone, it is converted into voice command data and sent to the wireless audio unit 223 of the charging case through the wireless audio unit 212.
  • the voice command may be a control command or other commands.
  • the method further includes:
  • the headset establishes a communication connection with the charging box
  • the charging box establishes a communication connection with the cloud server.
  • the headset establishes a communication connection with the wireless audio unit 223 of the charging case through the wireless audio unit 212, for example, it may be a Bluetooth connection.
  • the charging box establishes a communication connection with the cloud server through the wireless network unit 224.
  • a communication connection For example, 4G/3G/2G, Wi-Fi, TCP/IP and other communication connection methods can be used.
  • the charging box After receiving the voice command, the charging box sends the voice command to the cloud server.
  • the wireless audio unit 223 of the charging case After receiving the voice command sent by the wireless audio unit 212 of the headset, the wireless audio unit 223 of the charging case cannot recognize the voice command itself, and sends the voice command received from the headset to the cloud server through the wireless network unit 224.
  • the charging case may also have local language command recognition capability. After receiving the voice command sent by the headset, the charging case first performs local recognition. If the local recognition is not possible, it forwards the voice command to the cloud server for network recognition.
  • the cloud server After receiving the voice command, the cloud server performs voice command recognition.
  • the cloud server has a voice recognition function. After receiving the voice command sent by the charging box, the voice command is recognized to obtain a voice command recognition result.
  • the speech recognition process can refer to the existing speech recognition technology, which will not be repeated here.
  • the cloud server sends a voice command recognition result to the charging box.
  • the cloud server sends the voice command recognition result to the charging box through the wireless network.
  • the voice command when the voice command is a control command, the voice command may be executed by the charging case. Then, after S104, proceed to S105, and after receiving the voice command recognition result, the charging case executes the voice command.
  • the voice command is “play the song “Promise of Love””.
  • the charging box retrieves the song “Promise of Love” from the memory or from the server.
  • the charging box sends a voice command execution result to the headset.
  • the voice command execution result may be the audio data of the song.
  • the headset After receiving the voice command execution result, the headset plays the voice command execution result.
  • the headset plays the song after receiving the audio data of the song.
  • the method further includes: the headset executes the voice command.
  • the headset may also execute the voice command. Then, after S104, proceeding to S106, after receiving the voice command recognition result, the charging case sends the voice command recognition result to the earphone.
  • the voice command is “play the song “Promise of Love”
  • the charging box sends the recognition result of the voice command to the headset-"play the song “Promise of Love””.
  • the headset After receiving the voice command recognition result, the headset plays the voice command recognition result.
  • the headset after receiving the voice command recognition result-"playing the song "Promise of Love”", the headset obtains the audio data of the song from the charging box through the wireless audio unit 212, or the song from the local storage Audio data and play the song.
  • the headset and the cloud server cannot be used to realize the network voice recognition function.
  • the above unit is integrated through the charging box to realize the network voice recognition function with the cloud server. Combined with the original headset, a smart headset with integrated network voice recognition can be realized.
  • the charging box sends the voice command received from the headset to the cloud server, and then the charging box receives the voice command recognition result sent by the cloud server, so that the smart headset can realize the network Voice recognition can also cause the charging case to execute the recognized voice command.
  • FIG. 4 is a schematic flowchart of a network service interaction method according to an embodiment of the present application, which is applied to the smart headset shown in FIG. The method includes the following steps:
  • the headset sends a voice command to the charging box, where the voice command is used to instruct to obtain audio data.
  • the user wakes up the headset by voice or manually to issue a voice command.
  • the headset collects the user's voice command through the microphone, optionally, before S201, the method further includes: the headset collects the voice command; and the headset sends the collected voice command to the charging box .
  • the headset can also obtain local voice commands.
  • the headset itself cannot recognize voice commands. Therefore, after the voice command is collected by the earphone, it is converted into voice command data and sent to the wireless audio unit 223 of the charging case through the wireless audio unit 212.
  • the voice command is used to indicate data audio data. For example, to obtain the audio data of the song "Promise of Love".
  • the method further includes:
  • the headset establishes a communication connection with the charging box
  • the charging box establishes a communication connection with the cloud server.
  • the earphone establishes a communication connection with the wireless audio unit 223 of the charging case through the wireless audio unit 212, for example, it may be a Bluetooth connection.
  • the charging box establishes a communication connection with the cloud server through the wireless network unit 224.
  • a communication connection For example, 4G/3G/2G, Wi-Fi, TCP/IP and other communication connection methods can be used.
  • the charging box After receiving the voice command, the charging box sends the voice command to the cloud server.
  • the wireless audio unit 223 of the charging case After receiving the voice command sent by the wireless audio unit 212 of the headset, the wireless audio unit 223 of the charging case cannot recognize the voice command itself, and sends the voice command received from the headset to the cloud server through the wireless network unit 224.
  • the charging box has a function of storing local media resource files.
  • a command to play local media resources is recognized (local identification or network identification)
  • the local resource files are read, decoded, and sent to the headset for playback.
  • the cloud server recognizes the voice command and obtains the audio data indicated by the voice command.
  • the cloud server has voice recognition and network audio service functions.
  • the voice recognition function and the network audio service function may be integrated in one server, or may be implemented by different servers.
  • the cloud server recognizes the voice command after receiving the voice command sent by the charging box, and obtains the audio data indicated by the voice command.
  • the cloud server recognizes the voice command, sends a request to obtain audio data to another server with a network audio service function, and receives the Audio data.
  • the method further includes: the cloud server sends a voice command recognition result to the charging box; after receiving the voice command recognition result from the cloud server, the charging box sends to the headset The voice command recognition result.
  • the cloud server sends the audio data to the charging box.
  • the charging box After receiving the audio data sent by the cloud server, the charging box sends the audio data to the headset.
  • the wireless network unit 224 of the charging case receives the audio data sent by the cloud server, and sends the audio data to the wireless audio unit 212 of the headset through the wireless audio unit 223.
  • the headset decodes and plays the audio data.
  • the charging box can directly send the audio data sent by the cloud server to the headset, and the headset decodes and plays the audio data.
  • the method further includes: the charging box decodes the audio data received from the cloud server.
  • S205 is specifically: the charging box sends the decoded audio data to the earphone.
  • the charging box needs to decode the audio data received from the cloud server and send the decoded audio data to the headset.
  • the earphone After receiving the audio data decoded by the charging box, the earphone decodes the audio data according to its own playback format and plays the audio data decoded by itself.
  • a charging box receives a voice instruction for acquiring audio data sent by a headset, and sends it to a cloud server.
  • the cloud server recognizes the voice instruction and executes the voice instruction, and sends it to the charging box Audio data, the charging box sends the audio data to the headset for decoding and playback, so that the smart headset can realize network service interaction.
  • FIG. 5 is a schematic flowchart of a network service interaction method according to an embodiment of the present application, which is applied to the smart headset shown in FIG. The method includes the following steps:
  • the headset sends a voice command to the charging box.
  • the voice command may be a network service service command instructed by the cloud server.
  • the user's daily exercise steps are uploaded to the cloud server.
  • the user is walking fast at the moment, and the user sends a voice instruction to the headset to instruct the cloud server to prompt when the user's number of exercise steps exceeds the average value of the previous daily exercise steps.
  • the user sends a voice command "Do I have more steps today than the average daily steps?"
  • the charging box After the charging box receives the voice command from the headset, the charging box sends the voice command to the cloud server.
  • the wireless audio unit 223 of the charging case After receiving the voice command sent by the wireless audio unit 212 of the headset, the wireless audio unit 223 of the charging case cannot recognize the voice command itself, and sends the voice command received from the headset to the cloud server through the wireless network unit 224.
  • the cloud server After receiving the voice command, the cloud server recognizes the voice command and executes the voice command.
  • the cloud server has a voice recognition function, recognizing the user's voice command "Is my step count today more than the average daily step count?", the cloud server executes the voice command to upload the current step count of the user and the user's day Compare the average step count. When the user's current step count exceeds the user's daily average step count, the voice command execution result is "You have exceeded the daily average step count!
  • the cloud server sends a voice command execution result to the charging box.
  • the charging box After the charging box receives the voice command execution result from the cloud server, the charging box sends the voice command execution result to the headset.
  • the wireless network unit 224 of the charging box receives the voice command execution result sent by the cloud server, and sends the voice command execution result to the wireless audio unit 212 of the headset through the wireless audio unit 223.
  • the headset After receiving the voice command execution result, the headset plays the voice command execution result.
  • the wireless audio unit 212 of the headset After receiving the voice command execution result, the wireless audio unit 212 of the headset plays the voice command execution result.
  • the command execution result may also be other prompting methods, for example, after the charging box receives the command execution result, a vibration prompt is performed.
  • a charging box receives a voice command sent by a headset and sends it to a cloud server, the cloud server recognizes and executes the voice command, and returns the voice command execution result to the charging box to make it smart
  • the headset implements network business interaction.
  • An embodiment of the present application also provides a smart earphone.
  • the structure of the smart earphone is shown in FIG. 2. specifically:
  • the charging box is used to receive voice commands from the headset
  • the charging box is also used to send the voice command to the cloud server;
  • the charging box is also used to receive the voice command recognition result from the cloud server;
  • the charging box is also used to send the voice command recognition result to the headset, or the charging box is also used to execute the voice command and send the voice command execution result to the headset;
  • the headset is used to play the voice command recognition result or the voice command execution result.
  • the headset is also used to collect the voice command; and the headset is also used to send the collected voice command to the charging box.
  • the headset is also used to establish a communication connection with the charging box; and the charging box is also used to establish a communication connection with the cloud server.
  • the headset is also used to execute the voice command.
  • the charging box sends the voice command received from the headset to the cloud server, and then the charging box receives the voice command recognition result sent by the cloud server, which can enable the smart headset to realize network voice recognition , Can also make the charging box execute the recognized voice command.
  • An embodiment of the present application further provides another smart earphone.
  • the structure of the smart earphone is shown in FIG. 2. specifically:
  • the charging box is used to receive a voice command from the headset, and the voice command is used to instruct to obtain audio data;
  • the charging box is also used to send the voice command to the cloud server;
  • the charging box is also used to receive audio data from the cloud server
  • the charging box is also used to send the audio data to the headset;
  • the headset is used to decode and play the audio data.
  • the charging box is also used to decode audio data received from the cloud server; and the charging box is also used to send the decoded audio data to the headset.
  • the charging box is also used to receive a voice command recognition result from the cloud server; and the charging box is also used to send the voice command recognition result to the headset.
  • the headset is also used to collect the voice command; and the headset is also used to send the collected voice command to the charging box.
  • the headset is also used to establish a communication connection with the charging box; and the charging box is also used to establish a communication connection with the cloud server.
  • a charging box receives a voice instruction for acquiring audio data sent by a headset, and sends it to a cloud server.
  • the cloud server recognizes the voice instruction and executes the voice instruction, and sends audio data to the charging box ,
  • the charging box sends the audio data to the headset for decoding and playback, so that the smart headset can realize network service interaction.
  • An embodiment of the present application further provides another smart earphone.
  • the structure of the smart earphone is shown in FIG. 2. specifically:
  • the charging box is used to receive voice commands from the headset
  • the charging box is also used to send the voice command to the cloud server;
  • the charging box is also used to receive the voice command execution result from the cloud server;
  • the charging box is also used to send the voice command execution result to the headset;
  • the headset is used to play the execution result of the voice command.
  • the charging box receives the voice command sent by the headset and sends it to the cloud server, the cloud server recognizes and executes the voice command, and returns the voice command execution result to the charging box, so that the smart headset realizes Network business interaction.
  • An embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the method performed by the smart headset in the embodiments shown in FIGS. 3 to 5 is implemented.
  • the disclosed system, device, and method may be implemented in other ways.
  • the division of the unit is only a logical function division, and there may be other divisions in actual implementation.
  • multiple units or components may be combined or integrated into another system, or some features may be ignored, or not carried out.
  • the displayed or discussed mutual coupling, direct coupling, or communication connection may be indirect coupling or communication connection through some interfaces, devices, or units, and may be in electrical, mechanical, or other forms.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a dedicated computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted through the computer-readable storage medium.
  • the computer instructions can be transferred from a website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another A website site, computer, server or data center for transmission.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device including a server, a data center, and the like integrated with one or more available media.
  • the available media may be read-only memory (ROM), or random access memory (RAM), or magnetic media, such as floppy disks, hard disks, magnetic tapes, magnetic disks, or optical media, such as, Digital versatile disc (DVD), or semiconductor media, for example, solid state disk (SSD), etc.

Abstract

一种网络语音识别方法、网络业务交互方法及智能耳机(200)。通过智能耳机(200)中的充电盒(22)与云服务器建立通信连接,该充电盒(22)将耳机(21)发送的语音命令发送至云服务器,由云服务器进行语音命令识别、与云服务器进行网络业务交互,使得智能耳机(200)可以实现网络语音识别和网络业务交互。

Description

网络语音识别方法、网络业务交互方法及智能耳机 技术领域
本申请涉及智能终端技术领域,尤其涉及一种网络语音识别方法、网络业务交互方法及智能耳机。
背景技术
如图1所示的无线智能耳机的结构示意图,无线智能耳机100包括无线耳机11和充电盒12两个部件。其中,无线耳机11包括音频采集/播放/编解码单元111和无线音频单元122。音频采集/播放/编解码单元111用于采集音频数据与播放接收到的音频数据,无线音频单元122用于实现与智能终端的无线通信,并建立音频数据传输通道,传输无线耳机采集到的音频数据和要播放的音频数据。充电盒12包括充电/按键控制单元121和储能/供电单元122,储能/供电单元122用于实现电池蓄能和给耳机供电等。
目前网络语音识别或网络业务交互受到消费者的欢迎,都在寻求各智能终端支持网络语音识别或网络业务交互的解决方案。目前,智能音箱/手机已能够支持网络语音识别和业务交互,然而,智能音箱是外放的音频单元,私密性差。而智能手机尺寸受制于屏幕,无法满足有携带要求场景的体验要求,如运动场景。智能耳机私密性好,且携带方便,然而,目前没有智能耳机能够实现网络语音识别或网络业务交互。
因此,智能耳机如何实现网络语音识别或网络业务交互,成为目前亟待解决的问题。
发明内容
本申请提供一种网络语音识别方法、网络业务交互方法及智能耳机,以使得智能耳机可以实现网络语音识别或网络业务交互。
本申请的一方面提供了一种网络语音识别方法,应用于智能耳机,所述智能耳机包括耳机和充电盒,所述方法包括:所述充电盒接收来自所述耳机的语音命令;所述充电盒向云服务器发送所述语音命令;所述充电盒接收来自所述云服务器的语音命令识别结果;所述充电盒向所述耳机发送所述语音命令识别结果,或所述充电盒执行所述语音命令,并向所述耳机发送语音命令执行结果;以及所述耳机播放所述语音命令识别结果或所述语音命令执行结果。
在该方面中,由充电盒将从耳机接收到的语音命令发送至云服务器,再由充电盒接收云服务器发送的语音命令识别结果,可以使得智能耳机实现网络语音识别,还可以使得充电盒执行识别后的语音命令。
在一个实现方式中,所述方法还包括:所述耳机采集所述语音命令;以及所述耳机向所述充电盒发送采集到的所述语音命令。
在另一个实现方式中,所述方法还包括:所述耳机建立与所述充电盒的通信连接;以及所述充电盒建立与所述云服务器的通信连接。
在又一个实现方式中,所述充电盒向所述耳机发送所述语音命令识别结果之后,所述方法还包括:所述耳机执行所述语音命令。
在该实现方式中,由耳机执行识别后的语音命令。
本申请的另一方面提供了一种网络业务交互方法,应用于智能耳机,所述智能耳机包括耳机和充电盒,所述方法包括:所述充电盒接收来自所述耳机的语音命令,所述语音命令用于指示获取音频数据;所述充电盒向云服务器发送所述语音命令;所述充电盒接收来自所述云服务器的音频数据;所述充电盒向所述耳机发送所述音频数据;以及所述耳机解码并播放所述音频数据。
在该方面中,由充电盒接收耳机发送的获取音频数据的语音指令,并发送至云服务器,云服务器识别该语音指令并执行该语音指令,向充电盒发送音频数据,充电盒发送该音频数据给耳机进行解码和播放,从而使得智能耳机可以实现网络业务交互。
在一个实现方式中,所述方法还包括:所述充电盒解码从所述云服务器接收到的音频数据;以及所述充电盒向所述耳机发送所述音频数据,包括:所述充电盒向所述耳机发送所述解码后的音频数据。
在该实现方式中,当充电盒和耳机之间发送的音频数据的格式不一致时,充电盒还对从云服务器接收到的音频数据进行解码,解码后,再发送至耳机进行再解码。
在另一个实现方式中,所述充电盒向云服务器发送所述语音命令之后,所述方法还包括:所述充电盒接收来自所述云服务器的语音命令识别结果;以及所述充电盒向所述耳机发送所述语音命令识别结果。
在该实现方式中,当云服务器中语音识别与网络业务交互没有进行集成时,充电盒还接收云服务器发送的语音命令识别结果并发送给耳机。
在又一个实现方式中,所述方法还包括:所述耳机采集所述语音命令;以及所述耳机向所述充电盒发送采集到的所述语音命令。
在又一个实现方式中,所述方法还包括:所述耳机建立与所述充电盒的通信连接;以及所述充电盒建立与所述云服务器的通信连接。
本申请的又一方面提供了一种网络业务交互方法,应用于智能耳机,所述智能耳机包括耳机和充电盒,所述方法包括:所述充电盒接收来自所述耳机的语音命令;所述充电盒向云服务器发送所述语音命令;所述充电盒接收来自所述云服务器的语音命令执行结果;所述充电盒向所述耳机发送所述语音命令执行结果;以及所述耳机播放所述语音命令执行结果。
在该方面中,充电盒接收耳机发送的语音命令,并发送至云服务器,云服务器识别并执行该语音命令,向充电盒返回语音命令执行结果,以使得智能耳机实现了网络业务交互。
本申请的又一方面提供了一种智能耳机,所述智能耳机包括耳机和充电盒,所述充电盒用于接收来自所述耳机的语音命令;所述充电盒还用于向云服务器发送所述语音命令;所述充电盒还用于接收来自所述云服务器的语音命令识别结果;所述充电盒还用于向所述耳机发送所述语音命令识别结果,或所述充电盒还用于执行所述语音命令,并向所述耳机发送语音命令执行结果;以及所述耳机用于播放所述语音命令识别结果或所述语音命令执行结果。
在一个实现方式中,所述耳机还用于采集所述语音命令;以及所述耳机还用于向所述充电盒发送采集到的所述语音命令。
在另一个实现方式中,所述耳机还用于建立与所述充电盒的通信连接;以及所述充电盒还用于建立与所述云服务器的通信连接。
在又一个实现方式中,所述耳机还用于执行所述语音命令。
本申请的又一方面提供了一种智能耳机,所述智能耳机包括耳机和充电盒,所述充电盒用于接收来自所述耳机的语音命令,所述语音命令用于指示获取音频数据;所述充电盒还用于向云服务器发送所述语音命令;所述充电盒还用于接收来自所述云服务器的音频数据;所述充电盒还用于向所述耳机发送所述音频数据;以及所述耳机用于解码并播放所述音频数据。
在一个实现方式中,所述充电盒还用于解码从所述云服务器接收到的音频数据;以及所述充电盒还用于向所述耳机发送所述解码后的音频数据。
在另一个实现方式中,所述充电盒还用于接收来自所述云服务器的语音命令识别结果;以及所述充电盒还用于向所述耳机发送所述语音命令识别结果。
在又一个实现方式中,所述耳机还用于采集所述语音命令;以及所述耳机还用于向所述充电盒发送采集到的所述语音命令。
在又一个实现方式中,所述耳机还用于建立与所述充电盒的通信连接;以及所述充电盒还用于建立与所述云服务器的通信连接。
本申请的又一方面提供了一种智能耳机,所述智能耳机包括耳机和充电盒,所述充电盒用于接收来自所述耳机的语音命令;所述充电盒还用于向云服务器发送所述语音命令;所述充电盒还用于接收来自所述云服务器的语音命令执行结果;所述充电盒还用于向所述耳机发送所述语音命令执行结果;以及所述耳机用于播放所述语音命令执行结果。
本申请的又一方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。
本申请的又一方面提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。
附图说明
图1为现有的智能耳机的结构示意图;
图2为本申请实施例提供的一种智能耳机的通用的结构示意图;
图3为本申请实施例提供的一种网络语音识别方法的流程示意图;
图4为本申请实施例提供的一种网络业务交互的流程示意图;
图5为本申请实施例提供的另一种网络业务交互的流程示意图。
具体实施方式
下面结合本申请实施例中的附图对本申请实施例进行描述。
网络语音识别是指终端采集用户语音命令,并利用网络传输到云服务器,识别用户语音命令后返回到终端,并由终端执行的功能。
网络交互业务是指终端发送业务需求到云服务器,云服务器响应终端需求并发送执行结果到终端。
本申请中,网络语音识别业务和网络交互业务可以集成在一个服务器中,也可以由不同的服务器执行。
请参阅图2,图2为本申请实施例提供的一种智能耳机的通用的结构示意图,该智能耳机包括耳机21和充电盒22。一般地,耳机21与充电盒22无线连接。耳机21的结构与图1所示的结构相同,包括音频采集/播放/编解码单元211和无线音频单元212。其中,音频采集/播放/编解码单元211又具体包括语音唤醒、本地音频播放、音频播放、音频采集和音频编解码等功能。需要说明的是,针对不同的网络功能,音频采集/播放/编解码单元211包含的功能可以不同。例如,当智能耳机用于实现网络语音识别时,音频采集/播放/编解码单元211可以包含的功能为语音唤醒、音频采集和音频播放等;当智能耳机用于实现网络交互业务,例如获取音频数据时,音频采集/播放/编解码单元211包含的功能为语音唤醒、音频播放、音频采集和音频编解码等。当智能耳机既用于实现网络语音识别,又用于实现网络交互业务时,则音频采集/播放/编解码单元211可以包括上述所有功能。无线音频单元212可以与充电盒22的无线音频单元223进行蓝牙(bluetooth,BT)通信等。
充电盒22包括互联网服务单元221、音频编解码单元222、无线音频单元223、无线网络单元224、充电/按键控制单元225和储能/供电226等。由于充电盒22具有连接至云服务器的功能,因此也可以称为联网充电盒。其中,互联网服务单元221包括流媒体服务软件开发工具包(software development kit,SDK)和语音引擎云端SDK。流媒体服务SDK为对接云服务器的流媒体软件,语音引擎云端SDK为对接云服务器的语音软件。根据智能耳机的功能实现,可包括上述两个SDK,或者包括其中一个SDK。音频编解码单元222包括本地播放、流媒体播放和音频编解码等功能。根据智能耳机的功能实现,该音频编解码单元222是可选的。无线网络单元224用于实现与云服务器的通信连接,可以采用4G/3G/2G、Wi-Fi和TCP/IP等通信连接方式与服务器建立通信连接。充电/按键控制单元225和储能/供电单元226的功能分别与图1所示实施例的充电/按键控制单元121和储能/供电单元122的功能相同。
请参阅图3,图3为本申请实施例提供的一种网络语音识别方法的流程示意图,应用于图2所示的智能耳机。该方法包括以下步骤:
S101、耳机向充电盒发送语音命令。
在本步骤中,用户佩戴好耳机后,通过语音或手动唤醒耳机,下达语音命令。耳机通过麦克风采集用户的语音命令,则可选地,在S101之前,所述方法还包括:所述耳机采集所述语音命令;以及所述耳机向所述充电盒发送采集到的所述语音命令。
耳机也可以获取本地的语音命令。
耳机本身不能对语音命令进行识别。因此,耳机采集到语音命令后,转换为语音命令数据,通过无线音频单元212发送至充电盒的无线音频单元223。该语音命令可以是控制命令,也可以是其它的命令。
可选地,在S101之前,所述方法还包括:
所述耳机建立与所述充电盒的通信连接;
所述充电盒建立与所述云服务器的通信连接。
具体地,耳机通过无线音频单元212与充电盒的无线音频单元223建立通信连接,例 如可以是蓝牙连接。
充电盒通过无线网络单元224建立与云服务器的通信连接。例如可以采用4G/3G/2G、Wi-Fi和TCP/IP等通信连接方式。
S102、充电盒接收到所述语音命令后,向云服务器发送所述语音命令。
充电盒的无线音频单元223接收到耳机的无线音频单元212发送的语音命令后,本身也不能识别语音命令,其通过无线网络单元224将从耳机接收到的语音命令发送至云服务器。
在另外的实施例中,充电盒也可以具备本地语言命令识别能力,充电盒收到耳机发送的语音命令后,先进行本地识别,如无法本地识别,将语音命令转发至云服务器进行网络识别。
S103、云服务器接收到所述语音命令后,进行语音命令识别。
云服务器具有语音识别功能,在接收到充电盒发送的语音命令后,对语音命令进行识别,得到语音命令识别结果。语音识别过程可参考现有的语音识别技术,在此不再赘述。
S104、云服务器向充电盒发送语音命令识别结果。
云服务器通过无线网络向充电盒发送语音命令识别结果。
作为一种实现方式,当该语音命令是一种控制命令时,可以由充电盒执行该语音命令。则S104之后,进行到S105、充电盒接收到所述语音命令识别结果后,执行所述语音命令。
例如,语音命令为“播放歌曲《爱的诺言》”,充电盒在接收到云服务器发送的语音命令识别结果后,从存储器中调取或从服务器中获取歌曲《爱的诺言》。
S106、充电盒向所述耳机发送语音命令执行结果。
如上面示例所示,该语音命令执行结果可以是该歌曲的音频数据。
S107、耳机接收到所述语音命令执行结果后,播放所述语音命令执行结果。
如上面示例所示,耳机在接收到该歌曲的音频数据后,播放该歌曲。
可选地,则S107之后,还包括:所述耳机执行所述语音命令。
作为另一种实现方式,当该语音命令是一种控制命令时,也可以由耳机执行该语音命令。则在S104之后,进行到S106、充电盒接收到所述语音命令识别结果后,向所述耳机发送所述语音命令识别结果。
例如,该语音命令为“播放歌曲《爱的诺言》”,充电盒向耳机发送该语音命令识别结果——“播放歌曲《爱的诺言》”。
S107、耳机接收到所述语音命令识别结果后,播放所述语音命令识别结果。
如上面示例所示,耳机在接收到该语音命令识别结果——“播放歌曲《爱的诺言》”后,通过无线音频单元212向充电盒获取该歌曲的音频数据,或从本地存储器获取该歌曲的音频数据,并播放该歌曲。
由于耳机的尺寸和供电的限制,无法利用耳机与云服务器实现网络语音识别功能,而在本实施例中,通过充电盒集成上述单元,可以实现与云服务器的网络语音识别功能。结合原有的耳机,可实现集成网络语音识别功能的智能耳机。
根据本申请实施例提供的一种网络语音识别方法,由充电盒将从耳机接收到的语音命令发送至云服务器,再由充电盒接收云服务器发送的语音命令识别结果,可以使得智能耳 机实现网络语音识别,还可以使得充电盒执行识别后的语音命令。
请参阅图4,图4为本申请实施例提供的一种网络业务交互方法的流程示意图,应用于图2所示的智能耳机。该方法包括以下步骤:
S201、耳机向充电盒发送语音命令,所述语音命令用于指示获取音频数据。
在本步骤中,用户佩戴好耳机后,通过语音或手动唤醒耳机,下达语音命令。耳机通过麦克风采集用户的语音命令,则可选地,在S201之前,所述方法还包括:所述耳机采集所述语音命令;以及所述耳机向所述充电盒发送采集到的所述语音命令。
耳机也可以获取本地的语音命令。
耳机本身不能对语音命令进行识别。因此,耳机采集到语音命令后,转换为语音命令数据,通过无线音频单元212发送至充电盒的无线音频单元223。该语音命令用于指示数据音频数据。例如,获取歌曲《爱的诺言》的音频数据。
可选地,在S201之前,所述方法还包括:
所述耳机建立与所述充电盒的通信连接;
所述充电盒建立与所述云服务器的通信连接。
具体地,耳机通过无线音频单元212与充电盒的无线音频单元223建立通信连接,例如可以是蓝牙连接。
充电盒通过无线网络单元224建立与云服务器的通信连接。例如可以采用4G/3G/2G、Wi-Fi和TCP/IP等通信连接方式。
S202、所述充电盒接收到所述语音命令后,向云服务器发送所述语音命令。
充电盒的无线音频单元223接收到耳机的无线音频单元212发送的语音命令后,本身也不能识别语音命令,其通过无线网络单元224将从耳机接收到的语音命令发送至云服务器。
在另外的实施例中,充电盒具备存储本地媒体资源文件功能,当识别(本地识别或网络识别)到播放本地媒体资源的命令后,读取本地资源文件、解码并发送给耳机播放。
S203、所述云服务器识别所述语音命令,并获取所述语音命令所指示的音频数据。
云服务器具有语音识别功能和网络音频业务功能。可选地,语音识别功能和网络音频业务功能可以集成在一个服务器中,也可以由不同的服务器实现。
当语音识别功能和网络音频业务功能集成在一个服务器中时,则云服务器接收到充电盒发送的语音命令后,识别该语音命令,并获取该语音命令所指示的音频数据。
当语音识别功能和网络音频业务功能由不同的服务器实现时,则该云服务器识别该语音命令后,向另一个具有网络音频业务功能的服务器发送获取音频数据的请求,并接收另一个服务器发送的音频数据。可选地,S203之后,所述方法还包括:所述云服务器向所述充电盒发送语音命令识别结果;所述充电盒接收来自所述云服务器的语音命令识别结果后,向所述耳机发送所述语音命令识别结果。
S204、所述云服务器向所述充电盒发送所述音频数据。
S205、所述充电盒接收到所述云服务器发送的音频数据后,向所述耳机发送所述音频数据。
具体地,充电盒的无线网络单元224接收云服务器发送的音频数据,并通过无线音频 单元223向耳机的无线音频单元212发送该音频数据。
S206、所述耳机解码并播放所述音频数据。
当充电盒发送的音频数据的格式与耳机播放音频数据的格式一致时,则充电盒在接收到云服务器发送的音频数据后,可直接发送给耳机,由耳机解码并播放该音频数据。
可选地,当充电盒发送的音频数据的格式与耳机播放音频数据的格式不一致时,则S204之后,所述方法还包括:所述充电盒解码从所述云服务器接收到的音频数据。
S205具体为:所述充电盒向所述耳机发送所述解码后的音频数据。
具体地,由于充电盒发送的音频数据的格式与耳机播放音频数据的格式不一致,则充电盒需要解码从云服务器接收到的音频数据,并向耳机发送该解码后的音频数据。耳机接收到充电盒解码后的音频数据后,再根据自身的播放格式对该音频数据进行解码,并播放自身解码后的音频数据。
根据本申请实施例提供的一种网络业务交互方法,由充电盒接收耳机发送的获取音频数据的语音指令,并发送至云服务器,云服务器识别该语音指令并执行该语音指令,向充电盒发送音频数据,充电盒发送该音频数据给耳机进行解码和播放,从而使得智能耳机可以实现网络业务交互。
请参阅图5,图5为本申请实施例提供的一种网络业务交互方法的流程示意图,应用于图2所示的智能耳机。该方法包括以下步骤:
S301、耳机向充电盒发送语音命令。
与图4所示实施例不同的是,该语音命令可以是指示云服务器进行的网络业务服务命令。例如,用户每日的运动步数上传云服务器。用户当前时刻在快步走,用户向耳机发送语音指令,指示云服务器在当用户的运动步数超过之前每日的运动步数的平均值时进行提示。例如,用户发送语音命令“我今天步数超过日平均步数了吗?”
S302、所述充电盒接收到来自所述耳机的语音命令后,所述充电盒向云服务器发送所述语音命令。
充电盒的无线音频单元223接收到耳机的无线音频单元212发送的语音命令后,本身也不能识别语音命令,其通过无线网络单元224将从耳机接收到的语音命令发送至云服务器。
S303、所述云服务器接收到所述语音命令后,对所述语音命令进行识别并执行所述语音命令。
云服务器具有语音识别功能,识别用户的语音命令“我今天步数超过日平均步数了吗?”,云服务器执行该语音命令,将耳机或充电盒上传的用户的当前步数与用户的日平均步数进行比较,当用户的当前步数超过用户的日平均步数时,得到语音命令执行结果“你已经超过日平均步数啦!”
S304、所述云服务器向所述充电盒发送语音命令执行结果。
S305、所述充电盒接收到来自所述云服务器的语音命令执行结果后,所述充电盒向所述耳机发送所述语音命令执行结果。
具体地,充电盒的无线网络单元224接收云服务器发送的语音命令执行结果,并通过无线音频单元223向耳机的无线音频单元212发送该语音命令执行结果。
S306、所述耳机接收到所述语音命令执行结果后,播放所述语音命令执行结果。
耳机的无线音频单元212接收到该语音命令执行结果后,播放该语音命令执行结果。
当然,云服务器执行语音命令后,命令执行结果也可以是其他的提示方式,例如充电盒接收到命令执行结果后,进行振动提示等。
根据本申请实施例提供的一种网络业务交互方法,充电盒接收耳机发送的语音命令,并发送至云服务器,云服务器识别并执行该语音命令,向充电盒返回语音命令执行结果,以使得智能耳机实现了网络业务交互。
本申请实施例还提供一种智能耳机,该智能耳机的结构如图2所示。具体地:
所述充电盒用于接收来自所述耳机的语音命令;
所述充电盒还用于向云服务器发送所述语音命令;
所述充电盒还用于接收来自所述云服务器的语音命令识别结果;
所述充电盒还用于向所述耳机发送所述语音命令识别结果,或所述充电盒还用于执行所述语音命令,并向所述耳机发送语音命令执行结果;
所述耳机用于播放所述语音命令识别结果或所述语音命令执行结果。
可选地,所述耳机还用于采集所述语音命令;以及所述耳机还用于向所述充电盒发送采集到的所述语音命令。
可选地,所述耳机还用于建立与所述充电盒的通信连接;以及所述充电盒还用于建立与所述云服务器的通信连接。
可选地,所述耳机还用于执行所述语音命令。
具体的功能实现可参考图3所示的实施例。
根据本申请实施例提供的一种智能耳机,由充电盒将从耳机接收到的语音命令发送至云服务器,再由充电盒接收云服务器发送的语音命令识别结果,可以使得智能耳机实现网络语音识别,还可以使得充电盒执行识别后的语音命令。
本申请实施例还提供另一种智能耳机,该智能耳机的结构如图2所示。具体地:
所述充电盒用于接收来自所述耳机的语音命令,所述语音命令用于指示获取音频数据;
所述充电盒还用于向云服务器发送所述语音命令;
所述充电盒还用于接收来自所述云服务器的音频数据;
所述充电盒还用于向所述耳机发送所述音频数据;
所述耳机用于解码并播放所述音频数据。
可选地,所述充电盒还用于解码从所述云服务器接收到的音频数据;以及所述充电盒还用于向所述耳机发送所述解码后的音频数据。
可选地,所述充电盒还用于接收来自所述云服务器的语音命令识别结果;以及所述充电盒还用于向所述耳机发送所述语音命令识别结果。
可选地,所述耳机还用于采集所述语音命令;以及所述耳机还用于向所述充电盒发送采集到的所述语音命令。
可选地,所述耳机还用于建立与所述充电盒的通信连接;以及所述充电盒还用于建立与所述云服务器的通信连接。
具体的功能实现可参考图4所示的实施例。
根据本申请实施例提供的一种智能耳机,由充电盒接收耳机发送的获取音频数据的语音指令,并发送至云服务器,云服务器识别该语音指令并执行该语音指令,向充电盒发送音频数据,充电盒发送该音频数据给耳机进行解码和播放,从而使得智能耳机可以实现网络业务交互。
本申请实施例还提供又一种智能耳机,该智能耳机的结构如图2所示。具体地:
所述充电盒用于接收来自所述耳机的语音命令;
所述充电盒还用于向云服务器发送所述语音命令;
所述充电盒还用于接收来自所述云服务器的语音命令执行结果;
所述充电盒还用于向所述耳机发送所述语音命令执行结果;
所述耳机用于播放所述语音命令执行结果。
具体的功能实现可参考图5所示的实施例。
根据本申请实施例提供的一种智能耳机,充电盒接收耳机发送的语音命令,并发送至云服务器,云服务器识别并执行该语音命令,向充电盒返回语音命令执行结果,以使得智能耳机实现了网络业务交互。
本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现图3~图5所示实施例中智能耳机所执行的方法。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,该单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。所显示或讨论的相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地产生按照本申请实施例的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者通过该计算机可读存储介质进行传输。该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是只读存储器(read-only memory,ROM),或随机存储存储器(random access memory,RAM),或磁性 介质,例如,软盘、硬盘、磁带、磁碟、或光介质,例如,数字通用光盘(digital versatile disc,DVD)、或者半导体介质,例如,固态硬盘(solid state disk,SSD)等。

Claims (21)

  1. 一种网络语音识别方法,应用于智能耳机,所述智能耳机包括耳机和充电盒,其特征在于,所述方法包括:
    所述充电盒接收来自所述耳机的语音命令;
    所述充电盒向云服务器发送所述语音命令;
    所述充电盒接收来自所述云服务器的语音命令识别结果;
    所述充电盒向所述耳机发送所述语音命令识别结果,或所述充电盒执行所述语音命令,并向所述耳机发送语音命令执行结果;
    所述耳机播放所述语音命令识别结果或所述语音命令执行结果。
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    所述耳机采集所述语音命令;
    所述耳机向所述充电盒发送采集到的所述语音命令。
  3. 如权利要求1或2所述的方法,其特征在于,所述方法还包括:
    所述耳机建立与所述充电盒的通信连接;
    所述充电盒建立与所述云服务器的通信连接。
  4. 如权利要求1所述的方法,其特征在于,所述充电盒向所述耳机发送所述语音命令识别结果之后,所述方法还包括:
    所述耳机执行所述语音命令。
  5. 一种网络业务交互方法,应用于智能耳机,所述智能耳机包括耳机和充电盒,其特征在于,所述方法包括:
    所述充电盒接收来自所述耳机的语音命令,所述语音命令用于指示获取音频数据;
    所述充电盒向云服务器发送所述语音命令;
    所述充电盒接收来自所述云服务器的音频数据;
    所述充电盒向所述耳机发送所述音频数据;
    所述耳机解码并播放所述音频数据。
  6. 如权利要求5所述的方法,其特征在于,所述方法还包括:
    所述充电盒解码从所述云服务器接收到的音频数据;
    所述充电盒向所述耳机发送所述音频数据,包括:
    所述充电盒向所述耳机发送所述解码后的音频数据。
  7. 如权利要求5所述的方法,其特征在于,所述充电盒向云服务器发送所述语音命令之后,所述方法还包括:
    所述充电盒接收来自所述云服务器的语音命令识别结果;
    所述充电盒向所述耳机发送所述语音命令识别结果。
  8. 如权利要求5~7任一项所述的方法,其特征在于,所述方法还包括:
    所述耳机采集所述语音命令;
    所述耳机向所述充电盒发送采集到的所述语音命令。
  9. 如权利要求5~8任一项所述的方法,其特征在于,所述方法还包括:
    所述耳机建立与所述充电盒的通信连接;
    所述充电盒建立与所述云服务器的通信连接。
  10. 一种网络业务交互方法,应用于智能耳机,所述智能耳机包括耳机和充电盒,其特征在于,所述方法包括:
    所述充电盒接收来自所述耳机的语音命令;
    所述充电盒向云服务器发送所述语音命令;
    所述充电盒接收来自所述云服务器的语音命令执行结果;
    所述充电盒向所述耳机发送所述语音命令执行结果;
    所述耳机播放所述语音命令执行结果。
  11. 一种智能耳机,所述智能耳机包括耳机和充电盒,其特征在于:
    所述充电盒用于接收来自所述耳机的语音命令;
    所述充电盒还用于向云服务器发送所述语音命令;
    所述充电盒还用于接收来自所述云服务器的语音命令识别结果;
    所述充电盒还用于向所述耳机发送所述语音命令识别结果,或所述充电盒还用于执行所述语音命令,并向所述耳机发送语音命令执行结果;
    所述耳机用于播放所述语音命令识别结果或所述语音命令执行结果。
  12. 如权利要求11所述的智能耳机,其特征在于:
    所述耳机还用于采集所述语音命令;
    所述耳机还用于向所述充电盒发送采集到的所述语音命令。
  13. 如权利要求11或12所述的智能耳机,其特征在于:
    所述耳机还用于建立与所述充电盒的通信连接;
    所述充电盒还用于建立与所述云服务器的通信连接。
  14. 如权利要求11所述的智能耳机,其特征在于:
    所述耳机还用于执行所述语音命令。
  15. 一种智能耳机,所述智能耳机包括耳机和充电盒,其特征在于:
    所述充电盒用于接收来自所述耳机的语音命令,所述语音命令用于指示获取音频数 据;
    所述充电盒还用于向云服务器发送所述语音命令;
    所述充电盒还用于接收来自所述云服务器的音频数据;
    所述充电盒还用于向所述耳机发送所述音频数据;
    所述耳机用于解码并播放所述音频数据。
  16. 如权利要求15所述的智能耳机,其特征在于:
    所述充电盒还用于解码从所述云服务器接收到的音频数据;
    所述充电盒还用于向所述耳机发送所述解码后的音频数据。
  17. 如权利要求15所述的智能耳机,其特征在于:
    所述充电盒还用于接收来自所述云服务器的语音命令识别结果;
    所述充电盒还用于向所述耳机发送所述语音命令识别结果。
  18. 如权利要求15~17任一项所述的智能耳机,其特征在于:
    所述耳机还用于采集所述语音命令;
    所述耳机还用于向所述充电盒发送采集到的所述语音命令。
  19. 如权利要求15~18任一项所述的智能耳机,其特征在于:
    所述耳机还用于建立与所述充电盒的通信连接;
    所述充电盒还用于建立与所述云服务器的通信连接。
  20. 一种智能耳机,所述智能耳机包括耳机和充电盒,其特征在于:
    所述充电盒用于接收来自所述耳机的语音命令;
    所述充电盒还用于向云服务器发送所述语音命令;
    所述充电盒还用于接收来自所述云服务器的语音命令执行结果;
    所述充电盒还用于向所述耳机发送所述语音命令执行结果;
    所述耳机用于播放所述语音命令执行结果。
  21. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1~4任一项所述的方法、或权利要求5~9任一项所述的方法、或权利要求10所述的方法。
PCT/CN2019/115873 2018-12-03 2019-11-06 网络语音识别方法、网络业务交互方法及智能耳机 WO2020114181A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811465464.4 2018-12-03
CN201811465464.4A CN111276135B (zh) 2018-12-03 2018-12-03 网络语音识别方法、网络业务交互方法及智能耳机

Publications (1)

Publication Number Publication Date
WO2020114181A1 true WO2020114181A1 (zh) 2020-06-11

Family

ID=70974054

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/115873 WO2020114181A1 (zh) 2018-12-03 2019-11-06 网络语音识别方法、网络业务交互方法及智能耳机

Country Status (2)

Country Link
CN (1) CN111276135B (zh)
WO (1) WO2020114181A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113286212B (zh) * 2021-05-20 2022-07-12 北京明略软件系统有限公司 一种佩戴式音频采集组件
CN113421570A (zh) * 2021-06-21 2021-09-21 紫优科技(深圳)有限公司 一种智能耳机身份认证方法及装置
CN113380251A (zh) * 2021-06-22 2021-09-10 紫优科技(深圳)有限公司 一种基于智能耳机的移动语音交互方法及装置
CN113411709A (zh) * 2021-06-28 2021-09-17 紫优科技(深圳)有限公司 一种云智能耳机系统的设计方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160073188A1 (en) * 2014-09-05 2016-03-10 Epickal AB Wireless earbuds
CN107333201A (zh) * 2017-07-24 2017-11-07 歌尔科技有限公司 一种翻译耳机收纳盒、无线翻译耳机和无线翻译系统
CN108509428A (zh) * 2018-02-26 2018-09-07 深圳市百泰实业股份有限公司 耳机翻译方法和系统
CN108550367A (zh) * 2018-05-18 2018-09-18 深圳傲智天下信息科技有限公司 一种便携式智能语音交互控制设备、方法及系统
CN108549206A (zh) * 2018-07-12 2018-09-18 深圳傲智天下信息科技有限公司 一种带具有语音交互功能耳机的智能手表
CN108564949A (zh) * 2018-05-18 2018-09-21 深圳傲智天下信息科技有限公司 一种tws耳机、腕带式ai语音交互装置及系统

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3119248U (ja) * 2005-12-06 2006-02-16 ▲シウ▼瑩企業有限公司 無線イヤホンデバイスおよび充電ベースのアセンブリー
CN102594988A (zh) * 2012-02-10 2012-07-18 深圳市中兴移动通信有限公司 一种实现蓝牙耳机语音识别自动配对连接的方法及系统
US20140119554A1 (en) * 2012-10-25 2014-05-01 Elwha Llc Methods and systems for non-volatile memory in wireless headsets
EP3594828A1 (en) * 2013-06-07 2020-01-15 Apple Inc. Intelligent automated assistant
US10453450B2 (en) * 2015-10-20 2019-10-22 Bragi GmbH Wearable earpiece voice command control system and method
CN106850847A (zh) * 2017-03-10 2017-06-13 上海斐讯数据通信技术有限公司 基于云平台的语音信息共享方法及其智能耳机
CN206977651U (zh) * 2017-06-05 2018-02-06 广东朝阳电子科技股份有限公司 带WiFi功能之TWS蓝牙耳机装置的电路结构
CN207518810U (zh) * 2017-11-20 2018-06-19 深圳市胜兴旺精密科技有限公司 充电盒
CN108900945A (zh) * 2018-09-29 2018-11-27 上海与德科技有限公司 蓝牙耳机盒和语音识别方法、服务器和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160073188A1 (en) * 2014-09-05 2016-03-10 Epickal AB Wireless earbuds
CN107333201A (zh) * 2017-07-24 2017-11-07 歌尔科技有限公司 一种翻译耳机收纳盒、无线翻译耳机和无线翻译系统
CN108509428A (zh) * 2018-02-26 2018-09-07 深圳市百泰实业股份有限公司 耳机翻译方法和系统
CN108550367A (zh) * 2018-05-18 2018-09-18 深圳傲智天下信息科技有限公司 一种便携式智能语音交互控制设备、方法及系统
CN108564949A (zh) * 2018-05-18 2018-09-21 深圳傲智天下信息科技有限公司 一种tws耳机、腕带式ai语音交互装置及系统
CN108549206A (zh) * 2018-07-12 2018-09-18 深圳傲智天下信息科技有限公司 一种带具有语音交互功能耳机的智能手表

Also Published As

Publication number Publication date
CN111276135A (zh) 2020-06-12
CN111276135B (zh) 2023-06-20

Similar Documents

Publication Publication Date Title
WO2020114181A1 (zh) 网络语音识别方法、网络业务交互方法及智能耳机
US10809964B2 (en) Portable intelligent voice interactive control device, method and system
CN208689384U (zh) 一种带具有语音交互功能耳机的智能手表
WO2019218368A1 (zh) 一种tws耳机、腕带式ai语音交互装置及系统
CN108886647B (zh) 耳机降噪方法及装置、主耳机、从耳机及耳机降噪系统
US11501779B2 (en) Bluetooth speaker base, method and system for controlling thereof
WO2014090040A1 (en) Method of using a mobile device as a microphone, method of audio playback, and related device and system
WO2020132818A1 (zh) 无线短距离音频共享方法及电子设备
WO2018152679A1 (zh) 音频文件的传输、接收方法及装置、设备及其系统
CN104378710A (zh) 一种无线音箱
CN110958298B (zh) 无线音频播放设备及其无线互联网音频播放方法
WO2015165415A1 (en) Method and apparatus for playing audio data
CN103686540A (zh) 一种主动式无线网络音响设备及其使用方法
WO2018103483A1 (zh) 具有遥控器的智能头盔及其遥控方法
CN104732993A (zh) 无线路由音乐播放器
CN114258003A (zh) 音频播放控制方法、系统、设备及存储介质
US20140163971A1 (en) Method of using a mobile device as a microphone, method of audio playback, and related device and system
WO2020082710A1 (zh) 一种蓝牙音箱语音交互控制方法、装置及系统
CN111556406B (zh) 音频处理方法、音频处理装置及耳机
CN109660914A (zh) 一种分布式蓝牙音响控制系统
CN203912165U (zh) 蓝牙音箱系统
WO2017049497A1 (zh) 一种实现对讲的方法及智能手环
WO2015109844A1 (zh) 便携式宽带无线装置
CN105679344B (zh) 音频播放方法及装置
JP2007104184A (ja) 情報記憶装置、情報処理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19893996

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19893996

Country of ref document: EP

Kind code of ref document: A1