CN116524919A - Equipment awakening method, related device and communication system - Google Patents

Equipment awakening method, related device and communication system Download PDF

Info

Publication number
CN116524919A
CN116524919A CN202210075546.8A CN202210075546A CN116524919A CN 116524919 A CN116524919 A CN 116524919A CN 202210075546 A CN202210075546 A CN 202210075546A CN 116524919 A CN116524919 A CN 116524919A
Authority
CN
China
Prior art keywords
wake
voice
word
electronic device
audio energy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210075546.8A
Other languages
Chinese (zh)
Inventor
孙渊
李树为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202210075546.8A priority Critical patent/CN116524919A/en
Publication of CN116524919A publication Critical patent/CN116524919A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Telephone Function (AREA)

Abstract

The application provides a device wake-up method, a related device and a communication system. The plurality of voice wake-up devices may detect whether the collected sound contains a pre-wake-up word. The pre-wake word is part of the wake word. When the pre-wake-up word is detected, the multiple voice wake-up devices can negotiate and determine the response device according to the audio energy of the pre-wake-up word corresponding to the pre-wake-up word. The answering device may enter an awake state after detecting the wake word, responding to the user. The method can start negotiating to determine the response equipment when the user does not speak the wake-up word yet, and the response equipment enters the wake-up state after detecting the wake-up word. The method can improve the response speed of the voice awakening device after detecting the awakening word under the condition of not reducing the awakening rate in the scene of a plurality of voice awakening devices.

Description

设备唤醒方法、相关装置及通信系统Equipment wake-up method, related device and communication system

技术领域technical field

本申请涉及终端技术领域,尤其涉及设备唤醒方法、相关装置及通信系统。The present application relates to the technical field of terminals, and in particular to a method for waking up a device, a related device and a communication system.

背景技术Background technique

随着手机、平板电脑、智能家居设备等电子设备的发展,越来越多的电子设备具有语音唤醒能力。具有语音唤醒能力的电子设备可以在检测到唤醒语音之后,进入唤醒状态,识别用户的语音指令,并执行语音指令对应的操作。唤醒语音为包含唤醒词的语音。With the development of electronic devices such as mobile phones, tablet computers, and smart home devices, more and more electronic devices have voice wake-up capabilities. An electronic device with a voice wake-up capability can enter a wake-up state after detecting a wake-up voice, recognize a user's voice command, and perform an operation corresponding to the voice command. The wake-up voice is a voice containing a wake-up word.

但在存在多个具有语音唤醒能力的电子设备,且用于唤醒这多个电子设备的唤醒词相同的场景中,这多个电子设备在检测到唤醒语音之后,需要协商,确定一个电子设备作为应答设备,来响应用户的语音指令。上述协商的过程需要时间,这就使得在环境中存在多个具有语音唤醒能力的电子设备时,电子设备的响应速度较慢,用户的使用体验较差。However, in a scenario where there are multiple electronic devices capable of waking up by voice, and the wake-up words used to wake up the multiple electronic devices are the same, the multiple electronic devices need to negotiate after detecting the wake-up voice, and determine one electronic device as the An answering device to respond to the user's voice commands. The above-mentioned negotiation process takes time, which makes the response speed of the electronic devices slow and the user experience poor when there are multiple electronic devices capable of waking up by voice in the environment.

发明内容Contents of the invention

本申请提供设备唤醒方法、相关装置及通信系统。在本申请提供的设备唤醒方法中,电子设备可以在检测到包含预唤醒词的预唤醒语音后,根据自己确定的预唤醒音频能量和其它电子设备在检测到包含预唤醒词的预唤醒语音后确定的音频能量,来判断进行应答的设备为哪一个。上述预唤醒词可以是唤醒词的一部分。当检测到包含唤醒词的唤醒语音,上述确定的进行应答的设备可以对用户进行响应。上述方法可以将多个唤醒词相同的电子设备协商确定应答设备的过程提前,从而减少用户说出唤醒词之后等待电子设备响应的时间。这可以提高电子设备在检测到唤醒语音之后的响应速度,提高用户的使用体验。The present application provides a device wake-up method, a related device and a communication system. In the device wake-up method provided by this application, after the electronic device detects the pre-wake-up voice containing the pre-wake-up word, the energy of the pre-wake-up audio determined by itself and other electronic devices after detecting the pre-wake-up voice containing the pre-wake-up word Determine the audio energy to determine which device is responding. The above-mentioned pre-wake-up word may be a part of the wake-up word. When the wake-up voice containing the wake-up word is detected, the above-mentioned determined answering device may respond to the user. The above method can advance the process of negotiating among multiple electronic devices with the same wake-up word to determine the answering device, thereby reducing the time for the user to wait for the electronic device to respond after speaking the wake-up word. This can improve the response speed of the electronic device after detecting the wake-up voice, and improve user experience.

第一方面,本申请提供一种设备唤醒方法。其中,第一电子设备可以检测到包含预唤醒词的第一预唤醒语音,根据第一预唤醒语音得到第一音频能量。第一电子设备可以接收到M个电子设备发送的M个音频能量,M个音频能量中的一个音频能量是,M个电子设备中一个电子设备根据检测到的包含预唤醒词的预唤醒语音得到的,M为正整数。第一电子设备可以根据第一音频能量和M个音频能量,确定第一电子设备为进行应答的设备。当检测到包含唤醒词的第一唤醒语音,第一电子设备中的第一应用可以进入唤醒状态。其中,预唤醒词是唤醒词的一部分,第一应用在唤醒状态用于检测和响应语音指令以执行语音指令对应的操作。In a first aspect, the present application provides a method for waking up a device. Wherein, the first electronic device may detect the first pre-wake-up voice containing the pre-wake-up word, and obtain the first audio energy according to the first pre-wake-up voice. The first electronic device can receive M audio energies sent by M electronic devices, one of the M audio energies is obtained by one of the M electronic devices based on the detected pre-wake-up voice containing the pre-wake-up word , M is a positive integer. The first electronic device may determine that the first electronic device is the responding device according to the first audio energy and the M audio energies. When the first wake-up voice containing the wake-up word is detected, the first application in the first electronic device may enter a wake-up state. Wherein, the pre-wake-up word is a part of the wake-up word, and the first application is used to detect and respond to voice commands in the wake-up state to perform operations corresponding to the voice commands.

其中,上述第一电子设备和上述M个电子设备的唤醒词相同。上述用于确定第一音频能量的第一预唤醒语音,和用于确定M个音频能量的预唤醒语音可以是第一电子设备和M个电子设备基于用户说出的一句预唤醒词而检测到的。Wherein, the wake-up words of the first electronic device and the M electronic devices are the same. The above-mentioned first pre-wake-up voice for determining the first audio energy and the pre-wake-up voice for determining M audio energies may be detected by the first electronic device and the M electronic devices based on a pre-wake-up word spoken by the user of.

上述第一音频能量可以是第一音频能量和M个音频能量中最大的。可以理解的,电子设备根据检测到的预唤醒语音确定出的音频能量越大,可以表示这个电子设备离用户的距离越近。在上述协商确定应答设备的过程中,可以选择离用户最近的电子设备来对用户说出的唤醒词和/或语音指令进行响应。The aforementioned first audio energy may be the largest of the first audio energy and the M audio energies. It can be understood that the greater the audio energy determined by the electronic device according to the detected pre-awakening voice, the closer the electronic device is to the user. In the above process of negotiating and determining the answering device, the electronic device closest to the user may be selected to respond to the wake-up word and/or voice command spoken by the user.

在一些实施例中,不限于仅根据由预唤醒语音确定出的音频能量,第一电子设备还可以结合第一音频能量、M个音频能量和各个电子设备的设备信息(如设备类型、设备使用频率、设备能力等等)来确定应答设备。例如,在判断出第一音频能量和M个音频能量中最大的音频能量有多个的情况下,第一电子设备可以比较多个最大的音频能量对应的电子设备的设备能力来确定应答设备。上述设备能力可以例如是麦克风的音效。第一电子设可以从上述多个最大的音频能量对应的电子设备中选取出音效最好的一个电子设备作为应答设备。In some embodiments, the first electronic device may also combine the first audio energy, the M audio energies and the device information of each electronic device (such as device type, device usage, etc.) frequency, device capabilities, etc.) to identify the responding device. For example, when it is determined that there are multiple first audio energies and the largest audio energy among the M audio energies, the first electronic device may compare device capabilities of the electronic devices corresponding to the multiple largest audio energies to determine the answering device. The above-mentioned device capability may be, for example, a sound effect of a microphone. The first electronic device may select an electronic device with the best sound effect from the plurality of electronic devices corresponding to the largest audio energy as the answering device.

在一些实施例中,上述唤醒词可以包含多个音节。上述预唤醒词包含的音节可以是从唤醒词中截取出来的。例如,预唤醒词包含的音节可以是唤醒词的前几个音节。In some embodiments, the above-mentioned wake-up word may contain multiple syllables. The syllables contained in the above pre-awakening words may be intercepted from the awakening words. For example, the syllables contained in the pre-wake-up word may be the first few syllables of the wake-up word.

可以看出,上述方法可以通过检测预唤醒语音,来确定上述协商过程开始的时机。上述预唤醒语音可以是包含预唤醒词的语音。第一电子设备和其它语音唤醒设备可以在用户还没有说完唤醒词的时候,就开始协商确定应答设备。这样,当检测到唤醒语音,也即用户说完唤醒词之后,应答设备可以进入唤醒状态。上述方法不仅提高语音唤醒设备检测到唤醒语音之后的响应速度,而且上述应答设备在确定检测到唤醒语音的情况下再进行响应,不会影响唤醒率。这可以在存在多个唤醒词相同的语音唤醒设备的场景中,有效提高用户语音唤醒的使用体验。It can be seen that the above method can determine the timing for starting the above negotiation process by detecting the pre-awakening voice. The above-mentioned pre-awakening voice may be a voice containing a pre-awakening word. The first electronic device and other voice wake-up devices may start to negotiate to determine the answering device before the user finishes speaking the wake-up words. In this way, when the wake-up voice is detected, that is, after the user finishes speaking the wake-up word, the answering device can enter the wake-up state. The above method not only improves the response speed of the voice wake-up device after detecting the wake-up voice, but also the responding device responds after the wake-up voice is detected, without affecting the wake-up rate. This can effectively improve the user experience of voice wake-up in a scenario where there are multiple voice wake-up devices with the same wake-up word.

结合第一方面,在一些实施例中,在第一电子设备检测到包含预唤醒词的第一预唤醒语音之后,当检测到采集的声音中不包含唤醒词,第一电子设备中的第一应用可以不进入唤醒状态。也即是说,第一电子设备可以在检测到预唤醒语音之后开始与上述M个电子设备协商确定应答设备。当未检测到包含唤醒词的唤醒语音,即便确定应答设备为第一电子设备,第一电子设备中的第一应用也可以不对用户进行响应。With reference to the first aspect, in some embodiments, after the first electronic device detects the first pre-wake-up voice containing the pre-wake-up word, when it is detected that the collected sound does not contain the wake-up word, the first electronic device in the first electronic device Apps don't have to go into the awake state. That is to say, after detecting the pre-awakening voice, the first electronic device may start to negotiate with the M electronic devices to determine the answering device. When the wake-up voice containing the wake-up word is not detected, even if it is determined that the answering device is the first electronic device, the first application in the first electronic device may not respond to the user.

上述方法可以减少误唤醒的概率,提高用户语音控制设备的使用体验。The above method can reduce the probability of false wake-up and improve the user experience of the voice control device.

其中,上述第一电子设备中的第一应用进入唤醒状态,可以表示第一电子设备调用第一应用的应用程序。当第一电子设备中的第一应用处于上述唤醒状态,第一电子设备运行的程序中包含第一应用的进程。在上述唤醒状态,第一电子设备可以通过第一应用检测和识别语音指令,执行语音指令对应的操作。Wherein, the first application in the first electronic device enters the wake-up state, which may mean that the first electronic device invokes an application program of the first application. When the first application in the first electronic device is in the wake-up state, the program run by the first electronic device includes the process of the first application. In the aforementioned wake-up state, the first electronic device can detect and recognize voice commands through the first application, and perform operations corresponding to the voice commands.

上述第一电子设备中的第一应用不进入唤醒状态,可以表示第一电子设备运行的程序中不包含第一应用的进程。或者,上述第一电子设备中的第一应用不进入唤醒状态,可以表示第一电子设备运行的程序中包含第一应用的进程,但第一应用不会对用户进行响应。也即是说,在第一应用未进入唤醒状态的情况下,第一电子设备不会对用户说出的唤醒词和/或语音指令进行响应。可以理解的,本申请中其它包含第一应用的电子设备中,第一应用进入唤醒状态的情况可以参考上述第一电子设备中的第一应用进入唤醒状态的情况。其它包含第一应用的电子设备中的第一应用不进入唤醒状态的情况可以参考上述第一电子设备中的第一应用不进入唤醒状态的情况。The fact that the first application in the first electronic device does not enter the wake-up state may mean that the programs run by the first electronic device do not include the process of the first application. Alternatively, the fact that the first application in the first electronic device does not enter the wake-up state may mean that the program running on the first electronic device includes the process of the first application, but the first application does not respond to the user. That is to say, when the first application is not in the wake-up state, the first electronic device will not respond to the wake-up word and/or voice command spoken by the user. It can be understood that, in other electronic devices including the first application in this application, the situation that the first application enters the wake-up state may refer to the above-mentioned situation that the first application enters the wake-up state in the first electronic device. For other situations in which the first application in the electronic device containing the first application does not enter the wake-up state, reference may be made to the above-mentioned situation in which the first application in the first electronic device does not enter the wake-up state.

结合第一方面,在一些实施例中,第一电子设备可以检测到包含预唤醒词的第二预唤醒语音,根据第二预唤醒语音得到第二音频能量。第一电子设备可以接收到K个电子设备发送的K个音频能量,K个音频能量中的一个音频能量是,K个电子设备中一个电子设备根据检测到的包含预唤醒词的预唤醒语音得到的,K为正整数。第一电子设备可以根据第二音频能量和K个音频能量,确定K个电子设备中的第二电子设备为进行应答的设备。在确定第二电子设备为进行应答的设备的情况下,第一电子设备中的第一应用可以不进入唤醒状态。With reference to the first aspect, in some embodiments, the first electronic device may detect the second pre-wake-up voice containing the pre-wake-up word, and obtain the second audio energy according to the second pre-wake-up voice. The first electronic device can receive K audio energies sent by K electronic devices, one of the K audio energies is obtained by one of the K electronic devices based on the detected pre-wake-up voice containing the pre-wake-up word , K is a positive integer. The first electronic device may determine, according to the second audio energy and the K audio energies, that the second electronic device among the K electronic devices is the responding device. In a case where it is determined that the second electronic device is the device that responds, the first application in the first electronic device may not enter the wake-up state.

其中,上述第一电子设备和上述K个电子设备的唤醒词相同。上述用于确定第二音频能量的第二预唤醒语音,和用于确定K个音频能量的预唤醒语音可以是第一电子设备和K个电子设备基于用户说出的一句预唤醒词而检测到的。Wherein, the wake-up words of the first electronic device and the K electronic devices are the same. The above-mentioned second pre-awakening voice used to determine the second audio energy and the pre-awakened voice used to determine the K audio energies may be detected by the first electronic device and the K electronic devices based on a sentence of pre-awakened words spoken by the user of.

可以看出,上述方法可以通过检测预唤醒语音,来确定上述协商过程开始的时机。上述预唤醒语音可以是包含预唤醒词的语音。第一电子设备和其它语音唤醒设备可以在用户还没有说完唤醒词的时候,就开始协商确定应答设备。当确定出应答设备不是第一电子设备,第一电子设备中的第一应用可以不进入唤醒状态。这样可以保证仅有应答设备在检测到唤醒语音后,应答设备的第一应用进入唤醒状态。上述方法可以减少误唤醒的概率,避免多个电子设备在检测到唤醒语音后对用户进行响应,从而给用户造成干扰。It can be seen that the above method can determine the timing for starting the above negotiation process by detecting the pre-awakening voice. The above-mentioned pre-awakening voice may be a voice containing a pre-awakening word. The first electronic device and other voice wake-up devices may start to negotiate to determine the answering device before the user finishes speaking the wake-up words. When it is determined that the answering device is not the first electronic device, the first application in the first electronic device may not enter the wake-up state. In this way, it can be ensured that only after the answering device detects the wake-up voice, the first application of the answering device enters the wake-up state. The above method can reduce the probability of false wake-up, and prevent multiple electronic devices from responding to the user after detecting the wake-up voice, thereby causing interference to the user.

结合第一方面,在一些实施例中,在确定第二电子设备为进行应答的设备的情况下,第一电子设备可以向第二电子设备发送第一消息,第一消息包含第一结果,第一结果用于指示第二电子设备为进行应答的设备,第一消息用于指示第二电子设备在检测到包含唤醒词的唤醒语音后,使得第二电子设备中的第一应用进入唤醒状态。With reference to the first aspect, in some embodiments, when it is determined that the second electronic device is the responding device, the first electronic device may send a first message to the second electronic device, the first message includes the first result, and the second electronic device may send a first message to the second electronic device. A result is used to indicate that the second electronic device is the device that responds, and the first message is used to indicate that the second electronic device causes the first application in the second electronic device to enter the wake-up state after detecting the wake-up voice containing the wake-up word.

结合第一方面,在一些实施例中,在根据第一预唤醒语音得到第一音频能量之后,第一电子设备还可以向M个电子设备发送第一音频能量。其中,上述M个电子设备之间也可以相互通告各自根据检测到的预唤醒语音而确定的音频能量。With reference to the first aspect, in some embodiments, after obtaining the first audio energy according to the first pre-wake-up voice, the first electronic device may further send the first audio energy to the M electronic devices. Wherein, the above M electronic devices may also notify each other of the audio energy determined according to the detected pre-awakening voice.

可以看出,在存在多个唤醒词相同的电子设备的场景中,这多个电子设备在检测到预唤醒语音之后协商的过程可以为:由一个电子设备(如第一电子设备)来确定应答设备。其它的电子设备可以将各自根据检测到的预唤醒语音而确定的音频能量发送给第一电子设备。第一电子设备可以将应答设备的确定结果发送给其它电子设备。或者,这多个电子设备在检测到预唤醒语音之后协商的过程可以为:这多个电子设备可以互相通告各自根据检测到的预唤醒语音而确定的音频能量。这多个电子设备均可以确定应答设备。当确定出自己是应答设备,电子设备可以在检测到唤醒语音之后进入唤醒状态。当确定出自己不是应答设备,电子设备可以在检测到唤醒语音之后不进入唤醒状态。It can be seen that in the scenario where there are multiple electronic devices with the same wake-up word, the negotiation process of these multiple electronic devices after detecting the pre-wake-up voice can be: one electronic device (such as the first electronic device) determines the response equipment. The other electronic devices may send the audio energy determined according to the detected pre-wake-up voice to the first electronic device. The first electronic device may send the determination result of the answering device to other electronic devices. Alternatively, the negotiation process of the multiple electronic devices after detecting the pre-wake-up voice may be: the multiple electronic devices may notify each other of audio energy determined according to the detected pre-wake-up voice. Each of the plurality of electronic devices can determine an answering device. When it is determined that it is the answering device, the electronic device may enter a wake-up state after detecting the wake-up voice. When it is determined that it is not the answering device, the electronic device may not enter the wake-up state after detecting the wake-up voice.

在一些实施例中,上述多个电子设备在检测到预唤醒语音之后协商的过程可以为:这多个电子设备可以将各自根据检测到的预唤醒语音而确定的音频能量发送给一个主设备。这一个主设备可以不是这多个唤醒词相同的电子设备中的一个电子设备。这一个主设备也可以是云服务器。主设备可以确定应答设备,并将应答设备的确定结果发送给这多个电子设备。或者,主设备可以仅将应答设备的确定结果发送给应答设备。In some embodiments, the negotiation process of the multiple electronic devices after detecting the pre-wake-up voice may be: the multiple electronic devices may send audio energy determined according to the detected pre-wake-up voice to a master device. The master device may not be one of the multiple electronic devices with the same wakeup word. This master device can also be a cloud server. The master device can determine the answering device, and send the determination result of the answering device to the plurality of electronic devices. Alternatively, the master device may only transmit the determination result of the answering device to the answering device.

结合第一方面,在一些实施例中,第一电子设备可以采集到第一声音,第一声音不包含预唤醒词。第一电子设备可以根据第一声音得到第三音频能量。第一电子设备根据第一预唤醒语音得到第四音频能量。第一电子设备可以利用第四音频能量减第三音频能量,得到上述第一音频能量。其中,上述第一声音可以是第一电子设备在检测到上述第一预唤醒语音的预设时间范围内采集到的。可以理解的,上述第一预唤醒语音中通常包含环境噪声。那么上述四音频能量中包含环境噪声所产生的能量。第一电子设备在检测到预唤醒语音的预设时间范围内采集的第一声音与第一预唤醒语音中包含的环境噪声接近。第一电子设备可以利用上述第三音频能量来减少第一音频能量中由环境噪声所产生的能量。With reference to the first aspect, in some embodiments, the first electronic device may collect the first sound, and the first sound does not contain the pre-wake-up word. The first electronic device can obtain the third audio energy according to the first sound. The first electronic device obtains fourth audio energy according to the first pre-wake-up voice. The first electronic device may subtract the third audio energy from the fourth audio energy to obtain the first audio energy. Wherein, the above-mentioned first sound may be collected by the first electronic device within a preset time range when the above-mentioned first pre-wake-up voice is detected. It can be understood that the above-mentioned first pre-wake-up voice usually includes environmental noise. Then, the above-mentioned four-tone energy includes energy generated by environmental noise. The first sound collected by the first electronic device within the preset time range of detecting the pre-awakening voice is close to the environmental noise contained in the first pre-awakening voice. The first electronic device may use the above-mentioned third audio energy to reduce energy generated by environmental noise in the first audio energy.

上述M个电子设备也可以减少各自根据检测到的预唤醒语音得到的音频能量中由唤醒噪声所产生的能量。具体方法可以参考第一电子设备减少第一音频能量中由环境噪声所产生的能量的方法。The above M electronic devices may also reduce the energy generated by the wake-up noise in the audio energy obtained according to the detected pre-wake-up voice. For a specific method, reference may be made to a method for the first electronic device to reduce energy generated by environmental noise in the first audio energy.

可以看出,在上述协商确定应答设备的过程中,第一电子设备和M个电子设备可以去除由预唤醒语音确定的音频能量中由环境噪声产生的音频能量。这可以减少环境噪声对确定应答设备的影响,提高应答设备的确定结果的准确率。通过上述减少环境噪声影响的音频能量,第一电子设备和M个电子设备可以在用户还没有说完唤醒词的时候,就开始协商确定应答设备。这样,当检测到唤醒语音,也即用户说完唤醒词之后,应答设备可以进入唤醒状态。上述方法不仅提高应答设备检测到唤醒语音之后的响应速度,而且上述应答设备在确定检测到唤醒语音的情况下才进行响应,不会影响唤醒率。这可以在存在多个唤醒词相同的电子设备的场景中,有效提高用户使用语音唤醒功能的使用体验。It can be seen that, in the above process of negotiating and determining the answering device, the first electronic device and the M electronic devices can remove the audio energy generated by the environmental noise in the audio energy determined by the pre-awakening voice. This can reduce the influence of environmental noise on the determination of the answering device, and improve the accuracy of the determination result of the answering device. By reducing the audio energy affected by the environmental noise, the first electronic device and the M electronic devices can start negotiating to determine the answering device before the user finishes speaking the wake-up word. In this way, when the wake-up voice is detected, that is, after the user finishes speaking the wake-up word, the answering device can enter the wake-up state. The above method not only improves the response speed of the answering device after detecting the wake-up voice, but also the above-mentioned answering device responds only when the wake-up voice is detected, which will not affect the wake-up rate. This can effectively improve the user experience of using the voice wake-up function in a scenario where there are multiple electronic devices with the same wake-up word.

第二方面,本申请提供一种设备唤醒方法。该方法可应用于语音唤醒系统,语音唤醒系统包括H个电子设备,H个电子设备包含第一电子设备,H为大于1的正整数。第一电子设备检测到包含预唤醒词的第一预唤醒语音,根据第一预唤醒语音得到第一音频能量。H个电子设备中的H1个电子设备向第一电子设备发送H1个音频能量,H1个电子设备不包含第一电子设备,H1个音频能量中的一个音频能量是,H1个电子设备中一个电子设备根据检测到的包含预唤醒词的预唤醒语音得到的;H1为小于H的正整数。第一电子设备根据第一音频能量和H1个音频能量,确定第一电子设备为进行应答的设备。当检测到包含唤醒词的第一唤醒语音,第一电子设备中的第一应用进入唤醒状态。其中,预唤醒词是唤醒词的一部分,第一应用在唤醒状态用于检测和响应语音指令以执行语音指令对应的操作。In a second aspect, the present application provides a method for waking up a device. The method can be applied to a voice wake-up system. The voice wake-up system includes H electronic devices, the H electronic devices include the first electronic device, and H is a positive integer greater than 1. The first electronic device detects a first pre-wake-up voice containing a pre-wake-up word, and obtains first audio energy according to the first pre-wake-up voice. H1 electronic devices among the H electronic devices send H1 audio energies to the first electronic device, H1 electronic devices do not contain the first electronic device, one audio energy of the H1 audio energies is, one electronic device among the H1 electronic devices It is obtained by the device based on the detected pre-wake-up voice containing the pre-wake-up word; H1 is a positive integer smaller than H. The first electronic device determines, according to the first audio energy and the H1 audio energy, that the first electronic device is a device that responds. When the first wake-up voice containing the wake-up word is detected, the first application in the first electronic device enters a wake-up state. Wherein, the pre-wake-up word is a part of the wake-up word, and the first application is used to detect and respond to voice commands in the wake-up state to perform operations corresponding to the voice commands.

可以看出,上述方法可以通过检测预唤醒语音,来确定上述协商过程开始的时机。上述预唤醒语音可以是包含预唤醒词的语音。第一电子设备和其它语音唤醒设备可以在用户还没有说完唤醒词的时候,就开始协商确定应答设备。这样,当检测到唤醒语音,也即用户说完唤醒词之后,应答设备可以进入唤醒状态。上述方法不仅提高语音唤醒设备检测到唤醒语音之后的响应速度,而且上述应答设备在确定检测到唤醒语音的情况下再进行响应,不会影响唤醒率。这可以在存在多个唤醒词相同的语音唤醒设备的场景中,有效提高用户语音唤醒的使用体验。It can be seen that the above method can determine the timing for starting the above negotiation process by detecting the pre-awakening voice. The above-mentioned pre-awakening voice may be a voice containing a pre-awakening word. The first electronic device and other voice wake-up devices may start to negotiate to determine the answering device before the user finishes speaking the wake-up words. In this way, when the wake-up voice is detected, that is, after the user finishes speaking the wake-up word, the answering device can enter the wake-up state. The above method not only improves the response speed of the voice wake-up device after detecting the wake-up voice, but also the responding device responds after the wake-up voice is detected, without affecting the wake-up rate. This can effectively improve the user experience of voice wake-up in a scenario where there are multiple voice wake-up devices with the same wake-up word.

结合第二方面,在一些实施例中,H1个电子设备中每个电子设备中的第一应用均不进入唤醒状态。With reference to the second aspect, in some embodiments, the first application in each of the H1 electronic devices does not enter the wake-up state.

可以看出,在确定出第一电子设备为进行应答的设备的情况下,语音唤醒系统中除第一电子设备以外的其它电子设备中的第一应用均可以不进入唤醒状态。这可以减少误唤醒的概率,避免多个电子设备在检测到唤醒语音后对用户进行响应,从而给用户造成干扰。It can be seen that, when the first electronic device is determined to be the answering device, the first applications in other electronic devices in the voice wake-up system except the first electronic device may not enter the wake-up state. This can reduce the probability of false wake-up, and prevent multiple electronic devices from responding to the user after detecting the wake-up voice, thereby causing interference to the user.

结合第二方面,在一些实施例中,在第一电子设备检测到包含预唤醒词的第一预唤醒语音之后,当检测到采集的声音中不包含唤醒词,第一电子设备中的第一应用不进入唤醒状态。With reference to the second aspect, in some embodiments, after the first electronic device detects the first pre-wake-up voice containing the pre-wake-up word, when it is detected that the collected sound does not contain the wake-up word, the first electronic device in the first electronic device The application does not enter the wake state.

可以看出,当未检测到包含唤醒词的唤醒语音,即便确定应答设备为第一电子设备,第一电子设备中的第一应用也可以不对用户进行响应。这可以减少误唤醒的概率,提高用户语音控制设备的使用体验。It can be seen that, when the wake-up voice containing the wake-up word is not detected, even if the answering device is determined to be the first electronic device, the first application in the first electronic device may not respond to the user. This can reduce the probability of false wake-up and improve the user experience of voice-controlled devices.

结合第二方面,在一些实施例中,第一电子设备检测到包含预唤醒词的第二预唤醒语音,根据第二预唤醒语音得到第二音频能量。H个电子设备中的H2个电子设备向第一电子设备发送H2个音频能量,H2个电子设备不包含第一电子设备,H2个音频能量中的一个音频能量是,H2个电子设备中一个电子设备根据检测到的包含预唤醒词的预唤醒语音得到的;H2为小于H的正整数。第一电子设备根据第二音频能量和H2个音频能量,确定H2个电子设备中的第二电子设备为进行应答的设备。当检测到包含唤醒词的第二唤醒语音,第二电子设备中的第一应用进入唤醒状态,第一电子设备中的第一应用和(H2-1)个电子设备中每个电子设备中的第一应用均不进入唤醒状态,其中,(H2-1)个电子设备是H2个电子设备中除第二电子设备外的设备。With reference to the second aspect, in some embodiments, the first electronic device detects the second pre-wake-up voice containing the pre-wake-up word, and obtains the second audio energy according to the second pre-wake-up voice. H2 electronic devices in the H electronic devices send H2 audio energies to the first electronic device, the H2 electronic devices do not contain the first electronic device, one audio energy in the H2 audio energies is, and one electronic device in the H2 electronic devices It is obtained by the device based on the detected pre-wake-up voice containing the pre-wake-up word; H2 is a positive integer smaller than H. The first electronic device determines, according to the second audio energy and the H2 audio energies, that the second electronic device among the H2 electronic devices is the responding device. When the second wake-up voice containing the wake-up word is detected, the first application in the second electronic device enters the wake-up state, and the first application in the first electronic device and each of the electronic devices in (H2-1) electronic devices None of the first applications enters the wake-up state, wherein the (H2-1) electronic devices are devices except the second electronic device among the H2 electronic devices.

结合第二方面,在一些实施例中,第一电子设备在确定第二电子设备为进行应答的设备后,可以向第二电子设备发送第一消息,第一消息包含第一结果,第一结果用于指示第二电子设备为进行应答的设备。基于第一消息,当检测到第二唤醒语音,第二电子设备中的第一应用进入唤醒状态。With reference to the second aspect, in some embodiments, the first electronic device may send a first message to the second electronic device after determining that the second electronic device is the responding device, the first message includes the first result, and the first result Used to indicate that the second electronic device is the answering device. Based on the first message, when the second wake-up voice is detected, the first application in the second electronic device enters into a wake-up state.

结合第二方面,在一些实施例中,第一电子设备将第二音频能量发送给第二电子设备。(H2-1)个电子设备将(H2-1)个音频能量发送给第二电子设备,(H2-1)个音频能量是H2个音频能量中(H2-1)个电子设备得到的音频能量。第二电子设备根据第二音频能量、(H2-1)个音频能量、第二电子设备根据检测到的包含预唤醒词的第三预唤醒语音得到的第五音频能量,确定出第二电子设备为进行应答的设备,第五音频能量包含于H2个音频能量。With reference to the second aspect, in some embodiments, the first electronic device sends the second audio energy to the second electronic device. The (H2-1) electronic device sends (H2-1) audio energy to the second electronic device, and the (H2-1) audio energy is the audio energy obtained by the (H2-1) electronic device in the H2 audio energy . The second electronic device determines the second electronic device based on the second audio energy, (H2-1) audio energy, and the fifth audio energy obtained by the second electronic device based on the detected third pre-wake-up voice containing the pre-wake-up word For the responding device, the fifth audio energy is included in the H2 audio energy.

结合第二方面,在一些实施例中,第一电子设备采集到第一声音,第一声音不包含预唤醒词。第一电子设备根据第一声音得到第三音频能量。第一电子设备根据第一预唤醒语音得到第四音频能量。第一电子设备利用第四音频能量减第三音频能量,得到第一音频能量。With reference to the second aspect, in some embodiments, the first electronic device collects the first sound, and the first sound does not contain the pre-wake-up word. The first electronic device obtains the third audio energy according to the first sound. The first electronic device obtains fourth audio energy according to the first pre-wake-up voice. The first electronic device subtracts the third audio energy from the fourth audio energy to obtain the first audio energy.

可以看出,在上述协商确定应答设备的过程中,语音唤醒系统包含的电子设备可以去除由预唤醒语音确定的音频能量中由环境噪声产生的音频能量。这可以减少环境噪声对确定应答设备的影响,提高应答设备的确定结果的准确率。通过上述减少环境噪声影响的音频能量,语音唤醒系统包含的电子设备可以在用户还没有说完唤醒词的时候,就开始协商确定应答设备。这样,当检测到唤醒语音,也即用户说完唤醒词之后,应答设备可以进入唤醒状态。上述方法不仅提高应答设备检测到唤醒语音之后的响应速度,而且上述应答设备在确定检测到唤醒语音的情况下才进行响应,不会影响唤醒率。这可以在存在多个唤醒词相同的电子设备的场景中,有效提高用户使用语音唤醒功能的使用体验。It can be seen that, in the above process of negotiating and determining the answering device, the electronic device contained in the voice wake-up system can remove the audio energy generated by the environmental noise in the audio energy determined by the pre-wake-up voice. This can reduce the influence of environmental noise on the determination of the answering device, and improve the accuracy of the determination result of the answering device. By reducing the audio energy affected by environmental noise, the electronic device included in the voice wake-up system can start to negotiate and determine the answering device before the user has finished speaking the wake-up word. In this way, when the wake-up voice is detected, that is, after the user finishes speaking the wake-up word, the answering device can enter the wake-up state. The above method not only improves the response speed of the answering device after detecting the wake-up voice, but also the above-mentioned answering device responds only when the wake-up voice is detected, which will not affect the wake-up rate. This can effectively improve the user experience of using the voice wake-up function in a scenario where there are multiple electronic devices with the same wake-up word.

第三方面,本申请提供一种电子设备,该电子设备可包括麦克风、通信装置、存储器和处理器,其中,该麦克风可用于采集声音,该存储器可用于存储计算机程序,该处理器可用于调用该计算机程序,使得该电子设备执行如第一方面中任一可能的实现方法。In a third aspect, the present application provides an electronic device, which may include a microphone, a communication device, a memory, and a processor, wherein the microphone may be used to collect sound, the memory may be used to store computer programs, and the processor may be used to call The computer program enables the electronic device to execute any possible implementation method in the first aspect.

第四方面,本申请提供一种计算机可读存储介质,包括指令,当该指令在电子设备上运行,使得该电子设备执行如第一方面中任一可能的实现方法。In a fourth aspect, the present application provides a computer-readable storage medium, including instructions. When the instructions are run on an electronic device, the electronic device executes any possible implementation method in the first aspect.

第五方面,本申请提供一种计算机程序产品,该计算机程序产品可包含计算机指令,当该计算机指令在电子设备上运行,使得该电子设备执行如第一方面中任一可能的实现方法。In a fifth aspect, the present application provides a computer program product. The computer program product may include computer instructions. When the computer instructions are run on an electronic device, the electronic device executes any possible implementation method in the first aspect.

第六方面,本申请提供一种芯片,该芯片应用于电子设备,该芯片包括一个或多个处理器,该处理器用于调用计算机指令以使得该电子设备执行如第一方面中任一可能的实现方法。In a sixth aspect, the present application provides a chip, the chip is applied to an electronic device, the chip includes one or more processors, and the processor is used to invoke computer instructions to make the electronic device perform any possible Implementation.

可以理解地,上述第三方面提供的电子设备、第四方面提供的计算机可读存储介质、第五方面提供的计算机程序产品、第六方面提供的芯片均用于执行本申请实施例所提供的方法。因此,其所能达到的有益效果可参考对应方法中的有益效果,此处不再赘述。It can be understood that the electronic device provided in the third aspect, the computer-readable storage medium provided in the fourth aspect, the computer program product provided in the fifth aspect, and the chip provided in the sixth aspect are all used to execute the method. Therefore, the beneficial effects that it can achieve can refer to the beneficial effects in the corresponding method, and will not be repeated here.

附图说明Description of drawings

图1是本申请实施例提供的一种语音唤醒设备的分布情况示意图;FIG. 1 is a schematic diagram of the distribution of a voice wake-up device provided in an embodiment of the present application;

图2是本申请实施例提供的一种从用户说出唤醒词到语音唤醒设备确定出应答设备的时间分布示意图;Fig. 2 is a schematic diagram of the time distribution from the user uttering the wake-up word to the voice wake-up device determining the answering device provided by the embodiment of the present application;

图3A是本申请实施例提供的一种电子设备100的结构示意图;FIG. 3A is a schematic structural diagram of an electronic device 100 provided in an embodiment of the present application;

图3B是本申请实施例提供的电子设备100的软件结构框图;FIG. 3B is a software structural block diagram of the electronic device 100 provided by the embodiment of the present application;

图4是本申请实施例提供的一种语音唤醒设备10的结构示意图;FIG. 4 is a schematic structural diagram of a voice wake-up device 10 provided in an embodiment of the present application;

图5是本申请实施例提供的一种设备唤醒方法的流程图;FIG. 5 is a flow chart of a method for waking up a device provided in an embodiment of the present application;

图6是本申请实施例提供的另一种从用户说出唤醒词到语音唤醒设备确定出应答设备的时间分布示意图;Fig. 6 is another schematic diagram of the time distribution from the user uttering the wake-up word to the voice wake-up device determining the answering device provided by the embodiment of the present application;

图7是本申请实施例提供的另一种设备唤醒方法的流程图;FIG. 7 is a flow chart of another method for waking up a device provided by an embodiment of the present application;

图8是本申请实施例提供的一种主设备200的结构示意图;FIG. 8 is a schematic structural diagram of a master device 200 provided in an embodiment of the present application;

图9是本申请实施例提供的另一种设备唤醒方法的流程图;FIG. 9 is a flow chart of another method for waking up a device provided by an embodiment of the present application;

图10是本申请实施例提供的一种语音唤醒系统的示意图。Fig. 10 is a schematic diagram of a voice wake-up system provided by an embodiment of the present application.

具体实施方式Detailed ways

下面结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。其中,在本申请实施例的描述中,以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一种”、“所述”、“上述”、“该”和“这一”旨在也包括例如“一个或多个”这种表达形式,除非其上下文中明确地有相反指示。还应当理解,在本申请以下各实施例中,“至少一个”、“一个或多个”是指一个或两个以上(包含两个)。术语“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系;例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。The technical solutions in the embodiments of the present application are described below with reference to the drawings in the embodiments of the present application. Wherein, in the description of the embodiments of the present application, the terms used in the following embodiments are only for the purpose of describing specific embodiments, and are not intended to limit the present application. As used in the specification and appended claims of this application, the singular expressions "a", "the", "above", "the" and "this" are intended to also include, for example, "a or more" unless the context clearly indicates otherwise. It should also be understood that in the following embodiments of the present application, "at least one" and "one or more" refer to one or more than two (including two). The term "and/or" is used to describe the association relationship of associated objects, indicating that there may be three types of relationships; for example, A and/or B may indicate: A exists alone, A and B exist simultaneously, and B exists alone, Wherein A and B can be singular or plural. The character "/" generally indicates that the contextual objects are an "or" relationship.

在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。术语“连接”包括直接连接和间接连接,除非另外说明。“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。Reference to "one embodiment" or "some embodiments" or the like in this specification means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in other embodiments," etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless specifically stated otherwise. The terms "including", "comprising", "having" and variations thereof mean "including but not limited to", unless specifically stated otherwise. The term "connected" includes both direct and indirect connections, unless otherwise stated. "First" and "second" are used for descriptive purposes only, and should not be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features.

在本申请实施例中,“示例性地”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性地”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性地”或者“例如”等词旨在以具体方式呈现相关概念。In the embodiments of the present application, words such as "exemplarily" or "for example" are used as examples, illustrations or descriptions. Any embodiment or design solution described as "exemplary" or "for example" in the embodiments of the present application shall not be interpreted as being more preferred or more advantageous than other embodiments or design solutions. Rather, the use of words such as "exemplarily" or "for example" is intended to present related concepts in a concrete manner.

目前许多电子设备,例如,手机、平板电脑、音箱、电视等等,都具有语音唤醒能力。上述具有语音唤醒能力的电子设备可以称为语音唤醒设备。上述语音唤醒设备中可安装有用于进行语音识别的应用程序(application,APP),例如,语音助手APP。其中,当语音唤醒的功能开启,语音唤醒设备可以实时采集环境中的声音,并检测声音中是否包含唤醒词(即检测是否存在唤醒语音)。当检测到唤醒词,语音唤醒设备可以被唤醒,而进入唤醒状态。在一种可能的实现方法中,语音唤醒设备被唤醒更具体的可以是,语音唤醒设备中的语音助手APP被唤醒。也即是说,上述唤醒状态可以表示语音唤醒设备中的语音助手APP被唤醒的状态。At present, many electronic devices, such as mobile phones, tablet computers, speakers, televisions, etc., have a voice wake-up capability. The aforementioned electronic device with voice wake-up capability may be called a voice wake-up device. An application program (Application, APP) for voice recognition, such as a voice assistant APP, may be installed in the above-mentioned voice wake-up device. Wherein, when the voice wake-up function is turned on, the voice wake-up device can collect the sound in the environment in real time, and detect whether the sound contains a wake-up word (that is, detect whether there is a wake-up voice). When the wake-up word is detected, the voice wake-up device can be woken up and enter the wake-up state. In a possible implementation method, the wake-up of the voice wake-up device may more specifically be that the voice assistant APP in the voice wake-up device is woken up. That is to say, the above-mentioned wake-up state may represent a state in which the voice assistant APP in the voice wake-up device is woken up.

在一些实施例中,语音助手APP被唤醒后,可以对用户说出的唤醒词进行响应。该响应可以是语音响应。例如,在用户说出唤醒词之后,语音唤醒设备可以语音回答“我在”。在一些实施例中,语音助手APP被唤醒后,可以识别语音唤醒设备采集到的声音中的语音指令,并执行语音指令对应的操作。In some embodiments, after the voice assistant APP is woken up, it may respond to the wake-up word spoken by the user. The response may be a voice response. For example, after the user speaks the wake word, a voice-wake device can voice-answer "I'm here." In some embodiments, after the voice assistant APP is woken up, it can recognize the voice command in the sound collected by the voice wake-up device, and execute the operation corresponding to the voice command.

可以看出,上述唤醒词可用于唤醒语音唤醒设备中的语音助手APP。为了保证语音唤醒设备被唤醒的唤醒率,唤醒词通常具有多个音节。例如,唤醒词“小艺小艺”具有四个音节。唤醒词“heySiri”具有三个音节。多个音节的唤醒词相比于单个音节的唤醒词可以有效提高语音唤醒设备被唤醒的唤醒率。例如,若唤醒词为“hey”,则唤醒词仅有一个音节。用户在日常生活中可能会经常说出“hey”,但目的并不是为了唤醒语音唤醒设备。这就有可能导致语音唤醒设备被误唤醒。另外,为了便于用户说出唤醒词,唤醒词也不宜过于复杂,包含太多的音节。示例性的,唤醒词可以包含4~6个音节。这既可以保证唤醒语音唤醒设备的唤醒率,也可以方便用户说出唤醒词。本申请实施例对唤醒词具有的音节的数量不作限定。后续实施例中具体以包含多个音节的唤醒词“小艺小艺”为例进行说明。It can be seen that the above wake-up word can be used to wake up the voice assistant APP in the voice wake-up device. In order to ensure the wake-up rate of the voice wake-up device being woken up, the wake-up word usually has multiple syllables. For example, the wake word "Xiaoyi Xiaoyi" has four syllables. The wake word "heySiri" has three syllables. Compared with the single-syllable wake-up word, the wake-up word with multiple syllables can effectively improve the wake-up rate of the voice wake-up device. For example, if the wake word is "hey", the wake word has only one syllable. Users may often say "hey" in their daily life, but the purpose is not to wake up the voice wake device. This may cause the voice wake-up device to be awakened by mistake. In addition, in order to facilitate the user to say the wake-up word, the wake-up word should not be too complicated and contain too many syllables. Exemplarily, the wake-up word may contain 4-6 syllables. This can not only ensure the wake-up rate of the wake-up voice to wake up the device, but also facilitate the user to say the wake-up word. The embodiment of the present application does not limit the number of syllables of the wake-up word. In the following embodiments, the wake-up word "Xiaoyi Xiaoyi" containing multiple syllables is used as an example for illustration.

不限于唤醒上述语音助手APP,上述唤醒词还可用于唤醒语音唤醒设备中的其它模块或者其它APP。本申请后续实施例中具体以唤醒语音助手APP为例进行说明。The method is not limited to waking up the above-mentioned voice assistant APP, and the above-mentioned wake-up word can also be used to wake up other modules or other APPs in the voice-activated device. In the subsequent embodiments of the present application, the specific example of waking up the voice assistant APP will be described.

随着电子设备的发展,一个家庭(或者其它的环境,如办公室等)中可能配置有多个语音唤醒设备。With the development of electronic devices, a family (or other environments, such as offices, etc.) may be equipped with multiple voice wake-up devices.

请参照图1,图1示例性示出了用户家中语音唤醒设备的分布情况。Please refer to FIG. 1 , which exemplarily shows the distribution of voice wake-up devices in a user's home.

如图1所示,客厅中配置有语音唤醒设备10、语音唤醒设备11、语音唤醒设备12。主卧中配置有语音唤醒设备13和语音唤醒设备14。次卧中配置有语音唤醒设备15。用于唤醒上述语音唤醒设备10~15的唤醒词可以是相同的。例如,唤醒词为“小艺小艺”。As shown in FIG. 1 , a voice wake-up device 10 , a voice wake-up device 11 , and a voice wake-up device 12 are arranged in the living room. The master bedroom is equipped with a voice wake-up device 13 and a voice wake-up device 14 . A voice wake-up device 15 is configured in the second bedroom. The wake-up words used to wake up the voice wake-up devices 10-15 may be the same. For example, the wake word is "Xiaoyi Xiaoyi".

在一种可能的实现方法中,一个家庭中配置的多个具有相同唤醒词的语音唤醒设备可以组成一个语音唤醒系统。其中,语音唤醒系统包含的一个语音唤醒设备可以检测到该语音唤醒系统包含的其它语音唤醒设备,并与其它语音唤醒设备通信。语音唤醒系统包含的各个语音唤醒设备之间相互通信的方法可以例如是蓝牙通信、无线保真(wirelessfidelity,Wi-Fi)通信等等。本申请实施例对此不作限定。In a possible implementation method, multiple voice wake-up devices with the same wake-up word configured in a family can form a voice wake-up system. Wherein, a voice wake-up device included in the voice wake-up system can detect other voice wake-up devices included in the voice wake-up system, and communicate with other voice wake-up devices. The method for communicating with each voice wake-up device included in the voice wake-up system may be, for example, Bluetooth communication, wireless fidelity (wirelessfidelity, Wi-Fi) communication, and the like. This embodiment of the present application does not limit it.

例如,图1所示的家庭中具有语音唤醒系统A。语音唤醒系统A包含上述语音唤醒设备10~15。语音唤醒设备10~15之间可以互相通信,感知彼此的状态(如语音唤醒的功能的开启状态、工作的状态等)。不限于语音唤醒设备10~15,语音唤醒系统A中还可以包含更多或更少的语音唤醒设备。For example, there is a voice wake-up system A in the family shown in FIG. 1 . The voice wake-up system A includes the above-mentioned voice wake-up devices 10-15. The voice wake-up devices 10-15 can communicate with each other and perceive each other's status (such as the activation status of the voice wake-up function, working status, etc.). Not limited to the voice wake-up devices 10-15, the voice wake-up system A may also include more or less voice wake-up devices.

本申请实施例对组建上述语音唤醒系统A的实现方式不作限定。在一种可能的实现方法中,语音唤醒系统A可以是由位于同一个局域网且唤醒词相同的语音唤醒设备组成的。具体的,语音唤醒设备10通过路由器接入网络。接入该路由器的电子设备可以处于同一个局域网。语音唤醒设备10可以在与自己在同一个局域网的电子设备中,检测哪些电子设备为语音唤醒设备,且唤醒词与自己的唤醒词是否相同。当检测到该局域网中存在语音唤醒设备11~15,且语音唤醒设备11~15的唤醒词与自己的唤醒词相同,语音唤醒设备10可以与语音唤醒设备11~15建立通信连接。同样的,语音唤醒设备11~15中的任意一个语音唤醒设备也可以发现语音唤醒系统A中其它语音唤醒设备的存在,从而与之建立通信连接。这样,语音唤醒设备10~15可以组成一个语音唤醒系统,即语音唤醒系统A。可选的,语音唤醒系统A还可以是由登录有同一个账号且唤醒词相同的语音唤醒设备组成的。The embodiment of the present application does not limit the implementation manner of establishing the above-mentioned voice wake-up system A. In a possible implementation method, the voice wake-up system A may be composed of voice wake-up devices located in the same local area network and having the same wake-up word. Specifically, the voice wake-up device 10 is connected to the network through a router. Electronic devices connected to the router can be in the same local area network. The voice wake-up device 10 can detect which electronic devices are voice wake-up devices among the electronic devices in the same local area network as itself, and whether the wake-up word is the same as its own wake-up word. When it is detected that there are voice wake-up devices 11-15 in the local area network, and the wake-up words of the voice wake-up devices 11-15 are the same as its own wake-up words, the voice wake-up device 10 can establish a communication connection with the voice wake-up devices 11-15. Similarly, any one of the voice wake-up devices 11-15 can also discover the existence of other voice wake-up devices in the voice wake-up system A, so as to establish a communication connection with them. In this way, the voice wake-up devices 10-15 can form a voice wake-up system, that is, the voice wake-up system A. Optionally, the voice wake-up system A may also be composed of voice wake-up devices logged in with the same account and with the same wake-up words.

在一些实施例中,语音唤醒系统A中的各个语音唤醒设备可以每隔预设时间段检测语音唤醒系统A中语音唤醒设备的存在情况。例如,语音唤醒设备10~15均处于同一个局域网中。响应于作用在语音唤醒设备15上,用于触发语音唤醒设备15退出上述局域网的用户操作,语音唤醒设备15可以退出上述局域网。那么,语音唤醒设备15可以从语音唤醒系统A中移除。语音唤醒设备10~14可以确定出语音唤醒系统A不再包含语音唤醒设备15。In some embodiments, each voice wake-up device in the voice wake-up system A can detect the existence of the voice wake-up device in the voice wake-up system A every preset time period. For example, voice wake-up devices 10-15 are all in the same local area network. In response to a user operation acting on the voice wake-up device 15 for triggering the voice wake-up device 15 to exit the local area network, the voice wake-up device 15 may exit the local area network. The voice wake-up device 15 can then be removed from the voice wake-up system A. The voice wake-up devices 10 - 14 may determine that the voice wake-up system A no longer includes the voice wake-up device 15 .

在一些实施例中,语音唤醒系统A中的各个语音唤醒设备可以每隔预设时间段检测语音唤醒系统A中语音唤醒设备语音唤醒的功能的开启状态。例如,响应于作用在语音唤醒设备15上,用于触发语音唤醒设备15关闭语音唤醒的功能的用户操作,语音唤醒设备15可以关闭语音唤醒的功能。当接收到其它语音唤醒设备询问语音唤醒的功能是否开启的消息,语音唤醒设备15可以发送用于指示语音唤醒的功能关闭的答复消息。这样,语音唤醒设备10~14可以确定出语音唤醒设备15中语音唤醒的功能关闭。In some embodiments, each voice wake-up device in the voice wake-up system A can detect the activation status of the voice wake-up function of the voice wake-up device in the voice wake-up system A every preset time period. For example, in response to a user operation acting on the voice wake-up device 15 for triggering the voice wake-up device 15 to disable the voice wake-up function, the voice wake-up device 15 may disable the voice wake-up function. When receiving a message from other voice wake-up devices asking whether the voice wake-up function is turned on, the voice wake-up device 15 may send a reply message indicating that the voice wake-up function is turned off. In this way, the voice wake-up devices 10-14 can determine that the voice wake-up function in the voice wake-up device 15 is disabled.

本申请实施例对上述预设时间段不作限定。其中,语音唤醒系统A中的各个语音唤醒设备可以定时或者不定时检测语音唤醒系统A中语音唤醒设备的存在情况、语音唤醒设备中语音唤醒的功能的开启状态等等内容。The embodiment of the present application does not limit the foregoing preset time period. Wherein, each voice wake-up device in the voice wake-up system A can regularly or irregularly detect the existence of the voice wake-up device in the voice wake-up system A, the activation status of the voice wake-up function in the voice wake-up device, and the like.

下面介绍一种在存在多个唤醒词相同的语音唤醒设备的情况下,唤醒语音唤醒设备的方法。The following introduces a method for waking up the voice wake-up device when there are multiple voice wake-up devices with the same wake-up word.

在一种可能的实现方法中,在存在多个语音唤醒设备且唤醒词相同的情况下,这多个语音唤醒设备在检测到唤醒语音之后可以协商,选择离用户最近的一个语音唤醒设备作为应答设备。上述应答设备可以进入唤醒状态,应答设备中的语音助手APP可以被唤醒,对用户进行响应。而应答设备以外的其它语音唤醒设备则不进入唤醒状态。这样可以由多个语音唤醒设备中的一个语音唤醒设备来执行用户的语音指令,避免出现用户说出唤醒词之后,多个语音唤醒设备均进行响应,给用户造成困扰的情况。In a possible implementation method, when there are multiple voice wake-up devices with the same wake-up word, the multiple voice wake-up devices can negotiate after detecting the wake-up voice, and select a voice wake-up device closest to the user as a response equipment. The above-mentioned answering device can enter the wake-up state, and the voice assistant APP in the answering device can be woken up to respond to the user. Other voice wake-up devices other than the answering device will not enter the wake-up state. In this way, one voice wake-up device among the multiple voice wake-up devices can execute the user's voice command, so as to avoid the situation that after the user speaks the wake-up word, multiple voice wake-up devices respond, causing trouble to the user.

具体的,当用户说出唤醒词,上述图1所示的语音唤醒系统A中的语音唤醒设备10~15均检测到唤醒语音。上述检测到唤醒语音可以表示语音唤醒设备从采集的声音中检测到唤醒词。其中,当语音唤醒设备10检测到唤醒语音,语音唤醒设备10可以根据自己采集的声音得到的音频,确定包含唤醒词的音频对应的唤醒词音频能量。语音唤醒设备10可以将自己确定的唤醒词音频能量发送给语音唤醒系统A中其它的语音唤醒设备,并接收语音唤醒系统A中其它语音唤醒设备确定的唤醒词音频能量。也即是说,语音唤醒设备10~15在检测到唤醒语音之后,可以互相向彼此通告自己确定的唤醒词音频能量。可选的,当确定出语音唤醒系统A中存在语音唤醒的功能关闭的语音唤醒设备,语音唤醒设备10也可不向上述语音唤醒的功能关闭的语音唤醒设备发送自己确定的唤醒词音频能量。Specifically, when the user speaks a wake-up word, the voice wake-up devices 10-15 in the voice wake-up system A shown in FIG. 1 all detect the wake-up voice. The aforementioned detection of the wake-up voice may indicate that the voice wake-up device detects a wake-up word from the collected sound. Wherein, when the voice wake-up device 10 detects the wake-up voice, the voice wake-up device 10 can determine the audio energy of the wake-up word corresponding to the audio containing the wake-up word according to the audio obtained by the voice collected by the voice wake-up device 10 . The voice wake-up device 10 may send the audio energy of the wake-up word determined by itself to other voice wake-up devices in the voice wake-up system A, and receive the audio energy of the wake-up word determined by other voice wake-up devices in the voice wake-up system A. That is to say, after the voice wake-up devices 10-15 detect the wake-up voice, they can notify each other of the audio energy of the wake-up word determined by themselves. Optionally, when it is determined that there is a voice wake-up device with the voice wake-up function disabled in the voice wake-up system A, the voice wake-up device 10 may not send the wake-up word audio energy determined by itself to the voice wake-up device with the voice wake-up function disabled.

上述唤醒词音频能量可以是包含唤醒词的音频的声音强度,或者声压等参数。本申请实施例对唤醒词音频能量的计算方法不作限定。可以理解的,语音唤醒设备离用户越近,用户说出唤醒词的声音传播达到语音唤醒设备的时间越短,语音唤醒设备可以更快地检测到唤醒语音。并且,由于声音在传播过程中会逐渐衰减,语音唤醒设备离用户越近,语音唤醒设备确定的唤醒词音频能量越大。语音唤醒系统A中的各个语音唤醒设备可以比较各个语音唤醒设备确定的唤醒词音频能量的大小,来确定进行应答的语音唤醒设备。The audio energy of the wake-up word may be a sound intensity or a sound pressure of the audio containing the wake-up word. The embodiment of the present application does not limit the method for calculating the audio energy of the wake word. It can be understood that the closer the voice wake-up device is to the user, the shorter the time it takes for the voice of the user to speak the wake-up word to reach the voice wake-up device, and the voice wake-up device can detect the wake-up voice faster. Moreover, since the sound will gradually attenuate during transmission, the closer the voice wake-up device is to the user, the greater the audio energy of the wake-up word determined by the voice wake-up device. Each voice wake-up device in the voice wake-up system A can compare the audio energy of the wake-up word determined by each voice wake-up device to determine the voice wake-up device that responds.

由于需要利用语音唤醒系统A中的各个语音唤醒设备确定的唤醒词音频能量,来确定应答设备,语音唤醒设备10通常需要等待语音唤醒系统A中的其它语音唤醒设备发送的唤醒词音频能量。这样可以减少遗漏语音唤醒设备的情况,提高应答设备的确定结果的准确率。Since the wake-up word audio energy determined by each voice wake-up device in the voice wake-up system A needs to be used to determine the answering device, the voice wake-up device 10 usually needs to wait for the wake-up word audio energy sent by other voice wake-up devices in the voice wake-up system A. In this way, the situation of waking up the device by voice can be reduced, and the accuracy rate of the determination result of the answering device can be improved.

在一些实施例中,语音唤醒设备10可以根据语音唤醒系统A中存在的,且语音唤醒的功能开启的语音唤醒设备的数量,来预估应该等待哪些语音唤醒设备发送的唤醒词音频能量。当语音唤醒设备10接收到语音唤醒系统A中除自己之外,所有语音唤醒的功能开启的语音唤醒设备发送的唤醒词音频能量(如语音唤醒设备11~14发送的唤醒词音频能量),语音唤醒设备10可以根据自己和其它语音唤醒设备确定的唤醒词音频能量,确定进行应答的语音唤醒设备。示例性的,语音唤醒设备10可以比较自己和其它语音唤醒设备确定的唤醒词音频能量,将最大的唤醒词音频能量对应的语音唤醒设备确定为应答设备。即最大的唤醒词音频能量对应的语音唤醒设备为进行应答的语音唤醒设备。In some embodiments, the voice wake-up device 10 may estimate which voice wake-up devices should wait for the audio energy of the wake-up word according to the number of voice wake-up devices existing in the voice wake-up system A and whose voice wake-up function is turned on. When the voice wake-up device 10 receives the wake-up word audio energy (such as the wake-up word audio energy sent by the voice wake-up devices 11-14) sent by the voice wake-up device with all voice wake-up functions enabled in the voice wake-up system A except itself, the voice The wake-up device 10 may determine the voice wake-up device that responds according to the audio energy of the wake-up word determined by itself and other voice wake-up devices. Exemplarily, the voice wake-up device 10 may compare the wake-up word audio energy determined by itself and other voice wake-up devices, and determine the voice wake-up device corresponding to the largest wake-up word audio energy as the answering device. That is, the voice wake-up device corresponding to the largest wake-up word audio energy is the voice wake-up device that responds.

在另一些实施例中,语音唤醒设备10在预设的等待时间段内等待语音唤醒系统A中其它的语音唤醒设备发送的唤醒词音频能量。当等待的时间达到上述预设的等待时间段,语音唤醒设备10仍未接收到语音唤醒系统A中除自己之外,所有语音唤醒的功能开启的语音唤醒设备发送的唤醒词音频能量,语音唤醒设备10可以停止等待,开始根据自己的唤醒词音频能量和已经接收到的其它语音唤醒设备的唤醒词音频能量,确定进行应答的语音唤醒设备。语音唤醒设备11~15也可以根据唤醒词音频能量来确定进行应答的语音唤醒设备,具体的方法可以参考上述语音唤醒设备10确定进行应答的语音唤醒设备的实现方法。In some other embodiments, the voice wake-up device 10 waits for the wake-up word audio energy sent by other voice wake-up devices in the voice wake-up system A within a preset waiting period. When the waiting time reaches the above-mentioned preset waiting time period, the voice wake-up device 10 has not yet received the wake-up word audio energy sent by the voice wake-up device whose voice wake-up function is enabled in the voice wake-up system A except itself, and the voice wake-up The device 10 may stop waiting, and start to determine the voice wake-up device that responds according to its own wake-up word audio energy and the received wake-up word audio energy of other voice wake-up devices. The voice wake-up devices 11-15 can also determine the voice wake-up device that responds according to the audio energy of the wake-up word. For a specific method, refer to the implementation method of the above-mentioned voice wake-up device 10 determining the voice wake-up device that responds.

请参照图2,图2示例性示出了从用户说出唤醒词到语音唤醒设备确定出应答设备的时间分布示意图。Please refer to FIG. 2 . FIG. 2 exemplarily shows a schematic diagram of time distribution from when the user speaks a wake-up word to when the voice wake-up device determines the answering device.

如图2所示,用户从t11时刻开始说唤醒词“小艺小艺”。可以理解的,唤醒词有多个音节,用户说出唤醒词需要一定的时间。其中,用户在t12时刻说完唤醒词。即从t11时刻到t12时刻的时间段为用户说出唤醒词的时间段。用户说出唤醒词,语音唤醒系统A中的语音唤醒设备10采集到包含唤醒词的声音,并从该声音中检测出唤醒词需要一定的时间。那么,语音唤醒设备10在t13时刻检测到唤醒语音,t13时刻晚于t12时刻。当检测到唤醒语音,语音唤醒设备10可以与语音唤醒系统A中的其它语音唤醒设备协商。其中,从t13时刻到t14时刻的时间段为上述协商确定出应答设备的时间段。也即是说,语音唤醒系统A中的语音唤醒设备可能在t14时刻才确定出应答设备。然后,应答设备中的语音助手APP被唤醒,对用户进行响应。As shown in Figure 2, the user starts to say the wake-up word "Xiaoyi Xiaoyi" from time t11. Understandably, the wake-up word has multiple syllables, and it takes a certain amount of time for the user to speak the wake-up word. Wherein, the user finishes speaking the wake-up word at time t12. That is, the time period from time t11 to time t12 is the time period when the user speaks the wake-up word. When the user speaks a wake-up word, the voice wake-up device 10 in the voice wake-up system A collects a sound containing the wake-up word, and it takes a certain amount of time to detect the wake-up word from the sound. Then, the voice wake-up device 10 detects the wake-up voice at time t13, and time t13 is later than time t12. When a wake-up voice is detected, the voice wake-up device 10 may negotiate with other voice wake-up devices in the voice wake-up system A. Wherein, the time period from time t13 to time t14 is the time period during which the responding device is determined through the negotiation. That is to say, the voice wake-up device in the voice wake-up system A may not determine the answering device until time t14. Then, the voice assistant APP in the answering device is awakened to respond to the user.

由上述实施例可知,语音唤醒系统A中的语音唤醒设备需要在检测到唤醒语音之后开始进行协商,来确定应答设备。在上述协商的过程中,一个语音唤醒设备需要等待其它语音唤醒设备发送的唤醒词音频能量。那么,当语音唤醒系统A中的语音唤醒设备越多,上述协商的过程就越复杂,语音唤醒设备等待接收其它语音唤醒设备确定的唤醒词音频能量需要的时间可能越长。这就导致语音唤醒设备越多,用户说出唤醒词之后,语音唤醒设备对用户进行响应的速度越慢。It can be known from the above embodiments that the voice wake-up device in the voice wake-up system A needs to start negotiation after detecting the wake-up voice to determine the answering device. During the above negotiation process, a voice wake-up device needs to wait for the audio energy of the wake-up word sent by other voice wake-up devices. Then, when there are more voice wake-up devices in the voice wake-up system A, the above-mentioned negotiation process is more complicated, and the voice wake-up devices may need longer time to wait for the audio energy of the wake-up word determined by other voice wake-up devices. This leads to the more voice wake-up devices, the slower the voice wake-up device responds to the user after the user speaks the wake-up word.

在一种可能的实现方法中,语音唤醒设备可以利用复杂程度更低、运算量更小的唤醒词识别模型。那么,语音唤醒设备可以更快地从采集的声音中检测到唤醒词,从而更快地开始与其它语音唤醒设备协商,确定应答设备的过程。这可以提升语音唤醒设备在检测到唤醒语音之后的响应速度。但是,上述复杂程度更低、运算量更小的唤醒词识别模型的识别准确率会降低。即利用复杂程度更低、运算量更小的唤醒词识别模型来检测唤醒语音会导致语音唤醒设备误唤醒的概率较高。例如用户说出唤醒词,语音唤醒设备利用上述唤醒词识别模型却未检测到唤醒词。或者,用户没有说出唤醒词,语音唤醒设备利用上述唤醒词识别模型却检测到唤醒词,进入唤醒状态。上述方法虽然可以提升语音唤醒设备在检测到唤醒语音之后的响应速度,但唤醒率却降低,用户体验仍然较差。In a possible implementation method, the voice wake-up device may use a wake-up word recognition model with a lower complexity and a smaller calculation load. Then, the voice wake-up device can detect the wake-up word from the collected sound more quickly, so as to start the process of negotiating with other voice wake-up devices and determining the answering device more quickly. This can improve the responsiveness of wake-on-voice devices after detecting a wake-up voice. However, the recognition accuracy rate of the wake-up word recognition model with lower complexity and less computational load will be reduced. That is, using a wake-up word recognition model with less complexity and less computation to detect the wake-up voice will lead to a higher probability of false wake-up of the voice wake-up device. For example, the user speaks a wake-up word, but the voice wake-up device does not detect the wake-up word using the above-mentioned wake-up word recognition model. Or, the user does not say the wake-up word, but the voice wake-up device detects the wake-up word by using the above-mentioned wake-up word recognition model, and enters the wake-up state. Although the above method can improve the response speed of the voice wake-up device after detecting the wake-up voice, the wake-up rate is reduced, and the user experience is still poor.

本申请提供一种设备唤醒方法。语音唤醒系统中的各个语音唤醒设备可以检测采集的声音中是否包含预唤醒词。上述预唤醒词可以是唤醒词的一部分。当检测到预唤醒语音,语音唤醒设备可以根据自己采集的声音得到的音频,确定包含预唤醒词的音频对应的预唤醒词音频能量。上述检测到预唤醒语音可以表示语音唤醒设备从采集的声音中检测到预唤醒词。即预唤醒语音为包含预唤醒词的语音。语音唤醒系统中的各个语音唤醒设备可以互相向彼此通告自己确定的预唤醒词音频能量,从而根据预唤醒词音频能量在语音唤醒系统中确定出一个应答设备。该应答设备在检测到唤醒语音之后,可以进入唤醒状态,对用户进行响应。This application provides a method for waking up a device. Each voice wake-up device in the voice wake-up system can detect whether the collected sound contains a pre-wake-up word. The above-mentioned pre-wake-up word may be a part of the wake-up word. When the pre-wake-up voice is detected, the voice wake-up device can determine the audio energy of the pre-wake-up word corresponding to the audio containing the pre-wake-up word according to the audio obtained by the voice collected by itself. The foregoing detection of the pre-wake-up voice may indicate that the voice wake-up device detects a pre-wake-up word from the collected sound. That is, the pre-wake-up voice is a voice containing a pre-wake-up word. Each voice wake-up device in the voice wake-up system can notify each other of the audio energy of the pre-wake-up word determined by itself, so as to determine a responding device in the voice wake-up system according to the audio energy of the pre-wake-up word. After the answering device detects the wake-up voice, it can enter the wake-up state and respond to the user.

可以看出,上述方法可以通过检测预唤醒语音,来确定上述协商过程开始的时机。语音唤醒设备可以在用户还没有说完唤醒词的时候,就开始协商确定应答设备。这样,当检测到唤醒语音,也即用户说完唤醒词之后,应答设备可以进入唤醒状态。上述方法不仅提高语音唤醒设备检测到唤醒语音之后的响应速度,而且上述应答设备在确定检测到唤醒语音的情况下再进行响应,不会影响唤醒率。这可以在存在多个唤醒词相同的语音唤醒设备的场景中,有效提高用户语音唤醒的使用体验。It can be seen that the above method can determine the timing for starting the above negotiation process by detecting the pre-awakening voice. The voice wake-up device can start to negotiate and determine the answering device before the user has finished speaking the wake-up word. In this way, when the wake-up voice is detected, that is, after the user finishes speaking the wake-up word, the answering device can enter the wake-up state. The above method not only improves the response speed of the voice wake-up device after detecting the wake-up voice, but also the responding device responds after the wake-up voice is detected, without affecting the wake-up rate. This can effectively improve the user experience of voice wake-up in a scenario where there are multiple voice wake-up devices with the same wake-up word.

为了便于理解,这里对本申请实施例涉及的一些概念进行介绍。For ease of understanding, some concepts involved in the embodiments of the present application are introduced here.

1、预唤醒词1. Pre-wake word

预唤醒词可以是唤醒词的一部分。由前述实施例可知,唤醒词通常具有多个音节。语音唤醒设备可以从唤醒词中截取一部分作为预唤醒词。通常可以截取唤醒词中的前几个音节作为预唤醒词。例如,唤醒词为“小艺小艺”。那么语音唤醒设备可以以“小艺”、“小艺小”作为预唤醒词。再例如,唤醒词为“heySiri”,那么语音唤醒设备可以以“hey”作为预唤醒词。The pre-wake word can be part of the wake word. It can be known from the foregoing embodiments that the wake-up word usually has multiple syllables. The voice wake-up device can intercept a part of the wake-up word as a pre-wake-up word. Usually the first few syllables in the wake-up word can be intercepted as the pre-wake-up word. For example, the wake word is "Xiaoyi Xiaoyi". Then the voice wake-up device can use "Xiaoyi" and "Xiaoyixiao" as pre-wake words. For another example, if the wake-up word is "heySiri", then the voice wake-up device can use "hey" as the pre-wake-up word.

可以理解的,当检测到唤醒语音,语音唤醒设备可以确定用户有语音交互的需求,需要唤醒多个语音唤醒设备中的一个。当检测到预唤醒语音,语音唤醒设备可以判断出用户可能需要唤醒多个语音唤醒设备中的一个,并且用户可能即将说完完整的唤醒词。那么,当判断出用户可能需要唤醒多个语音唤醒设备中的一个,这多个语音唤醒设备可以开始协商,选取出一个应答设备。当检测到唤醒语音之后,上述应答设备可以直接进入唤醒状态,对用户进行响应。这可以有效提高语音唤醒设备检测到唤醒语音之后的响应速度。It can be understood that when the wake-up voice is detected, the voice wake-up device may determine that the user needs voice interaction, and needs to wake up one of the multiple voice wake-up devices. When the pre-wake-up voice is detected, the voice wake-up device can determine that the user may need to wake up one of multiple voice wake-up devices, and the user may be about to finish speaking a complete wake-up word. Then, when it is determined that the user may need to wake up one of the multiple voice wake-up devices, the multiple voice wake-up devices may start negotiation to select an answering device. After the wake-up voice is detected, the answering device can directly enter the wake-up state to respond to the user. This can effectively improve the response speed after the voice wake-up device detects the wake-up voice.

2、预唤醒状态2. Pre-awakening state

当检测到预唤醒语音,语音唤醒设备可以进入预唤醒状态。当处于预唤醒状态,语音唤醒设备可以根据自己采集的声音得到的音频,确定包含预唤醒词的音频对应的预唤醒词音频能量。语音唤醒设备还可以将自己的预唤醒词音频能量发送给语音唤醒系统中其它的语音唤醒设备,并接收其它的语音唤醒设备确定的预唤醒词音频能量。也即是说,语音唤醒系统中的语音唤醒设备在预唤醒状态可以互相向彼此通告自己确定的预唤醒词音频能量,从而根据预唤醒词音频能量在语音唤醒系统中确定出一个应答设备。When the pre-wake-up voice is detected, the voice wake-up device can enter the pre-wake-up state. When in the pre-awakening state, the voice-activated device can determine the audio energy of the pre-awakening word corresponding to the audio containing the pre-awakening word according to the audio obtained by the voice collected by itself. The voice wake-up device can also send its own pre-wake-up word audio energy to other voice wake-up devices in the voice wake-up system, and receive the pre-wake-up word audio energy determined by other voice wake-up devices. That is to say, the voice wake-up devices in the voice wake-up system can notify each other of the pre-wake-up word audio energy determined by themselves in the pre-wake-up state, so as to determine a responding device in the voice wake-up system according to the audio energy of the pre-wake-up word.

3、唤醒状态3. Wake up state

当处于唤醒状态,语音唤醒设备可以对用户进行响应。具体的,语音唤醒设备处于唤醒状态可以表示语音唤醒设备中的语音助手APP被唤醒的状态。即语音唤醒设备对用户进行响应可以是语音唤醒设备中的语音助手APP对用户进行响应。When in the awake state, the voice wake-up device can respond to the user. Specifically, the wake-up state of the voice wake-up device may indicate a state in which the voice assistant APP in the voice wake-up device is woken up. That is, the response of the voice wake-up device to the user may be that the voice assistant APP in the voice wake-up device responds to the user.

当检测到唤醒语音且还未检测到语音指令,语音助手APP可以调用语音唤醒设备的音频输出模块(如扬声器),语音回答“我在”。即语音助手APP可以对用户说出的唤醒词进行响应。本申请实施例对语音助手APP对用户说出的唤醒词进行响应的实现方法不作限定。When the wake-up voice is detected and the voice command has not been detected, the voice assistant APP can call the audio output module (such as a speaker) of the voice wake-up device, and the voice answers "I am". That is, the voice assistant APP can respond to the wake-up words spoken by the user. The implementation method of the voice assistant APP responding to the wake-up word uttered by the user is not limited in the embodiment of the present application.

语音助手APP可以对采集到的声音中位于唤醒词之后的声音进行语音指令识别,来识别用户说出的语音指令。当识别出语音指令,语音助手APP可以执行语音指令对应的操作。上述语音指令可以为指示语音唤醒设备完成指定任务的语音。即语音助手APP可以对用户说出的语音指令进行响应。例如,语音指令为“开启空调”。当识别出该语音指令,语音唤醒设备可以向空调发送开启的指令,来开启空调。再例如,语音指令为“播放音乐”。当识别出该语音指令,且语音唤醒设备具备播放音乐的能力,语音唤醒设备可以播放音乐。The voice assistant APP can perform voice command recognition on the voice that is located after the wake-up word in the collected sound, so as to recognize the voice command spoken by the user. When the voice command is recognized, the voice assistant APP can perform the operation corresponding to the voice command. The above-mentioned voice instruction may be a voice instructing the voice-activated device to complete a specified task. That is, the voice assistant APP can respond to the voice commands spoken by the user. For example, the voice command is "turn on the air conditioner". When the voice instruction is recognized, the voice wake-up device can send an instruction to turn on the air conditioner to turn on the air conditioner. For another example, the voice command is "play music". When the voice command is recognized and the voice-activated device has the ability to play music, the voice-activated device can play music.

上述对用户进行响应的语音唤醒设备可以是语音唤醒系统中,经过协商确定出的一个应答设备。The voice wake-up device that responds to the user may be an answering device determined through negotiation in the voice wake-up system.

需要进行说明的是,语音唤醒设备在检测到唤醒语音且确定自己是应答设备的情况下,可以调用语音助手APP。例如,语音唤醒设备可以运行语音助手APP的应用程序。也即是说,该语音唤醒设备中的语音助手APP可以进入唤醒状态。当语音助手APP处于上述唤醒状态,语音唤醒设备运行的程序中包含语音助手APP的进程。在上述唤醒状态,语音助手APP可以检测和识别语音指令,执行语音指令对应的操作。It should be noted that, when the voice wake-up device detects the wake-up voice and determines that it is the answering device, it can call the voice assistant APP. For example, a voice-activated device can run an application program of a voice assistant APP. That is to say, the voice assistant APP in the voice wake-up device can enter the wake-up state. When the voice assistant APP is in the above-mentioned wake-up state, the program running on the voice wake-up device includes the process of the voice assistant APP. In the above wake-up state, the voice assistant APP can detect and recognize voice commands, and perform operations corresponding to the voice commands.

语音唤醒设备在检测到唤醒语音,但确定自己不是应答设备的情况下,该语音唤醒设备的语音助手APP可以不进入唤醒状态。例如,语音唤醒设备运行的程序中不包含语音助手APP的进程。或者,语音唤醒设备运行的程序中包含语音助手APP的进程,但语音助手APP不会对用户进行响应。即语音助手APP在未进入唤醒状态的情况下不会对用户进行响应。另外,语音唤醒设备在确定自己是应答设备(如根据检测到的预唤醒语音确定自己是应答设备),但没有检测到唤醒语音的情况下,该语音唤醒设备的语音助手APP可以不进入唤醒状态。When the voice wake-up device detects the wake-up voice but determines that it is not the answering device, the voice assistant APP of the voice wake-up device may not enter the wake-up state. For example, the program running on the voice-activated device does not include the process of the voice assistant APP. Alternatively, the program running on the voice-activated device includes the process of the voice assistant APP, but the voice assistant APP does not respond to the user. That is, the voice assistant APP will not respond to the user unless it enters the wake-up state. In addition, when the voice wake-up device determines that it is an answering device (such as determining that it is an answering device according to the detected pre-wake-up voice), but does not detect the wake-up voice, the voice assistant APP of the voice wake-up device may not enter the wake-up state .

可以理解的,在存在多个唤醒词相同的语音唤醒设备的场景中,当用户说出该唤醒词,这多个语音唤醒设备检测到预唤醒语音,并根据该预唤醒语音从这多个语音唤醒设备中确定一个应答设备。进一步的,当检测到唤醒语音,上述确定出的应答设备中的语音助手APP可以进入唤醒状态,检测和识别语音指令,并执行语音指令对应的操作。上述多个语音唤醒设备中除应答设备以外的语音唤醒设备中的语音助手APP均可以不进入唤醒状态。这可以减少用户说出唤醒词之后,多个语音唤醒设备中的语音助手APP均对用户进行响应给用户造成困扰,提高用户通过语音控制设备的使用体验。It can be understood that, in the scenario where there are multiple voice wake-up devices with the same wake-up word, when the user speaks the wake-up word, the multiple voice wake-up devices detect the pre-wake-up voice, and use the pre-wake-up voice from the multiple voices. Identify an answering device in Wake Devices. Further, when the wake-up voice is detected, the voice assistant APP in the above-mentioned determined answering device can enter the wake-up state, detect and recognize the voice command, and execute the operation corresponding to the voice command. Among the multiple voice wake-up devices mentioned above, the voice assistant APPs in the voice wake-up devices other than the answering device may not enter the wake-up state. This can reduce the trouble caused to the user by the voice assistant APP in multiple voice wake-up devices responding to the user after the user speaks the wake-up word, and improves the user experience of controlling the device by voice.

为了在不影响语音唤醒设备的唤醒率的情况下,提高语音唤醒设备的响应速度,本申请实施例提供一种电子设备100。该电子设备100可以是前述实施例中的语音唤醒设备(如语音唤醒设备10~15)。In order to improve the response speed of the voice wake-up device without affecting the wake-up rate of the voice wake-up device, an embodiment of the present application provides an electronic device 100 . The electronic device 100 may be the voice wake-up device (such as the voice wake-up devices 10-15) in the foregoing embodiments.

电子设备100可以是搭载或者其它操作系统的电子设备,例如,手机、平板电脑、智能手表、智能手环、音箱、电视等等。本申请实施例对电子设备100的具体类型不作限定。Electronic device 100 may be equipped with Or electronic devices with other operating systems, such as mobile phones, tablet computers, smart watches, smart bracelets, speakers, TVs, etc. The embodiment of the present application does not limit the specific type of the electronic device 100 .

下面对电子设备100的结构示意图进行介绍。A schematic structural diagram of the electronic device 100 will be introduced below.

如图3A所示,电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。As shown in FIG. 3A, the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charging management module 140, a power management module 141, and a battery 142 , antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193 , a display screen 194, and a subscriber identification module (subscriber identification module, SIM) card interface 195, etc.

可以理解的是,本申请实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that, the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100 . In other embodiments of the present application, the electronic device 100 may include more or fewer components than shown in the figure, or combine certain components, or separate certain components, or arrange different components. The illustrated components can be realized in hardware, software or a combination of software and hardware.

处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processingunit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。The processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor ( image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. . Wherein, different processing units may be independent devices, or may be integrated in one or more processors.

其中,控制器可以是电子设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。Wherein, the controller may be the nerve center and command center of the electronic device 100 . The controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.

在一些实施例中,处理器110可包括语音唤醒模块和语音指令识别模块。其中,语音唤醒模块和语音指令识别模块可以集成在不同的处理器芯片中,由不同的芯片执行。例如,语音唤醒模块可以集成在功耗较低的协处理器或DSP芯片中,语音指令识别模块可以集成在AP或NPU或其他芯片中。这样,可以在语音唤醒模块识别到预设的唤醒词后,再启动语音指令识别的模块所在的芯片触发语音指令识别功能,从而节省电子设备的功耗。或者,语音唤醒模块和语音指令识别模块可以集成在相同的处理器芯片中,由同一芯片执行相关功能。例如,语音唤醒模块和语音指令识别模块均可集成在AP芯片或NPU或其他芯片中。In some embodiments, the processor 110 may include a voice wake-up module and a voice command recognition module. Wherein, the voice wake-up module and the voice command recognition module can be integrated in different processor chips and executed by different chips. For example, the voice wake-up module can be integrated in a coprocessor or DSP chip with low power consumption, and the voice command recognition module can be integrated in an AP or NPU or other chips. In this way, after the voice wake-up module recognizes the preset wake-up word, the chip where the voice command recognition module is located can be activated to trigger the voice command recognition function, thereby saving the power consumption of the electronic device. Alternatively, the voice wake-up module and the voice command recognition module can be integrated in the same processor chip, and the same chip performs related functions. For example, both the voice wake-up module and the voice command recognition module can be integrated in an AP chip or an NPU or other chips.

处理器110还可以包括语音指令执行模块。在上述语音指令识别模块识别到语音指令后,语音指令执行模块可以执行语音指令对应的操作。例如,播放音乐、拨打电话、发送短信等等。The processor 110 may also include a voice command execution module. After the speech instruction recognition module recognizes the speech instruction, the speech instruction execution module can execute the operation corresponding to the speech instruction. For example, play music, make calls, send text messages, and more.

可以理解的,包含上述语音唤醒模块、语音指令识别模块和语音指令执行模块的电子设备是具有语音交互能力的电子设备。上述具有语音交互能力可以表示,电子设备100可以响应用户的语音指令,并执行该语音指令对应的操作。It can be understood that the electronic device including the above-mentioned voice wake-up module, voice command recognition module and voice command execution module is an electronic device with voice interaction capability. Having the voice interaction capability mentioned above may mean that the electronic device 100 can respond to a user's voice command and perform an operation corresponding to the voice command.

处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。A memory may also be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in processor 110 is a cache memory. The memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thereby improving the efficiency of the system.

USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为电子设备100充电,也可以用于电子设备100与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。The USB interface 130 is an interface conforming to the USB standard specification, specifically, it may be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like. The USB interface 130 can be used to connect a charger to charge the electronic device 100 , and can also be used to transmit data between the electronic device 100 and peripheral devices. It can also be used to connect headphones and play audio through them.

充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备供电。The charging management module 140 is configured to receive a charging input from a charger. Wherein, the charger may be a wireless charger or a wired charger. While the charging management module 140 is charging the battery 142 , it can also supply power to the electronic device through the power management module 141 .

电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。The power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 . The power management module 141 receives the input from the battery 142 and/or the charging management module 140 to provide power for the processor 110 , the internal memory 121 , the external memory, the display screen 194 , the camera 193 , and the wireless communication module 160 .

电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。The wireless communication function of the electronic device 100 can be realized by the antenna 1 , the antenna 2 , the mobile communication module 150 , the wireless communication module 160 , a modem processor, a baseband processor, and the like.

天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals. Each antenna in electronic device 100 may be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.

移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。The mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied on the electronic device 100 . The mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like. The mobile communication module 150 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation. The mobile communication module 150 can also amplify the signals modulated by the modem processor, and convert them into electromagnetic waves and radiate them through the antenna 1 .

无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wirelesslocal area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。The wireless communication module 160 can provide applications on the electronic device 100 including wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (wireless fidelity, Wi-Fi) network), bluetooth (bluetooth, BT), global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions. The wireless communication module 160 may be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 . The wireless communication module 160 can also receive the signal to be sent from the processor 110 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.

电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。The electronic device 100 realizes the display function through the GPU, the display screen 194 , and the application processor. The GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.

显示屏194用于显示图像,视频等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。The display screen 194 is used to display images, videos and the like. In some embodiments, the electronic device 100 may include 1 or N display screens 194 , where N is a positive integer greater than 1.

电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。The electronic device 100 can realize the shooting function through the ISP, the camera 193 , the video codec, the GPU, the display screen 194 and the application processor.

ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。The ISP is used for processing the data fed back by the camera 193 . For example, when taking a picture, open the shutter, the light is transmitted to the photosensitive element of the camera through the lens, and the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.

摄像头193用于捕获静态图像或视频。在一些实施例中,电子设备100可以包括1个或N个摄像头193,N为大于1的正整数。Camera 193 is used to capture still images or video. In some embodiments, the electronic device 100 may include 1 or N cameras 193 , where N is a positive integer greater than 1.

数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.

NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。The NPU is a neural-network (NN) computing processor. By referring to the structure of biological neural networks, such as the transfer mode between neurons in the human brain, it can quickly process input information and continuously learn by itself. Applications such as intelligent cognition of the electronic device 100 can be realized through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.

外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。The external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100 . The external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. Such as saving music, video and other files in the external memory card.

内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。The internal memory 121 may be used to store computer-executable program codes including instructions. The processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 . The internal memory 121 may include an area for storing programs and an area for storing data. Wherein, the stored program area can store an operating system, at least one application program required by a function (such as a sound playing function, an image playing function, etc.) and the like. The storage data area can store data created during the use of the electronic device 100 (such as audio data, phonebook, etc.) and the like. In addition, the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.

电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。The electronic device 100 can implement audio functions through the audio module 170 , the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playback, recording, etc.

音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。The audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signal. The audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be set in the processor 110 , or some functional modules of the audio module 170 may be set in the processor 110 .

扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。Speaker 170A, also referred to as a "horn", is used to convert audio electrical signals into sound signals.

受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。Receiver 170B, also called "earpiece", is used to convert audio electrical signals into sound signals.

麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。在一些实施例中,电子设备100可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备100还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。The microphone 170C, also called "microphone" or "microphone", is used to convert sound signals into electrical signals. In some embodiments, the electronic device 100 can be provided with two microphones 170C, which can also implement a noise reduction function in addition to collecting sound signals. In some other embodiments, the electronic device 100 can also be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and realize directional recording functions, etc.

耳机接口170D用于连接有线耳机。The earphone interface 170D is used for connecting wired earphones.

传感器模块180可以包括压力传感器,陀螺仪传感器,气压传感器,磁传感器,加速度传感器,距离传感器,接近光传感器,指纹传感器,温度传感器,触摸传感器,环境光传感器,骨传导传感器等。The sensor module 180 may include a pressure sensor, a gyro sensor, an air pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity light sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, and the like.

按键190包括开机键,音量键等。马达191可以产生振动提示。指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。The keys 190 include a power key, a volume key and the like. The motor 191 can generate a vibrating reminder. The indicator 192 can be an indicator light, and can be used to indicate charging status, power change, and can also be used to indicate messages, missed calls, notifications, and the like.

SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和电子设备100的接触和分离。电子设备100可以支持1个或N个SIM卡接口,N为大于1的正整数。电子设备100通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,电子设备100采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在电子设备100中,不能和电子设备100分离。The SIM card interface 195 is used for connecting a SIM card. The SIM card can be connected and separated from the electronic device 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195 . The electronic device 100 may support 1 or N SIM card interfaces, where N is a positive integer greater than 1. The electronic device 100 interacts with the network through the SIM card to implement functions such as calling and data communication. In some embodiments, the electronic device 100 adopts an eSIM, that is, an embedded SIM card. The eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100 .

电子设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的Android系统为例,示例性说明电子设备100的软件结构。The software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a micro-kernel architecture, a micro-service architecture, or a cloud architecture. The embodiment of the present application takes the Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100 .

图3B是本申请实施例的电子设备100的软件结构框图。FIG. 3B is a block diagram of the software structure of the electronic device 100 according to the embodiment of the present application.

分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime)和系统库,以及内核层。The layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces. In some embodiments, the Android system is divided into four layers, which are, from top to bottom, the application program layer, the application program framework layer, the Android runtime (Android runtime) and the system library, and the kernel layer.

应用程序层可以包括一系列应用程序包。The application layer can consist of a series of application packages.

如图3B所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,短信息,语音助手等应用程序。As shown in FIG. 3B , the application package may include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, short message, and voice assistant.

应用程序框架层为应用程序层的应用程序提供应用编程接口(applicationprogramming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。The application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer. The application framework layer includes some predefined functions.

如图3B所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器,活动管理器等。As shown in FIG. 3B, the application framework layer may include a window manager, content provider, view system, phone manager, resource manager, notification manager, activity manager, and so on.

窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。A window manager is used to manage window programs. The window manager can get the size of the display screen, determine whether there is a status bar, lock the screen, capture the screen, etc.

内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。Content providers are used to store and retrieve data and make it accessible to applications. Said data may include video, images, audio, calls made and received, browsing history and bookmarks, phonebook, etc.

视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。The view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and so on. The view system can be used to build applications. A display interface can consist of one or more views. For example, a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.

电话管理器用于提供电子设备100的通信功能。例如通话状态的管理(包括接通,挂断等)。The phone manager is used to provide communication functions of the electronic device 100 . For example, the management of call status (including connected, hung up, etc.).

资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.

通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。The notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and can automatically disappear after a short stay without user interaction. For example, the notification manager is used to notify the download completion, message reminder, etc. The notification manager can also be a notification that appears on the top status bar of the system in the form of a chart or scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window. For example, prompting text information in the status bar, issuing a prompt sound, vibrating the electronic device, and flashing the indicator light, etc.

活动管理器用于负责管理活动(activity),负责系统中各组件的启动、切换、调度以及应用程序的管理和调度等工作。活动管理器可供上层应用调用以打开对应的activity。The activity manager is used to manage activities, and is responsible for the starting, switching, scheduling of each component in the system, and the management and scheduling of application programs. The activity manager can be called by the upper application to open the corresponding activity.

Android Runtime包括核心库和虚拟机。Android runtime负责安卓系统的调度和管理。Android Runtime includes core library and virtual machine. The Android runtime is responsible for the scheduling and management of the Android system.

核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。The core library consists of two parts: one part is the function function that the java language needs to call, and the other part is the core library of Android.

应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。The application layer and the application framework layer run in virtual machines. The virtual machine executes the java files of the application program layer and the application program framework layer as binary files. The virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.

系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(Media Libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。A system library can include multiple function modules. For example: surface manager (surface manager), media library (Media Libraries), 3D graphics processing library (eg: OpenGL ES), 2D graphics engine (eg: SGL), etc.

表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。The surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.

媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。The media library supports playback and recording of various commonly used audio and video formats, as well as still image files, etc. The media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.

三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。The 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing, etc.

2D图形引擎是2D绘图的绘图引擎。2D graphics engine is a drawing engine for 2D drawing.

内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。The kernel layer is the layer between hardware and software. The kernel layer includes at least a display driver, a camera driver, an audio driver, and a sensor driver.

请参照图4,图4示例性示出了本申请实施例提供的一种语音唤醒设备10的结构示意图。Please refer to FIG. 4 . FIG. 4 exemplarily shows a schematic structural diagram of a voice wake-up device 10 provided by an embodiment of the present application.

如图4所示,语音唤醒设备10可以包含:通过总线相互耦合的预唤醒模块410、裁决模块420、唤醒模块430、语音助手APP440、通信模块450、音频输入模块460、音频输出模块470、存储模块480。其中:As shown in Figure 4, the voice wake-up device 10 may include: a pre-wake-up module 410 coupled to each other via a bus, an arbitration module 420, a wake-up module 430, a voice assistant APP 440, a communication module 450, an audio input module 460, an audio output module 470, a storage Module 480. in:

存储模块480中可存储有预唤醒模型481和唤醒模型482。上述预唤醒模型481可用于检测声音中是否包含预唤醒词。上述唤醒模型482可用于检测声音中是否包含唤醒词。本申请实施例对上述预唤醒模型481和唤醒模型482的实现方法不作限定。The storage module 480 may store a pre-wakeup model 481 and a wakeup model 482 . The above-mentioned pre-awakening model 481 can be used to detect whether a pre-awakening word is contained in the sound. The wake model 482 described above can be used to detect whether a wake word is contained in a sound. The embodiment of the present application does not limit the implementation methods of the above-mentioned pre-wake-up model 481 and wake-up model 482 .

示例性的,语音唤醒设备10可以通过麦克风采集环境音。其中,当用户在语音唤醒设备10附近说出预唤醒词(如“小艺”),环境音中可包含预唤醒语音。在采集到该环境音后,语音唤醒设备10可以利用预唤醒模型481从该环境音中分离出用户的语音,并从该用户的语音中,解码出音素序列。当得到音素序列,语音唤醒设备10可以利用预唤醒模型481判断该解码出的音素序列是否与已存储的预唤醒词音素序列匹配。若是,语音唤醒设备10可以确定出该用户的语音中包含预唤醒语音。Exemplarily, the voice-activated device 10 may collect ambient sound through a microphone. Wherein, when the user speaks a pre-awakening word (such as "Xiaoyi") near the voice-activated device 10, the ambient sound may include the pre-awakened voice. After collecting the environmental sound, the voice wake-up device 10 may use the pre-awakening model 481 to separate the user's voice from the environmental sound, and decode the phoneme sequence from the user's voice. When the phoneme sequence is obtained, the voice awakening device 10 can use the pre-awakening model 481 to determine whether the decoded phoneme sequence matches the stored pre-awakening word phoneme sequence. If yes, the voice wake-up device 10 may determine that the user's voice includes pre-wake-up voice.

可选的,上述预唤醒模型481还可以是基于神经网络的模型。唤醒模型482的实现方法可以与预唤醒模型481的实现方法相同或者不同。Optionally, the aforementioned pre-awakening model 481 may also be a model based on a neural network. The realization method of the wake-up model 482 may be the same as or different from the realization method of the pre-wake-up model 481 .

在一些实施例中,上述预唤醒模型481和上述唤醒模型482可以是同一个模型。可以理解的,预唤醒词是唤醒词的一部分。语音唤醒设备10可以利用一个模型识别采集到的声音中是否包含预唤醒词。当确定出声音中包含预唤醒词,这一个模型可以输出存在预唤醒词的结果。进一步的,语音唤醒设备10可以利用这一个模型继续识别采集到的声音中是否包含唤醒词。当确定出声音中包含唤醒词,这一个模型可以输出存在唤醒词的结果。In some embodiments, the above-mentioned pre-wake-up model 481 and the above-mentioned wake-up model 482 may be the same model. Understandably, the pre-wakeup word is a part of the wakeup word. The voice wake-up device 10 can use a model to identify whether the collected sound contains a pre-wake-up word. When it is determined that the pre-wake word is contained in the sound, this model can output the result that there is a pre-wake word. Further, the voice wake-up device 10 may use this model to continue identifying whether the collected sound contains wake-up words. When it is determined that the sound contains the wake word, this model can output the result that the wake word is present.

不限于存储预唤醒模型481和唤醒模型482,存储模块480还可以存储更多的内容。存储模块480可以相当于前述图3A所示的内部存储器121。Not limited to storing the pre-wakeup model 481 and the wakeup model 482, the storage module 480 can also store more content. The storage module 480 may be equivalent to the aforementioned internal memory 121 shown in FIG. 3A .

音频输入模块460可用于采集声音。音频输入模块460可以包含麦克风。当语音唤醒设备10中语音唤醒的功能开启,音频输入模块460可以将采集的声音转换为音频电信号,并发送给预唤醒模块410和唤醒模块430。The audio input module 460 can be used to collect sound. The audio input module 460 may include a microphone. When the voice wake-up function in the voice wake-up device 10 is turned on, the audio input module 460 can convert the collected sound into an audio electrical signal, and send it to the pre-wake-up module 410 and the wake-up module 430 .

音频输出模块470可用于将音频电信号转换为声音,从而播放该声音。音频输出模块470可以包括以下一项或多项:扬声器、受话器。例如,当检测到唤醒语音,语音助手APP440可以调用音频输出模块470语音回答“我在”。The audio output module 470 can be used to convert audio electrical signals into sound, thereby playing the sound. The audio output module 470 may include one or more of the following: speaker, receiver. For example, when a wake-up voice is detected, the voice assistant APP440 can call the audio output module 470 to answer "I am" by voice.

预唤醒模块410可用于在接收到来自音频输入模块460的音频电信号的情况下,从存储模块480中获取预唤醒模型481,来检测是否存在预唤醒词。当检测到音频输入模块460采集的声音中存在预唤醒词,预唤醒模块410可以指示裁决模块420确定应答设备。The pre-awakening module 410 can be configured to acquire the pre-awakening model 481 from the storage module 480 to detect whether there is a pre-awakening word in the case of receiving the electrical audio signal from the audio input module 460 . When detecting that there is a pre-awakening word in the sound collected by the audio input module 460, the pre-awakening module 410 may instruct the arbitration module 420 to determine the answering device.

在一种可能的实现方法中,预唤醒模块410可以根据音频输入模块460采集的声音得到的音频,确定包含预唤醒词的音频对应的预唤醒词音频能量。预唤醒模块410可以将上述预唤醒词音频能量发送给裁决模块420。另外,预唤醒模块410还可以通过通信模块450将上述预唤醒词音频能量发送给语音唤醒系统中的其它语音唤醒设备(如语音唤醒设备11~15)。In a possible implementation method, the pre-awakening module 410 may determine the audio energy of the pre-awakening word corresponding to the audio containing the pre-awakening word according to the audio obtained from the sound collected by the audio input module 460 . The pre-awakening module 410 may send the above-mentioned audio energy of the pre-awakening word to the arbitration module 420 . In addition, the pre-awakening module 410 may also send the audio energy of the pre-awakening word to other voice-activated devices (such as voice-activated devices 11-15) in the voice-activated system through the communication module 450.

在另一种可能的实现方法中,预唤醒模块410也可以将包含预唤醒词的音频发送给裁决模块420。裁决模块420可以确定上述预唤醒词音频能量,并将预唤醒词音频能量发送给语音唤醒系统中的其它语音唤醒设备。In another possible implementation method, the pre-awakening module 410 may also send the audio containing the pre-awakening word to the arbitration module 420 . The arbitration module 420 may determine the audio energy of the pre-awakening word, and send the audio energy of the pre-awakening word to other voice-activating devices in the voice-activating system.

需要进行说明的是,语音唤醒系统中的各个语音唤醒设备在互相通告自己确定的预唤醒词音频能量时,上述发送给其它语音唤醒设备的预唤醒词音频能量可以是经过归一化处理的。也即是说,用于确定应答设备的来自不同语音唤醒设备的预唤醒词音频能量是按照相同的计量标准计算得到的。在上述相同的计量标准中,预唤醒词音频能量的大小可以反映语音唤醒设备与用户距离的远近。其中,预唤醒词音频能量越大可以表示,该预唤醒词音频能量对应的语音唤醒设备与用户的距离越近。预唤醒词音频能量对应的语音唤醒设备可以为确定该预唤醒词音频能量的语音唤醒设备。这样,语音唤醒系统中的各个语音唤醒设备可以根据预唤醒词音频能量的大小来确定应答设备。It should be noted that when each voice wake-up device in the voice wake-up system notifies each other of the pre-wake-up word audio energy determined by itself, the audio energy of the pre-wake-up word sent to other voice wake-up devices may be normalized. That is to say, the audio energy of the pre-wake words from different voice wake-up devices used to determine the answering device is calculated according to the same measurement standard. In the same measurement standard as above, the audio energy of the pre-awakening word can reflect the distance between the voice awakening device and the user. Wherein, the greater the audio energy of the pre-awakening word may indicate that the distance between the voice awakening device corresponding to the audio energy of the pre-awakening word is closer to the user. The voice wake-up device corresponding to the audio energy of the pre-wake-up word may be a voice wake-up device that determines the audio energy of the pre-wake-up word. In this way, each voice wake-up device in the voice wake-up system can determine the answering device according to the audio energy of the pre-wake-up word.

本申请实施例对上述预唤醒词音频能量进行归一化处理的实现方法不作限定。The embodiment of the present application does not limit the implementation method of normalizing the audio energy of the pre-awakening word.

裁决模块420可用于在得到语音唤醒设备10,以及语音唤醒系统中其它语音唤醒设备确定的预唤醒词音频能量后,判断哪个语音唤醒设备确定的预唤醒词音频能量最大。裁决模块420可以将最大的预唤醒词音频能量对应的一个语音唤醒设备确定为应答设备。The decision module 420 can be configured to determine which voice wake-up device has the largest audio energy of the pre-wake word after obtaining the audio energy of the pre-wake word determined by the voice wake-up device 10 and other voice wake-up devices in the voice wake-up system. The arbitration module 420 may determine a voice wake-up device corresponding to the largest pre-wake-up word audio energy as the answering device.

上述裁决模块420是可选的。The arbitration module 420 described above is optional.

在一些实施例中,语音唤醒系统中的各个语音唤醒设备均包含上述裁决模块420。也即是说,语音唤醒系统中的各个语音唤醒设备协商确定应答设备的过程可以包括:各个语音唤醒设备可以互相向彼此通告自己确定的预唤醒词音频能量,各个语音唤醒设备均可以通过自己的裁决模块利用预唤醒词音频能量来确定应答设备。In some embodiments, each voice wake-up device in the voice wake-up system includes the aforementioned arbitration module 420 . That is to say, the process of each voice wake-up device in the voice wake-up system negotiating to determine the answering device may include: each voice wake-up device can notify each other of the pre-wake-up word audio energy determined by itself, and each voice wake-up device can pass its own The arbitration module uses the pre-wake word audio energy to determine the answering device.

在一些实施例中,语音唤醒设备10中也可不包含裁决模块420。例如,语音唤醒系统中的一个语音唤醒设备(如语音唤醒设备11)为主设备。语音唤醒设备10通过预唤醒模块410检测到预唤醒词之后,可以将预唤醒词音频能量发送给上述主设备。上述主设备可以包含裁决模块420。上述主设备可以获取语音唤醒系统中多个语音唤醒设备确定的预唤醒词音频能量,并通过裁决模块420利用预唤醒词音频能量来确定应答设备。可选的,上述主设备也可以是各个语音唤醒设备之外的一个电子设备。或者,语音唤醒系统中的各个语音唤醒设备可以将各自确定的预唤醒词音频能量发送给服务器(如云服务器)。云服务器可以确定应答设备,并指示应答设备在检测到唤醒语音之后对用户进行响应。In some embodiments, the voice wake-up device 10 may not include the arbitration module 420 . For example, a voice wake-up device (such as the voice wake-up device 11 ) in the voice wake-up system is the master device. After the voice wake-up device 10 detects the pre-wake-up word through the pre-wake-up module 410, it may send the audio energy of the pre-wake-up word to the above-mentioned main device. The above-mentioned master device may include an arbitration module 420 . The above-mentioned master device can acquire the audio energy of the pre-awakening word determined by multiple voice-activating devices in the voice-activating system, and use the audio energy of the pre-awakening word to determine the answering device through the arbitration module 420 . Optionally, the above-mentioned master device may also be an electronic device other than each voice wake-up device. Or, each voice wake-up device in the voice wake-up system can send the audio energy of the pre-wake-up word determined respectively to a server (such as a cloud server). The cloud server can determine the answering device and instruct the answering device to respond to the user after detecting the wake-up tone.

唤醒模块430可用于在接收到来自音频输入模块460的音频电信号的情况下,从存储模块480中获取唤醒模型482,来检测是否存在唤醒词。当检测到音频输入模块460采集的声音中存在唤醒词,唤醒模块430可以从裁决模块420获取应答设备的确定结果。在检测到唤醒语音,且应答设备为语音唤醒设备10的情况下,唤醒模块430可以唤醒语音助手APP440。唤醒模块430可以包含前述图3A所述实施例中的语音唤醒模块。The wake-up module 430 can be configured to acquire the wake-up model 482 from the storage module 480 to detect whether there is a wake-up word when receiving the audio electrical signal from the audio input module 460 . When it is detected that there is a wake-up word in the sound collected by the audio input module 460 , the wake-up module 430 may obtain a determination result of the answering device from the arbitration module 420 . When the wake-up voice is detected and the answering device is the voice wake-up device 10 , the wake-up module 430 can wake up the voice assistant APP 440 . The wake-up module 430 may include the voice wake-up module in the embodiment described above in FIG. 3A .

在一些实施例中,上述预唤醒模块410和唤醒模块430可以是同一个模块。In some embodiments, the above-mentioned pre-wake-up module 410 and wake-up module 430 may be the same module.

语音助手APP440可用于在唤醒后对用户进行响应。例如,语音助手APP440可以调用音频输入模块460对用户说出的唤醒词进行响应。语音助手APP440可以识别用户的语音指令,并执行语音指令对应的操作。语音助手APP440可以包含前述图3A所述实施例中的语音指令识别模块和语音指令执行模块。The voice assistant APP 440 can be used to respond to the user after waking up. For example, the voice assistant APP 440 may call the audio input module 460 to respond to the wake-up words spoken by the user. The voice assistant APP440 can recognize the user's voice command, and execute the operation corresponding to the voice command. The voice assistant APP440 may include the voice command recognition module and the voice command execution module in the aforementioned embodiment shown in FIG. 3A .

通信模块450可用于语音唤醒设备10与其它电子设备进行通信。例如,语音唤醒设备10可以通过通信模块450发现与自己具有相同唤醒词的语音唤醒设备,并确定这些语音唤醒设备的状态(如语音唤醒的功能的开启状态、工作的状态等)。当检测到预唤醒语音,语音唤醒设备10还可以通过通信模块450向语音唤醒系统中其它的语音唤醒设备发送自己确定的预唤醒词音频能量,并接收来自其它的语音唤醒设备确定的预唤醒词音频能量。通信模块450可以将其它语音唤醒设备确定的预唤醒词音频能量发送给裁决模块420。The communication module 450 can be used for the voice wake-up device 10 to communicate with other electronic devices. For example, the voice wake-up device 10 can discover voice wake-up devices with the same wake-up word as itself through the communication module 450, and determine the status of these voice wake-up devices (such as the activation status of the voice wake-up function, the working status, etc.). When the pre-wake-up voice is detected, the voice wake-up device 10 can also send the pre-wake-up word audio energy determined by itself to other voice wake-up devices in the voice wake-up system through the communication module 450, and receive the pre-wake-up words determined by other voice wake-up devices audio energy. The communication module 450 may send the audio energy of the pre-wake word determined by other voice wake-up devices to the arbitration module 420 .

不限于图4所示的模块,语音唤醒设备10还可以包含更多或更少的模块。可以理解的,本申请实施例中其它的语音唤醒设备的结构可以参考图4所示语音唤醒设备10的结构示意图。这里不再赘述。Not limited to the modules shown in FIG. 4 , the voice wake-up device 10 may also include more or fewer modules. It can be understood that, for the structure of other voice wake-up devices in the embodiment of the present application, reference may be made to the schematic structural diagram of the voice wake-up device 10 shown in FIG. 4 . I won't go into details here.

下面基于图4所示的语音唤醒设备10介绍本申请实施例提供的一种设备唤醒方法。The following introduces a method for waking up a device provided by an embodiment of the present application based on the voice waking up device 10 shown in FIG. 4 .

这里具体以上述语音唤醒设备10和语音唤醒设备11组成的语音唤醒系统为例进行介绍。可以理解的,当语音唤醒系统包含更多的语音唤醒设备(如语音唤醒系统A中包含语音唤醒设备10~15),语音唤醒设备10和语音唤醒设备11还可以分别与语音唤醒系统A中的其它语音唤醒设备通信,从而协商选取出语音唤醒系统A中离用户最近的语音唤醒设备。语音唤醒系统A包含的语音唤醒设备之间两两通信的过程可以参考语音唤醒设备10和语音唤醒设备11之间的通信过程。这里不再展开。Here, the voice wake-up system composed of the above-mentioned voice wake-up device 10 and the voice wake-up device 11 is taken as an example for introduction. It can be understood that when the voice wake-up system includes more voice wake-up devices (such as voice wake-up devices 10-15 included in the voice wake-up system A), the voice wake-up device 10 and the voice wake-up device 11 can also be connected with the voice wake-up devices in the voice wake-up system A respectively. The other voice wake-up devices communicate to negotiate and select the voice wake-up device closest to the user in the voice wake-up system A. The process of pairwise communication between the voice wake-up devices included in the voice wake-up system A may refer to the communication process between the voice wake-up device 10 and the voice wake-up device 11 . It will not be expanded here.

请参照图5,图5示例性示出了本申请实施例提供的一种设备唤醒方法的流程图。Please refer to FIG. 5 . FIG. 5 exemplarily shows a flow chart of a method for waking up a device provided by an embodiment of the present application.

该方法可包括步骤S510~S560。其中:The method may include steps S510-S560. in:

S510、语音唤醒设备10检测到预唤醒语音,进入预唤醒状态,确定自己检测到的预唤醒词对应的预唤醒词音频能量,预唤醒词是唤醒词的一部分。S510. The voice wake-up device 10 detects the pre-wake-up voice, enters the pre-wake-up state, and determines the audio energy of the pre-wake-up word corresponding to the pre-wake-up word detected by itself, and the pre-wake-up word is a part of the wake-up word.

S520、语音唤醒设备11检测到预唤醒语音,进入预唤醒状态,确定自己检测到的预唤醒词对应的预唤醒词音频能量,预唤醒词是唤醒词的一部分。S520. The voice wake-up device 11 detects the pre-wake-up voice, enters the pre-wake-up state, and determines the audio energy of the pre-wake-up word corresponding to the pre-wake-up word detected by itself, and the pre-wake-up word is a part of the wake-up word.

在用户说出唤醒词(如“小艺小艺”)的过程中,语音唤醒设备10和语音唤醒设备11可以在一检测到预唤醒语音的情况下,就进入预唤醒状态,来协商确定应答设备。可以理解的,由于预唤醒词是唤醒词的一部分,语音唤醒设备10和语音唤醒设备11可能在用户还未说完唤醒词的情况下,就已经开始上述协商过程。In the process of the user uttering the wake-up word (such as "Xiaoyi Xiaoyi"), the voice wake-up device 10 and the voice wake-up device 11 can enter the pre-wake-up state as soon as the pre-wake-up voice is detected, to negotiate and determine the answer equipment. It can be understood that since the pre-wake-up word is a part of the wake-up word, the voice wake-up device 10 and the voice wake-up device 11 may start the above negotiation process before the user has finished speaking the wake-up word.

上述检测到预唤醒语音,以及确定预唤醒词音频能量的实现方法可以参考前述实施例的介绍。这里不再赘述。For the implementation method of detecting the pre-awakening voice and determining the audio energy of the pre-awakening word, reference may be made to the introduction of the foregoing embodiments. I won't go into details here.

S530、语音唤醒设备10和语音唤醒设备11互相通告各自的预唤醒词音频能量。S530, the voice wake-up device 10 and the voice wake-up device 11 notify each other of their respective pre-wake-up word audio energy.

其中,语音唤醒设备10可以确定出与自己具有相同唤醒词的语音唤醒设备包括语音唤醒设备11。语音唤醒设备10可以将自己确定的预唤醒词音频能量发送给语音唤醒设备11,并等待来自语音唤醒设备11的预唤醒词音频能量。Wherein, the voice wake-up device 10 may determine that the voice wake-up devices having the same wake-up word as itself include the voice wake-up device 11 . The voice wake-up device 10 may send the audio energy of the pre-wake-up word determined by itself to the voice wake-up device 11 , and wait for the audio energy of the pre-wake-up word from the voice wake-up device 11 .

语音唤醒设备11可以确定出与自己具有相同唤醒词的语音唤醒设备包括语音唤醒设备10。语音唤醒设备11可以将自己确定的预唤醒词音频能量发送给语音唤醒设备10,并等待来自语音唤醒设备10的预唤醒词音频能量。The voice wake-up device 11 may determine that voice wake-up devices having the same wake-up word as itself include the voice wake-up device 10 . The voice wake-up device 11 may send the audio energy of the pre-wake-up word determined by itself to the voice wake-up device 10 , and wait for the audio energy of the pre-wake-up word from the voice wake-up device 10 .

S540、语音唤醒设备10根据自己与语音唤醒设备11的预唤醒词音频能量,确定出自己的预唤醒词音频能量最大,将自己确定为应答设备。S540, the voice wake-up device 10 determines that its own pre-wake-up word audio energy is the largest according to the audio energy of the pre-wake-up word between itself and the voice wake-up device 11, and determines itself as the answering device.

当接收到来自语音唤醒设备11的预唤醒词音频能量,语音唤醒设备10可以自己与语音唤醒设备11的预唤醒词音频能量,判断哪个语音唤醒设备的预唤醒词音频能量最大。当确定出语音唤醒设备10的预唤醒词音频能量最大,语音唤醒设备10可以将自己确定为应答设备。When receiving the pre-wake-up word audio energy from the voice wake-up device 11, the voice wake-up device 10 can determine which voice wake-up device has the largest pre-wake-up word audio energy by itself and the pre-wake-up word audio energy of the voice wake-up device 11. When it is determined that the audio energy of the pre-wake word of the voice wake-up device 10 is the largest, the voice wake-up device 10 may determine itself as the answering device.

S550、语音唤醒设备11根据自己与语音唤醒设备10的预唤醒词音频能量,确定出语音唤醒设备10的预唤醒词音频能量最大,将语音唤醒设备10确定为应答设备。S550. The voice wake-up device 11 determines that the audio wake-up word of the voice wake-up device 10 has the largest audio energy according to the audio energy of the pre-wake-up word between itself and the voice wake-up device 10, and determines the voice wake-up device 10 as the answering device.

当接收到来自语音唤醒设备10的预唤醒词音频能量,语音唤醒设备11可以自己与语音唤醒设备10的预唤醒词音频能量,判断哪个语音唤醒设备的预唤醒词音频能量最大。当确定出语音唤醒设备10的预唤醒词音频能量最大,语音唤醒设备11可以将语音唤醒设备10确定为应答设备。When receiving the pre-wake-up word audio energy from the voice wake-up device 10, the voice wake-up device 11 can determine which voice wake-up device has the largest pre-wake-up word audio energy by itself and the pre-wake-up word audio energy of the voice wake-up device 10. When it is determined that the audio energy of the pre-wake word of the voice wake-up device 10 is the largest, the voice wake-up device 11 may determine the voice wake-up device 10 as the answering device.

在一些实施例中,语音唤醒设备10和语音唤醒设备11在各自确定出应答设备之后,还可以互相向彼此通告自己的应答设备的确定结果。In some embodiments, after the voice wake-up device 10 and the voice wake-up device 11 respectively determine the answering device, they may also notify each other of the determination result of their answering device.

在一些实施例中,除了各个语音唤醒设备确定的预唤醒词音频能量,语音唤醒设备10和语音唤醒设备11还可以结合各个语音唤醒设备的设备信息(如设备类型、设备使用频率、设备能力等等)来确定应答设备。例如,语音唤醒设备10可以先比较自己的预唤醒词音频能量与语音唤醒设备11的预唤醒词音频能量之间的大小,判断哪个语音唤醒设备离用户的预唤醒词音频能量最大。其中,若多个语音唤醒设备的预唤醒词音频能量相同,则可以表示这多个语音唤醒设备与用户的距离相同。在判断出语音唤醒设备10和语音唤醒设备11的预唤醒词音频能量相同的情况下,语音唤醒设备10可以根据设备能力来确定应答设备。若语音唤醒设备10的能力高于语音唤醒设备11的能力(如语音唤醒设备10的音效更好),语音唤醒设备10可以将自己确定为应答设备。同样的,语音唤醒设备11也可以根据上述方法确定语音唤醒设备10为应答设备。本申请实施例对语音唤醒设备结合各个语音唤醒设备确定的预唤醒词音频能量、各个语音唤醒设备的设备信息来确定应答设备的实现方法不作限定。In some embodiments, in addition to the pre-wake-up word audio energy determined by each voice wake-up device, the voice wake-up device 10 and the voice wake-up device 11 can also combine the device information (such as device type, device usage frequency, device capability, etc.) of each voice wake-up device etc.) to identify the answering device. For example, the voice wake-up device 10 can first compare the audio energy of its own pre-wake-up word with the audio energy of the pre-wake-up word of the voice wake-up device 11, and determine which voice wake-up device is the largest from the user's pre-wake-up word audio energy. Wherein, if the audio energy of the pre-awakening word of multiple voice wake-up devices is the same, it may indicate that the distances between the multiple voice wake-up devices and the user are the same. In the case that it is determined that the audio energy of the pre-wake word of the voice wake-up device 10 and the voice wake-up device 11 are the same, the voice wake-up device 10 may determine the answering device according to the device capability. If the capability of the voice wake-up device 10 is higher than that of the voice wake-up device 11 (for example, the sound effect of the voice wake-up device 10 is better), the voice wake-up device 10 can determine itself as the answering device. Similarly, the voice wake-up device 11 may also determine the voice wake-up device 10 as the answering device according to the above method. The embodiment of the present application does not limit the implementation method for the voice wake-up device to determine the answering device in combination with the audio energy of the pre-wake-up word determined by each voice wake-up device and the device information of each voice wake-up device.

S560、当语音唤醒设备10检测到唤醒语音,根据步骤S540中应答设备的确定结果,进入唤醒状态,语音助手APP被唤醒并对用户进行响应。S560. When the voice wake-up device 10 detects the wake-up voice, it enters the wake-up state according to the determination result of the answering device in step S540, and the voice assistant APP is woken up and responds to the user.

下面基于图5所示的方法,介绍在上述方法中从用户说出唤醒词到语音唤醒设备确定出应答设备的时间分布。Based on the method shown in FIG. 5 , the time distribution from the time when the user speaks the wake-up word to the voice wake-up device is determined to determine the answering device in the above-mentioned method is introduced below.

请参照图6,图6示例性示出了从用户说出唤醒词到语音唤醒设备确定出应答设备的时间分布示意图。Please refer to FIG. 6 . FIG. 6 exemplarily shows a schematic diagram of time distribution from when the user speaks a wake-up word to when the voice wake-up device determines the answering device.

如图6所示,用户从t1时刻开始说唤醒词“小艺小艺”。可以理解的,唤醒词有多个音节,用户说出唤醒词需要一定的时间。其中,用户在t3时刻说完唤醒词。即从t1时刻到t3时刻的时间段为用户说出唤醒词的时间段。其中,预唤醒词是唤醒词的一部分。用户可能在t1时刻到t2时刻的时间段内说完了预唤醒词(如“小艺”)。语音唤醒设备10在t2时刻检测到预唤醒语音,并进入预唤醒状态。在上述预唤醒状态,语音唤醒设备10与可以等待接收其它语音唤醒设备的预唤醒词音频能量。语音唤醒设备10在t4时刻确定出应答设备。也即是说,t2时刻到t4时刻的时间段为语音唤醒设备10从检测到预唤醒语音,到确定出应答设备的时间段。As shown in Figure 6, the user starts to say the wake-up word "Xiaoyi Xiaoyi" from time t1. Understandably, the wake-up word has multiple syllables, and it takes a certain amount of time for the user to speak the wake-up word. Wherein, the user finishes speaking the wake-up word at time t3. That is, the time period from time t1 to time t3 is the time period when the user speaks the wake-up word. Wherein, the pre-wake-up word is a part of the wake-up word. The user may have finished speaking the pre-awakening word (such as "Xiaoyi") within the time period from time t1 to time t2. The voice wake-up device 10 detects the pre-wake-up voice at time t2, and enters the pre-wake-up state. In the above-mentioned pre-wake-up state, the voice wake-up device 10 can wait to receive the audio energy of the pre-wake-up word from other voice wake-up devices. The voice wake-up device 10 determines the answering device at time t4. That is to say, the time period from time t2 to time t4 is the time period from when the voice wake-up device 10 detects the pre-awakening voice to when it determines the answering device.

其中,不同的语音唤醒设备的计算能力不同。若语音唤醒设备的计算能力较强,则语音唤醒设备根据预唤醒词音频能量确定出应答设备所需要的时间较短。若语音唤醒设备的计算能力较弱,则语音唤醒设备根据预唤醒词音频能量确定出应答设备所需要的时间较长。因此,上述t4时刻有可能在t3时刻之前。即在用户说完唤醒词之前,语音唤醒设备10已经得到了应答设备的确定结果。上述t4时刻有可能在t3时刻之后。即在用户说完唤醒词之后,语音唤醒设备10才得到应答设备的确定结果。Wherein, different voice wake-up devices have different computing capabilities. If the computing capability of the voice wake-up device is strong, the time required for the voice wake-up device to determine the answering device according to the audio energy of the pre-wake word is relatively short. If the computing power of the voice wake-up device is weak, it takes a long time for the voice wake-up device to determine the answering device according to the audio energy of the pre-wake word. Therefore, the above-mentioned time t4 may be before the time t3. That is, before the user finishes speaking the wake-up word, the voice wake-up device 10 has already obtained the determination result of the answering device. The above-mentioned time t4 may be after the time t3. That is, after the user finishes speaking the wake-up word, the voice wake-up device 10 obtains the determination result of the answering device.

语音唤醒设备10在t5时刻检测到唤醒语音。当检测到唤醒语音,且语音唤醒设备10为应答设备,语音唤醒设备10可以进入唤醒状态。其中,语音唤醒设备10确定应答设备,与检测采集的声音中是否存在唤醒词可以是并行处理的。上述确定应答设备,以及检测采集的声音中是否存在唤醒词均需要一定的处理时间。上述t5时刻有可能在t4时刻之后。即语音唤醒设备10在确定出应答设备之后,才检测到唤醒语音。那么,在语音唤醒设备10是应答设备的情况下,语音唤醒设备10可以一检测到唤醒语音,就进入唤醒状态。上述t5时刻有可能在t4时刻之前。即语音唤醒设备10检测到唤醒语音的时候,还没有确定出应答设备。当在检测到唤醒语音之后确定出自己为应答设备,语音唤醒设备10可以进入唤醒状态。可以看出,无论上述t5时刻在上述t4时刻之前还是之后,相比于在检测到唤醒语音之后再与其它语音唤醒设备协商确定应答设备,图5所示的方法均可以减少从用户说完唤醒词到应答设备对用户进行响应的时间。The voice wake-up device 10 detects the wake-up voice at time t5. When a wake-up voice is detected and the voice wake-up device 10 is an answering device, the voice wake-up device 10 may enter a wake-up state. Wherein, the voice wake-up device 10 determines the answering device and detects whether there is a wake-up word in the collected sound may be processed in parallel. The above determination of the answering device and the detection of whether there is a wake-up word in the collected sound require a certain amount of processing time. The above-mentioned time t5 may be after the time t4. That is, the voice wake-up device 10 detects the wake-up voice after determining the answering device. Then, in the case that the voice wake-up device 10 is an answering device, the voice wake-up device 10 may enter the wake-up state as soon as it detects the wake-up voice. The above-mentioned time t5 may be before the time t4. That is, when the voice wake-up device 10 detects the wake-up voice, no answering device has been determined. When it is determined that it is the answering device after detecting the wake-up voice, the voice wake-up device 10 may enter the wake-up state. It can be seen that no matter whether the above-mentioned time t5 is before or after the above-mentioned time t4, compared with negotiating with other voice wake-up devices to determine the answering device after detecting the wake-up voice, the method shown in FIG. The time it takes for the answering device to respond to the user.

由上述实现方法可知,语音唤醒系统中的各个语音唤醒设备可以通过检测预唤醒语音,来确定上述协商过程开始的时机。语音唤醒设备可以在用户还没有说完唤醒词的时候,就开始协商确定应答设备。这样,当检测到唤醒语音,也即用户说完唤醒词之后,应答设备可以进入唤醒状态。上述方法不仅提高语音唤醒设备检测到唤醒语音之后的响应速度,而且上述应答设备在确定检测到唤醒语音的情况下才进行响应,不会影响唤醒率。这可以在存在多个唤醒词相同的语音唤醒设备的场景中,有效提高用户使用语音唤醒功能的使用体验。It can be known from the above implementation method that each voice wake-up device in the voice wake-up system can determine the timing for starting the above negotiation process by detecting the pre-wake-up voice. The voice wake-up device can start to negotiate and determine the answering device before the user has finished speaking the wake-up word. In this way, when the wake-up voice is detected, that is, after the user finishes speaking the wake-up word, the answering device can enter the wake-up state. The above method not only improves the response speed of the voice wake-up device after detecting the wake-up voice, but also the answering device responds only when the wake-up voice is detected, which will not affect the wake-up rate. This can effectively improve the user experience of using the voice wakeup function in a scenario where there are multiple voice wakeup devices with the same wakeup word.

在一些实施例中,在用户说出唤醒词的情况下,语音唤醒设备采集的声音除了包含用户的语音,通常还包含环境噪声。上述环境噪声可以指唤醒词对应的语音之外的声音。例如,用户走动的声音、电视播放视频的声音、音箱播放音乐的声音等等。不同的语音唤醒设备附近的噪声源可能不同。例如,语音唤醒设备10离空调很近,语音唤醒设备10采集的声音中空调运行产生的环境噪声音量较大。语音唤醒设备11离空调较远,语音唤醒设备11采集的声音中空调运行产生的环境噪声音量较小。那么,利用包含预唤醒词对应的语音和环境噪声的音频确定的预唤醒词音频能量来判断语音唤醒设备距离用户的远近,可能会产生较大的误差。In some embodiments, when the user speaks the wake-up word, the sound collected by the voice wake-up device usually includes environmental noise in addition to the user's voice. The foregoing environmental noise may refer to sounds other than the speech corresponding to the wake-up word. For example, the sound of the user walking, the sound of the TV playing video, the sound of the speaker playing music, and so on. Noise sources in the vicinity of different wake-on-voice devices may vary. For example, the voice wake-up device 10 is very close to the air conditioner, and the ambient noise generated by the air conditioner running in the sound collected by the voice wake-up device 10 is relatively loud. The voice wake-up device 11 is far away from the air conditioner, and the volume of ambient noise generated by the operation of the air conditioner in the sound collected by the voice wake-up device 11 is relatively small. Then, using the audio energy of the pre-wake-up word determined by the audio containing the voice corresponding to the pre-wake-up word and the ambient noise to judge the distance between the voice-wake-up device and the user may cause relatively large errors.

为了提高应答设备确定的准确率,使得在用户说出唤醒词之后,离用户最近的语音唤醒设备进入唤醒状态,语音唤醒设备可以减少预唤醒词音频能量中环境噪声所产生的能量。下面仍以语音唤醒设备10和语音唤醒设备11组成的语音唤醒系统为例,介绍本申请实施例提供的另一种设备唤醒方法。In order to improve the determination accuracy of the answering device, so that after the user speaks the wake-up word, the voice wake-up device closest to the user enters the wake-up state, and the voice wake-up device can reduce the energy generated by the environmental noise in the audio energy of the pre-wake-up word. Still taking the voice wake-up system composed of the voice wake-up device 10 and the voice wake-up device 11 as an example, another device wake-up method provided by the embodiment of the present application is introduced below.

请参照图7,图7示例性示出了本申请实施例提供的另一种设备唤醒方法的流程图。Please refer to FIG. 7 , which exemplarily shows a flowchart of another method for waking up a device provided by an embodiment of the present application.

该方法可包含步骤S710~S780。其中:The method may include steps S710-S780. in:

S710、语音唤醒设备10采集环境音,并确定环境音对应的环境音频能量。S710. The voice wake-up device 10 collects ambient sound, and determines ambient audio energy corresponding to the ambient sound.

S720、语音唤醒设备11采集环境音,并确定环境音对应的环境音频能量。S720. The voice wake-up device 11 collects the environmental sound, and determines the environmental audio energy corresponding to the environmental sound.

在一种可能的实现方法中,语音唤醒设备10和语音唤醒设备11可以定时或者不定时对采集的环境音进行处理,确定环境音频能量。上述环境音可以表示语音唤醒设备的音频输入装置能采集到的声音。上述环境音频能量可以包括环境音的声音强度、声压等参数。本申请实施例对环境音频能量的计算方法不作限定。In a possible implementation method, the voice wake-up device 10 and the voice wake-up device 11 may process the collected environmental sound regularly or irregularly to determine the environmental audio energy. The above ambient sound may represent the sound that can be collected by the audio input device of the voice wake-up device. The above environmental audio energy may include parameters such as sound intensity and sound pressure of the environmental sound. The embodiment of the present application does not limit the calculation method of the ambient audio energy.

S730、语音唤醒设备10检测到预唤醒语音,进入预唤醒状态,确定自己检测到的预唤醒词对应的预唤醒词音频能量,根据预唤醒词音频能量和环境音频能量得到去噪预唤醒词音频能量,预唤醒词是唤醒词的一部分。S730, the voice wake-up device 10 detects the pre-wake-up voice, enters the pre-wake-up state, determines the pre-wake-up word audio energy corresponding to the pre-wake-up word detected by itself, and obtains the denoising pre-wake-up word audio according to the pre-wake-up word audio energy and the environmental audio energy Energy, the pre-wake word is part of the wake word.

S740、语音唤醒设备11检测到预唤醒语音,进入预唤醒状态,确定自己检测到的预唤醒词对应的预唤醒词音频能量,根据预唤醒词音频能量和环境音频能量得到去噪预唤醒词音频能量,预唤醒词是唤醒词的一部分。S740, the voice wake-up device 11 detects the pre-wake-up voice, enters the pre-wake-up state, determines the pre-wake-up word audio energy corresponding to the pre-wake-up word detected by itself, and obtains the denoising pre-wake-up word audio according to the pre-wake-up word audio energy and the environmental audio energy Energy, the pre-wake word is part of the wake word.

语音唤醒设备10和语音唤醒设备11还可以对采集的环境音进行语音识别,判断环境音中是否包含预唤醒词。当检测到预唤醒语音,语音唤醒设备10和语音唤醒设备11可以进入预唤醒状态。The voice wake-up device 10 and the voice wake-up device 11 can also perform voice recognition on the collected environmental sound, and judge whether the environmental sound contains a pre-awakening word. When the pre-wake-up voice is detected, the voice wake-up device 10 and the voice wake-up device 11 can enter a pre-wake-up state.

其中,语音唤醒设备10可以根据包含预唤醒词的声音得到的音频,确定预唤醒词对应的预唤醒词音频能量。由于上述包含预唤醒词的声音中存在环境噪声,上述预唤醒词音频能量中包含环境噪声的音频能量。语音唤醒设备10可以获取根据最近一次采集的不包含预唤醒词的环境音确定的环境音频能量。然后,语音唤醒设备10可以利用上述预唤醒词音频能量减上述环境音频能量,得到上述去噪预唤醒词音频能量。也即是说,语音唤醒设备10可以根据下述公式来确定去噪预唤醒词音频能量:Wherein, the voice awakening device 10 may determine the audio energy of the pre-awakening word corresponding to the pre-awakening word according to the audio obtained from the sound containing the pre-awakening word. Since there is environmental noise in the sound containing the pre-awakening word, the audio energy of the above-mentioned pre-awakening word includes the audio energy of the environmental noise. The voice wake-up device 10 may acquire ambient audio energy determined according to the latest collected ambient sound that does not contain a pre-wake word. Then, the voice wake-up device 10 may subtract the ambient audio energy from the audio energy of the pre-wake-up word to obtain the audio energy of the denoised pre-wake-up word. That is to say, the voice awakening device 10 can determine the audio energy of the denoising pre-awakening word according to the following formula:

去噪预唤醒词音频能量=预唤醒词音频能量-环境音频能量Denoising pre-awakening word audio energy = pre-awakening word audio energy - ambient audio energy

可以理解的,由于上述不包含预唤醒词的环境音是语音唤醒设备10在检测到预唤醒语音之前最近一次采集的,那么从语音唤醒设备10采集上述不包含唤醒词的环境音,到用户说出预唤醒词的过程中,除了用户的预唤醒语音,环境中的声音可以认为是几乎不变的。上述包含预唤醒词的声音可以相当于是由用户的预唤醒语音和上述不包含预唤醒词的环境音组成的。这样,上述去噪预唤醒词音频能量可以相当于是上述预唤醒语音所产生的音频能量。相比于预唤醒词音频能量,上述去噪预唤醒词音频能量的大小可以更准确地反映语音唤醒设备与用户距离的远近。It can be understood that since the above-mentioned environmental sound that does not contain the pre-wake-up word is collected last time by the voice wake-up device 10 before detecting the pre-wake-up voice, then the above-mentioned environmental sound that does not contain the wake-up word is collected from the voice wake-up device 10, and the user says In the process of issuing the pre-wake-up word, except for the user's pre-wake-up voice, the sound in the environment can be considered to be almost unchanged. The above-mentioned sound containing the pre-wake-up word may be equivalent to being composed of the user's pre-wake-up voice and the above-mentioned environmental sound not containing the pre-wake-up word. In this way, the audio energy of the denoising pre-awakening word may be equivalent to the audio energy generated by the above-mentioned pre-awakening speech. Compared with the audio energy of the pre-wake-up word, the audio energy of the denoising pre-wake-up word can more accurately reflect the distance between the voice wake-up device and the user.

在一种可能的实现方式中,在确定上述预唤醒词音频能量之前,语音唤醒设备10可以对上述包含预唤醒词的声音得到的音频进行噪声消除处理。在确定上述环境音频能量之前,语音唤醒设备10可以对上述不包含预唤醒词的环境音进行噪声消除处理。上述噪声消除处理可用于消除音频中的部分噪声信号。也即是说,上述预唤醒词音频能量、上述环境音频能量均可以是经过降噪处理得到的。语音唤醒设备10对上述不包含唤醒词的环境音和上述包含预唤醒词的声音得到的音频所进行的噪声消除处理可以相同的。这样可以消除上述不包含唤醒词的环境音和上述包含预唤醒词的声音得到的音频中相同的噪声信号。本申请实施例对上述噪声消除处理的实现方法不作限定。In a possible implementation manner, before determining the audio energy of the pre-wake word, the voice wake-up device 10 may perform noise cancellation processing on the audio obtained from the sound containing the pre-wake word. Before determining the ambient audio energy, the voice wake-up device 10 may perform noise cancellation processing on the above-mentioned ambient sound that does not contain the pre-wake-up words. The noise removal process described above can be used to remove part of the noise signal in the audio. That is to say, both the audio energy of the pre-awakening word and the audio energy of the environment may be obtained through noise reduction processing. The noise elimination processing performed by the voice wake-up device 10 on the audio obtained from the above-mentioned ambient sound not containing the wake-up word and the above-mentioned sound containing the pre-wake-up word may be the same. In this way, the same noise signal in the audio obtained by the ambient sound not containing the wake-up word and the sound containing the pre-wake-up word can be eliminated. The implementation method of the foregoing noise elimination processing is not limited in the embodiment of the present application.

语音唤醒设备11确定去噪预唤醒词音频能量的方法可以参考上述语音唤醒设备10确定去噪预唤醒词音频能量的方法。For the method for the voice wake-up device 11 to determine the audio energy of the denoising pre-wake word, refer to the above-mentioned method for the voice wake-up device 10 to determine the audio energy of the de-noise pre-wake word.

S750、语音唤醒设备10和语音唤醒设备11互相通告各自的去噪预唤醒词音频能量。S750, the voice wake-up device 10 and the voice wake-up device 11 notify each other of the audio energy of the denoising pre-wake word.

S760、语音唤醒设备10根据自己与语音唤醒设备11的去噪预唤醒词音频能量,确定出自己的去噪预唤醒词音频能量最大,将自己确定为应答设备。S760. The voice wake-up device 10 determines that its own audio energy of the denoising pre-wake-up word is the largest according to the audio energy of the denoising pre-wake-up word between itself and the voice wake-up device 11, and determines itself as the answering device.

S770、语音唤醒设备11据自己与语音唤醒设备10的去噪预唤醒词音频能量,确定出语音唤醒设备10的去噪预唤醒词音频能量最大,将语音唤醒设备10确定为应答设备。S770. The voice wake-up device 11 determines that the audio energy of the denoising pre-wake-up word of the voice wake-up device 10 is the largest according to the audio energy of the denoising pre-wake-up word of itself and the voice wake-up device 10, and determines the voice wake-up device 10 as the answering device.

语音唤醒设备10和语音唤醒设备11利用去噪预唤醒词音频能量确定应答设备的实现过程,可以参考前述图5所示方法中语音唤醒设备10和语音唤醒设备11利用预唤醒词音频能量确定应答设备的实现过程。这里不再赘述。Voice wake-up device 10 and voice wake-up device 11 use the audio energy of the denoising pre-wake-up word to determine the implementation process of the response device. You can refer to the aforementioned method shown in FIG. implementation of the device. I won't go into details here.

S780、当语音唤醒设备10检测到唤醒语音,根据步骤S760中应答设备的确定结果,进入唤醒状态,语音助手APP被唤醒并对用户进行响应。S780. When the voice wake-up device 10 detects the wake-up voice, it enters the wake-up state according to the determination result of the answering device in step S760, and the voice assistant APP is woken up and responds to the user.

由图7所示的方法可知,语音唤醒系统中的各个语音唤醒设备在利用预唤醒词音频能量协商确定应答设备的过程中,可以去除预唤醒词音频能量中由环境噪声产生的音频能量。这可以减少环境噪声对确定应答设备的影响,提高应答设备的确定结果的准确率。通过上述去噪预唤醒词音频能量,语音唤醒设备可以在用户还没有说完唤醒词的时候,就开始协商确定应答设备。这样,当检测到唤醒语音,也即用户说完唤醒词之后,应答设备可以进入唤醒状态。上述方法不仅提高语音唤醒设备检测到唤醒语音之后的响应速度,而且上述应答设备在确定检测到唤醒语音的情况下才进行响应,不会影响唤醒率。这可以在存在多个唤醒词相同的语音唤醒设备的场景中,有效提高用户使用语音唤醒功能的使用体验。As can be seen from the method shown in FIG. 7 , each voice wake-up device in the voice wake-up system can remove the audio energy generated by environmental noise in the audio energy of the pre-wake-up word during the process of negotiating and determining the answering device using the audio energy of the pre-wake-up word. This can reduce the influence of environmental noise on the determination of the answering device, and improve the accuracy of the determination result of the answering device. By denoising the audio energy of the pre-awakening word, the voice wake-up device can start to negotiate and determine the answering device before the user finishes speaking the wake-up word. In this way, when the wake-up voice is detected, that is, after the user finishes speaking the wake-up word, the answering device can enter the wake-up state. The above method not only improves the response speed of the voice wake-up device after detecting the wake-up voice, but also the answering device responds only when the wake-up voice is detected, which will not affect the wake-up rate. This can effectively improve the user experience of using the voice wakeup function in a scenario where there are multiple voice wakeup devices with the same wakeup word.

在一些实施例中,语音唤醒系统中可包含主设备。该主设备可以是语音唤醒系统包含的多个唤醒词相同的语音唤醒设备中的一个。或者,该主设备可以是上述多个语音唤醒设备之外的一个电子设备。该主设备可用于接收语音唤醒系统中各个语音唤醒设备确定的预唤醒词音频能量,并根据上述预唤醒词音频能量从语音唤醒系统中确定出一个应答设备。In some embodiments, a master device may be included in the voice wake-up system. The master device may be one of multiple voice wake-up devices included in the voice wake-up system with the same wake-up word. Alternatively, the main device may be an electronic device other than the above-mentioned multiple voice-activated devices. The master device can be used to receive the pre-wake-up word audio energy determined by each voice-wake-up device in the voice-wake-up system, and determine a response device from the voice-wake-up system according to the audio energy of the pre-wake-up word.

请参照图8,图8示例性示出了本申请实施例提供的一种主设备200的结构示意图。Please refer to FIG. 8 . FIG. 8 exemplarily shows a schematic structural diagram of a master device 200 provided by an embodiment of the present application.

如图8所示,主设备200可以包括:通过总线耦合的裁决模块810、通信模块820、存储模块830。其中:As shown in FIG. 8 , the main device 200 may include: an arbitration module 810 , a communication module 820 , and a storage module 830 coupled through a bus. in:

通信模块820可用于与语音唤醒系统中的语音唤醒设备建立通信连接。上述通信连接可以包括:有线通信连接、无线通信连接(如蓝牙通信连接、Wi-Fi通信连接等)。本申请实施例对上述通信连接的具体方法不作限定。主设备200可以通过通信模块820接收来自语音唤醒设备确定的预唤醒词音频能量,并向各语音唤醒设备发送应答设备的确定结果。The communication module 820 can be used to establish a communication connection with the voice wake-up device in the voice wake-up system. The communication connection mentioned above may include: a wired communication connection, a wireless communication connection (such as a Bluetooth communication connection, a Wi-Fi communication connection, etc.). The embodiment of the present application does not limit the specific method of the foregoing communication connection. The main device 200 may receive the audio energy of the pre-wake word determined from the voice wake-up device through the communication module 820, and send the determination result of the answering device to each voice wake-up device.

通信模块820可以将接收到的预唤醒词音频能量发送给裁决模块810。The communication module 820 may send the received audio energy of the pre-wake word to the arbitration module 810 .

裁决模块810可以根据不同语音唤醒设备确定的预唤醒词音频能量,判断哪个语音唤醒设备的预唤醒词音频能量最大。裁决模块810可以将最大的预唤醒词音频能量对应的一个语音唤醒设备确定为应答设备。The decision module 810 may determine which voice wake-up device has the highest audio energy of the pre-wake word according to the audio energy of the pre-wake word determined by different voice wake-up devices. The arbitration module 810 may determine a voice wake-up device corresponding to the largest pre-wake-up word audio energy as the answering device.

裁决模块810可以参考前述图4所示的裁决模块420。For the arbitration module 810, reference may be made to the aforementioned arbitration module 420 shown in FIG. 4 .

存储模块830可用于存储计算机程序。例如,用于利用预唤醒词音频能量确定应答设备的计算机程序等。The storage module 830 can be used to store computer programs. For example, a computer program for determining an answering device using pre-wake word audio energy, etc.

不限于图8所示的模块,主设备200还可以包含更多或更少的模块。例如,主设备200位语音唤醒系统中的一个语音唤醒设备。那么,主设备200还可以包含预唤醒模块、唤醒模块、语音助手APP、语音输入模块和音频输出模块。上述存储模块830中还可存储有预唤醒模型和唤醒模型。当主设备200根据自己确定的预唤醒词音频能量和其它语音唤醒设备确定的预唤醒词音频能量确定出自己为应答设备,主设备200可以在检测到唤醒语音之后进入唤醒状态。Not limited to the modules shown in FIG. 8 , the main device 200 may also include more or fewer modules. For example, a voice wake-up device in the master device 200-bit voice wake-up system. Then, the main device 200 may also include a pre-wake-up module, a wake-up module, a voice assistant APP, a voice input module and an audio output module. The above-mentioned storage module 830 may also store a pre-wake-up model and a wake-up model. When the main device 200 determines that it is the responding device according to the audio energy of the pre-wake-up word determined by itself and the audio energy of the pre-wake-up word determined by other voice wake-up devices, the main device 200 may enter the wake-up state after detecting the wake-up voice.

在一些实施例中,语音唤醒系统中除了包含多个唤醒词相同的语音唤醒设备,还可以包含云服务器。该云服务器可用于接收语音唤醒系统中各个语音唤醒设备确定的预唤醒词音频能量,并根据上述预唤醒词音频能量从语音唤醒系统中确定出一个应答设备。云服务器可以将应答设备的确定结果发送给语音唤醒系统中各个语音唤醒设备。In some embodiments, the voice wake-up system may also include a cloud server in addition to multiple voice wake-up devices with the same wake-up word. The cloud server can be used to receive the pre-wake-up word audio energy determined by each voice-wake-up device in the voice-wake-up system, and determine an answering device from the voice-wake-up system according to the audio energy of the pre-wake-up word. The cloud server can send the determination result of the answering device to each voice wake-up device in the voice wake-up system.

下面基于包含多个唤醒词相同的语音唤醒设备(如语音唤醒设备10、语音唤醒设备11等)以及主设备200的语音唤醒系统,介绍本申请实施例提供的另一种设备唤醒方法。Based on the voice wake-up system including multiple voice wake-up devices with the same wake-up word (such as voice wake-up device 10, voice wake-up device 11, etc.) and the main device 200, another device wake-up method provided by the embodiment of the present application is introduced below.

图9示例性示出了本申请实施例提供的另一种设备唤醒方法的流程图。FIG. 9 exemplarily shows a flow chart of another method for waking up a device provided by an embodiment of the present application.

如图9所示,该方法可包括步骤S910~S980。其中:As shown in FIG. 9, the method may include steps S910-S980. in:

S910、语音唤醒设备10检测到预唤醒语音,进入预唤醒状态,确定自己检测到的预唤醒词对应的预唤醒词音频能量,预唤醒词是唤醒词的一部分。S910. The voice wake-up device 10 detects the pre-wake-up voice, enters the pre-wake-up state, and determines the audio energy of the pre-wake-up word corresponding to the pre-wake-up word detected by itself, and the pre-wake-up word is a part of the wake-up word.

S920、语音唤醒设备11检测到预唤醒语音,进入预唤醒状态,确定自己检测到的预唤醒词对应的预唤醒词音频能量,预唤醒词是唤醒词的一部分。S920. The voice wake-up device 11 detects the pre-wake-up voice, enters the pre-wake-up state, and determines the audio energy of the pre-wake-up word corresponding to the pre-wake-up word detected by itself, and the pre-wake-up word is a part of the wake-up word.

步骤S910和步骤S920可以参考前述图5所示的步骤S510和步骤S520。不限于语音唤醒设备10和语音唤醒设备11,语音唤醒系统中可包含更多的语音唤醒设备。在用户说出唤醒词的过程中,语音唤醒系统中的多个语音唤醒设备检测到预唤醒语音,并确定各自检测到预唤醒词对应的预唤醒词音频能量。Step S910 and step S920 may refer to the aforementioned step S510 and step S520 shown in FIG. 5 . Not limited to the voice wake-up device 10 and the voice wake-up device 11 , more voice wake-up devices may be included in the voice wake-up system. When the user speaks the wake-up word, multiple voice wake-up devices in the voice wake-up system detect the pre-wake-up voice, and determine the audio energy of the pre-wake-up word corresponding to the detected pre-wake-up word.

S930、语音唤醒设备10向主设备200发送预唤醒词音频能量。S930. The voice wake-up device 10 sends the audio energy of the pre-wake-up word to the main device 200 .

S940、语音唤醒设备11向主设备200发送预唤醒词音频能量。S940, the voice wake-up device 11 sends the audio energy of the pre-wake-up word to the master device 200 .

语音唤醒系统中的多个语音唤醒设备可以将自己确定的预唤醒词音频能量发送给主设备200。Multiple voice wake-up devices in the voice wake-up system can send the audio energy of the pre-wake-up word determined by themselves to the main device 200 .

S950、主设备200根据多个语音唤醒设备的预唤醒词音频能量确定出语音唤醒设备10的预唤醒词音频能量最大,将语音唤醒设备10确定为应答设备。S950. The main device 200 determines that the audio energy of the pre-wake word of the voice wake-up device 10 is the largest according to the audio energy of the pre-wake word of the multiple voice wake-up devices, and determines the voice wake-up device 10 as the answering device.

在一种可能的实现方式中,主设备200可以确定语音唤醒系统中包含的语音唤醒设备的数量。主设备200可以在接收到语音唤醒系统包含的所有语音唤醒设备发送的预唤醒词音频能量之后,开始根据预唤醒词音频能量来确定应答设备。这样可以减少遗漏语音唤醒设备,导致应答设备的确定结果不准确的情况。In a possible implementation manner, the master device 200 may determine the number of voice wake-up devices included in the voice wake-up system. The main device 200 may start to determine the answering device according to the audio energy of the pre-wake word after receiving the audio energy of the pre-wake word sent by all the voice-wake-up devices included in the voice-wake-up system. This can reduce the situation that the voice wakes up the device due to missed voice, resulting in inaccurate determination results of the answering device.

或者,主设备200可以在预设的等待时间段内等待语音唤醒设备发送预唤醒词音频能量。若在上述预设的等待时间段内,主设备200接收到语音唤醒系统中一部分语音唤醒设备发送的预唤醒词音频能量。主设备200可以利用在上述预设的等待时间段内接收到的预唤醒词音频能量,从接收到的预唤醒词音频能量所属的语音唤醒设备中选取出一个应答设备。可以理解的,在上述预设的等待时间段内未向主设备200发送预唤醒词音频能量的语音唤醒设备,可以认为是未检测到预唤醒语音的设备。例如,在用户说出唤醒词的过程中,距离用户较远的语音唤醒设备可能难以采集到包含预唤醒词的声音。那么,主设备200可以不用必须等到语音唤醒系统中所有的语音唤醒设备发送的预唤醒词音频能量,再开始确定应答设备。这可以提高确定应答设备的效率,从而提高语音唤醒系统中的语音唤醒设备在检测到唤醒语音后的响应速度。Alternatively, the main device 200 may wait for the voice wake-up device to send audio energy of the pre-wake word within a preset waiting period. If the master device 200 receives the pre-wake word audio energy sent by some voice wake-up devices in the voice wake-up system within the above-mentioned preset waiting time period. The master device 200 may use the audio energy of the pre-wake word received within the preset waiting period to select a response device from the voice wake-up devices to which the received audio energy of the pre-wake word belongs. It can be understood that the voice wake-up device that does not send the audio energy of the pre-wake-up word to the master device 200 within the above-mentioned preset waiting period can be regarded as a device that has not detected the pre-wake-up voice. For example, when the user speaks the wake-up word, it may be difficult for the voice-activated device far away from the user to collect the sound containing the pre-wake-up word. Then, the master device 200 does not have to wait for the audio energy of the pre-wake word sent by all the voice wake-up devices in the voice wake-up system before starting to determine the answering device. This can improve the efficiency of determining the answering device, thereby improving the response speed of the voice wake-up device in the voice wake-up system after detecting the wake-up voice.

主设备200确定应答设备的具体方法可以参考前述实施例。这里不再赘述。For the specific method for the main device 200 to determine the answering device, reference may be made to the foregoing embodiments. I won't go into details here.

S960、主设备200向语音唤醒设备11发送应答设备的确定结果。S960, the main device 200 sends the determination result of the answering device to the voice wake-up device 11 .

S970、主设备200向语音唤醒设备10发送应答设备的确定结果。S970, the main device 200 sends the determination result of the answering device to the voice wake-up device 10 .

当确定出应答设备,主设备200可以向语音唤醒系统中的多个语音唤醒设备发送应答设备的确定结果。例如,该应答设备的确定结果指示语音唤醒设备10为应答设备。其中,主设备200可以将应答设备的确定结果,发送给向主设备200发送过语音唤醒音频能量的语音唤醒设备。When the answering device is determined, the main device 200 may send the determination result of the answering device to multiple voice-activating devices in the voice-activating system. For example, the determination result of the answering device indicates that the voice wake-up device 10 is the answering device. Wherein, the master device 200 may send the determination result of the answering device to the voice wake-up device that has sent the voice wake-up audio energy to the master device 200 .

S980、当语音唤醒设备10检测到唤醒语音,可以根据接收到的应答设备的确定结果,进入唤醒状态,语音唤醒设备10的语音助手APP被唤醒并对用户进行应答。S980. When the voice wake-up device 10 detects the wake-up voice, it may enter the wake-up state according to the received determination result of the answering device, and the voice assistant APP of the voice wake-up device 10 is woken up and responds to the user.

在应答设备的确定结果指示语音唤醒设备10为应答设备的情况下,语音唤醒设备10可以在检测到唤醒语音之后,进入唤醒状态。In the case that the determination result of the answering device indicates that the voice wake-up device 10 is the answering device, the voice wake-up device 10 may enter the wake-up state after detecting the wake-up voice.

上述步骤S960是可选的。在一些实施例中,主设备200也可以仅向应答设备发送上述应答设备的确定结果。例如,当确定出应答设备为语音唤醒设备10,主设备200可以向语音唤醒设备10发送应答设备的确定结果。语音唤醒设备10在检测到唤醒语音后,可以根据应答设备的确定结果进入唤醒状态。而语音唤醒设备11在检测到唤醒语音后,可以等待应答设备的确定结果。当等待超时,语音唤醒设备11可以停止等待。也即是说,未接收到主设备200发送应答设备的确定结果的语音唤醒设备,即便检测到唤醒语音,也不进入唤醒状态。The above step S960 is optional. In some embodiments, the master device 200 may also only send the determination result of the responding device to the responding device. For example, when it is determined that the answering device is the voice wake-up device 10 , the master device 200 may send the determination result of the answering device to the voice wake-up device 10 . After the voice wake-up device 10 detects the wake-up voice, it can enter the wake-up state according to the determination result of the answering device. After the voice wake-up device 11 detects the wake-up voice, it may wait for the determination result of the answering device. When waiting for timeout, the voice wake-up device 11 may stop waiting. That is to say, the voice wake-up device that has not received the determination result sent by the master device 200 from the answering device will not enter the wake-up state even if the wake-up voice is detected.

在一些实施例中,语音唤醒系统中的多个语音唤醒设备可以基于上述预唤醒词音频能量,确定去噪预唤醒词音频能量,并将去噪预唤醒词音频能量发送给主设备200。即主设备200可以利用去噪预唤醒词音频能量来确定应答设备。In some embodiments, multiple voice wake-up devices in the voice wake-up system may determine the audio energy of the denoised pre-wake word based on the audio energy of the pre-wake word, and send the audio energy of the denoised pre-wake word to the main device 200 . That is, the main device 200 can use the audio energy of the denoising pre-wake word to determine the answering device.

在一些实施例中,图9所示的主设备200也可被替换为云服务器。即云服务器可以根据预唤醒词音频能量来确定应答设备,并将应答设备的确定结果发送给语音唤醒系统中的多个语音唤醒设备。In some embodiments, the main device 200 shown in FIG. 9 can also be replaced by a cloud server. That is, the cloud server can determine the answering device according to the audio energy of the pre-awakening word, and send the determination result of the answering device to multiple voice wake-up devices in the voice wake-up system.

由上述图9所示的方法可知,语音唤醒系统中的各个语音唤醒设备可以通过检测预唤醒语音的情况下,将预唤醒词音频能量发送给主设备200。主设备200可以利用预唤醒词音频能量来确定应答设备。相比于在检测到唤醒语音之后再确定应答设备,上述方法可以将确定应答设备的处理过程提前。那么各个语音唤醒设备有可能在检测到唤醒语音之前,就得到了应答设备的确定结果。应答设备的确定结果指示的语音唤醒设备可以在检测到唤醒语音后立即进入唤醒状态。上述方法不仅提高语音唤醒设备检测到唤醒语音之后的响应速度,而且上述应答设备在确定检测到唤醒语音的情况下才进行响应,不会影响唤醒率。这可以在存在多个唤醒词相同的语音唤醒设备的场景中,有效提高用户使用语音唤醒功能的使用体验。It can be known from the above method shown in FIG. 9 that each voice wake-up device in the voice wake-up system can send the audio energy of the pre-wake-up word to the main device 200 when detecting the pre-wake-up voice. The master device 200 may utilize the pre-wake word audio energy to determine the answering device. Compared with determining the answering device after the wake-up voice is detected, the above method can advance the process of determining the answering device. Then each voice wake-up device may obtain the determination result of the answering device before detecting the wake-up voice. The voice wake-up device indicated by the determination result of the answering device may enter the wake-up state immediately after detecting the wake-up voice. The above method not only improves the response speed of the voice wake-up device after detecting the wake-up voice, but also the answering device responds only when the wake-up voice is detected, which will not affect the wake-up rate. This can effectively improve the user experience of using the voice wakeup function in a scenario where there are multiple voice wakeup devices with the same wakeup word.

请参照图10,图10示例性示出了本申请实施例提供的一种语音唤醒系统1000的示意图。Please refer to FIG. 10 . FIG. 10 exemplarily shows a schematic diagram of a voice wake-up system 1000 provided by an embodiment of the present application.

如图10所示,在一些实施例中,语音唤醒系统1000可包含一个或多个语音唤醒设备(如语音唤醒设备10、语音唤醒设备11等)。在另一些实施例中,除了包含一个或多个语音唤醒设备,语音唤醒系统1000还可包含主设备200。在另一些实施例中,除了包含一个或多个语音唤醒设备,语音唤醒系统1000还可包含云服务器201。也即是说,上述主设备200和云服务器201是可选的。As shown in FIG. 10 , in some embodiments, the voice wake-up system 1000 may include one or more voice wake-up devices (eg, voice wake-up device 10 , voice wake-up device 11 , etc.). In some other embodiments, the voice wake-up system 1000 may further include a master device 200 in addition to including one or more voice wake-up devices. In some other embodiments, besides including one or more voice wake-up devices, the voice wake-up system 1000 may also include a cloud server 201 . That is to say, the above-mentioned master device 200 and cloud server 201 are optional.

示例性的,语音唤醒系统1000可包含一个或多个语音唤醒设备。这一个或多个语音唤醒设备可以根据前述图5所示的方法,在检测到唤醒语音后确定出一个语音唤醒设备对用户进行响应。Exemplarily, the voice wake-up system 1000 may include one or more voice wake-up devices. The one or more voice wake-up devices may determine a voice wake-up device to respond to the user after detecting the wake-up voice according to the aforementioned method shown in FIG. 5 .

可选的,语音唤醒系统1000可包含一个或多个语音唤醒设备,以及主设备200。语音唤醒系统1000可以根据前述图9所示的方法,在检测到唤醒语音后确定出一个语音唤醒设备对用户进行响应。其中,主设备200也可以是一个语音唤醒设备。Optionally, the voice wake-up system 1000 may include one or more voice wake-up devices, and the main device 200 . The voice wake-up system 1000 may determine a voice wake-up device to respond to the user after detecting the wake-up voice according to the aforementioned method shown in FIG. 9 . Wherein, the main device 200 may also be a voice wake-up device.

可选的,语音唤醒系统1000可包含一个或多个语音唤醒设备,以及云服务器201。语音唤醒系统1000可以根据前述图9所示的方法,在检测到唤醒语音后确定出一个语音唤醒设备对用户进行响应。Optionally, the voice wake-up system 1000 may include one or more voice wake-up devices, and a cloud server 201 . The voice wake-up system 1000 may determine a voice wake-up device to respond to the user after detecting the wake-up voice according to the aforementioned method shown in FIG. 9 .

需要说明的是,在不产生矛盾或冲突的情况下,本申请任意实施例中的任意特征,或任意特征中的任意部分都可以组合,组合后的技术方案也在本申请实施例的范围内。It should be noted that, in the absence of contradiction or conflict, any feature in any embodiment of the present application, or any part of any feature can be combined, and the combined technical solution is also within the scope of the embodiments of the present application .

以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。As mentioned above, the above embodiments are only used to illustrate the technical solutions of the present application, and are not intended to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still understand the foregoing The technical solutions described in each embodiment are modified, or some of the technical features are replaced equivalently; and these modifications or replacements do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the various embodiments of the application.

Claims (16)

1. A method of waking up a device, the method comprising:
the method comprises the steps that first electronic equipment detects first pre-awakening voice containing pre-awakening words, and first audio energy is obtained according to the first pre-awakening voice;
The first electronic device receives M audio energies sent by M electronic devices, wherein one audio energy in the M audio energies is obtained by one electronic device in the M electronic devices according to the detected pre-wake-up voice containing the pre-wake-up word, and M is a positive integer;
the first electronic device determines the first electronic device as a device for responding according to the first audio energy and the M audio energies;
when a first wake-up voice containing a wake-up word is detected, a first application in the first electronic equipment enters a wake-up state;
the pre-wake word is a part of the wake word, and the first application is used for detecting and responding to a voice instruction in the wake state to execute an operation corresponding to the voice instruction.
2. The method of claim 1, wherein after the first electronic device detects a first pre-wake-up speech that includes a pre-wake-up word, the method further comprises:
and when the collected sound is detected not to contain the wake-up word, the first application in the first electronic equipment does not enter the wake-up state.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
The first electronic device detects a second pre-wake-up speech comprising the pre-wake-up word, obtains a second audio energy based on the second pre-wake-up speech,
the first electronic device receives K pieces of audio energy sent by K pieces of electronic devices, wherein one piece of audio energy in the K pieces of audio energy is obtained by one piece of electronic device in the K pieces of electronic devices according to detected pre-wake-up voice containing the pre-wake-up word, and K is a positive integer;
the first electronic device determines that a second electronic device in the K electronic devices is a device for responding according to the second audio energy and the K audio energies;
and under the condition that the second electronic device is determined to be a device for responding, the first application in the first electronic device does not enter the awakening state.
4. A method according to claim 3, characterized in that the method further comprises:
and under the condition that the second electronic equipment is determined to be the equipment for responding, the first electronic equipment sends a first message to the second electronic equipment, wherein the first message comprises a first result, the first result is used for indicating the second electronic equipment to be the equipment for responding, and the first message is used for indicating the second electronic equipment to enable the first application in the second electronic equipment to enter the awakening state after the second electronic equipment detects awakening voice containing the awakening word.
5. A method according to any of claims 1-3, wherein after the deriving first audio energy from the first pre-wake-up speech, the method further comprises:
the first electronic device sends the first audio energy to the M electronic devices.
6. The method according to any one of claims 1-5, further comprising:
the first electronic equipment collects first sound, wherein the first sound does not contain the pre-awakening words;
the first electronic device obtains third audio energy according to the first sound;
the first electronic device obtains fourth audio energy according to the first pre-awakening voice;
the obtaining the first audio energy according to the first pre-wake-up voice specifically includes:
the first electronic device subtracts the third audio energy from the fourth audio energy to obtain the first audio energy.
7. A device wake-up method, wherein the method is applied to a voice wake-up system, the voice wake-up system includes H electronic devices, the H electronic devices include a first electronic device, and H is a positive integer greater than 1, the method includes:
The first electronic equipment detects first pre-awakening voice containing a pre-awakening word, and obtains first audio energy according to the first pre-awakening voice;
the H1 pieces of electronic equipment in the H pieces of electronic equipment send H1 pieces of audio energy to the first electronic equipment, the H1 pieces of electronic equipment do not contain the first electronic equipment, and one piece of audio energy in the H1 pieces of audio energy is obtained by one piece of electronic equipment in the H1 pieces of electronic equipment according to detected pre-wake-up voice containing the pre-wake-up word; the H1 is a positive integer smaller than H;
the first electronic device determines the first electronic device as a device for responding according to the first audio energy and the H1 audio energy;
when a first wake-up voice containing a wake-up word is detected, a first application in the first electronic equipment enters a wake-up state;
the pre-wake word is a part of the wake word, and the first application is used for detecting and responding to a voice instruction in the wake state to execute an operation corresponding to the voice instruction.
8. The method of claim 7, wherein the method further comprises:
the first application in each of the H1 electronic devices does not enter the awake state.
9. The method of any of claims 7 or 8, wherein after the first electronic device detects a first pre-wake speech comprising a pre-wake word, the method further comprises:
and when the collected sound is detected not to contain the wake-up word, the first application in the first electronic equipment does not enter the wake-up state.
10. The method according to any one of claims 7-9, further comprising:
the first electronic device detects second pre-awakening voice containing the pre-awakening word, and second audio energy is obtained according to the second pre-awakening voice;
the H2 electronic devices in the H electronic devices send H2 audio energy to the first electronic device, the H2 electronic devices do not contain the first electronic device, and one audio energy in the H2 audio energy is obtained by one electronic device in the H2 electronic devices according to the detected pre-wake-up voice containing the pre-wake-up word; the H2 is a positive integer less than H;
the first electronic device determines that a second electronic device in the H2 electronic devices is a device for response according to the second audio energy and the H2 audio energy;
When a second wake-up voice containing the wake-up word is detected, the first application in the second electronic device enters the wake-up state, and neither the first application in the first electronic device nor the first application in each of (H2-1) electronic devices enters the wake-up state, wherein the (H2-1) electronic devices are devices of the H2 electronic devices except the second electronic device.
11. The method according to claim 10, wherein the first application in the second electronic device enters the wake state when a second wake speech comprising the wake word is detected, in particular comprising:
the first electronic device sends a first message to the second electronic device, wherein the first message comprises a first result, and the first result is used for indicating that the second electronic device is a device for responding;
based on the first message, when the second wake-up voice is detected, the first application in the second electronic device enters the wake-up state.
12. The method according to claim 10, wherein the method further comprises:
the first electronic device transmitting the second audio energy to the second electronic device;
The (H2-1) electronic equipment sends (H2-1) audio energy to the second electronic equipment, wherein the (H2-1) audio energy is audio energy obtained by the (H2-1) electronic equipment in the H2 audio energy;
the second electronic device determines that the second electronic device is a device for responding according to the second audio energy, the (H2-1) audio energy and the fifth audio energy obtained by the second electronic device according to the detected third pre-awakening voice containing the pre-awakening word, wherein the fifth audio energy is contained in the H2 audio energy.
13. The method according to any one of claims 7-12, further comprising:
the first electronic equipment collects first sound, wherein the first sound does not contain the pre-awakening words;
the first electronic device obtains third audio energy according to the first sound;
the first electronic device obtains fourth audio energy according to the first pre-awakening voice;
the obtaining the first audio energy according to the first pre-wake-up voice specifically includes:
the first electronic device subtracts the third audio energy from the fourth audio energy to obtain the first audio energy.
14. An electronic device comprising a microphone for capturing sound, a communication means, a memory for storing a computer program, and a processor for invoking the computer program to cause the electronic device to perform the method of any of claims 1-6.
15. A computer readable storage medium comprising instructions which, when run on an electronic device, cause the electronic device to perform the method of any one of claims 1-6.
16. A computer program product comprising computer instructions which, when run on an electronic device, cause the electronic device to perform the method of any of claims 1-6.
CN202210075546.8A 2022-01-22 2022-01-22 Equipment awakening method, related device and communication system Pending CN116524919A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210075546.8A CN116524919A (en) 2022-01-22 2022-01-22 Equipment awakening method, related device and communication system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210075546.8A CN116524919A (en) 2022-01-22 2022-01-22 Equipment awakening method, related device and communication system

Publications (1)

Publication Number Publication Date
CN116524919A true CN116524919A (en) 2023-08-01

Family

ID=87392678

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210075546.8A Pending CN116524919A (en) 2022-01-22 2022-01-22 Equipment awakening method, related device and communication system

Country Status (1)

Country Link
CN (1) CN116524919A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117116263A (en) * 2023-09-15 2023-11-24 广州易云信息技术有限公司 Intelligent robot awakening method and device based on voice recognition and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117116263A (en) * 2023-09-15 2023-11-24 广州易云信息技术有限公司 Intelligent robot awakening method and device based on voice recognition and storage medium
CN117116263B (en) * 2023-09-15 2024-04-12 广州易云信息技术有限公司 Intelligent robot awakening method and device based on voice recognition and storage medium

Similar Documents

Publication Publication Date Title
CN110784830B (en) Data processing method, Bluetooth module, electronic device and readable storage medium
CN110138959B (en) Method for displaying prompt of human-computer interaction instruction and electronic equipment
WO2021052263A1 (en) Voice assistant display method and device
CN114255745A (en) Man-machine interaction method, electronic equipment and system
WO2020073288A1 (en) Method for triggering electronic device to execute function and electronic device
WO2022143258A1 (en) Voice interaction processing method and related apparatus
CN115312068B (en) Voice control method, device and storage medium
WO2022088964A1 (en) Control method and apparatus for electronic device
CN115083401A (en) Voice control method and device
WO2022161077A1 (en) Speech control method, and electronic device
EP4293664A1 (en) Voiceprint recognition method, graphical interface, and electronic device
CN115206308A (en) Man-machine interaction method and electronic equipment
CN113488042A (en) Voice control method and electronic equipment
CN114765026A (en) Voice control method, device and system
CN113380240B (en) Voice interaction method and electronic device
CN116524919A (en) Equipment awakening method, related device and communication system
CN115981454A (en) Non-contact gesture control method and electronic equipment
WO2020253694A1 (en) Method, chip and terminal for music recognition
WO2023006001A1 (en) Video processing method and electronic device
CN114299923B (en) Audio identification method, device, electronic equipment and storage medium
CN117119102B (en) Awakening method and electronic device for voice interaction function
EP4425483A1 (en) Voice interaction method and related apparatus
WO2024139974A1 (en) Interaction method, electronic device, and medium
WO2025055617A1 (en) Voice interaction method and related device
CN119232837A (en) Voice control method, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination