WO2020215736A1 - 语音识别设备及其唤醒响应方法、计算机存储介质 - Google Patents

语音识别设备及其唤醒响应方法、计算机存储介质 Download PDF

Info

Publication number
WO2020215736A1
WO2020215736A1 PCT/CN2019/123811 CN2019123811W WO2020215736A1 WO 2020215736 A1 WO2020215736 A1 WO 2020215736A1 CN 2019123811 W CN2019123811 W CN 2019123811W WO 2020215736 A1 WO2020215736 A1 WO 2020215736A1
Authority
WO
WIPO (PCT)
Prior art keywords
response factor
response
voice recognition
wake
central device
Prior art date
Application number
PCT/CN2019/123811
Other languages
English (en)
French (fr)
Inventor
何瑞澄
Original Assignee
广东美的白色家电技术创新中心有限公司
美的集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东美的白色家电技术创新中心有限公司, 美的集团股份有限公司 filed Critical 广东美的白色家电技术创新中心有限公司
Priority to EP19926004.3A priority Critical patent/EP3944231A4/en
Priority to JP2021562155A priority patent/JP7279992B2/ja
Priority to KR1020217033362A priority patent/KR20210141581A/ko
Publication of WO2020215736A1 publication Critical patent/WO2020215736A1/zh
Priority to US17/452,223 priority patent/US20220044685A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/84Detection of presence or absence of voice signals for discriminating voice from noise
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Definitions

  • This application relates to the field of voice wake-up, and in particular to a wake-up response method of a voice recognition device, a voice recognition device and a computer storage medium.
  • the user may be awakened by voice signals and respond at the same time.
  • the user will obviously only wake up one voice recognition device.
  • the simultaneous wake-up and response of multiple voice recognition devices will cause the problem of mutual interference between multiple voice recognition devices.
  • the sound broadcast by one voice recognition device in response to the voice signal will be received and responded by another voice recognition device. The reverse is also true, that is, the problem of mutual interference.
  • the present application provides a wake-up response method for a voice recognition device, a voice recognition device, and a computer storage medium, so as to solve the mutual interference problem caused by multiple voice recognition devices responding to the wake-up voice at the same time in the prior art.
  • this application provides a wake-up response method for a voice recognition device.
  • Multiple voice recognition devices form a regional network.
  • the multiple voice recognition devices are divided into a central device and at least one non-central device;
  • the wake-up response method includes: The central device analyzes the collected voice signal to obtain the response factor of the central device; receives the response factor of the non-central device, and the response factor of the non-central device is obtained by analyzing the collected voice signal by the non-central device; compares the response factor of the central device and the non-central device The response factor of the central device; the voice recognition device to be responded to is determined, and the voice recognition device to be responded is the voice recognition device that responds to the voice signal in the regional network.
  • this application provides a wake-up response method for a voice recognition device.
  • Multiple voice recognition devices form a regional network.
  • the multiple voice recognition devices are divided into a central device and at least one non-central device; the wake-up response method includes:
  • the non-central device analyzes the collected voice signals to obtain the response factor of the non-central device; sends the response factor of the non-central device to the central device, so that the central device compares the response factor of the non-central device with the response factor of the central device to determine the response factor of the non-central device.
  • the response voice recognition device, and the response voice recognition device is a voice recognition device that responds to voice signals in a local area network.
  • the present application provides a voice recognition device, which includes a processor and a memory, a computer program is stored in the memory, and the processor is used to execute the computer program to implement the steps of the wake-up response method.
  • this application provides a computer storage medium in which a computer program is stored, and when the computer program is executed, the steps of the above wake-up response method are realized.
  • multiple voice recognition devices form a regional network, where the voice recognition devices all collect voice signals, and analyze the collected voice signals to obtain response factors.
  • the multiple voice recognition devices are divided into a central device and at least one non-central device.
  • the central device obtains its own response factor, and receives the response factor of the non-central device; then compares its own response factor with the response factor of the non-central device to determine the voice recognition device to be responded to, which is the local area network Voice recognition equipment in response to voice signals.
  • the voice recognition device that forms the local area network does not respond temporarily after being awakened by the voice signal.
  • the central device first determines which one should respond, so as to avoid the problem of mutual interference caused by multiple voice recognition devices responding .
  • Figure 1 is a schematic diagram of the structure of a network formed by interconnecting voice recognition devices of the present application
  • Figure 2 is a schematic flow diagram of the application of the wake-up response method of the voice recognition device of the present application in a single area network;
  • FIG. 3 is a schematic flow diagram of the application of the wake-up response method of the voice recognition device of this application in a multi-area network;
  • FIG. 4 is a schematic diagram of the work flow of the hub device side of the wake-up response method of the voice recognition device of the present application
  • FIG. 5 is a schematic diagram of the work process of the non-central device side of the voice recognition device wake-up response method of the present application
  • Figure 6 is a schematic structural diagram of an embodiment of a speech recognition device according to the present application.
  • Fig. 7 is a schematic structural diagram of an embodiment of a computer storage medium of the present application.
  • the wake-up response method of the present application is applied to the situation where multiple voice recognition devices can respond to the same voice signal.
  • voice recognition devices such as televisions, air conditioners, and refrigerators in the living room area; voice recognition devices such as refrigerators, microwave ovens, kettles, and rice cookers exist in the kitchen area.
  • voice recognition devices such as televisions, air conditioners, and refrigerators in the living room area
  • voice recognition devices such as refrigerators, microwave ovens, kettles, and rice cookers exist in the kitchen area.
  • the response sound of the household appliance A may be received and responded by the household appliance B, which may cause mutual interference between the household appliances and fail to respond to the user's needs normally.
  • the household appliance B may cause mutual interference between the household appliances and fail to respond to the user's needs normally.
  • both areas can receive the voice signal and respond to the voice signal, and the problem of mutual interference may also occur.
  • the speech recognition device of the present application it is a mode of waking up first and then responding, that is, being awakened by a voice signal sent by the user first, and then responding to the voice signal.
  • this application introduces a selection determination mechanism between wake-up and response, that is, after being awakened by a voice signal, it does not respond temporarily, and then responds when it is determined that a response is needed.
  • multiple voice recognition devices are connected to each other to form a regional network.
  • One voice recognition device is used as the hub device in the regional network.
  • the hub device determines which voice recognition device in the regional network responds to the regional network. voice signal.
  • the hub device of each area network first determines the voice recognition device to be responded to the voice signal in the area network. After that, a first hub device among all the hub devices determines the waiting voice recognition device in which area network. Respond to the voice recognition device to respond, thereby solving the problem of mutual interference caused by multiple voice recognition devices responding to voice signals.
  • the central device In the application of household appliances, since the central device needs to be able to respond to the user's voice signal at any time to determine the device that responds to the voice signal, it is generally selected to connect to the power source for a long time and basically not power off the household appliance; and the interactive screen is preferred.
  • the network hub device which facilitates related settings through the interactive screen.
  • the refrigerator serves as a central device.
  • each area such as the living room area and the home appliance in the kitchen area, can form an area network.
  • the area network corresponds to the division of areas. On the network connection, it does not necessarily form a separate area network, that is, it may be Home appliances in all areas of a family can be connected to each other to form a whole home appliance network.
  • the network constituted in this application includes, but is not limited to, a local area network composed of WIFI wireless network, a local area network composed of a wired network, a LAN composed of Bluetooth mesh, a local area network composed of zigbee, a local area network composed of RS485, a local area network composed of LoRa, a local area network composed of 1394, LAN composed of CAN and so on.
  • the communication mechanism of the formed network includes but is not limited to UDP, TCP/IP, HTTP, MQTT, CoAP, etc., to ensure that each voice recognition device on the same network can quickly and reliably exchange information.
  • the following describes the wake-up response method starting from the network formed by the voice recognition device.
  • FIG. 1 is a schematic diagram of the structure of a network formed by interconnecting voice recognition devices of this application.
  • the area in Figure 1 is divided into living room area A, kitchen area B, and bedroom area C; in living room area A, voice recognition equipment includes: refrigerator A1, TV A2, air purifier A3; in kitchen area B, voice recognition equipment includes: Range hood B1, rice cooker B2, wall breaker B3; in bedroom area C, voice recognition equipment includes: air conditioner C1, humidifier C2. All voice recognition devices are connected to form a network, and the voice recognition devices in each area also form a regional network.
  • the voice devices in each regional network are divided into a central device and at least one non-central device, and the central device determines the voice recognition device to respond to the voice signal in the local network.
  • the hub devices of all regional networks are further divided into a first hub device and at least one second hub device. The first hub device determines which voice recognition device in the regional network will respond to the voice signal.
  • voice devices in the local area network are not only divided into hub devices and non-central devices, but also have a wake-up priority.
  • the wake-up priority can be set by the manufacturer when the voice recognition device is shipped from the factory. After the network, the voice recognition device with the highest wake-up priority automatically serves as the central device of the regional network; the wake-up priority can also be set when the network is constructed, set by the user, or set by the service provider who builds the network; according to the set wake-up priority The voice recognition device with the highest wake-up priority is the central device of the network.
  • the priority of living room area A is A1>A2>A3
  • the priority of kitchen area B is B1>B2>B3
  • the priority of bedroom area C is C1>C2; where A1 , B1 and C1 respectively serve as the central equipment of their respective local area networks.
  • A1 , B1 and C1 respectively serve as the central equipment of their respective local area networks.
  • A1 is the first hub device
  • B1 and C1 are the second hub devices.
  • Figure 1 can realize wake-up response in a single area and wake-up response in multiple areas.
  • Figure 2 is a schematic flow diagram of the application of the wake-up response method of the voice recognition device of this application on a single area network
  • Figure 3 is a schematic flow diagram of the application of the wake-up response method of the voice recognition device of this application on a multi-region network .
  • the implementation of the wake-up response method in a single area network includes the following steps.
  • the voice recognition device analyzes the collected voice signal to obtain a response factor.
  • the voice recognition device mainly performs two actions, collection and analysis. After the user, the signal source, sends out the voice signal, the voice recognition device can collect the voice signal. Because each voice recognition device has a different relative position with the user, the voice signal it collects is also different. Among them, the voice recognition equipment far away from the user may not be able to collect voice signals even in the local area network.
  • the voice recognition devices analyze the voice signals collected by each.
  • all voice recognition devices in each regional network have the same voice signal analysis mechanism to facilitate subsequent comparison calculations.
  • the voice signal is analyzed and calculated to obtain a response factor.
  • the response factor indicates the degree of correspondence of the voice recognition device to the voice signal, that is, how likely the voice signal is to be sent by the voice recognition device.
  • the response factor includes the identification of the voice recognition device and the energy value used for judgment.
  • the energy value of the response factor can be specifically based on the voice characteristics of the voice signal and The matching degree between the voice signal and the wake-up template in the voice recognition device is calculated.
  • the voice feature can be the volume of the voice signal. The larger the voice recognition device, the closer the user is to the voice recognition device; the higher the matching degree with the wake-up template in the voice recognition device is, the greater the user is likely to be the voice recognition device. Voice signal.
  • calculation method of the response factor energy value can be as follows:
  • the wake-up energy E1 is calculated based on the voice characteristics of the voice signal
  • the noise floor energy E2 is calculated based on the voice characteristics of the environmental noise in the environment where the voice recognition device is located.
  • the confidence P represents the matching degree between the voice signal and the wake-up template.
  • the voice recognition device When the voice recognition device is awakened by the voice signal, it will judge the matching degree between the voice signal and the wake-up template, for example A perfect match is recorded as 100%, and most matches can be recorded as 90%, 80%, or 70%.
  • the degree of matching exceeds a certain threshold, it is determined that the speech recognition device can be awakened.
  • the confidence level P calculated when calculating the wake-up factor energy also corresponds to the matching degree of the voice signal and the wake-up template when being awakened; for example, P can be 1, 0.9, 0.8, 0.7, etc.
  • K xE+yP, where x is the weight coefficient of the effective energy E, and y is the weight coefficient of the confidence level P.
  • the weight coefficients x and y can be fixed values, or can be changed among multiple sets of fixed values, and can also be changed and adjusted according to the final accuracy of the speech recognition device responding to the speech signal.
  • the energy value of the response factor obtained by the device A1 is recorded as K1
  • the energy value of the response factor obtained by the device A2 is recorded as K2
  • the energy value of the response factor obtained by the device A3 is recorded as K3.
  • the central hub device analyzes the collected voice signal to obtain the response factor of the central device; the non-central device analyzes the collected voice signal to obtain the response factor of the non-central device.
  • S202 The central device receives the response factor of the non-central device.
  • the non-central device After the voice recognition device calculates the response factor, the non-central device sends the response factor obtained by itself to the central device.
  • the central device A1 receives the response factor sent by the non-central device.
  • S203 The central device compares the response factor of the central device with the response factor of the non-central device, and determines the voice recognition device to be responded to.
  • the central device compares the response factor of the central device with the response factor of the non-central device, so as to determine the voice recognition device in the area network that responds to the voice signal.
  • the central equipment uses a sorting algorithm to compare the energy values of the response factors, and obtains the sorting of the energy values of all the response factors, thereby obtaining the response factor with the largest energy value.
  • Sorting algorithms include, but are not limited to, insertion sort, Hill sort, selection sort, heap sort, bubble sort, quick sort, merge sort, computational sort, bucket sort, radix sort, etc.
  • the order of the response factor energy value is K2>K1>K3.
  • the speech recognition device to be responded can be determined. There are many ways to determine the process.
  • the response factor with the largest energy value is obtained, it can be determined that the corresponding voice recognition device is the voice recognition device to be responded.
  • the response factor with the largest energy value is the response factor of the central device, that is, if the response factor with the largest energy value is the response factor of the central device, the central device is determined to be the response factor Voice recognition equipment.
  • the maximum response factor of the energy value may be two or more.
  • the device that responds to the voice signal is determined based on the wake-up priority of the voice recognition device, that is, the energy Among the voice recognition devices corresponding to the response factor with the largest value, the one with the highest priority is determined as the voice recognition device to be responded.
  • the hub device sends a notification whether to respond to the voice signal to the non-central device.
  • the hub device After the hub device determines the voice recognition device to respond to the voice signal, it can send a notification of whether to respond to the voice signal to the non-central device, that is, to all voice recognition devices that have been awakened but have not responded to the voice signal through the network.
  • the notification may be a specific response or no response, and may also be device information of the determined voice recognition device that responds to the voice signal. It is also possible to only send a notification to the voice recognition device to be responded, and other voice recognition devices that have not received the notification do not respond, but those that receive the notification respond.
  • S205 The voice recognition device to be responded responds to the voice signal.
  • the identified voice recognition device can respond to the voice signal, while other voice recognition devices do not. It is ensured that only one voice recognition device responds to the voice signal without causing mutual interference.
  • the method shown in Figure 2 above is applied to the voice wake-up recognition of a single area network. After the voice recognition device in the single area network is awakened by voice information, it does not respond immediately, but after the central device of the single area network determines the responding device, Respond again.
  • a multi-area network is a plurality of interconnected area networks.
  • the hub devices of each area network are connected to each other. They are divided into a first hub device and at least one second hub device. Each area network determines its response After the voice recognition device, the first hub device further confirms the voice recognition device that responds to the voice signal.
  • the steps for implementing the wake-up response method for each regional network in the multi-regional network will not be repeated. Please also refer to Fig. 3.
  • the wake-up response method of the multi-regional network further includes the following steps.
  • the second central device sends a second response factor to the first central device, and the first central device receives the second response factor.
  • the first hub device needs to compare the response factors of the voice recognition devices to be responded in all regional networks to determine the voice recognition device that responds to the voice signal.
  • the voice recognition device to be responded to is determined in a single regional network A voice recognition device that responds to voice signals; in the application of a multi-area network, the voice recognition device to be responded determined by a single regional network does not respond immediately; instead, the first central device receives multiple voice recognition
  • the recognition device confirms which one responds to the voice signal, that is, the final voice recognition device that responds to the voice signal is determined. Therefore, in this step S301, the second central device sends its second response factor to the first central device.
  • the second response factor is the response factor of the voice recognition device to be responded in the area where the second central device is located.
  • A1 compares KA1, KA2, and KA3 to determine that the voice recognition device to be responded is A2; in area B, B1 compares KB1, KB2, and KB3 to determine that the voice recognition device to be responded is B3; in area C, Compare KC1 and KC2 by C1, and determine that the responding device is C1.
  • B1 sends the response factor KB3 of the voice recognition device B3 to be responded in its local area network to A1, and C1 also sends the response factor KC1 to A1, and the response factor of the voice recognition device A2 determined by A1 itself is KA2.
  • the first central device compares the second response factor with the first response factor, and determines a voice recognition device that responds to the voice signal.
  • the first hub device compares the response factor of each voice recognition device to be responded, that is, the first response factor and the second response factor, and the first response factor is the response factor of the voice recognition device to be responded in the local network where the first hub device is located.
  • the energy value of the first response factor and the energy value of the second response factor may be compared to obtain the response factor with the largest energy value; it is determined that the voice recognition device corresponding to the response factor with the largest energy value responds to the voice signal.
  • the first central device compares the energy value of the first response factor with the energy value of the second response factor to obtain the response factor with the largest energy value; if the response factor with the largest energy value is the first response factor, the first central device responds to the voice signal; If the response factor with the largest energy value is the second response factor, calculate the energy difference between the response factor with the largest energy value and the first response factor; compare the energy difference with the wake-up threshold, if the energy difference is greater than the wake-up threshold, use the energy
  • the voice recognition device corresponding to the response factor with the largest value responds to the voice signal; if the energy difference is less than or equal to the wake-up threshold, the first central device responds to the voice signal.
  • A1 compares KA2, KB3, and KC1; thereby determining the voice recognition device that responds to the voice signal, for example, B2.
  • the maximum response factor of the energy value obtained may be two or more.
  • the device that responds to the voice signal is further determined according to the wake-up priority of the voice recognition device, that is, the response factor corresponding to the maximum energy value Among the voice recognition devices, the one with the highest priority is determined as the voice recognition device to be responded.
  • the first hub device sends a notification whether to respond to the voice signal to other voice recognition devices in the multi-area network.
  • the first hub device After the first hub device determines the voice recognition device that responds to the voice signal, it can directly send notifications to the entire network, that is, multiple regional networks, or it can first send notifications to hub devices in each regional network, and then each hub device can send notifications to non- The hub device sends a notification. Similarly, it can only be sent to the voice recognition device that responds to the voice signal, and other devices that have not received the notification will not respond.
  • S304 The determined voice recognition device responds to the voice signal.
  • This step S304 is similar to the above step S205, and will not be described again.
  • the method shown in Figure 3 is applied to multi-region voice wake-up recognition. After each region determines the voice device that should respond to this region, the first central device will further determine which region’s voice device responds, so as to ensure that only A voice recognition device responds to voice signals.
  • the voice recognition device has a wake-up priority sequence, so when the highest priority voice recognition device fails, the next wake-up priority can be determined according to the wake-up priority sequence.
  • the voice recognition device serves as the hub device or the first hub device.
  • the voice recognition equipment For voice recognition equipment, it can periodically detect whether it has the highest wake-up priority in the local area network, or detect whether it has the highest wake-up priority when the local network changes; if it detects that it is the current local network The highest wake-up priority in, that is, in response to detecting that it is the highest wake-up priority in the local area network, it operates as a hub device.
  • the wake-up response method implemented in the network of this embodiment is based on the fact that the voice recognition device in the network has a wake-up priority order, and the voice recognition device as a network hub device can compare response factors. Therefore, the voice recognition device newly added to the network also needs to comply with the wake-up mechanism of this embodiment, which can be set by the hub device.
  • the hub device can obtain the device information of the voice recognition device joining the network. Analyze device information according to preset rules to re-order the voice recognition devices in the network to wake up priority.
  • Each voice recognition device is equipped with a voice recognition system, which determines the wake-up priority, voice recognition algorithm, wake-up template, etc. If the newly added voice recognition device has a different voice recognition system, that is, it has different wake-up priority settings, the network hub device can reorder according to its own wake-up priority settings. For example, in the network A1-A2-A3, the newly added voice recognition device A4, whose wake-up priority is set to be greater than A3, can reorder the wake-up priority as A1>A2>A4>A3.
  • the wake-up priority of the voice recognition device that joins the network first will be higher.
  • the newly added voice recognition device A3 has the same voice recognition system as the previous A3, the previous A3 is used as A31, the newly added one is used as A32, and the wake-up priority is reordered as A1>A2>A31>A32.
  • the voice recognition device can play two roles, one is to operate as a central device, and the other is to operate as a non-central device.
  • the voice recognition device can be used as a central device with more powerful functions; it can also be used as a non-central device with lighter weight.
  • a voice recognition system with more powerful functions can be loaded into it, so that it can be used as a central device; for small household appliances, such as rice cookers, electric kettles, etc.,
  • the voice recognition system with lightweight functions makes it only a non-central device.
  • FIG. 4 is a schematic diagram of the hub device side workflow of the wake-up response method of the voice recognition device of the present application.
  • its wake-up response method includes the following steps.
  • S401 Analyze the collected voice signal to obtain the response factor of the central device.
  • this step S401 is completed in the above step S201, and the details will not be repeated.
  • S402 Receive a response factor of a non-central device that is not a central device.
  • This step S402 corresponds to the above step S202, and the details are not repeated here.
  • S403 Compare the response factor of the central device with the response factor of the non-central device, and determine the voice recognition device to be responded in the regional network.
  • This step S403 is similar to the above step S203, and the details are not repeated here.
  • the above steps use the voice recognition device as the role of the central device to illustrate the steps in implementing the single-area wake-up response method.
  • the specific details of each step and the specific details of the operation of the central device have also been described above, so they will not be Repeat.
  • the voice recognition device of this embodiment can determine a voice recognition device that responds to the voice signal from multiple voice recognition devices, thereby avoiding the problem of mutual interference due to all responses.
  • the hub device is further divided into a first hub device and a second hub device.
  • the first hub device it further performs the following steps.
  • S404 The first hub device receives the second response factor.
  • This step S404 is completed in the above step S301, and the details are not repeated here.
  • S406 Compare the first response factor and the second response factor to determine a voice recognition device that responds to the voice signal.
  • This step S406 is similar to the above step S302, and the details are not repeated here.
  • the second hub device For the second hub device, it performs the following steps.
  • the second central device sends a second response factor to the first central device, so that the first central device compares the first response factor and the second response factor, so as to determine a voice recognition device that responds to the voice signal.
  • This step S405 is completed in the above steps S301-S302, and the details are not repeated here.
  • the first hub device further determines which area network's to-be-responsive voice recognition device responds to the voice signal.
  • FIG. 5 is a schematic diagram of the non-central device side work flow of the voice recognition device wake-up response method of the present application.
  • the voice recognition device is a non-central device, and the wake-up response method of this embodiment includes the following steps.
  • S501 Analyze the collected voice signal to obtain the response factor of the non-central device.
  • This step S501 is similar to the above step S201, both of which are for obtaining response factors, and the specific process will not be repeated.
  • S502 Send the response factor of the non-central device to the central device, so that the central device compares the response factor of the non-central device with the response factor of the central device to determine the voice recognition device to be responded to.
  • non-central device after collecting the voice signal, it does not respond to the voice signal immediately, but performs calculation and analysis to obtain the response factor, and then transmits the response factor to the central device for analysis and comparison, and the central device confirms the response Voice recognition equipment for voice signals.
  • the role of the voice recognition device as a non-central device is used to illustrate the steps in implementing the wake-up response method.
  • the specific details of each step and the specific details of the operation of the non-central device have also been described above. Repeat it again.
  • the voice recognition device of this embodiment does not respond immediately after receiving the voice signal, but decides whether to respond after receiving the notification, which avoids the problem of mutual interference caused by simultaneous response with other voice recognition devices.
  • FIG. 6 is a schematic structural diagram of an embodiment of the voice recognition device of this application.
  • the voice recognition device 100 in this embodiment may be a household appliance. It includes a voice collector 11, a processor 12, and a memory 13 connected to each other.
  • the voice recognition device 100 in this embodiment can implement the above-mentioned wake-up response method.
  • the voice collector 11 is used to collect voice signals
  • a computer program is stored in the memory 13
  • the processor 12 is used to execute the computer program to implement the above wake-up response method.
  • the voice collector 11 is used to collect voice signals; the processor 12 is used to analyze the collected voice signals to obtain response factors, and compare all response factors according to a preset algorithm to determine a voice recognition device that responds to the voice signal; Other voice recognition devices send notifications whether they respond to voice signals.
  • the voice collector 11 is used to collect voice signals; the processor 12 is used to analyze the collected voice signals to obtain the response factor, and send the response factor to the central device, and according to the received notification sent by the central device whether it responds to the voice signal, To determine whether to respond.
  • the processor 12 may be an integrated circuit chip with signal processing capability.
  • the processor 12 may also be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component .
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA off-the-shelf programmable gate array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • FIG. 7 is a schematic structural diagram of an embodiment of the computer storage medium of the present application.
  • the computer storage medium 200 of this embodiment stores a computer program 21, which can be executed to implement the method in the foregoing embodiment.
  • the computer storage medium 200 of this embodiment may be a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a random access memory (RAM, Random Access Memory), a magnetic disk or an optical disk, etc., which can store program instructions. Or it may also be a server storing the program instructions, and the server may send the stored program instructions to other devices to run, or it may run the stored program instructions by itself.
  • the disclosed method and device may be implemented in other ways.
  • the device implementation described above is only illustrative.
  • the division of modules or units is only a logical function division.
  • there may be other division methods for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of this embodiment.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) or a processor execute all or part of the steps of the methods in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

一种语音识别设备的唤醒响应方法、语音识别设备及计算机存储介质,其中,多个语音识别设备构成区域网络,多个语音识别设备分为一个中枢设备和至少一个非中枢设备;唤醒响应方法包括:中枢设备分析采集的语音信号,以获得中枢设备的响应因子(S201);接收非中枢设备的响应因子,非中枢设备的响应因子由非中枢设备分析采集的语音信号而获得(S202);比较中枢设备的响应因子和非中枢设备的响应因子;确定待响应语音识别设备(S203),待响应语音识别设备为区域网络中响应语音信号的语音识别设备。该唤醒响应方法能够在多个可响应该语音信号的语音识别设备中确定一个响应该语音信号的设备。

Description

语音识别设备及其唤醒响应方法、计算机存储介质
本申请要求于2019年4月26日提交的申请号为2019103430678,发明名称为“语音识别设备及其唤醒响应方法、计算机存储介质”的中国专利申请的优先权,其通过引用方式全部并入本申请。
【技术领域】
本申请涉及语音唤醒领域,特别是涉及一种语音识别设备的唤醒响应方法、语音识别设备及计算机存储介质。
【背景技术】
语音识别,语音交互等技术已应用在多个领域,对于搭载了语音识别系统的设备一般在收到语音信号时会被唤醒后对语音信号进行响应。
对于同一区域内或多个相邻区域内的多个语音识别设备,可能出现同时被语音信号唤醒并响应的情况,而在一般的应用场景中,用户显然只会对一个语音识别设备进行唤醒,并且多个语音识别设备的同时唤醒并响应会导致多个语音识别设备之间相互干扰的问题,例如一个语音识别设备响应所述语音信号而播报的声音会被另一个语音识别设备接收并响应,反之亦然,即产生相互干扰的问题。
【发明内容】
本申请提供一种语音识别设备的唤醒响应方法、语音识别设备及计算机存储介质,以解决现有技术中多个语音识别设备同时响应唤醒语音,而造成的相互干扰问题。
为解决上述技术问题,本申请提供一种语音识别设备的唤醒响应方法,多个语音识别设备构成区域网络,多个语音识别设备分为一个中枢设备和至少一个非中枢设备;唤醒响应方法包括:中枢设备分析采集的语音信号,以获得中枢设备的响应因子;接收非中枢设备的响应因子,非中枢设备的响应因子由非中枢设备分析采集的语音信号而获得;比较中枢设备的响应因子和非中枢设备的响应因子;确定待响应语音识别设备,待响应语音识别设备为区域网络中响 应语音信号的语音识别设备。
为解决上述技术问题,本申请提供一种语音识别设备的唤醒响应方法,多个语音识别设备构成区域网络,多个语音识别设备分为一个中枢设备和至少一个非中枢设备;唤醒响应方法包括:非中枢设备分析采集的语音信号,以获得非中枢设备的响应因子;向中枢设备发送非中枢设备的响应因子,以由中枢设备比较非中枢设备的响应因子和中枢设备的响应因子,来确定待响应语音识别设备,待响应语音识别设备为区域网络中响应语音信号的语音识别设备。
为解决上述技术问题,本申请提供一种语音识别设备,其包括处理器和存储器,存储器中存储有计算机程序,处理器用于执行计算机程序以实现唤醒响应方法的步骤。
为解决上述技术问题,本申请提供一种计算机存储介质,其中存储有计算机程序,计算机程序被执行时实现上述唤醒响应方法的步骤。
本申请唤醒响应方法中多个语音识别设备构成区域网络,其中,语音识别设备均采集语音信号,并分析所采集到的语音信号以获得响应因子。多个语音识别设备分为一个中枢设备和至少一个非中枢设备。中枢设备获取其自身的响应因子,并接收非中枢设备的响应因子;然后比较自身的响应因子和非中枢设备的响应因子,从而确定待响应语音识别设备,该待响应语音识别设备即本区域网络中响应语音信号的语音识别设备。本申请中对于构成区域网络的语音识别设备,在被语音信号唤醒后,暂时不响应,先由中枢设备来确定该由哪个进行响应,从而避免多个语音识别设备均响应造成的相互干扰的问题。
【附图说明】
图1是本申请语音识别设备相互连接所构成网络的结构示意图;
图2是本申请语音识别设备的唤醒响应方法应用在单区域网络的流程示意图;
图3是本申请语音识别设备的唤醒响应方法应用在多区域网络的流程示意图;
图4是本申请语音识别设备的唤醒响应方法的中枢设备端工作流程示意图;
图5是本申请语音识别设备唤醒响应方法的非中枢设备端工作流程示意图;
图6是本申请语音识别设备一实施例的结构示意图;
图7是本申请计算机存储介质一实施例的结构示意图。
【具体实施方式】
为使本领域的技术人员更好地理解本发明的技术方案,下面结合附图和具体实施方式对本申请所提供的一种语音识别设备的唤醒响应方法、语音识别设备及计算机存储介质做进一步详细描述。
本申请唤醒响应方法应用于多个语音识别设备均可对同一语音信号进行响应的情况,对于这种情况,以家电领域为例,在同一区域或多个相邻区域存在多个家电设备,其中家电设备均具有语音识别功能,即作为语音识别设备。例如客厅区域存在电视机、空调、冰箱等语音识别设备;厨房区域存在冰箱、微波炉、热水壶、电饭煲等语音识别设备。当用户在客厅区域发出语音信号时,由于声音传播特性,在客厅区域内的多个家电设备均可能接收到该语音信号,并对该语音信号进行响应,此时则会出现多个家电设备均进行回应的情况,在该情况下,A家电设备回应的声音可能又被B家电设备接收并响应,继而导致家电设备之间相互干扰,而无法正常回应用户的需求。还例如当用户在客厅区域和厨房区域之间发出语音信号时,两个区域均可接收到语音信号,并对该语音信号进行响应,也会出现相互干扰的问题。
对于本申请语音识别设备来说,为先唤醒后响应的模式,即先被用户发出的语音信号唤醒,然后再对该语音信号进行响应回复。对此,本申请在唤醒和响应之间引入选择确定机制,即在被语音信号唤醒后,暂时不响应,在确定需要响应时再回复。
具体来说对于单个区域,将多个语音识别设备相互连接构成区域网络,其中一个语音识别设备作为该区域网络中的中枢设备,由中枢设备来确定本区域网络中由哪个语音识别设备来响应该语音信号。
对于多个区域,首先每个区域网络的中枢设备确定本区域网络中响应语音信号的待响应语音识别设备,此后,再由所有中枢设备中一个第一中枢设备来确定由哪个区域网络中的待响应语音识别设备来响应,从而解决多个语音识别设备均响应语音信号而造成相互干扰的问题。
在家电领域的应用中,由于中枢设备需要随时能够应对用户的语音信号,以确定响应语音信号的设备,因此一般选择长时间连接电源,基本不会断电的 家电设备;且优先选择具有交互屏幕的家电设备来作为网络中枢设备,方便通过交互屏幕进行相关设置。例如,冰箱作为中枢设备。
一般来说,每个区域,例如客厅区域、厨房区域中的家电设备均可分别构成区域网络,该区域网络对应于区域的划分,在网络连接上,不一定构成单独的区域网络,即可能在一个家庭中所有区域的家电设备可相互连接构成整体的家电设备网络。
本申请中所构成的网络包括并不仅限于WIFI无线网络组成的局域网、有线网络组成的局域网、蓝牙mesh组成的局域网、zigbee组成的局域网、RS485组成的局域网、LoRa组成的局域网、1394组成的局域网、CAN组成的局域网等等。所构成网络的通讯机制包括并不仅限于UDP、TCP/IP、HTTP、MQTT、CoAP等等,确保同一网络的每个语音识别设备能够快速和可靠地进行信息交互。
对于本申请的唤醒响应方法,下面从语音识别设备所构成的网络出发,对唤醒响应方法进行说明。
请参阅图1,图1是本申请语音识别设备相互连接所构成网络的结构示意图。图1中区域划分为客厅区域A、厨房区域B、卧室区域C;在客厅区域A,语音识别设备包括:冰箱A1、电视机A2、空气净化器A3;在厨房区域B,语音识别设备包括:抽油烟机B1、电饭煲B2、破壁机B3;在卧室区域C,语音识别设备包括:空调C1、加湿器C2。所有的语音识别设备连接构成网络,每个区域中的语音识别设备也构成区域网络。
每个区域网络中的语音设备分为一个中枢设备和至少一个非中枢设备,由中枢设备确定本区域网络中响应语音信号的待响应语音识别设备。而所有区域网络的中枢设备又分为一个第一中枢设备和至少一个第二中枢设备,由第一中枢设备来确定具体由哪个区域网络中的待响应语音识别设备来响应语音信号。
在本申请一些实施例中,区域网络中的语音设备不仅仅分为中枢设备和非中枢设备,其还进一步具有唤醒优先级,唤醒优先级可由厂商在出厂语音识别设备时进行设置,在连接构成网络后,最高唤醒优先级的语音识别设备自动作为区域网络的中枢设备;唤醒优先级也可以在构建网络时设置,由用户自主设置,或由搭建网络的服务商设置;根据所设置的唤醒优先级,最高唤醒优先级的语音识别设备作为网络的中枢设备。
在图1所示网络中,客厅区域A的优先级排序为A1>A2>A3,厨房区域B的优先级排序为B1>B2>B3,卧室区域C的优先级排序为C1>C2;其中A1、 B1、C1分别作为各自所在区域网络的中枢设备。各个区域网络的中枢设备之间也有优先级排序A1>B1>C1,本申请中,A1作为第一中枢设备,B1和C1作为第二中枢设备。
图1所示网络可实现在单区域内的唤醒响应,以及在多区域的唤醒响应。具体请参阅图2和图3,图2是本申请语音识别设备的唤醒响应方法应用在单区域网络的流程示意图,图3是本申请语音识别设备的唤醒响应方法应用在多区域网络的流程示意图。
如图2,对于单区域网络中唤醒响应方法的实现,包括以下步骤。
S201:语音识别设备分析采集的语音信号,获得响应因子。
本步骤中语音识别设备主要进行两个动作,采集和分析。在用户即信号源发出语音信号后,语音识别设备均可对语音信号进行采集,每个语音识别设备由于与用户的相对位置不同,其所采集到的语音信号也不同。其中距离用户比较远的语音识别设备,虽然在区域网络中,也可能并不能采集到语音信号。
语音识别设备对各自所采集到的语音信号进行分析,本实施例每个区域网络中所有语音识别设备对语音信号的分析机制均是相同的,以便于后续的比较计算。对语音信号进行分析计算获得响应因子,响应因子表示了语音识别设备对于该语音信号的对应程度,即该语音信号有多大可能是对该语音识别设备发出的。
由于需要根据响应因子以确定响应语音信号的待响应语音识别设备,因而响应因子中包括语音识别设备的标识,以及用于判断的能量值,响应因子的能量值具体可根据语音信号的语音特征及语音信号与语音识别设备中唤醒模板的匹配度计算获得。其中,语音特征可以是语音信号的音量,越大即表示用户距离该语音识别设备越近;与该语音识别设备中唤醒模板的匹配度越高即表示用户越大可能是针对该语音识别设备发出的语音信号。
进一步的,响应因子能量值的计算方式可以如下:
根据语音信号的语音特征计算得到唤醒能量E1,根据语音识别设备所处环境中环境噪声的语音特征计算得到底噪能量E2,以唤醒能量和底噪能量的差值作为有效能量E=E1-E2;
根据语音信号和唤醒模板的匹配度计算置信度P;置信度P表示了语音信号与唤醒模板的匹配度,在语音识别设备被语音信号唤醒时,会判断语音信号和唤醒模板的匹配程度,例如完全匹配记为100%,大部分匹配可记为90%、80% 或70%等,而当匹配程度超过一定阈值时,判定语音识别设备可被唤醒。相应的,在计算唤醒因子能量时所计算的置信度P,也是对应于在被唤醒时,语音信号与唤醒模板的匹配程度;例如P可以是1、0.9、0.8、0.7等。
对有效能量E和置信度P进行加权求和,以获得响应因子的能量值K;
K=xE+yP,其中x为有效能量E的权重系数,y为置信度P的权重系数。
其中,权重系数x,y可以是固定数值,也可以是在多组固定数值中变换,还可以根据最后所确定的响应语音信号的语音识别设备的准确度来变化调整。
对于本实施例区域A中,设备A1所获得响应因子的能量值记为K1,设备A2所获得响应因子的能量值记为K2,设备A3所获得响应因子的能量值记为K3。
本步骤S201中中枢设备分析采集的语音信号,获得中枢设备的响应因子;而非中枢设备分析采集的语音信号,获得非中枢设备的响应因子。
S202:中枢设备接收非中枢设备的响应因子。
语音识别设备计算获得响应因子后,其中,非中枢设备将自身获得的响应因子发送至中枢设备。本实施例中,中枢设备A1接收到非中枢设备发送的响应因子。
S203:中枢设备比较中枢设备的响应因子和非中枢设备的响应因子,确定待响应语音识别设备。
本步骤中,中枢设备比较中枢设备的响应因子和非中枢设备的响应因子,从而确定区域网络中响应语音信号的待语音识别设备。具体来说,中枢设备采用排序算法来比较响应因子能量值,获得所有响应因子的能量值的排序,从而得到能量值最大的响应因子。排序算法包括且不限于插入排序、希尔排序、选择排序、堆排序、冒泡排序、快速排序、归并排序、计算排序、桶排序、基数排序等等。本实施例对响应因子能量值的排序为K2>K1>K3。
根据对响应因子能量值的比较,可确定待响应语音识别设备。具体确定过程有多种方式。
例如:在得到能量值最大的响应因子后,可确定其所对应的语音识别设备为待响应语音识别设备。
还例如:在得到能量值最大的响应因子后,响应于能量值最大的响应因子为中枢设备的响应因子,即若能量值最大的响应因子为中枢设备的响应因子,则确定中枢设备为待响应语音识别设备。
响应于能量值最大的响应因子为非中枢设备的响应因子,即若能量值最大的响应因子为非中枢设备的响应因子,本实施例给出的情况,能量值最大的为K2;则进一步计算能量值最大的响应因子与中枢设备的响应因子的能量差值,即计算能量差值δ=K2-K1。
比较能量差值δ与唤醒阈值δd;若能量差值δ大于唤醒阈值δd,确定能量值最大的响应因子对应的语音识别设备为待响应语音识别设备;响应于能量差值δ小于等于唤醒阈值δd,确定中枢设备为待响应语音识别设备。
在对响应因子进行比较分析时,所得到的能量值最大响应因子可能有两个甚至多个,此时,则进一步依据语音识别设备的唤醒优先级排序来确定响应语音信号的设备,即在能量值最大的响应因子对应的语音识别设备中,确定优先级最高的作为待响应语音识别设备。
S204:中枢设备向非中枢设备发送是否响应语音信号的通知。
中枢设备在确定响应语音信号的待响应语音识别设备后,则可通过网络向非中枢设备,即向区域网络中所有被唤醒但还未响应的语音识别设备发送是否响应该语音信号的通知,该通知可为具体的是响应或无需响应,也可为所确定的响应该语音信号的语音识别设备的设备信息。也可仅向待响应语音识别设备发送通知,其他未接到通知的语音识别设备不做响应,而接收到通知的则做响应。
S205:待响应语音识别设备响应语音信号。
所确定的语音识别设备即可响应语音信号,而其他的语音识别设备则不响应。保证了只有一个语音识别设备来响应该语音信号,而不会造成相互干扰的问题。
以上图2所示的方法应用于单区域网络的语音唤醒识别,单区域网络中语音识别设备被语音信息唤醒后,并不立即响应,而是由单区域网络的中枢设备确定响应的设备后,再做响应。
多区域网络的唤醒响应方法的实现,基于图2所示单区域网络中待响应语音识别设备的确认。具体来说,多区域网络即多个相互连接的区域网络,每个区域网络的中枢设备相互连接,区分为一个第一中枢设备和至少一个第二中枢设备,在每个区域网络确定其待响应语音识别设备后,再由第一中枢设备进一步确认响应语音信号的语音识别设备。
多区域网络中每个区域网络实现唤醒响应方法的步骤不再赘述,另请参阅 图3,多区域网络的唤醒响应方法还包括以下步骤。
S301:第二中枢设备向第一中枢设备发送第二响应因子,第一中枢设备接收第二响应因子。
在多区域网络中,第一中枢设备需比较所有区域网络中待响应语音识别设备的响应因子,从而确定响应语音信号的语音识别设备,待响应语音识别设备为在单个区域网络中所判断出的响应语音信号的语音识别设备;而在多区域网络的应用中,单个区域网络所确定出的待响应语音识别设备,并不立刻进行响应;而是由第一中枢设备再从多个待响应语音识别设备中确认由哪个来响应语音信号,即确定最终的响应语音信号的语音识别设备。因而本步骤S301中第二中枢设备将其第二响应因子发送给第一中枢设备,第二响应因子即第二中枢设备所在区域的待响应语音识别设备的响应因子。
例如,区域A中,由A1比较KA1、KA2、KA3,确定待响应语音识别设备为A2;区域B中,由B1比较KB1、KB2、KB3,确定待响应语音识别设备为B3;区域C中,由C1比较KC1、KC2,确定待响应设备为C1。
B1将其所在区域网络的待响应语音识别设备B3的响应因子KB3发送给A1,C1也将响应因子KC1发送给A1,而A1自身所确定的待响应语音识别设备A2的响应因子为KA2。
S302:第一中枢设备比较第二响应因子和第一响应因子,确定响应语音信号的语音识别设备。
第一中枢设备比较每个待响应语音识别设备的响应因子,即第一响应因子和第二响应因子,第一响应因子为第一中枢设备所在区域网络中的待响应语音识别设备的响应因子。
本步骤S302的比较过程与上述步骤S203的比较过程类似,具体不再赘述。
例如,可比较第一响应因子的能量值和第二响应因子的能量值,得到能量值最大的响应因子;确定由能量值最大的响应因子对应的语音识别设备响应语音信号。
还例如,比较第一响应因子的能量值和第二响应因子的能量值,得到能量值最大的响应因子;若能量值最大的响应因子为第一响应因子,则第一中枢设备响应语音信号;若能量值最大的响应因子为第二响应因子,则计算能量值最大的响应因子与第一响应因子的能量差值;比较能量差值与唤醒阈值,若能量差值大于唤醒阈值,则以能量值最大的响应因子对应的语音识别设备响应语音 信号;若能量差值小于等于唤醒阈值,则以第一中枢设备响应语音信号。
本实施例中A1比较KA2、KB3、KC1;从而确定响应语音信号的语音识别设备,例如为B2。
同样,所得到的能量值最大响应因子可能有两个甚至多个,此时,则进一步依据语音识别设备的唤醒优先级排序来确定响应语音信号的设备,即在能量值最大的响应因子对应的语音识别设备中,确定优先级最高的作为待响应语音识别设备。
S303:第一中枢设备向多区域网络中的其他语音识别设备发送是否响应语音信号的通知。
第一中枢设备在确定响应语音信号的语音识别设备后,可直接向全网,即多个区域网络发送通知,或者也可首先向各个区域网络的中枢设备发送通知,再由各个中枢设备向非中枢设备发送通知。同样,也可仅发送给响应语音信号的语音识别设备,其他未接收到通知的不作响应。
S304:所确定的语音识别设备响应语音信号。
本步骤S304与上述步骤S205类似,不再赘述。
图3所示的方法应用于多区域的语音唤醒识别,在每个区域确定本区域应响应的语音设备后,再由第一中枢设备来进一步确定由哪个区域的语音设备响应,从而保证仅有一个语音识别设备来响应语音信号。
在图2和图3所应用的网络中,语音识别设备具有唤醒优先级的排序,因而在最高优先级的语音识别设备出现故障时,可根据唤醒优先级的排序来确定下一唤醒优先级的语音识别设备作为中枢设备或第一中枢设备。
对于语音识别设备来说,可周期性的检测其自身在区域网络中是否为最高唤醒优先级,也可在区域网络发生变化时检测自身是否为最高唤醒优先级;若检测到自身为当前区域网络中的最高唤醒优先级,即响应于检测到在区域网络中为最高唤醒优先级,则作为中枢设备运行。
本实施例网络中实现唤醒响应方法,所基于的是网络中语音识别设备具有唤醒优先级排序,且语音识别设备作为网络中枢设备可进行响应因子的比较。因而对于新加入到网络中的语音识别设备,也需要符合本实施例的唤醒机制,可由中枢设备来进行相关设置。
中枢设备可获取加入网络的语音识别设备的设备信息。根据预设规则分析设备信息,以重新对网络中的语音识别设备进行唤醒优先级的排序。
每个语音识别设备均搭载有语音识别系统,语音识别系统决定了唤醒优先级,语音识别算法,唤醒模板等。若新加入的语音识别设备具有不同语音识别系统,即其具有不同的唤醒优先级设置,网络中枢设备则可根据其本身的唤醒优先级设置来重新排序。例如网络A1-A2-A3,新加入的语音识别设备A4,其唤醒优先级的设置为大于A3,则可对将唤醒优先级重新排序为A1>A2>A4>A3。
若新加入的语音识别设备具有相同的语音识别系统,即其具有相同的唤醒优先级设置,则将以先加入网络的语音识别设备的唤醒优先级为更高。例如,新加入的语音识别设备A3,与之前的A3具有相同的语音识别系统,则之前的A3作为A31,新加入的作为A32,唤醒优先级的重新排序为A1>A2>A31>A32。
对于本实施例网络来说,其中实现唤醒响应方法的所有步骤均可在网络内部完成,因而本实施例的语音识别设备可离线运行。
在以上语音识别设备相互连接所构成的单区域网络中,语音识别设备可作为两种角色,一是作为中枢设备运作,另一是作为非中枢设备运作。对于每一语音识别设备,其可作为中枢设备,具有较强较多的功能;也可仅作为非中枢设备,具有轻量化的功能。
在家电领域,对于大型家电,例如冰箱、电视机等,可在其中加载功能较强较多的语音识别系统,使其能够作为中枢设备;而对于小型家电,如电饭煲,电水壶等,可在其中加载轻量级功能的语音识别系统,使其仅作为非中枢设备。
对于能够作为网络中枢设备的语音识别装置,其实现唤醒响应方法的步骤请参阅图4,图4是本申请语音识别设备的唤醒响应方法的中枢设备端工作流程示意图。作为网络中枢设备,其实现唤醒响应方法包括以下步骤。
S401:分析采集的语音信号,以获得中枢设备的响应因子。
对于每个区域网络中的中枢设备时,本步骤S401在上述步骤S201中完成,具体不再赘述。
S402:接收非中枢设备的非中枢设备的响应因子。
本步骤S402与上述步骤S202对应,具体不再赘述。
S403:比较中枢设备的响应因子和非中枢设备的响应因子,确定区域网络中的待响应语音识别设备。
本步骤S403与上述步骤S203类似,具体不再赘述。
上述步骤以语音识别设备作为中枢设备的角色,来说明其在实现单区域唤醒响应方法时的步骤,其中每个步骤的具体细节,中枢设备运行的具体细节也已在上文中描述,因此不再赘述。本实施例语音识别设备可从多个语音识别设备中确定响应该语音信号的一个语音识别设备,从而避免了均响应而相互干扰的问题。
进一步的,对于多区域网络,中枢设备还分为第一中枢设备和第二中枢设备,对于第一中枢设备来说,其进一步执行以下步骤。
S404:第一中枢设备接收第二响应因子。
本步骤S404在上述步骤S301中完成,具体不再赘述。
S406:比较第一响应因子和第二响应因子,确定响应语音信号的语音识别设备。
本步骤S406与上述步骤S302类似,具体不再赘述。
对于第二中枢设备来说,其则执行以下步骤。
S405:第二中枢设备向第一中枢设备发送第二响应因子,以由第一中枢设备比较第一响应因子和第二响应因子,从而确定响应语音信号的语音识别设备。
本步骤S405在上述步骤S301-S302中完成,具体不再赘述。
进一步的,在多区域网络中,由第一中枢设备进一步确定由哪个区域网络中的待响应语音识别设备来响应语音信号。
从非中枢设备的角度来看,其实现唤醒响应方法的步骤请参阅图5,图5是本申请语音识别设备唤醒响应方法的非中枢设备端工作流程示意图。该语音识别设备作为非中枢设备,本实施例唤醒响应方法包括以下步骤。
S501:分析采集的语音信号,以获得非中枢设备的响应因子。
本步骤S501与上述步骤S201类似,均为获取响应因子,具体过程不再赘述。
S502:向中枢设备发送非中枢设备的响应因子,以由中枢设备比较非中枢设备的响应因子和中枢设备的响应因子,来确定待响应语音识别设备。
作为非中枢设备,其在采集到语音信号后,并不立刻响应该语音信号,而是进行计算分析获得响应因子,然后再将该响应因子传送给中枢设备进行分析比较,由中枢设备来确认响应语音信号的语音识别设备。
本实施例以语音识别设备作为非中枢设备的角色,来说明其在实现唤醒响应方法时的步骤,其中每个步骤的具体细节,非中枢设备运行的具体细节也已 在上文中描述,因此不再赘述。本实施例语音识别设备在接收到语音信号后不会立即响应,而是在收到通知后再决定是否响应,避免了与其他语音识别设备同时响应,造成的相互干扰的问题。
上述唤醒响应方法由语音识别设备实现,因而本申请还提出语音识别设备,请参阅图6,图6是本申请语音识别设备一实施例的结构示意图,本实施例语音识别设备100可以是家用电器,其包括相互连接的语音采集器11,处理器12和存储器13,本实施例语音识别设备100可实现上述唤醒响应方法的实施例。其中,语音采集器11用于采集语音信号,存储器13中存储有计算机程序,处理器12用于执行计算机程序以实现上述唤醒响应方法。
具体来说,语音采集器11用于采集语音信号;处理器12用于分析采集的语音信号获得响应因子,并根据预设算法比较所有的响应因子,以确定响应语音信号的语音识别设备;向其他语音识别设备发送是否响应语音信号的通知。
或者,语音采集器11用于采集语音信号;处理器12用于分析采集的语音信号获得响应因子,将响应因子发送至中枢设备,根据所接收到的中枢设备发送的是否响应语音信号的通知,来确定是否响应。
其中,处理器12可以是一种集成电路芯片,具有信号的处理能力。处理器12还可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
对于上述实施例的方法,其可以计算机程序的形式存在,因而本申请提出一种计算机存储介质,请参阅图7,图7是本申请计算机存储介质一实施例的结构示意图。本实施例计算机存储介质200中存储有计算机程序21,其可被执行以实现上述实施例中的方法。
本实施例计算机存储介质200可以是U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等可以存储程序指令的介质,或者也可以为存储有该程序指令的服务器,该服务器可将存储的程序指令发送给其他设备运行,或者也可以自运行该存储的程序指令。
在本申请所提供的几个实施例中,应该理解到,所揭露的方法和设备,可以通过其它的方式实现。例如,以上所描述的设备实施方式仅仅是示意性的, 例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施方式方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施方式方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅为本申请的实施方式,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (23)

  1. 一种语音识别设备的唤醒响应方法,其特征在于,所述多个语音识别设备构成区域网络,所述多个语音识别设备分为一个中枢设备和至少一个非中枢设备;所述唤醒响应方法包括:
    所述中枢设备分析采集的语音信号,以获得所述中枢设备的响应因子;
    接收所述非中枢设备的响应因子,所述非中枢设备的响应因子由所述非中枢设备分析采集的所述语音信号而获得;
    比较所述中枢设备的响应因子和所述非中枢设备的响应因子;
    确定待响应语音识别设备,所述待响应语音识别设备为所述区域网络中响应所述语音信号的语音识别设备。
  2. 根据权利要求1所述的唤醒响应方法,其特征在于,所述比较所述中枢设备的响应因子和所述非中枢设备的响应因子,确定待响应语音识别设备,包括:
    比较所述中枢设备的响应因子的能量值和所述非中枢设备的响应因子的能量值,得到能量值最大的响应因子;
    确定所述能量值最大的响应因子对应的语音识别设备为所述待响应语音识别设备。
  3. 根据权利要求1所述的唤醒响应方法,其特征在于,所述比较所述中枢设备的响应因子和所述非中枢设备的响应因子,确定待响应语音识别设备,包括:
    比较所述中枢设备的响应因子的能量值和所述非中枢设备的响应因子的能量值,得到能量值最大的响应因子;
    判断所述能量值最大的响应因子是否为所述中枢设备的响应因子;
    响应于所述能量值最大的响应因子为所述中枢设备的响应因子,确定所述中枢设备为所述待响应语音识别设备;
    响应于所述能量值最大的响应因子不为所述中枢设备的响应因子,计算所述能量值最大的响应因子与所述中枢设备的响应因子的能量差值;
    比较所述能量差值与唤醒阈值;
    响应于所述能量差值大于所述唤醒阈值,确定所述能量值最大的响应因子对应的语音识别设备为所述待响应语音识别设备;
    响应于所述能量差值小于等于所述唤醒阈值,确定所述中枢设备为所述待响应语音识别设备。
  4. 根据权利要求2或3所述的唤醒响应方法,其特征在于,所述多个语音识别设备具有唤醒优先级;所述确定所述能量值最大的响应因子对应的语音识别设备为所述待响应语音识别设备,包括:
    在所述能量值最大的响应因子对应的语音识别设备中,确定唤醒优先级最高的作为所述待响应语音识别设备。
  5. 根据权利要求1所述的唤醒响应方法,其特征在于,所述唤醒响应方法包括:
    所述中枢设备向所述非中枢设备发送是否响应所述语音信号的通知。
  6. 根据权利要求1所述的唤醒响应方法,其特征在于,多个所述区域网络相互连接,所有区域网络中的多个中枢设备分为一个第一中枢设备和至少一个第二中枢设备;所述唤醒响应方法进一步包括:
    所述第二中枢设备向所述第一中枢设备发送第二响应因子,以由所述第一中枢设备比较所述第二响应因子和第一响应因子,从而确定响应所述语音信号的语音识别设备;
    所述第一响应因子为所述第一中枢设备所在区域网络的待响应语音识别设备的响应因子,所述第二响应因子为所述第二中枢设备所在的区域网络中待响应语音识别设备的响应因子。
  7. 根据权利要求1所述的唤醒响应方法,其特征在于,多个所述区域网络相互连接,所有区域网络中的多个中枢设备分为一个第一中枢设备和至少一个第二中枢设备;所述唤醒响应方法进一步包括:
    所述第一中枢设备接收第二响应因子,所述第二响应因子为所述第二中枢设备所在区域网络的待响应语音识别设备的响应因子;
    比较所述第二响应因子和第一响应因子,以确定响应所述语音信号的语音识别设备,所述第一响应因子为所述第一中枢设备所在的区域网络中待响应语音识别设备的响应因子。
  8. 根据权利要求6或7所述的唤醒响应方法,其特征在于,所述比较所述第二响应因子和第一响应因子,以确定响应所述语音信号的语音识别设备,包括:
    比较所述第一响应因子的能量值和所述第二响应因子的能量值,得到能量 值最大的响应因子;
    确定所述能量值最大的响应因子对应的语音识别设备响应所述语音信号。
  9. 根据权利要求8所述的唤醒响应方法,其特征在于,所述多个语音识别设备具有唤醒优先级;所述确定所述能量值最大的响应因子对应的语音识别设备为所述待响应语音识别设备,包括:
    在所述能量值最大的响应因子对应的语音识别设备中,确定唤醒优先级最高的语音识别设备响应所述语音信号。
  10. 根据权利要求6或7所述的唤醒响应方法,其特征在于,所述比较所述第二响应因子和所述第一中枢设备的第一响应因子,以确定响应所述语音信号的语音识别设备,包括:
    比较所述第一响应因子的能量值和所述第二响应因子的能量值,得到能量值最大的响应因子;
    判断所述能量值最大的响应因子是否为所述第一响应因子;
    响应于所述能量值最大的响应因子为所述第一响应因子,确定所述第一中枢设备响应所述语音信号;
    响应于所述能量值最大的响应因子不为所述第一响应因子,计算所述能量值最大的响应因子与所述第一响应因子的能量差值;
    比较所述能量差值与所述唤醒阈值;
    响应于所述能量差值大于所述唤醒阈值,确定所述能量值最大的响应因子对应的语音识别设备响应所述语音信号;
    响应于所述能量差值小于等于所述唤醒阈值,确定所述第一中枢设备响应所述语音信号。
  11. 根据权利要求10所述的唤醒响应方法,其特征在于,所述多个语音识别设备具有唤醒优先级;所述确定所述能量值最大的响应因子对应的语音识别设备响应所述语音信号,包括:
    在所述能量值最大的响应因子对应的语音识别设备中,确定唤醒优先级最高的语音识别设备响应所述语音信号。
  12. 根据权利要求6或7所述的唤醒响应方法,其特征在于,所述唤醒响应方法进一步包括:
    所述第一中枢设备向所述多个区域网络中的其他语音识别设备发送是否响应所述语音信号的通知。
  13. 根据权利要求1、6、7中任一项所述的唤醒响应方法,其特征在于,所述中枢设备的响应因子与所述非中枢设备的响应因子统称为响应因子;分析采集的语音信号获得响应因子,包括:
    根据所述语音信号的语音特征及所述语音信号与所述语音识别设备的唤醒模板的匹配度,计算获得所述响应因子的能量值。
  14. 根据权利要求13所述的唤醒响应方法,其特征在于,所述根据所述语音信号的语音特征及所述语音信号与所述语音识别设备的唤醒模板的匹配度,计算获得所述响应因子的能量值,包括:
    根据所述语音信号的语音特征计算得到唤醒能量,根据所述语音识别设备所处环境中环境噪声的语音特征计算得到底噪能量,以所述唤醒能量和所述底噪能量的差值作为有效能量;
    根据所述语音信号与所述唤醒模板的匹配程度计算置信度;
    对所述有效能量和所述置信度进行加权求和,以获得所述响应因子的能量值。
  15. 一种语音识别设备的唤醒响应方法,其特征在于,所述多个语音识别设备构成区域网络,所述多个语音识别设备分为一个中枢设备和至少一个非中枢设备;所述唤醒响应方法包括:
    所述非中枢设备分析采集的语音信号,以获得所述非中枢设备的响应因子;
    向所述中枢设备发送所述非中枢设备的响应因子,以由所述中枢设备比较所述非中枢设备的响应因子和所述中枢设备的响应因子,来确定待响应语音识别设备,所述待响应语音识别设备为所述区域网络中响应所述语音信号的语音识别设备。
  16. 根据权利要求15所述的唤醒响应方法,其特征在于,所述中枢设备比较所述非中枢设备的响应因子和所述中枢设备的响应因子,来确定待响应语音识别设备,包括:
    所述中枢设备比较所述中枢设备的响应因子的能量值和所述非中枢设备的响应因子的能量值,得到能量值最大的响应因子;
    确定所述能量值最大的响应因子对应的语音识别设备为所述待响应语音识别设备。
  17. 根据权利要求15所述的唤醒响应方法,其特征在于,
    所述中枢设备比较所述中枢设备的响应因子的能量值和所述非中枢设备的 响应因子的能量值,得到能量值最大的响应因子;
    判断所述能量值最大的响应因子是否为中枢设备的响应因子;
    响应于所述能量值最大的响应因子为所述中枢设备的响应因子,确定所述中枢设备为所述待响应语音识别设备;
    响应于所述能量值最大的响应因子不为所述中枢设备的响应因子,计算所述能量值最大的响应因子与所述中枢设备的响应因子的能量差值;
    比较所述能量差值与唤醒阈值;
    响应于所述能量差值大于所述唤醒阈值,确定所述能量值最大的响应因子对应的语音识别设备为所述待响应语音识别设备;
    响应于所述能量差值小于等于所述唤醒阈值,确定所述中枢设备为所述待响应语音识别设备。
  18. 根据权利要求16或17所述的唤醒响应方法,其特征在于,所述多个语音识别设备具有唤醒优先级;所确定所述能量值最大的响应因子对应的语音识别设备为所述待响应语音识别设备,包括:
    在所述能量值最大的响应因子对应的语音识别设备中,确定唤醒优先级最高的作为所述待响应语音识别设备。
  19. 根据权利要求15所述的唤醒方法,其特征在于,所述唤醒响应方法进一步包括:
    接收所述中枢设备发送的是否响应所述语音信号的通知。
  20. 根据权利要求15所述的唤醒响应方法,其特征在于,所述中枢设备的响应因子与所述非中枢设备的响应因子统称为响应因子;分析采集的语音信号获得响应因子,包括:
    根据所述语音信号的语音特征及所述语音信号与所述语音识别设备的唤醒模板的匹配度,计算获得所述响应因子的能量值。
  21. 根据权利要求20所述的唤醒响应方法,其特征在于,所述根据所述语音信号的语音特征及所述语音信号与所述语音识别设备的唤醒模板的匹配度,计算获得所述响应因子的能量值,包括:
    根据所述语音信号的语音特征计算得到唤醒能量,根据所述语音识别设备所处环境中环境噪声的语音特征计算得到底噪能量,以所述唤醒能量和所述底噪能量的差值作为有效能量;
    根据所述语音信号与所述唤醒模板的匹配程度计算置信度;
    对所述有效能量和所述置信度进行加权求和,以获得所述响应因子的能量值。
  22. 一种语音识别设备,其特征在于,所述语音识别设备包括处理器和存储器,所述存储器中存储有计算机程序,所述处理器用于执行所述计算机程序以实现如权利要求1-21中任一项所述方法的步骤。
  23. 一种计算机存储介质,其特征在于,所述计算机存储介质存储有计算机程序,所述计算机程序被执行以实现如权利要求1-21中任一项所述方法的步骤。
PCT/CN2019/123811 2019-04-26 2019-12-06 语音识别设备及其唤醒响应方法、计算机存储介质 WO2020215736A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP19926004.3A EP3944231A4 (en) 2019-04-26 2019-12-06 VOICE RECOGNITION DEVICES AND METHOD OF WAKE-UP RESPONSE THEREOF, AND COMPUTER STORAGE MEDIA
JP2021562155A JP7279992B2 (ja) 2019-04-26 2019-12-06 音声認識デバイス及びそのウェイクアップ応答方法、コンピュータ記憶媒体
KR1020217033362A KR20210141581A (ko) 2019-04-26 2019-12-06 음성 식별 장치 및 그 웨이크업 응답 방법, 컴퓨터 저장 매체
US17/452,223 US20220044685A1 (en) 2019-04-26 2021-10-25 Voice Recognition Device, Waking-Up and Responding Method of the Same, and Computer Storage Medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910343067.8A CN111862988B (zh) 2019-04-26 2019-04-26 语音识别设备及其唤醒响应方法、计算机存储介质
CN201910343067.8 2019-04-26

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/452,223 Continuation US20220044685A1 (en) 2019-04-26 2021-10-25 Voice Recognition Device, Waking-Up and Responding Method of the Same, and Computer Storage Medium

Publications (1)

Publication Number Publication Date
WO2020215736A1 true WO2020215736A1 (zh) 2020-10-29

Family

ID=72941506

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/123811 WO2020215736A1 (zh) 2019-04-26 2019-12-06 语音识别设备及其唤醒响应方法、计算机存储介质

Country Status (6)

Country Link
US (1) US20220044685A1 (zh)
EP (1) EP3944231A4 (zh)
JP (1) JP7279992B2 (zh)
KR (1) KR20210141581A (zh)
CN (1) CN111862988B (zh)
WO (1) WO2020215736A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112164405B (zh) * 2020-11-05 2024-04-23 佛山市顺德区美的电子科技有限公司 语音设备及其唤醒方法、装置以及存储介质
CN114582337A (zh) * 2020-12-01 2022-06-03 华为技术有限公司 一种设备响应方法和装置
WO2023240649A1 (zh) * 2022-06-17 2023-12-21 北京小米移动软件有限公司 一种唤醒优先级的更新方法及装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107452386A (zh) * 2017-08-16 2017-12-08 联想(北京)有限公司 一种语音数据处理方法和系统
CN107622767A (zh) * 2016-07-15 2018-01-23 青岛海尔智能技术研发有限公司 家电系统的语音控制方法与家电控制系统
CN108766422A (zh) * 2018-04-02 2018-11-06 青岛海尔科技有限公司 语音设备的响应方法、装置、存储介质及计算机设备
CN109215663A (zh) * 2018-10-11 2019-01-15 北京小米移动软件有限公司 设备唤醒方法及装置
CN109377987A (zh) * 2018-08-31 2019-02-22 百度在线网络技术(北京)有限公司 智能语音设备间的交互方法、装置、设备及存储介质
CN109391528A (zh) * 2018-08-31 2019-02-26 百度在线网络技术(北京)有限公司 语音智能设备的唤醒方法、装置、设备及存储介质
CN109658927A (zh) * 2018-11-30 2019-04-19 北京小米移动软件有限公司 智能设备的唤醒处理方法、装置及管理设备

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10026399B2 (en) * 2015-09-11 2018-07-17 Amazon Technologies, Inc. Arbitration between voice-enabled devices
JP2017107333A (ja) 2015-12-08 2017-06-15 キヤノン株式会社 通信機器及び通信機器の制御方法
US10354653B1 (en) * 2016-01-19 2019-07-16 United Services Automobile Association (Usaa) Cooperative delegation for digital assistants
US10133612B2 (en) * 2016-03-17 2018-11-20 Nuance Communications, Inc. Session processing interaction between two or more virtual assistants
DK179415B1 (en) * 2016-06-11 2018-06-14 Apple Inc Intelligent device arbitration and control
US10181323B2 (en) * 2016-10-19 2019-01-15 Sonos, Inc. Arbitration-based voice recognition
US11183181B2 (en) * 2017-03-27 2021-11-23 Sonos, Inc. Systems and methods of multiple voice services
US10573171B2 (en) * 2017-05-23 2020-02-25 Lenovo (Singapore) Pte. Ltd. Method of associating user input with a device
CN107919119A (zh) * 2017-11-16 2018-04-17 百度在线网络技术(北京)有限公司 多设备交互协同的方法、装置、设备及计算机可读介质
US11631017B2 (en) * 2018-01-09 2023-04-18 Microsoft Technology Licensing, Llc Federated intelligent assistance

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622767A (zh) * 2016-07-15 2018-01-23 青岛海尔智能技术研发有限公司 家电系统的语音控制方法与家电控制系统
CN107452386A (zh) * 2017-08-16 2017-12-08 联想(北京)有限公司 一种语音数据处理方法和系统
CN108766422A (zh) * 2018-04-02 2018-11-06 青岛海尔科技有限公司 语音设备的响应方法、装置、存储介质及计算机设备
CN109377987A (zh) * 2018-08-31 2019-02-22 百度在线网络技术(北京)有限公司 智能语音设备间的交互方法、装置、设备及存储介质
CN109391528A (zh) * 2018-08-31 2019-02-26 百度在线网络技术(北京)有限公司 语音智能设备的唤醒方法、装置、设备及存储介质
CN109215663A (zh) * 2018-10-11 2019-01-15 北京小米移动软件有限公司 设备唤醒方法及装置
CN109658927A (zh) * 2018-11-30 2019-04-19 北京小米移动软件有限公司 智能设备的唤醒处理方法、装置及管理设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3944231A4

Also Published As

Publication number Publication date
EP3944231A4 (en) 2022-05-25
JP2022529708A (ja) 2022-06-23
US20220044685A1 (en) 2022-02-10
CN111862988A (zh) 2020-10-30
KR20210141581A (ko) 2021-11-23
CN111862988B (zh) 2023-03-03
JP7279992B2 (ja) 2023-05-23
EP3944231A1 (en) 2022-01-26

Similar Documents

Publication Publication Date Title
WO2020215736A1 (zh) 语音识别设备及其唤醒响应方法、计算机存储介质
JP7041323B2 (ja) スマートプラグの動作のシステムおよび方法
CN110085233B (zh) 语音控制方法及其装置、电子设备和计算机可读存储介质
US20150032456A1 (en) Intelligent placement of appliance response to voice command
WO2014048317A1 (zh) 一种控制空调的方法及系统
CN113251585A (zh) 空调器噪音控制方法、系统、设备和存储介质
US9100207B2 (en) Systems, devices, and methods for mapping devices to realize building automation and energy management
CN110568771B (zh) 一种智能联动控制智能家居设备的系统及方法
US10979962B2 (en) Wireless system configuration of master zone devices based on signal strength analysis
WO2021012581A1 (zh) 语音识别设备及其唤醒响应方法、计算机存储介质
WO2020224265A1 (zh) 语音控制方法和装置
WO2020215741A1 (zh) 语音识别设备及其唤醒响应方法、计算机存储介质
JP6233732B2 (ja) 空気質監視装置、空気質監視システム、空気質監視方法及びプログラム
US20210176086A1 (en) Notification control device, notification control system, and notification control method
WO2023193411A1 (zh) 设备配网方法、装置、计算机设备及存储介质
WO2022268136A1 (zh) 一种进行语音控制的终端设备及服务器
CN115019793A (zh) 基于协同纠错的唤醒方法、装置及系统、介质、设备
CN110160219B (zh) 空调系统及其控制方法、空调器、智能家电系统的控制方法
CN115312048A (zh) 设备唤醒方法及装置、存储介质及电子装置
CN110824277B (zh) 智能家电设备分区的方法、装置及智能家电设备
CN117542356A (zh) 智能设备的语音唤醒方法、存储介质及电子装置
CN114879527A (zh) 基于智能分组和技能匹配的智能家电控制方法及装置
KR20240063131A (ko) 계층적 모바일 애플리케이션 런칭
CN115731928A (zh) 响应设备的确定方法和装置
CN114999484A (zh) 交互语音设备的选举方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19926004

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20217033362

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2021562155

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2019926004

Country of ref document: EP

Effective date: 20211018

NENP Non-entry into the national phase

Ref country code: DE