CN112420051A - Equipment determination method, device and storage medium - Google Patents
Equipment determination method, device and storage medium Download PDFInfo
- Publication number
- CN112420051A CN112420051A CN202011296411.1A CN202011296411A CN112420051A CN 112420051 A CN112420051 A CN 112420051A CN 202011296411 A CN202011296411 A CN 202011296411A CN 112420051 A CN112420051 A CN 112420051A
- Authority
- CN
- China
- Prior art keywords
- decision
- target
- voice signal
- wake
- signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000004590 computer program Methods 0.000 claims description 16
- 238000001514 detection method Methods 0.000 claims description 7
- 230000002618 waking effect Effects 0.000 claims description 7
- 230000004044 response Effects 0.000 abstract description 30
- 238000005516 engineering process Methods 0.000 abstract description 2
- 230000005540 biological transmission Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 230000002860 competitive effect Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000002592 echocardiography Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Power Sources (AREA)
Abstract
The embodiment of the invention provides a method, a device and a storage medium for determining equipment, wherein the method comprises the following steps: determining a wake-up decision parameter of a received first voice signal; and under the condition that the first device is allowed to be awakened by the first voice signal, sending current awakening decision information to the target decision device so that the target decision device can determine a target device answering the first voice signal based on the received awakening decision information sent by each device, wherein the current awakening decision information comprises an awakening decision parameter and an awakening state parameter of the first device, and the awakening state parameter is used for indicating that the first device is allowed to be awakened by the first voice signal. The invention solves the problems of long processing time and slow awakening response in the related technology, thereby achieving the effects of saving processing time and improving response speed.
Description
Technical Field
The embodiment of the invention relates to the field of communication, in particular to a method and a device for determining equipment and a storage medium.
Background
With the increasing of intelligent voice interaction devices, the requirement of voice interaction competitive response of multiple types of devices becomes more and more important. In practical application, multiple intelligent devices may be simultaneously online in the same area, and there may be a situation that the same wake-up word may wake up the multiple intelligent devices, and in this application scenario, in order to avoid a phenomenon that all devices respond when a certain intelligent device is woken up, it is necessary to select a unique device among the multiple devices according to a certain policy to respond.
In the related art, a distributed competitive response technique is adopted, in which the magnitude of the frequency domain energy average peak of a wake-up audio segment is used as a decision quantity of a distributed voice wake-up decision to determine which intelligent device responds, and a specific processing flow thereof can be seen in fig. 1, as shown in fig. 1, where conventional front-end signal processing is performed after voice signals are collected by microphones of each device, and includes: echo cancellation, noise reduction, beam forming, gain control, etc., and then passed to the wake-up module to detect the wake-up word. After the microphone is awakened successfully, the original signals collected by the section of microphone are copied and transmitted to the distributed decision module, the distributed decision module obtains the audio section of each device and then performs signal processing such as echo cancellation and the like, then corresponding decision quantity is calculated, and an optimal device is selected from the devices according to a decision mechanism. It can be seen from the above processing flow that two front-end signal processes are required, and all the processes are connected in series in time sequence, so that the processing time of the whole flow is increased, the wake-up response time is increased, and the user experience is affected.
In view of the above problems in the related art, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a method and a device for determining equipment and a storage medium, which are used for at least solving the problems of long processing time and slow awakening response in the related art.
According to an embodiment of the present invention, there is provided a method for determining a device, applied to a first device, including: determining a wake-up decision parameter of a received first voice signal; and under the condition that the first device is allowed to be woken up by the first voice signal, sending current wake-up decision information to a target decision device, so that the target decision device determines a target device answering the first voice signal based on the received wake-up decision information sent by each device, wherein the current wake-up decision information comprises a wake-up decision parameter and a wake-up state parameter of the first device, and the wake-up state parameter is used for indicating that the first device is allowed to be woken up by the first voice signal.
Optionally, before determining that the wake decision parameter of the received first voice signal is determined, the method further comprises: performing front-end signal processing for eliminating interference signals on the first voice signal to obtain a target signal; performing first processing on the target signal to obtain the wake-up decision parameter, and performing second processing on the target signal to determine whether the first device is allowed to be woken up by the first voice signal; wherein the first processing includes: performing energy detection on the received target signal to determine an energy value of the received target signal, wherein the wake-up decision parameter comprises the energy value; the second processing includes: acquiring a voice keyword included in the target signal; judging whether the voice keywords comprise a wake-up word for waking up the first equipment; determining that the first device is allowed to be awakened by the first voice signal under the condition that the judgment result is yes; and determining that the first equipment is not allowed to be awakened by the first voice signal under the condition that the judgment result is not included.
Optionally, the first processing the target signal to obtain the wake-up decision parameter, and the second processing the target signal to determine whether the first device is allowed to be woken up by the first voice signal includes: acquiring the target signal; performing first processing on the target signal to obtain the wake-up decision parameter, and performing second processing on the target signal after obtaining the wake-up decision parameter to determine whether the first device is allowed to be woken up by the first voice signal; or, performing a first processing on the target signal in parallel to obtain the wake-up decision parameter, and performing a second processing on the target signal to determine whether the first device is allowed to be woken up by the first voice signal.
Optionally, in a case that the first device is allowed to be woken up by the first voice signal, after sending the current decision information to a target decision device, the method further includes: and when a control instruction which is fed back by the target decision equipment and is used for responding to the first voice signal is received, executing the operation of responding to the first voice signal based on the control instruction.
Optionally, before performing an operation of answering the first voice signal based on the control instruction, the method further includes: receiving the control instruction from a local first decision device, wherein the target decision device comprises the first decision device.
Optionally, after the operation of answering the first voice signal based on the control instruction, the method further includes: and when a termination instruction for terminating and answering the first voice signal fed back by the target decision equipment is received, terminating and executing the operation responding to the first voice signal based on the termination instruction.
Optionally, before terminating the execution of the operation in response to the first voice signal based on the termination instruction, the method further includes: receiving the termination instruction from a second decision device in a cloud, wherein the target decision device comprises the second decision device.
Optionally, in a case that the first device is allowed to be woken up by the first voice signal, sending current wake-up decision information to a target decision device includes: in the case that the first device is determined to be allowed to be awakened by the first voice signal, performing the following operations: sending the current awakening decision information to local first decision equipment to indicate the first decision equipment to determine target equipment responding to the first voice signal based on the received awakening decision information sent by each equipment; and sending the current wake-up decision information to a second decision device at the cloud end to instruct the second decision device to determine whether the target device decided by the first decision device is reasonable or not based on the received wake-up decision information sent by each device, and adjusting the target device for responding to the first voice signal if the determination is not reasonable, wherein the target decision device comprises the first decision device and the second decision device.
According to another embodiment of the present invention, there is also provided an apparatus for determining a device, applied to a first device, including: the determining module is used for determining a wakeup decision parameter of the received first voice signal; a sending module, configured to send current wake-up decision information to a target decision device when the first device is allowed to be woken up by the first voice signal, so that the target decision device determines a target device that responds to the first voice signal based on the received wake-up decision information sent by each device, where the current wake-up decision information includes the wake-up decision parameter and a wake-up state parameter of the first device, and the wake-up state parameter is used to indicate that the first device is allowed to be woken up by the first voice signal.
According to a further embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.
According to the invention, because the awakening decision parameter and whether the equipment is allowed to be awakened are executed before the information is transmitted to the target decision equipment, the received first voice signal can be processed by front-end signal only once, thereby effectively saving processing time, improving response speed and solving the problems of long processing time and slow awakening response existing in the related technology.
Drawings
Fig. 1 is a flow chart of distributed signal processing in the related art;
fig. 2 is a block diagram of a hardware configuration of a mobile terminal of a device determination method according to an embodiment of the present invention;
FIG. 3 is a flow chart of a method of determination of a device according to an embodiment of the invention;
FIG. 4 is a flow diagram of distributed voice wakeup signal processing according to an embodiment of the invention;
fig. 5 is a block diagram of a structure of a determination device of an apparatus according to an embodiment of the present invention.
Detailed Description
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The method embodiments provided in the embodiments of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking the operation on the mobile terminal as an example, fig. 2 is a hardware structure block diagram of the mobile terminal of a method for determining a device according to an embodiment of the present invention. As shown in fig. 2, the mobile terminal may comprise one or more processors 202 (only one is shown in fig. 2) (the processor 202 may comprise, but is not limited to, a processing means such as a microprocessor MCU or a programmable logic device FPGA), and a memory 204 for storing data, wherein the mobile terminal may further comprise a transmission device 206 for communication functions and an input-output device 208. It will be understood by those skilled in the art that the structure shown in fig. 2 is only an illustration, and does not limit the structure of the mobile terminal. For example, the mobile terminal may also include more or fewer components than shown in FIG. 2, or have a different configuration than shown in FIG. 2.
The memory 204 can be used for storing computer programs, for example, software programs and modules of application software, such as computer programs corresponding to the determination method of the device in the embodiment of the present invention, and the processor 202 executes various functional applications and data processing by running the computer programs stored in the memory 204, so as to implement the method described above. Memory 204 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 204 may further include memory located remotely from the processor 202, which may be connected to the mobile terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means 206 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 206 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 206 can be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
In this embodiment, a method for determining a device is provided, and fig. 3 is a flowchart of a method for determining a device according to an embodiment of the present invention, as shown in fig. 3, the flowchart includes the following steps:
step S302, determining a wakeup decision parameter of the received first voice signal;
step S304, in a case that the first device is allowed to be woken up by the first voice signal, sending current wake-up decision information to a target decision device, so that the target decision device determines a target device that responds to the first voice signal based on the received wake-up decision information sent by each device, where the current wake-up decision information includes the wake-up decision parameter and a wake-up state parameter of the first device, and the wake-up state parameter is used to indicate that the first device is allowed to be woken up by the first voice signal.
The main body for executing the above operations may be the first terminal, where the first terminal may be a smart home device with a radio reception function, and may also be a background processor or other devices with similar processing capabilities. The application scenario of the above steps may be in a home environment, or in an office environment, or in a factory environment, etc.
In the above embodiment, the first terminal may receive the first voice signal, and at the same time when the terminal receives the first voice signal, there may be other terminals in the environment where the terminal is also capable of receiving the first voice signal, and since the locations of the different terminals may be different, when the different terminals receive the first voice signal, the detected wake-up decision parameter is different, for example, the energy value of the first voice signal received by the different terminals is different, or the reception angle value of the first voice signal received by the different terminals is different. That is, the wake-up decision parameter may include an energy value or a reception angle of the first voice signal, and the like, and may be a decision basis for deciding whether or not the terminal receiving the first voice signal is a final response terminal (for example, a terminal detecting the largest energy value is determined as a final response terminal, a terminal facing the emission direction of the first voice signal is determined as a final response terminal based on a reception angle, and the like), and thus, the wake-up decision parameter may also be referred to as a decision amount.
In the above embodiment, because the wake-up decision parameter and whether the device is allowed to be woken up are both executed before the information is transmitted to the target decision device, only one front-end signal processing can be performed on the received first voice signal, thereby effectively saving processing time, improving response speed, and solving the problems of long processing time and slow wake-up response existing in the related art.
In an optional embodiment, before determining the wake-up decision parameter of the received first voice signal, the method further comprises: performing front-end signal processing for eliminating interference signals on the first voice signal to obtain a target signal; performing first processing on the target signal to obtain the wake-up decision parameter, and performing second processing on the target signal to determine whether the first device is allowed to be woken up by the first voice signal; wherein the first processing includes: performing energy detection on the received target signal to determine an energy value of the received target signal, wherein the wake-up decision parameter comprises the energy value; the second processing includes: acquiring a voice keyword included in the target signal; judging whether the voice keywords comprise a wake-up word for waking up the first equipment; determining that the first device is allowed to be awakened by the first voice signal under the condition that the judgment result is yes; and determining that the first equipment is not allowed to be awakened by the first voice signal under the condition that the judgment result is not included. In this embodiment, the front-end signal processing may include at least operations of echo cancellation, noise reduction, beam forming, gain control, etc. to cancel the interference signal. In addition, the second processing on the target signal is actually used to detect whether the target signal includes a wakeup word capable of waking up the first device, determine that the first device is allowed to be woken up by the first voice signal when the target signal includes the wakeup word capable of waking up the first device, and determine that the first device is not allowed to be woken up by the first voice signal when the target signal does not include the wakeup word capable of waking up the first device.
In an optional embodiment, the first processing is performed on the target signal to obtain the wake-up decision parameter, and the second processing is performed on the target signal to determine whether the first device is allowed to be woken up by the first voice signal, and may refer to and perform the above operations, and may also be performed in series, that is, the first processing may be performed first, and then the second processing may be performed, where different processing may be performed by different modules in the first device, for example, the above operations may specifically include: acquiring the target signal by using a first module included in the first device, and performing first processing on the target signal to obtain the awakening decision parameter; transmitting, with the first module, the wake-up decision parameter and the target signal to a second module included in the first device; performing the second processing on the target signal by using the second module to determine whether the first device is allowed to be woken up by the first voice signal, wherein the second module is further configured to generate the wake-up decision information if it is determined that the first device is allowed to be woken up by the first voice signal. In this embodiment, the first module may first execute the operation of obtaining the wake-up decision parameter, and after obtaining the wake-up decision parameter, the wake-up decision parameter and the target signal are sent to the second module to instruct the second module to perform a second process based on the target signal to determine whether the first device is allowed to be woken up by the first voice signal. Therefore, in the embodiment, when performing the wake-up detection operation, it is not necessary to perform the front-end signal processing on the first voice signal again, but the obtained target signal is directly used to perform the wake-up detection operation, so that the two times of front-end signal processing are avoided, the processing time is effectively saved, and the response speed is improved.
Optionally, the flow of executing the first processing and the second processing may further include: acquiring the target signal in parallel by using a first module included in the first device and a second module included in the first device; the first module is used for performing the first processing on the target signal to obtain the wake-up decision parameter, and the second module is used for performing the second processing on the target signal to determine whether the first device is allowed to be woken up by the first voice signal, wherein the first module is further used for sending the wake-up decision parameter to the second module, and the second module is further used for generating the current wake-up decision information under the condition that the first device is allowed to be woken up by the first voice signal. In this embodiment, after the target signal is obtained, the first processing and the second processing may be performed in parallel based on the obtained target signal, thereby effectively saving processing time.
In an optional embodiment, after the current decision information is sent to the decision target device in a case where the first device is allowed to be woken up by the first voice signal, the method further includes: and when a control instruction which is fed back by the target decision equipment and is used for responding to the first voice signal is received, executing the operation of responding to the first voice signal based on the control instruction. In this embodiment, when the goal decision device determines that the first device is the final responding device, a control instruction may be sent to the first device to instruct the first device to respond to the first voice signal. Optionally, in practical applications, the final response device determined by the goal decision device may also be another terminal device, and in this case, the goal decision device needs to send a control instruction to the other terminal device to instruct the other terminal device to respond to the first voice signal.
In an optional embodiment, before performing the operation of answering the first voice signal based on the control instruction, the method further comprises: receiving the control instruction from a first decision device within a local first device, wherein the target decision device comprises the first decision device. In this embodiment, the objective decision device may include a first decision device located in the first device, and it should be noted that the objective decision device may be a device distributed locally, for example, may be located in a plurality of devices capable of receiving the first voice signal, and the plurality of local objective decision devices may all receive the reference information sent from other devices, and perform the above-mentioned comprehensive decision operation based on the received respective reference information, and further determine whether the device where the objective decision device is located is a final response device, and in a case that the determination is yes, control the device where the objective decision device is located to respond to the first voice signal, and in a case that the determination is not yes, do not do so.
In an optional embodiment, after the operation of answering the first voice signal based on the control instruction, the method further comprises: and when a termination instruction for terminating and answering the first voice signal fed back by the target decision equipment is received, terminating and executing the operation responding to the first voice signal based on the termination instruction. In this embodiment, the target decision device located locally may misjudge the final responding device, and in this case, the determined responding device needs to be timely controlled to terminate responding to the first voice signal, so that the actual responding device can continue to respond to the first voice signal.
In an optional embodiment, before terminating execution of the operation in response to the first voice signal based on the termination instruction, the method further includes: receiving the termination instruction from a second decision device in a cloud, wherein the target decision device comprises the second decision device. In this embodiment, the target decision device may further include a second decision device located at the cloud, where the second decision device may further determine a decision result of the local first decision device, so as to determine whether the response device determined by the local first decision device is an actual response device, and if not, the response device needs to be corrected in time.
In an optional embodiment, in the case that the first device is allowed to be woken up by the first voice signal, sending the current wake-up decision information to the decision-making target device includes: in the case that the first device is determined to be allowed to be awakened by the first voice signal, performing the following operations: sending the current awakening decision information to local first decision equipment to indicate the first decision equipment to determine target equipment responding to the first voice signal based on the received awakening decision information sent by each equipment; sending the current awakening decision information to a second decision device at the cloud end to indicate the second decision device to determine whether the target device decided by the first decision device is reasonable or not based on the received awakening decision information sent by each device, and adjusting the target device for responding to the first voice signal if the target device is unreasonable; wherein the goal decision device comprises the first decision device and the second decision device. In this embodiment, the second decision device determines whether the target device decided by the first decision device is reasonable and actual based on the current wake-up decision information and wake-up decision information from other devices, so as to determine whether the response device determined by the first decision device is actual.
The invention will now be described with reference to a specific embodiment:
as shown in fig. 4, the main process of the signal processing method for improving distributed voice wake-up response speed according to the embodiment of the present invention includes the following steps:
s402, collecting sound by the microphone array. The microphone array in the smart device collects voice signals (corresponding to the first voice signal), and each device with a sound collecting function in a specific area collects sound by using the microphone array arranged inside.
And S404, processing the voice signal. In this step, front-end signal processing, including front-end signal processing such as echo cancellation, noise cancellation, and beam forming, is performed on the voice signals collected by the microphone array to remove interference signals of non-sound source signals such as echoes and noises.
And S406, calculating the judgment quantity. In this step, the voice signal segment after the voice signal processing is adopted to perform decision quantity calculation (i.e., the operation of obtaining the above-mentioned wake-up decision parameter is executed), so as to obtain a decision quantity that can be used for distributed decision. The decision quantity can be calculated according to different decision mechanisms.
And S408, awakening. In this step, the signal segment after the voice signal processing may be transmitted to the wakeup processing module (corresponding to the second module) for wakeup word detection, and if a wakeup word is detected, the wakeup state and the decision amount calculated in step S406 are transmitted together to the subsequent distributed decision processing; if the awakening word is not detected, the judgment quantity is not transmitted to the distributed decision module, and subsequent processing is not carried out.
And S410, distributed decision making and response. In this step, the decision quantity transmitted by the wake-up processing module is obtained, the quantitative relation between the decision quantities of the devices is analyzed and judged according to a predefined decision rule, and a unique device is selected according to the decision rule to respond, while other devices keep silent.
As can be seen from the foregoing embodiments, in the present invention, a calculation module of distributed wake-up decision quantity (corresponding to the first module) is placed before a wake-up processing module, and is combined with front-end speech signal processing, and the calculated decision quantity and processed audio are transmitted to the wake-up processing module, and it is determined whether to continue transmitting the decision quantity to the distributed decision and response module for processing (corresponding to a target decision device) according to a wake-up state. The calculation of the decision quantity and the front-end voice signal processing are carried out simultaneously, and the situation that after a single device is awakened, an original voice signal needs to be transmitted to the distributed decision module for signal processing again is avoided, so that the response time of distributed voice awakening is shortened, and the use experience of a user is improved.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
In this embodiment, a device determining apparatus is further provided, where the apparatus is used to implement the foregoing embodiment and preferred embodiments, and details are not described again after the description is given. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 5 is a block diagram of a structure of a determination apparatus of a device according to an embodiment of the present invention, as shown in fig. 5, the apparatus includes:
a determining module 52, configured to determine a wake-up decision parameter of the received first voice signal;
a sending module 54, configured to send current wake-up decision information to a decision-making target device when the first device is allowed to be woken up by the first voice signal, so that the decision-making target device determines a target device that responds to the first voice signal based on the received wake-up decision information sent by each device, where the current wake-up decision information includes the wake-up decision parameter and a wake-up state parameter of the first device, and the wake-up state parameter is used to indicate that the first device is allowed to be woken up by the first voice signal.
In an optional embodiment, the apparatus is further configured to, before determining a wake-up decision parameter of the received first voice signal, perform front-end signal processing for eliminating an interference signal on the first voice signal to obtain a target signal; performing first processing on the target signal to obtain the wake-up decision parameter, and performing second processing on the target signal to determine whether the first device is allowed to be woken up by the first voice signal; wherein the first processing includes: performing energy detection on the received target signal to determine an energy value of the received target signal, wherein the wake-up decision parameter comprises the energy value; the second processing includes: acquiring a voice keyword included in the target signal; judging whether the voice keywords comprise a wake-up word for waking up the first equipment; determining that the first device is allowed to be awakened by the first voice signal under the condition that the judgment result is yes; and determining that the first equipment is not allowed to be awakened by the first voice signal under the condition that the judgment result is not included.
In an alternative embodiment, the apparatus may perform a first processing on the target signal to obtain the wake-up decision parameter, and perform a second processing on the target signal to determine whether the first device is allowed to be woken up by the first voice signal, by: acquiring the target signal; performing first processing on the target signal to obtain the wake-up decision parameter, and performing second processing on the target signal after obtaining the wake-up decision parameter to determine whether the first device is allowed to be woken up by the first voice signal; or, performing a first processing on the target signal in parallel to obtain the wake-up decision parameter, and performing a second processing on the target signal to determine whether the first device is allowed to be woken up by the first voice signal.
In an optional embodiment, the apparatus is further configured to, after the first device is allowed to be woken up by the first voice signal, send current decision information to a target decision device, and when a control instruction responding to the first voice signal and fed back by the target decision device is received, perform an operation of responding to the first voice signal based on the control instruction.
In an alternative embodiment, the apparatus is configured to receive the control instruction from a local first decision device before performing an operation to reply to the first speech signal based on the control instruction, wherein the target decision device comprises the first decision device.
In an optional embodiment, the apparatus is further configured to terminate, after performing an operation of answering the first voice signal based on the control instruction, the operation of responding to the first voice signal based on the termination instruction when receiving the termination instruction of terminating answering the first voice signal, which is fed back by the goal decision device.
In an optional embodiment, the apparatus is further configured to receive the termination instruction from a second decision device in a cloud before terminating execution of the operation in response to the first voice signal based on the termination instruction, where the target decision device includes the second decision device.
In an optional embodiment, the sending module is configured to, if it is determined that the first device is allowed to be woken up by the first voice signal, perform the following operations: sending the current awakening decision information to local first decision equipment to indicate the first decision equipment to determine target equipment responding to the first voice signal based on the received awakening decision information sent by each equipment; sending the current awakening decision information to a second decision device at the cloud end to indicate the second decision device to determine whether the target device decided by the first decision device is reasonable or not based on the received awakening decision information sent by each device, and adjusting the target device for responding to the first voice signal if the target device is unreasonable; wherein the goal decision device comprises the first decision device and the second decision device.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
Embodiments of the present invention also provide a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.
In an exemplary embodiment, the computer-readable storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.
In an exemplary embodiment, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
For specific examples in this embodiment, reference may be made to the examples described in the above embodiments and exemplary embodiments, and details of this embodiment are not repeated herein.
The method for improving the distributed voice awakening response speed provided by the embodiment of the invention can achieve the following beneficial effects:
the calculation of the decision quantity and the front-end signal processing are synchronously carried out, and the signal processing flow after awakening is simplified.
Redundant processing modules are reduced, so that the signal processing time is shortened, and the distributed wakeup response time is shortened.
The awakening response time is shortened, so that the user is more smooth from voice awakening to interaction, and the user experience is improved.
It will be apparent to those skilled in the art that the various modules or steps of the invention described above may be implemented using a general purpose computing device, they may be centralized on a single computing device or distributed across a network of computing devices, and they may be implemented using program code executable by the computing devices, such that they may be stored in a memory device and executed by the computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into various integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A method for determining a device, applied to a first device, includes:
determining a wake-up decision parameter of a received first voice signal;
and under the condition that the first device is allowed to be woken up by the first voice signal, sending current wake-up decision information to a target decision device, so that the target decision device determines a target device answering the first voice signal based on the received wake-up decision information sent by each device, wherein the current wake-up decision information comprises a wake-up decision parameter and a wake-up state parameter of the first device, and the wake-up state parameter is used for indicating that the first device is allowed to be woken up by the first voice signal.
2. The method of claim 1, wherein prior to determining the wake decision parameter for the received first voice signal, the method further comprises:
performing front-end signal processing for eliminating interference signals on the first voice signal to obtain a target signal;
performing first processing on the target signal to obtain the wake-up decision parameter, and performing second processing on the target signal to determine whether the first device is allowed to be woken up by the first voice signal;
wherein the first processing includes: performing energy detection on the received target signal to determine an energy value of the received target signal, wherein the wake-up decision parameter comprises the energy value;
the second processing includes: acquiring a voice keyword included in the target signal; judging whether the voice keywords comprise a wake-up word for waking up the first equipment; determining that the first device is allowed to be awakened by the first voice signal under the condition that the judgment result is yes; and determining that the first equipment is not allowed to be awakened by the first voice signal under the condition that the judgment result is not included.
3. The method of claim 2, wherein first processing the target signal to obtain the wake-up decision parameter, and second processing the target signal to determine whether the first device is allowed to be woken up by the first voice signal comprises:
acquiring the target signal;
performing first processing on the target signal to obtain the wake-up decision parameter, and performing second processing on the target signal after obtaining the wake-up decision parameter to determine whether the first device is allowed to be woken up by the first voice signal; or,
and performing first processing on the target signal in parallel to obtain the awakening decision parameter, and performing second processing on the target signal to determine whether the first equipment is allowed to be awakened by the first voice signal.
4. The method of claim 1, wherein after sending current decision information to a target decision device if the first device is allowed to wake up by the first voice signal, the method further comprises:
and when a control instruction which is fed back by the target decision equipment and is used for responding to the first voice signal is received, executing the operation of responding to the first voice signal based on the control instruction.
5. The method of claim 4, wherein prior to performing an operation to answer the first speech signal based on the control instruction, the method further comprises:
receiving the control instruction from a local first decision device, wherein the target decision device comprises the first decision device.
6. The method of claim 4, wherein after performing the operation of answering the first voice signal based on the control instruction, the method further comprises:
and when a termination instruction for terminating and answering the first voice signal fed back by the target decision equipment is received, terminating and executing the operation responding to the first voice signal based on the termination instruction.
7. The method according to claim 6, wherein before terminating execution of the operation responsive to the first speech signal based on the termination instruction, the method further comprises:
receiving the termination instruction from a second decision device in a cloud, wherein the target decision device comprises the second decision device.
8. The method of claim 1, wherein sending current wake-up decision information to a target decision device if the first device is allowed to wake up by the first voice signal comprises:
in the case that the first device is determined to be allowed to be awakened by the first voice signal, performing the following operations:
sending the current awakening decision information to local first decision equipment to indicate the first decision equipment to determine target equipment responding to the first voice signal based on the received awakening decision information sent by each equipment; and the number of the first and second groups,
sending the current awakening decision information to a second decision device at the cloud end to indicate the second decision device to determine whether the target device decided by the first decision device is reasonable or not based on the received awakening decision information sent by each device, and adjusting the target device for responding to the first voice signal if the target device is unreasonable;
wherein the goal decision device comprises the first decision device and the second decision device.
9. A computer-readable storage medium, in which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method as claimed in any of claims 1 to 9 are implemented when the computer program is executed by the processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011296411.1A CN112420051A (en) | 2020-11-18 | 2020-11-18 | Equipment determination method, device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011296411.1A CN112420051A (en) | 2020-11-18 | 2020-11-18 | Equipment determination method, device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112420051A true CN112420051A (en) | 2021-02-26 |
Family
ID=74774087
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011296411.1A Pending CN112420051A (en) | 2020-11-18 | 2020-11-18 | Equipment determination method, device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112420051A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113593548A (en) * | 2021-06-29 | 2021-11-02 | 青岛海尔科技有限公司 | Awakening method and device of intelligent equipment, storage medium and electronic device |
CN114168208A (en) * | 2021-12-07 | 2022-03-11 | 思必驰科技股份有限公司 | Wake-up decision method, electronic device and storage medium |
CN114678016A (en) * | 2021-04-23 | 2022-06-28 | 美的集团(上海)有限公司 | Device wake-up method and system, electronic device and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012073149A (en) * | 2010-09-29 | 2012-04-12 | Mitsubishi Electric Corp | Wake display device |
CN102999161A (en) * | 2012-11-13 | 2013-03-27 | 安徽科大讯飞信息科技股份有限公司 | Implementation method and application of voice awakening module |
CN107450722A (en) * | 2017-07-12 | 2017-12-08 | 深圳纬目信息技术有限公司 | A kind of dual interactive controlling head-mounted display apparatus and exchange method with noise filter function |
CN109887505A (en) * | 2019-03-11 | 2019-06-14 | 百度在线网络技术(北京)有限公司 | Method and apparatus for wake-up device |
CN110288997A (en) * | 2019-07-22 | 2019-09-27 | 苏州思必驰信息科技有限公司 | Equipment awakening method and system for acoustics networking |
US10460749B1 (en) * | 2018-06-28 | 2019-10-29 | Nuvoton Technology Corporation | Voice activity detection using vocal tract area information |
CN111223497A (en) * | 2020-01-06 | 2020-06-02 | 苏州思必驰信息科技有限公司 | Nearby wake-up method and device for terminal, computing equipment and storage medium |
CN111640431A (en) * | 2020-04-30 | 2020-09-08 | 海尔优家智能科技(北京)有限公司 | Equipment response processing method and device |
CN111696562A (en) * | 2020-04-29 | 2020-09-22 | 华为技术有限公司 | Voice wake-up method, device and storage medium |
-
2020
- 2020-11-18 CN CN202011296411.1A patent/CN112420051A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012073149A (en) * | 2010-09-29 | 2012-04-12 | Mitsubishi Electric Corp | Wake display device |
CN102999161A (en) * | 2012-11-13 | 2013-03-27 | 安徽科大讯飞信息科技股份有限公司 | Implementation method and application of voice awakening module |
CN107450722A (en) * | 2017-07-12 | 2017-12-08 | 深圳纬目信息技术有限公司 | A kind of dual interactive controlling head-mounted display apparatus and exchange method with noise filter function |
US10460749B1 (en) * | 2018-06-28 | 2019-10-29 | Nuvoton Technology Corporation | Voice activity detection using vocal tract area information |
CN109887505A (en) * | 2019-03-11 | 2019-06-14 | 百度在线网络技术(北京)有限公司 | Method and apparatus for wake-up device |
CN110288997A (en) * | 2019-07-22 | 2019-09-27 | 苏州思必驰信息科技有限公司 | Equipment awakening method and system for acoustics networking |
CN111223497A (en) * | 2020-01-06 | 2020-06-02 | 苏州思必驰信息科技有限公司 | Nearby wake-up method and device for terminal, computing equipment and storage medium |
CN111696562A (en) * | 2020-04-29 | 2020-09-22 | 华为技术有限公司 | Voice wake-up method, device and storage medium |
CN111640431A (en) * | 2020-04-30 | 2020-09-08 | 海尔优家智能科技(北京)有限公司 | Equipment response processing method and device |
Non-Patent Citations (1)
Title |
---|
王晓飞等: "具有选择注意能力的语音拾取技术", 《中国科学:信息科学》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114678016A (en) * | 2021-04-23 | 2022-06-28 | 美的集团(上海)有限公司 | Device wake-up method and system, electronic device and storage medium |
CN113593548A (en) * | 2021-06-29 | 2021-11-02 | 青岛海尔科技有限公司 | Awakening method and device of intelligent equipment, storage medium and electronic device |
CN113593548B (en) * | 2021-06-29 | 2023-12-19 | 青岛海尔科技有限公司 | Method and device for waking up intelligent equipment, storage medium and electronic device |
CN114168208A (en) * | 2021-12-07 | 2022-03-11 | 思必驰科技股份有限公司 | Wake-up decision method, electronic device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112420051A (en) | Equipment determination method, device and storage medium | |
CN111223497B (en) | Nearby wake-up method and device for terminal, computing equipment and storage medium | |
CN106910500B (en) | Method and device for voice control of device with microphone array | |
CN112037789A (en) | Equipment awakening method and device, storage medium and electronic device | |
CN110265052B (en) | Signal-to-noise ratio determining method and device for radio equipment, storage medium and electronic device | |
CN111599371B (en) | Voice adding method, system, device and storage medium | |
CN205508398U (en) | Intelligent robot with high in clouds interactive function | |
CN110875045A (en) | Voice recognition method, intelligent device and intelligent television | |
CN109450747B (en) | Method and device for awakening smart home equipment and computer storage medium | |
CN113593548B (en) | Method and device for waking up intelligent equipment, storage medium and electronic device | |
CN111667843B (en) | Voice wake-up method and system for terminal equipment, electronic equipment and storage medium | |
CN109841214A (en) | Voice wakes up processing method, device and storage medium | |
CN110097884B (en) | Voice interaction method and device | |
CN112837686A (en) | Wake-up response operation execution method and device, storage medium and electronic device | |
CN111640431A (en) | Equipment response processing method and device | |
CN111354336B (en) | Distributed voice interaction method, device, system and household appliance | |
CN112837694B (en) | Equipment awakening method and device, storage medium and electronic device | |
CN112423176A (en) | Earphone noise reduction method and device, storage medium and noise reduction earphone | |
CN112992170B (en) | Model training method and device, storage medium and electronic device | |
CN110309284B (en) | Automatic answer method and device based on Bayesian network reasoning | |
CN112466305B (en) | Voice control method and device of water dispenser | |
CN112201239B (en) | Determination method and device of target equipment, storage medium and electronic device | |
CN113889116A (en) | Voice information processing method and device, storage medium and electronic device | |
CN114550719A (en) | Method and device for recognizing voice control instruction and storage medium | |
CN113870879A (en) | Sharing method of microphone of intelligent household appliance, intelligent household appliance and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210226 |
|
RJ01 | Rejection of invention patent application after publication |