CN111128169A - Voice wake-up method and device - Google Patents

Voice wake-up method and device Download PDF

Info

Publication number
CN111128169A
CN111128169A CN201911402716.3A CN201911402716A CN111128169A CN 111128169 A CN111128169 A CN 111128169A CN 201911402716 A CN201911402716 A CN 201911402716A CN 111128169 A CN111128169 A CN 111128169A
Authority
CN
China
Prior art keywords
awakened
equipment
devices
noise data
energy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911402716.3A
Other languages
Chinese (zh)
Inventor
丁少为
关海欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Intelligent Technology Co Ltd
Xiamen Yunzhixin Intelligent Technology Co Ltd
Original Assignee
Unisound Intelligent Technology Co Ltd
Xiamen Yunzhixin Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisound Intelligent Technology Co Ltd, Xiamen Yunzhixin Intelligent Technology Co Ltd filed Critical Unisound Intelligent Technology Co Ltd
Priority to CN201911402716.3A priority Critical patent/CN111128169A/en
Publication of CN111128169A publication Critical patent/CN111128169A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique

Abstract

The invention relates to a voice awakening method and a voice awakening device. The method comprises the following steps: determining awakening voice of an awakening word received by each piece of equipment to be awakened in a plurality of pieces of equipment to be awakened; determining noise data obtained by each device to be awakened; selecting target awakened equipment from the equipment to be awakened according to the awakening voice received by the equipment to be awakened and the noise data obtained by the equipment to be awakened; responding to the wake-up voice by the target awakened device. By the technical scheme, the determination accuracy of the awakened device can be improved, the awakening accuracy is further improved, and other devices to be awakened in the multiple devices to be awakened are prevented from being awakened as devices which need to be awakened really.

Description

Voice wake-up method and device
Technical Field
The present invention relates to the field of voice technologies, and in particular, to a voice wake-up method and apparatus.
Background
At present, with the popularization of voice intelligent devices, a plurality of different devices using the same awakening word may appear in a home environment (for example, a television, a refrigerator, an air conditioner, a washing machine and the like are awakened by the same awakening word), and a situation of 'one-to-one-hundred response' is likely to appear in such a scene, and in order to solve the problem, the simplest processing method is as follows: the closest equipment is selected according to the signal energy of the awakening word received by each equipment, namely the farther the sound propagation distance is, the more serious the energy attenuation is, the maximum energy of the awakening word received by the equipment closest to the user is, and accordingly the closest equipment with the maximum energy is determined to be the equipment needing to be awakened to respond to the awakening voice so as to avoid mistakenly awakening all equipment corresponding to the awakening word.
However, this method does not distinguish signal energy but depends blindly on the total signal energy received by the device in the wakeup word period, so the wakeup response accuracy will decrease sharply in a noisy environment, for example: if a device is closer to the noise source and farther from the user, the device receives a noise with larger energy while receiving the wakeup word, which may cause the energy of the device to be higher than the energy received by the nearest device and to be misjudged as the nearest device, and further cause the farther device to be misjudged as the nearest device to respond to the wakeup voice.
Disclosure of Invention
The embodiment of the invention provides a voice awakening method and device. The technical scheme is as follows:
according to a first aspect of the embodiments of the present invention, there is provided a voice wake-up method, including:
determining awakening voice of an awakening word received by each piece of equipment to be awakened in a plurality of pieces of equipment to be awakened;
determining noise data obtained by each device to be awakened;
selecting target awakened equipment from the equipment to be awakened according to the awakening voice received by the equipment to be awakened and the noise data obtained by the equipment to be awakened;
responding to the wake-up voice by the target awakened device.
In an embodiment, the selecting, according to the wake-up voice received by each device to be woken up and the noise data obtained by each device to be woken up, a target device to be woken up from each device to be woken up includes:
will be describedThe awakening voice received by each equipment to be awakened is subjected to framing windowing and short-time Fourier transform to obtain the time frequency Y of the awakening voicek(f,n);
Performing frame windowing and short-time Fourier transform on the noise data obtained by each device to be awakened to obtain the time frequency X of the noise datak(f,n);
According to the time frequency Y of the awakening voice received by each equipment to be awakenedk(f, n) and the time-frequency X of the noise data obtained by each device to be awakenedk(f, n), selecting target awakened equipment from the equipment to be awakened.
In an embodiment, the time-frequency Y according to the wake-up voice received by each device to be woken up isk(f, n) and the time-frequency X of the noise data obtained by each device to be awakenedk(f, n), selecting a target awakened device from the devices to be awakened, comprising:
according to the time frequency Y of the awakening voice received by each equipment to be awakenedk(f, n), calculating first average frame energy of the awakening voice received by each device to be awakened;
according to the time frequency X of the noise data obtained by each device to be awakenedk(f, n), calculating second average frame energy of the noise data obtained by each device to be awakened;
and selecting target awakened equipment from the equipment to be awakened according to the first average frame energy of the equipment to be awakened and the second average frame energy of the equipment to be awakened.
In an embodiment, selecting a target awakened device from the devices to be awakened according to the first average frame energy of the devices to be awakened and the second average frame energy of the devices to be awakened includes:
calculating the energy difference of each device to be awakened according to the first average frame energy of each device to be awakened and the second average frame energy of each device to be awakened; the energy difference of each device to be awakened is the difference value of the first average frame energy and the second average frame energy of each device to be awakened;
and selecting the target awakened equipment from the equipment to be awakened according to the energy difference of the equipment to be awakened.
In an embodiment, the selecting the target awakened device from the devices to be awakened according to the energy difference between the devices to be awakened includes:
according to the energy difference of each device to be awakened, determining the device to be awakened with the largest energy difference from the devices to be awakened;
and determining the device to be awakened with the largest energy difference as the target awakened device.
According to a second aspect of the embodiments of the present invention, there is provided a voice wake-up apparatus, including:
the device comprises a first determining module, a second determining module and a control module, wherein the first determining module is used for determining the awakening voice of the awakening word received by each piece of equipment to be awakened in the plurality of equipment to be awakened;
the second determining module is used for determining the noise data obtained by each device to be awakened;
the selection module is used for selecting target awakened equipment from the equipment to be awakened according to the awakening voice received by the equipment to be awakened and the noise data obtained by the equipment to be awakened;
and the response module is used for responding to the awakening voice through the target awakened equipment.
In one embodiment, the selection module comprises:
a first processing submodule, configured to perform frame windowing and short-time fourier transform on the wake-up voice received by each device to be woken up, to obtain a time frequency Y of the wake-up voicek(f,n);
A second processing submodule, configured to perform frame windowing and short-time fourier transform on the noise data obtained by each device to be awakened, so as to obtain a time-frequency X of the noise datak(f,n);
A selection submodule for selecting the voice to be awakened according to the time frequency Y of the awakening voice received by each equipment to be awakenedk(f, n) and the time-frequency X of the noise data obtained by each device to be awakenedk(f, n), selecting target awakened equipment from the equipment to be awakened.
In one embodiment, the selection submodule includes:
a first computing unit, configured to compute a time-frequency Y of the wake-up voice received by each device to be woken up according to the wake-up voicek(f, n), calculating first average frame energy of the awakening voice received by each device to be awakened;
a second computing unit for obtaining the time frequency X of the noise data according to the devices to be awakenedk(f, n), calculating second average frame energy of the noise data obtained by each device to be awakened;
and the selection unit is used for selecting target awakened equipment from the equipment to be awakened according to the first average frame energy of the equipment to be awakened and the second average frame energy of the equipment to be awakened.
In one embodiment, the selection unit includes:
the calculating subunit is configured to calculate an energy difference between the devices to be wakened according to the first average frame energy of each device to be wakened and the second average frame energy of each device to be wakened; the energy difference of each device to be awakened is the difference value of the first average frame energy and the second average frame energy of each device to be awakened;
and the selecting subunit is configured to select the target awakened device from the devices to be awakened according to the energy difference between the devices to be awakened.
In one embodiment, the selection subunit is specifically configured to:
according to the energy difference of each device to be awakened, determining the device to be awakened with the largest energy difference from the devices to be awakened;
and determining the device to be awakened with the largest energy difference as the target awakened device.
The technical scheme provided by the embodiment of the invention can have the following beneficial effects:
after the awakening voice received by each device to be awakened and the noise data obtained before each device to be awakened receives the awakening voice are determined, which device to be awakened (namely, the target awakened device) is automatically selected from the devices to be awakened to respond to the awakening voice according to the awakening voice received by each device to be awakened and the obtained noise data, so that the determination accuracy of the awakened device can be improved by simultaneously combining the awakening voice and the noise data of the device to be awakened, the awakening accuracy is further improved, and the phenomenon that other devices to be awakened in a plurality of devices to be awakened are mistakenly awakened as devices which really need to be awakened is avoided.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
Fig. 1 is a flow chart illustrating a voice wake-up method according to an example embodiment.
Fig. 2 is a flow chart illustrating another voice wake-up method according to an example embodiment.
Fig. 3 is a block diagram illustrating a voice wake-up unit in accordance with an exemplary embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
In order to solve the above technical problem, an embodiment of the present invention provides a voice wake-up method, which may be used in a voice wake-up program, a system or a device, and an execution subject corresponding to the method may be a terminal or a server, as shown in fig. 1, where the method includes steps S101 to S104:
in step S101, determining a wake-up voice of a wake-up word received by each of a plurality of devices to be woken up;
in step S102, determining noise data obtained by each device to be wakened;
the noise data obtained by each device to be wakened is the noise data of each device to be wakened in a period of time (such as the previous 1 second) before the wakening voice of the wakening word is received.
In step S103, selecting a target device to be awakened from the devices to be awakened according to the awakening voice received by the devices to be awakened and the noise data obtained by the devices to be awakened;
in step S104, the target awakened device responds to the wake-up voice.
After the awakening voice received by each device to be awakened and the noise data obtained before each device to be awakened receives the awakening voice are determined, which device to be awakened (namely, the target awakened device) is automatically selected from the devices to be awakened to respond to the awakening voice according to the awakening voice received by each device to be awakened and the obtained noise data, so that the determination accuracy of the awakened device can be improved by simultaneously combining the awakening voice and the noise data of the device to be awakened, the awakening accuracy is further improved, and the phenomenon that other devices to be awakened in a plurality of devices to be awakened are mistakenly awakened as devices which really need to be awakened is avoided.
In addition, when the target awakened device is selected, noise data of a period of time before the period of time when the awakening voice of the awakening word is received by each device to be awakened is combined at the same time, and the method is not limited to only depending on the total signal energy value received by the device to be awakened in the period of time when the awakening word is received, so that the accuracy rate of determining the device to be awakened can be obviously improved, and the awakening accuracy rate is further improved compared with the prior art.
In an embodiment, the selecting, according to the wake-up voice received by each device to be woken up and the noise data obtained by each device to be woken up, a target device to be woken up from each device to be woken up includes:
performing frame windowing and short-time Fourier transform on the awakening voice received by each device to be awakened to obtain the time frequency Y of the awakening voicek(f,n);YkAnd (f, n) is the time domain spectrum of the wake-up voice.
Performing frame windowing and short-time Fourier transform on the noise data obtained by each device to be awakened to obtain the time frequency X of the noise datak(f,n);XkAnd (f, n) is a time domain spectrum of the noise data.
According to the time frequency Y of the awakening voice received by each equipment to be awakenedk(f, n) and the time-frequency X of the noise data obtained by each device to be awakenedk(f, n), selecting target awakened equipment from the equipment to be awakened.
According to the time frequency Y of the awakening voice received by each equipment to be awakenedk(f, n) and the time-frequency X of the noise data obtained by each device to be awakenedkAnd (f, n), the target awakened device can be automatically and accurately selected from the devices to be awakened, so that the determination accuracy of the awakened device is improved, and the awakening accuracy is further improved.
In an embodiment, the time-frequency Y according to the wake-up voice received by each device to be woken up isk(f, n) and the time-frequency X of the noise data obtained by each device to be awakenedk(f, n), selecting a target awakened device from the devices to be awakened, comprising:
according to the time frequency Y of the awakening voice received by each equipment to be awakenedk(f, n), calculating first average frame energy of the awakening voice received by each device to be awakened; f represents the frequency, n represents the total frame number of the awakening voice or noise data received by each device to be awakened, and k represents the kth device to be awakened.
According to the time frequency X of the noise data obtained by each device to be awakenedk(f, n), calculating second average frame energy of noise data obtained by the devices to be awakened in the same frequency range f;
the first average frame energy is an average energy obtained based on a sum of energies of voices per frame in the wake-up voice, and the second average frame energy is an average energy obtained based on a sum of energies of voices per frame in the noise data.
And selecting target awakened equipment from the equipment to be awakened according to the first average frame energy of the equipment to be awakened and the second average frame energy of the equipment to be awakened.
According to the time frequency Y of the awakening voice received by each equipment to be awakenedk(f, n), the first average frame energy of the awakening voice received by each equipment to be awakened can be accurately calculated, and meanwhile, the time frequency X of the noise data obtained by each equipment to be awakened can be usedkAnd (f, n), accurately calculating second average frame energy of the noise data obtained by each device to be awakened, so as to accurately select the target device to be awakened according to the two average frame energies of each device to be awakened, improve the awakening accuracy rate, and avoid mistakenly awakening other devices to be awakened in the plurality of devices to be awakened as devices which really need to be awakened. In addition, in the embodiment, signal energy is distinguished, that is, the energy is distinguished into energy of the awakening voice and energy of the noise data, so that compared with the prior art, obviously, the selection accuracy of the target awakened device can be further improved, and the awakening accuracy is further improved.
In an embodiment, selecting a target awakened device from the devices to be awakened according to the first average frame energy of the devices to be awakened and the second average frame energy of the devices to be awakened includes:
calculating the energy difference of each device to be awakened according to the first average frame energy of each device to be awakened and the second average frame energy of each device to be awakened; the energy difference of each device to be awakened is the difference value of the first average frame energy and the second average frame energy of each device to be awakened;
and selecting the target awakened equipment from the equipment to be awakened according to the energy difference of the equipment to be awakened.
According to the first average frame energy and the second average frame energy of each device to be awakened, the energy difference of each device to be awakened can be calculated, and then the target awakened device is automatically selected from each device to be awakened according to the energy difference, so that the selection accuracy of the target awakened device is improved, the awakening accuracy is improved, and the other devices to be awakened in the devices to be awakened are prevented from being awakened as devices which need to be awakened really.
In an embodiment, the selecting the target awakened device from the devices to be awakened according to the energy difference between the devices to be awakened includes:
according to the energy difference of each device to be awakened, determining the device to be awakened with the largest energy difference from the devices to be awakened;
and determining the device to be awakened with the largest energy difference as the target awakened device.
According to the energy difference of each device to be awakened, the energy differences can be sequenced from large to small to determine the maximum energy difference, so that the device to be awakened corresponding to the maximum energy difference can be determined, the device to be awakened corresponding to the maximum energy difference is automatically determined as the target device to be awakened, the selection accuracy of the target device to be awakened is improved, the awakening accuracy is improved, and other devices to be awakened in a plurality of devices to be awakened are prevented from being awakened as devices which really need to be awakened by mistake.
The technical solution of the present invention will be further described in detail with reference to fig. 2:
the prior art is poor in robustness to a noise environment only by relying on a mode of awakening word energy, and the patent provides a near response method aiming at the problem, so that robustness of near response in the noise environment is improved.
Step 1: supposing that K different intelligent devices are possible to be awakened at the same time, each device inputs voice data in an awakening word time period into the distributed engine, simultaneously inputs noise data in a period before the awakening voice of the awakening word is received into the distributed engine, and records the noise data as xk(t), t represents the sampling time point, k represents the kth equipment, and the awakening word data is noted as yk(t);
Step 2: performing frame windowing and short-time Fourier transform on the noise data of each device to obtain the time-frequency domain form of the noise data, and recording the form as Xk(f, n), wherein f represents frequency and n represents frame number;
and 3, step 3: selecting a certain frequency range to calculate the average frame energy of the noise data
Figure BDA0002347863830000081
Figure BDA0002347863830000091
Figure BDA0002347863830000092
Total number of frames f representing noise data of kth apparatus1And f2Representing a lower frequency limit and an upper frequency limit under consideration;
and 4, step 4: performing frame windowing and short-time Fourier transform on the awakening data of the awakening words received by each device to obtain the time-frequency domain form of the awakening word data (namely the awakening data of the awakening words received by each device), and recording the form as Yk(f,n);
And 5, step 5: calculate the average frame energy of the wakeup word data at the same frequency range as the noisy data, note
Figure BDA0002347863830000093
Figure BDA0002347863830000094
Representing the total frame number of the wake-up word data of the kth device;
and 6, step 6: the average frame energy of the awakening word data is subtracted from the average frame energy of the noise data to obtain the reliable nearest equipment judgment energy of each equipment
Figure BDA0002347863830000095
Namely, it is
Figure BDA0002347863830000096
And 7, step 7: is reliable inThe corresponding device with the maximum energy is selected as the latest device response awakening word in the judgment of the devices, namely
Figure BDA0002347863830000097
KFNumbering the devices of the final response.
According to the technical scheme, the noise data received before the awakening word data is received can be used as a punishment item, when a certain device is close to a noise source, the average frame energy of the awakening word data segment is high, meanwhile, the average frame energy of the noise data segment is also high, namely the punishment on the average frame energy of the awakening word data segment of the device is high, and therefore the robustness of the distributed engine in a noise scene is improved.
Finally, it is clear that: the above embodiments can be freely combined by those skilled in the art according to actual needs.
Corresponding to the voice wake-up method provided in the embodiment of the present invention, an embodiment of the present invention further provides a voice wake-up apparatus, as shown in fig. 3, the apparatus includes:
a first determining module 301, configured to determine a wake-up voice of a wake-up word received by each device to be wakened in a plurality of devices to be wakened;
a second determining module 302, configured to determine noise data obtained by each device to be wakened;
a selecting module 303, configured to select a target device to be awakened from the devices to be awakened according to the awakening voice received by each device to be awakened and the noise data obtained by each device to be awakened;
a response module 304, configured to respond to the wake-up voice through the target awakened device.
In one embodiment, the selection module comprises:
a first processing submodule, configured to perform frame windowing and short-time fourier transform on the wake-up voice received by each device to be woken up, to obtain a time frequency Y of the wake-up voicek(f,n);
A second processing submodule for waking up the data to be woken upThe noise data obtained by the equipment is subjected to frame windowing and short-time Fourier transform to obtain the time-frequency X of the noise datak(f,n);
A selection submodule for selecting the voice to be awakened according to the time frequency Y of the awakening voice received by each equipment to be awakenedk(f, n) and the time-frequency X of the noise data obtained by each device to be awakenedk(f, n), selecting target awakened equipment from the equipment to be awakened.
In one embodiment, the selection submodule includes:
a first computing unit, configured to compute a time-frequency Y of the wake-up voice received by each device to be woken up according to the wake-up voicek(f, n), calculating first average frame energy of the awakening voice received by each device to be awakened;
a second computing unit for obtaining the time frequency X of the noise data according to the devices to be awakenedk(f, n), calculating second average frame energy of the noise data obtained by each device to be awakened;
and the selection unit is used for selecting target awakened equipment from the equipment to be awakened according to the first average frame energy of the equipment to be awakened and the second average frame energy of the equipment to be awakened.
In one embodiment, the selection unit includes:
the calculating subunit is configured to calculate an energy difference between the devices to be wakened according to the first average frame energy of each device to be wakened and the second average frame energy of each device to be wakened; the energy difference of each device to be awakened is the difference value of the first average frame energy and the second average frame energy of each device to be awakened;
and the selecting subunit is configured to select the target awakened device from the devices to be awakened according to the energy difference between the devices to be awakened.
In one embodiment, the selection subunit is specifically configured to:
according to the energy difference of each device to be awakened, determining the device to be awakened with the largest energy difference from the devices to be awakened;
and determining the device to be awakened with the largest energy difference as the target awakened device.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (10)

1. A voice wake-up method, comprising:
determining awakening voice of an awakening word received by each piece of equipment to be awakened in a plurality of pieces of equipment to be awakened;
determining noise data obtained by each device to be awakened;
selecting target awakened equipment from the equipment to be awakened according to the awakening voice received by the equipment to be awakened and the noise data obtained by the equipment to be awakened;
responding to the wake-up voice by the target awakened device.
2. The method according to claim 1, wherein the selecting a target awakened device from the devices to be awakened according to the awakening voice received by the devices to be awakened and the noise data obtained by the devices to be awakened comprises:
performing frame windowing and short-time Fourier transform on the awakening voice received by each device to be awakened to obtain the time frequency Y of the awakening voicek(f,n);
Performing frame windowing and short-time Fourier transform on the noise data obtained by each device to be awakened to obtain the time frequency X of the noise datak(f,n);
According to the time frequency Y of the awakening voice received by each equipment to be awakenedk(f, n) and the time-frequency X of the noise data obtained by each device to be awakenedk(f, n), selecting target awakened equipment from the equipment to be awakened.
3. The method of claim 2,
the time frequency Y according to the awakening voice received by each equipment to be awakenedk(f, n) and the time-frequency X of the noise data obtained by each device to be awakenedk(f, n), selecting a target awakened device from the devices to be awakened, comprising:
according to the time frequency Y of the awakening voice received by each equipment to be awakenedk(f, n), calculating first average frame energy of the awakening voice received by each device to be awakened;
according to the time frequency X of the noise data obtained by each device to be awakenedk(f, n), calculating second average frame energy of the noise data obtained by each device to be awakened;
and selecting target awakened equipment from the equipment to be awakened according to the first average frame energy of the equipment to be awakened and the second average frame energy of the equipment to be awakened.
4. The method of claim 3,
selecting a target awakened device from the devices to be awakened according to the first average frame energy of the devices to be awakened and the second average frame energy of the devices to be awakened, including:
calculating the energy difference of each device to be awakened according to the first average frame energy of each device to be awakened and the second average frame energy of each device to be awakened; the energy difference of each device to be awakened is the difference value of the first average frame energy and the second average frame energy of each device to be awakened;
and selecting the target awakened equipment from the equipment to be awakened according to the energy difference of the equipment to be awakened.
5. The method of claim 4,
the selecting the target awakened device from the devices to be awakened according to the energy difference of the devices to be awakened includes:
according to the energy difference of each device to be awakened, determining the device to be awakened with the largest energy difference from the devices to be awakened;
and determining the device to be awakened with the largest energy difference as the target awakened device.
6. A voice wake-up apparatus, comprising:
the device comprises a first determining module, a second determining module and a control module, wherein the first determining module is used for determining the awakening voice of the awakening word received by each piece of equipment to be awakened in the plurality of equipment to be awakened;
the second determining module is used for determining the noise data obtained by each device to be awakened;
the selection module is used for selecting target awakened equipment from the equipment to be awakened according to the awakening voice received by the equipment to be awakened and the noise data obtained by the equipment to be awakened;
and the response module is used for responding to the awakening voice through the target awakened equipment.
7. The apparatus of claim 6, wherein the selection module comprises:
a first processing submodule, configured to perform frame windowing and short-time fourier transform on the wake-up voice received by each device to be woken up, to obtain a time frequency Y of the wake-up voicek(f,n);
A second processing submodule for performing frame windowing and short-time Fourier transform on the noise data obtained by each device to be awakened to obtain the dataTime-frequency X of the noise datak(f,n);
A selection submodule for selecting the voice to be awakened according to the time frequency Y of the awakening voice received by each equipment to be awakenedk(f, n) and the time-frequency X of the noise data obtained by each device to be awakenedk(f, n), selecting target awakened equipment from the equipment to be awakened.
8. The apparatus of claim 7,
the selection submodule includes:
a first computing unit, configured to compute a time-frequency Y of the wake-up voice received by each device to be woken up according to the wake-up voicek(f, n), calculating first average frame energy of the awakening voice received by each device to be awakened;
a second computing unit for obtaining the time frequency X of the noise data according to the devices to be awakenedk(f, n), calculating second average frame energy of the noise data obtained by each device to be awakened;
and the selection unit is used for selecting target awakened equipment from the equipment to be awakened according to the first average frame energy of the equipment to be awakened and the second average frame energy of the equipment to be awakened.
9. The apparatus of claim 8,
the selection unit includes:
the calculating subunit is configured to calculate an energy difference between the devices to be wakened according to the first average frame energy of each device to be wakened and the second average frame energy of each device to be wakened; the energy difference of each device to be awakened is the difference value of the first average frame energy and the second average frame energy of each device to be awakened;
and the selecting subunit is configured to select the target awakened device from the devices to be awakened according to the energy difference between the devices to be awakened.
10. The apparatus of claim 9,
the selection subunit is specifically configured to:
according to the energy difference of each device to be awakened, determining the device to be awakened with the largest energy difference from the devices to be awakened;
and determining the device to be awakened with the largest energy difference as the target awakened device.
CN201911402716.3A 2019-12-30 2019-12-30 Voice wake-up method and device Pending CN111128169A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911402716.3A CN111128169A (en) 2019-12-30 2019-12-30 Voice wake-up method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911402716.3A CN111128169A (en) 2019-12-30 2019-12-30 Voice wake-up method and device

Publications (1)

Publication Number Publication Date
CN111128169A true CN111128169A (en) 2020-05-08

Family

ID=70505802

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911402716.3A Pending CN111128169A (en) 2019-12-30 2019-12-30 Voice wake-up method and device

Country Status (1)

Country Link
CN (1) CN111128169A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022188560A1 (en) * 2021-03-10 2022-09-15 Oppo广东移动通信有限公司 Methods for distance relationship determination, device control and model training, and related apparatuses

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107146614A (en) * 2017-04-10 2017-09-08 北京猎户星空科技有限公司 A kind of audio signal processing method, device and electronic equipment
CN107316651A (en) * 2017-07-04 2017-11-03 北京中瑞智科技有限公司 Audio-frequency processing method and device based on microphone
CN107919119A (en) * 2017-11-16 2018-04-17 百度在线网络技术(北京)有限公司 Method, apparatus, equipment and the computer-readable medium of more equipment interaction collaborations
CN109377987A (en) * 2018-08-31 2019-02-22 百度在线网络技术(北京)有限公司 Exchange method, device, equipment and the storage medium of intelligent sound equipment room
CN109994112A (en) * 2019-03-12 2019-07-09 广东美的制冷设备有限公司 Control method, server, speech recognition apparatus and the medium of speech recognition apparatus
CN110211580A (en) * 2019-05-15 2019-09-06 海尔优家智能科技(北京)有限公司 More smart machine answer methods, device, system and storage medium
CN110223684A (en) * 2019-05-16 2019-09-10 华为技术有限公司 A kind of voice awakening method and equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107146614A (en) * 2017-04-10 2017-09-08 北京猎户星空科技有限公司 A kind of audio signal processing method, device and electronic equipment
CN107316651A (en) * 2017-07-04 2017-11-03 北京中瑞智科技有限公司 Audio-frequency processing method and device based on microphone
CN107919119A (en) * 2017-11-16 2018-04-17 百度在线网络技术(北京)有限公司 Method, apparatus, equipment and the computer-readable medium of more equipment interaction collaborations
CN109377987A (en) * 2018-08-31 2019-02-22 百度在线网络技术(北京)有限公司 Exchange method, device, equipment and the storage medium of intelligent sound equipment room
CN109994112A (en) * 2019-03-12 2019-07-09 广东美的制冷设备有限公司 Control method, server, speech recognition apparatus and the medium of speech recognition apparatus
CN110211580A (en) * 2019-05-15 2019-09-06 海尔优家智能科技(北京)有限公司 More smart machine answer methods, device, system and storage medium
CN110223684A (en) * 2019-05-16 2019-09-10 华为技术有限公司 A kind of voice awakening method and equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022188560A1 (en) * 2021-03-10 2022-09-15 Oppo广东移动通信有限公司 Methods for distance relationship determination, device control and model training, and related apparatuses

Similar Documents

Publication Publication Date Title
EP3703052B1 (en) Echo cancellation method and apparatus based on time delay estimation
CN111192589A (en) Voice wake-up method and device
CN108899044B (en) Voice signal processing method and device
CN105654949B (en) A kind of voice awakening method and device
CN108352818B (en) Sound signal processing apparatus and method for enhancing sound signal
CN108922553B (en) Direction-of-arrival estimation method and system for sound box equipment
CN109286875B (en) Method, apparatus, electronic device and storage medium for directional sound pickup
US10242677B2 (en) Speaker dependent voiced sound pattern detection thresholds
CN109509465B (en) Voice signal processing method, assembly, equipment and medium
CN110265020B (en) Voice wake-up method and device, electronic equipment and storage medium
CN110491403A (en) Processing method, device, medium and the speech enabled equipment of audio signal
CN111402883B (en) Nearby response system and method in distributed voice interaction system under complex environment
CN109346062B (en) Voice endpoint detection method and device
CN104992713B (en) A kind of quick broadcast audio comparison method
CN103617801A (en) Voice detection method and device and electronic equipment
US9754606B2 (en) Processing apparatus, processing method, program, computer readable information recording medium and processing system
CN112599127A (en) Voice instruction processing method, device, equipment and storage medium
CN106323454B (en) The recognition methods of air conditioner indoor unit abnormal sound and device
CN111128169A (en) Voice wake-up method and device
US9941977B2 (en) Data transmission between devices over audible sound
CN111081251B (en) Voice wake-up method and device
CN106340310A (en) Speech detection method and device
CN111462757B (en) Voice signal-based data processing method, device, terminal and storage medium
CN111128227B (en) Sound detection method and device
CN111192569B (en) Double-microphone voice feature extraction method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200508