CN111105798B - Equipment control method based on voice recognition - Google Patents
Equipment control method based on voice recognition Download PDFInfo
- Publication number
- CN111105798B CN111105798B CN201811264461.4A CN201811264461A CN111105798B CN 111105798 B CN111105798 B CN 111105798B CN 201811264461 A CN201811264461 A CN 201811264461A CN 111105798 B CN111105798 B CN 111105798B
- Authority
- CN
- China
- Prior art keywords
- voice
- signal
- user
- external
- noise
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000009471 action Effects 0.000 claims abstract description 15
- 230000000875 corresponding effect Effects 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 18
- 230000008030 elimination Effects 0.000 claims description 12
- 238000003379 elimination reaction Methods 0.000 claims description 12
- 238000007781 pre-processing Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 7
- 238000001228 spectrum Methods 0.000 claims description 7
- 238000001514 detection method Methods 0.000 claims description 5
- 230000001276 controlling effect Effects 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 230000004044 response Effects 0.000 abstract description 4
- 230000002452 interceptive effect Effects 0.000 abstract description 3
- 230000000694 effects Effects 0.000 description 3
- 241001672694 Citrus reticulata Species 0.000 description 1
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/02—Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Selective Calling Equipment (AREA)
Abstract
The invention relates to a device control method based on voice recognition, which is characterized in that an acquired external voice signal is matched with a preset standard voice control instruction database stored in the device, once the acquired external voice signal is matched with any preset standard voice control instruction, the device executes actions corresponding to the preset standard voice control instruction, and the device performs self-learning aiming at user voices, so that a voice characteristic parameter set conforming to the personalized characteristics of the user is obtained, the problem that the matching accuracy of dialects of the device when the user pronounces is not high with the preset standard voice control instruction is avoided, the device can accurately recognize personalized voices of the user once the device acquires voices with the same voice characteristic parameter set again, and the recognition accuracy of the device to the user voices and the interactive response efficiency of the device to the user voice control are improved.
Description
Technical Field
The invention relates to the field of equipment control, in particular to a voice recognition-based equipment control method.
Background
Along with the continuous development of equipment intellectualization, intelligent equipment with various control functions is continuously emerging in the market. For example, compared with a traditional key control mode of equipment, the existing intelligent equipment has a touch control function and a gesture control function based on user actions.
However, the manner of operation of the existing smart device has some drawbacks: in the process of operating equipment such as a range hood, a steam box or an oven by a user, the key operation mode, touch operation and gesture operation of the existing equipment still need to occupy one hand or two hands of the user, so that the user can not easily vacate the free hands to do other matters, and therefore the operation experience effect of the user on the equipment is reduced.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a device control method based on voice recognition aiming at the prior art.
The technical scheme adopted for solving the technical problems is as follows: the equipment control method based on voice recognition is characterized by comprising the following steps 1 to 4:
step 1, a preset standard voice control instruction database for controlling equipment to execute actions is constructed; wherein, the preset standard voice control instruction database stores preset standard voice corresponding to the action executed by the equipment;
step 2, detecting and acquiring an external voice signal outside the equipment, and preprocessing the external voice signal;
step 3, performing matching judgment processing on the preprocessed external voice signal and the preset standard voice control command database:
when the preprocessed external voice signal is matched and consistent with any preset standard voice in the preset standard voice control instruction database, taking the preset standard voice as an external voice control instruction, and turning to step 4; otherwise, turning to step 2;
and 4, commanding the equipment to execute the action corresponding to the external voice control instruction.
Further, in the voice recognition-based device control method, in step 2, the preprocessing process for the external voice signal includes steps 2-1 to 2-4 as follows:
step 2-1, performing endpoint detection on the external voice signal to obtain a user voice signal and an external noise signal in the external voice signal;
step 2-2, eliminating the external noise signal in the external voice signal to obtain a user voice signal after noise elimination processing;
step 2-3, extracting a voice characteristic parameter set in the user voice signal according to preset voice characteristic parameters;
and 2-4, taking the extracted voice characteristic parameter set as a preprocessing result aiming at the external voice signal.
Still further, in step 2-1, the process of acquiring the user voice signal and the external noise signal from the external voice signal is as follows in steps a1 to a3:
step a1, constructing a voice model fused with a user voice signal and an external noise signal; wherein the speech model is as follows:
wherein x is k Is the subband energy of the selected signal; z=0, representing the selected signal as an external noise signal; z=1, representing that the selected signal is a user speech signal; r is (r) k Is comprised of parameter mu z Sum parameter sigma 2 Is a parameter set of (a); mu (mu) z Representing the mean value, sigma, of the amplitude of the signal z 2 Representing the energy of signal z; p (x) k |z,r k ) Representing the probability that the selected signal is z;
step a2, calculating the probability that the signals in the external voice signals are the user voice signals and the probability of the external noise signals respectively according to the constructed voice model;
step a3, determining the signal type of the external voice signal by using a hypothesis testing method according to the probability result obtained in the step a 2; wherein the signal type is a user speech signal or an external noise signal.
Still further, in step a3, the signal type determining process of the user voice signal and the external noise signal in the external voice signal includes the following steps b1 to b7:
step b1, constructing a noise model by using signal characteristic data of a first preset frame number before starting the step 3;
step b2, calculating a normalized spectrum difference value by using the signal intensity of the second preset frame number before starting the step 3;
step b3, calculating the signal-to-noise ratio in each frame of signal by adopting a probability density function according to the constructed noise model and the obtained normalized spectrum difference value, and distinguishing a user voice signal and an external noise signal;
step b4, according to the signal-to-noise ratio in each frame of signals, using a wiener filter to eliminate external noise signals in the external voice signals in a frequency domain;
step b5, calculating the energy ratio of the external voice signal before and after the noise elimination, and the signal-to-noise likelihood ratio of the external voice signal before and after the noise elimination;
step b6, repairing and adjusting the external voice signal after noise cancellation by utilizing the energy ratio before and after noise cancellation and the signal-to-noise likelihood ratio before and after noise cancellation;
and b7, outputting the repaired and adjusted external voice signal as a user voice signal.
In the device control method based on voice recognition, after the successful execution of step 4, the method further comprises: and 5, the device executes self-learning aiming at the voice of the user and executes corresponding actions again according to the self-learning result.
Further, the process of the device performing self-learning for the user's voice and performing corresponding actions again according to the self-learning result includes the following steps c1 to c5:
step c1, taking the external voice signal matched and consistent with any preset standard voice data as a voice command to be learned of the equipment;
step c2, obtaining repeated user voice control instructions which are sent by the user again and have the same content as the voice instructions to be learned;
step c3, respectively extracting the voice characteristic parameter set to be learned of the voice instruction to be learned and the user voice characteristic parameter sets of the voice control instructions of each time according to the same voice characteristic parameter extraction method;
step c4, performing matching judgment according to the extracted voice characteristic parameter set to be learned and the voice characteristic parameter sets of each user:
when the matching times of the user voice characteristic parameter set and the voice characteristic parameter set to be learned reach preset times, the voice characteristic parameter set to be learned is used as a user control voice characteristic parameter set for representing the user control equipment; turning to step c5; otherwise, feeding back prompt information of failure of learning the voice command to the user;
and c5, when the user control voice matched with the user control voice characteristic parameter set is obtained again, executing the action corresponding to the voice instruction to be learned by the equipment.
Optionally, in the device control method based on voice recognition, the preset standard voice in the preset standard voice control command database is a voice control command of the device system or a voice command recorded by a user.
Further, in the voice recognition-based device control method, the device is a home appliance device.
Compared with the prior art, the invention has the advantages that:
according to the equipment control method, the collected external voice signals are matched with the preset standard voice control instruction database stored in the equipment, once the collected external voice signals are matched with any preset standard voice control instruction, the equipment executes the action corresponding to the any preset standard voice control instruction, so that a user can control the equipment through voice, manual equipment operation is avoided, the hands of the user are effectively liberated, and the control experience effect of the user on the equipment is improved;
moreover, the invention also enables the equipment to execute self-learning aiming at the user voice so as to obtain the voice characteristic parameter set which accords with the personalized characteristics of the user, thereby avoiding the problem that the dialect of the equipment is difficult to match with the preset standard voice control instruction when the user pronounces, so that the equipment can accurately recognize the personalized voice of the user once the equipment acquires the voice which also has the voice characteristic parameter set again, and improving the recognition accuracy of the equipment to the user voice and the interactive response efficiency of the equipment to the user voice control.
Drawings
Fig. 1 is a schematic flow chart of a device control method based on voice recognition in an embodiment of the invention.
Detailed Description
The invention is described in further detail below with reference to the embodiments of the drawings.
In this embodiment, a kitchen electric appliance (or kitchen electric appliance) is taken as an example, and a method of controlling the appliance in the present invention will be described. Referring to fig. 1, the device control method based on voice recognition in this embodiment includes the following steps 1 to 4:
step 1, a preset standard voice control instruction database for controlling kitchen electric equipment to execute actions is constructed; wherein, the preset standard voice corresponding to the action executed by the kitchen electric equipment is stored in the preset standard voice control instruction database;
for example, for the kitchen electric equipment, the preset standard voice control command database stores standard voices which are recorded by using standard mandarin, and conform to various functions of the kitchen electric equipment, such as a start voice control command S1, an end voice control command S2, an increase gear voice control command S3, a decrease gear voice control command S4 and the like; that is, as long as the degree of matching of the speech uttered by the user with the standard speech stored therein reaches the set degree, the user can be considered to utter the standard speech;
of course, the preset standard voice in the preset standard voice control command database can be a voice control command of the kitchen electric equipment when leaving the factory, or can be a voice control command recorded by a user after the user purchases the kitchen electric equipment;
step 2, detecting and acquiring an external voice signal outside the kitchen electric equipment, and preprocessing the external voice signal;
specifically, in step 2 of the present embodiment, the preprocessing process for the external voice signal here includes the following steps 2-1 to 2-4:
step 2-1, performing endpoint detection on an external voice signal to obtain a user voice signal and an external noise signal in the external voice signal;
assuming that an external voice signal collected by the device is marked as X, after endpoint detection is executed, a user voice signal in the external voice signal X is Sound, and an external Noise signal in the external voice signal X is Noise; the endpoint detection in this embodiment belongs to the prior art, and is not described here again;
it should be noted that, in the external voice signal described in this embodiment, the process of acquiring the user voice signal and the external noise signal is as follows:
step a1, constructing a voice model fused with a user voice signal and an external noise signal; wherein, the speech model is as follows:
wherein x is k Is the subband energy of the selected signal; z=0, representing the selected signal as an external noise signal; z=1, representing that the selected signal is a user speech signal; r is (r) k Is comprised of parameter mu z Sum parameter sigma 2 Is a parameter set of (a); mu (mu) z Representing the mean value, sigma, of the amplitude of the signal z 2 Representing the energy of signal z; p (x) k |z,r k ) Representing the probability that the selected signal is z;
step a2, calculating the probability that the signals in the external voice signals are the user voice signals and the probability of the external noise signals respectively according to the constructed voice model;
step a3, determining the signal type of the external voice signal by using a hypothesis testing method as the prior art according to the probability result obtained in the step a 2; the signal type is a user voice signal or an external noise signal;
specifically, in step a3, the signal type determination process of the user voice signal and the external noise signal in the external voice signal includes the following steps b1 to b7:
step b1, constructing a noise model by using signal characteristic data of a first preset frame number before starting the step 3;
step b2, calculating a normalized spectrum difference value by using the signal intensity of the second preset frame number before starting the step 3;
step b3, calculating the signal-to-noise ratio in each frame of signal by adopting a probability density function according to the constructed noise model and the obtained normalized spectrum difference value, and distinguishing a user voice signal and an external noise signal;
step b4, according to the signal-to-noise ratio in each frame of signals, using a wiener filter to eliminate external noise signals in external voice signals in a frequency domain;
step b5, calculating the energy ratio of the external voice signal before and after the noise elimination and the signal-to-noise likelihood ratio of the external voice signal before and after the noise elimination;
step b6, repairing and adjusting the external voice signal after noise elimination by utilizing the energy ratio before and after noise elimination and the signal-to-noise likelihood ratio before and after noise elimination;
and b7, outputting the repaired and adjusted external voice signal as a user voice signal. That is, after the repairing and adjusting process of the step b6, only the user voice signal remains in the external voice signal after the noise is eliminated, so that the purpose of determining the user voice signal and the external noise signal in the external voice signal is achieved;
in the embodiment, by adopting the modes from the step b1 to the step b7, noise in the external voice signals collected by the kitchen electric equipment can be eliminated, and then only voice instructions sent by a user are remained, so that adverse effects of the noise on the voice instruction identifying process of the user are avoided, the voice control instruction identifying rate for the kitchen electric equipment is improved, and the timeliness of voice response sent by the kitchen electric equipment to the user is enhanced;
step 2-2, eliminating an external Noise signal Noise in the external voice signal X to obtain a Noise-eliminated user voice signal Sound; that is, after the step 2-2 is performed, only the voice signal Sound of the user remains in the so-called external voice signal X here; as for the elimination of the external Noise signal Noise here, a conventional wavelet Noise filtering method may be adopted, or the external Noise signal Noise here may be eliminated in the manner of the above-described steps b1 to b 4;
step 2-3, extracting a voice characteristic parameter set in a user voice signal according to preset voice characteristic parameters;
for example, the preset voice characteristic parameters may be characteristic parameters obtained based on parameters such as amplitude, frequency or frequency spectrum of the voice signal, and the voice characteristic parameters set includes characteristic parameters required for recognizing voice; the number or the type of the characteristic parameters in the voice characteristic parameter set can be selectively set according to actual requirements;
step 2-4, taking the extracted voice characteristic parameter set as a preprocessing result aiming at the external voice signal;
step 3, carrying out matching judgment processing on the preprocessed external voice signal and a preset standard voice control command database:
when the preprocessed external voice signal is matched with any preset standard voice in the preset standard voice control command database, the content of the preprocessed external voice signal is the content corresponding to the any preset standard voice to be sent, and the preset standard voice is used as an external voice control command at the moment, and the step 4 is carried out; otherwise, turning to step 2;
and 4, commanding the equipment to execute the action corresponding to the external voice control instruction. For example, once it is judged that the preprocessed external voice signal (specifically, the user voice signal Sound after noise cancellation) matches with the preset standard voice "increase gear" voice control command S3, it is indicated that the user sends a control command of "increase gear" to the kitchen electric equipment at this time, so that the kitchen electric equipment at this time increases the gear on the basis of the current gear of the user, and the control requirement of the user on the kitchen electric equipment is met.
Of course, in order to meet the personalized characteristics of the user voice, the problem that the kitchen electric equipment is difficult to match dialect of the user when sounding with a preset standard voice control instruction is not high in accuracy is avoided, and the equipment control method of the embodiment further comprises after the successful execution of the step 4: and 5, the kitchen electric equipment executes self-learning aiming at the voice of the user and executes corresponding actions again according to the self-learning result. The process of the kitchen electric equipment executing self-learning aiming at the voice of the user and executing corresponding actions again according to the self-learning result specifically comprises the following steps c1 to c5:
step c1, taking an external voice signal matched with any preset standard voice data as a voice command to be learned of the kitchen electric equipment;
since it is assumed in the present embodiment that the preprocessed external voice signal (specifically, the noise-removed user voice signal Sound) matches with the preset standard voice "up shift" voice control command S3, in the step c1, the external voice signal (specifically, the noise-removed user voice signal Sound) matching with the "up shift" voice control command S3 is used as the voice command to be learned of the kitchen electric device;
step c2, obtaining repeated user voice control instructions which are sent by the user again and have the same content as the voice instructions to be learned;
step c3, respectively extracting a to-be-learned voice characteristic parameter set of the to-be-learned voice instruction and a user voice characteristic parameter set of each user voice control instruction according to the same voice characteristic parameter extraction method;
step c4, performing matching judgment according to the extracted voice characteristic parameter set to be learned and the voice characteristic parameter sets of each user:
when the matching times of the user voice characteristic parameter set and the voice characteristic parameter set to be learned reach the preset times, the voice characteristic parameter set to be learned is used as a user control voice characteristic parameter set for representing the user control equipment; turning to step c5; otherwise, feeding back prompt information of failure of learning the voice command to the user;
for example, the kitchen electric device is required to acquire three times of user voice control instructions with the same content as the voice instruction S3 to be learned, namely, the first extracted user voice instruction is marked as K1, the second extracted user voice instruction is marked as K3 and the third extracted user voice instruction is marked as K3; the voice characteristic parameter set adopted by the voice instruction to be learned and the voice control instruction of the three times of users is assumed to comprise a voice characteristic parameter 1, a voice characteristic parameter 2 and a voice characteristic parameter 3;
the matching judgment process for step c4 is additionally described as follows:
when a preset voice characteristic parameter set (comprising a voice characteristic parameter 1, a voice characteristic parameter 2 and a voice characteristic parameter 3) is utilized to match a voice instruction S3 to be learned and a user voice instruction K1 extracted for the first time, if three voice characteristic parameters corresponding to two voice instructions are all in an allowable matching range, the user voice instruction K1 is considered to be matched with the voice instruction S3 to be learned;
similarly, the matching judgment for the user voice command K2 and the voice command to be learned S3 and the user voice command K3 and the voice command to be learned S3 is executed again; once the matching coincidence number in the three matching processes performed reaches a preset number (for example, the preset number is two), the voice feature parameter set (including the voice feature parameter 1, the voice feature parameter 2, and the voice feature parameter 3) for matching here is taken as a user-controlled voice feature parameter set characterizing the user-controlled device; that is, the subsequent voice control for the kitchen electric equipment takes the user control voice characteristic parameter set as a recognition matching standard;
and c5, when the user control voice matched with the user control voice characteristic parameter set is obtained again, executing the action corresponding to the voice instruction to be learned by the equipment. Therefore, the kitchen electric equipment performs self-learning aiming at the user voice, so that a voice characteristic parameter set which accords with the personalized characteristics of the user is obtained, the problem that the kitchen electric equipment is difficult to match dialects of the user with preset standard voice control instructions and is low in accuracy is avoided, and once the kitchen electric equipment acquires the voice which also has the voice characteristic parameter set again, the kitchen electric equipment can accurately recognize the personalized voice of the user, and the recognition accuracy of the kitchen electric equipment to the user voice and the interactive response efficiency of the kitchen electric equipment to the user voice control are improved.
It should be noted that the device control method based on voice recognition in the present embodiment may also be applied to home electric appliances such as an air conditioner and a television or other devices in a factory.
Claims (6)
1. The equipment control method based on voice recognition is characterized by comprising the following steps 1 to 4:
step 1, a preset standard voice control instruction database for controlling equipment to execute actions is constructed; wherein, the preset standard voice control instruction database stores preset standard voice corresponding to the action executed by the equipment;
step 2, detecting and acquiring an external voice signal outside the equipment, and preprocessing the external voice signal;
step 3, performing matching judgment processing on the preprocessed external voice signal and the preset standard voice control command database:
when the preprocessed external voice signal is matched and consistent with any preset standard voice in the preset standard voice control instruction database, taking the preset standard voice as an external voice control instruction, and turning to step 4; otherwise, turning to step 2;
step 4, commanding the equipment to execute the action corresponding to the external voice control instruction;
in step 2, the preprocessing process for the external voice signal includes step 2-1: performing endpoint detection on the external voice signal to acquire a user voice signal and an external noise signal in the external voice signal; in this step 2-1, the user voice signal and the external noise signal acquisition process in the external voice signal are as follows steps a1 to a3:
step a1, constructing a voice model fused with a user voice signal and an external noise signal; wherein the speech model is as follows:
wherein x is k Is the subband energy of the selected signal; z=0, representing the selected signal as an external noise signal; z=1, representing that the selected signal is a user speech signal; r is (r) k Is comprised of parameter mu z Sum parameter sigma 2 Is a parameter set of (a); mu (mu) z Representing the mean value, sigma, of the amplitude of the signal z 2 Representing the energy of signal z; p (x) k |z,r k ) Representing the probability that the selected signal is z;
step a2, calculating the probability that the signals in the external voice signals are the user voice signals and the probability of the external noise signals respectively according to the constructed voice model;
step a3, determining the signal type of the external voice signal by using a hypothesis testing method according to the probability result obtained in the step a 2; wherein the signal type is a user voice signal or an external noise signal; wherein, in this step a3, the signal type determining process of the user voice signal and the external noise signal among the external voice signals includes the following steps b1 to b7:
step b1, constructing a noise model by using signal characteristic data of a first preset frame number before starting the step 3;
step b2, calculating a normalized spectrum difference value by using the signal intensity of the second preset frame number before starting the step 3;
step b3, calculating the signal-to-noise ratio in each frame of signal and distinguishing user voice signals and external noise signals by adopting a probability density function according to the constructed noise model and the obtained normalized spectrum difference value;
step b4, according to the signal-to-noise ratio in each frame of signals, using a wiener filter to eliminate external noise signals in the external voice signals in a frequency domain;
step b5, calculating the energy ratio of the external voice signal before and after the noise elimination, and the signal-to-noise likelihood ratio of the external voice signal before and after the noise elimination;
step b6, repairing and adjusting the external voice signal after noise cancellation by utilizing the energy ratio before and after noise cancellation and the signal-to-noise likelihood ratio before and after noise cancellation;
and b7, outputting the repaired and adjusted external voice signal as a user voice signal.
2. The voice recognition-based device control method according to claim 1, wherein in step 2, the preprocessing process for the external voice signal further includes steps 2-2 to 2-4 of:
step 2-2, eliminating the external noise signal in the external voice signal to obtain a user voice signal after noise elimination processing;
step 2-3, extracting a voice characteristic parameter set in the user voice signal according to preset voice characteristic parameters;
and 2-4, taking the extracted voice characteristic parameter set as a preprocessing result aiming at the external voice signal.
3. The voice recognition-based device control method of claim 1, further comprising, after the successful execution of step 4: and 5, the device executes self-learning aiming at the voice of the user and executes corresponding actions again according to the self-learning result.
4. The voice recognition-based device control method according to claim 3, wherein the process of the device performing self-learning for the user's voice and performing corresponding actions again according to the self-learning result comprises the following steps c1 to c5:
step c1, taking the external voice signal matched and consistent with any preset standard voice data as a voice command to be learned of the equipment;
step c2, obtaining repeated user voice control instructions which are sent by the user again and have the same content as the voice instructions to be learned;
step c3, respectively extracting the voice characteristic parameter set to be learned of the voice instruction to be learned and the user voice characteristic parameter sets of the voice control instructions of each time according to the same voice characteristic parameter extraction method;
step c4, performing matching judgment according to the extracted voice characteristic parameter set to be learned and the voice characteristic parameter sets of each user:
when the matching times of the user voice characteristic parameter set and the voice characteristic parameter set to be learned reach preset times, the voice characteristic parameter set to be learned is used as a user control voice characteristic parameter set for representing the user control equipment; turning to step c5; otherwise, feeding back prompt information of failure of learning the voice command to the user;
and c5, when the user control voice matched with the user control voice characteristic parameter set is obtained again, executing the action corresponding to the voice instruction to be learned by the equipment.
5. The voice recognition-based device control method of any one of claims 1-4, wherein the preset standard voice in the preset standard voice control command database is a voice control command of a device system or a voice command entered by a user.
6. The apparatus control method based on voice recognition according to any one of claims 1 to 4, wherein the apparatus is a home appliance apparatus.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811264461.4A CN111105798B (en) | 2018-10-29 | 2018-10-29 | Equipment control method based on voice recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811264461.4A CN111105798B (en) | 2018-10-29 | 2018-10-29 | Equipment control method based on voice recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111105798A CN111105798A (en) | 2020-05-05 |
CN111105798B true CN111105798B (en) | 2023-08-18 |
Family
ID=70420301
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811264461.4A Active CN111105798B (en) | 2018-10-29 | 2018-10-29 | Equipment control method based on voice recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111105798B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7536667B2 (en) * | 2021-01-21 | 2024-08-20 | Tvs Regza株式会社 | Voice command processing circuit, receiving device, remote control and system |
CN113413613B (en) * | 2021-06-17 | 2024-06-25 | 网易(杭州)网络有限公司 | Method and device for optimizing voice chat in game, electronic equipment and medium |
CN113763935A (en) * | 2021-08-20 | 2021-12-07 | 重庆长安汽车股份有限公司 | Method and system for controlling electric appliance of vehicle body through voice outside vehicle, vehicle and storage medium |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1236928A (en) * | 1998-05-25 | 1999-12-01 | 郭巧 | Computer aided Chinese intelligent education system and its implementation method |
KR20030095474A (en) * | 2002-06-10 | 2003-12-24 | 휴먼씽크(주) | Method and apparatus for analysing a pitch, method and system for discriminating a corporal punishment, and computer readable medium storing a program thereof |
CN103456303A (en) * | 2013-08-08 | 2013-12-18 | 四川长虹电器股份有限公司 | Method for controlling voice and intelligent air-conditionier system |
KR20140060187A (en) * | 2012-11-09 | 2014-05-19 | 현대자동차주식회사 | Apparatus for controlling amplifier gain in voice recognition system and method thereof |
CN104715752A (en) * | 2015-04-09 | 2015-06-17 | 刘文军 | Voice recognition method, voice recognition device and voice recognition system |
CN104952447A (en) * | 2015-04-30 | 2015-09-30 | 深圳市全球锁安防系统工程有限公司 | Intelligent wearing equipment for safety and health service for old people and voice recognition method |
CN105202721A (en) * | 2015-07-31 | 2015-12-30 | 广东美的制冷设备有限公司 | Air conditioner and control method thereof |
CN105791931A (en) * | 2016-02-26 | 2016-07-20 | 深圳Tcl数字技术有限公司 | Smart television and voice control method of the smart television |
CN106057194A (en) * | 2016-06-25 | 2016-10-26 | 浙江合众新能源汽车有限公司 | Voice interaction system |
WO2018107874A1 (en) * | 2016-12-16 | 2018-06-21 | 广州视源电子科技股份有限公司 | Method and apparatus for automatically controlling gain of audio data |
CN108231063A (en) * | 2016-12-13 | 2018-06-29 | 中国移动通信有限公司研究院 | A kind of recognition methods of phonetic control command and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201617016D0 (en) * | 2016-09-09 | 2016-11-23 | Continental automotive systems inc | Robust noise estimation for speech enhancement in variable noise conditions |
-
2018
- 2018-10-29 CN CN201811264461.4A patent/CN111105798B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1236928A (en) * | 1998-05-25 | 1999-12-01 | 郭巧 | Computer aided Chinese intelligent education system and its implementation method |
KR20030095474A (en) * | 2002-06-10 | 2003-12-24 | 휴먼씽크(주) | Method and apparatus for analysing a pitch, method and system for discriminating a corporal punishment, and computer readable medium storing a program thereof |
KR20140060187A (en) * | 2012-11-09 | 2014-05-19 | 현대자동차주식회사 | Apparatus for controlling amplifier gain in voice recognition system and method thereof |
CN103456303A (en) * | 2013-08-08 | 2013-12-18 | 四川长虹电器股份有限公司 | Method for controlling voice and intelligent air-conditionier system |
CN104715752A (en) * | 2015-04-09 | 2015-06-17 | 刘文军 | Voice recognition method, voice recognition device and voice recognition system |
CN104952447A (en) * | 2015-04-30 | 2015-09-30 | 深圳市全球锁安防系统工程有限公司 | Intelligent wearing equipment for safety and health service for old people and voice recognition method |
CN105202721A (en) * | 2015-07-31 | 2015-12-30 | 广东美的制冷设备有限公司 | Air conditioner and control method thereof |
CN105791931A (en) * | 2016-02-26 | 2016-07-20 | 深圳Tcl数字技术有限公司 | Smart television and voice control method of the smart television |
CN106057194A (en) * | 2016-06-25 | 2016-10-26 | 浙江合众新能源汽车有限公司 | Voice interaction system |
CN108231063A (en) * | 2016-12-13 | 2018-06-29 | 中国移动通信有限公司研究院 | A kind of recognition methods of phonetic control command and device |
WO2018107874A1 (en) * | 2016-12-16 | 2018-06-21 | 广州视源电子科技股份有限公司 | Method and apparatus for automatically controlling gain of audio data |
Non-Patent Citations (1)
Title |
---|
Sohn J 等.A Statistical Model-based Voice Activity Detection.《IEEE Signal Processing Letters》.1999,第6卷(第1期),第1-3页. * |
Also Published As
Publication number | Publication date |
---|---|
CN111105798A (en) | 2020-05-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108597496B (en) | Voice generation method and device based on generation type countermeasure network | |
CN111105798B (en) | Equipment control method based on voice recognition | |
CN109215665A (en) | A kind of method for recognizing sound-groove based on 3D convolutional neural networks | |
CN109448711A (en) | Voice recognition method and device and computer storage medium | |
CN102005070A (en) | Voice identification gate control system | |
CN111081223B (en) | Voice recognition method, device, equipment and storage medium | |
CN106558306A (en) | Method for voice recognition, device and equipment | |
CN111583936A (en) | Intelligent voice elevator control method and device | |
CN109256139A (en) | A kind of method for distinguishing speek person based on Triplet-Loss | |
CN102945673A (en) | Continuous speech recognition method with speech command range changed dynamically | |
CN110211609A (en) | A method of promoting speech recognition accuracy | |
CN109215634A (en) | A kind of method and its system of more word voice control on-off systems | |
CN109087646B (en) | Method for leading-in artificial intelligence ultra-deep learning for voice image recognition | |
CN110931018A (en) | Intelligent voice interaction method and device and computer readable storage medium | |
CN111192573B (en) | Intelligent control method for equipment based on voice recognition | |
CN112017658A (en) | Operation control system based on intelligent human-computer interaction | |
CN108172220A (en) | A kind of novel voice denoising method | |
CN116343797A (en) | Voice awakening method and corresponding device | |
CN113077812B (en) | Voice signal generation model training method, echo cancellation method, device and equipment | |
CN106887226A (en) | Speech recognition algorithm based on artificial intelligence recognition | |
CN117316164A (en) | Voice interaction processing method and device, storage medium and electronic equipment | |
CN108492821B (en) | Method for weakening influence of speaker in voice recognition | |
CN116612754A (en) | Voice instruction recognition method and device applied to vehicle | |
CN115331670B (en) | Off-line voice remote controller for household appliances | |
CN107993666B (en) | Speech recognition method, speech recognition device, computer equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |