CN111105798B - Equipment control method based on voice recognition - Google Patents

Equipment control method based on voice recognition Download PDF

Info

Publication number
CN111105798B
CN111105798B CN201811264461.4A CN201811264461A CN111105798B CN 111105798 B CN111105798 B CN 111105798B CN 201811264461 A CN201811264461 A CN 201811264461A CN 111105798 B CN111105798 B CN 111105798B
Authority
CN
China
Prior art keywords
voice
signal
user
external
noise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811264461.4A
Other languages
Chinese (zh)
Other versions
CN111105798A (en
Inventor
姚长标
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Fotile Kitchen Ware Co Ltd
Original Assignee
Ningbo Fotile Kitchen Ware Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Fotile Kitchen Ware Co Ltd filed Critical Ningbo Fotile Kitchen Ware Co Ltd
Priority to CN201811264461.4A priority Critical patent/CN111105798B/en
Publication of CN111105798A publication Critical patent/CN111105798A/en
Application granted granted Critical
Publication of CN111105798B publication Critical patent/CN111105798B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Selective Calling Equipment (AREA)

Abstract

The invention relates to a device control method based on voice recognition, which is characterized in that an acquired external voice signal is matched with a preset standard voice control instruction database stored in the device, once the acquired external voice signal is matched with any preset standard voice control instruction, the device executes actions corresponding to the preset standard voice control instruction, and the device performs self-learning aiming at user voices, so that a voice characteristic parameter set conforming to the personalized characteristics of the user is obtained, the problem that the matching accuracy of dialects of the device when the user pronounces is not high with the preset standard voice control instruction is avoided, the device can accurately recognize personalized voices of the user once the device acquires voices with the same voice characteristic parameter set again, and the recognition accuracy of the device to the user voices and the interactive response efficiency of the device to the user voice control are improved.

Description

Equipment control method based on voice recognition
Technical Field
The invention relates to the field of equipment control, in particular to a voice recognition-based equipment control method.
Background
Along with the continuous development of equipment intellectualization, intelligent equipment with various control functions is continuously emerging in the market. For example, compared with a traditional key control mode of equipment, the existing intelligent equipment has a touch control function and a gesture control function based on user actions.
However, the manner of operation of the existing smart device has some drawbacks: in the process of operating equipment such as a range hood, a steam box or an oven by a user, the key operation mode, touch operation and gesture operation of the existing equipment still need to occupy one hand or two hands of the user, so that the user can not easily vacate the free hands to do other matters, and therefore the operation experience effect of the user on the equipment is reduced.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a device control method based on voice recognition aiming at the prior art.
The technical scheme adopted for solving the technical problems is as follows: the equipment control method based on voice recognition is characterized by comprising the following steps 1 to 4:
step 1, a preset standard voice control instruction database for controlling equipment to execute actions is constructed; wherein, the preset standard voice control instruction database stores preset standard voice corresponding to the action executed by the equipment;
step 2, detecting and acquiring an external voice signal outside the equipment, and preprocessing the external voice signal;
step 3, performing matching judgment processing on the preprocessed external voice signal and the preset standard voice control command database:
when the preprocessed external voice signal is matched and consistent with any preset standard voice in the preset standard voice control instruction database, taking the preset standard voice as an external voice control instruction, and turning to step 4; otherwise, turning to step 2;
and 4, commanding the equipment to execute the action corresponding to the external voice control instruction.
Further, in the voice recognition-based device control method, in step 2, the preprocessing process for the external voice signal includes steps 2-1 to 2-4 as follows:
step 2-1, performing endpoint detection on the external voice signal to obtain a user voice signal and an external noise signal in the external voice signal;
step 2-2, eliminating the external noise signal in the external voice signal to obtain a user voice signal after noise elimination processing;
step 2-3, extracting a voice characteristic parameter set in the user voice signal according to preset voice characteristic parameters;
and 2-4, taking the extracted voice characteristic parameter set as a preprocessing result aiming at the external voice signal.
Still further, in step 2-1, the process of acquiring the user voice signal and the external noise signal from the external voice signal is as follows in steps a1 to a3:
step a1, constructing a voice model fused with a user voice signal and an external noise signal; wherein the speech model is as follows:
wherein x is k Is the subband energy of the selected signal; z=0, representing the selected signal as an external noise signal; z=1, representing that the selected signal is a user speech signal; r is (r) k Is comprised of parameter mu z Sum parameter sigma 2 Is a parameter set of (a); mu (mu) z Representing the mean value, sigma, of the amplitude of the signal z 2 Representing the energy of signal z; p (x) k |z,r k ) Representing the probability that the selected signal is z;
step a2, calculating the probability that the signals in the external voice signals are the user voice signals and the probability of the external noise signals respectively according to the constructed voice model;
step a3, determining the signal type of the external voice signal by using a hypothesis testing method according to the probability result obtained in the step a 2; wherein the signal type is a user speech signal or an external noise signal.
Still further, in step a3, the signal type determining process of the user voice signal and the external noise signal in the external voice signal includes the following steps b1 to b7:
step b1, constructing a noise model by using signal characteristic data of a first preset frame number before starting the step 3;
step b2, calculating a normalized spectrum difference value by using the signal intensity of the second preset frame number before starting the step 3;
step b3, calculating the signal-to-noise ratio in each frame of signal by adopting a probability density function according to the constructed noise model and the obtained normalized spectrum difference value, and distinguishing a user voice signal and an external noise signal;
step b4, according to the signal-to-noise ratio in each frame of signals, using a wiener filter to eliminate external noise signals in the external voice signals in a frequency domain;
step b5, calculating the energy ratio of the external voice signal before and after the noise elimination, and the signal-to-noise likelihood ratio of the external voice signal before and after the noise elimination;
step b6, repairing and adjusting the external voice signal after noise cancellation by utilizing the energy ratio before and after noise cancellation and the signal-to-noise likelihood ratio before and after noise cancellation;
and b7, outputting the repaired and adjusted external voice signal as a user voice signal.
In the device control method based on voice recognition, after the successful execution of step 4, the method further comprises: and 5, the device executes self-learning aiming at the voice of the user and executes corresponding actions again according to the self-learning result.
Further, the process of the device performing self-learning for the user's voice and performing corresponding actions again according to the self-learning result includes the following steps c1 to c5:
step c1, taking the external voice signal matched and consistent with any preset standard voice data as a voice command to be learned of the equipment;
step c2, obtaining repeated user voice control instructions which are sent by the user again and have the same content as the voice instructions to be learned;
step c3, respectively extracting the voice characteristic parameter set to be learned of the voice instruction to be learned and the user voice characteristic parameter sets of the voice control instructions of each time according to the same voice characteristic parameter extraction method;
step c4, performing matching judgment according to the extracted voice characteristic parameter set to be learned and the voice characteristic parameter sets of each user:
when the matching times of the user voice characteristic parameter set and the voice characteristic parameter set to be learned reach preset times, the voice characteristic parameter set to be learned is used as a user control voice characteristic parameter set for representing the user control equipment; turning to step c5; otherwise, feeding back prompt information of failure of learning the voice command to the user;
and c5, when the user control voice matched with the user control voice characteristic parameter set is obtained again, executing the action corresponding to the voice instruction to be learned by the equipment.
Optionally, in the device control method based on voice recognition, the preset standard voice in the preset standard voice control command database is a voice control command of the device system or a voice command recorded by a user.
Further, in the voice recognition-based device control method, the device is a home appliance device.
Compared with the prior art, the invention has the advantages that:
according to the equipment control method, the collected external voice signals are matched with the preset standard voice control instruction database stored in the equipment, once the collected external voice signals are matched with any preset standard voice control instruction, the equipment executes the action corresponding to the any preset standard voice control instruction, so that a user can control the equipment through voice, manual equipment operation is avoided, the hands of the user are effectively liberated, and the control experience effect of the user on the equipment is improved;
moreover, the invention also enables the equipment to execute self-learning aiming at the user voice so as to obtain the voice characteristic parameter set which accords with the personalized characteristics of the user, thereby avoiding the problem that the dialect of the equipment is difficult to match with the preset standard voice control instruction when the user pronounces, so that the equipment can accurately recognize the personalized voice of the user once the equipment acquires the voice which also has the voice characteristic parameter set again, and improving the recognition accuracy of the equipment to the user voice and the interactive response efficiency of the equipment to the user voice control.
Drawings
Fig. 1 is a schematic flow chart of a device control method based on voice recognition in an embodiment of the invention.
Detailed Description
The invention is described in further detail below with reference to the embodiments of the drawings.
In this embodiment, a kitchen electric appliance (or kitchen electric appliance) is taken as an example, and a method of controlling the appliance in the present invention will be described. Referring to fig. 1, the device control method based on voice recognition in this embodiment includes the following steps 1 to 4:
step 1, a preset standard voice control instruction database for controlling kitchen electric equipment to execute actions is constructed; wherein, the preset standard voice corresponding to the action executed by the kitchen electric equipment is stored in the preset standard voice control instruction database;
for example, for the kitchen electric equipment, the preset standard voice control command database stores standard voices which are recorded by using standard mandarin, and conform to various functions of the kitchen electric equipment, such as a start voice control command S1, an end voice control command S2, an increase gear voice control command S3, a decrease gear voice control command S4 and the like; that is, as long as the degree of matching of the speech uttered by the user with the standard speech stored therein reaches the set degree, the user can be considered to utter the standard speech;
of course, the preset standard voice in the preset standard voice control command database can be a voice control command of the kitchen electric equipment when leaving the factory, or can be a voice control command recorded by a user after the user purchases the kitchen electric equipment;
step 2, detecting and acquiring an external voice signal outside the kitchen electric equipment, and preprocessing the external voice signal;
specifically, in step 2 of the present embodiment, the preprocessing process for the external voice signal here includes the following steps 2-1 to 2-4:
step 2-1, performing endpoint detection on an external voice signal to obtain a user voice signal and an external noise signal in the external voice signal;
assuming that an external voice signal collected by the device is marked as X, after endpoint detection is executed, a user voice signal in the external voice signal X is Sound, and an external Noise signal in the external voice signal X is Noise; the endpoint detection in this embodiment belongs to the prior art, and is not described here again;
it should be noted that, in the external voice signal described in this embodiment, the process of acquiring the user voice signal and the external noise signal is as follows:
step a1, constructing a voice model fused with a user voice signal and an external noise signal; wherein, the speech model is as follows:
wherein x is k Is the subband energy of the selected signal; z=0, representing the selected signal as an external noise signal; z=1, representing that the selected signal is a user speech signal; r is (r) k Is comprised of parameter mu z Sum parameter sigma 2 Is a parameter set of (a); mu (mu) z Representing the mean value, sigma, of the amplitude of the signal z 2 Representing the energy of signal z; p (x) k |z,r k ) Representing the probability that the selected signal is z;
step a2, calculating the probability that the signals in the external voice signals are the user voice signals and the probability of the external noise signals respectively according to the constructed voice model;
step a3, determining the signal type of the external voice signal by using a hypothesis testing method as the prior art according to the probability result obtained in the step a 2; the signal type is a user voice signal or an external noise signal;
specifically, in step a3, the signal type determination process of the user voice signal and the external noise signal in the external voice signal includes the following steps b1 to b7:
step b1, constructing a noise model by using signal characteristic data of a first preset frame number before starting the step 3;
step b2, calculating a normalized spectrum difference value by using the signal intensity of the second preset frame number before starting the step 3;
step b3, calculating the signal-to-noise ratio in each frame of signal by adopting a probability density function according to the constructed noise model and the obtained normalized spectrum difference value, and distinguishing a user voice signal and an external noise signal;
step b4, according to the signal-to-noise ratio in each frame of signals, using a wiener filter to eliminate external noise signals in external voice signals in a frequency domain;
step b5, calculating the energy ratio of the external voice signal before and after the noise elimination and the signal-to-noise likelihood ratio of the external voice signal before and after the noise elimination;
step b6, repairing and adjusting the external voice signal after noise elimination by utilizing the energy ratio before and after noise elimination and the signal-to-noise likelihood ratio before and after noise elimination;
and b7, outputting the repaired and adjusted external voice signal as a user voice signal. That is, after the repairing and adjusting process of the step b6, only the user voice signal remains in the external voice signal after the noise is eliminated, so that the purpose of determining the user voice signal and the external noise signal in the external voice signal is achieved;
in the embodiment, by adopting the modes from the step b1 to the step b7, noise in the external voice signals collected by the kitchen electric equipment can be eliminated, and then only voice instructions sent by a user are remained, so that adverse effects of the noise on the voice instruction identifying process of the user are avoided, the voice control instruction identifying rate for the kitchen electric equipment is improved, and the timeliness of voice response sent by the kitchen electric equipment to the user is enhanced;
step 2-2, eliminating an external Noise signal Noise in the external voice signal X to obtain a Noise-eliminated user voice signal Sound; that is, after the step 2-2 is performed, only the voice signal Sound of the user remains in the so-called external voice signal X here; as for the elimination of the external Noise signal Noise here, a conventional wavelet Noise filtering method may be adopted, or the external Noise signal Noise here may be eliminated in the manner of the above-described steps b1 to b 4;
step 2-3, extracting a voice characteristic parameter set in a user voice signal according to preset voice characteristic parameters;
for example, the preset voice characteristic parameters may be characteristic parameters obtained based on parameters such as amplitude, frequency or frequency spectrum of the voice signal, and the voice characteristic parameters set includes characteristic parameters required for recognizing voice; the number or the type of the characteristic parameters in the voice characteristic parameter set can be selectively set according to actual requirements;
step 2-4, taking the extracted voice characteristic parameter set as a preprocessing result aiming at the external voice signal;
step 3, carrying out matching judgment processing on the preprocessed external voice signal and a preset standard voice control command database:
when the preprocessed external voice signal is matched with any preset standard voice in the preset standard voice control command database, the content of the preprocessed external voice signal is the content corresponding to the any preset standard voice to be sent, and the preset standard voice is used as an external voice control command at the moment, and the step 4 is carried out; otherwise, turning to step 2;
and 4, commanding the equipment to execute the action corresponding to the external voice control instruction. For example, once it is judged that the preprocessed external voice signal (specifically, the user voice signal Sound after noise cancellation) matches with the preset standard voice "increase gear" voice control command S3, it is indicated that the user sends a control command of "increase gear" to the kitchen electric equipment at this time, so that the kitchen electric equipment at this time increases the gear on the basis of the current gear of the user, and the control requirement of the user on the kitchen electric equipment is met.
Of course, in order to meet the personalized characteristics of the user voice, the problem that the kitchen electric equipment is difficult to match dialect of the user when sounding with a preset standard voice control instruction is not high in accuracy is avoided, and the equipment control method of the embodiment further comprises after the successful execution of the step 4: and 5, the kitchen electric equipment executes self-learning aiming at the voice of the user and executes corresponding actions again according to the self-learning result. The process of the kitchen electric equipment executing self-learning aiming at the voice of the user and executing corresponding actions again according to the self-learning result specifically comprises the following steps c1 to c5:
step c1, taking an external voice signal matched with any preset standard voice data as a voice command to be learned of the kitchen electric equipment;
since it is assumed in the present embodiment that the preprocessed external voice signal (specifically, the noise-removed user voice signal Sound) matches with the preset standard voice "up shift" voice control command S3, in the step c1, the external voice signal (specifically, the noise-removed user voice signal Sound) matching with the "up shift" voice control command S3 is used as the voice command to be learned of the kitchen electric device;
step c2, obtaining repeated user voice control instructions which are sent by the user again and have the same content as the voice instructions to be learned;
step c3, respectively extracting a to-be-learned voice characteristic parameter set of the to-be-learned voice instruction and a user voice characteristic parameter set of each user voice control instruction according to the same voice characteristic parameter extraction method;
step c4, performing matching judgment according to the extracted voice characteristic parameter set to be learned and the voice characteristic parameter sets of each user:
when the matching times of the user voice characteristic parameter set and the voice characteristic parameter set to be learned reach the preset times, the voice characteristic parameter set to be learned is used as a user control voice characteristic parameter set for representing the user control equipment; turning to step c5; otherwise, feeding back prompt information of failure of learning the voice command to the user;
for example, the kitchen electric device is required to acquire three times of user voice control instructions with the same content as the voice instruction S3 to be learned, namely, the first extracted user voice instruction is marked as K1, the second extracted user voice instruction is marked as K3 and the third extracted user voice instruction is marked as K3; the voice characteristic parameter set adopted by the voice instruction to be learned and the voice control instruction of the three times of users is assumed to comprise a voice characteristic parameter 1, a voice characteristic parameter 2 and a voice characteristic parameter 3;
the matching judgment process for step c4 is additionally described as follows:
when a preset voice characteristic parameter set (comprising a voice characteristic parameter 1, a voice characteristic parameter 2 and a voice characteristic parameter 3) is utilized to match a voice instruction S3 to be learned and a user voice instruction K1 extracted for the first time, if three voice characteristic parameters corresponding to two voice instructions are all in an allowable matching range, the user voice instruction K1 is considered to be matched with the voice instruction S3 to be learned;
similarly, the matching judgment for the user voice command K2 and the voice command to be learned S3 and the user voice command K3 and the voice command to be learned S3 is executed again; once the matching coincidence number in the three matching processes performed reaches a preset number (for example, the preset number is two), the voice feature parameter set (including the voice feature parameter 1, the voice feature parameter 2, and the voice feature parameter 3) for matching here is taken as a user-controlled voice feature parameter set characterizing the user-controlled device; that is, the subsequent voice control for the kitchen electric equipment takes the user control voice characteristic parameter set as a recognition matching standard;
and c5, when the user control voice matched with the user control voice characteristic parameter set is obtained again, executing the action corresponding to the voice instruction to be learned by the equipment. Therefore, the kitchen electric equipment performs self-learning aiming at the user voice, so that a voice characteristic parameter set which accords with the personalized characteristics of the user is obtained, the problem that the kitchen electric equipment is difficult to match dialects of the user with preset standard voice control instructions and is low in accuracy is avoided, and once the kitchen electric equipment acquires the voice which also has the voice characteristic parameter set again, the kitchen electric equipment can accurately recognize the personalized voice of the user, and the recognition accuracy of the kitchen electric equipment to the user voice and the interactive response efficiency of the kitchen electric equipment to the user voice control are improved.
It should be noted that the device control method based on voice recognition in the present embodiment may also be applied to home electric appliances such as an air conditioner and a television or other devices in a factory.

Claims (6)

1. The equipment control method based on voice recognition is characterized by comprising the following steps 1 to 4:
step 1, a preset standard voice control instruction database for controlling equipment to execute actions is constructed; wherein, the preset standard voice control instruction database stores preset standard voice corresponding to the action executed by the equipment;
step 2, detecting and acquiring an external voice signal outside the equipment, and preprocessing the external voice signal;
step 3, performing matching judgment processing on the preprocessed external voice signal and the preset standard voice control command database:
when the preprocessed external voice signal is matched and consistent with any preset standard voice in the preset standard voice control instruction database, taking the preset standard voice as an external voice control instruction, and turning to step 4; otherwise, turning to step 2;
step 4, commanding the equipment to execute the action corresponding to the external voice control instruction;
in step 2, the preprocessing process for the external voice signal includes step 2-1: performing endpoint detection on the external voice signal to acquire a user voice signal and an external noise signal in the external voice signal; in this step 2-1, the user voice signal and the external noise signal acquisition process in the external voice signal are as follows steps a1 to a3:
step a1, constructing a voice model fused with a user voice signal and an external noise signal; wherein the speech model is as follows:
wherein x is k Is the subband energy of the selected signal; z=0, representing the selected signal as an external noise signal; z=1, representing that the selected signal is a user speech signal; r is (r) k Is comprised of parameter mu z Sum parameter sigma 2 Is a parameter set of (a); mu (mu) z Representing the mean value, sigma, of the amplitude of the signal z 2 Representing the energy of signal z; p (x) k |z,r k ) Representing the probability that the selected signal is z;
step a2, calculating the probability that the signals in the external voice signals are the user voice signals and the probability of the external noise signals respectively according to the constructed voice model;
step a3, determining the signal type of the external voice signal by using a hypothesis testing method according to the probability result obtained in the step a 2; wherein the signal type is a user voice signal or an external noise signal; wherein, in this step a3, the signal type determining process of the user voice signal and the external noise signal among the external voice signals includes the following steps b1 to b7:
step b1, constructing a noise model by using signal characteristic data of a first preset frame number before starting the step 3;
step b2, calculating a normalized spectrum difference value by using the signal intensity of the second preset frame number before starting the step 3;
step b3, calculating the signal-to-noise ratio in each frame of signal and distinguishing user voice signals and external noise signals by adopting a probability density function according to the constructed noise model and the obtained normalized spectrum difference value;
step b4, according to the signal-to-noise ratio in each frame of signals, using a wiener filter to eliminate external noise signals in the external voice signals in a frequency domain;
step b5, calculating the energy ratio of the external voice signal before and after the noise elimination, and the signal-to-noise likelihood ratio of the external voice signal before and after the noise elimination;
step b6, repairing and adjusting the external voice signal after noise cancellation by utilizing the energy ratio before and after noise cancellation and the signal-to-noise likelihood ratio before and after noise cancellation;
and b7, outputting the repaired and adjusted external voice signal as a user voice signal.
2. The voice recognition-based device control method according to claim 1, wherein in step 2, the preprocessing process for the external voice signal further includes steps 2-2 to 2-4 of:
step 2-2, eliminating the external noise signal in the external voice signal to obtain a user voice signal after noise elimination processing;
step 2-3, extracting a voice characteristic parameter set in the user voice signal according to preset voice characteristic parameters;
and 2-4, taking the extracted voice characteristic parameter set as a preprocessing result aiming at the external voice signal.
3. The voice recognition-based device control method of claim 1, further comprising, after the successful execution of step 4: and 5, the device executes self-learning aiming at the voice of the user and executes corresponding actions again according to the self-learning result.
4. The voice recognition-based device control method according to claim 3, wherein the process of the device performing self-learning for the user's voice and performing corresponding actions again according to the self-learning result comprises the following steps c1 to c5:
step c1, taking the external voice signal matched and consistent with any preset standard voice data as a voice command to be learned of the equipment;
step c2, obtaining repeated user voice control instructions which are sent by the user again and have the same content as the voice instructions to be learned;
step c3, respectively extracting the voice characteristic parameter set to be learned of the voice instruction to be learned and the user voice characteristic parameter sets of the voice control instructions of each time according to the same voice characteristic parameter extraction method;
step c4, performing matching judgment according to the extracted voice characteristic parameter set to be learned and the voice characteristic parameter sets of each user:
when the matching times of the user voice characteristic parameter set and the voice characteristic parameter set to be learned reach preset times, the voice characteristic parameter set to be learned is used as a user control voice characteristic parameter set for representing the user control equipment; turning to step c5; otherwise, feeding back prompt information of failure of learning the voice command to the user;
and c5, when the user control voice matched with the user control voice characteristic parameter set is obtained again, executing the action corresponding to the voice instruction to be learned by the equipment.
5. The voice recognition-based device control method of any one of claims 1-4, wherein the preset standard voice in the preset standard voice control command database is a voice control command of a device system or a voice command entered by a user.
6. The apparatus control method based on voice recognition according to any one of claims 1 to 4, wherein the apparatus is a home appliance apparatus.
CN201811264461.4A 2018-10-29 2018-10-29 Equipment control method based on voice recognition Active CN111105798B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811264461.4A CN111105798B (en) 2018-10-29 2018-10-29 Equipment control method based on voice recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811264461.4A CN111105798B (en) 2018-10-29 2018-10-29 Equipment control method based on voice recognition

Publications (2)

Publication Number Publication Date
CN111105798A CN111105798A (en) 2020-05-05
CN111105798B true CN111105798B (en) 2023-08-18

Family

ID=70420301

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811264461.4A Active CN111105798B (en) 2018-10-29 2018-10-29 Equipment control method based on voice recognition

Country Status (1)

Country Link
CN (1) CN111105798B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7536667B2 (en) * 2021-01-21 2024-08-20 Tvs Regza株式会社 Voice command processing circuit, receiving device, remote control and system
CN113413613B (en) * 2021-06-17 2024-06-25 网易(杭州)网络有限公司 Method and device for optimizing voice chat in game, electronic equipment and medium
CN113763935A (en) * 2021-08-20 2021-12-07 重庆长安汽车股份有限公司 Method and system for controlling electric appliance of vehicle body through voice outside vehicle, vehicle and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1236928A (en) * 1998-05-25 1999-12-01 郭巧 Computer aided Chinese intelligent education system and its implementation method
KR20030095474A (en) * 2002-06-10 2003-12-24 휴먼씽크(주) Method and apparatus for analysing a pitch, method and system for discriminating a corporal punishment, and computer readable medium storing a program thereof
CN103456303A (en) * 2013-08-08 2013-12-18 四川长虹电器股份有限公司 Method for controlling voice and intelligent air-conditionier system
KR20140060187A (en) * 2012-11-09 2014-05-19 현대자동차주식회사 Apparatus for controlling amplifier gain in voice recognition system and method thereof
CN104715752A (en) * 2015-04-09 2015-06-17 刘文军 Voice recognition method, voice recognition device and voice recognition system
CN104952447A (en) * 2015-04-30 2015-09-30 深圳市全球锁安防系统工程有限公司 Intelligent wearing equipment for safety and health service for old people and voice recognition method
CN105202721A (en) * 2015-07-31 2015-12-30 广东美的制冷设备有限公司 Air conditioner and control method thereof
CN105791931A (en) * 2016-02-26 2016-07-20 深圳Tcl数字技术有限公司 Smart television and voice control method of the smart television
CN106057194A (en) * 2016-06-25 2016-10-26 浙江合众新能源汽车有限公司 Voice interaction system
WO2018107874A1 (en) * 2016-12-16 2018-06-21 广州视源电子科技股份有限公司 Method and apparatus for automatically controlling gain of audio data
CN108231063A (en) * 2016-12-13 2018-06-29 中国移动通信有限公司研究院 A kind of recognition methods of phonetic control command and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201617016D0 (en) * 2016-09-09 2016-11-23 Continental automotive systems inc Robust noise estimation for speech enhancement in variable noise conditions

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1236928A (en) * 1998-05-25 1999-12-01 郭巧 Computer aided Chinese intelligent education system and its implementation method
KR20030095474A (en) * 2002-06-10 2003-12-24 휴먼씽크(주) Method and apparatus for analysing a pitch, method and system for discriminating a corporal punishment, and computer readable medium storing a program thereof
KR20140060187A (en) * 2012-11-09 2014-05-19 현대자동차주식회사 Apparatus for controlling amplifier gain in voice recognition system and method thereof
CN103456303A (en) * 2013-08-08 2013-12-18 四川长虹电器股份有限公司 Method for controlling voice and intelligent air-conditionier system
CN104715752A (en) * 2015-04-09 2015-06-17 刘文军 Voice recognition method, voice recognition device and voice recognition system
CN104952447A (en) * 2015-04-30 2015-09-30 深圳市全球锁安防系统工程有限公司 Intelligent wearing equipment for safety and health service for old people and voice recognition method
CN105202721A (en) * 2015-07-31 2015-12-30 广东美的制冷设备有限公司 Air conditioner and control method thereof
CN105791931A (en) * 2016-02-26 2016-07-20 深圳Tcl数字技术有限公司 Smart television and voice control method of the smart television
CN106057194A (en) * 2016-06-25 2016-10-26 浙江合众新能源汽车有限公司 Voice interaction system
CN108231063A (en) * 2016-12-13 2018-06-29 中国移动通信有限公司研究院 A kind of recognition methods of phonetic control command and device
WO2018107874A1 (en) * 2016-12-16 2018-06-21 广州视源电子科技股份有限公司 Method and apparatus for automatically controlling gain of audio data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Sohn J 等.A Statistical Model-based Voice Activity Detection.《IEEE Signal Processing Letters》.1999,第6卷(第1期),第1-3页. *

Also Published As

Publication number Publication date
CN111105798A (en) 2020-05-05

Similar Documents

Publication Publication Date Title
CN108597496B (en) Voice generation method and device based on generation type countermeasure network
CN111105798B (en) Equipment control method based on voice recognition
CN109215665A (en) A kind of method for recognizing sound-groove based on 3D convolutional neural networks
CN109448711A (en) Voice recognition method and device and computer storage medium
CN102005070A (en) Voice identification gate control system
CN111081223B (en) Voice recognition method, device, equipment and storage medium
CN106558306A (en) Method for voice recognition, device and equipment
CN111583936A (en) Intelligent voice elevator control method and device
CN109256139A (en) A kind of method for distinguishing speek person based on Triplet-Loss
CN102945673A (en) Continuous speech recognition method with speech command range changed dynamically
CN110211609A (en) A method of promoting speech recognition accuracy
CN109215634A (en) A kind of method and its system of more word voice control on-off systems
CN109087646B (en) Method for leading-in artificial intelligence ultra-deep learning for voice image recognition
CN110931018A (en) Intelligent voice interaction method and device and computer readable storage medium
CN111192573B (en) Intelligent control method for equipment based on voice recognition
CN112017658A (en) Operation control system based on intelligent human-computer interaction
CN108172220A (en) A kind of novel voice denoising method
CN116343797A (en) Voice awakening method and corresponding device
CN113077812B (en) Voice signal generation model training method, echo cancellation method, device and equipment
CN106887226A (en) Speech recognition algorithm based on artificial intelligence recognition
CN117316164A (en) Voice interaction processing method and device, storage medium and electronic equipment
CN108492821B (en) Method for weakening influence of speaker in voice recognition
CN116612754A (en) Voice instruction recognition method and device applied to vehicle
CN115331670B (en) Off-line voice remote controller for household appliances
CN107993666B (en) Speech recognition method, speech recognition device, computer equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant