CN111105798B

CN111105798B - Equipment control method based on voice recognition

Info

Publication number: CN111105798B
Application number: CN201811264461.4A
Authority: CN
Inventors: 姚长标
Original assignee: Ningbo Fotile Kitchen Ware Co Ltd
Current assignee: Ningbo Fotile Kitchen Ware Co Ltd
Priority date: 2018-10-29
Filing date: 2018-10-29
Publication date: 2023-08-18
Anticipated expiration: 2038-10-29
Also published as: CN111105798A

Abstract

The invention relates to a device control method based on voice recognition, which is characterized in that an acquired external voice signal is matched with a preset standard voice control instruction database stored in the device, once the acquired external voice signal is matched with any preset standard voice control instruction, the device executes actions corresponding to the preset standard voice control instruction, and the device performs self-learning aiming at user voices, so that a voice characteristic parameter set conforming to the personalized characteristics of the user is obtained, the problem that the matching accuracy of dialects of the device when the user pronounces is not high with the preset standard voice control instruction is avoided, the device can accurately recognize personalized voices of the user once the device acquires voices with the same voice characteristic parameter set again, and the recognition accuracy of the device to the user voices and the interactive response efficiency of the device to the user voice control are improved.

Description

Equipment control method based on voice recognition

Technical Field

The invention relates to the field of equipment control, in particular to a voice recognition-based equipment control method.

Background

Along with the continuous development of equipment intellectualization, intelligent equipment with various control functions is continuously emerging in the market. For example, compared with a traditional key control mode of equipment, the existing intelligent equipment has a touch control function and a gesture control function based on user actions.

However, the manner of operation of the existing smart device has some drawbacks: in the process of operating equipment such as a range hood, a steam box or an oven by a user, the key operation mode, touch operation and gesture operation of the existing equipment still need to occupy one hand or two hands of the user, so that the user can not easily vacate the free hands to do other matters, and therefore the operation experience effect of the user on the equipment is reduced.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a device control method based on voice recognition aiming at the prior art.

The technical scheme adopted for solving the technical problems is as follows: the equipment control method based on voice recognition is characterized by comprising the following steps 1 to 4:

step 1, a preset standard voice control instruction database for controlling equipment to execute actions is constructed; wherein, the preset standard voice control instruction database stores preset standard voice corresponding to the action executed by the equipment;

step 2, detecting and acquiring an external voice signal outside the equipment, and preprocessing the external voice signal;

step 3, performing matching judgment processing on the preprocessed external voice signal and the preset standard voice control command database:

when the preprocessed external voice signal is matched and consistent with any preset standard voice in the preset standard voice control instruction database, taking the preset standard voice as an external voice control instruction, and turning to step 4; otherwise, turning to step 2;

and 4, commanding the equipment to execute the action corresponding to the external voice control instruction.

Further, in the voice recognition-based device control method, in step 2, the preprocessing process for the external voice signal includes steps 2-1 to 2-4 as follows:

step 2-1, performing endpoint detection on the external voice signal to obtain a user voice signal and an external noise signal in the external voice signal;

step 2-2, eliminating the external noise signal in the external voice signal to obtain a user voice signal after noise elimination processing;

step 2-3, extracting a voice characteristic parameter set in the user voice signal according to preset voice characteristic parameters;

and 2-4, taking the extracted voice characteristic parameter set as a preprocessing result aiming at the external voice signal.

Still further, in step 2-1, the process of acquiring the user voice signal and the external noise signal from the external voice signal is as follows in steps a1 to a3:

step a1, constructing a voice model fused with a user voice signal and an external noise signal; wherein the speech model is as follows:

wherein x is _k Is the subband energy of the selected signal; z=0, representing the selected signal as an external noise signal; z=1, representing that the selected signal is a user speech signal; r is (r) _k Is comprised of parameter mu _z Sum parameter sigma ² Is a parameter set of (a); mu (mu) _z Representing the mean value, sigma, of the amplitude of the signal z ² Representing the energy of signal z; p (x) _k |z,r _k ) Representing the probability that the selected signal is z;

step a2, calculating the probability that the signals in the external voice signals are the user voice signals and the probability of the external noise signals respectively according to the constructed voice model;

step a3, determining the signal type of the external voice signal by using a hypothesis testing method according to the probability result obtained in the step a 2; wherein the signal type is a user speech signal or an external noise signal.

Still further, in step a3, the signal type determining process of the user voice signal and the external noise signal in the external voice signal includes the following steps b1 to b7:

step b1, constructing a noise model by using signal characteristic data of a first preset frame number before starting the step 3;

step b2, calculating a normalized spectrum difference value by using the signal intensity of the second preset frame number before starting the step 3;

step b3, calculating the signal-to-noise ratio in each frame of signal by adopting a probability density function according to the constructed noise model and the obtained normalized spectrum difference value, and distinguishing a user voice signal and an external noise signal;

step b4, according to the signal-to-noise ratio in each frame of signals, using a wiener filter to eliminate external noise signals in the external voice signals in a frequency domain;

step b5, calculating the energy ratio of the external voice signal before and after the noise elimination, and the signal-to-noise likelihood ratio of the external voice signal before and after the noise elimination;

step b6, repairing and adjusting the external voice signal after noise cancellation by utilizing the energy ratio before and after noise cancellation and the signal-to-noise likelihood ratio before and after noise cancellation;

and b7, outputting the repaired and adjusted external voice signal as a user voice signal.

In the device control method based on voice recognition, after the successful execution of step 4, the method further comprises: and 5, the device executes self-learning aiming at the voice of the user and executes corresponding actions again according to the self-learning result.

Further, the process of the device performing self-learning for the user's voice and performing corresponding actions again according to the self-learning result includes the following steps c1 to c5:

step c1, taking the external voice signal matched and consistent with any preset standard voice data as a voice command to be learned of the equipment;

step c2, obtaining repeated user voice control instructions which are sent by the user again and have the same content as the voice instructions to be learned;

step c3, respectively extracting the voice characteristic parameter set to be learned of the voice instruction to be learned and the user voice characteristic parameter sets of the voice control instructions of each time according to the same voice characteristic parameter extraction method;

step c4, performing matching judgment according to the extracted voice characteristic parameter set to be learned and the voice characteristic parameter sets of each user:

when the matching times of the user voice characteristic parameter set and the voice characteristic parameter set to be learned reach preset times, the voice characteristic parameter set to be learned is used as a user control voice characteristic parameter set for representing the user control equipment; turning to step c5; otherwise, feeding back prompt information of failure of learning the voice command to the user;

and c5, when the user control voice matched with the user control voice characteristic parameter set is obtained again, executing the action corresponding to the voice instruction to be learned by the equipment.

Optionally, in the device control method based on voice recognition, the preset standard voice in the preset standard voice control command database is a voice control command of the device system or a voice command recorded by a user.

Further, in the voice recognition-based device control method, the device is a home appliance device.

Compared with the prior art, the invention has the advantages that:

according to the equipment control method, the collected external voice signals are matched with the preset standard voice control instruction database stored in the equipment, once the collected external voice signals are matched with any preset standard voice control instruction, the equipment executes the action corresponding to the any preset standard voice control instruction, so that a user can control the equipment through voice, manual equipment operation is avoided, the hands of the user are effectively liberated, and the control experience effect of the user on the equipment is improved;

moreover, the invention also enables the equipment to execute self-learning aiming at the user voice so as to obtain the voice characteristic parameter set which accords with the personalized characteristics of the user, thereby avoiding the problem that the dialect of the equipment is difficult to match with the preset standard voice control instruction when the user pronounces, so that the equipment can accurately recognize the personalized voice of the user once the equipment acquires the voice which also has the voice characteristic parameter set again, and improving the recognition accuracy of the equipment to the user voice and the interactive response efficiency of the equipment to the user voice control.

Drawings

Fig. 1 is a schematic flow chart of a device control method based on voice recognition in an embodiment of the invention.

Detailed Description

The invention is described in further detail below with reference to the embodiments of the drawings.

In this embodiment, a kitchen electric appliance (or kitchen electric appliance) is taken as an example, and a method of controlling the appliance in the present invention will be described. Referring to fig. 1, the device control method based on voice recognition in this embodiment includes the following steps 1 to 4:

step 1, a preset standard voice control instruction database for controlling kitchen electric equipment to execute actions is constructed; wherein, the preset standard voice corresponding to the action executed by the kitchen electric equipment is stored in the preset standard voice control instruction database;

for example, for the kitchen electric equipment, the preset standard voice control command database stores standard voices which are recorded by using standard mandarin, and conform to various functions of the kitchen electric equipment, such as a start voice control command S1, an end voice control command S2, an increase gear voice control command S3, a decrease gear voice control command S4 and the like; that is, as long as the degree of matching of the speech uttered by the user with the standard speech stored therein reaches the set degree, the user can be considered to utter the standard speech;

of course, the preset standard voice in the preset standard voice control command database can be a voice control command of the kitchen electric equipment when leaving the factory, or can be a voice control command recorded by a user after the user purchases the kitchen electric equipment;

step 2, detecting and acquiring an external voice signal outside the kitchen electric equipment, and preprocessing the external voice signal;

specifically, in step 2 of the present embodiment, the preprocessing process for the external voice signal here includes the following steps 2-1 to 2-4:

step 2-1, performing endpoint detection on an external voice signal to obtain a user voice signal and an external noise signal in the external voice signal;

assuming that an external voice signal collected by the device is marked as X, after endpoint detection is executed, a user voice signal in the external voice signal X is Sound, and an external Noise signal in the external voice signal X is Noise; the endpoint detection in this embodiment belongs to the prior art, and is not described here again;

it should be noted that, in the external voice signal described in this embodiment, the process of acquiring the user voice signal and the external noise signal is as follows:

step a1, constructing a voice model fused with a user voice signal and an external noise signal; wherein, the speech model is as follows:

step a3, determining the signal type of the external voice signal by using a hypothesis testing method as the prior art according to the probability result obtained in the step a 2; the signal type is a user voice signal or an external noise signal;

specifically, in step a3, the signal type determination process of the user voice signal and the external noise signal in the external voice signal includes the following steps b1 to b7:

step b4, according to the signal-to-noise ratio in each frame of signals, using a wiener filter to eliminate external noise signals in external voice signals in a frequency domain;

step b5, calculating the energy ratio of the external voice signal before and after the noise elimination and the signal-to-noise likelihood ratio of the external voice signal before and after the noise elimination;

step b6, repairing and adjusting the external voice signal after noise elimination by utilizing the energy ratio before and after noise elimination and the signal-to-noise likelihood ratio before and after noise elimination;

and b7, outputting the repaired and adjusted external voice signal as a user voice signal. That is, after the repairing and adjusting process of the step b6, only the user voice signal remains in the external voice signal after the noise is eliminated, so that the purpose of determining the user voice signal and the external noise signal in the external voice signal is achieved;

in the embodiment, by adopting the modes from the step b1 to the step b7, noise in the external voice signals collected by the kitchen electric equipment can be eliminated, and then only voice instructions sent by a user are remained, so that adverse effects of the noise on the voice instruction identifying process of the user are avoided, the voice control instruction identifying rate for the kitchen electric equipment is improved, and the timeliness of voice response sent by the kitchen electric equipment to the user is enhanced;

step 2-2, eliminating an external Noise signal Noise in the external voice signal X to obtain a Noise-eliminated user voice signal Sound; that is, after the step 2-2 is performed, only the voice signal Sound of the user remains in the so-called external voice signal X here; as for the elimination of the external Noise signal Noise here, a conventional wavelet Noise filtering method may be adopted, or the external Noise signal Noise here may be eliminated in the manner of the above-described steps b1 to b 4;

step 2-3, extracting a voice characteristic parameter set in a user voice signal according to preset voice characteristic parameters;

for example, the preset voice characteristic parameters may be characteristic parameters obtained based on parameters such as amplitude, frequency or frequency spectrum of the voice signal, and the voice characteristic parameters set includes characteristic parameters required for recognizing voice; the number or the type of the characteristic parameters in the voice characteristic parameter set can be selectively set according to actual requirements;

step 2-4, taking the extracted voice characteristic parameter set as a preprocessing result aiming at the external voice signal;

step 3, carrying out matching judgment processing on the preprocessed external voice signal and a preset standard voice control command database:

when the preprocessed external voice signal is matched with any preset standard voice in the preset standard voice control command database, the content of the preprocessed external voice signal is the content corresponding to the any preset standard voice to be sent, and the preset standard voice is used as an external voice control command at the moment, and the step 4 is carried out; otherwise, turning to step 2;

and 4, commanding the equipment to execute the action corresponding to the external voice control instruction. For example, once it is judged that the preprocessed external voice signal (specifically, the user voice signal Sound after noise cancellation) matches with the preset standard voice "increase gear" voice control command S3, it is indicated that the user sends a control command of "increase gear" to the kitchen electric equipment at this time, so that the kitchen electric equipment at this time increases the gear on the basis of the current gear of the user, and the control requirement of the user on the kitchen electric equipment is met.

Of course, in order to meet the personalized characteristics of the user voice, the problem that the kitchen electric equipment is difficult to match dialect of the user when sounding with a preset standard voice control instruction is not high in accuracy is avoided, and the equipment control method of the embodiment further comprises after the successful execution of the step 4: and 5, the kitchen electric equipment executes self-learning aiming at the voice of the user and executes corresponding actions again according to the self-learning result. The process of the kitchen electric equipment executing self-learning aiming at the voice of the user and executing corresponding actions again according to the self-learning result specifically comprises the following steps c1 to c5:

step c1, taking an external voice signal matched with any preset standard voice data as a voice command to be learned of the kitchen electric equipment;

since it is assumed in the present embodiment that the preprocessed external voice signal (specifically, the noise-removed user voice signal Sound) matches with the preset standard voice "up shift" voice control command S3, in the step c1, the external voice signal (specifically, the noise-removed user voice signal Sound) matching with the "up shift" voice control command S3 is used as the voice command to be learned of the kitchen electric device;

step c3, respectively extracting a to-be-learned voice characteristic parameter set of the to-be-learned voice instruction and a user voice characteristic parameter set of each user voice control instruction according to the same voice characteristic parameter extraction method;

when the matching times of the user voice characteristic parameter set and the voice characteristic parameter set to be learned reach the preset times, the voice characteristic parameter set to be learned is used as a user control voice characteristic parameter set for representing the user control equipment; turning to step c5; otherwise, feeding back prompt information of failure of learning the voice command to the user;

for example, the kitchen electric device is required to acquire three times of user voice control instructions with the same content as the voice instruction S3 to be learned, namely, the first extracted user voice instruction is marked as K1, the second extracted user voice instruction is marked as K3 and the third extracted user voice instruction is marked as K3; the voice characteristic parameter set adopted by the voice instruction to be learned and the voice control instruction of the three times of users is assumed to comprise a voice characteristic parameter 1, a voice characteristic parameter 2 and a voice characteristic parameter 3;

the matching judgment process for step c4 is additionally described as follows:

when a preset voice characteristic parameter set (comprising a voice characteristic parameter 1, a voice characteristic parameter 2 and a voice characteristic parameter 3) is utilized to match a voice instruction S3 to be learned and a user voice instruction K1 extracted for the first time, if three voice characteristic parameters corresponding to two voice instructions are all in an allowable matching range, the user voice instruction K1 is considered to be matched with the voice instruction S3 to be learned;

similarly, the matching judgment for the user voice command K2 and the voice command to be learned S3 and the user voice command K3 and the voice command to be learned S3 is executed again; once the matching coincidence number in the three matching processes performed reaches a preset number (for example, the preset number is two), the voice feature parameter set (including the voice feature parameter 1, the voice feature parameter 2, and the voice feature parameter 3) for matching here is taken as a user-controlled voice feature parameter set characterizing the user-controlled device; that is, the subsequent voice control for the kitchen electric equipment takes the user control voice characteristic parameter set as a recognition matching standard;

and c5, when the user control voice matched with the user control voice characteristic parameter set is obtained again, executing the action corresponding to the voice instruction to be learned by the equipment. Therefore, the kitchen electric equipment performs self-learning aiming at the user voice, so that a voice characteristic parameter set which accords with the personalized characteristics of the user is obtained, the problem that the kitchen electric equipment is difficult to match dialects of the user with preset standard voice control instructions and is low in accuracy is avoided, and once the kitchen electric equipment acquires the voice which also has the voice characteristic parameter set again, the kitchen electric equipment can accurately recognize the personalized voice of the user, and the recognition accuracy of the kitchen electric equipment to the user voice and the interactive response efficiency of the kitchen electric equipment to the user voice control are improved.

It should be noted that the device control method based on voice recognition in the present embodiment may also be applied to home electric appliances such as an air conditioner and a television or other devices in a factory.

Claims

1. The equipment control method based on voice recognition is characterized by comprising the following steps 1 to 4:

step 4, commanding the equipment to execute the action corresponding to the external voice control instruction;

in step 2, the preprocessing process for the external voice signal includes step 2-1: performing endpoint detection on the external voice signal to acquire a user voice signal and an external noise signal in the external voice signal; in this step 2-1, the user voice signal and the external noise signal acquisition process in the external voice signal are as follows steps a1 to a3:

step a3, determining the signal type of the external voice signal by using a hypothesis testing method according to the probability result obtained in the step a 2; wherein the signal type is a user voice signal or an external noise signal; wherein, in this step a3, the signal type determining process of the user voice signal and the external noise signal among the external voice signals includes the following steps b1 to b7:

step b3, calculating the signal-to-noise ratio in each frame of signal and distinguishing user voice signals and external noise signals by adopting a probability density function according to the constructed noise model and the obtained normalized spectrum difference value;

2. The voice recognition-based device control method according to claim 1, wherein in step 2, the preprocessing process for the external voice signal further includes steps 2-2 to 2-4 of:

3. The voice recognition-based device control method of claim 1, further comprising, after the successful execution of step 4: and 5, the device executes self-learning aiming at the voice of the user and executes corresponding actions again according to the self-learning result.

4. The voice recognition-based device control method according to claim 3, wherein the process of the device performing self-learning for the user's voice and performing corresponding actions again according to the self-learning result comprises the following steps c1 to c5:

5. The voice recognition-based device control method of any one of claims 1-4, wherein the preset standard voice in the preset standard voice control command database is a voice control command of a device system or a voice command entered by a user.

6. The apparatus control method based on voice recognition according to any one of claims 1 to 4, wherein the apparatus is a home appliance apparatus.