WO2019242312A1

WO2019242312A1 - Wakeup word training method and device of household appliance, and household appliance

Info

Publication number: WO2019242312A1
Application number: PCT/CN2019/074317
Authority: WO
Inventors: 孙裕文; 谭博钊
Original assignee: 广东美的厨房电器制造有限公司
Priority date: 2018-06-19
Filing date: 2019-02-01
Publication date: 2019-12-26

Abstract

A wakeup word training method and device of a household appliance, and a household appliance. The method comprises: collecting a voice data sample of a wakeup word (S101); extracting the feature information of the voice data sample (S102); normalizing the feature information, and detecting and determining whether the normalized feature information meets a preset condition; and saving the voice data sample of the wakeup word to a user-defined wakeup word library (S104). By collecting the voice data sample of the wakeup word, extracting the feature information of the voice data sample, normalizing the feature information, detecting and determining whether the normalized feature information meets a preset condition, and saving the voice data sample of the wakeup word to a user-defined wakeup word library, the wakeup word is defined by a user and the personalized demands of the user are satisfied.

Description

Method and device for training wake-up words of home appliances and home appliances

Cross-reference to related applications

This disclosure claims the priority of China Patent Application No. “201810628693.7” submitted by Guangdong Midea Kitchen Appliance Manufacturing Co., Ltd. on June 19, 2018, with the application name of “Wake Word Training Method and Apparatus for Home Appliances and Home Appliances”, And the priority of Chinese patent application number "201810885079.9", which was filed on August 6, 2018, and whose application name is "Wake Word Training Method, Apparatus, and Home Appliances for Home Appliances".

Technical field

The present disclosure relates to the technical field of household appliances, and in particular, to a wake word training method and device for household appliances and household appliances.

Background technique

With the continuous advancement of science and technology, the product application areas developed by voice recognition technology are becoming more and more extensive, involving vehicle systems, robots, home services, banking services, medical services, industrial control, and so on. At present, speech recognition technology is mainly divided into two categories. One is cloud-based semantic recognition. Voice signals are transmitted to the server through the network for semantic analysis and understanding, and the results are transmitted through the network. Typical representatives: Apple's Siri (voice assistant), Amazon's echo speaker, Microsoft Xiaobing, etc. However, this method must have a network to be used, and the usage scenarios are limited. The other is local entry recognition, which does not require the use of a network. It can process voice control command words in real time through the embedded high-performance processor. However, it can only recognize pre-set voice control command terms, and it needs to recognize the complete voice control command terms before responding. It cannot realize free semantic understanding and the user experience is not high.

Summary of the Invention

The present disclosure provides a wake word training method and device for a home appliance, and a home appliance to implement a user-defined wake word to meet a user's personalized needs.

An embodiment of the first aspect of the present disclosure provides a wakeup word training method for a home appliance, including:

Collect speech data samples of wake words;

Extracting feature information of the voice data samples;

Normalizing the feature information, detecting and determining that the normalized feature information meets a preset condition;

The speech data sample of the wake-up word is saved in a custom wake-up word library.

As a first possible implementation manner of the embodiment of the first aspect of the present disclosure, the method further includes:

Detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the awake word.

As a second possible implementation manner of the embodiment of the first aspect of the present disclosure, the method further includes:

After collecting the voice data samples of the wake word, the voice data samples are denoised.

As a third possible implementation manner of the embodiment of the first aspect of the present disclosure, the method further includes:

Before collecting voice data samples of the wake word, control the home appliance to enter a custom wake word mode.

As a fourth possible implementation manner of the embodiment of the first aspect of the present disclosure, the method further includes:

Receive input voice information;

Detecting and determining that the voice information is a custom wake-up word based on the custom wake-up word dictionary;

Generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.

As a fifth possible implementation manner of the embodiment of the first aspect of the present disclosure, detecting and determining that the voice information is a custom wake-up word based on the custom wake-up thesaurus includes:

Extracting characteristic information of the voice information;

Normalize the feature information of the voice information, and use a dynamic time planning algorithm to compare the feature information of the voice information with the feature information of the wake word in the custom wake word dictionary;

Get the comparison result with the highest similarity;

Detect and determine that the comparison result with the highest similarity satisfies the set value, and determine that the voice information is a custom wake-up word.

As a sixth possible implementation manner of the embodiment of the first aspect of the present disclosure, the method further includes:

Detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.

The method for training wake-up words of a home appliance according to an embodiment of the present disclosure includes detecting voice data samples of wake-up words, extracting feature information of the voice data samples, and normalizing the feature information to detect and determine the normalization. The normalized feature information satisfies a preset condition, and the voice data samples of the wake-up word are stored in a custom wake-up word bank, thereby realizing a user-defined wake-up word and satisfying the personalized needs of the user.

An embodiment of the second aspect of the present disclosure provides another wake word training method for a home appliance, including:

Control home appliances to enter custom wake-up word mode;

Collect input wake-up words;

Training the awakening word, detecting and determining that the training awakening word is successful;

The next awakening word collection and training is performed until the Nth training awakening word is successful, N is a positive integer.

As a first possible implementation manner of the embodiment of the second aspect of the present disclosure, the method further includes:

After the N-th training awakening word succeeds, it is determined that the awakening word is effective, and the effective awakening word is saved locally.

As a second possible implementation manner of the embodiment of the second aspect of the present disclosure, the method further includes:

After determining that the wake-up word is valid, receiving an inputted wake-up word that is valid;

Wake the home appliance according to the effective wake-up word.

As a third possible implementation manner of the embodiment of the second aspect of the present disclosure, training the awake word, and detecting and determining that the training awake word is successful, includes:

Extracting feature information of the wake word;

Detect and determine that the feature information of the awakened word meets a preset standard, and determine that the training of the awakened word is successful.

As a fourth possible implementation manner of the embodiment of the second aspect of the present disclosure, the next wake-word acquisition and training is performed until the N-th training wake-word is successful, including:

Extract feature information of the awake words inputted for the Mth time;

Detect and determine that the feature information of the awoken words inputted in the Mth time meets the preset criteria, and perform similarity calculations on the feature information of the awakewords inputted in the Mth time and the feature information of the wakeupwords input in the first M-1 times;

Detect and determine that the similarity between the feature information of the awake words input at the Mth time and the feature information of the awake words input at the first M-1 times is greater than a preset similarity, and it is determined that the training of the awake words is successful.

The wake word training method for a home appliance according to an embodiment of the present disclosure includes controlling the home appliance to enter a custom wake word mode, collecting the entered wake word, and training the wake word to detect and determine that the training wake word is successful. One training, until the Nth training of the wake word is successful, so that the user can customize the wake word to meet the user's personalized needs, and the trained wake word is highly accurate.

An embodiment of the third aspect of the present disclosure provides a wake word training apparatus for a home appliance, including:

A first acquisition module, configured to collect speech data samples of wake words;

An extraction module, configured to extract feature information of the voice data samples;

A judging module for normalizing the feature information, detecting and determining that the normalized feature information meets a preset condition;

The first saving module is configured to save the speech data samples of the wake-up word to a custom wake-up word bank.

As a first possible implementation manner of the embodiment of the third aspect of the present disclosure, the first acquisition module is further configured to:

Detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the wake-up word.

As a second possible implementation manner of the embodiment of the third aspect of the present disclosure, the apparatus further includes:

The preprocessing module is configured to perform denoising processing on the voice data samples after collecting the voice data samples of the wake word.

As a third possible implementation manner of the embodiment of the third aspect of the present disclosure, the apparatus further includes:

The first control module is configured to control the home appliance to enter a custom wake-up word mode before collecting voice data samples of the wake-up word.

As a fourth possible implementation manner of the embodiment of the third aspect of the present disclosure, the apparatus further includes:

A first receiving module, configured to receive input voice information;

A recognition module, configured to detect and determine that the voice information is a custom wake-up word based on the custom wake-up thesaurus;

The first wake-up module is configured to generate a wake-up instruction and wake up the home appliance according to the wake-up instruction.

As a fifth possible implementation manner of the embodiment of the third aspect of the present disclosure, the identification module is configured to:

Extracting characteristic information of the voice information;

Get the comparison result with the highest similarity;

As a sixth possible implementation manner of the embodiment of the third aspect of the present disclosure, the apparatus further includes:

A prompting module is configured to detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.

The wake word training device for a home appliance in the embodiment of the present disclosure detects voice data samples of wake words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalization. The normalized feature information satisfies a preset condition, and the voice data samples of the wake-up word are stored in a custom wake-up word bank, thereby realizing a user-defined wake-up word and satisfying the personalized needs of the user.

An embodiment of the fourth aspect of the present disclosure provides another wake word training device for a home appliance, including:

A second control module, for controlling a home appliance to enter a custom wake-up word mode;

A second acquisition module, configured to collect an input wake-up word;

A training module for training the awake word, detecting and determining that the training awake word is successful, the acquisition module performs the next collection, and the training module performs the next training until the nth training awake word is successful, N is a positive integer .

As a first possible implementation manner of the embodiment of the fourth aspect of the present disclosure, the apparatus further includes:

The second saving module is configured to determine that the wakeup word is effective after the Nth training wakeup word is successful, and save the valid wakeup word locally.

As a second possible implementation manner of the embodiment of the fourth aspect of the present disclosure, the apparatus further includes:

A second receiving module, configured to receive an input effective wake-up word after determining that the wake-up word is valid;

The second wake-up module is configured to wake up the home appliance according to the valid wake-up word.

As a third possible implementation manner of the embodiment of the fourth aspect of the present disclosure, the training module is configured to:

Extracting feature information of the wake word;

As a fourth possible implementation manner of the embodiment of the fourth aspect of the present disclosure, the training module is further configured to:

Extract feature information of the awake words inputted for the Mth time;

The wake word training device for a home appliance in the embodiment of the present disclosure controls a home appliance to enter a custom wake word mode, collects the entered wake word, and trains the wake word to detect and determine that the wake word is successfully trained. One training, until the Nth training of the wake word is successful, so that the user can customize the wake word to meet the user's personalized needs, and the trained wake word is highly accurate.

An embodiment of the fifth aspect of the present disclosure provides a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements a wake-up word for a home appliance as described in the embodiment of the first aspect. A training method, or a wake-up word training method for a home appliance as described in the embodiment of the second aspect.

An embodiment of the sixth aspect of the present disclosure provides a home appliance including a processor, a memory, and a computer program stored on the memory and executable on the processor, where the processor is configured to execute the implementation as in the first aspect. The wake-up word training method of a home appliance according to the example, or the wake-up word training method of a home appliance according to the embodiment of the second aspect is performed.

Additional aspects and advantages of the present disclosure will be given in part in the following description, and part of them will become apparent from the following description, or be learned through the practice of the present disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to explain the technical solutions in the embodiments of the present disclosure more clearly, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present disclosure. Those of ordinary skill in the art can obtain other drawings based on these drawings without paying creative labor.

FIG. 1 is a flowchart of a wake-up word training method for a home appliance according to a first embodiment of the present disclosure; FIG.

FIG. 2 is a schematic flowchart of training the same wake word multiple times according to an embodiment of the present disclosure; FIG.

FIG. 3 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 2 of the present disclosure; FIG.

FIG. 4 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 3 of the present disclosure; FIG.

FIG. 5 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 4 of the present disclosure; FIG.

FIG. 6 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 5 of the present disclosure; FIG.

FIG. 7 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 6 of the present disclosure; FIG.

FIG. 8 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 7 of the present disclosure; FIG.

9 is a schematic flowchart of wake word training according to a specific example of the present disclosure;

FIG. 10 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 8 of the present disclosure; FIG.

11 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 9 of the present disclosure;

FIG. 12 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 10 of the present disclosure; FIG.

13 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 11 of the present disclosure;

FIG. 14 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 12 of the present disclosure; FIG.

FIG. 15 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 13 of the present disclosure; FIG.

FIG. 16 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 14 of the present disclosure; FIG.

FIG. 17 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 15 of the present disclosure.

detailed description

Hereinafter, embodiments of the present disclosure will be described in detail. Examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals represent the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the drawings are exemplary and are intended to explain the present disclosure, and should not be construed as limiting the present disclosure.

At present, speech recognition technology mainly includes cloud semantic recognition and local entry recognition. Cloud semantic recognition must rely on the network to be used, and the use scenarios are limited. Local entry recognition can only recognize pre-set voice control command entries, and cannot achieve free semantic understanding. To this end, the present disclosure proposes a wake-up word training method for home appliances, which can customize local wake-up words to meet personalized needs, does not need to rely on the network, has fast response speed, and is not limited by scenarios.

The following describes a wake word training method and apparatus for a home appliance according to an embodiment of the present disclosure, and a home appliance with reference to the accompanying drawings.

FIG. 1 is a flowchart of a wake-up word training method for a home appliance according to a first embodiment of the present disclosure.

As shown in FIG. 1, a wake-up word training method for a home appliance includes:

S101. Collect voice data samples of awake words.

In the field of intelligent voice interaction, users can wake up a device that is in a dormant state through a wake word. The wake word is usually predefined by the manufacturer, and cannot be changed, which cannot meet the personalized needs of users. Therefore, in this embodiment, a custom wake-up word mode is set for the home appliance device, so that the user can train a custom wake-up word that meets his own needs.

In an embodiment of the present disclosure, before training the user-defined wake-up word, the user may first control the home appliance to enter the user-defined wake-up word mode. The way of entering may be to trigger a physical button or issue a voice command. After the home appliance enters the custom wake-up word mode, the user can be reminded of the wake-up word that they want to set. Within a predetermined period of time, such as within 5 seconds, the user speaks the wake word. At this time, the home appliance can collect voice data samples of the wake word in a preset audio format through a voice input device such as a microphone. For example, the sound signal is collected in a format with a sampling frequency of 16Khz and a transmission rate of 16Bit. If the user does not say the wake word within 5 seconds, the user may be reminded to re-enter it.

S102. Extract feature information of a voice data sample.

Among them, feature information can be extracted using MFCC (Mel Frequency Cepstral Coefficient) or other feature extraction algorithms. According to the frequency, speech can be divided into low frequency, intermediate frequency and high frequency. Therefore, when extracting features, feature information can be extracted separately in the low frequency range, the intermediate frequency range, and the high frequency range. And the weight information corresponding to the feature information of the low frequency range, the feature information of the intermediate frequency range and the feature information of the high frequency range are different. The feature information may include features such as sound intensity, male voice, or female voice.

S103: Normalize the feature information, and determine whether the normalized feature information meets a preset condition. If yes, perform step S104, and if not, perform step S105.

Among them, normalization is to limit the feature information to a certain range after processing (by a certain algorithm), so that the normalized feature information can be compared and judged with preset conditions. For example: whether the length feature is too short or too long compared to the preset length range; or whether the strength feature is too large or too small compared to the preset strength range, and so on.

S104. Save the voice data samples of the wake-up words in a custom wake-up word library.

In one embodiment of the present disclosure, if the normalized feature information satisfies a preset condition, the training of the wake-up word is successful, that is, the voice data sample of the wake-up word is saved in the custom wake-up word library.

In addition, the wake-up word training method for home appliances may further include:

S105. Collect voice data samples of the wake word again.

In an embodiment of the present disclosure, if the normalized feature information does not satisfy a preset condition, the home appliance may remind the user to re-enter the wake-up word, and thereby re-collect voice data samples of the wake-up word.

Of course, in order to improve the accuracy, the same arousal word training can be performed multiple times. As shown in Figure 2, the same wake-up words spoken by the user are collected three times, and feature information is extracted, normalized, and then the normalized feature information is filtered to detect feature information that meets the conditions (training Wake word for success). Finally, the trained wake-up words are stored in a local custom wake-up dictionary.

The method for training wake-up words of a home appliance according to the embodiment of the present disclosure collects voice data samples of wake-up words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalized feature information. Satisfy the preset conditions, and save the voice data samples of the wake-up words into the custom wake-up thesaurus, so as to achieve user-defined wake-up words and meet the personalized needs of users.

In another embodiment of the present disclosure, as shown in FIG. 3, the wake-up word training method for a home appliance may further include:

S106. After collecting the voice data samples of the awake words, perform denoising processing on the voice data samples.

Due to the influence of environmental noise and other disturbing sounds, when collecting wake-up words spoken by the user, the collected voice data samples need to be denoised first to avoid noise effects and improve accuracy.

In another embodiment of the present disclosure, as shown in FIG. 4, the wake-up word training method for a home appliance may further include:

S401. Receive input voice information.

When the custom wake-up word is successful, you can use the custom wake-up word to wake up the home appliance.

In one embodiment of the present disclosure, voice information input by a user may be received.

S402. Identify whether the voice information is a custom wake-up word based on the custom wake-up lexicon. If yes, perform step S403; if no, perform step S404.

After that, whether the voice information is a custom wake-up word can be identified based on the custom wake-up thesaurus.

Specifically, the feature information of the voice information can be extracted, the feature information of the voice information can be normalized, and then the feature information of the voice information and the feature information of all the wake-up words in the custom wake-up vocabulary are adopted by using a dynamic time planning algorithm. Compare. For example, the similarity between the feature information B of the voice information and the feature information of the wake-up word A1, the feature information of the wake-up word A2, and the feature information of the wake-up word A3 in the custom wake-up thesaurus are calculated respectively.

Then, the comparison result with the highest similarity is obtained. If the comparison result with the highest similarity satisfies the set value, it is determined that the voice information is a custom wakeup word; if the comparison result with the highest similarity does not satisfy the set value, it is determined that the voice information is not a custom wakeup word.

S403: Generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.

In one embodiment of the present disclosure, if the voice information is a custom wake-up word, a wake-up instruction is generated, and the home appliance is woken up according to the wake-up instruction.

S404: Prompt to input voice information again.

In an embodiment of the present disclosure, if the voice information is not a custom wake-up word, prompting to re-enter the voice information, thereby improving the success rate of the home appliance being woken up.

In this embodiment, the local customized wake-up word dictionary is used to identify whether the voice information input by the user is a customized wake-up word. Compared with traditional network recognition, the response speed is faster, and it is not limited by the network, and the usage scenarios are more abundant.

The disclosure also proposes another wake-up word training method for home appliances. FIG. 5 is a flowchart of a wake-up word training method for home appliances provided in Embodiment 4 of the present disclosure.

As shown in FIG. 5, a wake-up word training method for a home appliance includes:

S501. Control a home appliance to enter a custom wake-up word mode.

In an embodiment of the present disclosure, before training the user-defined wake-up word, the user may first control the home appliance to enter the user-defined wake-up word mode. The way of entering may be to trigger a physical button or issue a voice command.

S502: Collect the input wake-up word.

After the home appliance enters the custom wake-up word mode, the user can be reminded of the wake-up word that they want to set. Within a predetermined period of time, such as within 5 seconds, the user speaks the wake word. At this time, the home appliance can collect voice data samples of the wake word in a preset audio format through a voice input device such as a microphone. For example, the sound signal is collected in a format with a sampling frequency of 16Khz and a transmission rate of 16Bit. If the user does not say the wake word within 5 seconds, the user may be reminded to re-enter it.

S503. Train the wake-up words and determine whether the training of wake-up words is successful. If yes, go to step S504; if no, go to step S502.

In one embodiment of the present disclosure, the feature information of the arousal word may be extracted first, and then the feature information is compared with the preset standard to determine whether the feature information meets the preset standard.

For example, in combination with user usage habits, a suitable maximum time length can be set for the wake-up word.

During the training process, the quality and consistency of the training corpus (wake words) need to be strictly ensured. Therefore, in the entire training process, it is necessary to judge from the size of the voice, the length of the voice, the similarity of the voice, the complexity of the voice, and the environmental noise. Whether the wake word meets the preset criteria.

Wherein, the collected wake-up word is a time-domain signal, and the time-domain signal can be converted into a frequency-domain signal (characteristic information is extracted), and then compared and analyzed.

Judgment of voice sound level: First set 4 predefined thresholds according to the experimental results, which respectively represent the maximum volume vh, the minimum volume vl, the maximum value above the maximum volume vhm, and the maximum value below the minimum volume vlm. Then, the number of training corpus above the maximum volume vhr and the number below the minimum volume vhr are counted. If vhr> vhm, it means that the sound is too loud; if vlr> vlm, it means that the sound is too low. If vhr <vhm and vlr <vlm, it means that the voice sound level meets the standard.

Judgment of speech length: It can be divided into two parts, super long judgment and too short judgment. Both the overlength determination and the overlength determination are based on the characteristics of the fixed length of the training corpus, combined with the signal-to-noise ratio of the front-end speech and the back-end speech. If the power of the back-end voice does not decrease relative to the power of the front-end voice, it means that the voice is too long; if the power of the back-end voice decreases relative to the power of the previous-stage voice, it means that the voice is too short.

Judgment of speech similarity: According to the experimental results, the threshold of similarity is predefined. Then the cosine distance is used to judge the similarity between different voices. If the similarity is greater than the threshold, it indicates similarity; otherwise, it indicates dissimilarity.

Speech complexity judgment: Use the peak characteristics of the training corpus. If the number of peaks is greater than a predefined threshold, it means that the training corpus is qualified, otherwise it means unqualified.

Environmental noise judgment: Use environmental characteristics to set the noise threshold. Analyze the training corpus. If the noise of the training corpus is lower than the threshold, it indicates that the environment is suitable, otherwise, it indicates that the noise is too large.

S504. Perform the next awake word collection and training until the Nth training awake word succeeds.

Where N is a positive integer.

That is, if the first training awakening word is successful, the second training awakening word can be performed. If the first training wake word is unsuccessful, the first training wake word is re-performed. In addition, when training the wake word, if the number of consecutive unsuccessful trainings reaches 3 times, a prompt message can be generated. The information content can be "Wake word training failed, please enter other wake words for training", etc., so as to remind the user to change Easier to train successful wakeup words.

In an embodiment of the present disclosure, when performing the Mth training, as shown in FIG. 6, the following steps may be specifically included:

S601. Extract feature information of the wake-up word input at the Mth time.

S602: Detect and determine that the feature information of the awake words inputted at the Mth time meets a preset standard, and perform similarity calculation on the feature information of the awake words inputted at the Mth time and the feature information of the awake words inputted at the first M-1 times.

S603: Detect and determine that the similarity between the feature information of the awake words inputted at the Mth time and the feature information of the awake words inputted at the first M-1 times is greater than a preset similarity, and it is determined that the training of the awake words is successful.

Assuming M = 5, when performing the fifth training, it is necessary to perform the feature information of the awake words input for the fifth time and the feature information of the awake words input for the first, second, third, and fourth times respectively. Similarity calculation. The four similarities need to be greater than a preset similarity, such as 85%, to determine that the fifth awake word training was successful.

It should be understood that the process of training the wake word each time may specifically adopt a method of recording sound multiple times. For example, during the first training, the sound signals input by the user three times can be collected, the feature information of the three arousal words can be extracted, and their average value can be used to train as the feature information of the awake words of the first training. Success rate of training wake words.

The wake word training method for a home appliance in the embodiment of the present disclosure is to control the home appliance to enter a custom wake word mode, collect the entered wake word, and train the wake word to detect and determine that the wake word is successfully trained for the next training. , Until the N-th training awakening word succeeds, so as to realize user-defined awakening words, meet the personalized needs of users, and the training awakening words have high accuracy.

In another embodiment of the present disclosure, as shown in FIG. 7, the wake-up word training method for a home appliance may further include:

S505. After the N-th training wake-up word succeeds, determine that the wake-up word is effective, and save the effective wake-up word locally.

Among them, the effective wake-up words are saved in the local custom wake-up thesaurus.

In another embodiment of the present disclosure, as shown in FIG. 8, the wake-up word training method for a home appliance may further include:

S506. After determining that the wake-up word is valid, receive the inputted wake-up word that is valid.

S507. Wake up the home appliance according to the effective wake-up word.

After customizing the wake-up word, you can use the effective wake-up word to wake up the home appliance.

Specifically, the feature information of the effective wake-up words input can be extracted, and then compared with the feature information stored in the custom wake-up thesaurus. If the similarity between the two is higher than a preset value, a wake-up instruction may be generated, and the home appliance may be woken up according to the wake-up instruction. Otherwise, wake-up appliances are unsuccessful.

The following uses a specific example for illustration:

The speech recognition device is installed in the cooking equipment so that the cooking equipment has a speech recognition function. The factory setting of the cooking device is: the command word for starting a custom training wake-up word is "change a name".

After the cooking device is powered on, the voice recognition device voice module is activated. The user says "change a name", and the cooking device can enter a mode of custom training wake word. At this point, the cooking device can play "Please say a new wake-up word after a beep". The user speaks a new wake-up word according to the prompt voice. The cooking device receives the new wake-up word and determines whether the new wake-up word is successfully trained. If the training is successful, the cooking device may give a voice prompt "Training is successful, please say the wake word again"; if the training is not successful, the cooking device may give a voice prompt "Sound **, please say the wake word again". Among them, ** can be "too small", "too big", "too long", "too short", "too simple", "inconsistent with the last training result" and so on. The above training steps are repeated, and when the third training is successful, the cooking device may perform a voice prompt "Training is completed, and the new wake-up word has taken effect", thereby ending the training. The training wake-up word process can be shown in FIG. 9. By this method, the accuracy of the aroused words is greatly improved, and the recognition rate of the aroused words is improved, and the misrecognition rate is reduced.

To achieve the above embodiments, the present disclosure also proposes a wake word training apparatus for a home appliance.

FIG. 10 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 8 of the present disclosure.

As shown in FIG. 10, the wake-up word training device for a home appliance may include: a first acquisition module 110, an extraction module 120, a determination module 130, and a first storage module 140.

The first collection module 110 is configured to collect voice data samples of wake words.

As a possible implementation manner, the first collection module 110 is further configured to detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the wake-up word.

The extraction module 120 is configured to extract feature information of a voice data sample.

The judging module 130 is configured to normalize the feature information, and detect and determine that the normalized feature information meets a preset condition.

The first saving module 140 is configured to save a voice data sample of the wake-up word into a custom wake-up word bank.

In another embodiment of the present disclosure, as shown in FIG. 11, the wake-up word training apparatus for a home appliance may further include a pre-processing module 150.

The preprocessing module 150 is configured to perform denoising processing on the voice data samples after collecting the voice data samples of the wake word.

In another embodiment of the present disclosure, as shown in FIG. 12, the wake-up word training apparatus for a home appliance may further include a first control module 160.

The control module 160 is configured to control the home appliance to enter a custom wake-up word mode before collecting voice data samples of the wake-up word.

In still another embodiment of the present disclosure, as shown in FIG. 13, the wake-up word training device for a home appliance may further include a first receiving module 210, a recognition module 220, and a first wake-up module 230.

The first receiving module 210 is configured to receive input voice information.

The recognition module 220 is configured to detect and determine that the voice information is a custom wake-up word based on the custom wake-up thesaurus.

As a possible implementation manner, the recognition module 220 is configured to: extract feature information of the voice information; normalize the feature information of the voice information, and use a dynamic time planning algorithm to The feature information is compared with the feature information of the awakened words in the custom wake-up vocabulary; the comparison result with the highest similarity is obtained; the comparison result with the highest similarity is detected and determined to satisfy the set value, and the voice information is determined For custom wake up words.

The first wake-up module 230 is configured to detect and determine that the voice information is a custom wake-up word, generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.

In a specific embodiment of the present disclosure, as shown in FIG. 14, the wake-up word training apparatus for a home appliance may further include a prompting module 240.

The prompting module 240 is configured to detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.

It should be noted that the foregoing explanation of the wake word training method for home appliances is also applicable to the wake word training device for home appliances in the embodiments of the present disclosure. Details not disclosed in the embodiments of the present disclosure will not be repeated here.

The apparatus for awakening word training of a home appliance in the embodiment of the present disclosure collects voice data samples of the awakening words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalized feature information. Satisfy the preset conditions, and save the voice data samples of the wake-up words into the custom wake-up thesaurus, so as to achieve user-defined wake-up words and meet the personalized needs of users.

FIG. 15 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 13 of the present disclosure.

As shown in FIG. 15, the wake word training device for a home appliance may include a second control module 310, a second acquisition module 320, and a training module 330.

The second control module 310 is configured to control a home appliance to enter a custom wake-up word mode.

The second collection module 320 is configured to collect an input wake-up word.

The training module 330 is configured to train the wake-up words, detect and determine that the training wake-up words are successful, the collection module 320 performs the next collection, and the training module 330 performs the next training until the N-th training wake-up word succeeds.

As a possible implementation manner, the training module 330 is specifically configured to: extract feature information of the awake word; detect and determine that the feature information of the awake word meets a preset standard, and determine that training of the awake word is successful.

As a possible implementation manner, the training module 330 is further configured to: extract feature information of the wake-up word inputted at the Mth time; detect and determine that the feature information of the wake-up word inputted at the Mth time conforms to a preset standard, and convert the Mth time The feature information of the awake words input is calculated similarly to the feature information of the awake words inputted before M-1 times; the feature information of the awake word inputted the Mth times and the wakeup word of the first M-1 times are detected and determined. The similarity of the feature information is greater than the preset similarity, and it is determined that the training awakening word is successful.

In another embodiment of the present disclosure, as shown in FIG. 16, the wake-up word training apparatus for a home appliance may further include a second saving module 340.

The second saving module 340 is configured to determine that the wakeup word is effective after the Nth training wakeup word is successful, and save the valid wakeup word locally.

In another embodiment of the present disclosure, as shown in FIG. 17, the wake-up word training device for a home appliance may further include a second receiving module 350 and a second wake-up module 360.

The second receiving module 350 is configured to receive an inputted effective wake-up word after determining that the wake-up word is valid.

The second wake-up module 360 is configured to wake up the home appliance according to the effective wake-up word.

It should be noted that the foregoing explanation of the wake word training method for home appliances is also applicable to the wake word training device for home appliances in the embodiment of the present disclosure. Details not disclosed in the embodiments of the present disclosure will not be repeated here.

The wake word training device for a home appliance in the embodiment of the present disclosure controls a home appliance to enter a custom wake word mode, collects the entered wake word, and trains the wake word to detect and determine that the wake word is successfully trained for the next training. , Until the N-th training awakening word succeeds, so as to realize user-defined awakening words, meet the personalized needs of users, and the training awakening words have high accuracy.

In order to implement the above embodiments, the present disclosure also proposes a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements a wake-up word for a home appliance as proposed by the foregoing embodiment of the present disclosure. Training methods.

In order to implement the above embodiments, the present disclosure also provides a home appliance including a processor, a memory, and a computer program stored on the memory and executable on the processor. The processor is configured to execute the home appliance as proposed in the foregoing embodiment of the present disclosure. Wake-up word training method.

In the description of this specification, the description with reference to the terms “one embodiment”, “some embodiments”, “examples”, “specific examples”, or “some examples” and the like means specific features described in conjunction with the embodiments or examples , Structure, material, or characteristic is included in at least one embodiment or example of the present disclosure. In this specification, the schematic expressions of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. In addition, without any contradiction, those skilled in the art may combine and combine different embodiments or examples and features of the different embodiments or examples described in this specification.

In addition, the terms "first" and "second" are used for descriptive purposes only and cannot be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Therefore, the features defined as "first" and "second" may explicitly or implicitly include at least one of the features. In the description of the present disclosure, the meaning of "plurality" is at least two, for example, two, three, etc., unless it is specifically and specifically defined otherwise.

Any process or method description in a flowchart or otherwise described herein can be understood as representing a module, fragment, or portion of code that includes one or more executable instructions for implementing steps of a custom logic function or process And, the scope of the preferred embodiments of the present disclosure includes additional implementations in which the functions may be performed out of the order shown or discussed, including performing functions in a substantially simultaneous manner or in the reverse order according to the functions involved, which should It is understood by those skilled in the art to which the embodiments of the present disclosure belong.

The logic and / or steps represented in the flowchart or otherwise described herein, for example, a sequenced list of executable instructions that can be considered to implement a logical function, can be embodied in any computer-readable medium, For the instruction execution system, device, or device (such as a computer-based system, a system including a processor, or other system that can fetch and execute instructions from the instruction execution system, device, or device), or in combination with these instruction execution systems, devices Or equipment. For the purposes of this specification, a "computer-readable medium" may be any device that can contain, store, communicate, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. More specific examples (non-exhaustive list) of computer readable media include the following: electrical connections (electronic devices) with one or more wirings, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disk read-only memory (CDROM). In addition, the computer-readable medium may even be paper or other suitable medium on which the program can be printed, because, for example, by optically scanning the paper or other medium, followed by editing, interpretation, or other suitable Processing to obtain the program electronically and then store it in computer memory.

It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods may be implemented by software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware as in another embodiment, it may be implemented using any one or a combination of the following techniques known in the art: Discrete logic circuits with logic gates for implementing logic functions on data signals Logic circuits, ASICs with suitable combinational logic gate circuits, programmable gate arrays (PGA), field programmable gate arrays (FPGAs), etc.

A person of ordinary skill in the art can understand that all or part of the steps carried by the methods in the foregoing embodiments may be implemented by a program instructing related hardware. The program may be stored in a computer-readable storage medium. The program is When executed, one or a combination of the steps of the method embodiment is included.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing module, or each unit may exist separately physically, or two or more units may be integrated into one module. The above integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium.

The aforementioned storage medium may be a read-only memory, a magnetic disk, or an optical disk. Although the embodiments of the present disclosure have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limitations on the present disclosure. Those skilled in the art can understand the above within the scope of the present disclosure. Embodiments are subject to change, modification, substitution, and modification.

Claims

A wake-up word training method for a home appliance is characterized in that it includes:

Collect speech data samples of wake words;

Extracting feature information of the voice data samples;

Normalizing the feature information, detecting and determining that the normalized feature information meets a preset condition;

The speech data sample of the wake-up word is saved in a custom wake-up word library.
The method of claim 1, further comprising:

Detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the awake word.
The method according to claim 1 or 2, further comprising:

After collecting the voice data samples of the wake word, the voice data samples are denoised.
The method according to any one of claims 1-3, before collecting voice data samples of the wake word, further comprising:

Control home appliances to enter custom wake-up word mode.
The method according to any one of claims 1-4, further comprising:

Receive input voice information;

Detecting and determining that the voice information is a custom wake-up word based on the custom wake-up word dictionary;

Generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.
The method according to claim 5, wherein detecting and determining the voice information as a custom wake-up word based on the custom wake-up thesaurus comprises:

Extracting characteristic information of the voice information;

Normalize the feature information of the voice information, and use a dynamic time planning algorithm to compare the feature information of the voice information with the feature information of the wake word in the custom wake word dictionary;

Get the comparison result with the highest similarity;

Detect and determine that the comparison result with the highest similarity satisfies the set value, and determine that the voice information is a custom wake-up word.
The method according to claim 5 or 6, further comprising:

Detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.
A wake-up word training method for a home appliance is characterized in that it includes:

Control home appliances to enter custom wake-up word mode;

Collect input wake-up words;

Training the awakening word, detecting and determining that the training awakening word is successful;

The next awakening word collection and training is performed until the Nth training awakening word is successful, N is a positive integer.
The method according to claim 8, further comprising:

After the N-th training awakening word succeeds, it is determined that the awakening word is effective, and the effective awakening word is saved locally.
The method according to claim 9, further comprising:

After determining that the wake-up word is valid, receiving an inputted wake-up word that is valid;

Wake the home appliance according to the effective wake-up word.
The method according to any one of claims 8 to 10, wherein training the awake words and detecting and determining that the awake words are successfully trained comprises:

Extracting feature information of the wake word;

Detect and determine that the feature information of the awakened word meets a preset standard, and determine that the training of the awakened word is successful.
The method according to any one of claims 8-11, wherein performing the next awake word collection and training until the Nth training awake word succeeds, comprising:

Extract feature information of the awake words inputted for the Mth time;

Detect and determine that the feature information of the awoken words inputted in the Mth time meets the preset criteria, and perform similarity calculations on the feature information of the awakewords inputted in the Mth time and the feature information of the wakeupwords input in the first M-1 times;

Detect and determine that the similarity between the feature information of the awake words input at the Mth time and the feature information of the awake words input at the first M-1 times is greater than a preset similarity, and it is determined that the training of the awake words is successful.
A wake word training device for a home appliance is characterized in that it includes:

A first acquisition module, configured to collect speech data samples of wake words;

An extraction module, configured to extract feature information of the voice data samples;

A judging module for normalizing the feature information, detecting and determining that the normalized feature information meets a preset condition;

The first saving module is configured to save the speech data samples of the wake-up word to a custom wake-up word bank.
The apparatus according to claim 13, wherein the first acquisition module is further configured to:

Detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the wake-up word.
The device according to claim 13 or 14, further comprising:

The preprocessing module is configured to perform denoising processing on the voice data samples after collecting the voice data samples of the wake word.
The device according to any one of claims 13-15, wherein the device further comprises:

The first control module is configured to control the home appliance to enter a custom wake-up word mode before collecting voice data samples of the wake-up word.
The device according to any one of claims 13-16, wherein the device further comprises:

A first receiving module, configured to receive input voice information;

A recognition module, configured to detect and determine that the voice information is a custom wake-up word based on the custom wake-up thesaurus;

The first wake-up module is configured to generate a wake-up instruction and wake up the home appliance according to the wake-up instruction.
The device according to claim 17, wherein the identification module is configured to:

Extracting characteristic information of the voice information;

Normalize the feature information of the voice information, and use a dynamic time planning algorithm to compare the feature information of the voice information with the feature information of the wake word in the custom wake word dictionary;

Get the comparison result with the highest similarity;

Detect and determine that the comparison result with the highest similarity satisfies the set value, and determine that the voice information is a custom wake-up word.
The device according to claim 17 or 18, wherein the device further comprises:

A prompting module is configured to detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.
A wake word training device for a home appliance is characterized in that it includes:

A second control module, for controlling a home appliance to enter a custom wake-up word mode;

A second acquisition module, configured to collect an input wake-up word;

A training module for training the awake word, detecting and determining that the training awake word is successful, the acquisition module performs next acquisition, and the training module performs a next training until the Nth training awake word is successful, N is Positive integer.
The apparatus according to claim 20, further comprising:

The second saving module is configured to determine that the wakeup word is effective after the Nth training wakeup word is successful, and save the valid wakeup word locally.
The apparatus according to claim 21, wherein the apparatus further comprises:

A second receiving module, configured to receive an input effective wake-up word after determining that the wake-up word is valid;

The second wake-up module is configured to wake up the home appliance according to the valid wake-up word.
The apparatus according to any one of claims 20 to 22, wherein the training module is configured to:

Extracting feature information of the wake word;

Detect and determine that the feature information of the awakened word meets a preset standard, and determine that the training of the awakened word is successful.
The device according to any one of claims 20-23, wherein the training module is further configured to:

Extract feature information of the awake words inputted for the Mth time;

Detect and determine that the feature information of the awoken words inputted in the Mth time meets the preset criteria, and perform similarity calculations on the feature information of the awakewords inputted in the Mth time and the feature information of the wakeupwords input in the first M-1 times;

Detect and determine that the similarity between the feature information of the awake words input at the Mth time and the feature information of the awake words input at the first M-1 times is greater than a preset similarity, and it is determined that the training of the awake words is successful.
A non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements the wake word training method for a home appliance as claimed in any one of claims 1-7, or implements The wake word training method for a home appliance according to any one of claims 8-12.
A home appliance including a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor is configured to execute the home appliance according to any one of claims 1-7 Or a wake-up word training method for a home appliance according to any one of claims 8-12.