WO2019242312A1 - Wakeup word training method and device of household appliance, and household appliance - Google Patents

Wakeup word training method and device of household appliance, and household appliance Download PDF

Info

Publication number
WO2019242312A1
WO2019242312A1 PCT/CN2019/074317 CN2019074317W WO2019242312A1 WO 2019242312 A1 WO2019242312 A1 WO 2019242312A1 CN 2019074317 W CN2019074317 W CN 2019074317W WO 2019242312 A1 WO2019242312 A1 WO 2019242312A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
wake
training
feature information
home appliance
Prior art date
Application number
PCT/CN2019/074317
Other languages
French (fr)
Chinese (zh)
Inventor
孙裕文
谭博钊
Original Assignee
广东美的厨房电器制造有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201810628693.7A external-priority patent/CN109036393A/en
Priority claimed from CN201810885079.9A external-priority patent/CN109166571B/en
Application filed by 广东美的厨房电器制造有限公司 filed Critical 广东美的厨房电器制造有限公司
Publication of WO2019242312A1 publication Critical patent/WO2019242312A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]

Definitions

  • the present disclosure relates to the technical field of household appliances, and in particular, to a wake word training method and device for household appliances and household appliances.
  • voice recognition technology is mainly divided into two categories.
  • One is cloud-based semantic recognition.
  • Voice signals are transmitted to the server through the network for semantic analysis and understanding, and the results are transmitted through the network.
  • Typical representatives Apple's Siri (voice assistant), Amazon's echo speaker, Microsoft Xiaobing, etc.
  • Apple's Siri voice assistant
  • Amazon's echo speaker Microsoft Xiaobing
  • This method must have a network to be used, and the usage scenarios are limited.
  • the other is local entry recognition, which does not require the use of a network. It can process voice control command words in real time through the embedded high-performance processor. However, it can only recognize pre-set voice control command terms, and it needs to recognize the complete voice control command terms before responding. It cannot realize free semantic understanding and the user experience is not high.
  • the present disclosure provides a wake word training method and device for a home appliance, and a home appliance to implement a user-defined wake word to meet a user's personalized needs.
  • An embodiment of the first aspect of the present disclosure provides a wakeup word training method for a home appliance, including:
  • the speech data sample of the wake-up word is saved in a custom wake-up word library.
  • the method further includes:
  • the method further includes:
  • the voice data samples After collecting the voice data samples of the wake word, the voice data samples are denoised.
  • the method further includes:
  • control the home appliance Before collecting voice data samples of the wake word, control the home appliance to enter a custom wake word mode.
  • the method further includes:
  • detecting and determining that the voice information is a custom wake-up word based on the custom wake-up thesaurus includes:
  • the method further includes:
  • the method for training wake-up words of a home appliance includes detecting voice data samples of wake-up words, extracting feature information of the voice data samples, and normalizing the feature information to detect and determine the normalization.
  • the normalized feature information satisfies a preset condition, and the voice data samples of the wake-up word are stored in a custom wake-up word bank, thereby realizing a user-defined wake-up word and satisfying the personalized needs of the user.
  • An embodiment of the second aspect of the present disclosure provides another wake word training method for a home appliance, including:
  • N is a positive integer.
  • the method further includes:
  • the N-th training awakening word After the N-th training awakening word succeeds, it is determined that the awakening word is effective, and the effective awakening word is saved locally.
  • the method further includes:
  • training the awake word, and detecting and determining that the training awake word is successful includes:
  • the next wake-word acquisition and training is performed until the N-th training wake-word is successful, including:
  • the wake word training method for a home appliance includes controlling the home appliance to enter a custom wake word mode, collecting the entered wake word, and training the wake word to detect and determine that the training wake word is successful. One training, until the Nth training of the wake word is successful, so that the user can customize the wake word to meet the user's personalized needs, and the trained wake word is highly accurate.
  • An embodiment of the third aspect of the present disclosure provides a wake word training apparatus for a home appliance, including:
  • a first acquisition module configured to collect speech data samples of wake words
  • An extraction module configured to extract feature information of the voice data samples
  • the first saving module is configured to save the speech data samples of the wake-up word to a custom wake-up word bank.
  • the first acquisition module is further configured to:
  • the apparatus further includes:
  • the preprocessing module is configured to perform denoising processing on the voice data samples after collecting the voice data samples of the wake word.
  • the apparatus further includes:
  • the first control module is configured to control the home appliance to enter a custom wake-up word mode before collecting voice data samples of the wake-up word.
  • the apparatus further includes:
  • a first receiving module configured to receive input voice information
  • a recognition module configured to detect and determine that the voice information is a custom wake-up word based on the custom wake-up thesaurus;
  • the first wake-up module is configured to generate a wake-up instruction and wake up the home appliance according to the wake-up instruction.
  • the identification module is configured to:
  • the apparatus further includes:
  • a prompting module is configured to detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.
  • the wake word training device for a home appliance in the embodiment of the present disclosure detects voice data samples of wake words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalization.
  • the normalized feature information satisfies a preset condition, and the voice data samples of the wake-up word are stored in a custom wake-up word bank, thereby realizing a user-defined wake-up word and satisfying the personalized needs of the user.
  • An embodiment of the fourth aspect of the present disclosure provides another wake word training device for a home appliance, including:
  • a second control module for controlling a home appliance to enter a custom wake-up word mode
  • a second acquisition module configured to collect an input wake-up word
  • the apparatus further includes:
  • the second saving module is configured to determine that the wakeup word is effective after the Nth training wakeup word is successful, and save the valid wakeup word locally.
  • the apparatus further includes:
  • a second receiving module configured to receive an input effective wake-up word after determining that the wake-up word is valid
  • the second wake-up module is configured to wake up the home appliance according to the valid wake-up word.
  • the training module is configured to:
  • the training module is further configured to:
  • the wake word training device for a home appliance in the embodiment of the present disclosure controls a home appliance to enter a custom wake word mode, collects the entered wake word, and trains the wake word to detect and determine that the wake word is successfully trained. One training, until the Nth training of the wake word is successful, so that the user can customize the wake word to meet the user's personalized needs, and the trained wake word is highly accurate.
  • An embodiment of the fifth aspect of the present disclosure provides a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements a wake-up word for a home appliance as described in the embodiment of the first aspect.
  • An embodiment of the sixth aspect of the present disclosure provides a home appliance including a processor, a memory, and a computer program stored on the memory and executable on the processor, where the processor is configured to execute the implementation as in the first aspect.
  • the wake-up word training method of a home appliance according to the example, or the wake-up word training method of a home appliance according to the embodiment of the second aspect is performed.
  • FIG. 1 is a flowchart of a wake-up word training method for a home appliance according to a first embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of training the same wake word multiple times according to an embodiment of the present disclosure
  • FIG. 3 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 2 of the present disclosure
  • FIG. 4 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 3 of the present disclosure
  • FIG. 5 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 4 of the present disclosure
  • FIG. 6 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 5 of the present disclosure
  • FIG. 7 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 6 of the present disclosure.
  • FIG. 8 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 7 of the present disclosure.
  • FIG. 9 is a schematic flowchart of wake word training according to a specific example of the present disclosure.
  • FIG. 10 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 8 of the present disclosure.
  • FIG. 11 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 9 of the present disclosure.
  • FIG. 12 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 10 of the present disclosure.
  • FIG. 13 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 11 of the present disclosure
  • FIG. 14 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 12 of the present disclosure.
  • FIG. 15 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 13 of the present disclosure.
  • FIG. 16 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 14 of the present disclosure.
  • FIG. 17 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 15 of the present disclosure.
  • speech recognition technology mainly includes cloud semantic recognition and local entry recognition.
  • Cloud semantic recognition must rely on the network to be used, and the use scenarios are limited.
  • Local entry recognition can only recognize pre-set voice control command entries, and cannot achieve free semantic understanding.
  • the present disclosure proposes a wake-up word training method for home appliances, which can customize local wake-up words to meet personalized needs, does not need to rely on the network, has fast response speed, and is not limited by scenarios.
  • FIG. 1 is a flowchart of a wake-up word training method for a home appliance according to a first embodiment of the present disclosure.
  • a wake-up word training method for a home appliance includes:
  • a custom wake-up word mode is set for the home appliance device, so that the user can train a custom wake-up word that meets his own needs.
  • the user may first control the home appliance to enter the user-defined wake-up word mode.
  • the way of entering may be to trigger a physical button or issue a voice command.
  • the user can be reminded of the wake-up word that they want to set.
  • the home appliance can collect voice data samples of the wake word in a preset audio format through a voice input device such as a microphone. For example, the sound signal is collected in a format with a sampling frequency of 16Khz and a transmission rate of 16Bit. If the user does not say the wake word within 5 seconds, the user may be reminded to re-enter it.
  • feature information can be extracted using MFCC (Mel Frequency Cepstral Coefficient) or other feature extraction algorithms.
  • MFCC Mel Frequency Cepstral Coefficient
  • speech can be divided into low frequency, intermediate frequency and high frequency. Therefore, when extracting features, feature information can be extracted separately in the low frequency range, the intermediate frequency range, and the high frequency range.
  • the weight information corresponding to the feature information of the low frequency range, the feature information of the intermediate frequency range and the feature information of the high frequency range are different.
  • the feature information may include features such as sound intensity, male voice, or female voice.
  • step S103 Normalize the feature information, and determine whether the normalized feature information meets a preset condition. If yes, perform step S104, and if not, perform step S105.
  • normalization is to limit the feature information to a certain range after processing (by a certain algorithm), so that the normalized feature information can be compared and judged with preset conditions. For example: whether the length feature is too short or too long compared to the preset length range; or whether the strength feature is too large or too small compared to the preset strength range, and so on.
  • the training of the wake-up word is successful, that is, the voice data sample of the wake-up word is saved in the custom wake-up word library.
  • the wake-up word training method for home appliances may further include:
  • the home appliance may remind the user to re-enter the wake-up word, and thereby re-collect voice data samples of the wake-up word.
  • the same arousal word training can be performed multiple times.
  • the same wake-up words spoken by the user are collected three times, and feature information is extracted, normalized, and then the normalized feature information is filtered to detect feature information that meets the conditions (training Wake word for success).
  • the trained wake-up words are stored in a local custom wake-up dictionary.
  • the method for training wake-up words of a home appliance collects voice data samples of wake-up words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalized feature information. Satisfy the preset conditions, and save the voice data samples of the wake-up words into the custom wake-up thesaurus, so as to achieve user-defined wake-up words and meet the personalized needs of users.
  • the wake-up word training method for a home appliance may further include:
  • the collected voice data samples need to be denoised first to avoid noise effects and improve accuracy.
  • the wake-up word training method for a home appliance may further include:
  • S401 Receive input voice information.
  • voice information input by a user may be received.
  • step S402. Identify whether the voice information is a custom wake-up word based on the custom wake-up lexicon. If yes, perform step S403; if no, perform step S404.
  • whether the voice information is a custom wake-up word can be identified based on the custom wake-up thesaurus.
  • the feature information of the voice information can be extracted, the feature information of the voice information can be normalized, and then the feature information of the voice information and the feature information of all the wake-up words in the custom wake-up vocabulary are adopted by using a dynamic time planning algorithm. Compare. For example, the similarity between the feature information B of the voice information and the feature information of the wake-up word A1, the feature information of the wake-up word A2, and the feature information of the wake-up word A3 in the custom wake-up thesaurus are calculated respectively.
  • the comparison result with the highest similarity is obtained. If the comparison result with the highest similarity satisfies the set value, it is determined that the voice information is a custom wakeup word; if the comparison result with the highest similarity does not satisfy the set value, it is determined that the voice information is not a custom wakeup word.
  • S403 Generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.
  • a wake-up instruction is generated, and the home appliance is woken up according to the wake-up instruction.
  • the voice information is not a custom wake-up word, prompting to re-enter the voice information, thereby improving the success rate of the home appliance being woken up.
  • the local customized wake-up word dictionary is used to identify whether the voice information input by the user is a customized wake-up word. Compared with traditional network recognition, the response speed is faster, and it is not limited by the network, and the usage scenarios are more abundant.
  • FIG. 5 is a flowchart of a wake-up word training method for home appliances provided in Embodiment 4 of the present disclosure.
  • a wake-up word training method for a home appliance includes:
  • a custom wake-up word mode is set for the home appliance device, so that the user can train a custom wake-up word that meets his own needs.
  • the user before training the user-defined wake-up word, the user may first control the home appliance to enter the user-defined wake-up word mode.
  • the way of entering may be to trigger a physical button or issue a voice command.
  • S502 Collect the input wake-up word.
  • the user can be reminded of the wake-up word that they want to set.
  • the user speaks the wake word.
  • the home appliance can collect voice data samples of the wake word in a preset audio format through a voice input device such as a microphone.
  • the sound signal is collected in a format with a sampling frequency of 16Khz and a transmission rate of 16Bit. If the user does not say the wake word within 5 seconds, the user may be reminded to re-enter it.
  • step S503 Train the wake-up words and determine whether the training of wake-up words is successful. If yes, go to step S504; if no, go to step S502.
  • the feature information of the arousal word may be extracted first, and then the feature information is compared with the preset standard to determine whether the feature information meets the preset standard.
  • a suitable maximum time length can be set for the wake-up word.
  • the quality and consistency of the training corpus need to be strictly ensured. Therefore, in the entire training process, it is necessary to judge from the size of the voice, the length of the voice, the similarity of the voice, the complexity of the voice, and the environmental noise. Whether the wake word meets the preset criteria.
  • the collected wake-up word is a time-domain signal
  • the time-domain signal can be converted into a frequency-domain signal (characteristic information is extracted), and then compared and analyzed.
  • Judgment of voice sound level First set 4 predefined thresholds according to the experimental results, which respectively represent the maximum volume vh, the minimum volume vl, the maximum value above the maximum volume vhm, and the maximum value below the minimum volume vlm. Then, the number of training corpus above the maximum volume vhr and the number below the minimum volume vhr are counted. If vhr> vhm, it means that the sound is too loud; if vlr> vlm, it means that the sound is too low. If vhr ⁇ vhm and vlr ⁇ vlm, it means that the voice sound level meets the standard.
  • Judgment of speech length It can be divided into two parts, super long judgment and too short judgment. Both the overlength determination and the overlength determination are based on the characteristics of the fixed length of the training corpus, combined with the signal-to-noise ratio of the front-end speech and the back-end speech. If the power of the back-end voice does not decrease relative to the power of the front-end voice, it means that the voice is too long; if the power of the back-end voice decreases relative to the power of the previous-stage voice, it means that the voice is too short.
  • the threshold of similarity is predefined. Then the cosine distance is used to judge the similarity between different voices. If the similarity is greater than the threshold, it indicates similarity; otherwise, it indicates dissimilarity.
  • Speech complexity judgment Use the peak characteristics of the training corpus. If the number of peaks is greater than a predefined threshold, it means that the training corpus is qualified, otherwise it means unqualified.
  • Environmental noise judgment Use environmental characteristics to set the noise threshold. Analyze the training corpus. If the noise of the training corpus is lower than the threshold, it indicates that the environment is suitable, otherwise, it indicates that the noise is too large.
  • N is a positive integer.
  • the second training awakening word can be performed. If the first training wake word is unsuccessful, the first training wake word is re-performed. In addition, when training the wake word, if the number of consecutive unsuccessful trainings reaches 3 times, a prompt message can be generated.
  • the information content can be "Wake word training failed, please enter other wake words for training", etc., so as to remind the user to change Easier to train successful wakeup words.
  • S602 Detect and determine that the feature information of the awake words inputted at the Mth time meets a preset standard, and perform similarity calculation on the feature information of the awake words inputted at the Mth time and the feature information of the awake words inputted at the first M-1 times.
  • S603 Detect and determine that the similarity between the feature information of the awake words inputted at the Mth time and the feature information of the awake words inputted at the first M-1 times is greater than a preset similarity, and it is determined that the training of the awake words is successful.
  • the process of training the wake word each time may specifically adopt a method of recording sound multiple times.
  • the sound signals input by the user three times can be collected, the feature information of the three arousal words can be extracted, and their average value can be used to train as the feature information of the awake words of the first training. Success rate of training wake words.
  • the wake word training method for a home appliance in the embodiment of the present disclosure is to control the home appliance to enter a custom wake word mode, collect the entered wake word, and train the wake word to detect and determine that the wake word is successfully trained for the next training. , Until the N-th training awakening word succeeds, so as to realize user-defined awakening words, meet the personalized needs of users, and the training awakening words have high accuracy.
  • the wake-up word training method for a home appliance may further include:
  • the effective wake-up words are saved in the local custom wake-up thesaurus.
  • the wake-up word training method for a home appliance may further include:
  • the feature information of the effective wake-up words input can be extracted, and then compared with the feature information stored in the custom wake-up thesaurus. If the similarity between the two is higher than a preset value, a wake-up instruction may be generated, and the home appliance may be woken up according to the wake-up instruction. Otherwise, wake-up appliances are unsuccessful.
  • the speech recognition device is installed in the cooking equipment so that the cooking equipment has a speech recognition function.
  • the factory setting of the cooking device is: the command word for starting a custom training wake-up word is "change a name”.
  • the voice recognition device voice module is activated.
  • the user says “change a name”, and the cooking device can enter a mode of custom training wake word.
  • the cooking device can play "Please say a new wake-up word after a beep”.
  • the user speaks a new wake-up word according to the prompt voice.
  • the cooking device receives the new wake-up word and determines whether the new wake-up word is successfully trained. If the training is successful, the cooking device may give a voice prompt "Training is successful, please say the wake word again”; if the training is not successful, the cooking device may give a voice prompt "Sound **, please say the wake word again”.
  • ** can be "too small”, “too big”, “too long”, “too short”, “too simple”, “inconsistent with the last training result” and so on.
  • the above training steps are repeated, and when the third training is successful, the cooking device may perform a voice prompt "Training is completed, and the new wake-up word has taken effect", thereby ending the training.
  • the training wake-up word process can be shown in FIG. 9.
  • the present disclosure also proposes a wake word training apparatus for a home appliance.
  • FIG. 10 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 8 of the present disclosure.
  • the wake-up word training device for a home appliance may include: a first acquisition module 110, an extraction module 120, a determination module 130, and a first storage module 140.
  • the first collection module 110 is configured to collect voice data samples of wake words.
  • the first collection module 110 is further configured to detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the wake-up word.
  • the extraction module 120 is configured to extract feature information of a voice data sample.
  • the judging module 130 is configured to normalize the feature information, and detect and determine that the normalized feature information meets a preset condition.
  • the first saving module 140 is configured to save a voice data sample of the wake-up word into a custom wake-up word bank.
  • the wake-up word training apparatus for a home appliance may further include a pre-processing module 150.
  • the preprocessing module 150 is configured to perform denoising processing on the voice data samples after collecting the voice data samples of the wake word.
  • the wake-up word training apparatus for a home appliance may further include a first control module 160.
  • the control module 160 is configured to control the home appliance to enter a custom wake-up word mode before collecting voice data samples of the wake-up word.
  • the wake-up word training device for a home appliance may further include a first receiving module 210, a recognition module 220, and a first wake-up module 230.
  • the first receiving module 210 is configured to receive input voice information.
  • the recognition module 220 is configured to detect and determine that the voice information is a custom wake-up word based on the custom wake-up thesaurus.
  • the recognition module 220 is configured to: extract feature information of the voice information; normalize the feature information of the voice information, and use a dynamic time planning algorithm to The feature information is compared with the feature information of the awakened words in the custom wake-up vocabulary; the comparison result with the highest similarity is obtained; the comparison result with the highest similarity is detected and determined to satisfy the set value, and the voice information is determined For custom wake up words.
  • the first wake-up module 230 is configured to detect and determine that the voice information is a custom wake-up word, generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.
  • the wake-up word training apparatus for a home appliance may further include a prompting module 240.
  • the prompting module 240 is configured to detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.
  • the apparatus for awakening word training of a home appliance in the embodiment of the present disclosure collects voice data samples of the awakening words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalized feature information. Satisfy the preset conditions, and save the voice data samples of the wake-up words into the custom wake-up thesaurus, so as to achieve user-defined wake-up words and meet the personalized needs of users.
  • the present disclosure also proposes a wake word training apparatus for a home appliance.
  • FIG. 15 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 13 of the present disclosure.
  • the wake word training device for a home appliance may include a second control module 310, a second acquisition module 320, and a training module 330.
  • the second control module 310 is configured to control a home appliance to enter a custom wake-up word mode.
  • the second collection module 320 is configured to collect an input wake-up word.
  • the training module 330 is configured to train the wake-up words, detect and determine that the training wake-up words are successful, the collection module 320 performs the next collection, and the training module 330 performs the next training until the N-th training wake-up word succeeds.
  • the training module 330 is specifically configured to: extract feature information of the awake word; detect and determine that the feature information of the awake word meets a preset standard, and determine that training of the awake word is successful.
  • the training module 330 is further configured to: extract feature information of the wake-up word inputted at the Mth time; detect and determine that the feature information of the wake-up word inputted at the Mth time conforms to a preset standard, and convert the Mth time
  • the feature information of the awake words input is calculated similarly to the feature information of the awake words inputted before M-1 times; the feature information of the awake word inputted the Mth times and the wakeup word of the first M-1 times are detected and determined.
  • the similarity of the feature information is greater than the preset similarity, and it is determined that the training awakening word is successful.
  • the wake-up word training apparatus for a home appliance may further include a second saving module 340.
  • the second saving module 340 is configured to determine that the wakeup word is effective after the Nth training wakeup word is successful, and save the valid wakeup word locally.
  • the wake-up word training device for a home appliance may further include a second receiving module 350 and a second wake-up module 360.
  • the second receiving module 350 is configured to receive an inputted effective wake-up word after determining that the wake-up word is valid.
  • the second wake-up module 360 is configured to wake up the home appliance according to the effective wake-up word.
  • the wake word training device for a home appliance in the embodiment of the present disclosure controls a home appliance to enter a custom wake word mode, collects the entered wake word, and trains the wake word to detect and determine that the wake word is successfully trained for the next training. , Until the N-th training awakening word succeeds, so as to realize user-defined awakening words, meet the personalized needs of users, and the training awakening words have high accuracy.
  • the present disclosure also proposes a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements a wake-up word for a home appliance as proposed by the foregoing embodiment of the present disclosure. Training methods.
  • the present disclosure also provides a home appliance including a processor, a memory, and a computer program stored on the memory and executable on the processor.
  • the processor is configured to execute the home appliance as proposed in the foregoing embodiment of the present disclosure. Wake-up word training method.
  • first and second are used for descriptive purposes only and cannot be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Therefore, the features defined as “first” and “second” may explicitly or implicitly include at least one of the features. In the description of the present disclosure, the meaning of "plurality” is at least two, for example, two, three, etc., unless it is specifically and specifically defined otherwise.
  • any process or method description in a flowchart or otherwise described herein can be understood as representing a module, fragment, or portion of code that includes one or more executable instructions for implementing steps of a custom logic function or process
  • the scope of the preferred embodiments of the present disclosure includes additional implementations in which the functions may be performed out of the order shown or discussed, including performing functions in a substantially simultaneous manner or in the reverse order according to the functions involved, which should It is understood by those skilled in the art to which the embodiments of the present disclosure belong.
  • a sequenced list of executable instructions that can be considered to implement a logical function can be embodied in any computer-readable medium,
  • the instruction execution system, device, or device such as a computer-based system, a system including a processor, or other system that can fetch and execute instructions from the instruction execution system, device, or device), or in combination with these instruction execution systems, devices Or equipment.
  • a "computer-readable medium” may be any device that can contain, store, communicate, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device.
  • computer readable media include the following: electrical connections (electronic devices) with one or more wirings, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disk read-only memory (CDROM).
  • the computer-readable medium may even be paper or other suitable medium on which the program can be printed, because, for example, by optically scanning the paper or other medium, followed by editing, interpretation, or other suitable Processing to obtain the program electronically and then store it in computer memory.
  • portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof.
  • multiple steps or methods may be implemented by software or firmware stored in a memory and executed by a suitable instruction execution system.
  • Discrete logic circuits with logic gates for implementing logic functions on data signals Logic circuits, ASICs with suitable combinational logic gate circuits, programmable gate arrays (PGA), field programmable gate arrays (FPGAs), etc.
  • a person of ordinary skill in the art can understand that all or part of the steps carried by the methods in the foregoing embodiments may be implemented by a program instructing related hardware.
  • the program may be stored in a computer-readable storage medium.
  • the program is When executed, one or a combination of the steps of the method embodiment is included.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing module, or each unit may exist separately physically, or two or more units may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium.
  • the aforementioned storage medium may be a read-only memory, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

A wakeup word training method and device of a household appliance, and a household appliance. The method comprises: collecting a voice data sample of a wakeup word (S101); extracting the feature information of the voice data sample (S102); normalizing the feature information, and detecting and determining whether the normalized feature information meets a preset condition; and saving the voice data sample of the wakeup word to a user-defined wakeup word library (S104). By collecting the voice data sample of the wakeup word, extracting the feature information of the voice data sample, normalizing the feature information, detecting and determining whether the normalized feature information meets a preset condition, and saving the voice data sample of the wakeup word to a user-defined wakeup word library, the wakeup word is defined by a user and the personalized demands of the user are satisfied.

Description

家电设备的唤醒词训练方法、装置及家电设备Method and device for training wake-up words of home appliances and home appliances
相关申请的交叉引用Cross-reference to related applications
本公开要求广东美的厨房电器制造有限公司于2018年06月19日提交的、申请名称为“家电设备的唤醒词训练方法、装置及家电设备”的、中国专利申请号“201810628693.7”的优先权,以及于2018年08月06日提交的、申请名称为“家电设备的唤醒词训练方法、装置及家电设备”的、中国专利申请号“201810885079.9”的优先权。This disclosure claims the priority of China Patent Application No. “201810628693.7” submitted by Guangdong Midea Kitchen Appliance Manufacturing Co., Ltd. on June 19, 2018, with the application name of “Wake Word Training Method and Apparatus for Home Appliances and Home Appliances”, And the priority of Chinese patent application number "201810885079.9", which was filed on August 6, 2018, and whose application name is "Wake Word Training Method, Apparatus, and Home Appliances for Home Appliances".
技术领域Technical field
本公开涉及家用电器技术领域,尤其涉及一种家电设备的唤醒词训练方法、装置及家电设备。The present disclosure relates to the technical field of household appliances, and in particular, to a wake word training method and device for household appliances and household appliances.
背景技术Background technique
随着科技的不断进步,语音识别技术开发出的产品应用领域越来越广泛,涉及车载系统、机器人、家庭服务、银行服务、医疗服务、工业控制等等。目前,语音识别技术主要分为两类,一类是云端语义识别,通过网络将语音信号传输到服务器进行语义分析和理解,再通过网络将结果传输。典型代表:苹果的Siri(语音助手)、亚马逊的echo音箱、微软小冰等等。但是该方法必须有网络才能使用,使用场景受限制。另一类是本地词条识别,无需使用网络,通过本机内嵌高性能处理器,能够实时处理语音控制命令词。但其只能识别预先设定好的语音控制命令词条,需识别到完整的语音控制命令词条以后才会响应,不能实现自由语义理解,用户体验感不高。With the continuous advancement of science and technology, the product application areas developed by voice recognition technology are becoming more and more extensive, involving vehicle systems, robots, home services, banking services, medical services, industrial control, and so on. At present, speech recognition technology is mainly divided into two categories. One is cloud-based semantic recognition. Voice signals are transmitted to the server through the network for semantic analysis and understanding, and the results are transmitted through the network. Typical representatives: Apple's Siri (voice assistant), Amazon's echo speaker, Microsoft Xiaobing, etc. However, this method must have a network to be used, and the usage scenarios are limited. The other is local entry recognition, which does not require the use of a network. It can process voice control command words in real time through the embedded high-performance processor. However, it can only recognize pre-set voice control command terms, and it needs to recognize the complete voice control command terms before responding. It cannot realize free semantic understanding and the user experience is not high.
发明内容Summary of the Invention
本公开提出一种家电设备的唤醒词训练方法、装置及家电设备,以实现用户自定义唤醒词,满足用户的个性化需求。The present disclosure provides a wake word training method and device for a home appliance, and a home appliance to implement a user-defined wake word to meet a user's personalized needs.
本公开第一方面实施例提出了一种家电设备的唤醒词训练方法,包括:An embodiment of the first aspect of the present disclosure provides a wakeup word training method for a home appliance, including:
采集唤醒词的语音数据样本;Collect speech data samples of wake words;
提取所述语音数据样本的特征信息;Extracting feature information of the voice data samples;
对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件;Normalizing the feature information, detecting and determining that the normalized feature information meets a preset condition;
将所述唤醒词的语音数据样本保存至自定义唤醒词库中。The speech data sample of the wake-up word is saved in a custom wake-up word library.
作为本公开第一方面实施例的第一种可能的实现方式,方法还包括:As a first possible implementation manner of the embodiment of the first aspect of the present disclosure, the method further includes:
检测并确定所述归一化的特征信息不满足所述预设条件,重新采集所述唤醒词的语音数据样本。Detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the awake word.
作为本公开第一方面实施例的第二种可能的实现方式,方法还包括:As a second possible implementation manner of the embodiment of the first aspect of the present disclosure, the method further includes:
在采集唤醒词的语音数据样本之后,对所述语音数据样本进行去噪处理。After collecting the voice data samples of the wake word, the voice data samples are denoised.
作为本公开第一方面实施例的第三种可能的实现方式,方法还包括:As a third possible implementation manner of the embodiment of the first aspect of the present disclosure, the method further includes:
在采集唤醒词的语音数据样本之前,控制家电设备进入自定义唤醒词模式。Before collecting voice data samples of the wake word, control the home appliance to enter a custom wake word mode.
作为本公开第一方面实施例的第四种可能的实现方式,方法还包括:As a fourth possible implementation manner of the embodiment of the first aspect of the present disclosure, the method further includes:
接收输入的语音信息;Receive input voice information;
基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词;Detecting and determining that the voice information is a custom wake-up word based on the custom wake-up word dictionary;
生成唤醒指令,并根据所述唤醒指令唤醒家电设备。Generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.
作为本公开第一方面实施例的第五种可能的实现方式,基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词,包括:As a fifth possible implementation manner of the embodiment of the first aspect of the present disclosure, detecting and determining that the voice information is a custom wake-up word based on the custom wake-up thesaurus includes:
提取所述语音信息的特征信息;Extracting characteristic information of the voice information;
对所述语音信息的特征信息进行归一化,并采用动态时间规划算法,将所述语音信息的特征信息与所述自定义唤醒词库中的唤醒词的特征信息进行比对;Normalize the feature information of the voice information, and use a dynamic time planning algorithm to compare the feature information of the voice information with the feature information of the wake word in the custom wake word dictionary;
获取相似度最高的比对结果;Get the comparison result with the highest similarity;
检测并确定相似度最高的比对结果满足设定值,确定所述语音信息为自定义唤醒词。Detect and determine that the comparison result with the highest similarity satisfies the set value, and determine that the voice information is a custom wake-up word.
作为本公开第一方面实施例的第六种可能的实现方式,方法还包括:As a sixth possible implementation manner of the embodiment of the first aspect of the present disclosure, the method further includes:
检测并确定所述语音信息不为自定义唤醒词,提示重新输入语音信息。Detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.
本公开实施例的家电设备的唤醒词训练方法,通过采集唤醒词的语音数据样本,并提取所述语音数据样本的特征信息,以及对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件,将所述唤醒词的语音数据样本保存至自定义唤醒词库中,从而实现用户自定义唤醒词,满足用户的个性化需求。The method for training wake-up words of a home appliance according to an embodiment of the present disclosure includes detecting voice data samples of wake-up words, extracting feature information of the voice data samples, and normalizing the feature information to detect and determine the normalization. The normalized feature information satisfies a preset condition, and the voice data samples of the wake-up word are stored in a custom wake-up word bank, thereby realizing a user-defined wake-up word and satisfying the personalized needs of the user.
本公开第二方面实施例提出了另一种家电设备的唤醒词训练方法,包括:An embodiment of the second aspect of the present disclosure provides another wake word training method for a home appliance, including:
控制家电设备进入自定义唤醒词模式;Control home appliances to enter custom wake-up word mode;
采集输入的唤醒词;Collect input wake-up words;
对所述唤醒词进行训练,检测并确定训练唤醒词成功;Training the awakening word, detecting and determining that the training awakening word is successful;
进行下一次的唤醒词采集和训练,直至第N次训练唤醒词成功,N为正整数。The next awakening word collection and training is performed until the Nth training awakening word is successful, N is a positive integer.
作为本公开第二方面实施例的第一种可能的实现方式,方法还包括:As a first possible implementation manner of the embodiment of the second aspect of the present disclosure, the method further includes:
在第N次训练唤醒词成功之后,确定所述唤醒词生效,并将生效的唤醒词保存在本地。After the N-th training awakening word succeeds, it is determined that the awakening word is effective, and the effective awakening word is saved locally.
作为本公开第二方面实施例的第二种可能的实现方式,方法还包括:As a second possible implementation manner of the embodiment of the second aspect of the present disclosure, the method further includes:
在确定所述唤醒词生效之后,接收输入的生效的唤醒词;After determining that the wake-up word is valid, receiving an inputted wake-up word that is valid;
根据所述生效的唤醒词唤醒家电设备。Wake the home appliance according to the effective wake-up word.
作为本公开第二方面实施例的第三种可能的实现方式,对所述唤醒词进行训练,检测并确定训练唤醒词成功,包括:As a third possible implementation manner of the embodiment of the second aspect of the present disclosure, training the awake word, and detecting and determining that the training awake word is successful, includes:
提取所述唤醒词的特征信息;Extracting feature information of the wake word;
检测并确定所述唤醒词的特征信息符合预设标准,确定训练唤醒词成功。Detect and determine that the feature information of the awakened word meets a preset standard, and determine that the training of the awakened word is successful.
作为本公开第二方面实施例的第四种可能的实现方式,进行下一次的唤醒词采集和训练,直至第N次训练唤醒词成功,包括:As a fourth possible implementation manner of the embodiment of the second aspect of the present disclosure, the next wake-word acquisition and training is performed until the N-th training wake-word is successful, including:
提取第M次输入的唤醒词的特征信息;Extract feature information of the awake words inputted for the Mth time;
检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算;Detect and determine that the feature information of the awoken words inputted in the Mth time meets the preset criteria, and perform similarity calculations on the feature information of the awakewords inputted in the Mth time and the feature information of the wakeupwords input in the first M-1 times;
检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。Detect and determine that the similarity between the feature information of the awake words input at the Mth time and the feature information of the awake words input at the first M-1 times is greater than a preset similarity, and it is determined that the training of the awake words is successful.
本公开实施例的家电设备的唤醒词训练方法,通过控制家电设备进入自定义唤醒词模式,并采集输入的唤醒词,以及对所述唤醒词进行训练,检测并确定训练唤醒词成功,进行下一次训练,直至第N次训练唤醒词成功,从而实现用户自定义唤醒词,满足用户的个性化需求,且训练出的唤醒词精确度高。The wake word training method for a home appliance according to an embodiment of the present disclosure includes controlling the home appliance to enter a custom wake word mode, collecting the entered wake word, and training the wake word to detect and determine that the training wake word is successful. One training, until the Nth training of the wake word is successful, so that the user can customize the wake word to meet the user's personalized needs, and the trained wake word is highly accurate.
本公开第三方面实施例提出了一种家电设备的唤醒词训练装置,包括:An embodiment of the third aspect of the present disclosure provides a wake word training apparatus for a home appliance, including:
第一采集模块,用于采集唤醒词的语音数据样本;A first acquisition module, configured to collect speech data samples of wake words;
提取模块,用于提取所述语音数据样本的特征信息;An extraction module, configured to extract feature information of the voice data samples;
判断模块,用于对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件;A judging module for normalizing the feature information, detecting and determining that the normalized feature information meets a preset condition;
第一保存模块,用于将所述唤醒词的语音数据样本保存至自定义唤醒词库中。The first saving module is configured to save the speech data samples of the wake-up word to a custom wake-up word bank.
作为本公开第三方面实施例的第一种可能的实现方式,所述第一采集模块,还用于:As a first possible implementation manner of the embodiment of the third aspect of the present disclosure, the first acquisition module is further configured to:
检测并确定所述归一化后的特征信息不满足所述预设条件,重新采集所述唤醒词的语音数据样本。Detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the wake-up word.
作为本公开第三方面实施例的第二种可能的实现方式,所述装置还包括:As a second possible implementation manner of the embodiment of the third aspect of the present disclosure, the apparatus further includes:
预处理模块,用于在采集唤醒词的语音数据样本之后,对所述语音数据样本进行去噪处理。The preprocessing module is configured to perform denoising processing on the voice data samples after collecting the voice data samples of the wake word.
作为本公开第三方面实施例的第三种可能的实现方式,所述装置还包括:As a third possible implementation manner of the embodiment of the third aspect of the present disclosure, the apparatus further includes:
第一控制模块,用于在采集唤醒词的语音数据样本之前,控制家电设备进入自定义唤醒词模式。The first control module is configured to control the home appliance to enter a custom wake-up word mode before collecting voice data samples of the wake-up word.
作为本公开第三方面实施例的第四种可能的实现方式,所述装置还包括:As a fourth possible implementation manner of the embodiment of the third aspect of the present disclosure, the apparatus further includes:
第一接收模块,用于接收输入的语音信息;A first receiving module, configured to receive input voice information;
识别模块,用于基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词;A recognition module, configured to detect and determine that the voice information is a custom wake-up word based on the custom wake-up thesaurus;
第一唤醒模块,用于生成唤醒指令,并根据所述唤醒指令唤醒家电设备。The first wake-up module is configured to generate a wake-up instruction and wake up the home appliance according to the wake-up instruction.
作为本公开第三方面实施例的第五种可能的实现方式,所述识别模块,用于:As a fifth possible implementation manner of the embodiment of the third aspect of the present disclosure, the identification module is configured to:
提取所述语音信息的特征信息;Extracting characteristic information of the voice information;
对所述语音信息的特征信息进行归一化,并采用动态时间规划算法,将所述语音信息的特征信息与所述自定义唤醒词库中的唤醒词的特征信息进行比对;Normalize the feature information of the voice information, and use a dynamic time planning algorithm to compare the feature information of the voice information with the feature information of the wake word in the custom wake word dictionary;
获取相似度最高的比对结果;Get the comparison result with the highest similarity;
检测并确定相似度最高的比对结果满足设定值,确定所述语音信息为自定义唤醒词。Detect and determine that the comparison result with the highest similarity satisfies the set value, and determine that the voice information is a custom wake-up word.
作为本公开第三方面实施例的第六种可能的实现方式,所述装置还包括:As a sixth possible implementation manner of the embodiment of the third aspect of the present disclosure, the apparatus further includes:
提示模块,用于检测并确定所述语音信息不为自定义唤醒词,提示重新输入语音信息。A prompting module is configured to detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.
本公开实施例的家电设备的唤醒词训练装置,通过采集唤醒词的语音数据样本,并提取所述语音数据样本的特征信息,以及对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件,将所述唤醒词的语音数据样本保存至自定义唤醒词库中,从而实现用户自定义唤醒词,满足用户的个性化需求。The wake word training device for a home appliance in the embodiment of the present disclosure detects voice data samples of wake words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalization. The normalized feature information satisfies a preset condition, and the voice data samples of the wake-up word are stored in a custom wake-up word bank, thereby realizing a user-defined wake-up word and satisfying the personalized needs of the user.
本公开第四方面实施例提出了另一种家电设备的唤醒词训练装置,包括:An embodiment of the fourth aspect of the present disclosure provides another wake word training device for a home appliance, including:
第二控制模块,用于控制家电设备进入自定义唤醒词模式;A second control module, for controlling a home appliance to enter a custom wake-up word mode;
第二采集模块,用于采集输入的唤醒词;A second acquisition module, configured to collect an input wake-up word;
训练模块,用于对所述唤醒词进行训练,检测并确定训练唤醒词成功,采集模块进行下一次采集,所述训练模块进行下一次训练,直至第N次训练唤醒词成功,N为正整数。A training module for training the awake word, detecting and determining that the training awake word is successful, the acquisition module performs the next collection, and the training module performs the next training until the nth training awake word is successful, N is a positive integer .
作为本公开第四方面实施例的第一种可能的实现方式,所述装置还包括:As a first possible implementation manner of the embodiment of the fourth aspect of the present disclosure, the apparatus further includes:
第二保存模块,用于在第N次训练唤醒词成功之后,确定所述唤醒词生效,并将生效的唤醒词保存在本地。The second saving module is configured to determine that the wakeup word is effective after the Nth training wakeup word is successful, and save the valid wakeup word locally.
作为本公开第四方面实施例的第二种可能的实现方式,所述装置还包括:As a second possible implementation manner of the embodiment of the fourth aspect of the present disclosure, the apparatus further includes:
第二接收模块,用于在确定所述唤醒词生效之后,接收输入的生效的唤醒词;A second receiving module, configured to receive an input effective wake-up word after determining that the wake-up word is valid;
第二唤醒模块,用于根据所述生效的唤醒词唤醒家电设备。The second wake-up module is configured to wake up the home appliance according to the valid wake-up word.
作为本公开第四方面实施例的第三种可能的实现方式,所述训练模块,用于:As a third possible implementation manner of the embodiment of the fourth aspect of the present disclosure, the training module is configured to:
提取所述唤醒词的特征信息;Extracting feature information of the wake word;
检测并确定所述唤醒词的特征信息符合预设标准,确定训练唤醒词成功。Detect and determine that the feature information of the awakened word meets a preset standard, and determine that the training of the awakened word is successful.
作为本公开第四方面实施例的第四种可能的实现方式,所述训练模块,还用于:As a fourth possible implementation manner of the embodiment of the fourth aspect of the present disclosure, the training module is further configured to:
提取第M次输入的唤醒词的特征信息;Extract feature information of the awake words inputted for the Mth time;
检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算;Detect and determine that the feature information of the awoken words inputted in the Mth time meets the preset criteria, and perform similarity calculations on the feature information of the awakewords inputted in the Mth time and the feature information of the wakeupwords input in the first M-1 times;
检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。Detect and determine that the similarity between the feature information of the awake words input at the Mth time and the feature information of the awake words input at the first M-1 times is greater than a preset similarity, and it is determined that the training of the awake words is successful.
本公开实施例的家电设备的唤醒词训练装置,通过控制家电设备进入自定义唤醒词模式,并采集输入的唤醒词,以及对所述唤醒词进行训练,检测并确定训练唤醒词成功,进行下一次训练,直至第N次训练唤醒词成功,从而实现用户自定义唤醒词,满足用户的个性化需求,且训练出的唤醒词精确度高。The wake word training device for a home appliance in the embodiment of the present disclosure controls a home appliance to enter a custom wake word mode, collects the entered wake word, and trains the wake word to detect and determine that the wake word is successfully trained. One training, until the Nth training of the wake word is successful, so that the user can customize the wake word to meet the user's personalized needs, and the trained wake word is highly accurate.
本公开第五方面实施例提出了一种非临时性计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如第一方面实施例所述的家电设备的唤醒词训练方法,或者,实现如第二方面实施例所述的家电设备的唤醒词训练方法。An embodiment of the fifth aspect of the present disclosure provides a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements a wake-up word for a home appliance as described in the embodiment of the first aspect. A training method, or a wake-up word training method for a home appliance as described in the embodiment of the second aspect.
本公开第六方面实施例提出了一种家电设备,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器用于执行如第一方面实施例所述的家电设备的唤醒词训练方法,或者,执行如第二方面实施例所述的家电设备的唤醒词训练方法。An embodiment of the sixth aspect of the present disclosure provides a home appliance including a processor, a memory, and a computer program stored on the memory and executable on the processor, where the processor is configured to execute the implementation as in the first aspect. The wake-up word training method of a home appliance according to the example, or the wake-up word training method of a home appliance according to the embodiment of the second aspect is performed.
本公开附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本公开的实践了解到。Additional aspects and advantages of the present disclosure will be given in part in the following description, and part of them will become apparent from the following description, or be learned through the practice of the present disclosure.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本公开实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions in the embodiments of the present disclosure more clearly, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present disclosure. Those of ordinary skill in the art can obtain other drawings based on these drawings without paying creative labor.
图1为本公开实施例一所提出的家电设备的唤醒词训练方法的流程图;FIG. 1 is a flowchart of a wake-up word training method for a home appliance according to a first embodiment of the present disclosure; FIG.
图2为本公开一实施例提出的多次训练相同的唤醒词的流程示意图;FIG. 2 is a schematic flowchart of training the same wake word multiple times according to an embodiment of the present disclosure; FIG.
图3为本公开实施例二所提出的家电设备的唤醒词训练方法的流程图;FIG. 3 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 2 of the present disclosure; FIG.
图4为本公开实施例三所提出的家电设备的唤醒词训练方法的流程图;FIG. 4 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 3 of the present disclosure; FIG.
图5为本公开实施例四所提出的家电设备的唤醒词训练方法的流程图;FIG. 5 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 4 of the present disclosure; FIG.
图6为本公开实施例五所提出的家电设备的唤醒词训练方法的流程图;FIG. 6 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 5 of the present disclosure; FIG.
图7为本公开实施例六所提出的家电设备的唤醒词训练方法的流程图;FIG. 7 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 6 of the present disclosure; FIG.
图8为本公开实施例七所提出的家电设备的唤醒词训练方法的流程图;FIG. 8 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 7 of the present disclosure; FIG.
图9为本公开一具体示例的唤醒词训练的流程示意图;9 is a schematic flowchart of wake word training according to a specific example of the present disclosure;
图10为本公开实施例八所提出的家电设备的唤醒词训练装置的结构框图;FIG. 10 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 8 of the present disclosure; FIG.
图11为本公开实施例九所提出的家电设备的唤醒词训练装置的结构框图;11 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 9 of the present disclosure;
图12为本公开实施例十所提出的家电设备的唤醒词训练装置的结构框图;FIG. 12 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 10 of the present disclosure; FIG.
图13为本公开实施例十一所提出的家电设备的唤醒词训练装置的结构框图;13 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 11 of the present disclosure;
图14为本公开实施例十二所提出的家电设备的唤醒词训练装置的结构框图;FIG. 14 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 12 of the present disclosure; FIG.
图15为本公开实施例十三所提出的家电设备的唤醒词训练装置的结构框图;FIG. 15 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 13 of the present disclosure; FIG.
图16为本公开实施例十四所提出的家电设备的唤醒词训练装置的结构框图;FIG. 16 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 14 of the present disclosure; FIG.
图17为本公开实施例十五所提出的家电设备的唤醒词训练装置的结构框图。FIG. 17 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 15 of the present disclosure.
具体实施方式detailed description
下面详细描述本公开的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本公开,而不能理解为对本公开的限制。Hereinafter, embodiments of the present disclosure will be described in detail. Examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals represent the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the drawings are exemplary and are intended to explain the present disclosure, and should not be construed as limiting the present disclosure.
目前,语音识别技术主要是云端语义识别和本地词条识别两种方式。云端语义识别必须依靠网络才能使用,使用场景受限制。本地词条识别只能识别预先设定好的语音控制命令词条,不能实现自由语义理解。为此本公开提出一种家电设备的唤醒词训练方法,能够自定义本地唤醒词,满足个性化需求,且无需依靠网络,响应速度快,不受场景限制。At present, speech recognition technology mainly includes cloud semantic recognition and local entry recognition. Cloud semantic recognition must rely on the network to be used, and the use scenarios are limited. Local entry recognition can only recognize pre-set voice control command entries, and cannot achieve free semantic understanding. To this end, the present disclosure proposes a wake-up word training method for home appliances, which can customize local wake-up words to meet personalized needs, does not need to rely on the network, has fast response speed, and is not limited by scenarios.
下面参考附图描述本公开实施例的家电设备的唤醒词训练方法、装置及家电设备。The following describes a wake word training method and apparatus for a home appliance according to an embodiment of the present disclosure, and a home appliance with reference to the accompanying drawings.
图1为本公开实施例一所提出的家电设备的唤醒词训练方法的流程图。FIG. 1 is a flowchart of a wake-up word training method for a home appliance according to a first embodiment of the present disclosure.
如图1所示,家电设备的唤醒词训练方法,包括:As shown in FIG. 1, a wake-up word training method for a home appliance includes:
S101,采集唤醒词的语音数据样本。S101. Collect voice data samples of awake words.
在智能语音交互领域中,用户可通过唤醒词将处于休眠状态的设备唤醒。而该唤醒词通常为厂家预先定义的,无法更改,不能满足用户个性化的需求。因此,本实施例为家电设备设置一个自定义唤醒词模式,可以让用户训练出一个符合自身需求的自定义唤醒词。In the field of intelligent voice interaction, users can wake up a device that is in a dormant state through a wake word. The wake word is usually predefined by the manufacturer, and cannot be changed, which cannot meet the personalized needs of users. Therefore, in this embodiment, a custom wake-up word mode is set for the home appliance device, so that the user can train a custom wake-up word that meets his own needs.
在本公开的一个实施例中,用户在训练自定义唤醒词之前,可先控制家电设备进入自定义唤醒词模式。其中,进入的方式可以采用触发实体按键或者发出语音指令等方式。在家电设备进入自定义唤醒词模式之后,可提醒用户想要设置的唤醒词。在预定时间段内如5秒内,用户说出唤醒词。此时,家电设备可通过麦克风等语音输入装置以预设音频格式采集唤醒词的语音数据样本。例如,按照采样频率16Khz、传输速率16Bit的格式来采集声音信号。如果5秒内,用户没有说出唤醒词,则可提醒用户重新输入。In an embodiment of the present disclosure, before training the user-defined wake-up word, the user may first control the home appliance to enter the user-defined wake-up word mode. The way of entering may be to trigger a physical button or issue a voice command. After the home appliance enters the custom wake-up word mode, the user can be reminded of the wake-up word that they want to set. Within a predetermined period of time, such as within 5 seconds, the user speaks the wake word. At this time, the home appliance can collect voice data samples of the wake word in a preset audio format through a voice input device such as a microphone. For example, the sound signal is collected in a format with a sampling frequency of 16Khz and a transmission rate of 16Bit. If the user does not say the wake word within 5 seconds, the user may be reminded to re-enter it.
S102,提取语音数据样本的特征信息。S102. Extract feature information of a voice data sample.
其中,特征信息的提取可以采用MFCC(Mel频率倒谱系数)或其他特征提取算法进行提取。而语音按照频率的高低,可分为低频、中频和高频。因此,在提取特征时,可在低频范围、中频范围和高频范围内,分别提取特征信息。且低频范围的特征信息、中频范围的特征信息和高频范围的特征信息所对应的权重值不同。其中,特征信息可包括声音强度大小、男声或女声等特征。Among them, feature information can be extracted using MFCC (Mel Frequency Cepstral Coefficient) or other feature extraction algorithms. According to the frequency, speech can be divided into low frequency, intermediate frequency and high frequency. Therefore, when extracting features, feature information can be extracted separately in the low frequency range, the intermediate frequency range, and the high frequency range. And the weight information corresponding to the feature information of the low frequency range, the feature information of the intermediate frequency range and the feature information of the high frequency range are different. The feature information may include features such as sound intensity, male voice, or female voice.
S103,对特征信息进行归一化,并判断归一化后的特征信息是否满足预设条件,若是,执行步骤S104,若否,执行步骤S105。S103: Normalize the feature information, and determine whether the normalized feature information meets a preset condition. If yes, perform step S104, and if not, perform step S105.
其中,归一化是将特征信息经过处理后(通过一定的算法)限制在一定范围内,从而使得归一化后的特征信息能够与预设条件进行比对判断。例如:长度特征与预设长度范围相比,是否过短或过长;或者强度特征与预设强度范围相比,是否过大或过小等等。Among them, normalization is to limit the feature information to a certain range after processing (by a certain algorithm), so that the normalized feature information can be compared and judged with preset conditions. For example: whether the length feature is too short or too long compared to the preset length range; or whether the strength feature is too large or too small compared to the preset strength range, and so on.
S104,将唤醒词的语音数据样本保存至自定义唤醒词库中。S104. Save the voice data samples of the wake-up words in a custom wake-up word library.
在本公开的一个实施例中,如果归一化后的特征信息满足预设条件,则说明该唤醒词训练成功,即将唤醒词的语音数据样本保存至自定义唤醒词库中。In one embodiment of the present disclosure, if the normalized feature information satisfies a preset condition, the training of the wake-up word is successful, that is, the voice data sample of the wake-up word is saved in the custom wake-up word library.
此外,家电设备的唤醒词训练方法还可包括:In addition, the wake-up word training method for home appliances may further include:
S105,重新采集唤醒词的语音数据样本。S105. Collect voice data samples of the wake word again.
在本公开的一个实施例中,如果归一化后的特征信息不满足预设条件,则家电设备可提醒用户重新输入唤醒词,从而重新采集唤醒词的语音数据样本。In an embodiment of the present disclosure, if the normalized feature information does not satisfy a preset condition, the home appliance may remind the user to re-enter the wake-up word, and thereby re-collect voice data samples of the wake-up word.
当然,为了提高准确率,可进行多次的相同的唤醒词的训练。如图2所示,三次采集用户说出的相同的唤醒词,分别提取特征信息,经过归一化处理,然后对归一化后的特征信息进行过滤检测,筛选出满足条件的特征信息(训练成功的唤醒词)。最后将训练好的唤醒词存储在本地的自定义唤醒词库中。Of course, in order to improve the accuracy, the same arousal word training can be performed multiple times. As shown in Figure 2, the same wake-up words spoken by the user are collected three times, and feature information is extracted, normalized, and then the normalized feature information is filtered to detect feature information that meets the conditions (training Wake word for success). Finally, the trained wake-up words are stored in a local custom wake-up dictionary.
本公开实施例的家电设备的唤醒词训练方法,通过采集唤醒词的语音数据样本,并提取语音数据样本的特征信息,以及对特征信息进行归一化,检测并确定归一化后的特征信息满足预设条件,将唤醒词的语音数据样本保存至自定义唤醒词库中,从而实现用户自定义唤醒词,满足用户的个性化需求。The method for training wake-up words of a home appliance according to the embodiment of the present disclosure collects voice data samples of wake-up words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalized feature information. Satisfy the preset conditions, and save the voice data samples of the wake-up words into the custom wake-up thesaurus, so as to achieve user-defined wake-up words and meet the personalized needs of users.
在本公开的另一个实施例中,如图3所示,家电设备的唤醒词训练方法还可包括:In another embodiment of the present disclosure, as shown in FIG. 3, the wake-up word training method for a home appliance may further include:
S106,在采集唤醒词的语音数据样本之后,对语音数据样本进行去噪处理。S106. After collecting the voice data samples of the awake words, perform denoising processing on the voice data samples.
由于环境噪声、其他干扰声音的影响,在采集用户说出的唤醒词时,需要先对采集到的语音数据样本进行去噪处理,从而避免噪声影响,提高精准度。Due to the influence of environmental noise and other disturbing sounds, when collecting wake-up words spoken by the user, the collected voice data samples need to be denoised first to avoid noise effects and improve accuracy.
在本公开的又一个实施例中,如图4所示,家电设备的唤醒词训练方法还可包括:In another embodiment of the present disclosure, as shown in FIG. 4, the wake-up word training method for a home appliance may further include:
S401,接收输入的语音信息。S401. Receive input voice information.
在自定义唤醒词成功时候,便可以利用自定义唤醒词唤醒家电设备。When the custom wake-up word is successful, you can use the custom wake-up word to wake up the home appliance.
在本公开的一个实施例中,可接收用户输入的语音信息。In one embodiment of the present disclosure, voice information input by a user may be received.
S402,基于自定义唤醒词库识别语音信息是否为自定义唤醒词,若是,执行步骤S403,若否,执行步骤S404。S402. Identify whether the voice information is a custom wake-up word based on the custom wake-up lexicon. If yes, perform step S403; if no, perform step S404.
在此之后,可基于自定义唤醒词库来识别语音信息是否为自定义唤醒词。After that, whether the voice information is a custom wake-up word can be identified based on the custom wake-up thesaurus.
具体地,可提取语音信息的特征信息,对语音信息的特征信息进行归一化,然后采用动态时间规划算法,将语音信息的特征信息与自定义唤醒词库中的所有的唤醒词的特征信息进行比对。例如,分别计算语音信息的特征信息B与自定义唤醒词库中的唤醒词A1的特征信息、唤醒词A2的特征信息、唤醒词A3的特征信息的相似度。Specifically, the feature information of the voice information can be extracted, the feature information of the voice information can be normalized, and then the feature information of the voice information and the feature information of all the wake-up words in the custom wake-up vocabulary are adopted by using a dynamic time planning algorithm. Compare. For example, the similarity between the feature information B of the voice information and the feature information of the wake-up word A1, the feature information of the wake-up word A2, and the feature information of the wake-up word A3 in the custom wake-up thesaurus are calculated respectively.
之后,获取相似度最高的比对结果。如果相似度最高的比对结果满足设定值,则确定语音信息为自定义唤醒词;如果相似度最高的比对结果不满足设定值,则确定语音信息不为自定义唤醒词。Then, the comparison result with the highest similarity is obtained. If the comparison result with the highest similarity satisfies the set value, it is determined that the voice information is a custom wakeup word; if the comparison result with the highest similarity does not satisfy the set value, it is determined that the voice information is not a custom wakeup word.
S403,生成唤醒指令,并根据唤醒指令唤醒家电设备。S403: Generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.
在本公开的一个实施例中,如果语音信息为自定义唤醒词,则生成唤醒指令,并根据唤醒指令唤醒家电设备。In one embodiment of the present disclosure, if the voice information is a custom wake-up word, a wake-up instruction is generated, and the home appliance is woken up according to the wake-up instruction.
S404,提示重新输入语音信息。S404: Prompt to input voice information again.
在本公开的一个实施例中,如果语音信息不为自定义唤醒词,则提示重新输入语音信息,从而提升家电设备被唤醒的成功率。In an embodiment of the present disclosure, if the voice information is not a custom wake-up word, prompting to re-enter the voice information, thereby improving the success rate of the home appliance being woken up.
本实施例通过本地的自定义唤醒词库来识别用户输入的语音信息是否为自定义唤醒词,相比于传统的联网识别,响应速度更快,且不受网络限制,使用场景更丰富。In this embodiment, the local customized wake-up word dictionary is used to identify whether the voice information input by the user is a customized wake-up word. Compared with traditional network recognition, the response speed is faster, and it is not limited by the network, and the usage scenarios are more abundant.
本公开还提出了另一种家电设备的唤醒词训练方法,图5为本公开实施例四所提出的家电设备的唤醒词训练方法的流程图。The disclosure also proposes another wake-up word training method for home appliances. FIG. 5 is a flowchart of a wake-up word training method for home appliances provided in Embodiment 4 of the present disclosure.
如图5所示,家电设备的唤醒词训练方法,包括:As shown in FIG. 5, a wake-up word training method for a home appliance includes:
S501,控制家电设备进入自定义唤醒词模式。S501. Control a home appliance to enter a custom wake-up word mode.
在智能语音交互领域中,用户可通过唤醒词将处于休眠状态的设备唤醒。而该唤醒词通常为厂家预先定义的,无法更改,不能满足用户个性化的需求。因此,本实施例为家电设备设置一个自定义唤醒词模式,可以让用户训练出一个符合自身需求的自定义唤醒词。In the field of intelligent voice interaction, users can wake up a device that is in a dormant state through a wake word. The wake word is usually predefined by the manufacturer, and cannot be changed, which cannot meet the personalized needs of users. Therefore, in this embodiment, a custom wake-up word mode is set for the home appliance device, so that the user can train a custom wake-up word that meets his own needs.
在本公开的一个实施例中,用户在训练自定义唤醒词之前,可先控制家电设备进入自定义唤醒词模式。其中,进入的方式可以采用触发实体按键或者发出语音指令等方式。In an embodiment of the present disclosure, before training the user-defined wake-up word, the user may first control the home appliance to enter the user-defined wake-up word mode. The way of entering may be to trigger a physical button or issue a voice command.
S502,采集输入的唤醒词。S502: Collect the input wake-up word.
在家电设备进入自定义唤醒词模式之后,可提醒用户想要设置的唤醒词。在预定时间段内如5秒内,用户说出唤醒词。此时,家电设备可通过麦克风等语音输入装置以预设音频格式采集唤醒词的语音数据样本。例如,按照采样频率16Khz、传输速率16Bit的格式来采集声音信号。如果5秒内,用户没有说出唤醒词,则可提醒用户重新输入。After the home appliance enters the custom wake-up word mode, the user can be reminded of the wake-up word that they want to set. Within a predetermined period of time, such as within 5 seconds, the user speaks the wake word. At this time, the home appliance can collect voice data samples of the wake word in a preset audio format through a voice input device such as a microphone. For example, the sound signal is collected in a format with a sampling frequency of 16Khz and a transmission rate of 16Bit. If the user does not say the wake word within 5 seconds, the user may be reminded to re-enter it.
S503,对唤醒词进行训练,并判断训练唤醒词是否成功,若是,执行步骤S504,若否,执行步骤S502。S503. Train the wake-up words and determine whether the training of wake-up words is successful. If yes, go to step S504; if no, go to step S502.
在本公开的一个实施例中,可先提取唤醒词的特征信息,然后将特征信息与预设标准进行比对判断,判断特征信息是否符合预设标准。In one embodiment of the present disclosure, the feature information of the arousal word may be extracted first, and then the feature information is compared with the preset standard to determine whether the feature information meets the preset standard.
举例来说,结合用户使用习惯,可为唤醒词设定一个合适的最大时间长度。For example, in combination with user usage habits, a suitable maximum time length can be set for the wake-up word.
在训练过程中,需要严格保证训练语料(唤醒词)的质量和一致性,因此,在整个训练过程中,需要从语音声响大小、语音长度、语音相似度、语音复杂度、环境噪声等方面判断唤醒词是否符合预设标准。During the training process, the quality and consistency of the training corpus (wake words) need to be strictly ensured. Therefore, in the entire training process, it is necessary to judge from the size of the voice, the length of the voice, the similarity of the voice, the complexity of the voice, and the environmental noise. Whether the wake word meets the preset criteria.
其中,采集的唤醒词为时域信号,可将该时域信号转换为频域信号(提取特征信息),再进行比对分析。Wherein, the collected wake-up word is a time-domain signal, and the time-domain signal can be converted into a frequency-domain signal (characteristic information is extracted), and then compared and analyzed.
语音声响大小的判断:先按照实验结果设置4个预定义的阈值,分别表示最大音量vh,最小音量vl,高于最大音量的最大数值vhm,低于最小音量的最大数值vlm。之后,统计 出训练语料的高于最大音量的个数vhr和低于最小音量的个数vlr。如果vhr>vhm,则表示声音太大;如果vlr>vlm,则表示声音太小。如果vhr<vhm,且vlr<vlm,则表示语音声响大小符合标准。Judgment of voice sound level: First set 4 predefined thresholds according to the experimental results, which respectively represent the maximum volume vh, the minimum volume vl, the maximum value above the maximum volume vhm, and the maximum value below the minimum volume vlm. Then, the number of training corpus above the maximum volume vhr and the number below the minimum volume vhr are counted. If vhr> vhm, it means that the sound is too loud; if vlr> vlm, it means that the sound is too low. If vhr <vhm and vlr <vlm, it means that the voice sound level meets the standard.
语音长度的判断:可分为两部分,超长判定和过短判定。超长判定和过短判定均是利用训练语料固定长度的特性,结合前端语音和后端语音的信噪比进行判断。如果后端语音的功率相对于前端语音的功率没有减弱,则表示语音超长;如果后端语音的功率相对于前段语音的功率提前减弱,则表示语音过短。Judgment of speech length: It can be divided into two parts, super long judgment and too short judgment. Both the overlength determination and the overlength determination are based on the characteristics of the fixed length of the training corpus, combined with the signal-to-noise ratio of the front-end speech and the back-end speech. If the power of the back-end voice does not decrease relative to the power of the front-end voice, it means that the voice is too long; if the power of the back-end voice decreases relative to the power of the previous-stage voice, it means that the voice is too short.
语音相似度的判断:根据实验结果,预定义相似度的阈值。再利用余弦距离判断不同语音之间的相似度。如果相似度大于阈值,则表示相似,否则表示不相似。Judgment of speech similarity: According to the experimental results, the threshold of similarity is predefined. Then the cosine distance is used to judge the similarity between different voices. If the similarity is greater than the threshold, it indicates similarity; otherwise, it indicates dissimilarity.
语音复杂度判断:利用训练语料的波峰特性,若波峰数大于预定义的阈值,则表示训练语料合格,否则表示不合格。Speech complexity judgment: Use the peak characteristics of the training corpus. If the number of peaks is greater than a predefined threshold, it means that the training corpus is qualified, otherwise it means unqualified.
环境噪声判断:利用环境特性,设置噪音阈值。对训练语料进行分析,若训练语料的噪音低于阈值,则表示环境合适,否则,表示噪声太大。Environmental noise judgment: Use environmental characteristics to set the noise threshold. Analyze the training corpus. If the noise of the training corpus is lower than the threshold, it indicates that the environment is suitable, otherwise, it indicates that the noise is too large.
S504,进行下一次的唤醒词采集和训练,直至第N次训练唤醒词成功。S504. Perform the next awake word collection and training until the Nth training awake word succeeds.
其中,N为正整数。Where N is a positive integer.
也就是说,如果第一次训练唤醒词成功,那么可以进行第二次训练唤醒词。如果第一次训练唤醒词不成功,则重新进行第一次训练唤醒词。此外,在训练唤醒词时,如果连续训练不成功的次数达到3次,则可生成提示信息,信息内容可以是“唤醒词训练失败,请输入其他唤醒词进行训练”等,从而提醒用户更换一个更容易训练成功的唤醒词。That is, if the first training awakening word is successful, the second training awakening word can be performed. If the first training wake word is unsuccessful, the first training wake word is re-performed. In addition, when training the wake word, if the number of consecutive unsuccessful trainings reaches 3 times, a prompt message can be generated. The information content can be "Wake word training failed, please enter other wake words for training", etc., so as to remind the user to change Easier to train successful wakeup words.
在本公开的一个实施例中,在进行第M次训练时,如图6所示,具体可包括如下步骤:In an embodiment of the present disclosure, when performing the Mth training, as shown in FIG. 6, the following steps may be specifically included:
S601,提取第M次输入的唤醒词的特征信息。S601. Extract feature information of the wake-up word input at the Mth time.
S602,检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算。S602: Detect and determine that the feature information of the awake words inputted at the Mth time meets a preset standard, and perform similarity calculation on the feature information of the awake words inputted at the Mth time and the feature information of the awake words inputted at the first M-1 times.
S603,检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。S603: Detect and determine that the similarity between the feature information of the awake words inputted at the Mth time and the feature information of the awake words inputted at the first M-1 times is greater than a preset similarity, and it is determined that the training of the awake words is successful.
假设M=5,则进行第五次训练时,需要将第五次输入的唤醒词的特征信息分别与第一次、第二次、第三次、第四次输入的唤醒词的特征信息进行相似度计算。4个相似度需要均大于预设相似度如85%,才能确定第五次的唤醒词训练成功。Assuming M = 5, when performing the fifth training, it is necessary to perform the feature information of the awake words input for the fifth time and the feature information of the awake words input for the first, second, third, and fourth times respectively. Similarity calculation. The four similarities need to be greater than a preset similarity, such as 85%, to determine that the fifth awake word training was successful.
应当理解的是,每一次训练唤醒词的过程,具体还可以采用多次录制声音的方式。例如,第一次训练时,可采集三次用户输入的声音信号,提取这三次的唤醒词的特征信息,求取它们的平均值来作为第一次训练的唤醒词的特征信息进行训练,从而提升训练唤醒词的成功率。It should be understood that the process of training the wake word each time may specifically adopt a method of recording sound multiple times. For example, during the first training, the sound signals input by the user three times can be collected, the feature information of the three arousal words can be extracted, and their average value can be used to train as the feature information of the awake words of the first training. Success rate of training wake words.
本公开实施例的家电设备的唤醒词训练方法,通过控制家电设备进入自定义唤醒词模式,并采集输入的唤醒词,以及对唤醒词进行训练,检测并确定训练唤醒词成功,进行下一次训练,直至第N次训练唤醒词成功,从而实现用户自定义唤醒词,满足用户的个性化需求,且训练出的唤醒词精确度高。The wake word training method for a home appliance in the embodiment of the present disclosure is to control the home appliance to enter a custom wake word mode, collect the entered wake word, and train the wake word to detect and determine that the wake word is successfully trained for the next training. , Until the N-th training awakening word succeeds, so as to realize user-defined awakening words, meet the personalized needs of users, and the training awakening words have high accuracy.
在本公开的另一个实施例中,如图7所示,家电设备的唤醒词训练方法还可包括:In another embodiment of the present disclosure, as shown in FIG. 7, the wake-up word training method for a home appliance may further include:
S505,在第N次训练唤醒词成功之后,确定唤醒词生效,并将生效的唤醒词保存在本地。S505. After the N-th training wake-up word succeeds, determine that the wake-up word is effective, and save the effective wake-up word locally.
其中,生效的唤醒词保存在本地的自定义唤醒词库中。Among them, the effective wake-up words are saved in the local custom wake-up thesaurus.
在本公开的又一个实施例中,如图8所示,家电设备的唤醒词训练方法还可包括:In another embodiment of the present disclosure, as shown in FIG. 8, the wake-up word training method for a home appliance may further include:
S506,在确定唤醒词生效之后,接收输入的生效的唤醒词。S506. After determining that the wake-up word is valid, receive the inputted wake-up word that is valid.
S507,根据生效的唤醒词唤醒家电设备。S507. Wake up the home appliance according to the effective wake-up word.
在自定义唤醒词成功后,便可以利用生效的唤醒词唤醒家电设备。After customizing the wake-up word, you can use the effective wake-up word to wake up the home appliance.
具体地,可提取输入的生效的唤醒词的特征信息,然后与保存在自定义唤醒词库中的特征信息进行比对。如果两者相似度高于预设值,则可生成唤醒指令,并根据唤醒指令唤醒家电设备。否则,唤醒家电设备不成功。Specifically, the feature information of the effective wake-up words input can be extracted, and then compared with the feature information stored in the custom wake-up thesaurus. If the similarity between the two is higher than a preset value, a wake-up instruction may be generated, and the home appliance may be woken up according to the wake-up instruction. Otherwise, wake-up appliances are unsuccessful.
下面以一个具体示例进行说明:The following uses a specific example for illustration:
将语音识别装置安装在烹饪设备中,使烹饪设备具备语音识别功能。其中,烹饪设备的出厂设置为:启动自定义训练唤醒词的命令词为“换一个名字”。The speech recognition device is installed in the cooking equipment so that the cooking equipment has a speech recognition function. The factory setting of the cooking device is: the command word for starting a custom training wake-up word is "change a name".
在烹饪设备通电后,语音识别装置语音模组启动。用户说出“换一个名字”,则烹饪设备可进入自定义训练唤醒词的模式。此时,烹饪设备可播放“请在滴一声后说出新的唤醒词”。用户根据该提示语音,说出新的唤醒词。烹饪设备接收新的唤醒词,并判断新的唤醒词是否训练成功。如果训练成功,则烹饪设备可进行语音提示“训练成功,请再次说出唤醒词”;如果训练不成功,则烹饪设备可进行语音提示“声音**,请重新说出唤醒词”。其中,**可以是“太小”、“太大”、“太长”、“太短”、“太简单”、“与上一次训练结果不一致”等。重复上述训练步骤,并当第三次训练成功时,烹饪设备可进行语音提示“训练已完成,新唤醒词已生效”,从而结束训练。上述训练唤醒词过程,可如图9所示。通过该方法,大大提高了训练出的唤醒词的精度,进而提升了唤醒词的识别率,降低了误识别率。After the cooking device is powered on, the voice recognition device voice module is activated. The user says "change a name", and the cooking device can enter a mode of custom training wake word. At this point, the cooking device can play "Please say a new wake-up word after a beep". The user speaks a new wake-up word according to the prompt voice. The cooking device receives the new wake-up word and determines whether the new wake-up word is successfully trained. If the training is successful, the cooking device may give a voice prompt "Training is successful, please say the wake word again"; if the training is not successful, the cooking device may give a voice prompt "Sound **, please say the wake word again". Among them, ** can be "too small", "too big", "too long", "too short", "too simple", "inconsistent with the last training result" and so on. The above training steps are repeated, and when the third training is successful, the cooking device may perform a voice prompt "Training is completed, and the new wake-up word has taken effect", thereby ending the training. The training wake-up word process can be shown in FIG. 9. By this method, the accuracy of the aroused words is greatly improved, and the recognition rate of the aroused words is improved, and the misrecognition rate is reduced.
为实现上述实施例,本公开还提出一种家电设备的唤醒词训练装置。To achieve the above embodiments, the present disclosure also proposes a wake word training apparatus for a home appliance.
图10为本公开实施例八所提出的家电设备的唤醒词训练装置的结构框图。FIG. 10 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 8 of the present disclosure.
如图10所示,家电设备的唤醒词训练装置可包括:第一采集模块110、提取模块 120、判断模块130和第一保存模块140。As shown in FIG. 10, the wake-up word training device for a home appliance may include: a first acquisition module 110, an extraction module 120, a determination module 130, and a first storage module 140.
其中,第一采集模块110,用于采集唤醒词的语音数据样本。The first collection module 110 is configured to collect voice data samples of wake words.
作为一种可能的实现方式,第一采集模块110,还用于:检测并确定归一化后的特征信息不满足所述预设条件,重新采集所述唤醒词的语音数据样本。As a possible implementation manner, the first collection module 110 is further configured to detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the wake-up word.
提取模块120,用于提取语音数据样本的特征信息。The extraction module 120 is configured to extract feature information of a voice data sample.
判断模块130,用于对特征信息进行归一化,检测并确定归一化后的特征信息满足预设条件。The judging module 130 is configured to normalize the feature information, and detect and determine that the normalized feature information meets a preset condition.
第一保存模块140,用于将唤醒词的语音数据样本保存至自定义唤醒词库中。The first saving module 140 is configured to save a voice data sample of the wake-up word into a custom wake-up word bank.
在本公开的另一个实施例中,如图11所示,家电设备的唤醒词训练装置还可包括预处理模块150。In another embodiment of the present disclosure, as shown in FIG. 11, the wake-up word training apparatus for a home appliance may further include a pre-processing module 150.
预处理模块150,用于在采集唤醒词的语音数据样本之后,对语音数据样本进行去噪处理。The preprocessing module 150 is configured to perform denoising processing on the voice data samples after collecting the voice data samples of the wake word.
在本公开的又一个实施例中,如图12所示,家电设备的唤醒词训练装置还可包括:第一控制模块160。In another embodiment of the present disclosure, as shown in FIG. 12, the wake-up word training apparatus for a home appliance may further include a first control module 160.
控制模块160,用于在采集唤醒词的语音数据样本之前,控制家电设备进入自定义唤醒词模式。The control module 160 is configured to control the home appliance to enter a custom wake-up word mode before collecting voice data samples of the wake-up word.
在本公开的再一个实施例中,如图13所示,家电设备的唤醒词训练装置还可包括第一接收模块210、识别模块220和第一唤醒模块230。In still another embodiment of the present disclosure, as shown in FIG. 13, the wake-up word training device for a home appliance may further include a first receiving module 210, a recognition module 220, and a first wake-up module 230.
第一接收模块210,用于接收输入的语音信息。The first receiving module 210 is configured to receive input voice information.
识别模块220,用于基于自定义唤醒词库检测并确定语音信息为自定义唤醒词。The recognition module 220 is configured to detect and determine that the voice information is a custom wake-up word based on the custom wake-up thesaurus.
作为一种可能的实现方式,识别模块220,用于:提取所述语音信息的特征信息;对所述语音信息的特征信息进行归一化,并采用动态时间规划算法,将所述语音信息的特征信息与所述自定义唤醒词库中的唤醒词的特征信息进行比对;获取相似度最高的比对结果;检测并确定相似度最高的比对结果满足设定值,确定所述语音信息为自定义唤醒词。As a possible implementation manner, the recognition module 220 is configured to: extract feature information of the voice information; normalize the feature information of the voice information, and use a dynamic time planning algorithm to The feature information is compared with the feature information of the awakened words in the custom wake-up vocabulary; the comparison result with the highest similarity is obtained; the comparison result with the highest similarity is detected and determined to satisfy the set value, and the voice information is determined For custom wake up words.
第一唤醒模块230,用于检测并确定语音信息为自定义唤醒词,生成唤醒指令,并根据唤醒指令唤醒家电设备。The first wake-up module 230 is configured to detect and determine that the voice information is a custom wake-up word, generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.
在本公开的一个具体实施例中,如图14所示,家电设备的唤醒词训练装置还可包括提示模块240。In a specific embodiment of the present disclosure, as shown in FIG. 14, the wake-up word training apparatus for a home appliance may further include a prompting module 240.
提示模块240,用于检测并确定语音信息不为自定义唤醒词,提示重新输入语音信息。The prompting module 240 is configured to detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.
需要说明的是,前述对家电设备的唤醒词训练方法的解释说明,也适用于本公开实施 例的家电设备的唤醒词训练装置,本公开实施例中未公布的细节,在此不再赘述。It should be noted that the foregoing explanation of the wake word training method for home appliances is also applicable to the wake word training device for home appliances in the embodiments of the present disclosure. Details not disclosed in the embodiments of the present disclosure will not be repeated here.
本公开实施例的家电设备的唤醒词训练装置,通过采集唤醒词的语音数据样本,并提取语音数据样本的特征信息,以及对特征信息进行归一化,检测并确定归一化后的特征信息满足预设条件,将唤醒词的语音数据样本保存至自定义唤醒词库中,从而实现用户自定义唤醒词,满足用户的个性化需求。The apparatus for awakening word training of a home appliance in the embodiment of the present disclosure collects voice data samples of the awakening words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalized feature information. Satisfy the preset conditions, and save the voice data samples of the wake-up words into the custom wake-up thesaurus, so as to achieve user-defined wake-up words and meet the personalized needs of users.
为实现上述实施例,本公开还提出一种家电设备的唤醒词训练装置。To achieve the above embodiments, the present disclosure also proposes a wake word training apparatus for a home appliance.
图15为本公开实施例十三所提出的家电设备的唤醒词训练装置的结构框图。FIG. 15 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 13 of the present disclosure.
如图15所示,家电设备的唤醒词训练装置可包括:第二控制模块310、第二采集模块320和训练模块330。As shown in FIG. 15, the wake word training device for a home appliance may include a second control module 310, a second acquisition module 320, and a training module 330.
其中,第二控制模块310,用于控制家电设备进入自定义唤醒词模式。The second control module 310 is configured to control a home appliance to enter a custom wake-up word mode.
第二采集模块320,用于采集输入的唤醒词。The second collection module 320 is configured to collect an input wake-up word.
训练模块330,用于对唤醒词进行训练,检测并确定训练唤醒词成功,采集模块320进行下一次采集,训练模块330进行下一次训练,直至第N次训练唤醒词成功。The training module 330 is configured to train the wake-up words, detect and determine that the training wake-up words are successful, the collection module 320 performs the next collection, and the training module 330 performs the next training until the N-th training wake-up word succeeds.
作为一种可能的实现方式,训练模块330,具体用于:提取所述唤醒词的特征信息;检测并确定所述唤醒词的特征信息符合预设标准,确定训练唤醒词成功。As a possible implementation manner, the training module 330 is specifically configured to: extract feature information of the awake word; detect and determine that the feature information of the awake word meets a preset standard, and determine that training of the awake word is successful.
作为一种可能的实现方式,训练模块330,还用于:提取第M次输入的唤醒词的特征信息;检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算;检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。As a possible implementation manner, the training module 330 is further configured to: extract feature information of the wake-up word inputted at the Mth time; detect and determine that the feature information of the wake-up word inputted at the Mth time conforms to a preset standard, and convert the Mth time The feature information of the awake words input is calculated similarly to the feature information of the awake words inputted before M-1 times; the feature information of the awake word inputted the Mth times and the wakeup word of the first M-1 times are detected and determined. The similarity of the feature information is greater than the preset similarity, and it is determined that the training awakening word is successful.
在本公开的另一个实施例中,如图16所示,家电设备的唤醒词训练装置还可包括:第二保存模块340。In another embodiment of the present disclosure, as shown in FIG. 16, the wake-up word training apparatus for a home appliance may further include a second saving module 340.
第二保存模块340,用于在第N次训练唤醒词成功之后,确定唤醒词生效,并将生效的唤醒词保存在本地。The second saving module 340 is configured to determine that the wakeup word is effective after the Nth training wakeup word is successful, and save the valid wakeup word locally.
在本公开的又一个实施例中,如图17所示,家电设备的唤醒词训练装置还可包括:第二接收模块350和第二唤醒模块360。In another embodiment of the present disclosure, as shown in FIG. 17, the wake-up word training device for a home appliance may further include a second receiving module 350 and a second wake-up module 360.
第二接收模块350,用于在确定唤醒词生效之后,接收输入的生效的唤醒词。The second receiving module 350 is configured to receive an inputted effective wake-up word after determining that the wake-up word is valid.
第二唤醒模块360,用于根据生效的唤醒词唤醒家电设备。The second wake-up module 360 is configured to wake up the home appliance according to the effective wake-up word.
需要说明的是,前述对家电设备的唤醒词训练方法的解释说明,也适用于本公开实施例的家电设备的唤醒词训练装置,本公开实施例中未公布的细节,在此不再赘述。It should be noted that the foregoing explanation of the wake word training method for home appliances is also applicable to the wake word training device for home appliances in the embodiment of the present disclosure. Details not disclosed in the embodiments of the present disclosure will not be repeated here.
本公开实施例的家电设备的唤醒词训练装置,通过控制家电设备进入自定义唤醒词模 式,并采集输入的唤醒词,以及对唤醒词进行训练,检测并确定训练唤醒词成功,进行下一次训练,直至第N次训练唤醒词成功,从而实现用户自定义唤醒词,满足用户的个性化需求,且训练出的唤醒词精确度高。The wake word training device for a home appliance in the embodiment of the present disclosure controls a home appliance to enter a custom wake word mode, collects the entered wake word, and trains the wake word to detect and determine that the wake word is successfully trained for the next training. , Until the N-th training awakening word succeeds, so as to realize user-defined awakening words, meet the personalized needs of users, and the training awakening words have high accuracy.
为实现上述实施例,本公开还提出一种非临时性计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如本公开前述实施例提出的家电设备的唤醒词训练方法。In order to implement the above embodiments, the present disclosure also proposes a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements a wake-up word for a home appliance as proposed by the foregoing embodiment of the present disclosure. Training methods.
为实现上述实施例,本公开还提出一种家电设备,包括处理器、存储器及存储在存储器上并可在处理器上运行的计算机程序,处理器用于执行如本公开前述实施例提出的家电设备的唤醒词训练方法。In order to implement the above embodiments, the present disclosure also provides a home appliance including a processor, a memory, and a computer program stored on the memory and executable on the processor. The processor is configured to execute the home appliance as proposed in the foregoing embodiment of the present disclosure. Wake-up word training method.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本公开的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, the description with reference to the terms “one embodiment”, “some embodiments”, “examples”, “specific examples”, or “some examples” and the like means specific features described in conjunction with the embodiments or examples , Structure, material, or characteristic is included in at least one embodiment or example of the present disclosure. In this specification, the schematic expressions of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. In addition, without any contradiction, those skilled in the art may combine and combine different embodiments or examples and features of the different embodiments or examples described in this specification.
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本公开的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。In addition, the terms "first" and "second" are used for descriptive purposes only and cannot be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Therefore, the features defined as "first" and "second" may explicitly or implicitly include at least one of the features. In the description of the present disclosure, the meaning of "plurality" is at least two, for example, two, three, etc., unless it is specifically and specifically defined otherwise.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现定制逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本公开的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本公开的实施例所属技术领域的技术人员所理解。Any process or method description in a flowchart or otherwise described herein can be understood as representing a module, fragment, or portion of code that includes one or more executable instructions for implementing steps of a custom logic function or process And, the scope of the preferred embodiments of the present disclosure includes additional implementations in which the functions may be performed out of the order shown or discussed, including performing functions in a substantially simultaneous manner or in the reverse order according to the functions involved, which should It is understood by those skilled in the art to which the embodiments of the present disclosure belong.
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设 备而使用。就本说明书而言,"计算机可读介质"可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。The logic and / or steps represented in the flowchart or otherwise described herein, for example, a sequenced list of executable instructions that can be considered to implement a logical function, can be embodied in any computer-readable medium, For the instruction execution system, device, or device (such as a computer-based system, a system including a processor, or other system that can fetch and execute instructions from the instruction execution system, device, or device), or in combination with these instruction execution systems, devices Or equipment. For the purposes of this specification, a "computer-readable medium" may be any device that can contain, store, communicate, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. More specific examples (non-exhaustive list) of computer readable media include the following: electrical connections (electronic devices) with one or more wirings, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disk read-only memory (CDROM). In addition, the computer-readable medium may even be paper or other suitable medium on which the program can be printed, because, for example, by optically scanning the paper or other medium, followed by editing, interpretation, or other suitable Processing to obtain the program electronically and then store it in computer memory.
应当理解,本公开的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。如,如果用硬件来实现和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods may be implemented by software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware as in another embodiment, it may be implemented using any one or a combination of the following techniques known in the art: Discrete logic circuits with logic gates for implementing logic functions on data signals Logic circuits, ASICs with suitable combinational logic gate circuits, programmable gate arrays (PGA), field programmable gate arrays (FPGAs), etc.
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。A person of ordinary skill in the art can understand that all or part of the steps carried by the methods in the foregoing embodiments may be implemented by a program instructing related hardware. The program may be stored in a computer-readable storage medium. The program is When executed, one or a combination of the steps of the method embodiment is included.
此外,在本公开各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing module, or each unit may exist separately physically, or two or more units may be integrated into one module. The above integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium.
上述提到的存储介质可以是只读存储器,磁盘或光盘等。尽管上面已经示出和描述了本公开的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本公开的限制,本领域的普通技术人员在本公开的范围内可以对上述实施例进行变化、修改、替换和变型。The aforementioned storage medium may be a read-only memory, a magnetic disk, or an optical disk. Although the embodiments of the present disclosure have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limitations on the present disclosure. Those skilled in the art can understand the above within the scope of the present disclosure. Embodiments are subject to change, modification, substitution, and modification.

Claims (26)

  1. 一种家电设备的唤醒词训练方法,其特征在于,包括:A wake-up word training method for a home appliance is characterized in that it includes:
    采集唤醒词的语音数据样本;Collect speech data samples of wake words;
    提取所述语音数据样本的特征信息;Extracting feature information of the voice data samples;
    对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件;Normalizing the feature information, detecting and determining that the normalized feature information meets a preset condition;
    将所述唤醒词的语音数据样本保存至自定义唤醒词库中。The speech data sample of the wake-up word is saved in a custom wake-up word library.
  2. 如权利要求1所述的方法,其特征在于,还包括:The method of claim 1, further comprising:
    检测并确定所述归一化的特征信息不满足所述预设条件,重新采集所述唤醒词的语音数据样本。Detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the awake word.
  3. 如权利要求1或2所述的方法,其特征在于,还包括:The method according to claim 1 or 2, further comprising:
    在采集唤醒词的语音数据样本之后,对所述语音数据样本进行去噪处理。After collecting the voice data samples of the wake word, the voice data samples are denoised.
  4. 如权利要求1-3任一项所述的方法,其特征在于,在采集唤醒词的语音数据样本之前,还包括:The method according to any one of claims 1-3, before collecting voice data samples of the wake word, further comprising:
    控制家电设备进入自定义唤醒词模式。Control home appliances to enter custom wake-up word mode.
  5. 如权利要求1-4任一项所述的方法,其特征在于,还包括:The method according to any one of claims 1-4, further comprising:
    接收输入的语音信息;Receive input voice information;
    基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词;Detecting and determining that the voice information is a custom wake-up word based on the custom wake-up word dictionary;
    生成唤醒指令,并根据所述唤醒指令唤醒家电设备。Generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.
  6. 如权利要求5所述的方法,其特征在于,基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词,包括:The method according to claim 5, wherein detecting and determining the voice information as a custom wake-up word based on the custom wake-up thesaurus comprises:
    提取所述语音信息的特征信息;Extracting characteristic information of the voice information;
    对所述语音信息的特征信息进行归一化,并采用动态时间规划算法,将所述语音信息的特征信息与所述自定义唤醒词库中的唤醒词的特征信息进行比对;Normalize the feature information of the voice information, and use a dynamic time planning algorithm to compare the feature information of the voice information with the feature information of the wake word in the custom wake word dictionary;
    获取相似度最高的比对结果;Get the comparison result with the highest similarity;
    检测并确定相似度最高的比对结果满足设定值,确定所述语音信息为自定义唤醒词。Detect and determine that the comparison result with the highest similarity satisfies the set value, and determine that the voice information is a custom wake-up word.
  7. 如权利要求5或6所述的方法,其特征在于,还包括:The method according to claim 5 or 6, further comprising:
    检测并确定所述语音信息不为自定义唤醒词,提示重新输入语音信息。Detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.
  8. 一种家电设备的唤醒词训练方法,其特征在于,包括:A wake-up word training method for a home appliance is characterized in that it includes:
    控制家电设备进入自定义唤醒词模式;Control home appliances to enter custom wake-up word mode;
    采集输入的唤醒词;Collect input wake-up words;
    对所述唤醒词进行训练,检测并确定训练唤醒词成功;Training the awakening word, detecting and determining that the training awakening word is successful;
    进行下一次的唤醒词采集和训练,直至第N次训练唤醒词成功,N为正整数。The next awakening word collection and training is performed until the Nth training awakening word is successful, N is a positive integer.
  9. 如权利要求8所述的方法,其特征在于,还包括:The method according to claim 8, further comprising:
    在第N次训练唤醒词成功之后,确定所述唤醒词生效,并将生效的唤醒词保存在本地。After the N-th training awakening word succeeds, it is determined that the awakening word is effective, and the effective awakening word is saved locally.
  10. 如权利要求9所述的方法,其特征在于,还包括:The method according to claim 9, further comprising:
    在确定所述唤醒词生效之后,接收输入的生效的唤醒词;After determining that the wake-up word is valid, receiving an inputted wake-up word that is valid;
    根据所述生效的唤醒词唤醒家电设备。Wake the home appliance according to the effective wake-up word.
  11. 如权利要求8-10任一项所述的方法,其特征在于,对所述唤醒词进行训练,检测并确定训练唤醒词成功,包括:The method according to any one of claims 8 to 10, wherein training the awake words and detecting and determining that the awake words are successfully trained comprises:
    提取所述唤醒词的特征信息;Extracting feature information of the wake word;
    检测并确定所述唤醒词的特征信息符合预设标准,确定训练唤醒词成功。Detect and determine that the feature information of the awakened word meets a preset standard, and determine that the training of the awakened word is successful.
  12. 如权利要求8-11任一项所述的方法,其特征在于,所述进行下一次的唤醒词采集和训练,直至第N次训练唤醒词成功,包括:The method according to any one of claims 8-11, wherein performing the next awake word collection and training until the Nth training awake word succeeds, comprising:
    提取第M次输入的唤醒词的特征信息;Extract feature information of the awake words inputted for the Mth time;
    检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算;Detect and determine that the feature information of the awoken words inputted in the Mth time meets the preset criteria, and perform similarity calculations on the feature information of the awakewords inputted in the Mth time and the feature information of the wakeupwords input in the first M-1 times;
    检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。Detect and determine that the similarity between the feature information of the awake words input at the Mth time and the feature information of the awake words input at the first M-1 times is greater than a preset similarity, and it is determined that the training of the awake words is successful.
  13. 一种家电设备的唤醒词训练装置,其特征在于,包括:A wake word training device for a home appliance is characterized in that it includes:
    第一采集模块,用于采集唤醒词的语音数据样本;A first acquisition module, configured to collect speech data samples of wake words;
    提取模块,用于提取所述语音数据样本的特征信息;An extraction module, configured to extract feature information of the voice data samples;
    判断模块,用于对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件;A judging module for normalizing the feature information, detecting and determining that the normalized feature information meets a preset condition;
    第一保存模块,用于将所述唤醒词的语音数据样本保存至自定义唤醒词库中。The first saving module is configured to save the speech data samples of the wake-up word to a custom wake-up word bank.
  14. 如权利要求13所述的装置,其特征在于,所述第一采集模块,还用于:The apparatus according to claim 13, wherein the first acquisition module is further configured to:
    检测并确定所述归一化后的特征信息不满足所述预设条件,重新采集所述唤醒词的语音数据样本。Detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the wake-up word.
  15. 如权利要求13或14所述的装置,其特征在于,所述装置还包括:The device according to claim 13 or 14, further comprising:
    预处理模块,用于在采集唤醒词的语音数据样本之后,对所述语音数据样本进行 去噪处理。The preprocessing module is configured to perform denoising processing on the voice data samples after collecting the voice data samples of the wake word.
  16. 如权利要求13-15任一项所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 13-15, wherein the device further comprises:
    第一控制模块,用于在采集唤醒词的语音数据样本之前,控制家电设备进入自定义唤醒词模式。The first control module is configured to control the home appliance to enter a custom wake-up word mode before collecting voice data samples of the wake-up word.
  17. 如权利要求13-16任一项所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 13-16, wherein the device further comprises:
    第一接收模块,用于接收输入的语音信息;A first receiving module, configured to receive input voice information;
    识别模块,用于基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词;A recognition module, configured to detect and determine that the voice information is a custom wake-up word based on the custom wake-up thesaurus;
    第一唤醒模块,用于生成唤醒指令,并根据所述唤醒指令唤醒家电设备。The first wake-up module is configured to generate a wake-up instruction and wake up the home appliance according to the wake-up instruction.
  18. 如权利要求17所述的装置,其特征在于,所述识别模块,用于:The device according to claim 17, wherein the identification module is configured to:
    提取所述语音信息的特征信息;Extracting characteristic information of the voice information;
    对所述语音信息的特征信息进行归一化,并采用动态时间规划算法,将所述语音信息的特征信息与所述自定义唤醒词库中的唤醒词的特征信息进行比对;Normalize the feature information of the voice information, and use a dynamic time planning algorithm to compare the feature information of the voice information with the feature information of the wake word in the custom wake word dictionary;
    获取相似度最高的比对结果;Get the comparison result with the highest similarity;
    检测并确定相似度最高的比对结果满足设定值,确定所述语音信息为自定义唤醒词。Detect and determine that the comparison result with the highest similarity satisfies the set value, and determine that the voice information is a custom wake-up word.
  19. 如权利要求17或18所述的装置,其特征在于,所述装置还包括:The device according to claim 17 or 18, wherein the device further comprises:
    提示模块,用于检测并确定所述语音信息不为自定义唤醒词,提示重新输入语音信息。A prompting module is configured to detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.
  20. 一种家电设备的唤醒词训练装置,其特征在于,包括:A wake word training device for a home appliance is characterized in that it includes:
    第二控制模块,用于控制家电设备进入自定义唤醒词模式;A second control module, for controlling a home appliance to enter a custom wake-up word mode;
    第二采集模块,用于采集输入的唤醒词;A second acquisition module, configured to collect an input wake-up word;
    训练模块,用于对所述唤醒词进行训练,检测并确定训练唤醒词成功,所述采集模块进行下一次采集,所述训练模块进行下一次训练,直至第N次训练唤醒词成功,N为正整数。A training module for training the awake word, detecting and determining that the training awake word is successful, the acquisition module performs next acquisition, and the training module performs a next training until the Nth training awake word is successful, N is Positive integer.
  21. 如权利要求20所述的装置,其特征在于,所述装置还包括:The apparatus according to claim 20, further comprising:
    第二保存模块,用于在第N次训练唤醒词成功之后,确定所述唤醒词生效,并将生效的唤醒词保存在本地。The second saving module is configured to determine that the wakeup word is effective after the Nth training wakeup word is successful, and save the valid wakeup word locally.
  22. 如权利要求21所述的装置,其特征在于,所述装置还包括:The apparatus according to claim 21, wherein the apparatus further comprises:
    第二接收模块,用于在确定所述唤醒词生效之后,接收输入的生效的唤醒词;A second receiving module, configured to receive an input effective wake-up word after determining that the wake-up word is valid;
    第二唤醒模块,用于根据所述生效的唤醒词唤醒家电设备。The second wake-up module is configured to wake up the home appliance according to the valid wake-up word.
  23. 如权利要求20-22任一项所述的装置,其特征在于,所述训练模块,用于:The apparatus according to any one of claims 20 to 22, wherein the training module is configured to:
    提取所述唤醒词的特征信息;Extracting feature information of the wake word;
    检测并确定所述唤醒词的特征信息符合预设标准,确定训练唤醒词成功。Detect and determine that the feature information of the awakened word meets a preset standard, and determine that the training of the awakened word is successful.
  24. 如权利要求20-23任一项所述的装置,其特征在于,所述训练模块,还用于:The device according to any one of claims 20-23, wherein the training module is further configured to:
    提取第M次输入的唤醒词的特征信息;Extract feature information of the awake words inputted for the Mth time;
    检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算;Detect and determine that the feature information of the awoken words inputted in the Mth time meets the preset criteria, and perform similarity calculations on the feature information of the awakewords inputted in the Mth time and the feature information of the wakeupwords input in the first M-1 times;
    检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。Detect and determine that the similarity between the feature information of the awake words input at the Mth time and the feature information of the awake words input at the first M-1 times is greater than a preset similarity, and it is determined that the training of the awake words is successful.
  25. 一种非临时性计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如权利要求1-7任一项所述的家电设备的唤醒词训练方法,或者,实现如权利要求8-12任一项所述的家电设备的唤醒词训练方法。A non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements the wake word training method for a home appliance as claimed in any one of claims 1-7, or implements The wake word training method for a home appliance according to any one of claims 8-12.
  26. 一种家电设备,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器用于执行如权利要求1-7任一项所述的家电设备的唤醒词训练方法,或者,执行如权利要求8-12任一项所述的家电设备的唤醒词训练方法。A home appliance including a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor is configured to execute the home appliance according to any one of claims 1-7 Or a wake-up word training method for a home appliance according to any one of claims 8-12.
PCT/CN2019/074317 2018-06-19 2019-02-01 Wakeup word training method and device of household appliance, and household appliance WO2019242312A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201810628693.7 2018-06-19
CN201810628693.7A CN109036393A (en) 2018-06-19 2018-06-19 Wake-up word training method, device and the household appliance of household appliance
CN201810885079.9A CN109166571B (en) 2018-08-06 2018-08-06 Household appliance awakening word training method and device and household appliance
CN201810885079.9 2018-08-06

Publications (1)

Publication Number Publication Date
WO2019242312A1 true WO2019242312A1 (en) 2019-12-26

Family

ID=68983484

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/074317 WO2019242312A1 (en) 2018-06-19 2019-02-01 Wakeup word training method and device of household appliance, and household appliance

Country Status (1)

Country Link
WO (1) WO2019242312A1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104282307A (en) * 2014-09-05 2015-01-14 中兴通讯股份有限公司 Method, device and terminal for awakening voice control system
US20150154953A1 (en) * 2013-12-02 2015-06-04 Spansion Llc Generation of wake-up words
CN104795068A (en) * 2015-04-28 2015-07-22 深圳市锐曼智能装备有限公司 Robot awakening control method and robot awakening control system
US20160293168A1 (en) * 2015-03-30 2016-10-06 Opah Intelligence Ltd. Method of setting personal wake-up word by text for voice control
CN106161755A (en) * 2015-04-20 2016-11-23 钰太芯微电子科技(上海)有限公司 A kind of key word voice wakes up system and awakening method and mobile terminal up
CN107369439A (en) * 2017-07-31 2017-11-21 北京捷通华声科技股份有限公司 A kind of voice awakening method and device
CN109036393A (en) * 2018-06-19 2018-12-18 广东美的厨房电器制造有限公司 Wake-up word training method, device and the household appliance of household appliance
CN109166571A (en) * 2018-08-06 2019-01-08 广东美的厨房电器制造有限公司 Wake-up word training method, device and the household appliance of household appliance

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150154953A1 (en) * 2013-12-02 2015-06-04 Spansion Llc Generation of wake-up words
CN104282307A (en) * 2014-09-05 2015-01-14 中兴通讯股份有限公司 Method, device and terminal for awakening voice control system
US20160293168A1 (en) * 2015-03-30 2016-10-06 Opah Intelligence Ltd. Method of setting personal wake-up word by text for voice control
CN106161755A (en) * 2015-04-20 2016-11-23 钰太芯微电子科技(上海)有限公司 A kind of key word voice wakes up system and awakening method and mobile terminal up
CN104795068A (en) * 2015-04-28 2015-07-22 深圳市锐曼智能装备有限公司 Robot awakening control method and robot awakening control system
CN107369439A (en) * 2017-07-31 2017-11-21 北京捷通华声科技股份有限公司 A kind of voice awakening method and device
CN109036393A (en) * 2018-06-19 2018-12-18 广东美的厨房电器制造有限公司 Wake-up word training method, device and the household appliance of household appliance
CN109166571A (en) * 2018-08-06 2019-01-08 广东美的厨房电器制造有限公司 Wake-up word training method, device and the household appliance of household appliance

Similar Documents

Publication Publication Date Title
WO2021093449A1 (en) Wakeup word detection method and apparatus employing artificial intelligence, device, and medium
JP6453917B2 (en) Voice wakeup method and apparatus
CN108320733B (en) Voice data processing method and device, storage medium and electronic equipment
US10504511B2 (en) Customizable wake-up voice commands
CN105632486B (en) Voice awakening method and device of intelligent hardware
CN106448663B (en) Voice awakening method and voice interaction device
CN107481718B (en) Audio recognition method, device, storage medium and electronic equipment
WO2017114201A1 (en) Method and device for executing setting operation
WO2017071182A1 (en) Voice wakeup method, apparatus and system
BR102018070673A2 (en) GENERATE DIALOGUE BASED ON VERIFICATION SCORES
JP2019533193A (en) Voice control system, wakeup method thereof, wakeup device, home appliance, coprocessor
CN105529028A (en) Voice analytical method and apparatus
CN109036393A (en) Wake-up word training method, device and the household appliance of household appliance
CN109166571B (en) Household appliance awakening word training method and device and household appliance
CN110223687B (en) Instruction execution method and device, storage medium and electronic equipment
CN110706707B (en) Method, apparatus, device and computer-readable storage medium for voice interaction
JP6915637B2 (en) Information processing equipment, information processing methods, and programs
US20240013784A1 (en) Speaker recognition adaptation
CN112002349B (en) Voice endpoint detection method and device
US20200312305A1 (en) Performing speaker change detection and speaker recognition on a trigger phrase
CN111862943B (en) Speech recognition method and device, electronic equipment and storage medium
CN117636872A (en) Audio processing method, device, electronic equipment and readable storage medium
CN110808050B (en) Speech recognition method and intelligent device
IT201900015506A1 (en) Process of processing an electrical signal transduced by a speech signal, electronic device, connected network of electronic devices and corresponding computer product
CN115691478A (en) Voice wake-up method and device, man-machine interaction equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19822929

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.05.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19822929

Country of ref document: EP

Kind code of ref document: A1