CN111386566A - Device control method, cloud device, intelligent device, computer medium and device - Google Patents
Device control method, cloud device, intelligent device, computer medium and device Download PDFInfo
- Publication number
- CN111386566A CN111386566A CN201780097241.4A CN201780097241A CN111386566A CN 111386566 A CN111386566 A CN 111386566A CN 201780097241 A CN201780097241 A CN 201780097241A CN 111386566 A CN111386566 A CN 111386566A
- Authority
- CN
- China
- Prior art keywords
- awakening
- intelligent
- word
- same
- words
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 238000012549 training Methods 0.000 claims description 21
- 238000004590 computer program Methods 0.000 claims description 13
- 238000001514 detection method Methods 0.000 claims description 9
- 238000007476 Maximum Likelihood Methods 0.000 claims description 7
- 238000012423 maintenance Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000007723 transport mechanism Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- Telephonic Communication Services (AREA)
Abstract
Disclosed herein are a device control method, a cloud device, an intelligent device, a computer medium, and a device, the method applied to the cloud device including: acquiring awakening words of a plurality of intelligent devices in the same voice receiving area, wherein the plurality of intelligent devices can be awakened through the respective awakening words; detecting whether the awakening words of the plurality of intelligent devices have the same awakening word; and generating an instruction for modifying the same awakening word under the condition that the same awakening word exists in the awakening words of the plurality of intelligent devices. The method and the device can effectively identify a plurality of intelligent devices which are in the same voice receiving area and use the same awakening words, actively modify or inform the intelligent devices of modifying the awakening words, prevent the same awakening words from appearing among the intelligent devices, ensure the accurate control of the intelligent devices by users and improve the use experience of the users.
Description
The embodiment of the invention relates to the technical field of internet, in particular to a device control method, a cloud device, an intelligent device, a computer medium and a device.
With the development of technology, various types of smart home devices are gradually widely applied. Existing smart home devices generally support a voice wake-up function, for example: the user sends out and awakens the word, and after the intelligent home equipment used voice acquisition device to gather the voice signal, the content of this voice signal was discerned for this intelligent home equipment set up awaken the word after, alright in order to open this intelligent home equipment automatically. However, with the increase of the smart home devices in the home, a plurality of smart home devices may use the same wake-up word, which may cause the user to be unable to control the desired device accurately by voice, and may cause trouble to the user and affect the effective execution of the command.
Disclosure of Invention
In order to solve the technical problem, a device control method, a cloud device, an intelligent device, a computer medium and a device are provided.
The device control method provided by the document is applied to the cloud device and comprises the following steps:
acquiring awakening words of a plurality of intelligent devices in the same voice receiving area, wherein the plurality of intelligent devices can be awakened through the respective awakening words;
detecting whether the awakening words of the plurality of intelligent devices have the same awakening word;
and generating an instruction for modifying the same awakening word under the condition that the same awakening word exists in the awakening words of the plurality of intelligent devices.
The equipment control method also has the following characteristics:
after generating the instruction for modifying the same wake word, the method further comprises: and determining the same awakening word according to the awakening word updating rule.
The equipment control method also has the following characteristics:
the wake word update rule includes adding suffix words having a sequential relationship to the wake words.
The equipment control method also has the following characteristics:
when obtaining the awakening words of a plurality of intelligent devices in the same voice receiving area, the method further comprises the following steps:
acquiring hardware addresses of routers to which a plurality of intelligent devices in the same voice receiving area belong;
whether the awakening words of the intelligent devices have the same awakening words or not is detected, and the method comprises the following steps:
detecting whether intelligent equipment belonging to the same router exists in the plurality of intelligent equipment according to the hardware address of the router to which the plurality of intelligent equipment belong;
under the condition that the intelligent devices belonging to the same router exist in the plurality of intelligent devices, detecting whether the awakening words of the intelligent devices belonging to the same router have the same awakening word or not;
under the condition that the same awakening word exists in the awakening words of the plurality of intelligent devices, generating an instruction for modifying the same awakening word, wherein the instruction comprises the following steps:
and generating an instruction for modifying the same awakening word under the condition that the same awakening word exists in the awakening words of the intelligent devices belonging to the same router.
A computer-readable storage medium is provided herein, having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described method.
A computer device is provided herein, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the above method when executing the program.
The cloud device provided herein includes:
the device comprises a first acquisition module, a second acquisition module and a voice receiving module, wherein the first acquisition module is used for acquiring the awakening words of a plurality of intelligent devices in the same voice receiving area, and the plurality of intelligent devices can be awakened through the respective awakening words;
the first detection module is used for detecting whether the awakening words of the plurality of intelligent devices have the same awakening word;
the first generation module is used for generating an instruction for modifying the same awakening word under the condition that the same awakening word exists in the awakening words of the plurality of intelligent devices.
The cloud equipment further has the following characteristics:
the first acquisition module is used for acquiring the awakening words of the intelligent devices in the same voice receiving area;
the first detection module comprises a second detection module and is used for detecting whether intelligent equipment belonging to the same router exists in the plurality of intelligent equipment or not according to the hardware address of the router belonging to the plurality of intelligent equipment, and detecting whether the same awakening word exists in the awakening words of the intelligent equipment belonging to the same router or not under the condition that the intelligent equipment belonging to the same router exists in the plurality of intelligent equipment;
the first generation module comprises a second generation module and is used for generating an instruction for modifying the same awakening word under the condition that the same awakening word exists in the awakening words of the intelligent devices belonging to the same router.
The device control method provided by the invention is applied to intelligent devices and comprises the following steps:
sending a wake-up word of the intelligent device to the cloud device, wherein the intelligent device can be woken up through the wake-up word;
when the cloud device detects that the awakening words of other intelligent devices in the same voice receiving area with the intelligent device are the same as the awakening words of the intelligent device, receiving and storing a new awakening word sent by the cloud device or receiving an instruction sent by the cloud device for modifying the awakening words of the intelligent device.
The equipment control method also has the following characteristics:
after receiving an instruction sent by the cloud device to modify a wakeup word of the smart device, the method further includes: and determining a new awakening word according to the awakening word updating rule or receiving and storing the new awakening word set by the user according to the instruction.
The equipment control method also has the following characteristics:
after the new wake word is determined, the method further comprises: and changing the pre-constructed voice recognition decoding network according to the new awakening words.
The equipment control method also has the following characteristics:
changing the pre-constructed speech recognition decoding network according to the new awakening word, comprising the following steps:
determining a triphone node sequence corresponding to the new awakening word, determining a state index value corresponding to each triphone in the triphone node sequence according to a corresponding relation between the triphone in the pre-constructed acoustic model and the model output state index value, and forming a state index value sequence; the pre-constructed acoustic model is a model obtained by training a holophone, a garbage phone and a noise phone by adopting a first criterion and a second criterion, the first criterion is a maximum likelihood estimation criterion, and the second criterion is a discriminative training criterion taking a minimum phone error criterion as an optimization criterion;
and replacing the state index value sequence corresponding to the original awakening word in the pre-constructed voice recognition decoding network with the state index value sequence corresponding to the new awakening word.
The equipment control method also has the following characteristics:
after the pre-constructed speech recognition decoding network is changed according to the new awakening word, the method further comprises the following steps:
receiving voice data input by a user; extracting voice features from the voice data; decoding the voice characteristics by using a voice recognition decoding network to obtain a decoding result; in case that the decoding result includes a new wake-up word, a wake-up operation is performed.
The equipment control method also has the following characteristics:
when sending the awakening word of the intelligent device to the cloud device, the method further comprises the following steps: sending a hardware address of a router to which the intelligent device belongs to the cloud device;
when the high in the clouds equipment detects the word of awakening up of other smart machines that are in same speech reception area with smart machine and smart machine awakens up the word the same, receive and store the new word of awakening up that the high in the clouds equipment sent or receive the instruction of the word of awakening up of modification smart machine that the high in the clouds equipment sent, include:
when the cloud device detects that the awakening words of other intelligent devices which are in the same voice receiving area with the intelligent device and have the same hardware address with the router are the same as the awakening words of the intelligent device, receiving and storing new awakening words sent by the cloud device or receiving an instruction sent by the cloud device for modifying the awakening words of the intelligent device.
Another computer-readable storage medium provided herein, having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described method.
Another computer device provided herein comprises a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the program.
A smart device provided herein comprises:
the system comprises a first sending module, a second sending module and a processing module, wherein the first sending module is used for sending a wake-up word of the intelligent device to the cloud device, and the intelligent device can be woken up through the wake-up word;
the first receiving module is used for receiving a new awakening word sent by the cloud equipment or receiving an instruction sent by the cloud equipment for modifying the awakening word of the intelligent equipment when the cloud equipment detects that the awakening words of other intelligent equipment in the same voice receiving area with the intelligent equipment are the same as the awakening words of the intelligent equipment.
The intelligent equipment also has the following characteristics:
the smart device further includes:
the second receiving module is used for receiving and storing a new awakening word set by the user according to the instruction after the first receiving module receives the instruction for modifying the awakening word of the intelligent device, which is sent by the cloud device;
the awakening word updating module is used for determining a new awakening word according to the awakening word updating rule after the first receiving module receives the instruction for modifying the awakening word of the intelligent device, which is sent by the cloud device.
The intelligent equipment also has the following characteristics:
the intelligent device also comprises a network updating module used for changing a pre-constructed voice recognition decoding network according to the new awakening words after the new awakening words are determined;
the network updating module comprises a computing module and a network maintenance module;
the calculation module is used for determining a triphone node sequence corresponding to the new awakening word, determining a state index value corresponding to each triphone in the triphone node sequence according to the corresponding relation between the triphone in the pre-constructed acoustic model and the model output state index value, and forming a state index value sequence;
and the network maintenance module is used for replacing the state index value sequence corresponding to the original awakening word in the pre-constructed voice recognition decoding network with the state index value sequence corresponding to the new awakening word.
The intelligent equipment also has the following characteristics:
the intelligent device further comprises a training module, wherein the training module is used for training the holophone, the garbage phone and the noise phone to obtain a pre-constructed acoustic model by adopting a first criterion and a second criterion, the first criterion is a maximum likelihood estimation criterion, and the second criterion is a discriminative training criterion taking a minimum phone error criterion as an optimization criterion.
The intelligent equipment also has the following characteristics:
the smart device further includes:
the second sending module is used for sending the hardware address of the router to which the intelligent device belongs to the cloud device while the first sending module sends the awakening word of the intelligent device to the cloud device;
the first receiving module comprises a third receiving module and is used for receiving and storing a new awakening word sent by the cloud equipment or receiving an instruction sent by the cloud equipment for modifying the awakening word of the intelligent equipment when the cloud equipment detects that the awakening words of other intelligent equipment which are in the same voice receiving area with the intelligent equipment and have the same hardware address with the router are the same as the awakening words of the intelligent equipment.
The method can effectively identify a plurality of intelligent devices which are in the same voice receiving area and use the same awakening words, generate an instruction for modifying the same awakening words, and then actively modify or inform the intelligent devices of modifying the awakening words according to the instruction, so that the situation that the same awakening words appear among the intelligent devices is prevented, accurate control of the intelligent devices by a user is guaranteed, and user experience is improved.
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this application, illustrate embodiment(s) and description of the disclosure herein are provided for explanation and do not constitute a limitation of the disclosure. In the drawings:
fig. 1 is a flowchart of a device control method applied to a cloud device in one embodiment;
FIG. 2 is a block diagram of a cloud-side device according to an embodiment;
fig. 3 is a flowchart of a device control method applied to the cloud device in the second embodiment;
fig. 4 is a structural diagram of an intelligent device in the second embodiment.
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The intelligent device in the embodiment of the invention is typically an intelligent household appliance.
Example one
As shown in fig. 1, the device control method applied to the cloud device includes:
102, detecting whether the awakening words of the plurality of intelligent devices have the same awakening word;
and 103, generating an instruction for modifying the same awakening word under the condition that the same awakening word exists in the awakening words of the plurality of intelligent devices.
The embodiment of the invention can effectively identify a plurality of intelligent devices which are positioned in the same voice receiving area and use the same awakening word, generate the instruction for modifying the same awakening word, and actively modify or inform the intelligent devices of modifying the awakening word according to the instruction, thereby preventing the same awakening word from appearing among the intelligent devices, ensuring the accurate control of the user on the intelligent devices and improving the user experience.
The method may further comprise step 104: and determining the same awakening word according to the awakening word updating rule. And adding suffix words with sequential relation to the awakening words. For example, the original wake-up word is "power on", and the updated wake-up words are "power on 1", "power on 2", "power on 3", and the like; or, the updated wake-up words are "boot-up a", "boot-up B", "boot-up C", and the like.
According to the above description, the method for updating the wake-up word in the embodiment can be a cloud device, and the embodiment of the invention provides more than one updating mode, so that the user can select a corresponding mode according to the use requirement, and the use experience of the user is improved.
In step 102, while obtaining the wakeup words of a plurality of intelligent devices in the same voice receiving area, the method further includes: and acquiring hardware addresses of routers to which the plurality of intelligent devices in the same voice receiving area belong.
In step 102, detecting whether there is a same wake-up word in the wake-up words of the plurality of intelligent devices includes: detecting whether intelligent equipment belonging to the same router exists in the plurality of intelligent equipment according to the hardware address of the router to which the plurality of intelligent equipment belong; when the intelligent devices belonging to the same router exist in the plurality of intelligent devices, whether the awakening words of the intelligent devices belonging to the same router have the same awakening word or not is detected.
In the case that there is a same wake word in the wake words of the multiple intelligent devices, the step 103 generates an instruction for modifying the same wake word, including: and generating an instruction for modifying the same awakening word under the condition that the same awakening word exists in the awakening words of the intelligent devices belonging to the same router.
In the above method, the intelligent devices in the same lan and using the same wake-up word are taken as the intelligent devices in the same voice receiving area, considering that the intelligent devices in the same lan are generally devices in the same small area (e.g., home, company, etc.), and the distance between the intelligent devices is small, so that the plurality of intelligent devices may receive the wake-up word when the user utters the voice wake-up word by using the same wake-up word.
The first embodiment further includes a computer readable storage medium, on which a computer program is stored, and the program is executed by a processor to implement the steps of the method.
The first embodiment further includes a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the steps of the method are implemented.
As shown in fig. 2, the cloud device in the first embodiment includes:
the device comprises a first acquisition module, a second acquisition module and a voice receiving module, wherein the first acquisition module is used for acquiring the awakening words of a plurality of intelligent devices in the same voice receiving area, and the plurality of intelligent devices can be awakened through the respective awakening words;
the first detection module is used for detecting whether the awakening words of the plurality of intelligent devices have the same awakening word;
the first generation module is used for generating an instruction for modifying the same awakening word under the condition that the same awakening word exists in the awakening words of the plurality of intelligent devices.
The cloud device further comprises a second acquisition module, which is used for acquiring the hardware addresses of the routers of the plurality of intelligent devices in the same voice receiving area when the first acquisition module acquires the awakening words of the plurality of intelligent devices in the same voice receiving area.
The first detection module comprises a second detection module and is used for detecting whether intelligent equipment belonging to the same router exists in the plurality of intelligent equipment or not according to the hardware address of the router belonging to the plurality of intelligent equipment, and detecting whether the same awakening word exists in the awakening words of the intelligent equipment belonging to the same router or not under the condition that the intelligent equipment belonging to the same router exists in the plurality of intelligent equipment.
The first generation module comprises a second generation module and is used for generating an instruction for modifying the same awakening word under the condition that the same awakening word exists in the awakening words of the intelligent devices belonging to the same router.
This cloud equipment still includes: and the awakening word modification module is used for determining the same awakening word according to the awakening word updating rule. The wake word update rule includes adding suffix words having a sequential relationship to the wake words. For example, the original wake-up word is "power on", and the updated wake-up words are "power on 1", "power on 2", "power on 3", and the like; or, the updated wake-up words are "boot-up a", "boot-up B", "boot-up C", and the like.
Example two
As shown in fig. 3, the device control method applied to the smart device includes:
In the method, after receiving the instruction for modifying the wakeup word of the smart device sent by the cloud device in step 302, the method further includes: and determining a new awakening word according to the awakening word updating rule or receiving and storing the new awakening word set by the user according to the instruction.
In the second embodiment, after the intelligent device receives the instruction for modifying the wake-up word of the intelligent device, the method for updating the wake-up word may be implemented by the intelligent device or the user.
Whether there is the same word of awakening up in the word of awakening up for making high in the clouds equipment can detect a plurality of smart machines, this smart machine still includes when sending smart machine's the word of awakening up to high in the clouds equipment: sending a hardware address of a router to which the intelligent device belongs to the cloud device; when the cloud device detects that the wakeup words of other smart devices in the same voice receiving area as the smart device are the same as the wakeup words of the smart device, the step 302 of receiving and storing a new wakeup word sent by the cloud device or receiving an instruction sent by the cloud device to modify the wakeup word of the smart device specifically includes: when the cloud device detects that the awakening words of other intelligent devices which are in the same voice receiving area with the intelligent device and have the same hardware address with the router are the same as the awakening words of the intelligent device, receiving and storing new awakening words sent by the cloud device or receiving an instruction sent by the cloud device for modifying the awakening words of the intelligent device.
After the new wake word is determined in step 302, the method further includes: and changing the pre-constructed voice recognition decoding network according to the new awakening words.
It should be noted here that the manner of determining the new wake-up word may be to receive and store the new wake-up word sent by the cloud device, or may be to determine the new wake-up word according to a wake-up word update rule by the smart device or to receive and store the new wake-up word set by the user according to an instruction by the smart device.
The intelligent device builds an acoustic model in advance for realizing the awakening function of the awakening word, wherein the acoustic model built in advance is a model obtained by training a full phoneme, a garbage phoneme and a noise phoneme by adopting a first criterion and a second criterion. The first criterion is a Maximum Likelihood Estimation (MLE) criterion, and the second criterion is a discriminative training criterion using a Minimum Phoneme Error (MPE) criterion as an optimization criterion.
When an acoustic model is constructed, a garbage phoneme library consisting of M consonant phonemes and N vowel phonemes is clustered by using holophonics in a holophonic phoneme library according to a pronunciation variation rule to train the garbage model; because the garbage factor model is obtained by mixed training of a plurality of normal phonemes, when a wake word is encountered during decoding, the scores of the acoustic features of the input speech on the garbage phonemes are not competitive with the scores on the phonemes corresponding to the wake word, and when speech except the wake word is encountered, the corresponding phonemes naturally compete with the phonemes corresponding to the wake word due to the existence of the corresponding phonemes in the garbage phonemes.
Compared with the common acoustic model in the prior art, the acoustic model has the advantages that the training part of the noise phoneme is added, and the anti-noise performance of the acoustic model can be improved. Moreover, the construction of the holophonemic acoustic Model has the great advantage that a covered holophonemic Hidden Markov Model (HMM) can be generated by using the data set only through a maximum likelihood method without specifying a specific awakening word during training.
When the sound range of the awakening word is not known, the holophonic speech library is used for training a holophonic acoustic model, when the sound range of the awakening word is known in advance, speech data of the sound range of the awakening word can be collected to establish the awakening word sound library, and the acoustic model is updated by using the speech library, so that the acoustic model has better matching performance on the awakening word.
After the acoustic model is trained successfully, the corresponding relation between the triphone of each pronunciation and the model output state index value is stored.
And the decoding network is also constructed by the intelligent equipment, and comprises parallel garbage phoneme index value nodes, noise phoneme index value nodes and a state index value sequence corresponding to the currently used awakening word. The decoding network is a round-robin network, which is characterized by jumping from the exit node back to the entry node, so that a plurality of continuous voice fragments can be covered.
The method for changing the pre-constructed speech recognition decoding network according to the new awakening words comprises the following steps: determining a triphone node sequence corresponding to the new awakening word, determining a state index value corresponding to each triphone in the triphone node sequence according to a corresponding relation between the triphone in the pre-constructed acoustic model and the model output state index value, and forming a state index value sequence; and replacing the state index value sequence corresponding to the original awakening word in the pre-constructed voice recognition decoding network with the state index value sequence corresponding to the new awakening word.
The triphone node sequence corresponding to the wakeup word is determined by combining front and rear phonemes to form a triphone using the phonemes of each word. The sequence of triphone nodes may also be obtained such that the first phoneme of the first triphone and the last phoneme of the last triphone include a silent phoneme (sil).
If the awakening word is "hello tv", the triphone node sequence corresponding to the awakening word is as follows:
sil-n+i,n-i+h,i-h+ao,h-ao+d,ao-d+ian,d-ian+sh,ian-sh+i,sh-i+sil
in the method, after the pre-constructed voice recognition decoding network is changed according to the new awakening word, the method for using the new awakening word comprises the following steps: receiving voice data input by a user; extracting a voice feature from the voice data, wherein the voice feature is, for example, a Mel-Frequency Cepstrum Coefficient (MFCC) and decoding the voice feature by using a voice recognition decoding network to obtain a decoding result; in case that the decoding result includes a new wake-up word, a wake-up operation is performed.
A second embodiment provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of the above-mentioned method.
The second embodiment also provides a computer device, which includes a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method when executing the program.
As shown in fig. 4, the intelligent device in the second embodiment includes:
the system comprises a first sending module, a second sending module and a processing module, wherein the first sending module is used for sending a wake-up word of the intelligent device to the cloud device, and the intelligent device can be woken up through the wake-up word;
the first receiving module is used for receiving and storing a new awakening word sent by the cloud equipment or receiving an instruction sent by the cloud equipment for modifying the awakening word of the intelligent equipment when the cloud equipment detects that the awakening words of other intelligent equipment in the same voice receiving area with the intelligent equipment are the same as the awakening words of the intelligent equipment.
The intelligent device also comprises a second receiving module or a wake-up word updating module. The second receiving module is used for receiving and storing a new awakening word set by the user according to the instruction after the first receiving module receives the instruction for modifying the awakening word of the intelligent device, which is sent by the cloud device. The awakening word updating module is used for determining a new awakening word according to the awakening word updating rule after the first receiving module receives the instruction for modifying the awakening word of the intelligent device, which is sent by the cloud device.
The intelligent device also comprises a network updating module which is used for changing the pre-constructed speech recognition decoding network according to the new awakening words after the new awakening words are determined.
The network updating module comprises a computing module and a network maintenance module.
The calculation module is used for determining a triphone node sequence corresponding to the new awakening word, determining a state index value corresponding to each triphone in the triphone node sequence according to the corresponding relation between the triphone in the pre-constructed acoustic model and the model output state index value, and forming a state index value sequence.
The network maintenance module is used for replacing the state index value sequence corresponding to the original awakening word in the pre-constructed voice recognition decoding network with the state index value sequence corresponding to the new awakening word.
The intelligent device further comprises a training module, wherein the training module is used for training the holophone, the garbage phone and the noise phone by adopting a first criterion and a second criterion to obtain the pre-constructed acoustic model, the first criterion is a maximum likelihood estimation criterion, and the second criterion is a discriminative training criterion taking a minimum phone error criterion as an optimization criterion.
The intelligent device further comprises a second sending module, and the second sending module is used for sending the wake-up word of the intelligent device to the cloud device and sending the hardware address of the router to which the intelligent device belongs to the cloud device. And the first receiving module comprises a third receiving module for receiving and storing a new awakening word sent by the cloud device or receiving an instruction sent by the cloud device for modifying the awakening word of the intelligent device when the cloud device detects that the awakening words of other intelligent devices which are in the same voice receiving area with the intelligent device and have the same hardware address with the router are the same as the awakening words of the intelligent device.
The intelligent device also comprises a fourth receiving module, an extracting module, a decoding module and an executing module, wherein the fourth receiving module is used for receiving voice data input by a user after the network updating module changes a pre-constructed voice recognition decoding network according to the new awakening words; the extraction module is used for extracting voice features from the voice data; the decoding module is used for decoding the voice characteristics by using a voice recognition decoding network to obtain a decoding result; and the execution module is used for executing the awakening operation under the condition that the decoding result comprises the new awakening word.
Various modifications and alterations to the embodiments of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the embodiments of the present invention shall be included in the protection scope of the present invention.
The above-described aspects may be implemented individually or in various combinations, and such variations are within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.
The method can effectively identify a plurality of intelligent devices which are in the same voice receiving area and use the same awakening words, actively modify or inform the intelligent devices of modifying the awakening words, prevent the same awakening words from appearing among the intelligent devices, ensure the accurate control of the intelligent devices by users, and improve the use experience of the users.
Claims (21)
- A device control method is applied to a cloud device and comprises the following steps:acquiring awakening words of a plurality of intelligent devices in the same voice receiving area, wherein the intelligent devices can be awakened through the respective awakening words;detecting whether the awakening words of the plurality of intelligent devices have the same awakening word;and generating an instruction for modifying the same awakening word under the condition that the same awakening word exists in the awakening words of the plurality of intelligent devices.
- The device control method of claim 1, wherein after generating the instruction to modify the same wake word, the method further comprises: and determining the same awakening word according to an awakening word updating rule.
- The device control method according to claim 2,the wake word update rule includes adding suffix words having a sequential relationship to wake words.
- The device control method according to any one of claims 1 to 3, wherein, while acquiring the wake-up words of a plurality of smart devices in the same voice receiving area, the method further comprises:acquiring hardware addresses of routers to which a plurality of intelligent devices in the same voice receiving area belong;detecting whether the awakening words of the plurality of intelligent devices have the same awakening word, including:detecting whether intelligent equipment belonging to the same router exists in the plurality of intelligent equipment according to the hardware address of the router to which the plurality of intelligent equipment belong;under the condition that the intelligent devices belonging to the same router exist in the plurality of intelligent devices, detecting whether the awakening words of the intelligent devices belonging to the same router have the same awakening word or not;generating an instruction for modifying the same awakening word under the condition that the same awakening word exists in the awakening words of the plurality of intelligent devices, wherein the instruction comprises:and generating an instruction for modifying the same awakening words under the condition that the same awakening words exist in the awakening words of the intelligent devices belonging to the same router.
- A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 4.
- A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method of any one of claims 1 to 4 when executing the program.
- A cloud device, comprising:the device comprises a first acquisition module, a second acquisition module and a voice receiving module, wherein the first acquisition module is used for acquiring awakening words of a plurality of intelligent devices in the same voice receiving area, and the plurality of intelligent devices can be awakened through the respective awakening words;the first detection module is used for detecting whether the awakening words of the plurality of intelligent devices have the same awakening word or not;the first generation module is used for generating an instruction for modifying the same awakening words under the condition that the same awakening words exist in the awakening words of the plurality of intelligent devices.
- The cloud device of claim 7,the first acquisition module is used for acquiring the awakening words of the intelligent devices in the same voice receiving area;the first detection module comprises a second detection module and is used for detecting whether intelligent equipment belonging to the same router exists in the plurality of intelligent equipment or not according to the hardware address of the router belonging to the plurality of intelligent equipment, and detecting whether the same wake-up word exists in the wake-up words of the intelligent equipment belonging to the same router or not under the condition that the intelligent equipment belonging to the same router exists in the plurality of intelligent equipment;the first generation module comprises a second generation module and is used for generating an instruction for modifying the same awakening words under the condition that the same awakening words exist in the awakening words of the intelligent devices belonging to the same router.
- A device control method is applied to an intelligent device and comprises the following steps:sending a wake-up word of the intelligent device to a cloud device, wherein the intelligent device can be woken up through the wake-up word;when the cloud device detects that the awakening words of other intelligent devices in the same voice receiving area with the intelligent device are the same as the awakening words of the intelligent device, receiving and storing a new awakening word sent by the cloud device or receiving an instruction sent by the cloud device for modifying the awakening words of the intelligent device.
- The device control method of claim 9, wherein after receiving the instruction sent by the cloud device to modify the wake-up word of the smart device, the method further comprises:and determining a new awakening word according to the awakening word updating rule or receiving and storing the new awakening word set by the user according to the instruction.
- The device control method according to claim 9 or 10, wherein after the new wake word is determined, the method further comprises:and changing the pre-constructed voice recognition decoding network according to the new awakening words.
- The device control method of claim 11, wherein modifying the pre-constructed speech recognition decoding network according to the new wake-up word comprises:determining a triphone node sequence corresponding to the new awakening word, determining a state index value corresponding to each triphone in the triphone node sequence according to a corresponding relation between the triphone in a pre-constructed acoustic model and a model output state index value, and forming a state index value sequence; the pre-constructed acoustic model is a model obtained by training a holophone, a garbage phone and a noise phone by adopting a first criterion and a second criterion, the first criterion is a maximum likelihood estimation criterion, and the second criterion is a discriminative training criterion taking a minimum phone error criterion as an optimization criterion;and replacing the state index value sequence corresponding to the original awakening word in the pre-constructed voice recognition decoding network with the state index value sequence corresponding to the new awakening word.
- The device control method according to claim 11 or 12, wherein after modifying the pre-constructed speech recognition decoding network according to the new wake-up word, the method further comprises:receiving voice data input by a user;extracting voice features from the voice data;decoding the voice features by using the voice recognition decoding network to obtain a decoding result;and executing a wake-up operation if the decoding result comprises the new wake-up word.
- The device control method according to any one of claims 9 to 12, wherein, when sending the wake word of the smart device to a cloud device, the method further includes:sending a hardware address of a router to which the intelligent device belongs to the cloud device;when the cloud device detects that the awakening words of other intelligent devices in the same voice receiving area as the intelligent device are the same as the awakening words of the intelligent device, receiving and storing a new awakening word sent by the cloud device or receiving an instruction sent by the cloud device for modifying the awakening words of the intelligent device, wherein the instruction comprises:when the cloud device detects that the awakening words of other intelligent devices which are in the same voice receiving area with the intelligent device and have the same hardware address with the router are the same as the awakening words of the intelligent device, receiving and storing a new awakening word sent by the cloud device or receiving an instruction sent by the cloud device for modifying the awakening words of the intelligent device.
- A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 9 to 14.
- A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method of any one of claims 9 to 14 when executing the program.
- A smart device, comprising:the system comprises a first sending module, a second sending module and a control module, wherein the first sending module is used for sending a wake-up word of the intelligent device to cloud equipment, and the intelligent device can be woken up through the wake-up word;the first receiving module is used for receiving and storing a new awakening word sent by the cloud device or receiving an instruction sent by the cloud device for modifying the awakening word of the intelligent device when the cloud device detects that the awakening words of other intelligent devices in the same voice receiving area with the intelligent device are the same as the awakening words of the intelligent device.
- The smart device of claim 17, wherein the smart device further comprises a second receiving module or a wake word updating module:the second receiving module is used for receiving and storing a new awakening word set by a user according to the instruction after the first receiving module receives the instruction for modifying the awakening word of the intelligent device, which is sent by the cloud device;the awakening word updating module is used for determining a new awakening word according to an awakening word updating rule after the first receiving module receives the instruction for modifying the awakening word of the intelligent device, which is sent by the cloud device.
- The smart device of claim 17 or 18, wherein the smart device further comprises a network update module for, after a new wake-up word is determined, modifying a pre-constructed speech recognition decoding network according to the new wake-up word;the network updating module comprises a computing module and a network maintenance module;the computing module is used for determining a triphone node sequence corresponding to the new awakening word, determining a state index value corresponding to each triphone in the triphone node sequence according to a corresponding relation between the triphone in a pre-constructed acoustic model and a model output state index value, and forming a state index value sequence;and the network maintenance module is used for replacing the state index value sequence corresponding to the original awakening word in the pre-constructed voice recognition decoding network with the state index value sequence corresponding to the new awakening word.
- The smart device of claim 19, wherein the smart device further comprises a training module for training the whole phones, garbage phones, and noise phones to the pre-constructed acoustic model using a first criterion and a second criterion, the first criterion being a maximum likelihood estimation criterion and the second criterion being a discriminative training criterion that optimizes a minimum phone error criterion.
- The smart device of claim 17, wherein the smart device further comprises:the second sending module is used for sending the hardware address of the router to which the intelligent device belongs to the cloud device while the first sending module sends the awakening word of the intelligent device to the cloud device;the first receiving module comprises a third receiving module and is used for receiving and storing a new awakening word sent by the cloud equipment or receiving an instruction sent by the cloud equipment for modifying the awakening word of the intelligent equipment when the cloud equipment detects that the awakening words of other intelligent equipment which are in the same voice receiving area with the intelligent equipment and have the same hardware address with the router are the same as the awakening words of the intelligent equipment.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2017/116319 WO2019113911A1 (en) | 2017-12-15 | 2017-12-15 | Device control method, cloud device, smart device, computer medium and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111386566A true CN111386566A (en) | 2020-07-07 |
Family
ID=66818893
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201780097241.4A Pending CN111386566A (en) | 2017-12-15 | 2017-12-15 | Device control method, cloud device, intelligent device, computer medium and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111386566A (en) |
WO (1) | WO2019113911A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023109129A1 (en) * | 2021-12-13 | 2023-06-22 | 海信视像科技股份有限公司 | Speech data processing method and apparatus |
CN117354623A (en) * | 2023-12-04 | 2024-01-05 | 深圳市冠旭电子股份有限公司 | Photographing control method and device, electronic equipment and storage medium |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112929724B (en) * | 2020-12-31 | 2022-09-30 | 海信视像科技股份有限公司 | Display device, set top box and far-field pickup awakening control method |
CN114697151A (en) * | 2022-03-15 | 2022-07-01 | 杭州控客信息技术有限公司 | Intelligent home system with non-voice awakening function and non-voice awakening method thereof |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102859583A (en) * | 2010-01-12 | 2013-01-02 | 弗劳恩霍弗实用研究促进协会 | Audio encoder, audio decoder, method for encoding and audio information, method for decoding an audio information and computer program using a modification of a number representation of a numeric previous context value |
CN103021409A (en) * | 2012-11-13 | 2013-04-03 | 安徽科大讯飞信息科技股份有限公司 | Voice activating photographing system |
US20170090864A1 (en) * | 2015-09-28 | 2017-03-30 | Amazon Technologies, Inc. | Mediation of wakeword response for multiple devices |
CN107358954A (en) * | 2017-08-29 | 2017-11-17 | 成都启英泰伦科技有限公司 | It is a kind of to change the device and method for waking up word in real time |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102999161B (en) * | 2012-11-13 | 2016-03-02 | 科大讯飞股份有限公司 | A kind of implementation method of voice wake-up module and application |
CN103971678B (en) * | 2013-01-29 | 2015-08-12 | 腾讯科技(深圳)有限公司 | Keyword spotting method and apparatus |
CN106469040B (en) * | 2015-08-19 | 2019-06-21 | 华为终端有限公司 | Communication means, server and equipment |
CN106960667B (en) * | 2017-03-08 | 2020-07-17 | 杭州联络互动信息科技股份有限公司 | Position reminding method, device and system |
CN107134279B (en) * | 2017-06-30 | 2020-06-19 | 百度在线网络技术(北京)有限公司 | Voice awakening method, device, terminal and storage medium |
-
2017
- 2017-12-15 CN CN201780097241.4A patent/CN111386566A/en active Pending
- 2017-12-15 WO PCT/CN2017/116319 patent/WO2019113911A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102859583A (en) * | 2010-01-12 | 2013-01-02 | 弗劳恩霍弗实用研究促进协会 | Audio encoder, audio decoder, method for encoding and audio information, method for decoding an audio information and computer program using a modification of a number representation of a numeric previous context value |
CN103021409A (en) * | 2012-11-13 | 2013-04-03 | 安徽科大讯飞信息科技股份有限公司 | Voice activating photographing system |
US20170090864A1 (en) * | 2015-09-28 | 2017-03-30 | Amazon Technologies, Inc. | Mediation of wakeword response for multiple devices |
CN107358954A (en) * | 2017-08-29 | 2017-11-17 | 成都启英泰伦科技有限公司 | It is a kind of to change the device and method for waking up word in real time |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023109129A1 (en) * | 2021-12-13 | 2023-06-22 | 海信视像科技股份有限公司 | Speech data processing method and apparatus |
CN117354623A (en) * | 2023-12-04 | 2024-01-05 | 深圳市冠旭电子股份有限公司 | Photographing control method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2019113911A1 (en) | 2019-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107644638B (en) | Audio recognition method, device, terminal and computer readable storage medium | |
US10332507B2 (en) | Method and device for waking up via speech based on artificial intelligence | |
CN108694940B (en) | Voice recognition method and device and electronic equipment | |
US9378738B2 (en) | System and method for advanced turn-taking for interactive spoken dialog systems | |
US10649727B1 (en) | Wake word detection configuration | |
US9679561B2 (en) | System and method for rapid customization of speech recognition models | |
CN111386566A (en) | Device control method, cloud device, intelligent device, computer medium and device | |
JP6812843B2 (en) | Computer program for voice recognition, voice recognition device and voice recognition method | |
US20080201147A1 (en) | Distributed speech recognition system and method and terminal and server for distributed speech recognition | |
JP2016075740A (en) | Voice processing device, voice processing method, and program | |
CN108932944B (en) | Decoding method and device | |
US10909983B1 (en) | Target-device resolution | |
US20170249935A1 (en) | System and method for estimating the reliability of alternate speech recognition hypotheses in real time | |
CN112861521B (en) | Speech recognition result error correction method, electronic device and storage medium | |
US20240013784A1 (en) | Speaker recognition adaptation | |
JPWO2010128560A1 (en) | Speech recognition apparatus, speech recognition method, and speech recognition program | |
CN111243604B (en) | Training method for speaker recognition neural network model supporting multiple awakening words, speaker recognition method and system | |
CN111081254B (en) | Voice recognition method and device | |
CN111128172B (en) | Voice recognition method, electronic equipment and storage medium | |
CN112863496B (en) | Voice endpoint detection method and device | |
CN111739515B (en) | Speech recognition method, equipment, electronic equipment, server and related system | |
CN111508481A (en) | Training method and device of voice awakening model, electronic equipment and storage medium | |
US10885899B2 (en) | Retraining voice model for trigger phrase using training data collected during usage | |
US10929601B1 (en) | Question answering for a multi-modal system | |
US11645468B2 (en) | User data processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |