CN107622770A - voice awakening method and device - Google Patents
voice awakening method and device Download PDFInfo
- Publication number
- CN107622770A CN107622770A CN201710922732.XA CN201710922732A CN107622770A CN 107622770 A CN107622770 A CN 107622770A CN 201710922732 A CN201710922732 A CN 201710922732A CN 107622770 A CN107622770 A CN 107622770A
- Authority
- CN
- China
- Prior art keywords
- wake
- voice
- terminal device
- threshold
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The present invention proposes a kind of voice awakening method and device, the detected not high also not low situation for waking up similarity between voice and default wake-up word signal that this method identifies to the first acoustic model of local, it can be again identified that by the second acoustic model of cloud server, terminal device false wake-up can be avoided as much as or the situation generation not waken up but can be waken up, improve the Experience Degree of user.In addition, wake up voice to what is identified by the first acoustic model and default wake up the higher situation of phase knowledge and magnanimity between word signal or the relatively low situation of phase knowledge and magnanimity, decided whether to perform the operation for waking up terminal device by terminal device itself, cloud server need not be sent to be identified, can so improve the efficiency of the execution wake operation of terminal device.
Description
Technical field
The present invention relates to intelligent human-machine interaction technical field, more particularly to a kind of voice awakening method and device.
Background technology
Artificial intelligence (Artificial Intelligence, AI) is research, developed for simulating, extending and extending people
Intelligent theory, method, a new technological sciences of technology and application system.Artificial intelligence is one of computer science
Branch, it attempts to understand the essence of intelligence, and produces a kind of new intelligence that can be made a response in a manner of human intelligence is similar
Energy machine, the research in the field include robot, speech recognition, image recognition, natural language processing and expert system etc..
With the development of speech recognition technology, increasing intelligent terminal is configured with voice arousal function.User
One section of voice is inputted against intelligent terminal, intelligent terminal judges whether the voice of input includes by built-in algorithm
Word is waken up, if comprising intelligent terminal is switched into wake-up states from resting state.
However, may be in due to user among different scenes, such as user just participates in concert, scene relatively noise
Miscellaneous, the noise in the voice that intelligent terminal receives is relatively more, and intelligent terminal may be made false wake-up occur, influences
The experience of user.
The content of the invention
It is contemplated that at least solves one of technical problem in correlation technique to a certain extent.
Therefore, first purpose of the present invention is to propose a kind of voice awakening method.First sound of this method to local
The detected not high also not low situation for waking up similarity between voice and default wake-up word signal that Model Identification goes out is learned,
Can be again identified that by the second acoustic model of cloud server, can be avoided as much as terminal device false wake-up or
The situation not waken up but can be waken up to occur, improve the Experience Degree of user.
Therefore, second object of the present invention is to propose a kind of voice Rouser.
Third object of the present invention is to propose a kind of computer equipment.
Fourth object of the present invention is to propose a kind of computer program product.
The 5th purpose of the present invention is to propose a kind of non-transitorycomputer readable storage medium.
For the above-mentioned purpose, first aspect present invention embodiment proposes voice awakening method, including:
What detection was input to terminal device wakes up voice and the current scene residing for the terminal device;
First threshold and Second Threshold are obtained according to the corresponding relation of the current scene and scene and threshold value, wherein, institute
State first threshold and be more than the Second Threshold;
The acoustic feature of the wake-up voice is analyzed according to the first acoustic model, obtains the wake-up voice and pre-
If wake up the first similarity between word signal;
Judge whether first similarity is more than the Second Threshold and is less than the first threshold;
If the determination result is YES, the wake-up voice is sent to cloud server so that cloud server is according to the rising tone
Learn model and judge the wake-up voice and default second similarity waken up between word signal, if second similarity is big
In the first threshold, then generate the wake-up for waking up the terminal device and instruct;Wherein, the knowledge of second acoustic model
Other precision is more than the accuracy of identification of first acoustic model;
Receive described wake up and instruct and perform the operation for waking up the terminal device.
Method as described above, if second similarity is more than the first threshold, generate for waking up institute
The wake-up instruction of terminal device is stated, including:
The acoustic feature of the wake-up voice is analyzed according to second acoustic model, obtains the wake-up voice
Corresponding pronunciation sequence;
Pronunciation sequence corresponding to the wake-up voice is analyzed according to language model, it is corresponding to obtain the wake-up voice
Text sequence;
By text sequence progress corresponding to text sequence corresponding to the wake-up voice and the default wake-up word signal
Match somebody with somebody;
If the match is successful, generate the wake-up for waking up the terminal device and instruct.
Method as described above, it is described that the acoustic feature of the wake-up voice is analyzed according to the first acoustic model,
The wake-up voice and default the first similarity waken up between word signal are obtained, including:
Determine that the acoustics of the wake-up voice is special according to the acoustic feature of the wake-up voice and first acoustic model
Seek peace it is described it is default wake up word signal acoustic feature between characteristic similarity;
First between the wake-up voice and the default wake-up word signal is determined according to each characteristic similarity
Similarity.
Method as described above, the current scene residing for the detection terminal device include:
The current location of the terminal device is detected, it is current according to residing for the current location determines the terminal device
Scene;
Or the scene voice of the terminal device is detected, Concordance is carried out to the scene voice, obtains the field
Scene corresponding to the language material set and the determination language material set of scape voice, scene corresponding to the language material set is defined as
Current scene residing for the terminal device.
Method as described above, in addition to:
If first similarity is more than the first threshold, the operation for waking up the terminal device is performed;
Or if first similarity is less than the Second Threshold, do not perform the operation for waking up the terminal device.
For the above-mentioned purpose, second aspect of the present invention embodiment proposes voice Rouser, including:
First detection module, the wake-up voice of terminal device is input to for detecting
Second detection module, for detecting the current scene residing for the terminal device;
Threshold module, for obtaining first threshold and second according to the corresponding relation of the current scene and scene and threshold value
Threshold value, wherein, the first threshold is more than the Second Threshold;
Analysis module, for being analyzed according to the first acoustic model the acoustic feature of the wake-up voice, obtain institute
State and wake up voice and default the first similarity waken up between word signal;
Judge module, for judging whether first similarity is more than the Second Threshold and is less than first threshold
Value, if the determination result is YES, trigger sending module;
Sending module, for the wake-up voice to be sent into cloud server so that cloud server is according to the second acoustics
Model judges the wake-up voice and default second similarity waken up between word signal, if second similarity is more than
The first threshold, then generate the wake-up for waking up the terminal device and instruct;Wherein, the identification of second acoustic model
Precision is more than the accuracy of identification of first acoustic model;
First execution module, for receive it is described wake up to instruct and perform wake up the operation of the terminal device.
Device as described above, the cloud server include waking up directive generation module;
The wake-up directive generation module is specifically used for:
The acoustic feature of the wake-up voice is analyzed according to second acoustic model, obtains the wake-up voice
Corresponding pronunciation sequence;
Pronunciation sequence corresponding to the wake-up voice is analyzed according to language model, it is corresponding to obtain the wake-up voice
Text sequence;
By text sequence progress corresponding to text sequence corresponding to the wake-up voice and the default wake-up word signal
Match somebody with somebody;
If the match is successful, generate the wake-up for waking up the terminal device and instruct.
Device as described above, the analysis module are specifically used for:
Determine that the acoustics of the wake-up voice is special according to the acoustic feature of the wake-up voice and first acoustic model
Seek peace it is described it is default wake up word signal acoustic feature between characteristic similarity;
First between the wake-up voice and the default wake-up word signal is determined according to each characteristic similarity
Similarity.
Device as described above, second detection module are specifically used for:
The current location of the terminal device is detected, it is current according to residing for the current location determines the terminal device
Scene;
Or second detection module is specifically used for:The scene voice of the terminal device is detected, to the scene language
Sound carries out Concordance, obtains the language material set of the scene voice and determines scene corresponding to the language material set, by institute
Scene corresponding to predicate material set is defined as the current scene residing for the terminal device.
Device as described above, in addition to:Second execution module and the 3rd execution module;
If the judged result of the judge module, which is first similarity, is more than the first threshold, triggering second performs
Module;Wherein, second execution module is used to perform the operation for waking up the terminal device;
Or if the judged result of the judge module is less than the Second Threshold for first similarity, trigger the
Three execution modules;Wherein, the 3rd execution module is used to not perform the operation for waking up the terminal device.
For the above-mentioned purpose, third aspect present invention embodiment proposes a kind of computer equipment, including:Memory and place
Manage device wherein, the processor can perform by reading the executable program code stored in the memory to run with described
Program corresponding to program code, for realizing the voice awakening method as described in first aspect of the embodiment of the present invention.
For the above-mentioned purpose, fourth aspect present invention embodiment proposes a kind of computer program product, when the calculating
When instruction in machine program product is by computing device, the voice awakening method as described in first aspect embodiment is performed.
For the above-mentioned purpose, fifth aspect present invention embodiment proposes a kind of non-transitory computer-readable storage medium
Matter, computer program is stored thereon with, is realized when computer program is executed by processor as described in first aspect embodiment
Voice awakening method.
The additional aspect of the present invention and advantage will be set forth in part in the description, and will partly become from the following description
Obtain substantially, or recognized by the practice of the present invention.
Brief description of the drawings
Of the invention above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments
Substantially and it is readily appreciated that, wherein:
Fig. 1 is the schematic flow sheet for the voice awakening method that one embodiment of the invention proposes;
Fig. 2 is the schematic flow sheet for the voice awakening method that further embodiment of this invention proposes;
Fig. 3 is the structural representation for the voice Rouser that one embodiment of the invention proposes;
Fig. 4 is the structural representation for the voice Rouser that further embodiment of this invention proposes;
Fig. 5 shows the block diagram suitable for being used for the exemplary computer device for realizing embodiment of the present invention.
Embodiment
Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end
Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached
The embodiment of figure description is exemplary, it is intended to for explaining the present invention, and is not considered as limiting the invention.
Below with reference to the accompanying drawings the voice awakening method and device of the embodiment of the present invention are described.
Fig. 1 is the schematic flow sheet for the voice awakening method that one embodiment of the invention proposes.The executive agent of this method is
Voice Rouser, the device can have hardware and/or software to realize, can also be integrated into terminal device.
As shown in figure 1, the voice awakening method that the present embodiment proposes, comprises the following steps:
What S101, detection were input to terminal device wakes up voice and the current scene residing for the terminal device.
For example, when user says one section of language against terminal device, such as " the small small degree of degree ", due in this section of voice
Including the wake-up word " small degree " for independently being set or being given tacit consent to by user, then the voice that active user is said is wake-up voice;Terminal
Equipment can receive the wake-up voice that user inputs by language detection devices such as the receivers that configures.
Specifically, may be in due to user among different scenes, such as user just participates in concert, scene relatively noise
Miscellaneous, the noise in the voice that intelligent terminal receives is relatively more, and intelligent terminal may be made false wake-up occur, influences
The experience of user.Therefore, the necessary current scene to residing for terminal device detects, different according to scene are carried out certainly
Terminal device is adaptively waken up, false wake-up is avoided as much as or the generation for the situation not waken up but can be waken up.It may be noted that
, can be to detecting that actual scene is finely divided, such as it is divided into quiet scene and noise scenarios.Terminal device is in quiet field
The probability for occurring the situation of false wake-up in scape is lower compared to the probability for the situation that false wake-up occurs in noise scenarios in terminal device.
In a kind of possible implementation, the specific implementation of the current scene residing for the terminal device is detected
For:The current location of the terminal device is detected, the current scene according to residing for the current location determines the terminal device.
For example, the terminal equipment configuration locating module of such as GPS (Global Positioning System, global positioning system),
Current location by locating module detection terminal equipment is certain KTV (Karaoke Television) public place of entertainment, at this moment, really
It is noise scenarios to determine the current scene residing for terminal device.In another example the current location by locating module detection terminal equipment
For library, at this moment, it is quiet scene to determine the current scene residing for terminal device.
In another possible implementation, the specific implementation of the current scene residing for the terminal device is detected
For:The scene voice of the terminal device is detected, Concordance is carried out to the scene voice, obtains the language of the scene voice
Material is gathered and determines scene corresponding to the language material set, and scene corresponding to the language material set is defined as into the terminal sets
Standby residing current scene.
For example, scene voice can be understood as the voice of surrounding enviroment residing for the terminal device that detects, scene language
Sound can be detected before detection wakes up voice, can also be detected after detection wakes up voice, or both are simultaneously
Detection, is not particularly limited herein.
For example, the scene voice detected in library checks out, the also specific language material such as book;In certain KTV
The scene voice detected in public place of entertainment also has singer's name, song title, sings carry out the specific language material such as a head again.The present embodiment
Concordance is carried out from multiple angles such as semanteme, voice, linguistic context by the scene voice to detection, obtains the complete of the scene voice
Portion's language material, whole language materials form language material set.Alternatively, being configured with terminal device can enter to language material corresponding to different scenes
The model of place of row deep learning, deep learning is carried out by the way that language material set is input in model of place, language can be got
Scene corresponding to sound set, scene corresponding to language material set is defined as to the current scene residing for terminal device in the present embodiment.
Alternatively, scene corresponding to language material set is finely divided, is divided into quiet scene and noise scenarios, accordingly, it may be determined that eventually
Current scene residing for end equipment is quiet scene or noise scenarios.
It is pointed out that the current scene residing for detection terminal equipment is not limited to illustrate.
S102, first threshold and Second Threshold obtained according to the corresponding relation of the current scene and scene and threshold value, its
In, the first threshold is more than the Second Threshold.
Specifically, first threshold, Second Threshold can independently be set by user or terminal device is entered before dispatching from the factory by manufacturer
Row is set, and is not particularly limited herein.In the present embodiment, different first thresholds and the second threshold are set according to the difference of scene
Value, for example, first threshold corresponding to noise scenarios is higher than first threshold corresponding to quiet scene, the second threshold corresponding to noise scenarios
Value is higher than Second Threshold corresponding to quiet scene, realizes and is adaptively adjusted first threshold or the second threshold according to the difference of scene
Value, so realize be avoided as much as due to terminal device caused by fixed first threshold or Second Threshold occur false wake-up or
The generation for the situation not waken up but can be waken up, lifts the Experience Degree of user's using terminal equipment.More specifically, it is pre-configured with
The corresponding relation of scene and threshold value, according to the corresponding relation of current scene and scene and threshold value can accurately obtain the first threshold
Value and Second Threshold.
For example, it is used as setting first threshold or second to wake up voice and the default similarity waken up between word signal
The basis source of threshold value, specifically, can be with if waking up voice and the default similarity waken up between word signal is higher than first threshold
Think that wake up voice wakes up word Signal Matching with default;If voice and the default similarity waken up between word signal are waken up less than the
Two threshold values, it is believed that wake up voice and mismatched with the default word signal that wakes up;If wake up between voice and default wake-up word signal
Similarity between first threshold and Second Threshold, it is believed that wake up voice and default wake-up word Signal Matching degree be not high
Also it is not low, it is necessary to which whether further confirm to wake up can be pre- with such as " the small small degree of degree " in voice when there is this situation
If wake up word Signal Matching.
S103, according to the first acoustic model to it is described wake-up voice acoustic feature analyze, obtain the wake-up language
Sound and default the first similarity waken up between word signal.
Specifically, acoustic model is one of part mostly important in speech recognition system, can be divided by acoustic model
Analysis obtains inputting pronunciation sequence corresponding to voice, the similarity inputted between voice and default voice can also be obtained, on sound
Learn model and can be found in prior art, will not be repeated here.
In the present embodiment, speech terminals detection technology can be used to detecting that waking up voice carries out mute part and reality
Border wakes up phonological component and separated, and then carries out acoustic feature extraction to the actual wake-up phonological component of acquisition, will get
The acoustic feature of wake-up voice be input to the first acoustic model and analyzed, obtain wake up voice and it is default wake up word signal it
Between the first similarity.Alternatively, the first acoustic model is established based on HMM.
In a kind of possible implementation, step S103 concrete implementation mode is:According to the wake-up voice
Acoustic feature and first acoustic model determine the acoustic feature of the wake-up voice and the default sound for waking up word signal
Learn the characteristic similarity between feature;The wake-up voice and the default wake-up word are determined according to each characteristic similarity
The first similarity between signal.
For example, waking up voice has multiple different acoustic features, and correspondingly, default wake-up word signal has multiple
Different acoustic features, the first acoustic model can first analyze the acoustic feature of each wake-up voice and corresponding default wake-up
Characteristic similarity between the acoustic feature of word signal, then statistical analysis is carried out to each obtained characteristic similarity, for example, can
To carry out statistical analysis to each obtained characteristic similarity using maximum likelihood principle, it is special to obtain the acoustics for waking up voice
Seek peace it is described it is default wake up word signal acoustic feature between maximum likelihood value, using obtained maximum likelihood value as wake-up language
Sound and default the first similarity waken up between word signal.
S104, judge whether first similarity is more than the Second Threshold and is less than the first threshold.
Specifically, Second Threshold and be less than first threshold when the first similarity is more than, illustrate the wake-up voice that detects and
Similarity is not high also not low between default wake-up word signal, it is necessary to further confirm to wake up voice when there is this situation
In whether can be with the default wake-up word Signal Matching of such as " small degree small degree ".
S105, if the determination result is YES, the wake-up voice is sent to cloud server so that cloud server according to
Second acoustic model judges the wake-up voice and default second similarity waken up between word signal, if second phase
It is more than the first threshold like degree, then generates the wake-up for waking up the terminal device and instruct;Wherein, second acoustic mode
The accuracy of identification of type is more than the accuracy of identification of first acoustic model.
In the present embodiment, the first acoustic model is configured in local, that is, is configured in terminal device;And in the present embodiment
Second acoustic model configures server beyond the clouds.Cloud server has powerful data-handling capacity, for example, cloud server
The second higher acoustic model of accuracy of identification can be established by excavating more related datas progress deep learnings.In this implementation
In example, the accuracy of identification of the second acoustic model is more than the accuracy of identification of the first acoustic model, and the first acoustic model of local is known
The detected not high also not low situation for waking up similarity between voice and default wake-up word signal not gone out, can pass through
Second acoustic model of cloud server is again identified that.
If the second acoustic model of cloud server judges second between the wake-up voice and default wake-up word signal
Similarity is more than first threshold, it is believed that wakes up voice and wakes up word Signal Matching with default.Using the default word signal that wakes up to be " small
Spend small degree " exemplified by, recognition result is matching, illustrates that user has said " the small small degree of degree " such wake-up voice, at this moment can hold
Row wakes up the operation of terminal device.Specifically, in the present embodiment, if second similarity is more than the first threshold,
The wake-up for waking up the terminal device is generated to instruct;If second similarity is less than the first threshold, do not generate
Wake-up for waking up the terminal device instructs.
In a kind of possible implementation, if second similarity is more than the first threshold, generate for calling out
Wake up the terminal device wake-up instruction concrete implementation mode be:
S1, according to second acoustic model to it is described wake-up voice acoustic feature analyze, obtain the wake-up
Pronunciation sequence corresponding to voice.
In the present embodiment, the pronunciation sequence matched the most with waking up voice can be determined by the second acoustic model.
S2, according to language model to it is described wake-up voice corresponding to pronunciation sequence analyze, obtain the wake-up voice
Corresponding text sequence.
Specifically, language model is one of part mostly important in speech recognition system, can be obtained by language model
To text sequence corresponding to input voice, voice will be inputted and be converted into text.Alternatively, language model is N-Gram models (N
Meta-model).
The present embodiment by the second acoustic model can determine with wake up pronunciation sequence that voice matches the most and then
The text sequence that can determine to match the most with waking up voice by speech model.
S3, by it is described wake-up voice corresponding to text sequence and it is described it is default wake-up word signal corresponding to text sequence carry out
Matching.
If S4, the match is successful, generate the wake-up for waking up the terminal device and instruct.
The present embodiment is by the second acoustic model to waking up the phase between voice and the default acoustic feature for waking up word signal
Tentatively judged like degree, then, using language model to waking up text sequence corresponding to voice and default wake-up word signal pair
The text sequence answered is matched, i.e., is matched twice from two angles of voice and text, voice awakening method is more defined
It is really reliable.
S106, receive the wake-up instruction and perform the operation for waking up the terminal device.
Voice awakening method provided in an embodiment of the present invention, including:Detection is input to wake-up voice and the institute of terminal device
State the current scene residing for terminal device;According to the corresponding relation of the current scene and scene and threshold value obtain first threshold and
Second Threshold, wherein, the first threshold is more than the Second Threshold;Sound according to the first acoustic model to the wake-up voice
Learn feature to be analyzed, obtain first similarity waken up between voice and default wake-up word signal;Judge described first
Whether similarity is more than the Second Threshold and is less than the first threshold;If the determination result is YES, by the wake-up voice hair
Cloud server is given so that cloud server judges the wake-up voice and the default wake-up word according to the second acoustic model
The second similarity between signal, if second similarity is more than the first threshold, generate for waking up the terminal
The wake-up instruction of equipment;Wherein, the accuracy of identification of second acoustic model is more than the accuracy of identification of first acoustic model;
Receive described wake up and instruct and perform the operation for waking up the terminal device.This method identifies to the first acoustic model of local
It is detected wake up voice and the default not high also not low situation for waking up similarity between word signal, high in the clouds can be passed through
Second acoustic model of server is again identified that, can be avoided as much as terminal device false wake-up or can wake up not having but
The situation of wake-up occurs, and improves the Experience Degree of user.
Fig. 2 is the schematic flow sheet for the voice awakening method that further embodiment of this invention proposes.In the base of above-described embodiment
On plinth, if first similarity is more than the first threshold, the operation for waking up the terminal device is performed;Or if institute
State the first similarity and be less than the Second Threshold, then do not perform the operation for waking up the terminal device.
As shown in Fig. 2 the voice awakening method that the present embodiment proposes, comprises the following steps:
What S201, detection were input to terminal device wakes up voice and the current scene residing for the terminal device, performs step
Rapid S202.
S202, first threshold and Second Threshold obtained according to the corresponding relation of the current scene and scene and threshold value, its
In, the first threshold is more than the Second Threshold, performs step S203.
S203, according to the first acoustic model to it is described wake-up voice acoustic feature analyze, obtain the wake-up language
Sound and default the first similarity waken up between word signal, perform step S204.
S204, judge whether first similarity is more than the Second Threshold and is less than the first threshold, perform step
Either step in rapid S205, step S207, step S208.
S205, if the determination result is YES, the wake-up voice is sent to cloud server so that cloud server according to
Second acoustic model judges the wake-up voice and default second similarity waken up between word signal, if second phase
It is more than the first threshold like degree, then generates the wake-up for waking up the terminal device and instruct;Wherein, second acoustic mode
The accuracy of identification of type is more than the accuracy of identification of first acoustic model, performs step S206.
S206, receive the wake-up instruction and perform the operation for waking up the terminal device.
It should be noted that the implementation of step S201, S202, S203, S204, S205, S206 in the present embodiment
It is identical with the implementation of step S101, S102, S103, S104, S105, S106 in above-described embodiment respectively, herein no longer
Repeat.
If S207, first similarity are more than the first threshold, the operation for waking up the terminal device is performed.
Specifically, determine that the first similarity is more than first threshold by the first acoustic model of local, it is believed that call out
Voice of waking up wakes up word Signal Matching with default.So that default wake-up word signal is " the small small degree of degree " as an example, recognition result is matching, is said
Bright user has said " the small small degree of degree " such operation for waking up voice, at this moment can performing wake-up terminal device.
If S208, first similarity are less than the Second Threshold, the operation for waking up the terminal device is not performed.
Specifically, determine that the first similarity is less than Second Threshold by the first acoustic model of local, it is believed that call out
Voice of waking up mismatches with the default word signal that wakes up.By it is default wake up word signal as " the small small degree of degree " exemplified by, recognition result is not
Match somebody with somebody, illustrate that user has not said " the small small degree of degree " such operation for waking up voice, at this moment not performing wake-up terminal device.
Voice awakening method provided in an embodiment of the present invention, the first similarity is determined by the first acoustic model of local
During more than first threshold, the operation for waking up terminal device is performed;First similarity is determined by the first acoustic model of local
During less than Second Threshold, the operation for waking up terminal device is not performed.That is, call out what is identified by the first acoustic model
Wake up voice and it is default wake up the higher situation of phase knowledge and magnanimity between word signal or the relatively low situation of phase knowledge and magnanimity, determined by terminal device itself
It is fixed whether to perform the operation for waking up terminal device, it is identified without being sent to cloud server, can so improves terminal and set
The efficiency of standby execution wake operation.
Fig. 3 is the structural representation for the voice Rouser that one embodiment of the invention proposes.The device can have hardware and/
Or software is realized, can also be integrated into terminal device, for performing voice awakening method.
As shown in figure 3, the voice Rouser that the present embodiment provides, including:
First detection module 01, the wake-up voice of terminal device is input to for detecting;
Second detection module 02, for detecting the current scene residing for the terminal device;
Threshold module 03, for obtaining first threshold and the according to the corresponding relation of the current scene and scene and threshold value
Two threshold values, wherein, the first threshold is more than the Second Threshold;
Analysis module 04, for being analyzed according to the first acoustic model the acoustic feature of the wake-up voice, obtain
It is described to wake up voice and default the first similarity waken up between word signal;
Judge module 05, for judging whether first similarity is more than the Second Threshold and is less than first threshold
Value, if the determination result is YES, trigger sending module;
Sending module 06, for the wake-up voice to be sent into cloud server so that cloud server is according to the rising tone
Learn model and judge the wake-up voice and default second similarity waken up between word signal, if second similarity is big
In the first threshold, then generate the wake-up for waking up the terminal device and instruct;Wherein, the knowledge of second acoustic model
Other precision is more than the accuracy of identification of first acoustic model;
First execution module 07, for receive it is described wake up to instruct and perform wake up the operation of the terminal device.
Further, the cloud server includes waking up directive generation module;
The wake-up directive generation module is specifically used for:
The acoustic feature of the wake-up voice is analyzed according to second acoustic model, obtains the wake-up voice
Corresponding pronunciation sequence;
Pronunciation sequence corresponding to the wake-up voice is analyzed according to language model, it is corresponding to obtain the wake-up voice
Text sequence;
By text sequence progress corresponding to text sequence corresponding to the wake-up voice and the default wake-up word signal
Match somebody with somebody;
If the match is successful, generate the wake-up for waking up the terminal device and instruct.
Further, the analysis module 04 is specifically used for:
Determine that the acoustics of the wake-up voice is special according to the acoustic feature of the wake-up voice and first acoustic model
Seek peace it is described it is default wake up word signal acoustic feature between characteristic similarity;
First between the wake-up voice and the default wake-up word signal is determined according to each characteristic similarity
Similarity.
Further, second detection module 02 is specifically used for:
The current location of the terminal device is detected, it is current according to residing for the current location determines the terminal device
Scene;
Or second detection module 02 is specifically used for:The scene voice of the terminal device is detected, to the scene
Voice carries out Concordance, obtains the language material set of the scene voice and determines scene corresponding to the language material set, will
Scene corresponding to the language material set is defined as the current scene residing for the terminal device.
On the device in the present embodiment, wherein modules perform the concrete mode of operation in relevant this method
It is described in detail in embodiment, explanation will be not set forth in detail herein.
Voice Rouser provided in an embodiment of the present invention, including:First detection module, it is input to terminal for detection and sets
Standby wake-up voice;Second detection module, for detecting the current scene residing for the terminal device;Threshold module, for root
First threshold and Second Threshold are obtained according to the corresponding relation of the current scene and scene and threshold value, wherein, the first threshold
More than the Second Threshold;Analysis module, for being divided according to the first acoustic model the acoustic feature of the wake-up voice
Analysis, obtain first similarity waken up between voice and default wake-up word signal;Judge module, for judging described first
Whether similarity is more than the Second Threshold and is less than the first threshold, if the determination result is YES, triggers sending module;Send
Module, for the wake-up voice to be sent into cloud server so that cloud server is according to judging the second acoustic model
Voice and default second similarity waken up between word signal are waken up, if second similarity is more than first threshold
Value, then generate the wake-up for waking up the terminal device and instruct;Wherein, the accuracy of identification of second acoustic model is more than institute
State the accuracy of identification of the first acoustic model;First execution module, the wake-up terminal is instructed and performs for receiving described wake up
The operation of equipment.The device wakes up word letter to the detected wake-up voice that the first acoustic model of local identifies with default
The not high also not low situation of similarity between number, can be again identified that by the second acoustic model of cloud server,
Terminal device false wake-up can be avoided as much as or the situation generation not waken up but can be waken up, improve the Experience Degree of user.
Fig. 4 is the structural representation for the voice Rouser that one embodiment of the invention proposes.On the basis of above-described embodiment
On, voice Rouser also includes the second execution module and the 3rd execution module.
As shown in figure 4, the voice Rouser that the present embodiment provides, including:
First detection module 01, the wake-up voice of terminal device is input to for detecting;
Second detection module 02, for detecting the current scene residing for the terminal device;
Threshold module 03, for obtaining first threshold and the according to the corresponding relation of the current scene and scene and threshold value
Two threshold values, wherein, the first threshold is more than the Second Threshold;
Analysis module 04, for being analyzed according to the first acoustic model the acoustic feature of the wake-up voice, obtain
It is described to wake up voice and default the first similarity waken up between word signal;
Judge module 05, for judging whether first similarity is more than the Second Threshold and is less than first threshold
Value, if the determination result is YES, sending module is triggered, or, if the judged result of the judge module is first similarity
More than the first threshold, the second execution module is triggered, or, if the judged result of the judge module is described first similar
Degree is less than the Second Threshold, triggers the 3rd execution module;
Sending module 06, for the wake-up voice to be sent into cloud server so that cloud server is according to the rising tone
Learn model and judge the wake-up voice and default second similarity waken up between word signal, if second similarity is big
In the first threshold, then generate the wake-up for waking up the terminal device and instruct;Wherein, the knowledge of second acoustic model
Other precision is more than the accuracy of identification of first acoustic model;
First execution module 07, for receive it is described wake up to instruct and perform wake up the operation of the terminal device.
Further, the cloud server includes waking up directive generation module;
The wake-up directive generation module is specifically used for:
The acoustic feature of the wake-up voice is analyzed according to second acoustic model, obtains the wake-up voice
Corresponding pronunciation sequence;
Pronunciation sequence corresponding to the wake-up voice is analyzed according to language model, it is corresponding to obtain the wake-up voice
Text sequence;
By text sequence progress corresponding to text sequence corresponding to the wake-up voice and the default wake-up word signal
Match somebody with somebody;
If the match is successful, generate the wake-up for waking up the terminal device and instruct.
Further, the analysis module 04 is specifically used for:
Determine that the acoustics of the wake-up voice is special according to the acoustic feature of the wake-up voice and first acoustic model
Seek peace it is described it is default wake up word signal acoustic feature between characteristic similarity;
First between the wake-up voice and the default wake-up word signal is determined according to each characteristic similarity
Similarity.
Further, second detection module 02 is specifically used for:
The current location of the terminal device is detected, it is current according to residing for the current location determines the terminal device
Scene;
Or second detection module 02 is specifically used for:The scene voice of the terminal device is detected, to the scene
Voice carries out Concordance, obtains the language material set of the scene voice and determines scene corresponding to the language material set, will
Scene corresponding to the language material set is defined as the current scene residing for the terminal device.
Second execution module 08, the operation of the terminal device is waken up for performing.
3rd execution module 09, the operation of the terminal device is waken up for not performing.
On the device in the present embodiment, wherein modules perform the concrete mode of operation in relevant this method
It is described in detail in embodiment, explanation will be not set forth in detail herein.
Voice Rouser provided in an embodiment of the present invention, the first similarity is determined by the first acoustic model of local
During more than first threshold, the operation for waking up terminal device is performed;First similarity is determined by the first acoustic model of local
During less than Second Threshold, the operation for waking up terminal device is not performed.That is, call out what is identified by the first acoustic model
Wake up voice and it is default wake up the higher situation of phase knowledge and magnanimity between word signal or the relatively low situation of phase knowledge and magnanimity, determined by terminal device itself
It is fixed whether to perform the operation for waking up terminal device, it is identified without being sent to cloud server, can so improves terminal and set
The efficiency of standby execution wake operation.
Fig. 5 shows the block diagram suitable for being used for the exemplary computer device 20 for realizing embodiment of the present invention.Fig. 5 is shown
Computer equipment 20 be only an example, any restrictions should not be brought to the function and use range of the embodiment of the present invention.
As shown in figure 5, computer equipment 20 is showed in the form of universal computing device.The component of computer equipment 20 can be with
Including but not limited to:One or more processor or processing unit 21, system storage 22, connect different system component
The bus 23 of (including system storage 22 and processing unit 21).
Bus 23 represents the one or more in a few class bus structures, including memory bus or Memory Controller,
Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.Lift
For example, these architectures include but is not limited to industry standard architecture (Industry Standard
Architecture;Hereinafter referred to as:ISA) bus, MCA (Micro Channel Architecture;Below
Referred to as:MAC) bus, enhanced isa bus, VESA (Video Electronics Standards
Association;Hereinafter referred to as:VESA) local bus and periphery component interconnection (Peripheral Component
Interconnection;Hereinafter referred to as:PCI) bus.
Computer equipment 20 typically comprises various computing systems computer-readable recording medium.These media can be it is any can be by
The usable medium that computer equipment 20 accesses, including volatibility and non-volatile media, moveable and immovable medium.
System storage 22 can include the computer system readable media of form of volatile memory, such as arbitrary access
Memory (Random Access Memory;Hereinafter referred to as:RAM) 30 and/or cache memory 32.Computer equipment can
To further comprise other removable/nonremovable, volatile/non-volatile computer system storage mediums.Only as act
Example, storage system 34 can be used for reading and writing immovable, non-volatile magnetic media, and (Fig. 5 does not show that commonly referred to as " hard disk drives
Dynamic device ").Although not shown in Fig. 5, it can provide for the disk to may move non-volatile magnetic disk (such as " floppy disk ") read-write
Driver, and to removable anonvolatile optical disk (such as:Compact disc read-only memory (Compact Disc Read Only
Memory;Hereinafter referred to as:CD-ROM), digital multi read-only optical disc (Digital Video Disc Read Only
Memory;Hereinafter referred to as:DVD-ROM) or other optical mediums) read-write CD drive.In these cases, each driving
Device can be connected by one or more data media interfaces with bus 23.Memory 22 can include at least one program and produce
Product, the program product have one group of (for example, at least one) program module, and these program modules are configured to perform of the invention each
The function of embodiment.
Program/utility 40 with one group of (at least one) program module 42, such as memory 22 can be stored in
In, such program module 42 include but is not limited to operating system, one or more application program, other program modules and
Routine data, the realization of network environment may be included in each or certain combination in these examples.Program module 42 is usual
Perform the function and/or method in embodiment described in the invention.
Computer equipment 20 can also be with one or more external equipments 50 (such as keyboard, sensing equipment, display 60
Deng) communication, the equipment communication interacted with the computer equipment 20 can be also enabled a user to one or more, and/or with making
Obtain any equipment that the computer equipment 20 can be communicated with one or more of the other computing device (such as network interface card, modulatedemodulate
Adjust device etc.) communication.This communication can be carried out by input/output (I/O) interface 24.Also, computer equipment 20 may be used also
To pass through network adapter 25 and one or more network (such as LAN (Local Area Network;Hereinafter referred to as:
LAN), wide area network (Wide Area Network;Hereinafter referred to as:WAN) and/or public network, for example, internet) communication.Such as figure
Shown, network adapter 25 is communicated by bus 23 with other modules of computer equipment 20.It should be understood that although do not show in figure
Go out, computer equipment 20 can be combined and use other hardware and/or software module, included but is not limited to:Microcode, device drives
Device, redundant processing unit, external disk drive array, RAID system, tape drive and data backup storage system etc..
Processing unit 21 is stored in program in system storage 22 by operation, so as to perform various function application and
Data processing, such as realize the voice awakening method shown in Fig. 1-Fig. 2.
Any combination of one or more computer-readable media can be used.Computer-readable medium can be calculated
Machine readable signal medium or computer-readable recording medium.Computer-readable recording medium can for example be but not limited to electricity,
Magnetic, optical, electromagnetic, system, device or the device of infrared ray or semiconductor, or it is any more than combination.Computer-readable storage
The more specifically example (non exhaustive list) of medium includes:Electrical connection, portable computer with one or more wires
Disk, hard disk, random access memory (RAM), read-only storage (Read Only Memory;Hereinafter referred to as:ROM it is), erasable
Formula programmable read only memory (Erasable Programmable Read Only Memory;Hereinafter referred to as:EPROM) or dodge
Deposit, optical fiber, portable compact disc read-only storage (CD-ROM), light storage device, magnetic memory device or above-mentioned any
Suitable combination.In this document, computer-readable recording medium can be it is any include or the tangible medium of storage program, should
Program can be commanded the either device use or in connection of execution system, device.
Computer-readable signal media can include in a base band or as carrier wave a part propagation data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium beyond storage medium is read, the computer-readable medium, which can send, propagates or transmit, to be used for
By instruction execution system, device either device use or program in connection.
The program code included on computer-readable medium can use any appropriate medium to transmit, including but not limited to without
Line, electric wire, optical cable, RF etc., or above-mentioned any appropriate combination.
It can be write with one or more programming languages or its combination for performing the computer that operates of the present invention
Program code, described program design language include object oriented program language such as Java, Smalltalk, C++, also
Include procedural programming language such as " C " language or similar programming language of routine.Program code can be complete
Ground is performed, partly performed on the user computer on the user computer, the software kit independent as one performs, partly existed
Subscriber computer upper part is performed or performed completely on remote computer or server on the remote computer.It is being related to
In the situation of remote computer, remote computer can include LAN (Local Area by the network of any kind
Network;Hereinafter referred to as:) or wide area network (Wide Area Network LAN;Hereinafter referred to as:WAN) it is connected to user's calculating
Machine, or, it may be connected to outer computer (such as passing through Internet connection using ISP).
In order to realize above-described embodiment, the present invention also proposes a kind of computer program product, when in computer program product
Instruction by computing device when, perform voice awakening method as in the foregoing embodiment.
In order to realize above-described embodiment, the present invention also proposes a kind of non-transitorycomputer readable storage medium, deposited thereon
Computer program is contained, can realize that voice as in the foregoing embodiment wakes up when the computer program is executed by processor
Method.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example description
Point is contained at least one embodiment or example of the present invention.In this manual, to the schematic representation of above-mentioned term not
Identical embodiment or example must be directed to.Moreover, specific features, structure, material or the feature of description can be with office
Combined in an appropriate manner in one or more embodiments or example.In addition, in the case of not conflicting, the skill of this area
Art personnel can be tied the different embodiments or example and the feature of different embodiments or example described in this specification
Close and combine.
In addition, term " first ", " second " are only used for describing purpose, and it is not intended that instruction or hint relative importance
Or the implicit quantity for indicating indicated technical characteristic.Thus, define " first ", the feature of " second " can be expressed or
Implicitly include at least one this feature.In the description of the invention, " multiple " are meant that at least two, such as two, three
It is individual etc., unless otherwise specifically defined.
Any process or method described otherwise above description in flow chart or herein is construed as, and represents to include
Module, fragment or the portion of the code of the executable instruction of one or more the step of being used to realize custom logic function or process
Point, and the scope of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discuss suitable
Sequence, including according to involved function by it is basic simultaneously in the way of or in the opposite order, carry out perform function, this should be of the invention
Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use
In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for
Instruction execution system, device or equipment (such as computer based system including the system of processor or other can be held from instruction
The system of row system, device or equipment instruction fetch and execute instruction) use, or combine these instruction execution systems, device or set
It is standby and use.For the purpose of this specification, " computer-readable medium " can any can be included, store, communicate, propagate or pass
Defeated program is for instruction execution system, device or equipment or the dress used with reference to these instruction execution systems, device or equipment
Put.The more specifically example (non-exhaustive list) of computer-readable medium includes following:Electricity with one or more wiring
Connecting portion (electronic installation), portable computer diskette box (magnetic device), random access memory (RAM), read-only storage
(ROM), erasable edit read-only storage (EPROM or flash memory), fiber device, and portable optic disk is read-only deposits
Reservoir (CDROM).In addition, computer-readable medium, which can even is that, to print the paper of described program thereon or other are suitable
Medium, because can then enter edlin, interpretation or if necessary with it for example by carrying out optical scanner to paper or other media
His suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each several part of the present invention can be realized with hardware, software, firmware or combinations thereof.Above-mentioned
In embodiment, software that multiple steps or method can be performed in memory and by suitable instruction execution system with storage
Or firmware is realized.Such as, if realized with hardware with another embodiment, following skill well known in the art can be used
Any one of art or their combination are realized:With the logic gates for realizing logic function to data-signal from
Logic circuit is dissipated, the application specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile
Journey gate array (FPGA) etc..
Those skilled in the art are appreciated that to realize all or part of step that above-described embodiment method carries
Suddenly it is that by program the hardware of correlation can be instructed to complete, described program can be stored in a kind of computer-readable storage medium
In matter, the program upon execution, including one or a combination set of the step of embodiment of the method.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing module, can also
That unit is individually physically present, can also two or more units be integrated in a module.Above-mentioned integrated mould
Block can both be realized in the form of hardware, can also be realized in the form of software function module.The integrated module is such as
Fruit is realized in the form of software function module and as independent production marketing or in use, can also be stored in a computer
In read/write memory medium.
Storage medium mentioned above can be read-only storage, disk or CD etc..Although have been shown and retouch above
Embodiments of the invention are stated, it is to be understood that above-described embodiment is exemplary, it is impossible to be interpreted as the limit to the present invention
System, one of ordinary skill in the art can be changed to above-described embodiment, change, replace and become within the scope of the invention
Type.
Claims (13)
- A kind of 1. voice awakening method, it is characterised in that including:What detection was input to terminal device wakes up voice and the current scene residing for the terminal device;First threshold and Second Threshold are obtained according to the corresponding relation of the current scene and scene and threshold value, wherein, described the One threshold value is more than the Second Threshold;The acoustic feature of the wake-up voice is analyzed according to the first acoustic model, the wake-up voice is obtained and presets and call out The first similarity between awake word signal;Judge whether first similarity is more than the Second Threshold and is less than the first threshold;If the determination result is YES, the wake-up voice is sent to cloud server so that cloud server is according to the second acoustic mode Type judges the wake-up voice and default second similarity waken up between word signal, if second similarity is more than institute First threshold is stated, then generates the wake-up for waking up the terminal device and instructs;Wherein, the identification essence of second acoustic model Accuracy of identification of the degree more than first acoustic model;Receive described wake up and instruct and perform the operation for waking up the terminal device.
- 2. the method as described in claim 1, it is characterised in that if second similarity is more than the first threshold, The wake-up for waking up the terminal device is then generated to instruct, including:The acoustic feature of the wake-up voice is analyzed according to second acoustic model, it is corresponding to obtain the wake-up voice Pronunciation sequence;Pronunciation sequence corresponding to the wake-up voice is analyzed according to language model, obtains text corresponding to the wake-up voice This sequence;Text sequence corresponding to text sequence corresponding to the wake-up voice and the default wake-up word signal is matched;If the match is successful, generate the wake-up for waking up the terminal device and instruct.
- 3. the method as described in claim 1, it is characterised in that the sound according to the first acoustic model to the wake-up voice Learn feature to be analyzed, obtain first similarity waken up between voice and default wake-up word signal, including:According to it is described wake-up voice acoustic feature and first acoustic model determine it is described wake-up voice acoustic feature and Characteristic similarity between the default acoustic feature for waking up word signal;Determine that first between the wake-up voice and the default wake-up word signal is similar according to each characteristic similarity Degree.
- 4. the method as described in claim 1, it is characterised in that the current scene residing for the detection terminal device, bag Include:The current location of the terminal device is detected, the current field according to residing for the current location determines the terminal device Scape;Or the scene voice of the terminal device is detected, Concordance is carried out to the scene voice, obtains the scene language Scene corresponding to the language material set and the determination language material set of sound, scene corresponding to the language material set is defined as described Current scene residing for terminal device.
- 5. the method as described in claim 1, it is characterised in that also include:If first similarity is more than the first threshold, the operation for waking up the terminal device is performed;Or if first similarity is less than the Second Threshold, do not perform the operation for waking up the terminal device.
- A kind of 6. voice Rouser, it is characterised in that including:First detection module, the wake-up voice of terminal device is input to for detecting;Second detection module, for detecting the current scene residing for the terminal device;Threshold module, for obtaining first threshold and the second threshold according to the corresponding relation of the current scene and scene and threshold value Value, wherein, the first threshold is more than the Second Threshold;Analysis module, for being analyzed according to the first acoustic model the acoustic feature of the wake-up voice, called out described in acquisition The first similarity waken up between voice and default wake-up word signal;Judge module, the Second Threshold and it is less than the first threshold for judging whether first similarity is more than, if Judged result is yes, triggers sending module;Sending module, for the wake-up voice to be sent into cloud server so that cloud server is according to the second acoustic model The wake-up voice and default second similarity waken up between word signal are judged, if second similarity is more than described First threshold, then generate the wake-up for waking up the terminal device and instruct;Wherein, the accuracy of identification of second acoustic model More than the accuracy of identification of first acoustic model;First execution module, for receive it is described wake up to instruct and perform wake up the operation of the terminal device.
- 7. device as claimed in claim 6, it is characterised in that the cloud server includes waking up directive generation module;The wake-up directive generation module is specifically used for:The acoustic feature of the wake-up voice is analyzed according to second acoustic model, it is corresponding to obtain the wake-up voice Pronunciation sequence;Pronunciation sequence corresponding to the wake-up voice is analyzed according to language model, obtains text corresponding to the wake-up voice This sequence;Text sequence corresponding to text sequence corresponding to the wake-up voice and the default wake-up word signal is matched;If the match is successful, generate the wake-up for waking up the terminal device and instruct.
- 8. device as claimed in claim 6, it is characterised in that the analysis module is specifically used for:According to it is described wake-up voice acoustic feature and first acoustic model determine it is described wake-up voice acoustic feature and Characteristic similarity between the default acoustic feature for waking up word signal;Determine that first between the wake-up voice and the default wake-up word signal is similar according to each characteristic similarity Degree.
- 9. device as claimed in claim 6, it is characterised in that second detection module is specifically used for:The current location of the terminal device is detected, the current field according to residing for the current location determines the terminal device Scape;Or second detection module is specifically used for:The scene voice of the terminal device is detected, the scene voice is entered Row Concordance, obtain the language material set of the scene voice and determine scene corresponding to the language material set, by institute's predicate Scene corresponding to material set is defined as the current scene residing for the terminal device.
- 10. device as claimed in claim 6, it is characterised in that also include:Second execution module and the 3rd execution module;If the judged result of the judge module, which is first similarity, is more than the first threshold, triggering second performs mould Block;Wherein, second execution module is used to perform the operation for waking up the terminal device;Or if the judged result of the judge module is less than the Second Threshold for first similarity, triggering the 3rd is held Row module;Wherein, the 3rd execution module is used to not perform the operation for waking up the terminal device.
- A kind of 11. computer equipment, it is characterised in that including:Processor and memory;Wherein, the processor can perform by reading the executable program code stored in the memory to run with described Program corresponding to program code, for realizing the voice awakening method as any one of claim 1-5.
- 12. a kind of computer program product, when the instruction in the computer program product is by computing device, perform as weighed Profit requires the voice awakening method any one of 1-5.
- 13. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, it is characterised in that the calculating The voice awakening method as any one of claim 1-5 is realized when machine program is executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710922732.XA CN107622770B (en) | 2017-09-30 | 2017-09-30 | Voice wake-up method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710922732.XA CN107622770B (en) | 2017-09-30 | 2017-09-30 | Voice wake-up method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107622770A true CN107622770A (en) | 2018-01-23 |
CN107622770B CN107622770B (en) | 2021-03-16 |
Family
ID=61091402
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710922732.XA Active CN107622770B (en) | 2017-09-30 | 2017-09-30 | Voice wake-up method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107622770B (en) |
Cited By (42)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108196465A (en) * | 2018-03-07 | 2018-06-22 | 佛山市云米电器科技有限公司 | A kind of intelligent sound box and its control method based on phonetic order control |
CN108198548A (en) * | 2018-01-25 | 2018-06-22 | 苏州奇梦者网络科技有限公司 | A kind of voice awakening method and its system |
CN108335696A (en) * | 2018-02-09 | 2018-07-27 | 百度在线网络技术(北京)有限公司 | Voice awakening method and device |
CN108537019A (en) * | 2018-03-20 | 2018-09-14 | 努比亚技术有限公司 | A kind of unlocking method and device, storage medium |
CN108665900A (en) * | 2018-04-23 | 2018-10-16 | 百度在线网络技术(北京)有限公司 | High in the clouds awakening method and system, terminal and computer readable storage medium |
CN108831477A (en) * | 2018-06-14 | 2018-11-16 | 出门问问信息科技有限公司 | A kind of audio recognition method, device, equipment and storage medium |
CN108924337A (en) * | 2018-05-02 | 2018-11-30 | 宇龙计算机通信科技(深圳)有限公司 | A kind of control method and device waking up performance |
CN108962240A (en) * | 2018-06-14 | 2018-12-07 | 百度在线网络技术(北京)有限公司 | A kind of sound control method and system based on earphone |
CN109215647A (en) * | 2018-08-30 | 2019-01-15 | 出门问问信息科技有限公司 | Voice awakening method, electronic equipment and non-transient computer readable storage medium |
CN109473092A (en) * | 2018-12-03 | 2019-03-15 | 珠海格力电器股份有限公司 | A kind of sound end detecting method and device |
CN109584873A (en) * | 2018-12-13 | 2019-04-05 | 北京极智感科技有限公司 | A kind of awakening method, device, readable medium and the equipment of vehicle-mounted voice system |
CN109817200A (en) * | 2019-01-30 | 2019-05-28 | 北京声智科技有限公司 | The optimization device and method that voice wakes up |
CN110047487A (en) * | 2019-06-05 | 2019-07-23 | 广州小鹏汽车科技有限公司 | Awakening method, device, vehicle and the machine readable media of vehicle-mounted voice equipment |
CN110049107A (en) * | 2019-03-22 | 2019-07-23 | 钛马信息网络技术有限公司 | A kind of net connection vehicle awakening method, device, equipment and medium |
CN110060678A (en) * | 2019-04-16 | 2019-07-26 | 深圳欧博思智能科技有限公司 | A kind of virtual role control method and smart machine based on smart machine |
CN110070857A (en) * | 2019-04-25 | 2019-07-30 | 北京梧桐车联科技有限责任公司 | The model parameter method of adjustment and device, speech ciphering equipment of voice wake-up model |
WO2019179285A1 (en) * | 2018-03-22 | 2019-09-26 | 腾讯科技(深圳)有限公司 | Speech recognition method, apparatus and device, and storage medium |
CN110390934A (en) * | 2019-06-25 | 2019-10-29 | 华为技术有限公司 | A kind of method and interactive voice terminal of information alert |
CN110444195A (en) * | 2018-01-31 | 2019-11-12 | 腾讯科技(深圳)有限公司 | The recognition methods of voice keyword and device |
CN110515449A (en) * | 2019-08-30 | 2019-11-29 | 北京安云世纪科技有限公司 | Wake up the method and device of smart machine |
CN110544468A (en) * | 2019-08-23 | 2019-12-06 | Oppo广东移动通信有限公司 | Application awakening method and device, storage medium and electronic equipment |
CN110600023A (en) * | 2018-06-12 | 2019-12-20 | Tcl集团股份有限公司 | Terminal equipment interaction method and device and terminal equipment |
CN110706703A (en) * | 2019-10-16 | 2020-01-17 | 珠海格力电器股份有限公司 | Voice wake-up method, device, medium and equipment |
CN110718212A (en) * | 2019-10-12 | 2020-01-21 | 出门问问信息科技有限公司 | Voice wake-up method, device and system, terminal and computer readable storage medium |
CN110808030A (en) * | 2019-11-22 | 2020-02-18 | 珠海格力电器股份有限公司 | Voice awakening method, system, storage medium and electronic equipment |
CN110910878A (en) * | 2019-11-27 | 2020-03-24 | 珠海格力电器股份有限公司 | Voice wake-up control method and device, storage medium and household appliance |
CN111081251A (en) * | 2019-11-27 | 2020-04-28 | 云知声智能科技股份有限公司 | Voice wake-up method and device |
CN111627439A (en) * | 2020-05-21 | 2020-09-04 | 腾讯科技(深圳)有限公司 | Audio data processing method and device, storage medium and electronic equipment |
CN111696562A (en) * | 2020-04-29 | 2020-09-22 | 华为技术有限公司 | Voice wake-up method, device and storage medium |
CN111724766A (en) * | 2020-06-29 | 2020-09-29 | 合肥讯飞数码科技有限公司 | Language identification method, related equipment and readable storage medium |
CN112133301A (en) * | 2020-08-21 | 2020-12-25 | 深圳数联天下智能科技有限公司 | Voice recognition method, control device, voice recognition circuit and household equipment |
US10964317B2 (en) | 2017-07-05 | 2021-03-30 | Baidu Online Network Technology (Beijing) Co., Ltd. | Voice wakeup method, apparatus and system, cloud server and readable medium |
CN112655043A (en) * | 2018-09-11 | 2021-04-13 | 日本电信电话株式会社 | Keyword detection device, keyword detection method, and program |
CN113192499A (en) * | 2020-01-10 | 2021-07-30 | 青岛海信移动通信技术股份有限公司 | Voice awakening method and terminal |
CN113205809A (en) * | 2021-04-30 | 2021-08-03 | 思必驰科技股份有限公司 | Voice wake-up method and device |
CN113330513A (en) * | 2021-04-20 | 2021-08-31 | 华为技术有限公司 | Voice information processing method and device |
WO2021179854A1 (en) * | 2020-03-12 | 2021-09-16 | Oppo广东移动通信有限公司 | Voiceprint wakeup method and apparatus, device, and storage medium |
CN113516977A (en) * | 2021-03-15 | 2021-10-19 | 南京每深智能科技有限责任公司 | Keyword recognition method and system |
CN113613079A (en) * | 2021-10-11 | 2021-11-05 | 浙江德塔森特数据技术有限公司 | Intelligent device video advertisement processing method and intelligent device |
WO2022156438A1 (en) * | 2021-01-20 | 2022-07-28 | 华为技术有限公司 | Wakeup method and electronic device |
CN114915514A (en) * | 2022-03-28 | 2022-08-16 | 青岛海尔科技有限公司 | Intention processing method and device, storage medium and electronic device |
WO2024034980A1 (en) * | 2022-08-09 | 2024-02-15 | Samsung Electronics Co., Ltd. | Context-aware false trigger mitigation for automatic speech recognition (asr) systems or other systems |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160189706A1 (en) * | 2014-12-30 | 2016-06-30 | Broadcom Corporation | Isolated word training and detection |
CN106297777A (en) * | 2016-08-11 | 2017-01-04 | 广州视源电子科技股份有限公司 | A kind of method and apparatus waking up voice service up |
CN106448663A (en) * | 2016-10-17 | 2017-02-22 | 海信集团有限公司 | Voice wakeup method and voice interaction device |
CN106653031A (en) * | 2016-10-17 | 2017-05-10 | 海信集团有限公司 | Voice wake-up method and voice interaction device |
CN107134279A (en) * | 2017-06-30 | 2017-09-05 | 百度在线网络技术(北京)有限公司 | A kind of voice awakening method, device, terminal and storage medium |
-
2017
- 2017-09-30 CN CN201710922732.XA patent/CN107622770B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160189706A1 (en) * | 2014-12-30 | 2016-06-30 | Broadcom Corporation | Isolated word training and detection |
CN106297777A (en) * | 2016-08-11 | 2017-01-04 | 广州视源电子科技股份有限公司 | A kind of method and apparatus waking up voice service up |
CN106448663A (en) * | 2016-10-17 | 2017-02-22 | 海信集团有限公司 | Voice wakeup method and voice interaction device |
CN106653031A (en) * | 2016-10-17 | 2017-05-10 | 海信集团有限公司 | Voice wake-up method and voice interaction device |
CN107134279A (en) * | 2017-06-30 | 2017-09-05 | 百度在线网络技术(北京)有限公司 | A kind of voice awakening method, device, terminal and storage medium |
Cited By (60)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10964317B2 (en) | 2017-07-05 | 2021-03-30 | Baidu Online Network Technology (Beijing) Co., Ltd. | Voice wakeup method, apparatus and system, cloud server and readable medium |
CN108198548A (en) * | 2018-01-25 | 2018-06-22 | 苏州奇梦者网络科技有限公司 | A kind of voice awakening method and its system |
CN110444195B (en) * | 2018-01-31 | 2021-12-14 | 腾讯科技(深圳)有限公司 | Method and device for recognizing voice keywords |
US11222623B2 (en) | 2018-01-31 | 2022-01-11 | Tencent Technology (Shenzhen) Company Limited | Speech keyword recognition method and apparatus, computer-readable storage medium, and computer device |
CN110444193A (en) * | 2018-01-31 | 2019-11-12 | 腾讯科技(深圳)有限公司 | The recognition methods of voice keyword and device |
CN110444195A (en) * | 2018-01-31 | 2019-11-12 | 腾讯科技(深圳)有限公司 | The recognition methods of voice keyword and device |
CN108335696A (en) * | 2018-02-09 | 2018-07-27 | 百度在线网络技术(北京)有限公司 | Voice awakening method and device |
CN108196465A (en) * | 2018-03-07 | 2018-06-22 | 佛山市云米电器科技有限公司 | A kind of intelligent sound box and its control method based on phonetic order control |
CN108537019A (en) * | 2018-03-20 | 2018-09-14 | 努比亚技术有限公司 | A kind of unlocking method and device, storage medium |
WO2019179285A1 (en) * | 2018-03-22 | 2019-09-26 | 腾讯科技(深圳)有限公司 | Speech recognition method, apparatus and device, and storage medium |
US11450312B2 (en) | 2018-03-22 | 2022-09-20 | Tencent Technology (Shenzhen) Company Limited | Speech recognition method, apparatus, and device, and storage medium |
US11574632B2 (en) | 2018-04-23 | 2023-02-07 | Baidu Online Network Technology (Beijing) Co., Ltd. | In-cloud wake-up method and system, terminal and computer-readable storage medium |
CN108665900B (en) * | 2018-04-23 | 2020-03-03 | 百度在线网络技术(北京)有限公司 | Cloud wake-up method and system, terminal and computer readable storage medium |
CN108665900A (en) * | 2018-04-23 | 2018-10-16 | 百度在线网络技术(北京)有限公司 | High in the clouds awakening method and system, terminal and computer readable storage medium |
CN108924337A (en) * | 2018-05-02 | 2018-11-30 | 宇龙计算机通信科技(深圳)有限公司 | A kind of control method and device waking up performance |
CN110600023A (en) * | 2018-06-12 | 2019-12-20 | Tcl集团股份有限公司 | Terminal equipment interaction method and device and terminal equipment |
CN108831477A (en) * | 2018-06-14 | 2018-11-16 | 出门问问信息科技有限公司 | A kind of audio recognition method, device, equipment and storage medium |
CN108962240A (en) * | 2018-06-14 | 2018-12-07 | 百度在线网络技术(北京)有限公司 | A kind of sound control method and system based on earphone |
CN109215647A (en) * | 2018-08-30 | 2019-01-15 | 出门问问信息科技有限公司 | Voice awakening method, electronic equipment and non-transient computer readable storage medium |
CN112655043A (en) * | 2018-09-11 | 2021-04-13 | 日本电信电话株式会社 | Keyword detection device, keyword detection method, and program |
CN109473092A (en) * | 2018-12-03 | 2019-03-15 | 珠海格力电器股份有限公司 | A kind of sound end detecting method and device |
CN109584873A (en) * | 2018-12-13 | 2019-04-05 | 北京极智感科技有限公司 | A kind of awakening method, device, readable medium and the equipment of vehicle-mounted voice system |
CN109817200A (en) * | 2019-01-30 | 2019-05-28 | 北京声智科技有限公司 | The optimization device and method that voice wakes up |
CN110049107B (en) * | 2019-03-22 | 2022-04-08 | 钛马信息网络技术有限公司 | Internet vehicle awakening method, device, equipment and medium |
CN110049107A (en) * | 2019-03-22 | 2019-07-23 | 钛马信息网络技术有限公司 | A kind of net connection vehicle awakening method, device, equipment and medium |
CN110060678A (en) * | 2019-04-16 | 2019-07-26 | 深圳欧博思智能科技有限公司 | A kind of virtual role control method and smart machine based on smart machine |
CN110060678B (en) * | 2019-04-16 | 2021-09-14 | 深圳欧博思智能科技有限公司 | Virtual role control method based on intelligent device and intelligent device |
CN110070857A (en) * | 2019-04-25 | 2019-07-30 | 北京梧桐车联科技有限责任公司 | The model parameter method of adjustment and device, speech ciphering equipment of voice wake-up model |
CN110070857B (en) * | 2019-04-25 | 2021-11-23 | 北京梧桐车联科技有限责任公司 | Model parameter adjusting method and device of voice awakening model and voice equipment |
CN110047487A (en) * | 2019-06-05 | 2019-07-23 | 广州小鹏汽车科技有限公司 | Awakening method, device, vehicle and the machine readable media of vehicle-mounted voice equipment |
CN110390934A (en) * | 2019-06-25 | 2019-10-29 | 华为技术有限公司 | A kind of method and interactive voice terminal of information alert |
CN110390934B (en) * | 2019-06-25 | 2022-07-26 | 华为技术有限公司 | Information prompting method and voice interaction terminal |
CN110544468A (en) * | 2019-08-23 | 2019-12-06 | Oppo广东移动通信有限公司 | Application awakening method and device, storage medium and electronic equipment |
CN110515449A (en) * | 2019-08-30 | 2019-11-29 | 北京安云世纪科技有限公司 | Wake up the method and device of smart machine |
CN110515449B (en) * | 2019-08-30 | 2021-06-04 | 北京安云世纪科技有限公司 | Method and device for awakening intelligent equipment |
CN110718212A (en) * | 2019-10-12 | 2020-01-21 | 出门问问信息科技有限公司 | Voice wake-up method, device and system, terminal and computer readable storage medium |
CN110706703A (en) * | 2019-10-16 | 2020-01-17 | 珠海格力电器股份有限公司 | Voice wake-up method, device, medium and equipment |
CN110808030A (en) * | 2019-11-22 | 2020-02-18 | 珠海格力电器股份有限公司 | Voice awakening method, system, storage medium and electronic equipment |
CN111081251A (en) * | 2019-11-27 | 2020-04-28 | 云知声智能科技股份有限公司 | Voice wake-up method and device |
CN111081251B (en) * | 2019-11-27 | 2022-03-04 | 云知声智能科技股份有限公司 | Voice wake-up method and device |
CN110910878A (en) * | 2019-11-27 | 2020-03-24 | 珠海格力电器股份有限公司 | Voice wake-up control method and device, storage medium and household appliance |
CN110910878B (en) * | 2019-11-27 | 2022-02-11 | 珠海格力电器股份有限公司 | Voice wake-up control method and device, storage medium and household appliance |
CN113192499A (en) * | 2020-01-10 | 2021-07-30 | 青岛海信移动通信技术股份有限公司 | Voice awakening method and terminal |
WO2021179854A1 (en) * | 2020-03-12 | 2021-09-16 | Oppo广东移动通信有限公司 | Voiceprint wakeup method and apparatus, device, and storage medium |
CN111696562A (en) * | 2020-04-29 | 2020-09-22 | 华为技术有限公司 | Voice wake-up method, device and storage medium |
CN111696562B (en) * | 2020-04-29 | 2022-08-19 | 华为技术有限公司 | Voice wake-up method, device and storage medium |
CN111627439B (en) * | 2020-05-21 | 2022-07-22 | 腾讯科技(深圳)有限公司 | Audio data processing method and device, storage medium and electronic equipment |
CN111627439A (en) * | 2020-05-21 | 2020-09-04 | 腾讯科技(深圳)有限公司 | Audio data processing method and device, storage medium and electronic equipment |
CN111724766B (en) * | 2020-06-29 | 2024-01-05 | 合肥讯飞数码科技有限公司 | Language identification method, related equipment and readable storage medium |
CN111724766A (en) * | 2020-06-29 | 2020-09-29 | 合肥讯飞数码科技有限公司 | Language identification method, related equipment and readable storage medium |
CN112133301A (en) * | 2020-08-21 | 2020-12-25 | 深圳数联天下智能科技有限公司 | Voice recognition method, control device, voice recognition circuit and household equipment |
WO2022156438A1 (en) * | 2021-01-20 | 2022-07-28 | 华为技术有限公司 | Wakeup method and electronic device |
CN113516977A (en) * | 2021-03-15 | 2021-10-19 | 南京每深智能科技有限责任公司 | Keyword recognition method and system |
CN113330513A (en) * | 2021-04-20 | 2021-08-31 | 华为技术有限公司 | Voice information processing method and device |
CN113205809A (en) * | 2021-04-30 | 2021-08-03 | 思必驰科技股份有限公司 | Voice wake-up method and device |
CN113613079B (en) * | 2021-10-11 | 2022-01-04 | 浙江德塔森特数据技术有限公司 | Intelligent device video advertisement processing method and intelligent device |
CN113613079A (en) * | 2021-10-11 | 2021-11-05 | 浙江德塔森特数据技术有限公司 | Intelligent device video advertisement processing method and intelligent device |
CN114915514A (en) * | 2022-03-28 | 2022-08-16 | 青岛海尔科技有限公司 | Intention processing method and device, storage medium and electronic device |
CN114915514B (en) * | 2022-03-28 | 2024-03-22 | 青岛海尔科技有限公司 | Method and device for processing intention, storage medium and electronic device |
WO2024034980A1 (en) * | 2022-08-09 | 2024-02-15 | Samsung Electronics Co., Ltd. | Context-aware false trigger mitigation for automatic speech recognition (asr) systems or other systems |
Also Published As
Publication number | Publication date |
---|---|
CN107622770B (en) | 2021-03-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107622770A (en) | voice awakening method and device | |
US11854527B2 (en) | Electronic device and method of controlling speech recognition by electronic device | |
CN110718223B (en) | Method, apparatus, device and medium for voice interaction control | |
US11670302B2 (en) | Voice processing method and electronic device supporting the same | |
CN109427333B (en) | Method for activating speech recognition service and electronic device for implementing said method | |
WO2021093449A1 (en) | Wakeup word detection method and apparatus employing artificial intelligence, device, and medium | |
KR102582291B1 (en) | Emotion information-based voice synthesis method and device | |
KR102371313B1 (en) | Electronic apparatus for recognizing keyword included in your utterance to change to operating state and controlling method thereof | |
CN106469552B (en) | Speech recognition apparatus and method | |
CN109036396A (en) | A kind of exchange method and system of third-party application | |
CN107134279A (en) | A kind of voice awakening method, device, terminal and storage medium | |
US20240021202A1 (en) | Method and apparatus for recognizing voice, electronic device and medium | |
US20220076674A1 (en) | Cross-device voiceprint recognition | |
TW201403588A (en) | Power-efficient voice activation | |
CN107545029A (en) | Voice feedback method, equipment and the computer-readable recording medium of smart machine | |
CN110248021A (en) | A kind of smart machine method for controlling volume and system | |
CN114038457B (en) | Method, electronic device, storage medium, and program for voice wakeup | |
EP3550449A1 (en) | Search method and electronic device using the method | |
CN112735418A (en) | Voice interaction processing method and device, terminal and storage medium | |
KR102426411B1 (en) | Electronic apparatus for processing user utterance and server | |
KR102409873B1 (en) | Method and system for training speech recognition models using augmented consistency regularization | |
KR102380717B1 (en) | Electronic apparatus for processing user utterance and controlling method thereof | |
CN115132195B (en) | Voice wakeup method, device, equipment, storage medium and program product | |
CN112037772B (en) | Response obligation detection method, system and device based on multiple modes | |
CN111508481B (en) | Training method and device of voice awakening model, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |