CN106448663A - Voice wakeup method and voice interaction device - Google Patents

Voice wakeup method and voice interaction device Download PDF

Info

Publication number
CN106448663A
CN106448663A CN201610901706.4A CN201610901706A CN106448663A CN 106448663 A CN106448663 A CN 106448663A CN 201610901706 A CN201610901706 A CN 201610901706A CN 106448663 A CN106448663 A CN 106448663A
Authority
CN
China
Prior art keywords
voice
signal
acoustic model
wake
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610901706.4A
Other languages
Chinese (zh)
Other versions
CN106448663B (en
Inventor
杨香斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Group Co Ltd
Original Assignee
Hisense Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Group Co Ltd filed Critical Hisense Group Co Ltd
Priority to CN201610901706.4A priority Critical patent/CN106448663B/en
Publication of CN106448663A publication Critical patent/CN106448663A/en
Application granted granted Critical
Publication of CN106448663B publication Critical patent/CN106448663B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Telephone Function (AREA)

Abstract

The present invention provides a voice wake-up method and a voice interaction device. The method includes the following steps that: voice input signals are received; the first similarity of the voice input signals and preset wake-up voice signals is determined according to a first acoustic model, and whether the first similarity exceeds a first preset threshold value is judged; and if the first similarity exceeds the first preset threshold value, second similarity between the speech input signals and the preset wake-up voice signals is determined according to a second acoustic model, and whether the second similarity exceeds a second preset threshold value is judged, if the second similarity exceeds the second preset threshold value, a voice interaction function is awaken, wherein the accuracy of the second acoustic model is higher than the accuracy of the first acoustic model. The voice wake-up method and the voice interaction device provided by the embodiment of the invention have the advantages of low power consumption and low wrong wake-up rate.

Description

Voice awakening method and voice interaction device
Technical field
The present embodiments relate to technical field of voice recognition, more particularly, to a kind of voice awakening method and interactive voice dress Put.
Background technology
Developing rapidly with speech recognition technology, the application scenarios of interactive voice are more and more universal, intelligent television, intelligence Vehicle-mounted, smart home, intelligent robot be all interactive voice application main application scenarios, simultaneously because man-machine interaction for The requirement more and more higher of family experience, the distance of man-machine voiced interaction is also increasingly not limited to closely say (within 50cm).Lead to now Excessive microphone techniques, have been able to realize the remote speech interaction in 3-5 rice.
Meanwhile, remote speech interaction there is also an issue, is exactly when to start to trigger voice radio reception simultaneously And start to identify.Current technology scheme has two kinds, and one kind is with a low-power chip, receives all the time by microphone array Sound, after doing corresponding signal processing (signal enhancing, noise suppressed, echo cancellor), then does speech recognition again, judges that user is No say wake-up word, if, then notify primary module, start radio reception and simultaneously carry out speech recognition, also a kind of mode is front end Module only do signal processing, radio reception always is come by primary module, and does speech recognition to judge whether user says wake-up word, but It is that both modes have drawback, former mode requires low-power consumption due to front end processing block, so recognition performance comes relatively Saying can be relatively low, and false wake-up rate also can be higher simultaneously;And the problem of latter scheme is main chip module needs full speed running always, Power consumption can ratio larger, and because the requirement to main chip module is higher, the cost of scheme is also higher.There is no at present and take into account Power consumption and the scheme of false wake-up rate.
Content of the invention
The embodiment of the present invention provides a kind of voice awakening method and voice interaction device, cannot be simultaneous in order to solve prior art Turn round and look at the problem of power consumption and false wake-up rate.
Embodiment of the present invention first aspect provides a kind of voice awakening method, and the method includes:
Receive voice input signal;
According to the first acoustic model, determine the first phase between described voice input signal and default wake-up voice signal Like degree, and judge described first similarity whether more than the first predetermined threshold value;
If exceeding, according to the second acoustic model, determine described voice input signal and default wake-up voice signal it Between the second similarity, and judge described second similarity whether more than the second predetermined threshold value, wherein, described second acoustic model Accuracy be higher than described first acoustic model accuracy;
If exceeding, wake up voice interactive function.
Embodiment of the present invention second aspect provides a kind of voice interaction device, and this device includes:
Receiver module, for receiving voice input signal;
First determining module, for according to the first acoustic model, determining described voice input signal and default wake-up language The first similarity between message number, and judge described first similarity whether more than the first predetermined threshold value;
Second determining module, for when described first similarity exceedes described first predetermined threshold value, according to the second acoustics Model, determines the second similarity between described voice input signal and default wake-up voice signal, and judges described second Whether more than the second predetermined threshold value, wherein, the accuracy of described second acoustic model is higher than described first acoustic model to similarity Accuracy;
Wake module, for when described second similarity is more than the second predetermined threshold value, waking up voice interactive function.
The embodiment of the present invention, first pass through the first relatively low acoustic model of accuracy voice input signal is carried out preliminary Voice wakes up identification, when identifying that the similarity between voice input signal and default wake-up voice signal is default more than first During threshold value, then second voice wake-up identification is carried out by higher second acoustic model of accuracy to voice input signal, thus Result according to second identification, it is determined whether wake up voice interactive function.Due to, in first time identification process, using The relatively low acoustic model of accuracy, therefore, the power consumption in first time identification process is relatively low.And only ought be identified by for the first time, When i.e. the similarity between voice input signal and default wake-up voice signal is more than the first predetermined threshold value, just enable accuracy The second higher acoustic model carries out second wake-up identification.So pass through by acoustic model relatively low for accuracy and accuracy relatively High acoustic model is used in combination, it is to avoid when low accuracy acoustic model is used alone, and it is relatively low to wake up recognition accuracy, calls out by mistake The higher problem of awake rate, when being also avoided that high accuracy acoustic model is used alone simultaneously, the higher problem of power consumption, and then reach Take into account power consumption and the purpose of low false wake-up rate.
Brief description
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing Have technology description in required use accompanying drawing be briefly described it should be apparent that, drawings in the following description be only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, also may be used So that other accompanying drawings are obtained according to these accompanying drawings.
The schematic flow sheet of the voice awakening method that Fig. 1 provides for one embodiment of the invention;
The Organization Chart of the voice interaction device that Fig. 2 provides for one embodiment of the invention;
The structural representation of the voice interaction device that Fig. 3 provides for one embodiment of the invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation description is it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of not making creative work Embodiment, broadly falls into the scope of protection of the invention.
The term " comprising " and " having " of description and claims of this specification and their any deformation it is intended that It is to cover non-exclusive comprising, for example, the device of the process or structure that contain series of steps is not necessarily limited to clearly arrange Those structures going out or step but may include clearly not listing or for the intrinsic other steps of these processes or device Rapid or structure.
The schematic flow sheet of the voice awakening method that Fig. 1 provides for one embodiment of the invention, the method can be by such as intelligence Can TV, intelligent vehicle-carried, smart home, intelligent robot etc. has the voice interaction device of voice interactive function to execute.As Shown in Fig. 1, the method that the present embodiment provides comprises the steps:
Step S101, reception voice input signal.
In practical application, voice interaction device can be by the microphone array that is disposed thereon come receive user or terminal The voice signal of equipment input, and the voice signal receiving is guaranteed after receiving voice signal by time delay equalization Integrity, it is to avoid due to missing part of speech signal, and to wake up judgement impact.
Further, obtain this enforcement by pretreatment is carried out to this voice signal after obtaining complete voice signal " voice input signal " alleged by example.Specifically, in preprocessing process, at least voice signal to be carried out at noise suppressed Reason, echo cancellation process and sound enhancement process, wherein, above-mentioned process is similar with speech processes process in prior art, at this In repeat no more.
Step S102, according to the first acoustic model, determine described voice input signal and default wake-up voice signal it Between the first similarity, and whether judge described first similarity more than the first predetermined threshold value, if not less than terminating this and call out Wake up and operate, if exceeding, execution step S103.
Wherein, this first predetermined threshold value can by user according to the actual requirements self-defined setting it is also possible to by terminal unit Default setting, the embodiment of the present invention is not construed as limiting to this.
Particularly, the voice awakening method providing in the present embodiment includes differentiating twice process, wherein, judges for the first time Journey, can be executed by a DSP module.In first time judge process, the phonetic entry that obtains from step S101 first In signal, extract characteristic signal.For example, it is possible to be obtained by way of the mel-frequency cepstrum coefficient extracting voice input signal Take characteristic signal, this process is same as the prior art, repeats no more here.
Further, in actual applications, can in DSP module built-in one simple acoustic model, by should Acoustic model does decoding process to the characteristic signal of above-mentioned acquisition, and calculates judging characteristic signal using maximum likelihood ratio and call out Similarity between awake voice signal, its ultimate principle is will to preset in each characteristic point in characteristic signal and acoustic model Each characteristic point waking up voice signal carries out similarity-rough set, then draws a maximum likelihood value by comprehensive for all of point, Formula is:
Wherein, xiIt is the sample value of ith feature point in characteristic signal, μ is the value in model, θ calculates for needs Maximum likelihood value, calculated between current speech input signal and default wake-up voice signal by this maximum likelihood value Similarity.Wherein, when calculating the similarity obtaining more than preset first threshold value, then unlatching wakes up for second and judges, otherwise Terminate wake operation.In the present embodiment, DSP module carries out to voice input signal waking up the process of judgement and existing skill for the first time Art is similar to, and repeats no more here.
Need exist for illustrating, use better simply acoustic model because first time wakes up judge process, therefore, Requirement to DSP module is relatively low, and the power consumption of DSP module is relatively low.
Certainly above are only and illustrate, rather than the unique restriction to the present invention, for example, in actual applications can also To calculate the similarity of two sections of voices using the method for packet window DTW, but its maximum problem is the pronunciation wind of voice Lattice difference can have a strong impact on the discrimination of voice.
Step S103, according to the second acoustic model, determine described voice input signal and default wake-up voice signal it Between the second similarity, and whether judging described second similarity more than the second predetermined threshold value, if exceeding, waking up interactive voice Function, does not otherwise wake up.Wherein, the accuracy of described second acoustic model is higher than the accuracy of described first acoustic model.
In the present embodiment, waking up judgement second can be executed by a master chip processing module.Calling out through for the first time Wake up after judging, if the similarity between voice input signal and default wake-up voice signal exceedes preset first threshold value, Activation master chip processing module, and then master chip processing module obtains features described above signal from DSP module, and built-in according to it The higher acoustic model (i.e. the second acoustic model) of accuracy and above-mentioned acquisition characteristic signal, determine voice input signal with The second similarity between default wake-up voice signal.Further, after obtaining the second similarity, the obtaining will be calculated Two similarities are contrasted with the second predetermined threshold value, when the second similarity is more than the second predetermined threshold value, wake up interactive voice work( Can, otherwise do not wake up.
It should be noted that not determining between voice input signal and default wake-up voice signal in DSP module When similarity exceedes preset first threshold value, master chip processing module is in unactivated state, that is, master chip processing module be in low Power consumption working condition or resting state;When DSP module determines between voice input signal and default wake-up voice signal Similarity when exceeding preset first threshold value, corresponding for this voice signal characteristic signal is sent to master chip and processes by DSP module Module, and then activate master chip processing module.
Particularly, in the present embodiment, wake up the method judging for second different with the method that first time wake-up judges, its Difference is:Wake up for second and judge to use complicated similarity decoding algorithm, such as Vetebi, it is that a kind of dynamic programming is calculated Method, can calculate the state relation relation in front and back of voice signal content, and wake up for the first time and judge it is static calculating similarity side Method, only calculates the maximum likelihood value of each sampled point, both acoustic models are also different simultaneously, the right and wrong in DSP module Often simple, easily calculate the simple acoustic model processing, in master chip processing module is more complicated, and precision is higher Complicated acoustic model.
As an example it is assumed that the wake-up word in wake-up voice is " Vidaa, Vidaa ", the calculating process in DSP module In it is believed that being that this section of speech decomposition is become 256 sampled points, then by maximum likelihood value-based algorithm come Integrated comparative this In 256 points, the coincidence probability of the maximum likelihood value between the voice input signal that value in acoustic model and collection are come in, be A kind of static computational methods, as long as such as it is considered that this probability reaches 70%, being considered as user and be possible to sentence " Vidaa Vidaa”;
Then start second to wake up and judge, voice input signal can be led with waking up voice signal by master chip processing module Enter the HMM acoustic model of the high accuracy training, high robust, and calculate voice input signal with Veterbi algorithm and call out Similarity between awake voice signal, this algorithm is dynamic planning algorithm, is to calculate in voice signal each point and front The transition probability of pronunciation unit afterwards, because when people speaks, the pronunciation of each word is continuous, and this is determined by vocal cords, because This each phonetic or factor pronunciation characteristic office have determined the transition probability that each is put in front and back, and this part amount of calculation is larger, accuracy Also very high, therefore, if the similarity calculated of Veterbi more than the second predetermined threshold value (such as 90%) then it is assumed that being to use " Vidaa Vidaa " the words has veritably been said at family.Certainly above are only and illustrate, be not the unique limit to the present invention Fixed.
Need exist for illustrating, in the present embodiment, the purpose that second wakes up identification is that voice input signal is entered Row more accurately identifies, it is to avoid the generation of false wake-up.Therefore, in actual applications, the setting of the second predetermined threshold value should be greater than Or it is equal to the first predetermined threshold value.
The present embodiment, first passes through the first relatively low acoustic model of accuracy and carries out preliminary voice to voice input signal Wake up identification, when identifying the similarity between voice input signal and default wake-up voice signal more than the first predetermined threshold value When, then second voice wake-up identification is carried out by higher second acoustic model of accuracy to voice input signal, thus according to The result of second identification, it is determined whether wake up voice interactive function.Due in first time identification process, using accurately Spend relatively low acoustic model, therefore, the power consumption in first time identification process is relatively low.And only ought be identified by for the first time, i.e. language When similarity between sound input signal and default wake-up voice signal is more than the first predetermined threshold value, just enable accuracy higher The second acoustic model carry out second wake-up identification.So passing through will be higher to acoustic model relatively low for accuracy and accuracy Acoustic model is used in combination, it is to avoid when low accuracy acoustic model is used alone, and it is relatively low to wake up recognition accuracy, false wake-up rate Higher problem, when being also avoided that high accuracy acoustic model is used alone simultaneously, the higher problem of power consumption, and then reached simultaneous Turn round and look at the purpose of power consumption and low false wake-up rate.
The Organization Chart of the voice interaction device that Fig. 2 provides for one embodiment of the invention, as shown in Fig. 2 interactive voice in Fig. 2 Device includes DSP module and master chip processing module.Wherein, a built-in better simply acoustic model (i.e. accuracy in DSP module Relatively low acoustic model), it is built-in with an accuracy and the higher acoustic model of robustness in master chip processing module.And master chip When processing module is not triggered by DSP module, it is in working condition or the resting state of low-power consumption, wherein it is preferred that working as main core When piece processing module is not triggered by DSP module, master chip processing module in a dormant state, can reduce main core to greatest extent The power consumption of piece.
In practical application, after microphone array receives voice input signal, DSP module passes through end-point detection (voice Activity detection, abbreviation VAD) to determine whether voice signal input, such as can in short-term can using existing Amount and the algorithm of short-time zero-crossing rate, the application in the present embodiment of this algorithm is identical with application in the prior art, here not Repeat again.After the completion of end-point detection, need to carry out a time delay equalization, to guarantee the complete of voice input signal.Right Before voice input signal carries out signal processing, need completely to preserve this section of voice input signal, in case being sent to cloud End server is identified.Signal processing at least includes noise suppressed process, echo cancellation process and sound enhancement process. In practical application, noise suppressed processes and can carry out on the basis of multi-filter combination.Echo cancellation process and sound strengthen The execution method processing is same as the prior art, repeats no more here.
Further, after completing above-mentioned signal processing, from voice input signal, first extract characteristic signal, further according to One in DSP module simple acoustic model, is decoded processing to extracting the characteristic signal obtaining, and calculates characteristic signal And default wake up voice signal between similarity, when calculate obtain similarity more than the first predetermined threshold value when, then trigger Master chip processing module, the wake-up carrying out again judges, otherwise exits this wake operation.Need exist for illustrating, DSP Module, does preliminary wake-up simply by simple acoustic model and judges, therefore, as long as DSP module is in the building ring of low-power consumption Under border.
Further, when master chip processing module is triggered, master chip processing module can by its with DSP module it Between data-interface, obtain DSP module and wake up, first, the characteristic signal obtaining in judge process, and built-in accurate according to it Spend higher acoustic model and features described above signal carries out second wake-up identification to voice input signal, master chip is processed here Mould carries out second wake-up, and to know method for distinguishing identical with shown in DSP module Fig. 1 embodiment second wake-up knowledge method for distinguishing, Repeat no more here.
Framework shown in Fig. 2, using the quick low-power consumption of front end DSP module, does preliminary wake-up to voice input signal Identification, utilizes the computing resource of DSP module simultaneously, has done a feature extraction, is second wake-up of master chip processing module Identification saves computing resource, and master chip processing module is before being not received by the trigger of DSP module, always low Power consumption mode runs, and after being triggered, then utilizes the high storage resource of itself and high computing resource, and DSP module sends over Characteristic signal, can quickly and efficiently voice input signal be carried out waking up identification, therefore whole framework can take into account low-power consumption And high-accuracy.
The structural representation of the voice interaction device that Fig. 3 provides for one embodiment of the invention, as shown in figure 3, the present embodiment The device providing includes:
Receiver module 11, for receiving voice input signal;
First determining module 12, for according to the first acoustic model, determining described voice input signal and default wake-up The first similarity between voice signal, and judge described first similarity whether more than the first predetermined threshold value;
Second determining module 13, for when described first similarity exceedes described first predetermined threshold value, according to the rising tone Learn model, determine the second similarity between described voice input signal and default wake-up voice signal, and judge described the Whether more than the second predetermined threshold value, wherein, the accuracy of described second acoustic model is higher than described first acoustic mode to two similarities The accuracy of type;
Wake module 14, for when described second similarity is more than the second predetermined threshold value, waking up voice interactive function.
Wherein, described second predetermined threshold value is more than or equal to the first predetermined threshold value.
Described first determining module 12, including:
Acquisition submodule 121, for, from described voice input signal, extracting characteristic signal;
First determination sub-module 122, for according to the first acoustic model and described characteristic signal, determining described characteristic signal And default wake up voice signal between the first maximum likelihood value;
According to described first maximum likelihood value, determine between described voice input signal and default wake-up voice signal First similarity.
Described second determining module 13, including:
Second determination sub-module 131, is used for
According to described second acoustic model, determine in described characteristic signal pronunciation unit with its before or after pronunciation unit Between the first transition probability, and corresponding described wake-up voice signal in pronunciation unit with its before or after pronunciation unit Between the second transition probability;
According to described first transition probability and described second transition probability, determine described characteristic signal and described wake-up voice The second similarity between signal.
The voice interaction device that the present embodiment provides, can be used in executing the method shown in Fig. 1, its specific executive mode Similar with embodiment illustrated in fig. 1 with beneficial effect, repeat no more here.
Finally it should be noted that:Various embodiments above only in order to technical scheme to be described, is not intended to limit;To the greatest extent Pipe has been described in detail to the present invention with reference to foregoing embodiments, it will be understood by those within the art that:Its according to So the technical scheme described in foregoing embodiments can be modified, or wherein some or all of technical characteristic is entered Row equivalent;And these modifications or replacement, do not make the essence of appropriate technical solution depart from various embodiments of the present invention technology The scope of scheme.

Claims (10)

1. a kind of voice awakening method is it is characterised in that include:
Receive voice input signal;
According to the first acoustic model, determine that first between described voice input signal and default wake-up voice signal is similar Degree, and judge described first similarity whether more than the first predetermined threshold value;
If exceeding, according to the second acoustic model, determine between described voice input signal and default wake-up voice signal Second similarity, and judge described second similarity whether more than the second predetermined threshold value, wherein, the standard of described second acoustic model Exactness is higher than the accuracy of described first acoustic model;
If exceeding, wake up voice interactive function.
2. method according to claim 1 is it is characterised in that described second predetermined threshold value is more than the described first default threshold Value.
3. method according to claim 2 it is characterised in that described according to the first acoustic model, determine that described voice is defeated Enter the first similarity between signal and default wake-up voice signal, including:
From described voice input signal, extract characteristic signal;
According to the first acoustic model and described characteristic signal, determine between described characteristic signal and default wake-up voice signal First maximum likelihood value;
According to described first maximum likelihood value, determine first between described voice input signal and default wake-up voice signal Similarity.
4. method according to claim 3 is it is characterised in that when described first similarity exceedes described first predetermined threshold value When, described according to the second acoustic model, determine the second phase between described voice input signal and default wake-up voice signal Seemingly spend, including:
According to described second acoustic model, determine in described characteristic signal pronunciation unit and before or after it between pronunciation unit The first transition probability, and pronunciation unit and before or after it between pronunciation unit in corresponding described wake-up voice signal The second transition probability;
According to described first transition probability and described second transition probability, determine described characteristic signal and described wake-up voice signal Between the second similarity.
5. the method according to any one of Claims 1 to 4 is it is characterised in that described first acoustic model is arranged on DSP mould In block, the second described acoustic model is arranged in master chip processing module.
6. a kind of voice interaction device is it is characterised in that include:
Receiver module, for receiving voice input signal;
First determining module, for according to the first acoustic model, determining described voice input signal and default wake-up voice letter The first similarity between number, and judge described first similarity whether more than the first predetermined threshold value;
Second determining module, for when described first similarity exceedes described first predetermined threshold value, according to the second acoustic model, Determine the second similarity between described voice input signal and default wake-up voice signal, and judge described second similarity Whether more than the second predetermined threshold value, wherein, the accuracy of described second acoustic model is higher than the accurate of described first acoustic model Degree;
Wake module, for when described second similarity is more than the second predetermined threshold value, waking up voice interactive function.
7. device according to claim 6 is it is characterised in that described second predetermined threshold value is more than the first predetermined threshold value.
8. device according to claim 7 is it is characterised in that described first determining module, including:
Acquisition submodule, for, from described voice input signal, extracting characteristic signal;
First determination sub-module, for according to the first acoustic model and described characteristic signal, determining described characteristic signal and presetting Wake up voice signal between the first maximum likelihood value;
According to described first maximum likelihood value, determine first between described voice input signal and default wake-up voice signal Similarity.
9. device according to claim 8 is it is characterised in that described second determining module, including:
Second determination sub-module, for according to described second acoustic model, determine in described characteristic signal pronunciation unit with its before And/or after the first transition probability between pronunciation unit, and in corresponding described wake-up voice signal pronunciation unit with its before And/or after the second transition probability between pronunciation unit;
According to described first transition probability and described second transition probability, determine described characteristic signal and described wake-up voice signal Between the second similarity.
10. the device according to any one of claim 6~9 is it is characterised in that described first acoustic model is arranged on DSP In module, the second described acoustic model is arranged in master chip processing module.
CN201610901706.4A 2016-10-17 2016-10-17 Voice awakening method and voice interaction device Active CN106448663B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610901706.4A CN106448663B (en) 2016-10-17 2016-10-17 Voice awakening method and voice interaction device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610901706.4A CN106448663B (en) 2016-10-17 2016-10-17 Voice awakening method and voice interaction device

Publications (2)

Publication Number Publication Date
CN106448663A true CN106448663A (en) 2017-02-22
CN106448663B CN106448663B (en) 2020-10-23

Family

ID=58174603

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610901706.4A Active CN106448663B (en) 2016-10-17 2016-10-17 Voice awakening method and voice interaction device

Country Status (1)

Country Link
CN (1) CN106448663B (en)

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239897A (en) * 2017-05-31 2017-10-10 中南大学 A kind of personality occupation type method of testing and system
CN107396158A (en) * 2017-08-21 2017-11-24 深圳创维-Rgb电子有限公司 A kind of acoustic control interactive device, acoustic control exchange method and television set
CN107464565A (en) * 2017-09-20 2017-12-12 百度在线网络技术(北京)有限公司 A kind of far field voice awakening method and equipment
CN107622770A (en) * 2017-09-30 2018-01-23 百度在线网络技术(北京)有限公司 voice awakening method and device
CN107680591A (en) * 2017-09-21 2018-02-09 百度在线网络技术(北京)有限公司 Voice interactive method, device and its equipment based on car-mounted terminal
CN107742516A (en) * 2017-09-29 2018-02-27 上海与德通讯技术有限公司 Intelligent identification Method, robot and computer-readable recording medium
CN107895573A (en) * 2017-11-15 2018-04-10 百度在线网络技术(北京)有限公司 Method and device for identification information
CN108122563A (en) * 2017-12-19 2018-06-05 北京声智科技有限公司 Improve voice wake-up rate and the method for correcting DOA
CN108198548A (en) * 2018-01-25 2018-06-22 苏州奇梦者网络科技有限公司 A kind of voice awakening method and its system
WO2018205083A1 (en) * 2017-05-08 2018-11-15 深圳前海达闼云端智能科技有限公司 Robot wakeup method and device, and robot
CN108831477A (en) * 2018-06-14 2018-11-16 出门问问信息科技有限公司 A kind of audio recognition method, device, equipment and storage medium
CN108877788A (en) * 2017-05-08 2018-11-23 瑞昱半导体股份有限公司 Electronic device and its operating method with voice arousal function
TWI643123B (en) * 2017-05-02 2018-12-01 瑞昱半導體股份有限公司 Electronic device having wake on voice function and operating method thereof
CN109036428A (en) * 2018-10-31 2018-12-18 广东小天才科技有限公司 A kind of voice wake-up device, method and computer readable storage medium
CN109065046A (en) * 2018-08-30 2018-12-21 出门问问信息科技有限公司 Method, apparatus, electronic equipment and the computer readable storage medium that voice wakes up
CN109215647A (en) * 2018-08-30 2019-01-15 出门问问信息科技有限公司 Voice awakening method, electronic equipment and non-transient computer readable storage medium
CN109360550A (en) * 2018-12-07 2019-02-19 上海智臻智能网络科技股份有限公司 Test method, device, equipment and the storage medium of voice interactive system
CN109785825A (en) * 2018-12-29 2019-05-21 广东长虹日电科技有限公司 A kind of algorithm and storage medium, the electric appliance using it of speech recognition
CN109979438A (en) * 2019-04-04 2019-07-05 Oppo广东移动通信有限公司 Voice awakening method and electronic equipment
WO2019179285A1 (en) * 2018-03-22 2019-09-26 腾讯科技(深圳)有限公司 Speech recognition method, apparatus and device, and storage medium
CN110444193A (en) * 2018-01-31 2019-11-12 腾讯科技(深圳)有限公司 The recognition methods of voice keyword and device
CN110534102A (en) * 2019-09-19 2019-12-03 北京声智科技有限公司 A kind of voice awakening method, device, equipment and medium
CN110534099A (en) * 2019-09-03 2019-12-03 腾讯科技(深圳)有限公司 Voice wakes up processing method, device, storage medium and electronic equipment
CN110570873A (en) * 2019-09-12 2019-12-13 Oppo广东移动通信有限公司 voiceprint wake-up method and device, computer equipment and storage medium
CN110706691A (en) * 2019-10-12 2020-01-17 出门问问信息科技有限公司 Voice verification method and device, electronic equipment and computer readable storage medium
CN110890087A (en) * 2018-09-10 2020-03-17 北京嘉楠捷思信息技术有限公司 Voice recognition method and device based on cosine similarity
CN110890093A (en) * 2019-11-22 2020-03-17 腾讯科技(深圳)有限公司 Intelligent device awakening method and device based on artificial intelligence
CN111161714A (en) * 2019-12-25 2020-05-15 联想(北京)有限公司 Voice information processing method, electronic equipment and storage medium
CN111831201A (en) * 2020-05-25 2020-10-27 中国人民解放军陆军军医大学第二附属医院 Human-computer interaction system and method for automatically detecting bone marrow cell morphology
CN112259085A (en) * 2020-09-28 2021-01-22 上海声瀚信息科技有限公司 Two-stage voice awakening algorithm based on model fusion framework
CN112740321A (en) * 2018-11-20 2021-04-30 深圳市欢太科技有限公司 Method and device for waking up equipment, storage medium and electronic equipment
CN112885353A (en) * 2021-01-26 2021-06-01 维沃移动通信有限公司 Voice wake-up method and device and electronic equipment
CN113256937A (en) * 2021-07-07 2021-08-13 常州分音塔科技有限公司 Intelligent home nursing method and system based on intelligent detection of audio event
CN113593561A (en) * 2021-07-29 2021-11-02 普强时代(珠海横琴)信息技术有限公司 Ultra-low power consumption awakening method and device based on multi-stage trigger mechanism
CN113611304A (en) * 2021-08-30 2021-11-05 深圳鱼亮科技有限公司 Noise reduction mixing system and method based on large-screen voice awakening recognition
CN113947855A (en) * 2021-09-18 2022-01-18 中标慧安信息技术股份有限公司 Intelligent building personnel safety alarm system based on voice recognition
CN117012206A (en) * 2023-10-07 2023-11-07 山东省智能机器人应用技术研究院 Man-machine voice interaction system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120316879A1 (en) * 2008-05-28 2012-12-13 Koreapowervoice Co., Ltd. System for detecting speech interval and recognizing continous speech in a noisy environment through real-time recognition of call commands
CN103811003A (en) * 2012-11-13 2014-05-21 联想(北京)有限公司 Voice recognition method and electronic equipment
CN104143326A (en) * 2013-12-03 2014-11-12 腾讯科技(深圳)有限公司 Voice command recognition method and device
CN104599667A (en) * 2015-01-16 2015-05-06 联想(北京)有限公司 Information processing method and electronic device
CN105575395A (en) * 2014-10-14 2016-05-11 中兴通讯股份有限公司 Voice wake-up method and apparatus, terminal, and processing method thereof
US20160253995A1 (en) * 2013-11-14 2016-09-01 Huawei Technologies Co., Ltd. Voice recognition method, voice recognition device, and electronic device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120316879A1 (en) * 2008-05-28 2012-12-13 Koreapowervoice Co., Ltd. System for detecting speech interval and recognizing continous speech in a noisy environment through real-time recognition of call commands
CN103811003A (en) * 2012-11-13 2014-05-21 联想(北京)有限公司 Voice recognition method and electronic equipment
US20160253995A1 (en) * 2013-11-14 2016-09-01 Huawei Technologies Co., Ltd. Voice recognition method, voice recognition device, and electronic device
CN104143326A (en) * 2013-12-03 2014-11-12 腾讯科技(深圳)有限公司 Voice command recognition method and device
CN105575395A (en) * 2014-10-14 2016-05-11 中兴通讯股份有限公司 Voice wake-up method and apparatus, terminal, and processing method thereof
CN104599667A (en) * 2015-01-16 2015-05-06 联想(北京)有限公司 Information processing method and electronic device

Cited By (52)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10347252B2 (en) 2017-05-02 2019-07-09 Realtek Semiconductor Corp. Electronic device with wake on voice function and operation method thereof
TWI643123B (en) * 2017-05-02 2018-12-01 瑞昱半導體股份有限公司 Electronic device having wake on voice function and operating method thereof
CN108877788A (en) * 2017-05-08 2018-11-23 瑞昱半导体股份有限公司 Electronic device and its operating method with voice arousal function
US11276402B2 (en) 2017-05-08 2022-03-15 Cloudminds Robotics Co., Ltd. Method for waking up robot and robot thereof
WO2018205083A1 (en) * 2017-05-08 2018-11-15 深圳前海达闼云端智能科技有限公司 Robot wakeup method and device, and robot
CN107239897A (en) * 2017-05-31 2017-10-10 中南大学 A kind of personality occupation type method of testing and system
CN107396158A (en) * 2017-08-21 2017-11-24 深圳创维-Rgb电子有限公司 A kind of acoustic control interactive device, acoustic control exchange method and television set
CN107464565A (en) * 2017-09-20 2017-12-12 百度在线网络技术(北京)有限公司 A kind of far field voice awakening method and equipment
CN107464565B (en) * 2017-09-20 2020-08-04 百度在线网络技术(北京)有限公司 Far-field voice awakening method and device
CN107680591A (en) * 2017-09-21 2018-02-09 百度在线网络技术(北京)有限公司 Voice interactive method, device and its equipment based on car-mounted terminal
CN107742516B (en) * 2017-09-29 2020-11-17 上海望潮数据科技有限公司 Intelligent recognition method, robot and computer readable storage medium
CN107742516A (en) * 2017-09-29 2018-02-27 上海与德通讯技术有限公司 Intelligent identification Method, robot and computer-readable recording medium
CN107622770B (en) * 2017-09-30 2021-03-16 百度在线网络技术(北京)有限公司 Voice wake-up method and device
CN107622770A (en) * 2017-09-30 2018-01-23 百度在线网络技术(北京)有限公司 voice awakening method and device
CN107895573A (en) * 2017-11-15 2018-04-10 百度在线网络技术(北京)有限公司 Method and device for identification information
US10803861B2 (en) 2017-11-15 2020-10-13 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for identifying information
CN107895573B (en) * 2017-11-15 2021-08-24 百度在线网络技术(北京)有限公司 Method and device for identifying information
CN108122563B (en) * 2017-12-19 2021-03-30 北京声智科技有限公司 Method for improving voice awakening rate and correcting DOA
CN108122563A (en) * 2017-12-19 2018-06-05 北京声智科技有限公司 Improve voice wake-up rate and the method for correcting DOA
CN108198548A (en) * 2018-01-25 2018-06-22 苏州奇梦者网络科技有限公司 A kind of voice awakening method and its system
US11222623B2 (en) 2018-01-31 2022-01-11 Tencent Technology (Shenzhen) Company Limited Speech keyword recognition method and apparatus, computer-readable storage medium, and computer device
CN110444193A (en) * 2018-01-31 2019-11-12 腾讯科技(深圳)有限公司 The recognition methods of voice keyword and device
US11450312B2 (en) 2018-03-22 2022-09-20 Tencent Technology (Shenzhen) Company Limited Speech recognition method, apparatus, and device, and storage medium
WO2019179285A1 (en) * 2018-03-22 2019-09-26 腾讯科技(深圳)有限公司 Speech recognition method, apparatus and device, and storage medium
CN108831477A (en) * 2018-06-14 2018-11-16 出门问问信息科技有限公司 A kind of audio recognition method, device, equipment and storage medium
CN109215647A (en) * 2018-08-30 2019-01-15 出门问问信息科技有限公司 Voice awakening method, electronic equipment and non-transient computer readable storage medium
CN109065046A (en) * 2018-08-30 2018-12-21 出门问问信息科技有限公司 Method, apparatus, electronic equipment and the computer readable storage medium that voice wakes up
CN110890087A (en) * 2018-09-10 2020-03-17 北京嘉楠捷思信息技术有限公司 Voice recognition method and device based on cosine similarity
CN109036428A (en) * 2018-10-31 2018-12-18 广东小天才科技有限公司 A kind of voice wake-up device, method and computer readable storage medium
CN112740321A (en) * 2018-11-20 2021-04-30 深圳市欢太科技有限公司 Method and device for waking up equipment, storage medium and electronic equipment
CN109360550A (en) * 2018-12-07 2019-02-19 上海智臻智能网络科技股份有限公司 Test method, device, equipment and the storage medium of voice interactive system
CN109785825A (en) * 2018-12-29 2019-05-21 广东长虹日电科技有限公司 A kind of algorithm and storage medium, the electric appliance using it of speech recognition
CN109979438A (en) * 2019-04-04 2019-07-05 Oppo广东移动通信有限公司 Voice awakening method and electronic equipment
CN110534099A (en) * 2019-09-03 2019-12-03 腾讯科技(深圳)有限公司 Voice wakes up processing method, device, storage medium and electronic equipment
CN110570873B (en) * 2019-09-12 2022-08-05 Oppo广东移动通信有限公司 Voiceprint wake-up method and device, computer equipment and storage medium
CN110570873A (en) * 2019-09-12 2019-12-13 Oppo广东移动通信有限公司 voiceprint wake-up method and device, computer equipment and storage medium
CN110534102B (en) * 2019-09-19 2020-10-30 北京声智科技有限公司 Voice wake-up method, device, equipment and medium
CN110534102A (en) * 2019-09-19 2019-12-03 北京声智科技有限公司 A kind of voice awakening method, device, equipment and medium
CN110706691A (en) * 2019-10-12 2020-01-17 出门问问信息科技有限公司 Voice verification method and device, electronic equipment and computer readable storage medium
CN110890093A (en) * 2019-11-22 2020-03-17 腾讯科技(深圳)有限公司 Intelligent device awakening method and device based on artificial intelligence
CN110890093B (en) * 2019-11-22 2024-02-09 腾讯科技(深圳)有限公司 Intelligent equipment awakening method and device based on artificial intelligence
CN111161714A (en) * 2019-12-25 2020-05-15 联想(北京)有限公司 Voice information processing method, electronic equipment and storage medium
CN111831201A (en) * 2020-05-25 2020-10-27 中国人民解放军陆军军医大学第二附属医院 Human-computer interaction system and method for automatically detecting bone marrow cell morphology
CN112259085A (en) * 2020-09-28 2021-01-22 上海声瀚信息科技有限公司 Two-stage voice awakening algorithm based on model fusion framework
CN112885353A (en) * 2021-01-26 2021-06-01 维沃移动通信有限公司 Voice wake-up method and device and electronic equipment
CN113256937A (en) * 2021-07-07 2021-08-13 常州分音塔科技有限公司 Intelligent home nursing method and system based on intelligent detection of audio event
CN113593561A (en) * 2021-07-29 2021-11-02 普强时代(珠海横琴)信息技术有限公司 Ultra-low power consumption awakening method and device based on multi-stage trigger mechanism
CN113611304A (en) * 2021-08-30 2021-11-05 深圳鱼亮科技有限公司 Noise reduction mixing system and method based on large-screen voice awakening recognition
CN113611304B (en) * 2021-08-30 2024-02-06 深圳鱼亮科技有限公司 Large-screen voice awakening recognition noise reduction mixing system and method
CN113947855A (en) * 2021-09-18 2022-01-18 中标慧安信息技术股份有限公司 Intelligent building personnel safety alarm system based on voice recognition
CN117012206A (en) * 2023-10-07 2023-11-07 山东省智能机器人应用技术研究院 Man-machine voice interaction system
CN117012206B (en) * 2023-10-07 2024-01-16 山东省智能机器人应用技术研究院 Man-machine voice interaction system

Also Published As

Publication number Publication date
CN106448663B (en) 2020-10-23

Similar Documents

Publication Publication Date Title
CN106448663A (en) Voice wakeup method and voice interaction device
CN110428810B (en) Voice wake-up recognition method and device and electronic equipment
CN110838289B (en) Wake-up word detection method, device, equipment and medium based on artificial intelligence
CN103971685B (en) Method and system for recognizing voice commands
CN105632486B (en) Voice awakening method and device of intelligent hardware
US11915699B2 (en) Account association with device
CN110364143B (en) Voice awakening method and device and intelligent electronic equipment
CN107767863B (en) Voice awakening method and system and intelligent terminal
WO2017114201A1 (en) Method and device for executing setting operation
CN104575504A (en) Method for personalized television voice wake-up by voiceprint and voice identification
CN109272991B (en) Voice interaction method, device, equipment and computer-readable storage medium
CN104143326A (en) Voice command recognition method and device
CN104282307A (en) Method, device and terminal for awakening voice control system
CN108766441A (en) A kind of sound control method and device based on offline Application on Voiceprint Recognition and speech recognition
CN111161728B (en) Awakening method, awakening device, awakening equipment and awakening medium of intelligent equipment
CN111462756B (en) Voiceprint recognition method and device, electronic equipment and storage medium
US11308946B2 (en) Methods and apparatus for ASR with embedded noise reduction
CN109697981B (en) Voice interaction method, device, equipment and storage medium
CN103077708A (en) Method for improving rejection capability of speech recognition system
CN103680505A (en) Voice recognition method and voice recognition system
CN111091819A (en) Voice recognition device and method, voice interaction system and method
US11044567B1 (en) Microphone degradation detection and compensation
CN111179944B (en) Voice awakening and age detection method and device and computer readable storage medium
WO2023029615A1 (en) Wake-on-voice method and apparatus, device, storage medium, and program product
CN114708856A (en) Voice processing method and related equipment thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant