CN102999161B - A kind of implementation method of voice wake-up module and application - Google Patents

A kind of implementation method of voice wake-up module and application Download PDF

Info

Publication number
CN102999161B
CN102999161B CN201210455175.2A CN201210455175A CN102999161B CN 102999161 B CN102999161 B CN 102999161B CN 201210455175 A CN201210455175 A CN 201210455175A CN 102999161 B CN102999161 B CN 102999161B
Authority
CN
China
Prior art keywords
wake
word
voice
score
acoustic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210455175.2A
Other languages
Chinese (zh)
Other versions
CN102999161A (en
Inventor
操文祥
王海坤
康怀茂
钱勇
谢信珍
黄海兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Science And Technology University Information Flying South China Institute Of Artificial Intelligence (guangzhou) Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN201210455175.2A priority Critical patent/CN102999161B/en
Publication of CN102999161A publication Critical patent/CN102999161A/en
Application granted granted Critical
Publication of CN102999161B publication Critical patent/CN102999161B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • User Interface Of Digital Computer (AREA)

Abstract

The implementation method of voice wake-up module and an application, comprising: phonetic entry (1), voice wake algorithm (2) up and wake execution (3) up; Voice wake algorithm (2) up and realize extracting (4) mainly through acoustic feature, wake word up and detect (5), wake word up and confirm (6), build and wake word Sampling network (7), training acoustic model (8) and structure up and wake the word confirmation network realization such as (9) up.No matter even if whether the present invention has broadcasting music under noisy environment, word opening voice arousal function can be waken up by voice, identify that wake-up effect is good; Implementation method of the present invention can be transplanted on ARM or DSP general processor and run, and is applied to vehicle-mounted and household electrical appliances association area.

Description

A kind of implementation method of voice wake-up module and application
Technical field
The invention discloses a kind of implementation method and application of voice wake-up module, be specifically related to a kind ofly say that predetermined voice wake word up and carry out triggering system and perform next step operation of user by user, can apply and need to realize the fields such as vehicle-mounted and household electrical appliances that voice wake up.
Background technology
The present invention relates to one and apply for invention disclosed patent, publication number is: CN102645977A, and the applying date is 2012.03.26, and inventor is Yin Jianhong, Wang Zhong, Zhou Yanhuang, name is called " a kind of vehicle-mounted voice wakes man-machine interactive system and method up ", is incorporated by reference document at this.The vehicle-mounted voice of this invention wakes up and realizes principle and be: in the flash storer pre-set, deposit the information such as sound bank, vehicle-mounted noise storehouse, speech engine, the phonetic order inputted by microphone is compared via the phonetic order relevant information that master controller MCU and storer store and is carried out speech recognition, and the phonetic order relevant information determined after matching identification is controlled vehicle-mounted control functional unit block as execution instruction, realize its corresponding function.What flash involved in this invention deposited is all fixing data, and under vehicle environment, whether due to road speed, road conditions, weather, turning on the aircondition to open a window all causes the vehicle-mounted noise storehouse change such as engine noise and tyre noise, the music play in car is different, the difference of speaker can cause referenced sound bank to change, so this invention realizes voice arousal function under being only applicable to the scene of fixing.And the present invention is by different speaker recording data under all kinds of scene of collection, train a kind of acoustic model, wake word Sampling network up by structure simultaneously and confirm network, make the present invention adapt to scene more extensive, voice wake-up effect is good simultaneously.
Summary of the invention
The object of the invention is to solve the deficiencies in the prior art, a kind of implementation method of voice waken system is provided, no matter even if whether have broadcasting music under noisy environment, can wake word opening voice arousal function up by voice, voice wake-up effect is good simultaneously; In addition the present invention also provides the application of voice waken system, comprises and is applied to application that is vehicle-mounted and household electrical appliances association area.
The present invention is achieved by the following technical solutions: a kind of implementation method of voice wake-up module comprises: phonetic entry 1, voice wake algorithm 2 up and wake execution 3 step up, voice wake the voice signal that algorithm 2 obtains phonetic entry 1 up, after carrying out voice wake up process, result is exported to and wakes execution 3 up, thus complete wake operation;
Described voice wake up algorithm 2 by acoustic feature extract 4, wake up word detect 5, wake up word confirm 6, build wake up word Sampling network 7, training acoustic model 8 and structure wake up word confirm network 9 realize, specific implementation process is as follows:
The first step, acoustic feature extracts 4: obtain voice signal input by phonetic entry 1, extract there is distinction and be the feature extracted based on human hearing characteristic, usually MFCC (Mel-FrequencyCepstrumCoefficient, the Mel frequency cepstrum coefficient) feature used in speech recognition is chosen as acoustic feature;
Second step, wake word up and detect 5: will the acoustic feature obtained be extracted, the acoustic model 8 of training is adopted to calculate acoustic score waking up on word Sampling network 7, if comprise in the path of score optimum to detect wake word up, then determine to have detected and wake word up, enter the 3rd step operation, otherwise get back to the first step and re-start and extract acoustic feature 4 step;
3rd step, wakes word up and confirms 6: will extract the acoustic feature that obtains, adopts the acoustic model 8 of training to confirm that network 9 carrying out wake word up confirms, is finally confirmed score waking word up; Judge whether the word that wakes up that this detects is wake word up really, compare by this final confirmation score waking word up and the thresholding preset, confirm that score is more than or equal to thresholding if final, then thinking that this wakes word up is wake word up really, voice wake up successfully, result is exported to and wakes execution 3 up, thus complete voice wake operation; Confirm that score is less than thresholding if final, then thinking that this wakes word up is false wake word up, comes back to the first step and re-starts acoustic feature and extract 4 steps.
The training of described acoustic model 8 is divided into two parts, is respectively phoneme acoustic model and garbage model (i.e. Garbage model); Phoneme acoustic model adopts the acoustic training model method in traditional speech recognition, choose database, utilize based on MLE (MaximumLikelihoodEstimation, maximal possibility estimation) and MPE (MinimumPhoneError, minimum phoneme mistake) distinction training criterion under obtain; Garbage model is for absorbing the independent voice except waking word up, use and train the database that phoneme model is same, by calculating the similarity between each phoneme model, each phoneme is divided into 20 classes, the all training datas using every class phoneme corresponding merge, adopt the Garbage model that the training of MLE criterion is corresponding, just obtain 20 class Garbage models.
The described implementation method waking word Sampling network 7 up adopts optimum score path computing to draw, the described optimum computing formula obtaining sub-path is:
W = arg m a x W P ( W ) P ( X | W ) - - - ( 2 )
Wherein X representative is from inputting the acoustic feature vector extracted voice, and W represents the maximum optimum word sequence of score; Conditional probability P (X|W) is acoustic model scores, is calculated by the acoustic model 8 trained; Prior probability P (W) is language model scores, and being the PenaltyP (X) added by different acoustic models is total probability, when acoustic model and wake up word Sampling network decide after be exactly definite value.
The described word that wakes up confirms that network (9) implementation method is:
A. the word that wakes up detected is decoded to phoneme one-level, and records all score (Score phone1, Score phone2..., Score phoneN), wherein N wakes phoneme number total in word up;
Score phone1, Score phone2..., Score phoneNwhat represent that this wakes all phonemes in word up respectively is decoding score, and wherein subscript represents the mark of N number of phoneme of phoneme.
B. use and wake word up and detect same feature, obtain corresponding acoustic score, and be accurate to frame one-level (Score frame1, Score frame2..., Score frameM), wherein M is the total duration of this feature, in units of frame;
C. calculate and wake each phoneme of word up and really to recognize point, account form is as follows:
CM p h o n e i = ( Score p h o n e i - Σ k = K i s t a r t K i e n d Score f r a m e k ) / ( K i e n d - K i s t a r t ) - - - ( 3 )
Wherein K istartand K iendbe respectively initial time and the end time of i-th phoneme;
CM phoneirepresent that i-th phoneme is recognized point really, subscript phonei represents i-th phoneme, Score phoneithe decoding score of i-th phone as shown above, Score framekrepresent to use and wake the score that word confirms the kth frame that network decoding obtains up.
D. calculate the final confirmation score that this wakes word up, account form is as follows:
CM w o r d = 1 N Σ i = 1 N CM p h o n e i - - - ( 4 )
Method of the present invention can be transplanted on ARM or DSP general processor, is applied to vehicle-mounted and household electrical appliances association area.
A kind of vehicle-mounted voice waken system, is characterized in that comprising: microprocessor, voice wake-up module, audio conversion device, recording device, apparatus for processing audio, public address system; Wherein voice wake-up module is run in the microprocessor, and specific implementation process is as follows:
The first step, microprocessor and apparatus for processing audio interconnect, and control apparatus for processing audio output audio information, and apparatus for processing audio and public address system interconnect, and required playback of audio information are carried out power amplification to promote loudspeaker playback, complete audio player operation;
Second step, recording device and audio conversion device interconnect, when user say voice wake word up time, carry out voice typing by recording device and pass to audio conversion device conversion, complete voice collecting operation;
3rd step, audio conversion device carries out data conversion to the voice messaging of recording device typing, the data after conversion are passed to simultaneously microprocessor carry out claim 1 described in the computing of voice wake-up module, complete voice data conversion operations;
4th step, microprocessor and audio conversion device interconnect, and the voice messaging of audio conversion device input are carried out to the computing of voice wake-up module, if correctly identify voice to wake information up, then control apparatus for processing audio and play voice message sound, complete vehicle-mounted voice and wake up and prompt tone play operation; If identify and make mistakes, then proceed the operation of second step voice collecting.
The present invention's advantage is compared with prior art:
(1) the present invention wakes word up as trigger source by the voice of user, add that waking word up detects and wake up word confirmation, no matter even if whether have broadcasting music under noisy environment, can wake word opening voice arousal function up by voice, voice wake-up effect is good; Also utilize bimanualness without the need to user simultaneously, realize arousal function fast by means of only voice command, carry out next step interactive operation.
(2) the present invention realizes, and cost is low, and code migrating is convenient, has good application value.
(3) the present invention can be widely used in the fields such as vehicle-mounted and household electrical appliances, can also be widely used in each field that other audio plays needs voice to wake up simultaneously.In the automotive environment, want in user's driving conditions before not using native system that starting recognition function needs manually to remove operation push-button, suspends the music of current broadcasting, causes driving conditions to there is potential safety hazard; Consumer's Experience weak effect simultaneously.
(4) value that the present invention brings is, by saying that the voice of agreement wake word opening voice arousal function up after using native system, play without the need to suspending audio frequency in advance, simultaneously by actual testing authentication, correct identification wakes rate up and can reach more than 90%; At other as field of household appliances, user just when TV reception, looks on the bright side of things and opens speech identifying function, also can wake word up to realize by voice, make interactive voice more convenient, more humane.
(5) the voice arousal function in the present invention is all realized by software algorithm, can be transplanted on the general processors such as ARM or DSP very easily.
Accompanying drawing explanation
Fig. 1 is the schematic block diagram that the present invention realizes;
Fig. 2 is that structure of the present invention wakes word Sampling network schematic block diagram up;
Fig. 3 is that structure of the present invention wakes word confirmation network schematic block diagram up;
Fig. 4 is the concrete enforcement schematic diagram of the present invention in automotive field.
Embodiment
As shown in Figure 1, the realization of voice wake-up module of the present invention wakes algorithm 2 up by phonetic entry 1, voice and wakes execution 3 step up and realizes.
Voice wake up algorithm 2 realize primarily of acoustic feature extract 4, wake up word detect 5, wake up word confirm 6, build wake up word Sampling network 7, training acoustic model 8 and structure wake up word confirm network 9 complete, specific implementation process is:
(1) acoustic model 8 is trained: the training of acoustic model is divided into two parts, is respectively phoneme acoustic model and garbage model (i.e. Garbage model).Phoneme acoustic model adopts the acoustic training model method in traditional speech recognition, choose suitable database, utilize based on MLE (MaximumLikelihoodEstimation, maximal possibility estimation) and MPE (MinimumPhoneError, minimum phoneme mistake) distinction training criterion under obtain.Garbage model is for absorbing the independent voice except waking word up, use and train the database that phoneme model is same, by calculating the similarity between each phoneme model, each phoneme is divided into 20 classes, the all training datas using every class phoneme corresponding merge, adopt the Garbage model that the training of MLE criterion is corresponding, so namely obtain 20 class Garbage models.Garbage model have employed the phoneme training data combined training of cluster, has two kinds of purposes, is used for absorbing other voice except waking word up waking up in word Sampling network, confirms to be used in network calculating the score confirming network waking word up.
(2) acoustic feature extracts 4: obtain voice signal input by phonetic entry 1, extraction can have certain distinction, and be the feature extracted based on human hearing characteristic, generally choose MFCC (Mel-FrequencyCepstrumCoefficient, the Mel frequency cepstrum coefficient) feature used in speech recognition.
(3) wake word up and detect 5: will the acoustic feature that obtains be extracted, use acoustic model 8 to calculate acoustic score waking up on word Sampling network 7, if comprise in the path of score optimum to detect wake word up, then detect and wake word up, enter next step operation; Otherwise again extract acoustic feature operation.In order to ensure that waking word up can be detected normally, invalid voice can effectively be absorbed again simultaneously.What the structure waking Sampling network up was selected primarily of user wakes word and Garbage model composition up, as shown in Figure 2, this network in speech recognition also referred to as recognition network, very simple owing to waking checking network line structure up, or can by simple program manual construction.Due to the complicacy of practical service environment, under many circumstances, what receive wakes voice up by noise pollution, it is a lot of that the score of feature on phoneme acoustic model now waking acoustics corresponding to voice up will reduce, and due to Garbage model be use more phoneme combined training to obtain, itself be not very accurate, the limited extent that the score of acoustic feature on Garbage model reduces, now wake voice up just to be absorbed by Garbage model, system wake-up rate will reduce by mistake.
In order to prevent the generation of above-mentioned situation, wake up word Sampling network is decoded time, certain punishment is done to the decoding score of the arc at Garbage place, i.e. Penalty, make its can not with the fair competition of phoneme acoustic model, also can normally be detected by the voice that wake up of noise pollution to ensure.Concrete punishment amplitude needs to do experimental adjustment for the different words that wakes up.
The implementation method waking word Sampling network 7 up adopts optimum score path computing to draw.
The optimum acquisition obtaining sub-path adopts classical Bayesian formula, as follows:
W = arg m a x W P ( W | X ) = arg m a x W P ( W ) P ( X | W ) P ( X ) - - - ( 1 )
The acoustic feature vector that in above formula, X representative is extracted from input voice, W represents the maximum optimum word sequence of score.Conditional probability P (X|W) is acoustic model scores, can be calculated by the phoneme acoustic model that trains and garbage model, prior probability P (W) is language model scores, can be understood as here the Penalty added by different acoustic models.P (X) is total probability, when acoustic model and wake up word Sampling network decide after be exactly definite value, therefore formula (1) can be written as:
W = arg max W P ( W ) P ( X | W ) - - - ( 2 )
(4) wake word up and confirm 6: due to the complicacy that there is inexactness and practical service environment of acoustic model itself, the word that wakes up obtained by waking word detection up not necessarily wakes word up really.Non-ly waking the false wake-up brought and the problem that can cause up below to reduce, needing to do further to confirm to detecting the word that wakes up obtained.The present invention adopts the mode of accompanying drawing 3 to build and wakes word confirmation network 9 up, wake word up and confirm that network is the same with waking word Sampling network up, all belong to the recognition network in speech recognition, confirm only to comprise Garbage model in network, simple program or manual construction can be used.
The key step waking word confirmation up is as follows:
A) word will be waken up detect and obtain waking word up and be decoded to phoneme one-level, and record its all score (Score phone1, Score phone2..., Score phoneN), wherein N wakes phoneme number total in word up.
B) use and wake word up and detect same feature, confirming network obtains corresponding acoustic score waking word up, and be accurate to frame one-level (Score frame1, Score frame2..., Score frameM), wherein M is the total duration of this feature, in units of frame.
C) calculate and wake each phoneme of word up and really to recognize point, account form is as follows:
CM p h o n e i = ( Score p h o n e i - Σ k = K i s t a r t K i e n d Score f r a m e k ) / ( K i e n d - K i s t a r t ) - - - ( 3 )
Wherein K istartand K iendbe respectively initial time and the end time of i-th phoneme.
D) calculate the final confirmation score that this wakes word up, account form is as follows:
CM w o r d = 1 N Σ i = 1 N CM p h o n e i - - - ( 4 )
Whether be really wake word, the thresholding contrasting this final confirmation score waking word up He preset if e) judging that this wakes word up, if confirm score C M wordbe greater than thresholding T and then think that this wakes word up for wake word up really, wake up successfully; If CM wordbeing less than thresholding T, then to think that this wakes word up be false wake word up, re-starts acoustic feature and extract.
Realize voice arousal function by working above, result feedback is to waking execution 3 up the most at last, performs wake operation.
As shown in Figure 4, give the present invention the concrete enforcement schematic diagram in automotive field, vehicle-mounted voice waken system, its structure comprises: microprocessor 11, preferentially selects ARM9 processor, but is not limited thereto microprocessor; Voice wake-up module operates in microprocessor 11; Audio conversion device 12, prioritizing selection WM8731, but be not limited thereto audio conversion device; Recording device 13, prioritizing selection sexual valence than high electret microphone, but is not limited thereto recording device; Apparatus for processing audio 14, prioritizing selection TDA7419, but be not limited thereto apparatus for processing audio; Public address system 15, adopts the four unit loudspeaker (left front loudspeaker, left back loudspeaker, right front loudspeaker, right back loudspeaker) that power amplifier TDA7388 and automobile carry, but is not limited thereto power amplifier and vehicle-mounted loudspeaker unit; Voice wake command word, prioritizing selection " automobile language point ", but be not not limited thereto voice and wake word up.
Realize that principle mainly comprises audio frequency broadcasting, data under voice, voice data conversion, voice wake up and the step such as prompt tone broadcasting completes.Specific as follows:
The first, when user uses native system to listen to music when driving, music can be other sources of sound such as the radio/TV/DVD/linein of audio frequency or the accessing to audio processor TDA7419 provided by the broadcast module of microprocessor ARM9; After the music of all broadcastings first carries out audio effect processing by audio process, then promote vehicle-mounted loudspeaker by power amplifier TDA7388 and broadcast, complete audio frequency broadcasting work;
The second, word is waken up when user says specific voice---time " automobile language point ", user's speaking volume should keep level of normally speaking, the too little meeting of sound causes electret microphone to be recorded less than voice signal, and sound is crossed conference and caused recording to cut top, all can cause arousal function failure; Include the microphone signal that voice wake word information up, in audio converter WM8731, carry out analog to digital conversion, complete speech signal collection work;
Three, the voice acquisition module of microprocessor ARM9 carries out analog to digital conversion work by iic bus control audio converter WM8731, convert microphone location signal to digital signal, and return to microprocessor by IIS bus, complete voice data conversion work;
Four, microprocessor training acoustic model, extracts user's acoustic feature of microphone signal input, after waking word Sampling network up and waking word confirmation network up, realizes voice arousal function.Simultaneously by audio process play cuing tone signal, complete whole voice and wake up and prompt tone play operation.
Be more than preferred embodiments of the present invention, user, when not playing music or non-driving, can wake word opening voice recognition function up by special sound equally.
Non-elaborated part of the present invention belongs to techniques well known.And above-described embodiment does not limit the present invention in any form, the technical scheme that the form that all employings are equal to replacement or equivalent transformation obtains, all drops within protection scope of the present invention.

Claims (4)

1. the implementation method of a voice wake-up module, it is characterized in that comprising: phonetic entry (1), voice wake algorithm (2) up and wake execution (3) step up, voice wake the voice signal that algorithm (2) obtains phonetic entry (1) up, after carrying out voice wake up process, result is exported to and wakes execution (3) up, thus complete wake operation;
Described voice wake up algorithm (2) by acoustic feature extract (4), wake up word detect (5), wake up word confirm (6), build wake up word Sampling network (7), training acoustic model (8) and structure wake up word confirmation network (9) realize, specific implementation process is as follows:
The first step, acoustic feature extracts (4): obtain voice signal input by phonetic entry (1), extract there is distinction and be the feature extracted based on human hearing characteristic, choose the Mel frequency cepstrum coefficient characteristics used in speech recognition as acoustic feature;
Second step, wake word up and detect (5): will the acoustic feature obtained be extracted, the acoustic model (8) of training is adopted to calculate acoustic score waking up on word Sampling network (7), if comprise in the path of acoustic score optimum to detect wake word up, then determine to have detected and wake word up, enter the 3rd step operation, otherwise get back to the first step re-start extract acoustic feature (4) step;
3rd step, wakes word up and confirms (6): will extract the acoustic feature that obtains, adopts the acoustic model (8) of training to confirm that network (9) carrying out wake word up confirms, is finally confirmed score waking word up; Judge whether the word that wakes up that this detects is wake word up really, by this final confirmation score waking word up and the thresholding preset, confirm that score is more than or equal to thresholding if final, then thinking that this wakes word up is wake word up really, voice wake up successfully, result is exported to and wakes execution (3) up, thus complete voice wake operation; Confirm that score is less than thresholding if final, then thinking that this wakes word up is false wake word up, comes back to the first step and re-starts acoustic feature and extract (4) step;
The described implementation method waking word Sampling network (7) up adopts the path computing of acoustic score optimum to draw, the computing formula in the path of described acoustic score optimum is:
W = arg max W P ( W ) P ( X | W )
Wherein X representative is from inputting the acoustic feature vector extracted voice, and W represents the maximum optimum word sequence of score; Conditional probability P (X|W) is acoustic model scores, is calculated by the acoustic model (8) trained; Prior probability P (W) is language model scores, and being the PenaltyP (X) added by different acoustic models is total probability;
The described word that wakes up confirms that network (9) implementation method is:
A. the word that wakes up detected is decoded to phoneme one-level, and records all score Score phone1, Score phone2..., Score phoneN, wherein N wakes phoneme number total in word up,
Score phone1, Score phone2..., Score phoneNwhat represent that this wakes all phonemes in word up respectively is decoding score, and wherein subscript represents the mark of N number of phoneme of phoneme;
B. use and wake word up and detect same feature, obtain corresponding acoustic score, and be accurate to frame one-level Score frame1, Score frame2..., Score frameM, wherein M is the total duration of this feature, in units of frame;
C. calculate the acoustic score waking each phoneme of word up, account form is as follows:
CM p h o n e i = ( Score p h o n e i - Σ k = K i s t a r t K i e n d Score f r a m e k ) / ( K i e n d - K i s t a r t )
Wherein K istartand K iendbe respectively initial time and the end time of i-th phoneme;
CM phoneirepresent that i-th phoneme is recognized point really, subscript phonei represents i-th phoneme, Score phoneirepresent the decoding score of i-th phone, Score framekrepresent to use and wake the score that word confirms the kth frame that network decoding obtains up;
D. calculate the final confirmation score that this wakes word up, account form is as follows:
CM w o r d = 1 N Σ i = 1 N CM p h o n e i .
2. the implementation method of voice wake-up module according to claim 1, is characterized in that: the training of described acoustic model (8) is divided into two parts, is respectively phoneme acoustic model and garbage model and Garbage model; Phoneme acoustic model adopts the acoustic training model method in traditional speech recognition, chooses database, utilizes and obtains based under maximal possibility estimation and minimum phoneme fault discrimination training criterion; Garbage model is for absorbing the independent voice except waking word up, use and train the database that phoneme model is same, by calculating the similarity between each phoneme model, each phoneme is divided into 20 classes, the all training datas using every class phoneme corresponding merge, adopt the Garbage model that the training of maximal possibility estimation criterion is corresponding, just obtain 20 class Garbage models.
3. the implementation method of a kind of voice wake-up module according to claim 1, is characterized in that: described method can be transplanted on ARM or DSP general processor and run, and is applied to vehicle-mounted and household electrical appliances association area.
4. a vehicle-mounted voice waken system, it is characterized in that comprising: voice wake-up module, audio conversion device, recording device, apparatus for processing audio, public address system described in microprocessor, claim 1, described voice wake-up module is run in the microprocessor, and specific implementation process is as follows:
The first step, microprocessor and apparatus for processing audio interconnect, and control apparatus for processing audio output audio information, and apparatus for processing audio and public address system interconnect, and required playback of audio information are carried out power amplification to promote loudspeaker playback, complete audio player operation;
Second step, recording device and audio conversion device interconnect, when user say voice wake word up time, carry out voice typing by recording device and pass to audio conversion device conversion, complete voice collecting operation;
3rd step, audio conversion device carries out data conversion to the voice messaging of recording device typing, the data after conversion is passed to the computing that microprocessor carries out voice wake-up module simultaneously, completes voice data conversion operations;
4th step, microprocessor and audio conversion device interconnect, and the voice messaging of audio conversion device input are carried out to the computing of voice wake-up module, if correctly identify voice to wake information up, then control apparatus for processing audio and play voice message sound, complete voice and wake up and prompt tone play operation; If identify and make mistakes, then proceed the operation of second step voice collecting.
CN201210455175.2A 2012-11-13 2012-11-13 A kind of implementation method of voice wake-up module and application Active CN102999161B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210455175.2A CN102999161B (en) 2012-11-13 2012-11-13 A kind of implementation method of voice wake-up module and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210455175.2A CN102999161B (en) 2012-11-13 2012-11-13 A kind of implementation method of voice wake-up module and application

Publications (2)

Publication Number Publication Date
CN102999161A CN102999161A (en) 2013-03-27
CN102999161B true CN102999161B (en) 2016-03-02

Family

ID=47927817

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210455175.2A Active CN102999161B (en) 2012-11-13 2012-11-13 A kind of implementation method of voice wake-up module and application

Country Status (1)

Country Link
CN (1) CN102999161B (en)

Families Citing this family (93)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9240182B2 (en) * 2013-09-17 2016-01-19 Qualcomm Incorporated Method and apparatus for adjusting detection threshold for activating voice assistant function
CN103714815A (en) * 2013-12-09 2014-04-09 何永 Voice control method and device thereof
CN103943105A (en) * 2014-04-18 2014-07-23 安徽科大讯飞信息科技股份有限公司 Voice interaction method and system
CN104282307A (en) * 2014-09-05 2015-01-14 中兴通讯股份有限公司 Method, device and terminal for awakening voice control system
CN105575395A (en) * 2014-10-14 2016-05-11 中兴通讯股份有限公司 Voice wake-up method and apparatus, terminal, and processing method thereof
CN104464723B (en) * 2014-12-16 2018-03-20 科大讯飞股份有限公司 A kind of voice interactive method and system
CN104616653B (en) * 2015-01-23 2018-02-23 北京云知声信息技术有限公司 Wake up word matching process, device and voice awakening method, device
CN106161755A (en) * 2015-04-20 2016-11-23 钰太芯微电子科技(上海)有限公司 A kind of key word voice wakes up system and awakening method and mobile terminal up
CN105096939B (en) * 2015-07-08 2017-07-25 百度在线网络技术(北京)有限公司 voice awakening method and device
CN106469554B (en) * 2015-08-21 2019-11-15 科大讯飞股份有限公司 A kind of adaptive recognition methods and system
CN105141919A (en) * 2015-09-01 2015-12-09 武汉同迅智能科技有限公司 Monitoring terminal device remotely controlled by voice
CN106653010B (en) * 2015-11-03 2020-07-24 络达科技股份有限公司 Electronic device and method for waking up electronic device through voice recognition
US9792907B2 (en) * 2015-11-24 2017-10-17 Intel IP Corporation Low resource key phrase detection for wake on voice
CN105632486B (en) * 2015-12-23 2019-12-17 北京奇虎科技有限公司 Voice awakening method and device of intelligent hardware
CN105654949B (en) * 2016-01-07 2019-05-07 北京云知声信息技术有限公司 A kind of voice awakening method and device
CN105702253A (en) * 2016-01-07 2016-06-22 北京云知声信息技术有限公司 Voice awakening method and device
US9772817B2 (en) 2016-02-22 2017-09-26 Sonos, Inc. Room-corrected voice detection
US10095470B2 (en) 2016-02-22 2018-10-09 Sonos, Inc. Audio response playback
US9811314B2 (en) 2016-02-22 2017-11-07 Sonos, Inc. Metadata exchange involving a networked playback system and a networked microphone system
US10264030B2 (en) 2016-02-22 2019-04-16 Sonos, Inc. Networked microphone device control
CN105812573A (en) * 2016-04-28 2016-07-27 努比亚技术有限公司 Voice processing method and mobile terminal
US10134399B2 (en) 2016-07-15 2018-11-20 Sonos, Inc. Contextualization of voice inputs
US10115400B2 (en) * 2016-08-05 2018-10-30 Sonos, Inc. Multiple voice services
CN106297777B (en) * 2016-08-11 2019-11-22 广州视源电子科技股份有限公司 A kind of method and apparatus waking up voice service
CN107767863B (en) * 2016-08-22 2021-05-04 科大讯飞股份有限公司 Voice awakening method and system and intelligent terminal
CN107767861B (en) * 2016-08-22 2021-07-02 科大讯飞股份有限公司 Voice awakening method and system and intelligent terminal
CN106094673A (en) * 2016-08-30 2016-11-09 奇瑞商用车(安徽)有限公司 Automobile wakes up word system and control method thereof up
US10181323B2 (en) 2016-10-19 2019-01-15 Sonos, Inc. Arbitration-based voice recognition
CN106611597B (en) * 2016-12-02 2019-11-08 百度在线网络技术(北京)有限公司 Voice awakening method and device based on artificial intelligence
CN106847273B (en) * 2016-12-23 2020-05-05 北京云知声信息技术有限公司 Awakening word selection method and device for voice recognition
CN106653022B (en) * 2016-12-29 2020-06-23 百度在线网络技术(北京)有限公司 Voice awakening method and device based on artificial intelligence
CN108447472B (en) * 2017-02-16 2022-04-05 腾讯科技(深圳)有限公司 Voice wake-up method and device
CN107220532B (en) * 2017-04-08 2020-10-23 网易(杭州)网络有限公司 Method and apparatus for recognizing user identity through voice
CN107123417B (en) * 2017-05-16 2020-06-09 上海交通大学 Customized voice awakening optimization method and system based on discriminant training
US20190043295A1 (en) * 2017-08-07 2019-02-07 Microchip Technology Incorporated Voice-Activated Actuation of Automotive Features
US10475449B2 (en) 2017-08-07 2019-11-12 Sonos, Inc. Wake-word detection suppression
CN108122556B (en) * 2017-08-08 2021-09-24 大众问问(北京)信息科技有限公司 Method and device for reducing false triggering of voice wake-up instruction words of driver
CN107591151B (en) * 2017-08-22 2021-03-16 百度在线网络技术(北京)有限公司 Far-field voice awakening method and device and terminal equipment
US10048930B1 (en) 2017-09-08 2018-08-14 Sonos, Inc. Dynamic computation of system response volume
US10482868B2 (en) 2017-09-28 2019-11-19 Sonos, Inc. Multi-channel acoustic echo cancellation
US10466962B2 (en) 2017-09-29 2019-11-05 Sonos, Inc. Media playback system with voice assistance
CN109672775B (en) * 2017-10-16 2021-10-29 腾讯科技(北京)有限公司 Method, device and terminal for adjusting awakening sensitivity
CN110809796B (en) * 2017-10-24 2020-09-18 北京嘀嘀无限科技发展有限公司 Speech recognition system and method with decoupled wake phrases
CN107895573B (en) * 2017-11-15 2021-08-24 百度在线网络技术(北京)有限公司 Method and device for identifying information
CN111386566A (en) * 2017-12-15 2020-07-07 海尔优家智能科技(北京)有限公司 Device control method, cloud device, intelligent device, computer medium and device
CN108320733B (en) * 2017-12-18 2022-01-04 上海科大讯飞信息科技有限公司 Voice data processing method and device, storage medium and electronic equipment
CN108198548B (en) * 2018-01-25 2020-11-20 苏州奇梦者网络科技有限公司 Voice awakening method and system
CN108039175B (en) 2018-01-29 2021-03-26 北京百度网讯科技有限公司 Voice recognition method and device and server
CN110097870B (en) * 2018-01-30 2023-05-30 阿里巴巴集团控股有限公司 Voice processing method, device, equipment and storage medium
CN108536668B (en) * 2018-02-26 2022-06-07 科大讯飞股份有限公司 Wake-up word evaluation method and device, storage medium and electronic equipment
CN108597506A (en) * 2018-03-13 2018-09-28 广州势必可赢网络科技有限公司 Intelligent wearable device warning method and intelligent wearable device
CN110390933A (en) * 2018-04-20 2019-10-29 比亚迪股份有限公司 State methods of exhibiting, device and the displaying vehicle system of vehicle intelligent voice system
US11175880B2 (en) 2018-05-10 2021-11-16 Sonos, Inc. Systems and methods for voice-assisted media content selection
US10959029B2 (en) 2018-05-25 2021-03-23 Sonos, Inc. Determining and adapting to changes in microphone performance of playback devices
CN108962240B (en) * 2018-06-14 2021-09-21 百度在线网络技术(北京)有限公司 Voice control method and system based on earphone
US11076035B2 (en) 2018-08-28 2021-07-27 Sonos, Inc. Do not disturb feature for audio notifications
JP7001029B2 (en) * 2018-09-11 2022-01-19 日本電信電話株式会社 Keyword detector, keyword detection method, and program
US10587430B1 (en) 2018-09-14 2020-03-10 Sonos, Inc. Networked devices, systems, and methods for associating playback devices based on sound codes
CN109243426A (en) * 2018-09-19 2019-01-18 易诚博睿(南京)科技有限公司 A kind of automatization judgement voice false wake-up system and its judgment method
US11024331B2 (en) 2018-09-21 2021-06-01 Sonos, Inc. Voice detection optimization using sound metadata
US11100923B2 (en) * 2018-09-28 2021-08-24 Sonos, Inc. Systems and methods for selective wake word detection using neural network models
CN109102806A (en) * 2018-09-29 2018-12-28 百度在线网络技术(北京)有限公司 Method, apparatus, equipment and computer readable storage medium for interactive voice
CN111128134B (en) * 2018-10-11 2023-06-06 阿里巴巴集团控股有限公司 Acoustic model training method, voice awakening method and device and electronic equipment
CN111819533B (en) * 2018-10-11 2022-06-14 华为技术有限公司 Method for triggering electronic equipment to execute function and electronic equipment
US11899519B2 (en) 2018-10-23 2024-02-13 Sonos, Inc. Multiple stage network microphone device with reduced power consumption and processing load
CN109192210B (en) * 2018-10-25 2023-09-22 腾讯科技(深圳)有限公司 Voice recognition method, wake-up word detection method and device
CN109119078A (en) * 2018-10-26 2019-01-01 北京石头世纪科技有限公司 Automatic robot's control method, device, automatic robot and medium
US11183183B2 (en) 2018-12-07 2021-11-23 Sonos, Inc. Systems and methods of operating media playback systems having multiple voice assistant services
US11132989B2 (en) 2018-12-13 2021-09-28 Sonos, Inc. Networked microphone devices, systems, and methods of localized arbitration
CN109448720A (en) * 2018-12-18 2019-03-08 维拓智能科技(深圳)有限公司 Convenience service self-aided terminal and its voice awakening method
CN109878218A (en) * 2019-01-30 2019-06-14 厦门爱立得科技有限公司 A kind of printer and its Method of printing with intelligent sound control
CN109753665B (en) * 2019-01-30 2020-10-16 北京声智科技有限公司 Method and device for updating wake-up model
CN111862963B (en) * 2019-04-12 2024-05-10 阿里巴巴集团控股有限公司 Voice wakeup method, device and equipment
CN110033758B (en) * 2019-04-24 2021-09-24 武汉水象电子科技有限公司 Voice wake-up implementation method based on small training set optimization decoding network
US11120794B2 (en) 2019-05-03 2021-09-14 Sonos, Inc. Voice assistant persistence across multiple network microphone devices
CN110177317B (en) * 2019-05-17 2020-12-22 腾讯科技(深圳)有限公司 Echo cancellation method, echo cancellation device, computer-readable storage medium and computer equipment
US11200894B2 (en) 2019-06-12 2021-12-14 Sonos, Inc. Network microphone device with command keyword eventing
CN110473536B (en) * 2019-08-20 2021-10-15 北京声智科技有限公司 Awakening method and device and intelligent device
CN110600008A (en) * 2019-09-23 2019-12-20 苏州思必驰信息科技有限公司 Voice wake-up optimization method and system
CN110727821A (en) * 2019-10-12 2020-01-24 深圳海翼智新科技有限公司 Method, apparatus, system and computer storage medium for preventing device from being awoken by mistake
US11189286B2 (en) 2019-10-22 2021-11-30 Sonos, Inc. VAS toggle based on device orientation
CN110989963B (en) * 2019-11-22 2023-08-01 北京梧桐车联科技有限责任公司 Wake-up word recommendation method and device and storage medium
US11200900B2 (en) 2019-12-20 2021-12-14 Sonos, Inc. Offline voice control
US11562740B2 (en) 2020-01-07 2023-01-24 Sonos, Inc. Voice verification for media playback
US11308958B2 (en) 2020-02-07 2022-04-19 Sonos, Inc. Localized wakeword verification
US11482224B2 (en) 2020-05-20 2022-10-25 Sonos, Inc. Command keywords with input detection windowing
CN111739513B (en) * 2020-07-22 2020-12-11 江苏清微智能科技有限公司 Automatic voice awakening test system and test method thereof
US11984123B2 (en) 2020-11-12 2024-05-14 Sonos, Inc. Network device interaction by range
CN112420051A (en) * 2020-11-18 2021-02-26 青岛海尔科技有限公司 Equipment determination method, device and storage medium
CN113038048B (en) * 2021-03-02 2022-10-28 海信视像科技股份有限公司 Far-field voice awakening method and display device
CN113535913B (en) * 2021-06-02 2023-12-01 科大讯飞股份有限公司 Answer scoring method and device, electronic equipment and storage medium
CN115731926A (en) * 2021-08-30 2023-03-03 佛山市顺德区美的电子科技有限公司 Control method and device of intelligent equipment, intelligent equipment and readable storage medium
CN115223573A (en) * 2022-07-15 2022-10-21 北京百度网讯科技有限公司 Voice wake-up method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1256460A (en) * 1999-11-19 2000-06-14 清华大学 Phonetic command controller
CN101516005A (en) * 2008-02-23 2009-08-26 华为技术有限公司 Speech recognition channel selecting system, method and channel switching device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020143540A1 (en) * 2001-03-28 2002-10-03 Narendranath Malayath Voice recognition system using implicit speaker adaptation
KR101056511B1 (en) * 2008-05-28 2011-08-11 (주)파워보이스 Speech Segment Detection and Continuous Speech Recognition System in Noisy Environment Using Real-Time Call Command Recognition

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1256460A (en) * 1999-11-19 2000-06-14 清华大学 Phonetic command controller
CN101516005A (en) * 2008-02-23 2009-08-26 华为技术有限公司 Speech recognition channel selecting system, method and channel switching device

Also Published As

Publication number Publication date
CN102999161A (en) 2013-03-27

Similar Documents

Publication Publication Date Title
CN102999161B (en) A kind of implementation method of voice wake-up module and application
CN103021409B (en) A kind of vice activation camera system
CN111161714B (en) Voice information processing method, electronic equipment and storage medium
CN102568478B (en) Video play control method and system based on voice recognition
EP3923273B1 (en) Voice recognition method and device, storage medium, and air conditioner
CN102111314B (en) Smart home voice control system and method based on Bluetooth transmission
CN106463112A (en) Voice recognition method, voice wake-up device, voice recognition device and terminal
CN110047481B (en) Method and apparatus for speech recognition
CN104575504A (en) Method for personalized television voice wake-up by voiceprint and voice identification
CN110648553B (en) Site reminding method, electronic equipment and computer readable storage medium
CN109285543A (en) A kind of vehicle-mounted multimedia navigating instrument voice automatization test system
CN206595039U (en) A kind of interactive system for vehicle-mounted voice
CN205354646U (en) Intelligence speech recognition system for mobile unit
CN107600075A (en) The control method and device of onboard system
CN103198829A (en) Method, device and equipment of reducing interior noise and improving voice recognition rate
CN107403619A (en) A kind of sound control method and system applied to bicycle environment
CN111145763A (en) GRU-based voice recognition method and system in audio
CN111833870A (en) Awakening method and device of vehicle-mounted voice system, vehicle and medium
CN110970020A (en) Method for extracting effective voice signal by using voiceprint
CN112185425A (en) Audio signal processing method, device, equipment and storage medium
CN110808050B (en) Speech recognition method and intelligent device
CN111613223B (en) Voice recognition method, system, mobile terminal and storage medium
CN204926573U (en) Intelligent robot of auxiliary exercise mandarin
CN110737422B (en) Sound signal acquisition method and device
CN106094673A (en) Automobile wakes up word system and control method thereof up

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Wangjiang Road high tech Development Zone Hefei city Anhui province 230088 No. 666

Applicant after: Iflytek Co., Ltd.

Address before: Wangjiang Road high tech Development Zone Hefei city Anhui province 230088 No. 666

Applicant before: Anhui USTC iFLYTEK Co., Ltd.

COR Change of bibliographic data
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20190212

Address after: 511458 X1301-G5145 (Cluster Registration) (JM) No. 106 Fengze East Road, Nansha District, Guangzhou, Guangdong Province

Patentee after: Science and Technology University Information Flying South China Institute of Artificial Intelligence (Guangzhou) Co., Ltd.

Address before: 230088 666 Wangjiang West Road, Hefei hi tech Development Zone, Anhui

Patentee before: Iflytek Co., Ltd.