CN102999161B - A kind of implementation method of voice wake-up module and application - Google Patents

A kind of implementation method of voice wake-up module and application Download PDF

Info

Publication number
CN102999161B
CN102999161B CN201210455175.2A CN201210455175A CN102999161B CN 102999161 B CN102999161 B CN 102999161B CN 201210455175 A CN201210455175 A CN 201210455175A CN 102999161 B CN102999161 B CN 102999161B
Authority
CN
China
Prior art keywords
up
wake
word
voice
score
Prior art date
Application number
CN201210455175.2A
Other languages
Chinese (zh)
Other versions
CN102999161A (en
Inventor
操文祥
王海坤
康怀茂
钱勇
谢信珍
黄海兵
Original Assignee
科大讯飞股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 科大讯飞股份有限公司 filed Critical 科大讯飞股份有限公司
Priority to CN201210455175.2A priority Critical patent/CN102999161B/en
Publication of CN102999161A publication Critical patent/CN102999161A/en
Application granted granted Critical
Publication of CN102999161B publication Critical patent/CN102999161B/en

Links

Abstract

The implementation method of voice wake-up module and an application, comprising: phonetic entry (1), voice wake algorithm (2) up and wake execution (3) up; Voice wake algorithm (2) up and realize extracting (4) mainly through acoustic feature, wake word up and detect (5), wake word up and confirm (6), build and wake word Sampling network (7), training acoustic model (8) and structure up and wake the word confirmation network realization such as (9) up.No matter even if whether the present invention has broadcasting music under noisy environment, word opening voice arousal function can be waken up by voice, identify that wake-up effect is good; Implementation method of the present invention can be transplanted on ARM or DSP general processor and run, and is applied to vehicle-mounted and household electrical appliances association area.

Description

A kind of implementation method of voice wake-up module and application

Technical field

The invention discloses a kind of implementation method and application of voice wake-up module, be specifically related to a kind ofly say that predetermined voice wake word up and carry out triggering system and perform next step operation of user by user, can apply and need to realize the fields such as vehicle-mounted and household electrical appliances that voice wake up.

Background technology

The present invention relates to one and apply for invention disclosed patent, publication number is: CN102645977A, and the applying date is 2012.03.26, and inventor is Yin Jianhong, Wang Zhong, Zhou Yanhuang, name is called " a kind of vehicle-mounted voice wakes man-machine interactive system and method up ", is incorporated by reference document at this.The vehicle-mounted voice of this invention wakes up and realizes principle and be: in the flash storer pre-set, deposit the information such as sound bank, vehicle-mounted noise storehouse, speech engine, the phonetic order inputted by microphone is compared via the phonetic order relevant information that master controller MCU and storer store and is carried out speech recognition, and the phonetic order relevant information determined after matching identification is controlled vehicle-mounted control functional unit block as execution instruction, realize its corresponding function.What flash involved in this invention deposited is all fixing data, and under vehicle environment, whether due to road speed, road conditions, weather, turning on the aircondition to open a window all causes the vehicle-mounted noise storehouse change such as engine noise and tyre noise, the music play in car is different, the difference of speaker can cause referenced sound bank to change, so this invention realizes voice arousal function under being only applicable to the scene of fixing.And the present invention is by different speaker recording data under all kinds of scene of collection, train a kind of acoustic model, wake word Sampling network up by structure simultaneously and confirm network, make the present invention adapt to scene more extensive, voice wake-up effect is good simultaneously.

Summary of the invention

The object of the invention is to solve the deficiencies in the prior art, a kind of implementation method of voice waken system is provided, no matter even if whether have broadcasting music under noisy environment, can wake word opening voice arousal function up by voice, voice wake-up effect is good simultaneously; In addition the present invention also provides the application of voice waken system, comprises and is applied to application that is vehicle-mounted and household electrical appliances association area.

The present invention is achieved by the following technical solutions: a kind of implementation method of voice wake-up module comprises: phonetic entry 1, voice wake algorithm 2 up and wake execution 3 step up, voice wake the voice signal that algorithm 2 obtains phonetic entry 1 up, after carrying out voice wake up process, result is exported to and wakes execution 3 up, thus complete wake operation;

Described voice wake up algorithm 2 by acoustic feature extract 4, wake up word detect 5, wake up word confirm 6, build wake up word Sampling network 7, training acoustic model 8 and structure wake up word confirm network 9 realize, specific implementation process is as follows:

The first step, acoustic feature extracts 4: obtain voice signal input by phonetic entry 1, extract there is distinction and be the feature extracted based on human hearing characteristic, usually MFCC (Mel-FrequencyCepstrumCoefficient, the Mel frequency cepstrum coefficient) feature used in speech recognition is chosen as acoustic feature;

Second step, wake word up and detect 5: will the acoustic feature obtained be extracted, the acoustic model 8 of training is adopted to calculate acoustic score waking up on word Sampling network 7, if comprise in the path of score optimum to detect wake word up, then determine to have detected and wake word up, enter the 3rd step operation, otherwise get back to the first step and re-start and extract acoustic feature 4 step;

3rd step, wakes word up and confirms 6: will extract the acoustic feature that obtains, adopts the acoustic model 8 of training to confirm that network 9 carrying out wake word up confirms, is finally confirmed score waking word up; Judge whether the word that wakes up that this detects is wake word up really, compare by this final confirmation score waking word up and the thresholding preset, confirm that score is more than or equal to thresholding if final, then thinking that this wakes word up is wake word up really, voice wake up successfully, result is exported to and wakes execution 3 up, thus complete voice wake operation; Confirm that score is less than thresholding if final, then thinking that this wakes word up is false wake word up, comes back to the first step and re-starts acoustic feature and extract 4 steps.

The training of described acoustic model 8 is divided into two parts, is respectively phoneme acoustic model and garbage model (i.e. Garbage model); Phoneme acoustic model adopts the acoustic training model method in traditional speech recognition, choose database, utilize based on MLE (MaximumLikelihoodEstimation, maximal possibility estimation) and MPE (MinimumPhoneError, minimum phoneme mistake) distinction training criterion under obtain; Garbage model is for absorbing the independent voice except waking word up, use and train the database that phoneme model is same, by calculating the similarity between each phoneme model, each phoneme is divided into 20 classes, the all training datas using every class phoneme corresponding merge, adopt the Garbage model that the training of MLE criterion is corresponding, just obtain 20 class Garbage models.

The described implementation method waking word Sampling network 7 up adopts optimum score path computing to draw, the described optimum computing formula obtaining sub-path is:

W = arg m a x W P ( W ) P ( X | W ) - - - ( 2 )

Wherein X representative is from inputting the acoustic feature vector extracted voice, and W represents the maximum optimum word sequence of score; Conditional probability P (X|W) is acoustic model scores, is calculated by the acoustic model 8 trained; Prior probability P (W) is language model scores, and being the PenaltyP (X) added by different acoustic models is total probability, when acoustic model and wake up word Sampling network decide after be exactly definite value.

The described word that wakes up confirms that network (9) implementation method is:

A. the word that wakes up detected is decoded to phoneme one-level, and records all score (Score phone1, Score phone2..., Score phoneN), wherein N wakes phoneme number total in word up;

Score phone1, Score phone2..., Score phoneNwhat represent that this wakes all phonemes in word up respectively is decoding score, and wherein subscript represents the mark of N number of phoneme of phoneme.

B. use and wake word up and detect same feature, obtain corresponding acoustic score, and be accurate to frame one-level (Score frame1, Score frame2..., Score frameM), wherein M is the total duration of this feature, in units of frame;

C. calculate and wake each phoneme of word up and really to recognize point, account form is as follows:

CM p h o n e i = ( Score p h o n e i - Σ k = K i s t a r t K i e n d Score f r a m e k ) / ( K i e n d - K i s t a r t ) - - - ( 3 )

Wherein K istartand K iendbe respectively initial time and the end time of i-th phoneme;

CM phoneirepresent that i-th phoneme is recognized point really, subscript phonei represents i-th phoneme, Score phoneithe decoding score of i-th phone as shown above, Score framekrepresent to use and wake the score that word confirms the kth frame that network decoding obtains up.

D. calculate the final confirmation score that this wakes word up, account form is as follows:

CM w o r d = 1 N Σ i = 1 N CM p h o n e i - - - ( 4 )

Method of the present invention can be transplanted on ARM or DSP general processor, is applied to vehicle-mounted and household electrical appliances association area.

A kind of vehicle-mounted voice waken system, is characterized in that comprising: microprocessor, voice wake-up module, audio conversion device, recording device, apparatus for processing audio, public address system; Wherein voice wake-up module is run in the microprocessor, and specific implementation process is as follows:

The first step, microprocessor and apparatus for processing audio interconnect, and control apparatus for processing audio output audio information, and apparatus for processing audio and public address system interconnect, and required playback of audio information are carried out power amplification to promote loudspeaker playback, complete audio player operation;

Second step, recording device and audio conversion device interconnect, when user say voice wake word up time, carry out voice typing by recording device and pass to audio conversion device conversion, complete voice collecting operation;

3rd step, audio conversion device carries out data conversion to the voice messaging of recording device typing, the data after conversion are passed to simultaneously microprocessor carry out claim 1 described in the computing of voice wake-up module, complete voice data conversion operations;

4th step, microprocessor and audio conversion device interconnect, and the voice messaging of audio conversion device input are carried out to the computing of voice wake-up module, if correctly identify voice to wake information up, then control apparatus for processing audio and play voice message sound, complete vehicle-mounted voice and wake up and prompt tone play operation; If identify and make mistakes, then proceed the operation of second step voice collecting.

The present invention's advantage is compared with prior art:

(1) the present invention wakes word up as trigger source by the voice of user, add that waking word up detects and wake up word confirmation, no matter even if whether have broadcasting music under noisy environment, can wake word opening voice arousal function up by voice, voice wake-up effect is good; Also utilize bimanualness without the need to user simultaneously, realize arousal function fast by means of only voice command, carry out next step interactive operation.

(2) the present invention realizes, and cost is low, and code migrating is convenient, has good application value.

(3) the present invention can be widely used in the fields such as vehicle-mounted and household electrical appliances, can also be widely used in each field that other audio plays needs voice to wake up simultaneously.In the automotive environment, want in user's driving conditions before not using native system that starting recognition function needs manually to remove operation push-button, suspends the music of current broadcasting, causes driving conditions to there is potential safety hazard; Consumer's Experience weak effect simultaneously.

(4) value that the present invention brings is, by saying that the voice of agreement wake word opening voice arousal function up after using native system, play without the need to suspending audio frequency in advance, simultaneously by actual testing authentication, correct identification wakes rate up and can reach more than 90%; At other as field of household appliances, user just when TV reception, looks on the bright side of things and opens speech identifying function, also can wake word up to realize by voice, make interactive voice more convenient, more humane.

(5) the voice arousal function in the present invention is all realized by software algorithm, can be transplanted on the general processors such as ARM or DSP very easily.

Accompanying drawing explanation

Fig. 1 is the schematic block diagram that the present invention realizes;

Fig. 2 is that structure of the present invention wakes word Sampling network schematic block diagram up;

Fig. 3 is that structure of the present invention wakes word confirmation network schematic block diagram up;

Fig. 4 is the concrete enforcement schematic diagram of the present invention in automotive field.

Embodiment

As shown in Figure 1, the realization of voice wake-up module of the present invention wakes algorithm 2 up by phonetic entry 1, voice and wakes execution 3 step up and realizes.

Voice wake up algorithm 2 realize primarily of acoustic feature extract 4, wake up word detect 5, wake up word confirm 6, build wake up word Sampling network 7, training acoustic model 8 and structure wake up word confirm network 9 complete, specific implementation process is:

(1) acoustic model 8 is trained: the training of acoustic model is divided into two parts, is respectively phoneme acoustic model and garbage model (i.e. Garbage model).Phoneme acoustic model adopts the acoustic training model method in traditional speech recognition, choose suitable database, utilize based on MLE (MaximumLikelihoodEstimation, maximal possibility estimation) and MPE (MinimumPhoneError, minimum phoneme mistake) distinction training criterion under obtain.Garbage model is for absorbing the independent voice except waking word up, use and train the database that phoneme model is same, by calculating the similarity between each phoneme model, each phoneme is divided into 20 classes, the all training datas using every class phoneme corresponding merge, adopt the Garbage model that the training of MLE criterion is corresponding, so namely obtain 20 class Garbage models.Garbage model have employed the phoneme training data combined training of cluster, has two kinds of purposes, is used for absorbing other voice except waking word up waking up in word Sampling network, confirms to be used in network calculating the score confirming network waking word up.

(2) acoustic feature extracts 4: obtain voice signal input by phonetic entry 1, extraction can have certain distinction, and be the feature extracted based on human hearing characteristic, generally choose MFCC (Mel-FrequencyCepstrumCoefficient, the Mel frequency cepstrum coefficient) feature used in speech recognition.

(3) wake word up and detect 5: will the acoustic feature that obtains be extracted, use acoustic model 8 to calculate acoustic score waking up on word Sampling network 7, if comprise in the path of score optimum to detect wake word up, then detect and wake word up, enter next step operation; Otherwise again extract acoustic feature operation.In order to ensure that waking word up can be detected normally, invalid voice can effectively be absorbed again simultaneously.What the structure waking Sampling network up was selected primarily of user wakes word and Garbage model composition up, as shown in Figure 2, this network in speech recognition also referred to as recognition network, very simple owing to waking checking network line structure up, or can by simple program manual construction.Due to the complicacy of practical service environment, under many circumstances, what receive wakes voice up by noise pollution, it is a lot of that the score of feature on phoneme acoustic model now waking acoustics corresponding to voice up will reduce, and due to Garbage model be use more phoneme combined training to obtain, itself be not very accurate, the limited extent that the score of acoustic feature on Garbage model reduces, now wake voice up just to be absorbed by Garbage model, system wake-up rate will reduce by mistake.

In order to prevent the generation of above-mentioned situation, wake up word Sampling network is decoded time, certain punishment is done to the decoding score of the arc at Garbage place, i.e. Penalty, make its can not with the fair competition of phoneme acoustic model, also can normally be detected by the voice that wake up of noise pollution to ensure.Concrete punishment amplitude needs to do experimental adjustment for the different words that wakes up.

The implementation method waking word Sampling network 7 up adopts optimum score path computing to draw.

The optimum acquisition obtaining sub-path adopts classical Bayesian formula, as follows:

W = arg m a x W P ( W | X ) = arg m a x W P ( W ) P ( X | W ) P ( X ) - - - ( 1 )

The acoustic feature vector that in above formula, X representative is extracted from input voice, W represents the maximum optimum word sequence of score.Conditional probability P (X|W) is acoustic model scores, can be calculated by the phoneme acoustic model that trains and garbage model, prior probability P (W) is language model scores, can be understood as here the Penalty added by different acoustic models.P (X) is total probability, when acoustic model and wake up word Sampling network decide after be exactly definite value, therefore formula (1) can be written as:

W = arg max W P ( W ) P ( X | W ) - - - ( 2 )

(4) wake word up and confirm 6: due to the complicacy that there is inexactness and practical service environment of acoustic model itself, the word that wakes up obtained by waking word detection up not necessarily wakes word up really.Non-ly waking the false wake-up brought and the problem that can cause up below to reduce, needing to do further to confirm to detecting the word that wakes up obtained.The present invention adopts the mode of accompanying drawing 3 to build and wakes word confirmation network 9 up, wake word up and confirm that network is the same with waking word Sampling network up, all belong to the recognition network in speech recognition, confirm only to comprise Garbage model in network, simple program or manual construction can be used.

The key step waking word confirmation up is as follows:

A) word will be waken up detect and obtain waking word up and be decoded to phoneme one-level, and record its all score (Score phone1, Score phone2..., Score phoneN), wherein N wakes phoneme number total in word up.

B) use and wake word up and detect same feature, confirming network obtains corresponding acoustic score waking word up, and be accurate to frame one-level (Score frame1, Score frame2..., Score frameM), wherein M is the total duration of this feature, in units of frame.

C) calculate and wake each phoneme of word up and really to recognize point, account form is as follows:

CM p h o n e i = ( Score p h o n e i - Σ k = K i s t a r t K i e n d Score f r a m e k ) / ( K i e n d - K i s t a r t ) - - - ( 3 )

Wherein K istartand K iendbe respectively initial time and the end time of i-th phoneme.

D) calculate the final confirmation score that this wakes word up, account form is as follows:

CM w o r d = 1 N Σ i = 1 N CM p h o n e i - - - ( 4 )

Whether be really wake word, the thresholding contrasting this final confirmation score waking word up He preset if e) judging that this wakes word up, if confirm score C M wordbe greater than thresholding T and then think that this wakes word up for wake word up really, wake up successfully; If CM wordbeing less than thresholding T, then to think that this wakes word up be false wake word up, re-starts acoustic feature and extract.

Realize voice arousal function by working above, result feedback is to waking execution 3 up the most at last, performs wake operation.

As shown in Figure 4, give the present invention the concrete enforcement schematic diagram in automotive field, vehicle-mounted voice waken system, its structure comprises: microprocessor 11, preferentially selects ARM9 processor, but is not limited thereto microprocessor; Voice wake-up module operates in microprocessor 11; Audio conversion device 12, prioritizing selection WM8731, but be not limited thereto audio conversion device; Recording device 13, prioritizing selection sexual valence than high electret microphone, but is not limited thereto recording device; Apparatus for processing audio 14, prioritizing selection TDA7419, but be not limited thereto apparatus for processing audio; Public address system 15, adopts the four unit loudspeaker (left front loudspeaker, left back loudspeaker, right front loudspeaker, right back loudspeaker) that power amplifier TDA7388 and automobile carry, but is not limited thereto power amplifier and vehicle-mounted loudspeaker unit; Voice wake command word, prioritizing selection " automobile language point ", but be not not limited thereto voice and wake word up.

Realize that principle mainly comprises audio frequency broadcasting, data under voice, voice data conversion, voice wake up and the step such as prompt tone broadcasting completes.Specific as follows:

The first, when user uses native system to listen to music when driving, music can be other sources of sound such as the radio/TV/DVD/linein of audio frequency or the accessing to audio processor TDA7419 provided by the broadcast module of microprocessor ARM9; After the music of all broadcastings first carries out audio effect processing by audio process, then promote vehicle-mounted loudspeaker by power amplifier TDA7388 and broadcast, complete audio frequency broadcasting work;

The second, word is waken up when user says specific voice---time " automobile language point ", user's speaking volume should keep level of normally speaking, the too little meeting of sound causes electret microphone to be recorded less than voice signal, and sound is crossed conference and caused recording to cut top, all can cause arousal function failure; Include the microphone signal that voice wake word information up, in audio converter WM8731, carry out analog to digital conversion, complete speech signal collection work;

Three, the voice acquisition module of microprocessor ARM9 carries out analog to digital conversion work by iic bus control audio converter WM8731, convert microphone location signal to digital signal, and return to microprocessor by IIS bus, complete voice data conversion work;

Four, microprocessor training acoustic model, extracts user's acoustic feature of microphone signal input, after waking word Sampling network up and waking word confirmation network up, realizes voice arousal function.Simultaneously by audio process play cuing tone signal, complete whole voice and wake up and prompt tone play operation.

Be more than preferred embodiments of the present invention, user, when not playing music or non-driving, can wake word opening voice recognition function up by special sound equally.

Non-elaborated part of the present invention belongs to techniques well known.And above-described embodiment does not limit the present invention in any form, the technical scheme that the form that all employings are equal to replacement or equivalent transformation obtains, all drops within protection scope of the present invention.

Claims (4)

1. the implementation method of a voice wake-up module, it is characterized in that comprising: phonetic entry (1), voice wake algorithm (2) up and wake execution (3) step up, voice wake the voice signal that algorithm (2) obtains phonetic entry (1) up, after carrying out voice wake up process, result is exported to and wakes execution (3) up, thus complete wake operation;
Described voice wake up algorithm (2) by acoustic feature extract (4), wake up word detect (5), wake up word confirm (6), build wake up word Sampling network (7), training acoustic model (8) and structure wake up word confirmation network (9) realize, specific implementation process is as follows:
The first step, acoustic feature extracts (4): obtain voice signal input by phonetic entry (1), extract there is distinction and be the feature extracted based on human hearing characteristic, choose the Mel frequency cepstrum coefficient characteristics used in speech recognition as acoustic feature;
Second step, wake word up and detect (5): will the acoustic feature obtained be extracted, the acoustic model (8) of training is adopted to calculate acoustic score waking up on word Sampling network (7), if comprise in the path of acoustic score optimum to detect wake word up, then determine to have detected and wake word up, enter the 3rd step operation, otherwise get back to the first step re-start extract acoustic feature (4) step;
3rd step, wakes word up and confirms (6): will extract the acoustic feature that obtains, adopts the acoustic model (8) of training to confirm that network (9) carrying out wake word up confirms, is finally confirmed score waking word up; Judge whether the word that wakes up that this detects is wake word up really, by this final confirmation score waking word up and the thresholding preset, confirm that score is more than or equal to thresholding if final, then thinking that this wakes word up is wake word up really, voice wake up successfully, result is exported to and wakes execution (3) up, thus complete voice wake operation; Confirm that score is less than thresholding if final, then thinking that this wakes word up is false wake word up, comes back to the first step and re-starts acoustic feature and extract (4) step;
The described implementation method waking word Sampling network (7) up adopts the path computing of acoustic score optimum to draw, the computing formula in the path of described acoustic score optimum is:
W = arg max W P ( W ) P ( X | W )
Wherein X representative is from inputting the acoustic feature vector extracted voice, and W represents the maximum optimum word sequence of score; Conditional probability P (X|W) is acoustic model scores, is calculated by the acoustic model (8) trained; Prior probability P (W) is language model scores, and being the PenaltyP (X) added by different acoustic models is total probability;
The described word that wakes up confirms that network (9) implementation method is:
A. the word that wakes up detected is decoded to phoneme one-level, and records all score Score phone1, Score phone2..., Score phoneN, wherein N wakes phoneme number total in word up,
Score phone1, Score phone2..., Score phoneNwhat represent that this wakes all phonemes in word up respectively is decoding score, and wherein subscript represents the mark of N number of phoneme of phoneme;
B. use and wake word up and detect same feature, obtain corresponding acoustic score, and be accurate to frame one-level Score frame1, Score frame2..., Score frameM, wherein M is the total duration of this feature, in units of frame;
C. calculate the acoustic score waking each phoneme of word up, account form is as follows:
CM p h o n e i = ( Score p h o n e i - Σ k = K i s t a r t K i e n d Score f r a m e k ) / ( K i e n d - K i s t a r t )
Wherein K istartand K iendbe respectively initial time and the end time of i-th phoneme;
CM phoneirepresent that i-th phoneme is recognized point really, subscript phonei represents i-th phoneme, Score phoneirepresent the decoding score of i-th phone, Score framekrepresent to use and wake the score that word confirms the kth frame that network decoding obtains up;
D. calculate the final confirmation score that this wakes word up, account form is as follows:
CM w o r d = 1 N Σ i = 1 N CM p h o n e i .
2. the implementation method of voice wake-up module according to claim 1, is characterized in that: the training of described acoustic model (8) is divided into two parts, is respectively phoneme acoustic model and garbage model and Garbage model; Phoneme acoustic model adopts the acoustic training model method in traditional speech recognition, chooses database, utilizes and obtains based under maximal possibility estimation and minimum phoneme fault discrimination training criterion; Garbage model is for absorbing the independent voice except waking word up, use and train the database that phoneme model is same, by calculating the similarity between each phoneme model, each phoneme is divided into 20 classes, the all training datas using every class phoneme corresponding merge, adopt the Garbage model that the training of maximal possibility estimation criterion is corresponding, just obtain 20 class Garbage models.
3. the implementation method of a kind of voice wake-up module according to claim 1, is characterized in that: described method can be transplanted on ARM or DSP general processor and run, and is applied to vehicle-mounted and household electrical appliances association area.
4. a vehicle-mounted voice waken system, it is characterized in that comprising: voice wake-up module, audio conversion device, recording device, apparatus for processing audio, public address system described in microprocessor, claim 1, described voice wake-up module is run in the microprocessor, and specific implementation process is as follows:
The first step, microprocessor and apparatus for processing audio interconnect, and control apparatus for processing audio output audio information, and apparatus for processing audio and public address system interconnect, and required playback of audio information are carried out power amplification to promote loudspeaker playback, complete audio player operation;
Second step, recording device and audio conversion device interconnect, when user say voice wake word up time, carry out voice typing by recording device and pass to audio conversion device conversion, complete voice collecting operation;
3rd step, audio conversion device carries out data conversion to the voice messaging of recording device typing, the data after conversion is passed to the computing that microprocessor carries out voice wake-up module simultaneously, completes voice data conversion operations;
4th step, microprocessor and audio conversion device interconnect, and the voice messaging of audio conversion device input are carried out to the computing of voice wake-up module, if correctly identify voice to wake information up, then control apparatus for processing audio and play voice message sound, complete voice and wake up and prompt tone play operation; If identify and make mistakes, then proceed the operation of second step voice collecting.
CN201210455175.2A 2012-11-13 2012-11-13 A kind of implementation method of voice wake-up module and application CN102999161B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210455175.2A CN102999161B (en) 2012-11-13 2012-11-13 A kind of implementation method of voice wake-up module and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210455175.2A CN102999161B (en) 2012-11-13 2012-11-13 A kind of implementation method of voice wake-up module and application

Publications (2)

Publication Number Publication Date
CN102999161A CN102999161A (en) 2013-03-27
CN102999161B true CN102999161B (en) 2016-03-02

Family

ID=47927817

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210455175.2A CN102999161B (en) 2012-11-13 2012-11-13 A kind of implementation method of voice wake-up module and application

Country Status (1)

Country Link
CN (1) CN102999161B (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9240182B2 (en) * 2013-09-17 2016-01-19 Qualcomm Incorporated Method and apparatus for adjusting detection threshold for activating voice assistant function
CN103714815A (en) * 2013-12-09 2014-04-09 何永 Voice control method and device thereof
CN103943105A (en) * 2014-04-18 2014-07-23 安徽科大讯飞信息科技股份有限公司 Voice interaction method and system
CN104282307A (en) * 2014-09-05 2015-01-14 中兴通讯股份有限公司 Method, device and terminal for awakening voice control system
CN105575395A (en) * 2014-10-14 2016-05-11 中兴通讯股份有限公司 Voice wake-up method and apparatus, terminal, and processing method thereof
CN104464723B (en) * 2014-12-16 2018-03-20 科大讯飞股份有限公司 A kind of voice interactive method and system
CN104616653B (en) * 2015-01-23 2018-02-23 北京云知声信息技术有限公司 Wake up word matching process, device and voice awakening method, device
CN105096939B (en) * 2015-07-08 2017-07-25 百度在线网络技术(北京)有限公司 voice awakening method and device
CN106469554B (en) * 2015-08-21 2019-11-15 科大讯飞股份有限公司 A kind of adaptive recognition methods and system
CN105141919A (en) * 2015-09-01 2015-12-09 武汉同迅智能科技有限公司 Monitoring terminal device remotely controlled by voice
CN106653010A (en) * 2015-11-03 2017-05-10 络达科技股份有限公司 Electronic apparatus and voice trigger method therefor
CN105632486B (en) * 2015-12-23 2019-12-17 北京奇虎科技有限公司 Voice awakening method and device of intelligent hardware
CN105654949B (en) * 2016-01-07 2019-05-07 北京云知声信息技术有限公司 A kind of voice awakening method and device
CN105702253A (en) * 2016-01-07 2016-06-22 北京云知声信息技术有限公司 Voice awakening method and device
CN105812573A (en) * 2016-04-28 2016-07-27 努比亚技术有限公司 Voice processing method and mobile terminal
CN106297777B (en) * 2016-08-11 2019-11-22 广州视源电子科技股份有限公司 A kind of method and apparatus waking up voice service
CN106094673A (en) * 2016-08-30 2016-11-09 奇瑞商用车(安徽)有限公司 Automobile wakes up word system and control method thereof up
CN106611597B (en) * 2016-12-02 2019-11-08 百度在线网络技术(北京)有限公司 Voice awakening method and device based on artificial intelligence
CN106847273A (en) * 2016-12-23 2017-06-13 北京云知声信息技术有限公司 The wake-up selected ci poem selection method and device of speech recognition
CN106653022A (en) * 2016-12-29 2017-05-10 百度在线网络技术(北京)有限公司 Voice awakening method and device based on artificial intelligence
CN107220532A (en) * 2017-04-08 2017-09-29 网易(杭州)网络有限公司 For the method and apparatus by voice recognition user identity
CN107123417A (en) * 2017-05-16 2017-09-01 上海交通大学 Optimization method and system are waken up based on the customized voice that distinctive is trained
WO2019113911A1 (en) * 2017-12-15 2019-06-20 海尔优家智能科技(北京)有限公司 Device control method, cloud device, smart device, computer medium and device
CN108962240A (en) * 2018-06-14 2018-12-07 百度在线网络技术(北京)有限公司 A kind of sound control method and system based on earphone
CN109878218A (en) * 2019-01-30 2019-06-14 厦门爱立得科技有限公司 A kind of printer and its Method of printing with intelligent sound control

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1256460A (en) * 1999-11-19 2000-06-14 清华大学 Phonetic command controller
CN101516005A (en) * 2008-02-23 2009-08-26 华为技术有限公司 Speech recognition channel selecting system, method and channel switching device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020143540A1 (en) * 2001-03-28 2002-10-03 Narendranath Malayath Voice recognition system using implicit speaker adaptation
KR101056511B1 (en) * 2008-05-28 2011-08-11 (주)파워보이스 Speech Segment Detection and Continuous Speech Recognition System in Noisy Environment Using Real-Time Call Command Recognition

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1256460A (en) * 1999-11-19 2000-06-14 清华大学 Phonetic command controller
CN101516005A (en) * 2008-02-23 2009-08-26 华为技术有限公司 Speech recognition channel selecting system, method and channel switching device

Also Published As

Publication number Publication date
CN102999161A (en) 2013-03-27

Similar Documents

Publication Publication Date Title
CN101206857B (en) Method and system for modifying speech processing arrangement
US9117449B2 (en) Embedded system for construction of small footprint speech recognition with user-definable constraints
EP3050052B1 (en) Speech recognizer with multi-directional decoding
US20150066506A1 (en) System and Method of Text Zoning
KR101383552B1 (en) Speech recognition method of sentence having multiple instruction
KR101417975B1 (en) Method and system for endpoint automatic detection of audio record
EP1301922B1 (en) System and method for voice recognition with a plurality of voice recognition engines
US20130218563A1 (en) Speech understanding method and system
KR100856358B1 (en) Spoken user interface for speech-enabled devices
EP2550651B1 (en) Context based voice activity detection sensitivity
JP2008501991A (en) Performance prediction for interactive speech recognition systems.
US8972252B2 (en) Signal processing apparatus having voice activity detection unit and related signal processing methods
JP2014077969A (en) Dialogue system and determination method of speech to dialogue system
US9443527B1 (en) Speech recognition capability generation and control
US9437186B1 (en) Enhanced endpoint detection for speech recognition
CN103236260B (en) Speech recognition system
US20110218798A1 (en) Obfuscating sensitive content in audio sources
US5983186A (en) Voice-activated interactive speech recognition device and method
CN101281745B (en) Interactive system for vehicle-mounted voice
JP2002140089A (en) Method and apparatus for pattern recognition training wherein noise reduction is performed after inserted noise is used
JP4304952B2 (en) On-vehicle controller and program for causing computer to execute operation explanation method thereof
CN102298443B (en) Smart home voice control system combined with video channel and control method thereof
KR20020004954A (en) Spoken user interface for speech-enabled devices
TWI466101B (en) Method and system for speech recognition
CN105009204A (en) Speech recognition power management

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
COR Change of bibliographic data
CB02 Change of applicant information

Address after: Wangjiang Road high tech Development Zone Hefei city Anhui province 230088 No. 666

Applicant after: Iflytek Co., Ltd.

Address before: Wangjiang Road high tech Development Zone Hefei city Anhui province 230088 No. 666

Applicant before: Anhui USTC iFLYTEK Co., Ltd.

C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20190212

Address after: 511458 X1301-G5145 (Cluster Registration) (JM) No. 106 Fengze East Road, Nansha District, Guangzhou, Guangdong Province

Patentee after: Science and Technology University Information Flying South China Institute of Artificial Intelligence (Guangzhou) Co., Ltd.

Address before: 230088 666 Wangjiang West Road, Hefei hi tech Development Zone, Anhui

Patentee before: Iflytek Co., Ltd.