CN107240397A

CN107240397A - A kind of smart lock and its audio recognition method and system based on Application on Voiceprint Recognition

Info

Publication number: CN107240397A
Application number: CN201710692968.9A
Authority: CN
Inventors: 王炜婷; 温坤华; 朱慧广; 陈俊
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2017-08-14
Filing date: 2017-08-14
Publication date: 2017-10-10

Abstract

The present invention discloses a kind of audio recognition method of the smart lock based on Application on Voiceprint Recognition, including：Extract the corresponding mel cepstrum coefficients of voice signal to be verified；The differentiation depth confidence network with parameter preset space is inputted using mel cepstrum coefficients as input layer, to obtain the hidden layer output for distinguishing depth confidence network, and as the characteristic vector of the mel cepstrum coefficients；The gauss hybrid models that characteristic vector is built in advance with each registration voice signal are contrasted, and calculate the posterior probability that characteristic vector matches with each registration voice signal respectively；Judge whether the maximum in each posterior probability is more than predetermined threshold value, if it is, voice signal to be verified is by checking, and unlocking operation is carried out to lockset；It is on the contrary then make lockset keep lock-out state.The present invention improves the discrimination to target speaker, the wrong acceptance probability of reduction, it is ensured that door lock security performance.Invention additionally discloses a kind of speech recognition system and a kind of smart lock, its advantage is as described above.

Description

A kind of smart lock and its audio recognition method and system based on Application on Voiceprint Recognition

Technical field

The present invention relates to signal processing technology field, the speech recognition of more particularly to a kind of smart lock based on Application on Voiceprint Recognition Method.The invention further relates to a kind of speech recognition system of the smart lock based on Application on Voiceprint Recognition.Include the invention further relates to one kind The smart lock based on Application on Voiceprint Recognition of above-mentioned speech recognition system.

Background technology

As burglary event occurs often, domestic door lock of how upgrading, it is ensured that indoor safety is that people have to The new challenge of reply.

At present, traditional domestic door lock uses key unblock, password unblock substantially on the market, because key is easy to lose, easy Replicate, the features such as characteristic is not strong, likewise, password solution, which is latched in, easily forgets password, the shortcoming that password is easily revealed is easily caused While the wasting of resources potential safety hazard is brought to user.In addition, the intelligent equipment of existing use other biological authentication techniques is such as Fingerprint recognition, iris recognition etc., cost is high, the inconvenience with contact, unhygienic, and such as fingerprint recognition needs to put on finger On a sensor, iris recognition is needed eyes closer camera, it is impossible to provide good Consumer's Experience and to user cause through Ji burden.

In the prior art, the intelligent door lock based on Application on Voiceprint Recognition has been occurred in that.Voiceprint lock is the pattern based on sound Identification, plays the same tune on different musical instruments with Fingerprint Lock and is used.As long as owner says the code word being previously set, just energy opens lock, even if others says Go out code word also not opening, this voiceprint lock distinguishes that the Main Basiss of sound are tone colors.But, because most of voiceprint locks are to sound Identification and verify often only with GMM model (Gaussian Mixture Model, gauss hybrid models) method training side Method, experiment shows that, only with GMM Speaker Identification models, the voice context content in training and test is to recognition result Have a great impact.It is closer to when the identification voice context and target Speaker Identification voice context of non-targeted speaker When, when target speaker test is carried out with the voice, the probability that mistake receives can be greatly improved.

Therefore, discrimination of the voiceprint lock to target speaker, the wrong acceptance probability of reduction, it is ensured that door lock safety how to be improved Performance, is those skilled in the art's technical problem urgently to be resolved hurrily.

The content of the invention

It is an object of the invention to provide a kind of audio recognition method of the smart lock based on Application on Voiceprint Recognition, it is possible to increase vocal print Lock the discrimination to target speaker, the wrong acceptance probability of reduction, it is ensured that door lock security performance.It is another object of the present invention to carry A kind of speech recognition system for smart lock based on Application on Voiceprint Recognition and it is a kind of including above-mentioned speech recognition system based on vocal print The smart lock of identification.

In order to solve the above technical problems, the present invention provides a kind of audio recognition method of the smart lock based on Application on Voiceprint Recognition, Including：

Extract the corresponding mel cepstrum coefficients of voice signal to be verified；

The differentiation depth confidence network with parameter preset space is inputted using the mel cepstrum coefficients as input layer, with The hidden layer output for distinguishing depth confidence network is obtained, and as the characteristic vector of the mel cepstrum coefficients；

The gauss hybrid models that the characteristic vector is built in advance with each registration voice signal are contrasted, and are calculated The posterior probability that the characteristic vector matches with each registration voice signal respectively；

Judge whether the maximum in each posterior probability is more than predetermined threshold value, if it is, voice signal to be verified Unlocking operation is carried out by checking, and to lockset；It is on the contrary then make lockset keep lock-out state.

Preferably, before the corresponding mel cepstrum coefficients of voice signal to be verified are extracted, in addition to：

Registration voice typing is carried out to the registrant of smart lock, and each registration voice signal structure Gauss of input is mixed Matched moulds type.

Preferably, gauss hybrid models are built to each registration voice signal of input, specifically included：

Each corresponding mel cepstrum coefficients of registration voice signal is extracted, and to each described registration voice signal of input Carry out preset numbers binding；

Using each corresponding mel cepstrum coefficients of registration voice signal as input layer, while by registration each described The numbering of voice signal binding makes a distinction depth confidence network training, and obtain the differentiation depth confidence as output layer The parameter space of network；

Each described corresponding mel cepstrum coefficients of voice signal of registering is inputted into the differentiation depth confidence network, to obtain The hidden layer output for distinguishing depth confidence network is obtained, and as the corresponding mel cepstrum of registration voice signal each described The characteristic vector of coefficient；

Using each characteristic vector as input, and it is defined structure gauss hybrid models by EM algorithm.

Preferably, voice signal to be verified or each corresponding mel cepstrum coefficients of registration voice signal, specific bag are extracted Include：Preemphasis plus Hamming window are carried out successively to voice signal to be verified or each registration voice signal, entered by Wiener Filter Method Row denoising, carry out Fast Fourier Transform (FFT), be filtered and discrete cosine transform by triangle bandpass filter.

Preferably, each described corresponding mel cepstrum coefficients of voice signal of registering is being inputted into the differentiation depth confidence Network, after the hidden layer output to obtain the differentiation depth confidence network, in addition to：

Pass through formula：

The hidden layer output quality for distinguishing depth confidence network is checked, if D value is more than predetermined threshold value, hidden layer output Quality meets preset requirement；

Wherein, D is discrimination, L_iFor the corresponding weights of the corresponding characteristic vector of each registration voice signal, S_iFor in matrix S Element, S=S_b-S_w, S_bFor within class scatter matrix, S_wFor inter _ class relationship matrix.

Preferably, before the corresponding mel cepstrum coefficients of voice signal to be verified are extracted, the registrant of smart lock is entered After row registration voice typing, in addition to：

Gather several untrained phonetic notation signals；

If the quantity of current untrained voice signal is less than predetermined threshold value, by corresponding to each untrained voice signal Mel cepstrum coefficients input it is described differentiation depth confidence network, to correct its parameter space；

If the quantity of current untrained voice signal exceedes predetermined threshold value, by corresponding to each untrained voice signal Mel cepstrum coefficients input it is revised differentiation depth confidence network, to obtain corresponding amendment characteristic vector, and utilize The amendment characteristic vector is modified to the gauss hybrid models.

Preferably, the gauss hybrid models are modified, specifically included：

If each untrained voice signal corresponds to T amendment characteristic vector respectively：

And each amendment characteristic vector distinguishes corresponding likelihood ratio and is：

{K₁,K₂,K₃,...,K_T}

Then pass through formula：

Correct the average and variance of gauss hybrid models.

The present invention also provides a kind of speech recognition system of the smart lock based on Application on Voiceprint Recognition, including：

Extraction module, for the corresponding mel cepstrum coefficients of voice signal to be verified；

Rectification module, it is deep for the mel cepstrum coefficients to be inputted into the differentiation with parameter preset space as input layer Confidence network is spent, is exported with the hidden layer for obtaining the differentiation depth confidence network, and as the mel cepstrum coefficients Characteristic vector；

Computing module, the gauss hybrid models for the characteristic vector to be built in advance with each registration voice signal enter Row contrast, and calculate the posterior probability that the characteristic vector matches with each registration voice signal respectively；

Authentication module, for judging whether the maximum in each posterior probability is more than predetermined threshold value, if it is, treating Voice signal is verified by checking, and unlocking operation is carried out to lockset；It is on the contrary then make lockset keep lock-out state.

Preferably, in addition to：

Registering modules, registration voice typing is carried out for the registrant to smart lock；

Training module, gauss hybrid models are built for each registration voice signal to input.

The present invention also provides a kind of smart lock based on Application on Voiceprint Recognition, including sound collector, lockset and two as described above Any one of speech recognition system.

The audio recognition method of smart lock provided by the present invention based on Application on Voiceprint Recognition, it is main to include four steps, its In, in the first step, after voice signal input to be verified, it is pre-processed first, by corresponding to voice signal to be verified Mel cepstrum coefficients extract；In second step, then combine the plum for distinguishing depth confidence network handles checking voice signal You carry out advanced treating by cepstrum coefficient, are input to the mel cepstrum coefficients as input layer in differentiation depth confidence network, should Distinguishing depth confidence network has parameter preset space, can directly obtain the hidden layer output for distinguishing depth confidence network, and The hidden layer is exported into the characteristic vector as the mel cepstrum coefficients of voice signal to be verified；In the third step, typically in intelligence Several default registration voice signals are stored in lock, and for each registration voice signal of Accurate Analysis, each registration Voice signal has built gauss hybrid models in advance, in this way, in this step, by after pretreatment and advanced treatment Voice signal to be verified is contrasted with each registration voice signal, specifically, i.e. will the characteristic vector that be obtained in second step and Each corresponding gauss hybrid models of registration voice signal is contrasted, and this feature vector can be calculated in comparison process and each The posterior probability that individual registration voice signal matches；In the 4th step, voice signal to be verified and each registration voice are calculated After the posterior probability that signal matches, to improve recognition rate and quality, may be selected maximum in each posterior probability with it is pre- If threshold value compares, if maximum therein is more than (containing being equal to) predetermined threshold value, illustrate that the confidence level of the posterior probability is higher, And the confidence level that the registration voice signal of the voice signal to be verified corresponding to the posterior probability in contrast matches compared with Height, now, voice signal to be verified are verified by recognizing, normal unlocking operation can be carried out to lockset, conversely, then voice to be verified Signal does not verify that lockset keeps lock-out state by recognizing.In summary, audio recognition method provided by the present invention, passes through Pretreatment to voice signal to be verified, obtains its mel cepstrum coefficients, then distinguishes depth to mel cepstrum coefficients importing and put Communication network carries out advanced treating, obtains its characteristic vector, then with each this feature vector is registered into the corresponding Gauss of voice signal Mixed model is contrasted, and calculates the identification probability of voice signal to be verified, compared to prior art, and the present invention is deep by distinguishing The corresponding mel cepstrum coefficients of degree confidence network handles checking voice signal are corrected, and improve and height is used only in the prior art This mixed model improves discrimination of the voiceprint lock to target speaker to the dependence of speech text, and reduction mistake receives general Rate, it is ensured that door lock security performance.

Brief description of the drawings

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the accompanying drawing used required in technology description to be briefly described, it should be apparent that, drawings in the following description are only this The embodiment of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis The accompanying drawing of offer obtains other accompanying drawings.

Fig. 1 is a kind of flow chart of embodiment provided by the present invention；

Fig. 2 is a kind of module map of embodiment provided by the present invention；

Fig. 3 is a kind of structure chart of embodiment provided by the present invention；

Fig. 4 is Fig. 3 internal structure schematic diagram.

Wherein, in Fig. 2-4：

Extraction module -1, rectification module -2, computing module -3, authentication module -4, mould is trained in Registering modules -5 Block -6, sound collector -7, button -8, display screen -9, voice prompting device -10, memory -11, lockset -12, control Device -13.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made Embodiment, belongs to the scope of protection of the invention.

Fig. 1 is refer to, Fig. 1 is a kind of flow chart of embodiment provided by the present invention.

In a kind of embodiment provided by the present invention, the speech recognition side of the smart lock based on Application on Voiceprint Recognition Method, it is main to include four steps, be respectively：Extract the corresponding mel cepstrum coefficients of voice signal to be verified；By mel cepstrum system Number inputs the differentiation depth confidence network with parameter preset space as input layer, and the hidden of depth confidence network is distinguished to obtain Layer output, and as the characteristic vector of mel cepstrum coefficients；Characteristic vector is built in advance with each registration voice signal Gauss hybrid models contrasted, and calculate characteristic vector respectively with each posterior probability for matching of registration voice signal； Judge whether the maximum in each posterior probability is more than predetermined threshold value, if it is, voice signal to be verified is by verifying, and it is right Lockset carries out unlocking operation；It is on the contrary then make lockset keep lock-out state.

Wherein, in the first step, after voice signal input to be verified, it is pre-processed first, by voice to be verified Mel cepstrum coefficients corresponding to signal are extracted.Specifically, in this step, the pretreatment to voice signal to be verified has Body includes：Preemphasis plus Hamming window are carried out to voice signal to be verified successively, denoising is carried out by Wiener Filter Method, carried out soon Fast Fourier transformation, by triangle bandpass filter be filtered with discrete cosine transform operate, Mel can be obtained afterwards and is fallen Spectral coefficient (Mel-Frequency Cepstral Coefficients, MFCC).

In second step, then combine the mel cepstrum coefficients progress of differentiation depth confidence network handles checking voice signal deeply Degree processing, is input to the mel cepstrum coefficients as input layer in differentiation depth confidence network, the differentiation depth confidence network With parameter preset space, the hidden layer output for distinguishing depth confidence network can be directly obtained, and using hidden layer output as The characteristic vector of the mel cepstrum coefficients of voice signal to be verified.

In the third step, several default registration voice signals are typically stored in smart lock, and in order to accurate Each registration voice signal is analyzed, each registration voice signal has built gauss hybrid models in advance, in this way, in this step In, the voice signal to be verified after pretreatment and advanced treatment is contrasted with each registration voice signal, specifically, The characteristic vector obtained in second step gauss hybrid models corresponding with each registration voice signal are contrasted, in contrast During can calculate this feature vector and each posterior probability for matching of registration voice signal.

In the 4th step, calculate after the posterior probability that voice signal to be verified matches with each registration voice signal, To improve recognition rate and quality, the maximum in each posterior probability may be selected compared with predetermined threshold value, if it is therein most Big value is more than (containing being equal to) predetermined threshold value, then illustrates that the confidence level of the posterior probability is higher, and corresponding to the posterior probability The confidence level that the registration voice signal of voice signal to be verified in contrast matches is higher, now, voice signal to be verified By recognizing checking, normal unlocking operation can be carried out to lockset, conversely, then voice signal to be verified is not verified by recognizing, lock Tool keeps lock-out state.

In summary, the audio recognition method that the present embodiment is provided, by the pretreatment to voice signal to be verified, is obtained Its mel cepstrum coefficients is obtained, then the mel cepstrum coefficients are imported with differentiation depth confidence network and carries out advanced treating, its is obtained special Vector is levied, then this feature vector gauss hybrid models corresponding with each registration voice signal are contrasted, is calculated to be verified The identification probability of voice signal, compared to prior art, the present embodiment verifies that voice is believed by distinguishing depth confidence network handles Number corresponding mel cepstrum coefficients are corrected, improve be used only in the prior art gauss hybrid models to speech text according to Lai Xing, improves discrimination of the voiceprint lock to target speaker, the wrong acceptance probability of reduction, it is ensured that door lock security performance.

For convenience of the contrast of voice signal to be verified and registration voice signal, before verification step is carried out, also need first Carry out registration step.In this way, before the corresponding mel cepstrum coefficients of voice to be verified are extracted, in addition to step：To smart lock Registrant carry out registration voice typing, and to input each registration voice signal build gauss hybrid models.

Specifically, registrant (i.e. the owner of smart lock) can to smart lock according to prompting text say some sections of voices, Voice of such as two sections correspondence one texts etc., then smart lock is by its typing and preserves, after typing, first can be according to treating The same treatment method of checking identification signal is handled, that is, extracts each corresponding mel cepstrum coefficients of registration voice signal, Then the mel cepstrum coefficients corresponding to each registration voice signal can be bound with preset numbers, so as in subsequent step Identification and matching.Preferably, the numbering can be manually entered by registrant, if the numbering of input has been present, intelligence Lock reminds user to re-enter new numbering.

Afterwards, after the numbering binding of registration voice signal terminates, advanced treating can be carried out to each registration voice signal And correction, to improve signal quality and high-resolution identification.Specifically, can combine differentiation depth confidence network mode (DDBN, Division Deep Belief Network, DBN), using each registration voice signal corresponding to mel cepstrum coefficients as Input layer, while numbering bound in each registration voice signal is made a distinction into depth confidence network with this as output layer Training, can obtain the parameter space of the differentiation depth confidence network after the completion of training.

Then, after the completion of depth confidence network training is distinguished, you can by the Mel corresponding to each registration voice signal Cepstrum coefficient is input to the differentiation depth confidence network, is exported with the hidden layer for obtaining the differentiation depth confidence network, while can be by The characteristic vector of the mel cepstrum coefficients corresponding to voice signal is registered in hidden layer output as each.In this way, deep by distinguishing Spend confidence network and advanced treating is carried out to each registration voice signal, its corresponding mel cepstrum coefficients is corrected, improved The fine definition and high identification of each registration voice signal.

Finally, you can the characteristic vector for obtaining each registration voice signal after distinguishing the correction of depth confidence network is made For input, gauss hybrid models are built with this, meanwhile, to improve accuracy and analysis quality, when building gauss hybrid models, It can be built by criterion of EM algorithm.Meanwhile, after gauss hybrid models build completion, smart lock can also be by height This mixed model is bound with reference numeral, and the gauss hybrid models trained are stored.

In addition, in Qualify Phase, calculating the posterior probability that voice signal to be verified matches with each registration voice signal When, specifically, can set the characteristic vector of the corresponding mel cepstrum coefficients of voice signal to be verified asRegister voice signal set Quantity is N, one of registration voice signal n, and corresponding gauss hybrid models are λ_n, voice signal to be verified is registration voice Signal n posterior probability is：

Wherein, P (λ_n) prior probability that voice signal is inputted is registered for n-th,For in all voice signals simultaneously Characteristic vector under conditions of inputProbability.

Last recognition result provides recognition result by maximum posteriori criterion, i.e.,：

It is general, because the prior probability of each voice signal is all unknown, it can set that its is equal, i.e.,：

In addition, for the observation characteristic vector of a determination Be one to all voice signals all it is equal really Fixed constant.Therefore, the problem of asking for maximum a posteriori probability, which is converted into, asks for the problem of maximum likelihood is spent, i.e.,：

In order that model is more standby general, criterion can be used for using log-likelihood.If choosing registration voice letter Voice signal to be verified obtains maximum a posteriori probability during number n*, and corresponding gauss hybrid models areOther registration voice signals Gauss hybrid models beThen log-likelihood ratio is：

Wherein,It can be the gauss hybrid models of optional one other registration voice signals, institute can also be traveled through The gauss hybrid models of some registration voice signals.The log-likelihood ratio that the former only needs to obtain is more than threshold k, Hou Zhexu The log-likelihood ratio all to obtain is all higher than K and just can confirm that voice signal to be verified and registration voice signal belong to same person.

Further, if log-likelihood ratio is more than K*, wherein K*>K, it is high-quality voice signal to illustrate this section of voice signal, Corresponding mel cepstrum coefficients mark is not trained into mark, binding registration people numbering and log-likelihood ratio, also, smart lock will be treated The mel cepstrum coefficients of checking voice signal are stored.

Furthermore, it is contemplated that often there is high requirement to amount of training data in gauss hybrid models, if if data volume is not enough, Systematic function and accuracy can significantly be influenceed.Therefore, the present embodiment is extracting the corresponding mel cepstrum system of voice signal to be verified Before number, and after registrant's progress registration voice typing of smart lock, it is additionally arranged amount of training data acquisition step.

Specifically, when smart lock is in non-registered stage and non-authentication stage, several of collection registrant are not trained Voice signal, the quantity of current untrained voice signal is less than predetermined threshold value, at such as less than 50, can not instruct each Mel cepstrum coefficients corresponding to experienced voice signal distinguish depth confidence network directly as training data input, to its parameter Space is modified, to improve the mel cepstrum system for distinguishing depth confidence network handles checking voice signal and registration voice signal Several correction accuracy.Meanwhile,, can be by each during such as more than 50 if acquire enough untrained voice signals Mel cepstrum coefficients corresponding to untrained voice signal input revised differentiation depth confidence network, to obtain each Characteristic vector corresponding to the mel cepstrum coefficients of untrained voice signal, while using this feature vector to foregoing structure The gauss hybrid models built up are modified.

Specifically, when being modified to gauss hybrid models, each untrained voice signal can be set and distinguish T amendment Characteristic vector：

Meanwhile, but each amendment characteristic vector distinguishes corresponding likelihood ratio and is：

{K₁, K₂,K₃,...,K_T}

Finally, formula can be passed through：

Average and correction to variances are carried out to the gauss hybrid models having had been built up, wherein, L_iFor each characteristic vector or Correct the weights corresponding to characteristic vector.

Moreover, it is contemplated that either voice signal to be verified still registers voice signal, depth is distinguished in joint to put When communication network carries out the correction of corresponding mel cepstrum coefficients, distinguish depth confidence network hidden layer output directly influence feature to The accuracy of amount, therefore, being additionally arranged the quality inspection steps of the hidden layer output to distinguishing depth confidence network in the present embodiment.

Specifically, formula can be passed through：

Check the hidden layer output quality for distinguishing depth confidence network.

Specifically, can be maximum between class distance, the minimum criterion of inter- object distance defines discrimination D.If registering voice Signal has K, exemplified by registering voice signal n, and the registration voice signal possesses characteristic vector c, each characteristic vector correspondence Weights be L_i, then register the average weight of voice signal n characteristic vectors as：

Define matrix S=S_b-S_w, S_iFor element in matrix S, wherein, S_bFor within class scatter matrix, S_wIt is inter _ class relationship Matrix.In this way, D is bigger, illustrate that the characteristic component quality that hidden layer is extracted is better, it is on the contrary then smaller.If D value is more than default Threshold value, then illustrate that hidden layer output quality meets preset requirement.

As shown in Fig. 2 Fig. 2 is a kind of module map of embodiment provided by the present invention.

The present embodiment also provides a kind of speech recognition system of the smart lock based on Application on Voiceprint Recognition, mainly including extraction module 1st, rectification module 2, computing module 3 and authentication module 4.Wherein, extraction module 1 is mainly used in the corresponding plum of voice signal to be verified That cepstrum coefficient.Rectification module 2 is mainly used in having parameter preset space using the mel cepstrum coefficients as input layer input Differentiation depth confidence network, exported, and fallen as the Mel with the hidden layer for obtaining the differentiation depth confidence network The characteristic vector of spectral coefficient.Computing module 3 is mainly used in build the characteristic vector and each registration voice signal in advance Gauss hybrid models are contrasted, and it is general with each posteriority for matching of registration voice signal respectively to calculate the characteristic vector Rate.Authentication module 4 is mainly used in judging whether the maximum in each posterior probability is more than predetermined threshold value, if it is, treating Voice signal is verified by checking, and unlocking operation is carried out to lockset 12；It is on the contrary then make lockset 12 keep lock-out state.

The audio recognition method of the speech recognition system is identical with foregoing related content, and here is omitted.

In addition, being also additionally arranged Registering modules 5 and training module 6 in the present embodiment.Wherein, Registering modules 5 are mainly used in pair The registrant of smart lock carries out registration voice typing, and training module 6 is connected with the signal of Registering modules 5, is mainly used in input Each registration voice signal build gauss hybrid models.

As shown in Figure 3 and Figure 4, Fig. 3 is a kind of structure chart of embodiment provided by the present invention, and Fig. 4 is Fig. 3's Internal structure schematic diagram.

The present embodiment also provides a kind of smart lock based on Application on Voiceprint Recognition, mainly includes sound collector 7, lockset 12, presses Key 8, display screen 9, voice prompting device 10, memory 11, controller 13 and speech recognition system.Wherein, speech recognition system with Above-mentioned related content is identical, and here is omitted.And sound collector 7 is mainly used in collecting voice signal, lockset 12 can be electricity Magnetic padlock, and button 8 is mainly used in for user's input numeral numbering etc., display screen 9 is mainly used in providing the user feedback information, than Such as speech text, numbering are re-entered, and voice prompting device 10 is mainly used in providing the user feedback information, such as speech text Acoustic information etc., memory 11 is mainly used in storage registration voice signal or the mel cepstrum coefficients institute of voice signal to be verified is right Numbering answered etc., controller 13 is mainly used under the control of the recognition result of identifying system, and control lockset 12 is unlocked behaviour Make or keep lock operation.

The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or using the present invention. A variety of modifications to these embodiments will be apparent for those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one The most wide scope caused.

Claims

1. a kind of audio recognition method of the smart lock based on Application on Voiceprint Recognition, it is characterised in that including：

The differentiation depth confidence network with parameter preset space is inputted using the mel cepstrum coefficients as input layer, to obtain The hidden layer output for distinguishing depth confidence network, and as the characteristic vector of the mel cepstrum coefficients；

The gauss hybrid models that the characteristic vector is built in advance with each registration voice signal are contrasted, and calculate described The posterior probability that characteristic vector matches with each registration voice signal respectively；

Judge whether the maximum in each posterior probability is more than predetermined threshold value, if it is, voice signal to be verified passes through Checking, and unlocking operation is carried out to lockset；It is on the contrary then make lockset keep lock-out state.

2. audio recognition method according to claim 1, it is characterised in that extracting the corresponding plum of voice signal to be verified Before your cepstrum coefficient, in addition to：

Registration voice typing is carried out to the registrant of smart lock, and Gaussian Mixture mould is built to each registration voice signal of input Type.

3. audio recognition method according to claim 2, it is characterised in that built to each registration voice signal of input Gauss hybrid models, are specifically included：

Each corresponding mel cepstrum coefficients of registration voice signal is extracted, and each described registration voice signal of input is carried out Preset numbers are bound；

Using each corresponding mel cepstrum coefficients of registration voice signal as input layer, while by each registration voice The numbering of signal binding makes a distinction depth confidence network training, and obtain the differentiation depth confidence network as output layer Parameter space；

Each described corresponding mel cepstrum coefficients of voice signal of registering is inputted into the differentiation depth confidence network, to obtain The hidden layer output for distinguishing depth confidence network is stated, and as the corresponding mel cepstrum coefficients of registration voice signal each described Characteristic vector；

4. audio recognition method according to claim 3, it is characterised in that extract voice signal to be verified or each registration The corresponding mel cepstrum coefficients of voice signal, are specifically included：Voice signal to be verified or each registration voice signal are entered successively Row preemphasis, plus Hamming window, by Wiener Filter Method carry out denoising, carry out Fast Fourier Transform (FFT), pass through triangle bandpass filtering Device is filtered and discrete cosine transform.

5. audio recognition method according to claim 4, it is characterised in that each described registration voice signal is corresponding Mel cepstrum coefficients input it is described differentiation depth confidence network, with obtain it is described differentiation depth confidence network hidden layer export Afterwards, in addition to：

Pass through formula：

<mrow> <mi>D</mi> <mo>=</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </munderover> <msub> <mi>L</mi> <mi>i</mi> </msub> <mo>*</mo> <msub> <mi>s</mi> <mi>i</mi> </msub> </mrow>

The hidden layer output quality for distinguishing depth confidence network is checked, if D value is more than predetermined threshold value, hidden layer output quality Meet preset requirement；

Wherein, D is discrimination, L_iFor the corresponding weights of the corresponding characteristic vector of each registration voice signal, S_iFor the member in matrix S Element, S=S_b-S_w, S_bFor within class scatter matrix, S_wFor inter _ class relationship matrix.

6. audio recognition method according to claim 5, it is characterised in that extracting the corresponding plum of voice signal to be verified Before your cepstrum coefficient, after registrant's progress registration voice typing of smart lock, in addition to：

Gather several untrained voice signals；

If the quantity of current untrained voice signal is less than predetermined threshold value, by the plum corresponding to each untrained voice signal Your cepstrum coefficient inputs the differentiation depth confidence network, to correct its parameter space；

If the quantity of current untrained voice signal exceedes predetermined threshold value, by the plum corresponding to each untrained voice signal Your cepstrum coefficient inputs revised differentiation depth confidence network, to obtain corresponding amendment characteristic vector, and described in Amendment characteristic vector is modified to the gauss hybrid models.

7. audio recognition method according to claim 6, it is characterised in that the gauss hybrid models are modified, Specifically include：

{K₁, K₂, K₃..., K_T}

Then pass through formula：

<mrow> <msub> <mi>L</mi> <mi>i</mi> </msub> <mo>=</mo> <mfrac> <msub> <mi>K</mi> <mi>i</mi> </msub> <mrow> <munderover> <mo>&Sigma;</mo> <mrow> <mi>m</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>T</mi> </munderover> <msub> <mi>K</mi> <mi>m</mi> </msub> </mrow> </mfrac> </mrow>

Correct the average and variance of gauss hybrid models.

8. a kind of speech recognition system of the smart lock based on Application on Voiceprint Recognition, it is characterised in that including：

Rectification module, puts for the mel cepstrum coefficients to be inputted into the differentiation depth with parameter preset space as input layer Communication network, is exported with the hidden layer for obtaining the differentiation depth confidence network, and as the feature of the mel cepstrum coefficients Vector；

Computing module, for the gauss hybrid models progress pair for building the characteristic vector and each registration voice signal in advance Than, and calculate the posterior probability that the characteristic vector matches with each registration voice signal respectively；

Authentication module, for judging whether the maximum in each posterior probability is more than predetermined threshold value, if it is, to be verified Voice signal carries out unlocking operation by checking, and to lockset；It is on the contrary then make lockset keep lock-out state.

9. speech recognition system according to claim 8, it is characterised in that also include：

10. a kind of smart lock based on Application on Voiceprint Recognition, it is characterised in that including sound collector, lockset and such as claim 8 or Speech recognition system described in 9.