CN107240397A - A kind of smart lock and its audio recognition method and system based on Application on Voiceprint Recognition - Google Patents
A kind of smart lock and its audio recognition method and system based on Application on Voiceprint Recognition Download PDFInfo
- Publication number
- CN107240397A CN107240397A CN201710692968.9A CN201710692968A CN107240397A CN 107240397 A CN107240397 A CN 107240397A CN 201710692968 A CN201710692968 A CN 201710692968A CN 107240397 A CN107240397 A CN 107240397A
- Authority
- CN
- China
- Prior art keywords
- voice signal
- registration
- characteristic vector
- mel cepstrum
- cepstrum coefficients
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 239000013598 vector Substances 0.000 claims abstract description 59
- 230000004069 differentiation Effects 0.000 claims abstract description 32
- 239000000284 extract Substances 0.000 claims abstract description 7
- 238000012549 training Methods 0.000 claims description 14
- 239000011159 matrix material Substances 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 5
- 238000004891 communication Methods 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 2
- 238000001914 filtration Methods 0.000 claims 1
- 230000009467 reduction Effects 0.000 abstract description 5
- 238000012937 correction Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000001755 vocal effect Effects 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G07—CHECKING-DEVICES
- G07C—TIME OR ATTENDANCE REGISTERS; REGISTERING OR INDICATING THE WORKING OF MACHINES; GENERATING RANDOM NUMBERS; VOTING OR LOTTERY APPARATUS; ARRANGEMENTS, SYSTEMS OR APPARATUS FOR CHECKING NOT PROVIDED FOR ELSEWHERE
- G07C9/00—Individual registration on entry or exit
- G07C9/00174—Electronically operated locks; Circuits therefor; Nonmechanical keys therefor, e.g. passive or active electrical keys or other data carriers without mechanical keys
- G07C9/00563—Electronically operated locks; Circuits therefor; Nonmechanical keys therefor, e.g. passive or active electrical keys or other data carriers without mechanical keys using personal physical data of the operator, e.g. finger prints, retinal images, voicepatterns
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G10L15/144—Training of HMMs
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Lock And Its Accessories (AREA)
Abstract
The present invention discloses a kind of audio recognition method of the smart lock based on Application on Voiceprint Recognition, including:Extract the corresponding mel cepstrum coefficients of voice signal to be verified;The differentiation depth confidence network with parameter preset space is inputted using mel cepstrum coefficients as input layer, to obtain the hidden layer output for distinguishing depth confidence network, and as the characteristic vector of the mel cepstrum coefficients;The gauss hybrid models that characteristic vector is built in advance with each registration voice signal are contrasted, and calculate the posterior probability that characteristic vector matches with each registration voice signal respectively;Judge whether the maximum in each posterior probability is more than predetermined threshold value, if it is, voice signal to be verified is by checking, and unlocking operation is carried out to lockset;It is on the contrary then make lockset keep lock-out state.The present invention improves the discrimination to target speaker, the wrong acceptance probability of reduction, it is ensured that door lock security performance.Invention additionally discloses a kind of speech recognition system and a kind of smart lock, its advantage is as described above.
Description
Technical field
The present invention relates to signal processing technology field, the speech recognition of more particularly to a kind of smart lock based on Application on Voiceprint Recognition
Method.The invention further relates to a kind of speech recognition system of the smart lock based on Application on Voiceprint Recognition.Include the invention further relates to one kind
The smart lock based on Application on Voiceprint Recognition of above-mentioned speech recognition system.
Background technology
As burglary event occurs often, domestic door lock of how upgrading, it is ensured that indoor safety is that people have to
The new challenge of reply.
At present, traditional domestic door lock uses key unblock, password unblock substantially on the market, because key is easy to lose, easy
Replicate, the features such as characteristic is not strong, likewise, password solution, which is latched in, easily forgets password, the shortcoming that password is easily revealed is easily caused
While the wasting of resources potential safety hazard is brought to user.In addition, the intelligent equipment of existing use other biological authentication techniques is such as
Fingerprint recognition, iris recognition etc., cost is high, the inconvenience with contact, unhygienic, and such as fingerprint recognition needs to put on finger
On a sensor, iris recognition is needed eyes closer camera, it is impossible to provide good Consumer's Experience and to user cause through
Ji burden.
In the prior art, the intelligent door lock based on Application on Voiceprint Recognition has been occurred in that.Voiceprint lock is the pattern based on sound
Identification, plays the same tune on different musical instruments with Fingerprint Lock and is used.As long as owner says the code word being previously set, just energy opens lock, even if others says
Go out code word also not opening, this voiceprint lock distinguishes that the Main Basiss of sound are tone colors.But, because most of voiceprint locks are to sound
Identification and verify often only with GMM model (Gaussian Mixture Model, gauss hybrid models) method training side
Method, experiment shows that, only with GMM Speaker Identification models, the voice context content in training and test is to recognition result
Have a great impact.It is closer to when the identification voice context and target Speaker Identification voice context of non-targeted speaker
When, when target speaker test is carried out with the voice, the probability that mistake receives can be greatly improved.
Therefore, discrimination of the voiceprint lock to target speaker, the wrong acceptance probability of reduction, it is ensured that door lock safety how to be improved
Performance, is those skilled in the art's technical problem urgently to be resolved hurrily.
The content of the invention
It is an object of the invention to provide a kind of audio recognition method of the smart lock based on Application on Voiceprint Recognition, it is possible to increase vocal print
Lock the discrimination to target speaker, the wrong acceptance probability of reduction, it is ensured that door lock security performance.It is another object of the present invention to carry
A kind of speech recognition system for smart lock based on Application on Voiceprint Recognition and it is a kind of including above-mentioned speech recognition system based on vocal print
The smart lock of identification.
In order to solve the above technical problems, the present invention provides a kind of audio recognition method of the smart lock based on Application on Voiceprint Recognition,
Including:
Extract the corresponding mel cepstrum coefficients of voice signal to be verified;
The differentiation depth confidence network with parameter preset space is inputted using the mel cepstrum coefficients as input layer, with
The hidden layer output for distinguishing depth confidence network is obtained, and as the characteristic vector of the mel cepstrum coefficients;
The gauss hybrid models that the characteristic vector is built in advance with each registration voice signal are contrasted, and are calculated
The posterior probability that the characteristic vector matches with each registration voice signal respectively;
Judge whether the maximum in each posterior probability is more than predetermined threshold value, if it is, voice signal to be verified
Unlocking operation is carried out by checking, and to lockset;It is on the contrary then make lockset keep lock-out state.
Preferably, before the corresponding mel cepstrum coefficients of voice signal to be verified are extracted, in addition to:
Registration voice typing is carried out to the registrant of smart lock, and each registration voice signal structure Gauss of input is mixed
Matched moulds type.
Preferably, gauss hybrid models are built to each registration voice signal of input, specifically included:
Each corresponding mel cepstrum coefficients of registration voice signal is extracted, and to each described registration voice signal of input
Carry out preset numbers binding;
Using each corresponding mel cepstrum coefficients of registration voice signal as input layer, while by registration each described
The numbering of voice signal binding makes a distinction depth confidence network training, and obtain the differentiation depth confidence as output layer
The parameter space of network;
Each described corresponding mel cepstrum coefficients of voice signal of registering is inputted into the differentiation depth confidence network, to obtain
The hidden layer output for distinguishing depth confidence network is obtained, and as the corresponding mel cepstrum of registration voice signal each described
The characteristic vector of coefficient;
Using each characteristic vector as input, and it is defined structure gauss hybrid models by EM algorithm.
Preferably, voice signal to be verified or each corresponding mel cepstrum coefficients of registration voice signal, specific bag are extracted
Include:Preemphasis plus Hamming window are carried out successively to voice signal to be verified or each registration voice signal, entered by Wiener Filter Method
Row denoising, carry out Fast Fourier Transform (FFT), be filtered and discrete cosine transform by triangle bandpass filter.
Preferably, each described corresponding mel cepstrum coefficients of voice signal of registering is being inputted into the differentiation depth confidence
Network, after the hidden layer output to obtain the differentiation depth confidence network, in addition to:
Pass through formula:
The hidden layer output quality for distinguishing depth confidence network is checked, if D value is more than predetermined threshold value, hidden layer output
Quality meets preset requirement;
Wherein, D is discrimination, LiFor the corresponding weights of the corresponding characteristic vector of each registration voice signal, SiFor in matrix S
Element, S=Sb-Sw, SbFor within class scatter matrix, SwFor inter _ class relationship matrix.
Preferably, before the corresponding mel cepstrum coefficients of voice signal to be verified are extracted, the registrant of smart lock is entered
After row registration voice typing, in addition to:
Gather several untrained phonetic notation signals;
If the quantity of current untrained voice signal is less than predetermined threshold value, by corresponding to each untrained voice signal
Mel cepstrum coefficients input it is described differentiation depth confidence network, to correct its parameter space;
If the quantity of current untrained voice signal exceedes predetermined threshold value, by corresponding to each untrained voice signal
Mel cepstrum coefficients input it is revised differentiation depth confidence network, to obtain corresponding amendment characteristic vector, and utilize
The amendment characteristic vector is modified to the gauss hybrid models.
Preferably, the gauss hybrid models are modified, specifically included:
If each untrained voice signal corresponds to T amendment characteristic vector respectively:
And each amendment characteristic vector distinguishes corresponding likelihood ratio and is:
{K1,K2,K3,...,KT}
Then pass through formula:
Correct the average and variance of gauss hybrid models.
The present invention also provides a kind of speech recognition system of the smart lock based on Application on Voiceprint Recognition, including:
Extraction module, for the corresponding mel cepstrum coefficients of voice signal to be verified;
Rectification module, it is deep for the mel cepstrum coefficients to be inputted into the differentiation with parameter preset space as input layer
Confidence network is spent, is exported with the hidden layer for obtaining the differentiation depth confidence network, and as the mel cepstrum coefficients
Characteristic vector;
Computing module, the gauss hybrid models for the characteristic vector to be built in advance with each registration voice signal enter
Row contrast, and calculate the posterior probability that the characteristic vector matches with each registration voice signal respectively;
Authentication module, for judging whether the maximum in each posterior probability is more than predetermined threshold value, if it is, treating
Voice signal is verified by checking, and unlocking operation is carried out to lockset;It is on the contrary then make lockset keep lock-out state.
Preferably, in addition to:
Registering modules, registration voice typing is carried out for the registrant to smart lock;
Training module, gauss hybrid models are built for each registration voice signal to input.
The present invention also provides a kind of smart lock based on Application on Voiceprint Recognition, including sound collector, lockset and two as described above
Any one of speech recognition system.
The audio recognition method of smart lock provided by the present invention based on Application on Voiceprint Recognition, it is main to include four steps, its
In, in the first step, after voice signal input to be verified, it is pre-processed first, by corresponding to voice signal to be verified
Mel cepstrum coefficients extract;In second step, then combine the plum for distinguishing depth confidence network handles checking voice signal
You carry out advanced treating by cepstrum coefficient, are input to the mel cepstrum coefficients as input layer in differentiation depth confidence network, should
Distinguishing depth confidence network has parameter preset space, can directly obtain the hidden layer output for distinguishing depth confidence network, and
The hidden layer is exported into the characteristic vector as the mel cepstrum coefficients of voice signal to be verified;In the third step, typically in intelligence
Several default registration voice signals are stored in lock, and for each registration voice signal of Accurate Analysis, each registration
Voice signal has built gauss hybrid models in advance, in this way, in this step, by after pretreatment and advanced treatment
Voice signal to be verified is contrasted with each registration voice signal, specifically, i.e. will the characteristic vector that be obtained in second step and
Each corresponding gauss hybrid models of registration voice signal is contrasted, and this feature vector can be calculated in comparison process and each
The posterior probability that individual registration voice signal matches;In the 4th step, voice signal to be verified and each registration voice are calculated
After the posterior probability that signal matches, to improve recognition rate and quality, may be selected maximum in each posterior probability with it is pre-
If threshold value compares, if maximum therein is more than (containing being equal to) predetermined threshold value, illustrate that the confidence level of the posterior probability is higher,
And the confidence level that the registration voice signal of the voice signal to be verified corresponding to the posterior probability in contrast matches compared with
Height, now, voice signal to be verified are verified by recognizing, normal unlocking operation can be carried out to lockset, conversely, then voice to be verified
Signal does not verify that lockset keeps lock-out state by recognizing.In summary, audio recognition method provided by the present invention, passes through
Pretreatment to voice signal to be verified, obtains its mel cepstrum coefficients, then distinguishes depth to mel cepstrum coefficients importing and put
Communication network carries out advanced treating, obtains its characteristic vector, then with each this feature vector is registered into the corresponding Gauss of voice signal
Mixed model is contrasted, and calculates the identification probability of voice signal to be verified, compared to prior art, and the present invention is deep by distinguishing
The corresponding mel cepstrum coefficients of degree confidence network handles checking voice signal are corrected, and improve and height is used only in the prior art
This mixed model improves discrimination of the voiceprint lock to target speaker to the dependence of speech text, and reduction mistake receives general
Rate, it is ensured that door lock security performance.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is the accompanying drawing used required in technology description to be briefly described, it should be apparent that, drawings in the following description are only this
The embodiment of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis
The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is a kind of flow chart of embodiment provided by the present invention;
Fig. 2 is a kind of module map of embodiment provided by the present invention;
Fig. 3 is a kind of structure chart of embodiment provided by the present invention;
Fig. 4 is Fig. 3 internal structure schematic diagram.
Wherein, in Fig. 2-4:
Extraction module -1, rectification module -2, computing module -3, authentication module -4, mould is trained in Registering modules -5
Block -6, sound collector -7, button -8, display screen -9, voice prompting device -10, memory -11, lockset -12, control
Device -13.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on
Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made
Embodiment, belongs to the scope of protection of the invention.
Fig. 1 is refer to, Fig. 1 is a kind of flow chart of embodiment provided by the present invention.
In a kind of embodiment provided by the present invention, the speech recognition side of the smart lock based on Application on Voiceprint Recognition
Method, it is main to include four steps, be respectively:Extract the corresponding mel cepstrum coefficients of voice signal to be verified;By mel cepstrum system
Number inputs the differentiation depth confidence network with parameter preset space as input layer, and the hidden of depth confidence network is distinguished to obtain
Layer output, and as the characteristic vector of mel cepstrum coefficients;Characteristic vector is built in advance with each registration voice signal
Gauss hybrid models contrasted, and calculate characteristic vector respectively with each posterior probability for matching of registration voice signal;
Judge whether the maximum in each posterior probability is more than predetermined threshold value, if it is, voice signal to be verified is by verifying, and it is right
Lockset carries out unlocking operation;It is on the contrary then make lockset keep lock-out state.
Wherein, in the first step, after voice signal input to be verified, it is pre-processed first, by voice to be verified
Mel cepstrum coefficients corresponding to signal are extracted.Specifically, in this step, the pretreatment to voice signal to be verified has
Body includes:Preemphasis plus Hamming window are carried out to voice signal to be verified successively, denoising is carried out by Wiener Filter Method, carried out soon
Fast Fourier transformation, by triangle bandpass filter be filtered with discrete cosine transform operate, Mel can be obtained afterwards and is fallen
Spectral coefficient (Mel-Frequency Cepstral Coefficients, MFCC).
In second step, then combine the mel cepstrum coefficients progress of differentiation depth confidence network handles checking voice signal deeply
Degree processing, is input to the mel cepstrum coefficients as input layer in differentiation depth confidence network, the differentiation depth confidence network
With parameter preset space, the hidden layer output for distinguishing depth confidence network can be directly obtained, and using hidden layer output as
The characteristic vector of the mel cepstrum coefficients of voice signal to be verified.
In the third step, several default registration voice signals are typically stored in smart lock, and in order to accurate
Each registration voice signal is analyzed, each registration voice signal has built gauss hybrid models in advance, in this way, in this step
In, the voice signal to be verified after pretreatment and advanced treatment is contrasted with each registration voice signal, specifically,
The characteristic vector obtained in second step gauss hybrid models corresponding with each registration voice signal are contrasted, in contrast
During can calculate this feature vector and each posterior probability for matching of registration voice signal.
In the 4th step, calculate after the posterior probability that voice signal to be verified matches with each registration voice signal,
To improve recognition rate and quality, the maximum in each posterior probability may be selected compared with predetermined threshold value, if it is therein most
Big value is more than (containing being equal to) predetermined threshold value, then illustrates that the confidence level of the posterior probability is higher, and corresponding to the posterior probability
The confidence level that the registration voice signal of voice signal to be verified in contrast matches is higher, now, voice signal to be verified
By recognizing checking, normal unlocking operation can be carried out to lockset, conversely, then voice signal to be verified is not verified by recognizing, lock
Tool keeps lock-out state.
In summary, the audio recognition method that the present embodiment is provided, by the pretreatment to voice signal to be verified, is obtained
Its mel cepstrum coefficients is obtained, then the mel cepstrum coefficients are imported with differentiation depth confidence network and carries out advanced treating, its is obtained special
Vector is levied, then this feature vector gauss hybrid models corresponding with each registration voice signal are contrasted, is calculated to be verified
The identification probability of voice signal, compared to prior art, the present embodiment verifies that voice is believed by distinguishing depth confidence network handles
Number corresponding mel cepstrum coefficients are corrected, improve be used only in the prior art gauss hybrid models to speech text according to
Lai Xing, improves discrimination of the voiceprint lock to target speaker, the wrong acceptance probability of reduction, it is ensured that door lock security performance.
For convenience of the contrast of voice signal to be verified and registration voice signal, before verification step is carried out, also need first
Carry out registration step.In this way, before the corresponding mel cepstrum coefficients of voice to be verified are extracted, in addition to step:To smart lock
Registrant carry out registration voice typing, and to input each registration voice signal build gauss hybrid models.
Specifically, registrant (i.e. the owner of smart lock) can to smart lock according to prompting text say some sections of voices,
Voice of such as two sections correspondence one texts etc., then smart lock is by its typing and preserves, after typing, first can be according to treating
The same treatment method of checking identification signal is handled, that is, extracts each corresponding mel cepstrum coefficients of registration voice signal,
Then the mel cepstrum coefficients corresponding to each registration voice signal can be bound with preset numbers, so as in subsequent step
Identification and matching.Preferably, the numbering can be manually entered by registrant, if the numbering of input has been present, intelligence
Lock reminds user to re-enter new numbering.
Afterwards, after the numbering binding of registration voice signal terminates, advanced treating can be carried out to each registration voice signal
And correction, to improve signal quality and high-resolution identification.Specifically, can combine differentiation depth confidence network mode (DDBN,
Division Deep Belief Network, DBN), using each registration voice signal corresponding to mel cepstrum coefficients as
Input layer, while numbering bound in each registration voice signal is made a distinction into depth confidence network with this as output layer
Training, can obtain the parameter space of the differentiation depth confidence network after the completion of training.
Then, after the completion of depth confidence network training is distinguished, you can by the Mel corresponding to each registration voice signal
Cepstrum coefficient is input to the differentiation depth confidence network, is exported with the hidden layer for obtaining the differentiation depth confidence network, while can be by
The characteristic vector of the mel cepstrum coefficients corresponding to voice signal is registered in hidden layer output as each.In this way, deep by distinguishing
Spend confidence network and advanced treating is carried out to each registration voice signal, its corresponding mel cepstrum coefficients is corrected, improved
The fine definition and high identification of each registration voice signal.
Finally, you can the characteristic vector for obtaining each registration voice signal after distinguishing the correction of depth confidence network is made
For input, gauss hybrid models are built with this, meanwhile, to improve accuracy and analysis quality, when building gauss hybrid models,
It can be built by criterion of EM algorithm.Meanwhile, after gauss hybrid models build completion, smart lock can also be by height
This mixed model is bound with reference numeral, and the gauss hybrid models trained are stored.
In addition, in Qualify Phase, calculating the posterior probability that voice signal to be verified matches with each registration voice signal
When, specifically, can set the characteristic vector of the corresponding mel cepstrum coefficients of voice signal to be verified asRegister voice signal set
Quantity is N, one of registration voice signal n, and corresponding gauss hybrid models are λn, voice signal to be verified is registration voice
Signal n posterior probability is:
Wherein, P (λn) prior probability that voice signal is inputted is registered for n-th,For in all voice signals simultaneously
Characteristic vector under conditions of inputProbability.
Last recognition result provides recognition result by maximum posteriori criterion, i.e.,:
It is general, because the prior probability of each voice signal is all unknown, it can set that its is equal, i.e.,:
In addition, for the observation characteristic vector of a determination Be one to all voice signals all it is equal really
Fixed constant.Therefore, the problem of asking for maximum a posteriori probability, which is converted into, asks for the problem of maximum likelihood is spent, i.e.,:
In order that model is more standby general, criterion can be used for using log-likelihood.If choosing registration voice letter
Voice signal to be verified obtains maximum a posteriori probability during number n*, and corresponding gauss hybrid models areOther registration voice signals
Gauss hybrid models beThen log-likelihood ratio is:
Wherein,It can be the gauss hybrid models of optional one other registration voice signals, institute can also be traveled through
The gauss hybrid models of some registration voice signals.The log-likelihood ratio that the former only needs to obtain is more than threshold k, Hou Zhexu
The log-likelihood ratio all to obtain is all higher than K and just can confirm that voice signal to be verified and registration voice signal belong to same person.
Further, if log-likelihood ratio is more than K*, wherein K*>K, it is high-quality voice signal to illustrate this section of voice signal,
Corresponding mel cepstrum coefficients mark is not trained into mark, binding registration people numbering and log-likelihood ratio, also, smart lock will be treated
The mel cepstrum coefficients of checking voice signal are stored.
Furthermore, it is contemplated that often there is high requirement to amount of training data in gauss hybrid models, if if data volume is not enough,
Systematic function and accuracy can significantly be influenceed.Therefore, the present embodiment is extracting the corresponding mel cepstrum system of voice signal to be verified
Before number, and after registrant's progress registration voice typing of smart lock, it is additionally arranged amount of training data acquisition step.
Specifically, when smart lock is in non-registered stage and non-authentication stage, several of collection registrant are not trained
Voice signal, the quantity of current untrained voice signal is less than predetermined threshold value, at such as less than 50, can not instruct each
Mel cepstrum coefficients corresponding to experienced voice signal distinguish depth confidence network directly as training data input, to its parameter
Space is modified, to improve the mel cepstrum system for distinguishing depth confidence network handles checking voice signal and registration voice signal
Several correction accuracy.Meanwhile,, can be by each during such as more than 50 if acquire enough untrained voice signals
Mel cepstrum coefficients corresponding to untrained voice signal input revised differentiation depth confidence network, to obtain each
Characteristic vector corresponding to the mel cepstrum coefficients of untrained voice signal, while using this feature vector to foregoing structure
The gauss hybrid models built up are modified.
Specifically, when being modified to gauss hybrid models, each untrained voice signal can be set and distinguish T amendment
Characteristic vector:
Meanwhile, but each amendment characteristic vector distinguishes corresponding likelihood ratio and is:
{K1, K2,K3,...,KT}
Finally, formula can be passed through:
Average and correction to variances are carried out to the gauss hybrid models having had been built up, wherein, LiFor each characteristic vector or
Correct the weights corresponding to characteristic vector.
Moreover, it is contemplated that either voice signal to be verified still registers voice signal, depth is distinguished in joint to put
When communication network carries out the correction of corresponding mel cepstrum coefficients, distinguish depth confidence network hidden layer output directly influence feature to
The accuracy of amount, therefore, being additionally arranged the quality inspection steps of the hidden layer output to distinguishing depth confidence network in the present embodiment.
Specifically, formula can be passed through:
Check the hidden layer output quality for distinguishing depth confidence network.
Specifically, can be maximum between class distance, the minimum criterion of inter- object distance defines discrimination D.If registering voice
Signal has K, exemplified by registering voice signal n, and the registration voice signal possesses characteristic vector c, each characteristic vector correspondence
Weights be Li, then register the average weight of voice signal n characteristic vectors as:
Define matrix S=Sb-Sw, SiFor element in matrix S, wherein, SbFor within class scatter matrix, SwIt is inter _ class relationship
Matrix.In this way, D is bigger, illustrate that the characteristic component quality that hidden layer is extracted is better, it is on the contrary then smaller.If D value is more than default
Threshold value, then illustrate that hidden layer output quality meets preset requirement.
As shown in Fig. 2 Fig. 2 is a kind of module map of embodiment provided by the present invention.
The present embodiment also provides a kind of speech recognition system of the smart lock based on Application on Voiceprint Recognition, mainly including extraction module
1st, rectification module 2, computing module 3 and authentication module 4.Wherein, extraction module 1 is mainly used in the corresponding plum of voice signal to be verified
That cepstrum coefficient.Rectification module 2 is mainly used in having parameter preset space using the mel cepstrum coefficients as input layer input
Differentiation depth confidence network, exported, and fallen as the Mel with the hidden layer for obtaining the differentiation depth confidence network
The characteristic vector of spectral coefficient.Computing module 3 is mainly used in build the characteristic vector and each registration voice signal in advance
Gauss hybrid models are contrasted, and it is general with each posteriority for matching of registration voice signal respectively to calculate the characteristic vector
Rate.Authentication module 4 is mainly used in judging whether the maximum in each posterior probability is more than predetermined threshold value, if it is, treating
Voice signal is verified by checking, and unlocking operation is carried out to lockset 12;It is on the contrary then make lockset 12 keep lock-out state.
The audio recognition method of the speech recognition system is identical with foregoing related content, and here is omitted.
In addition, being also additionally arranged Registering modules 5 and training module 6 in the present embodiment.Wherein, Registering modules 5 are mainly used in pair
The registrant of smart lock carries out registration voice typing, and training module 6 is connected with the signal of Registering modules 5, is mainly used in input
Each registration voice signal build gauss hybrid models.
As shown in Figure 3 and Figure 4, Fig. 3 is a kind of structure chart of embodiment provided by the present invention, and Fig. 4 is Fig. 3's
Internal structure schematic diagram.
The present embodiment also provides a kind of smart lock based on Application on Voiceprint Recognition, mainly includes sound collector 7, lockset 12, presses
Key 8, display screen 9, voice prompting device 10, memory 11, controller 13 and speech recognition system.Wherein, speech recognition system with
Above-mentioned related content is identical, and here is omitted.And sound collector 7 is mainly used in collecting voice signal, lockset 12 can be electricity
Magnetic padlock, and button 8 is mainly used in for user's input numeral numbering etc., display screen 9 is mainly used in providing the user feedback information, than
Such as speech text, numbering are re-entered, and voice prompting device 10 is mainly used in providing the user feedback information, such as speech text
Acoustic information etc., memory 11 is mainly used in storage registration voice signal or the mel cepstrum coefficients institute of voice signal to be verified is right
Numbering answered etc., controller 13 is mainly used under the control of the recognition result of identifying system, and control lockset 12 is unlocked behaviour
Make or keep lock operation.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or using the present invention.
A variety of modifications to these embodiments will be apparent for those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention
The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one
The most wide scope caused.
Claims (10)
1. a kind of audio recognition method of the smart lock based on Application on Voiceprint Recognition, it is characterised in that including:
Extract the corresponding mel cepstrum coefficients of voice signal to be verified;
The differentiation depth confidence network with parameter preset space is inputted using the mel cepstrum coefficients as input layer, to obtain
The hidden layer output for distinguishing depth confidence network, and as the characteristic vector of the mel cepstrum coefficients;
The gauss hybrid models that the characteristic vector is built in advance with each registration voice signal are contrasted, and calculate described
The posterior probability that characteristic vector matches with each registration voice signal respectively;
Judge whether the maximum in each posterior probability is more than predetermined threshold value, if it is, voice signal to be verified passes through
Checking, and unlocking operation is carried out to lockset;It is on the contrary then make lockset keep lock-out state.
2. audio recognition method according to claim 1, it is characterised in that extracting the corresponding plum of voice signal to be verified
Before your cepstrum coefficient, in addition to:
Registration voice typing is carried out to the registrant of smart lock, and Gaussian Mixture mould is built to each registration voice signal of input
Type.
3. audio recognition method according to claim 2, it is characterised in that built to each registration voice signal of input
Gauss hybrid models, are specifically included:
Each corresponding mel cepstrum coefficients of registration voice signal is extracted, and each described registration voice signal of input is carried out
Preset numbers are bound;
Using each corresponding mel cepstrum coefficients of registration voice signal as input layer, while by each registration voice
The numbering of signal binding makes a distinction depth confidence network training, and obtain the differentiation depth confidence network as output layer
Parameter space;
Each described corresponding mel cepstrum coefficients of voice signal of registering is inputted into the differentiation depth confidence network, to obtain
The hidden layer output for distinguishing depth confidence network is stated, and as the corresponding mel cepstrum coefficients of registration voice signal each described
Characteristic vector;
Using each characteristic vector as input, and it is defined structure gauss hybrid models by EM algorithm.
4. audio recognition method according to claim 3, it is characterised in that extract voice signal to be verified or each registration
The corresponding mel cepstrum coefficients of voice signal, are specifically included:Voice signal to be verified or each registration voice signal are entered successively
Row preemphasis, plus Hamming window, by Wiener Filter Method carry out denoising, carry out Fast Fourier Transform (FFT), pass through triangle bandpass filtering
Device is filtered and discrete cosine transform.
5. audio recognition method according to claim 4, it is characterised in that each described registration voice signal is corresponding
Mel cepstrum coefficients input it is described differentiation depth confidence network, with obtain it is described differentiation depth confidence network hidden layer export
Afterwards, in addition to:
Pass through formula:
<mrow>
<mi>D</mi>
<mo>=</mo>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>k</mi>
</munderover>
<msub>
<mi>L</mi>
<mi>i</mi>
</msub>
<mo>*</mo>
<msub>
<mi>s</mi>
<mi>i</mi>
</msub>
</mrow>
The hidden layer output quality for distinguishing depth confidence network is checked, if D value is more than predetermined threshold value, hidden layer output quality
Meet preset requirement;
Wherein, D is discrimination, LiFor the corresponding weights of the corresponding characteristic vector of each registration voice signal, SiFor the member in matrix S
Element, S=Sb-Sw, SbFor within class scatter matrix, SwFor inter _ class relationship matrix.
6. audio recognition method according to claim 5, it is characterised in that extracting the corresponding plum of voice signal to be verified
Before your cepstrum coefficient, after registrant's progress registration voice typing of smart lock, in addition to:
Gather several untrained voice signals;
If the quantity of current untrained voice signal is less than predetermined threshold value, by the plum corresponding to each untrained voice signal
Your cepstrum coefficient inputs the differentiation depth confidence network, to correct its parameter space;
If the quantity of current untrained voice signal exceedes predetermined threshold value, by the plum corresponding to each untrained voice signal
Your cepstrum coefficient inputs revised differentiation depth confidence network, to obtain corresponding amendment characteristic vector, and described in
Amendment characteristic vector is modified to the gauss hybrid models.
7. audio recognition method according to claim 6, it is characterised in that the gauss hybrid models are modified,
Specifically include:
If each untrained voice signal corresponds to T amendment characteristic vector respectively:
And each amendment characteristic vector distinguishes corresponding likelihood ratio and is:
{K1, K2, K3..., KT}
Then pass through formula:
<mrow>
<msub>
<mi>L</mi>
<mi>i</mi>
</msub>
<mo>=</mo>
<mfrac>
<msub>
<mi>K</mi>
<mi>i</mi>
</msub>
<mrow>
<munderover>
<mo>&Sigma;</mo>
<mrow>
<mi>m</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>T</mi>
</munderover>
<msub>
<mi>K</mi>
<mi>m</mi>
</msub>
</mrow>
</mfrac>
</mrow>
Correct the average and variance of gauss hybrid models.
8. a kind of speech recognition system of the smart lock based on Application on Voiceprint Recognition, it is characterised in that including:
Extraction module, for the corresponding mel cepstrum coefficients of voice signal to be verified;
Rectification module, puts for the mel cepstrum coefficients to be inputted into the differentiation depth with parameter preset space as input layer
Communication network, is exported with the hidden layer for obtaining the differentiation depth confidence network, and as the feature of the mel cepstrum coefficients
Vector;
Computing module, for the gauss hybrid models progress pair for building the characteristic vector and each registration voice signal in advance
Than, and calculate the posterior probability that the characteristic vector matches with each registration voice signal respectively;
Authentication module, for judging whether the maximum in each posterior probability is more than predetermined threshold value, if it is, to be verified
Voice signal carries out unlocking operation by checking, and to lockset;It is on the contrary then make lockset keep lock-out state.
9. speech recognition system according to claim 8, it is characterised in that also include:
Registering modules, registration voice typing is carried out for the registrant to smart lock;
Training module, gauss hybrid models are built for each registration voice signal to input.
10. a kind of smart lock based on Application on Voiceprint Recognition, it is characterised in that including sound collector, lockset and such as claim 8 or
Speech recognition system described in 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710692968.9A CN107240397A (en) | 2017-08-14 | 2017-08-14 | A kind of smart lock and its audio recognition method and system based on Application on Voiceprint Recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710692968.9A CN107240397A (en) | 2017-08-14 | 2017-08-14 | A kind of smart lock and its audio recognition method and system based on Application on Voiceprint Recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107240397A true CN107240397A (en) | 2017-10-10 |
Family
ID=59992018
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710692968.9A Pending CN107240397A (en) | 2017-08-14 | 2017-08-14 | A kind of smart lock and its audio recognition method and system based on Application on Voiceprint Recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107240397A (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107422643A (en) * | 2016-08-26 | 2017-12-01 | 深圳大学 | Smart home monitoring method and system based on vibration detection |
CN107993071A (en) * | 2017-11-21 | 2018-05-04 | 平安科技(深圳)有限公司 | Electronic device, auth method and storage medium based on vocal print |
CN108417207A (en) * | 2018-01-19 | 2018-08-17 | 苏州思必驰信息科技有限公司 | A kind of depth mixing generation network self-adapting method and system |
CN108520752A (en) * | 2018-04-25 | 2018-09-11 | 西北工业大学 | A kind of method for recognizing sound-groove and device |
CN108734833A (en) * | 2018-05-08 | 2018-11-02 | 芜湖琅格信息技术有限公司 | A kind of door lock control system based on voice control |
CN110503952A (en) * | 2019-07-29 | 2019-11-26 | 北京搜狗科技发展有限公司 | A kind of method of speech processing, device and electronic equipment |
WO2019232826A1 (en) * | 2018-06-06 | 2019-12-12 | 平安科技(深圳)有限公司 | I-vector extraction method, speaker recognition method and apparatus, device, and medium |
CN110718222A (en) * | 2019-10-24 | 2020-01-21 | 浙江交通职业技术学院 | Vehicle operator authentication method based on voiceprint recognition and voice recognition |
CN110853631A (en) * | 2018-08-02 | 2020-02-28 | 珠海格力电器股份有限公司 | Voice recognition method and device for smart home |
CN112070930A (en) * | 2020-08-21 | 2020-12-11 | 江苏科群通讯建设有限公司 | Lock with voice recognition function and implementation method |
CN112070949A (en) * | 2020-08-21 | 2020-12-11 | 淮北市盛世昊明科技服务有限公司 | Intelligent entrance guard identification system based on speech recognition |
CN112331181A (en) * | 2019-07-30 | 2021-02-05 | 中国科学院声学研究所 | Target speaker voice extraction method based on multi-speaker condition |
CN112331210A (en) * | 2021-01-05 | 2021-02-05 | 太极计算机股份有限公司 | Speech recognition device |
CN112822186A (en) * | 2020-12-31 | 2021-05-18 | 国网江苏省电力有限公司信息通信分公司 | Power system IP dispatching station notification broadcasting method and system based on voice authentication |
WO2021115176A1 (en) * | 2019-12-09 | 2021-06-17 | 华为技术有限公司 | Speech recognition method and related device |
CN113744431A (en) * | 2020-05-14 | 2021-12-03 | 大富科技(安徽)股份有限公司 | Shared bicycle lock control device, method, equipment and medium |
WO2022110782A1 (en) * | 2020-11-25 | 2022-06-02 | 华为技术有限公司 | Identity verification method and system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010061194A2 (en) * | 2008-11-28 | 2010-06-03 | Nottingham Trent University | Biometric identity verification |
CN102324232A (en) * | 2011-09-12 | 2012-01-18 | 辽宁工业大学 | Method for recognizing sound-groove and system based on gauss hybrid models |
CN104008751A (en) * | 2014-06-18 | 2014-08-27 | 周婷婷 | Speaker recognition method based on BP neural network |
CN106328121A (en) * | 2016-08-30 | 2017-01-11 | 南京理工大学 | Chinese traditional musical instrument classification method based on depth confidence network |
CN106448684A (en) * | 2016-11-16 | 2017-02-22 | 北京大学深圳研究生院 | Deep-belief-network-characteristic-vector-based channel-robust voiceprint recognition system |
CN106997627A (en) * | 2016-01-26 | 2017-08-01 | 北京君正集成电路股份有限公司 | A kind of implementation method of Application on Voiceprint Recognition door lock and intelligent door lock |
CN107039036A (en) * | 2017-02-17 | 2017-08-11 | 南京邮电大学 | A kind of high-quality method for distinguishing speek person based on autocoding depth confidence network |
-
2017
- 2017-08-14 CN CN201710692968.9A patent/CN107240397A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010061194A2 (en) * | 2008-11-28 | 2010-06-03 | Nottingham Trent University | Biometric identity verification |
CN102324232A (en) * | 2011-09-12 | 2012-01-18 | 辽宁工业大学 | Method for recognizing sound-groove and system based on gauss hybrid models |
CN104008751A (en) * | 2014-06-18 | 2014-08-27 | 周婷婷 | Speaker recognition method based on BP neural network |
CN106997627A (en) * | 2016-01-26 | 2017-08-01 | 北京君正集成电路股份有限公司 | A kind of implementation method of Application on Voiceprint Recognition door lock and intelligent door lock |
CN106328121A (en) * | 2016-08-30 | 2017-01-11 | 南京理工大学 | Chinese traditional musical instrument classification method based on depth confidence network |
CN106448684A (en) * | 2016-11-16 | 2017-02-22 | 北京大学深圳研究生院 | Deep-belief-network-characteristic-vector-based channel-robust voiceprint recognition system |
CN107039036A (en) * | 2017-02-17 | 2017-08-11 | 南京邮电大学 | A kind of high-quality method for distinguishing speek person based on autocoding depth confidence network |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107422643A (en) * | 2016-08-26 | 2017-12-01 | 深圳大学 | Smart home monitoring method and system based on vibration detection |
CN107993071A (en) * | 2017-11-21 | 2018-05-04 | 平安科技(深圳)有限公司 | Electronic device, auth method and storage medium based on vocal print |
CN108417207A (en) * | 2018-01-19 | 2018-08-17 | 苏州思必驰信息科技有限公司 | A kind of depth mixing generation network self-adapting method and system |
CN108520752A (en) * | 2018-04-25 | 2018-09-11 | 西北工业大学 | A kind of method for recognizing sound-groove and device |
CN108520752B (en) * | 2018-04-25 | 2021-03-12 | 西北工业大学 | Voiceprint recognition method and device |
CN108734833A (en) * | 2018-05-08 | 2018-11-02 | 芜湖琅格信息技术有限公司 | A kind of door lock control system based on voice control |
WO2019232826A1 (en) * | 2018-06-06 | 2019-12-12 | 平安科技(深圳)有限公司 | I-vector extraction method, speaker recognition method and apparatus, device, and medium |
CN110853631A (en) * | 2018-08-02 | 2020-02-28 | 珠海格力电器股份有限公司 | Voice recognition method and device for smart home |
CN110503952B (en) * | 2019-07-29 | 2022-02-22 | 北京搜狗科技发展有限公司 | Voice processing method and device and electronic equipment |
CN110503952A (en) * | 2019-07-29 | 2019-11-26 | 北京搜狗科技发展有限公司 | A kind of method of speech processing, device and electronic equipment |
CN112331181A (en) * | 2019-07-30 | 2021-02-05 | 中国科学院声学研究所 | Target speaker voice extraction method based on multi-speaker condition |
CN110718222A (en) * | 2019-10-24 | 2020-01-21 | 浙江交通职业技术学院 | Vehicle operator authentication method based on voiceprint recognition and voice recognition |
WO2021115176A1 (en) * | 2019-12-09 | 2021-06-17 | 华为技术有限公司 | Speech recognition method and related device |
CN113744431B (en) * | 2020-05-14 | 2024-04-09 | 大富科技(安徽)股份有限公司 | Shared bicycle lock control device, method, equipment and medium |
CN113744431A (en) * | 2020-05-14 | 2021-12-03 | 大富科技(安徽)股份有限公司 | Shared bicycle lock control device, method, equipment and medium |
CN112070949A (en) * | 2020-08-21 | 2020-12-11 | 淮北市盛世昊明科技服务有限公司 | Intelligent entrance guard identification system based on speech recognition |
CN112070949B (en) * | 2020-08-21 | 2022-02-15 | 淮北市盛世昊明科技服务有限公司 | Intelligent entrance guard identification system based on speech recognition |
CN112070930A (en) * | 2020-08-21 | 2020-12-11 | 江苏科群通讯建设有限公司 | Lock with voice recognition function and implementation method |
WO2022110782A1 (en) * | 2020-11-25 | 2022-06-02 | 华为技术有限公司 | Identity verification method and system |
CN112822186A (en) * | 2020-12-31 | 2021-05-18 | 国网江苏省电力有限公司信息通信分公司 | Power system IP dispatching station notification broadcasting method and system based on voice authentication |
CN112822186B (en) * | 2020-12-31 | 2023-04-28 | 国网江苏省电力有限公司信息通信分公司 | Voice authentication-based power system IP dispatching desk notification broadcasting method and system |
CN112331210B (en) * | 2021-01-05 | 2021-05-18 | 太极计算机股份有限公司 | Speech recognition device |
CN112331210A (en) * | 2021-01-05 | 2021-02-05 | 太极计算机股份有限公司 | Speech recognition device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107240397A (en) | A kind of smart lock and its audio recognition method and system based on Application on Voiceprint Recognition | |
Yu et al. | Spoofing detection in automatic speaker verification systems using DNN classifiers and dynamic acoustic features | |
CN104143326B (en) | A kind of voice command identification method and device | |
RU2161336C2 (en) | System for verifying the speaking person identity | |
CN101197131B (en) | Accidental vocal print password validation system, accidental vocal print cipher lock and its generation method | |
Naik | Speaker verification: A tutorial | |
US6401063B1 (en) | Method and apparatus for use in speaker verification | |
CA2549092C (en) | System and method for providing improved claimant authentication | |
CN106340298A (en) | Voiceprint unlocking method integrating content recognition and speaker recognition | |
US20160248768A1 (en) | Joint Speaker Authentication and Key Phrase Identification | |
CN103811001B (en) | Word verification method and device | |
EP3740949B1 (en) | Authenticating a user | |
CN102402985A (en) | Voiceprint authentication system for improving voiceprint identification safety and method for realizing the same | |
CN103971690A (en) | Voiceprint recognition method and device | |
CN1963917A (en) | Method for estimating distinguish of voice, registering and validating authentication of speaker and apparatus thereof | |
CN1170239C (en) | Palm acoustic-print verifying system | |
CN108074310A (en) | Voice interactive method and intelligent lock administration system based on sound identification module | |
CN102005070A (en) | Voice identification gate control system | |
WO2005013263A1 (en) | Voice authentication system | |
CN107481736A (en) | A kind of vocal print identification authentication system and its certification and optimization method and system | |
CN108985776A (en) | Credit card security monitoring method based on multiple Information Authentication | |
Wahyudi et al. | Intelligent voice-based door access control system using adaptive-network-based fuzzy inference systems (ANFIS) for building security | |
CN110580897B (en) | Audio verification method and device, storage medium and electronic equipment | |
CN109003613A (en) | The Application on Voiceprint Recognition payment information method for anti-counterfeit of combining space information | |
Mukherjee et al. | Text dependent speaker recognition using shifted MFCC |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171010 |