CN106098068A - A kind of method for recognizing sound-groove and device - Google Patents
A kind of method for recognizing sound-groove and device Download PDFInfo
- Publication number
- CN106098068A CN106098068A CN201610416650.3A CN201610416650A CN106098068A CN 106098068 A CN106098068 A CN 106098068A CN 201610416650 A CN201610416650 A CN 201610416650A CN 106098068 A CN106098068 A CN 106098068A
- Authority
- CN
- China
- Prior art keywords
- voice messaging
- character
- checking
- characteristic
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 239000013598 vector Substances 0.000 claims abstract description 209
- 230000001755 vocal effect Effects 0.000 claims abstract description 91
- 238000012549 training Methods 0.000 claims abstract description 59
- 239000000284 extract Substances 0.000 claims abstract description 11
- 239000012634 fragment Substances 0.000 claims description 20
- 238000012360 testing method Methods 0.000 description 11
- 238000000605 extraction Methods 0.000 description 8
- JOCBASBOOFNAJA-UHFFFAOYSA-N N-tris(hydroxymethyl)methyl-2-aminoethanesulfonic acid Chemical compound OCC(CO)(CO)NCCS(O)(=O)=O JOCBASBOOFNAJA-UHFFFAOYSA-N 0.000 description 6
- 230000008447 perception Effects 0.000 description 4
- 230000004069 differentiation Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 102100028524 Lysosomal protective protein Human genes 0.000 description 2
- 101710162021 Lysosomal protective protein Proteins 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000013011 mating Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/08—Use of distortion metrics or a particular distance between probe pattern and reference templates
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- Business, Economics & Management (AREA)
- Telephonic Communication Services (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Telephone Function (AREA)
Abstract
The embodiment of the invention discloses a kind of method for recognizing sound-groove and device, the method comprise the steps that acquisition checking user reads aloud the first character string produced checking voice messaging;To described checking voice messaging carry out speech recognition obtain described checking voice messaging in comprise respectively with the corresponding sound bite of multiple characters in described first character string;Extract the vocal print feature of the corresponding sound bite of each character;It according to the vocal print feature of the corresponding sound bite of each character described, is verified in voice messaging each character characteristic of correspondence vector in conjunction with the respective symbols corresponding universal background model training preset;Calculate the vectorial similarity score with respective symbols characteristic of correspondence vector in the registration voice messaging preset of each character characteristic of correspondence in checking voice messaging, if described similarity score reaches to preset checking thresholding, then described checking user is defined as described registration voice messaging corresponding registration user.Use the present invention, Application on Voiceprint Recognition accuracy rate can be effectively improved.
Description
Technical field
The present invention relates to voice recognition technology field, particularly relate to a kind of method for recognizing sound-groove and device.
Background technology
Application on Voiceprint Recognition knows method for distinguishing as a kind of biological information, including user registers and two rank of user identity identification
Section.Voice is mapped as user model by a series of process by registration phase.At cognitive phase for the unknown language of one section of identity
Sound, is carried out mating of similarity with model, and then sentences to whether the identity of unknown voice is consistent with the identity registering voice
Disconnected.Existing vocal print modeling method is typically to be modeled from the unrelated aspect of text to realize retouching speaker's identity feature
Stating, but the unrelated modeling pattern of text being when user reads aloud different content, recognition accuracy is relatively low, it is difficult to meets and requires.
Content of the invention
In view of this, the embodiment of the present invention provides a kind of method for recognizing sound-groove and device, can effectively improve Application on Voiceprint Recognition accurate
True rate.
In order to solve above-mentioned technical problem, embodiments providing a kind of method for recognizing sound-groove, described method includes:
Obtain checking user and read aloud the first character string produced checking voice messaging;
To described checking voice messaging carry out speech recognition obtain described checking voice messaging in comprise respectively with described
The corresponding sound bite of multiple characters in first character string;
Extract the vocal print feature of the corresponding sound bite of each character;
According to the vocal print feature of the corresponding sound bite of each character described, corresponding general in conjunction with the respective symbols preset
Background model training is verified in voice messaging each character characteristic of correspondence vector;
Calculate each character characteristic of correspondence vector and corresponding word in the registration voice messaging preset in checking voice messaging
Described checking if described similarity score reaches to preset checking thresholding, is then used by the similarity score of symbol characteristic of correspondence vector
Family is defined as described registration voice messaging corresponding registration user.
Correspondingly, the embodiment of the present invention additionally provides a kind of voice print identification device, and described device includes:
Voice acquisition module, is used for obtaining checking user and reads aloud the first character string produced checking voice messaging;
Sound bite identification module, obtains described checking voice letter for carrying out speech recognition to described checking voice messaging
Breath comprises respectively with the corresponding sound bite of multiple characters in described first character string;
Vocal print characteristic extracting module, the vocal print for extracting the corresponding sound bite of each character in checking voice messaging is special
Levy;
Characteristic model training module, for the vocal print feature according to the corresponding sound bite of each character described, in conjunction with in advance
If the training of respective symbols corresponding universal background model be verified in voice messaging each character characteristic of correspondence vector;
Similarity judge module, for calculating each character characteristic of correspondence vector and the note preset in checking voice messaging
The similarity score of respective symbols characteristic of correspondence vector in volume voice messaging;
By described, subscriber identification module, if reaching to preset checking thresholding for described similarity score, then verifies that user is true
It is set to described registration voice messaging corresponding registration user.
The present embodiment is by obtaining the vocal print verifying the corresponding sound bite of each character in voice messaging of checking user
Feature, the UBM training in conjunction with the respective symbols preset is verified in voice messaging each character characteristic of correspondence vector, and leads to
Cross and each character characteristic of correspondence vector in checking voice messaging and the characteristic vector of respective symbols in registration voice messaging are entered
Row similarity-rough set, so that it is determined that checking user user identity, which in order to the user characteristics vector that compares with concrete
Character is corresponding, fully takes into account vocal print feature when user reads aloud kinds of characters, thus can effectively improve Application on Voiceprint Recognition accuracy rate.
Brief description
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
In having technology to describe, the accompanying drawing of required use is briefly described, it should be apparent that, the accompanying drawing in describing below is only this
Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, all right
Obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the Stages Overview schematic diagram of the method for recognizing sound-groove in the embodiment of the present invention;
Fig. 2 is the schematic flow sheet of a kind of method for recognizing sound-groove in the embodiment of the present invention;
Fig. 3 is to identify from voice messaging in the embodiment of the present invention that the principle obtaining the corresponding sound bite of multiple character is shown
It is intended to;
Fig. 4 is the principle signal obtaining each character characteristic of correspondence vector in the embodiment of the present invention from voice messaging
Figure;
Fig. 5 is the voiceprint registration schematic flow sheet registering user in the embodiment of the present invention;
Fig. 6 is the schematic flow sheet of the method for recognizing sound-groove in another embodiment of the present invention;
Fig. 7 is the structural representation of a kind of voice print identification device in the embodiment of the present invention;
Fig. 8 is the structural representation of the sound bite identification module in the embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Describe, it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments wholely.Based on
Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of not making creative work
Embodiment, broadly falls into the scope of protection of the invention.
Embodiments provide a kind of method for recognizing sound-groove and device.This method for recognizing sound-groove and device can be applicable to
Be there is a need in scene or the equipment of identification unknown subscriber's identity.For carrying out the character in the character string of Application on Voiceprint Recognition can be
Arabic numerals, English alphabet or other language characters etc..In order to simplify description, the character in the embodiment of the present invention is with Arab
It is illustrated as a example by Shuo Zi.
Method for recognizing sound-groove in the embodiment of the present invention can be divided into two stages, as shown in Figure 1:
1) the voiceprint registration stage of user is registered
In the voiceprint registration stage, registration user can read aloud a login-string (the second character i.e. hereinafter occurring
String), voice print identification device gathers the registration voice messaging when reading aloud this login-string for this registration user, then to registration language
Message breath carry out voice recognition obtain described registration voice messaging in comprise respectively with the multiple words in described login-string
Accord with corresponding sound bite, and then corresponding sound bite carry out vocal print feature extraction and vocal print model training to each character,
Including according to the vocal print feature of the corresponding sound bite of each character described, in conjunction with the corresponding common background of respective symbols preset
Model (Universal Background Model, UBM, i.e. GMM-UBM) training obtains each character in registration voice messaging
Characteristic of correspondence vector, then voice print identification device can be respectively different registration users and reads aloud it in the voiceprint registration stage
Registration voice messaging in multiple character characteristic of correspondence vectors be saved in the model library of voice print identification device.
Such as, login-string is digit strings 0185851, contains four kinds of digital " 0 ", " 1 ", " 5 ", " 8 ", then sound
Line identification device carries out vocal print feature extraction and sound-groove model according to the corresponding sound bite of each character in registration voice messaging
Training, obtains " 0 ", " 1 ", " 5 ", the vocal print feature of " 8 " corresponding sound bite, and then it is corresponding to combine the respective symbols preset
UBM training obtains each character characteristic of correspondence vector in registration voice messaging, including vectorial with digital " 0 " characteristic of correspondence,
Vectorial with digital " 1 " characteristic of correspondence vector and numeral " 5 " characteristic of correspondence and vectorial with numeral " 8 " characteristic of correspondence.
2) the identification stage of user is verified
In the identification stage, the user of the i.e. unknown identity of checking user reads aloud a checking character string (i.e. hereinafter to be occurred
The first character string, described second character string has at least one identical character with described first character string), Application on Voiceprint Recognition fill
Put the collection checking voice messaging when reading aloud this checking character string for this checking user, then sound is carried out to checking voice messaging
Identify obtain described checking voice messaging in comprise respectively with described checking character string in the corresponding voice sheet of multiple characters
Section, and then corresponding sound bite carries out vocal print feature extraction and vocal print model training to each character, including according to described respectively
The vocal print feature of the corresponding sound bite of individual character, is verified voice letter in conjunction with the respective symbols corresponding UBM training preset
Each character characteristic of correspondence vector in breath, in finally calculating checking voice messaging, each character characteristic of correspondence is vectorial and presets
Registration voice messaging in respective symbols characteristic of correspondence vector similarity score, if described similarity score reach preset test
Card thresholding, then be defined as described registration voice messaging corresponding registration user by described checking user.
Such as, verify that character string is digit strings 85851510, then when voice print identification device is read aloud according to checking user
In the checking voice messaging producing, the corresponding sound bite of each character carries out vocal print feature extraction and vocal print model training, obtains
" 0 ", " 1 ", " 5 ", " 8 " corresponding GMM, so combine preset the corresponding UBM of respective symbols can be calculated checking user
Checking voice messaging characteristic vector, including with digital " 0 " characteristic of correspondence vector and numeral " 1 " characteristic of correspondence vector,
Vectorial with numeral " 5 " characteristic of correspondence and vectorial with numeral " 8 " characteristic of correspondence, and then calculate respectively in checking voice messaging
" 0 ", " 1 ", " 5 ", " 8 " characteristic of correspondence vector respectively with registration voice messaging in " 0 ", " 1 ", " 5 ", " 8 " corresponding spy
Levy the similarity score between vector, if described similarity score reaches to preset checking thresholding, then described checking user is determined
For described registration voice messaging corresponding registration user.
It is pointed out that the identification stage of voiceprint registration stage of above-mentioned registration user and checking user can be
Same equipment or device realize, it is also possible to realize in different equipment or device respectively, the vocal print note of such as registration user
The volume stage implements in the first equipment, and then the multiple character characteristic of correspondence vector registered in voice messaging is sent out by the first equipment
Give the second equipment, such that it is able to implement the identification stage of checking user in the second equipment.
Respectively above-mentioned two process is described in detail below by specific embodiment.
Fig. 2 is the schematic flow sheet of a kind of method for recognizing sound-groove in the embodiment of the present invention, as shown in the figure in the present embodiment
Method for recognizing sound-groove flow process may include that
S201, obtains checking user and reads aloud the first character string produced checking voice messaging.
Described checking user is the user of unknown identity, needs to verify its user identity by voice print identification device.Described
First character string is for verifying that user carries out the character string of authentication, can be randomly generated, it is also possible to be to preset admittedly
A fixed character string, for example at least partly identical with registration corresponding second character string of voice messaging previously generating one
Character string.Concrete, described character string can comprise m character, wherein has n mutually different character, and m, n are just whole
Number, and m >=n.
Such as, the first character string is " 12358948 ", totally 8 characters, include 7 kinds of mutually different characters " 1 ", " 2 ",
“3”、“4”、“5”、“8”、“9”。
In an alternative embodiment, voice print identification device can generate and show described first character string, allows and verifies user's root
Read aloud according to described first character string of display.
S202, to described checking voice messaging carry out speech recognition obtain described checking voice messaging in comprise respectively with
The corresponding sound bite of multiple characters in described first character string.
As it is shown on figure 3, voice print identification device can pass through speech recognition and intensity of sound filters, by described checking voice
Information divides and obtains the corresponding sound bite of multiple character, optionally can also weed out invalid voice fragment, after being not involved in
Continuous processing procedure.
S203, extracts the vocal print feature of the corresponding sound bite of each character.
Concrete, voice print identification device can extract the MFCC (Mel in the corresponding sound bite of each character
Frequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual Linear
Predictive, perception linear predictor coefficient), as the vocal print feature of the sound bite corresponding to each character.
S204 is according to the vocal print feature of the corresponding sound bite of each character described, corresponding in conjunction with the respective symbols preset
Universal background model training be verified in voice messaging each character characteristic of correspondence vector;
Described universal background model UBM in the embodiment of the present invention, is the language of a kind of optional network specific digit by a large amount of speakers
The mixed Gauss model of segment combined training, characterizes distribution in feature space for the corresponding digital voice, again due to instruction
Practicing data and deriving from substantial amounts of speaker, therefore it does not characterize a certain class and talks about people specifically, has the unrelated characteristic of identity, can
Regard a kind of universal background model as.Schematically, the number of speaking can be used to be more than 1000 people, the language more than 20 hours for the duration
Sound sample, and the frequency of occurrences relative equilibrium of each character, training obtains UBM.The mathematic(al) representation of UBM is:
P (x)=∑I=1 ... CaiN(x|μi, ∑i) ... ... formula (1)
Wherein, P (x) represents the probability distribution of UBM, and C represents and has C Gauss module in UBM, sums up, aiRepresent
The weight of i-th Gauss module, μiRepresent the average of i-th Gauss module, ∑iRepresent the variance of i-th Gauss module, N (x)
Representing Gaussian Profile, x represents sample, sample namely the vocal print feature of input.
Voice print identification device can using checking voice messaging in the corresponding sound bite of each character vocal print feature as
Training sample data, use maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default respective symbols pair
The parameter of the universal background model answered is adjusted, i.e. at the sound by the corresponding sound bite of each character in checking voice messaging
After line feature substitutes into formula (1) as input sample, by the continuous corresponding universal background model of respective symbols adjusting and presetting
Parameter so that posterior probability P (x) is maximum, such that it is able to verify voice according to the parameter determination making posterior probability P (x) maximum
Respective symbols characteristic of correspondence vector in information.
May be used for differentiation speak owing to substantial amounts of experiment and paper demonstrate the average of each Gauss module in UBM model
The identity information of people, the average super vector that we define UBM model is:
Thus, voice print identification device can be by the vocal print feature of the corresponding sound bite of each character in checking voice messaging
As training sample data, use maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default corresponding word
The average super vector according with corresponding universal background model is adjusted, and i.e. will verify the corresponding language of each character in voice messaging
After the vocal print feature of tablet section substitutes into formula (1) as input sample, adjust average super vector by continuous so that posterior probability P
X () is maximum, such that it is able to the average super vector of posterior probability P (x) maximum will be made as respective symbols in checking voice messaging
Characteristic of correspondence vector.
In another alternative embodiment, the slow problem of the high-dimensional convergence rate brought in order to reduce super vector, we
By the principal component analytical method (PPCA, probabilistic principal component analysis) based on probability
The excursion of average super vector is limited in a sub spaces, voice print identification device can by checking voice messaging in each
The vocal print feature of the corresponding sound bite of character as training sample data, uses maximal posterior probability algorithm to default corresponding
The average super vector of the corresponding universal background model of character is adjusted, and combine preset super vector subspace matrices thus
Each character characteristic of correspondence vector in checking voice messaging.In implementing, following formula can be used to default corresponding word
The average super vector according with corresponding universal background model is adjusted so that the respective symbols corresponding common background mould after adjustment
The posterior probability of type is maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents tune
The average super vector of the universal background model of the respective symbols before whole, T is the super vector subspace matrices preset, and ω is checking
Respective symbols characteristic of correspondence vector in voice messaging, i.e. by the corresponding sound bite of each character in checking voice messaging
After vocal print feature substitutes into formula (1) as input sample, by continuous adjust ω can realize the average in adjustment type (1) surpass to
Amount so that posterior probability P (x) is maximum, such that it is able to will make the ω of posterior probability P (x) maximum as in checking voice messaging
Respective symbols characteristic of correspondence vector.Described super vector subspace matrices T for according to the average of described gauss hybrid models surpass to
Correlation determination between each dimension vector in amount obtains.
S205, calculates each character characteristic of correspondence vector and phase in the registration voice messaging preset in checking voice messaging
Answer the similarity score of character characteristic of correspondence vector, if described similarity score reaches to preset checking thresholding, then test described
Card user is defined as described registration voice messaging corresponding registration user.
Concrete, voice print identification device can get the registration voice messaging of registration user in the voiceprint registration stage,
And by the vocal print feature extraction similar with the present embodiment and vocal print model training, can obtain registration voice messaging in each
The sound bite characteristic of correspondence vector of character.Described registration voice messaging, can be that voice print identification device obtains registration user
Reading aloud the second character string produced registration voice messaging, described second character string and described first character string have at least one
Identical character, i.e. described registration corresponding second character string of voice messaging is at least partly identical with described first character string.Enter
And in an alternative embodiment, it is corresponding that voice print identification device can also obtain respective symbols described registration voice messaging from outside
After characteristic vector, i.e. registration user pass through other equipment typings registration voice messaging, other equipment or server pass through sound
Line feature extraction and vocal print model training obtain registering the sound bite characteristic of correspondence vector of each character in voice messaging, sound
Line identification device is by getting respective symbols characteristic of correspondence in described registration voice messaging from other equipment or server
Vector, thus verifying the identification stage of user in order to enter with each character characteristic of correspondence vector in checking voice messaging
Row compares.
In implementing, described similarity score is that voice print identification device will verify that in voice messaging, each character is corresponding
After characteristic vector compares with respective symbols characteristic of correspondence vector in the registration voice messaging preset, weigh identical characters
The score value of the similarity degree between two characteristic vectors.In an alternative embodiment, each word in checking voice messaging can be calculated
COS distance value between respective symbols characteristic of correspondence vector in symbol characteristic of correspondence vector and default registration voice messaging
As described similarity score, i.e. calculate certain character characteristic of correspondence vector sum in checking voice messaging respectively by following formula
The similarity score between characteristic vector in registration voice messaging:
Wherein, subscript i represents character total in i-th checking voice messaging and registration voice messaging, ωi(tar) table
Show this character characteristic of correspondence vector, ω in checking voice messagingi(test) represent that this character is right in registration voice messaging
The characteristic vector answered.If checking voice messaging comprises multiple identical character in registration voice messaging, then can be according to above formula
The similarity score of each character calculated takes average, presets if the similarity score average of each character reaches corresponding
Checking thresholding, then be defined as described registration voice messaging corresponding registration user by described checking user.If there is multidigit registration
User, such as registration user A, B and the C shown in Fig. 1, can note with each according to the characteristic vector of checking certain character of user
The similarity of the characteristic vector of the respective symbols of volume user, when certain characteristic vector registering the respective symbols of user and checking language
The similarity score of the characteristic vector of this character of sound is the highest and similarity reaches to preset checking thresholding, then make this registration user
For verifying the identification result of user.
In an alternative embodiment, if described checking voice messaging existing same character occur more than once, such as occurring
The 1st, the 0th, checking voice messaging as shown in Figure 25 and 8 all occur in that 2 times respectively, then can be corresponding according to character 0 twice
Sound bite process the characteristic vector that the obtains similarity respectively with the characteristic vector of character 0 in default registration voice messaging
The mean value of fraction, as the characteristic vector of character 0 and character 0 in the registration voice messaging preset in this checking voice messaging
The similarity score of characteristic vector, by that analogy.
It is pointed out that the mode of the similarity weighed between two characteristic vectors also has a lot, above is only this
A kind of embodiment of bright offer, those skilled in the art can be without creative labor on the basis of scheme disclosed by the invention
The similarity obtaining the more characteristic vector calculating character total in checking voice messaging and registration voice messaging is divided dynamicly
The mode of number, the present invention is not necessarily to exhaustive.
Thus, the present embodiment is by obtaining the corresponding sound bite of each character in the checking voice messaging verifying user
Vocal print feature, the UBM training in conjunction with the respective symbols preset is verified in voice messaging each character characteristic of correspondence vector,
And by by the feature of each character characteristic of correspondence vector in checking voice messaging and respective symbols in registration voice messaging to
Amount carry out similarity-rough set, so that it is determined that checking user user identity, which in order to compare user characteristics vector with
Concrete character is corresponding, fully takes into account vocal print feature when user reads aloud kinds of characters, thus it is accurate to effectively improve Application on Voiceprint Recognition
True rate.
Fig. 5 is the voiceprint registration schematic flow sheet registering user in the embodiment of the present invention, as shown in the figure in the present embodiment
Voiceprint registration flow process may include that
S501, obtains registration user and reads aloud the second character string produced registration voice messaging, described second character string with
Described first character string has at least one identical character.
Described registration user i.e. determines the user of legal identity, and described second character string is for gathering registration user's vocal print
The character string of characteristic vector, can be randomly generated, it is also possible to be to preset a fixing character string.Concrete, described the
Two character strings also can comprise m character, wherein has n mutually different character, and m, n are positive integer, and m >=n.
In an alternative embodiment, voice print identification device can generate and show described second character string, allows and registers user's root
Read aloud according to described second character string of display.
S502, to described registration voice messaging carry out speech recognition obtain described registration voice messaging in comprise respectively with
The corresponding sound bite of multiple characters in described second character string;
Voice print identification device can pass through speech recognition and intensity of sound filters, and divides described checking voice messaging
To the corresponding sound bite of multiple characters, optionally invalid voice fragment can also be weeded out, be not involved in follow-up process
Journey.
S503, extracts the vocal print feature of the corresponding sound bite of each character in registration voice messaging.
Concrete, voice print identification device can extract the MFCC (Mel in the corresponding sound bite of each character
Frequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual Linear
Predictive, perception linear predictor coefficient), as the vocal print feature of the sound bite corresponding to each character.
S504, according to the vocal print feature of the corresponding sound bite of each character in registration voice messaging, in conjunction with the phase preset
The training of character corresponding universal background model is answered to obtain each character characteristic of correspondence vector in registration voice messaging.
The expression formula of UBM is referred to embodiment above.This step of voiceprint registration flow process and Application on Voiceprint Recognition flow process
S204 be similar to, voice print identification device can using registration voice messaging in the corresponding sound bite of each character vocal print feature as
Training sample data, use maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default respective symbols pair
The parameter of the universal background model answered is adjusted, i.e. at the sound by the corresponding sound bite of each character in registration voice messaging
After line feature substitutes into formula (1) as input sample, by the continuous corresponding universal background model of respective symbols adjusting and presetting
Parameter so that posterior probability P (x) is maximum, such that it is able to register voice according to the parameter determination making posterior probability P (x) maximum
Respective symbols characteristic of correspondence vector in information.
And owing to the average of each Gauss module in UBM model may be used for distinguishing the identity information of speaker, vocal print is known
Other device can be adopted using the vocal print feature of the corresponding sound bite of each character in registration voice messaging as training sample data
With maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default respective symbols corresponding common background mould
The average super vector of type is adjusted, and is i.e. making the vocal print feature of the corresponding sound bite of each character in registration voice messaging
After substituting into formula (1) for input sample, adjust average super vector by continuous so that posterior probability P (x) is maximum, such that it is able to will
The average super vector making posterior probability P (x) maximum is vectorial as respective symbols characteristic of correspondence in registration voice messaging.
In another alternative embodiment, equal to the default corresponding universal background model of respective symbols of following formula can be used
Value super vector is adjusted so that the posterior probability of the corresponding universal background model of respective symbols after adjustment is maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents tune
The average super vector of the universal background model of the respective symbols before whole, T is the super vector subspace matrices preset, and ω is registration
Respective symbols characteristic of correspondence vector in voice messaging, i.e. by the corresponding sound bite of each character in registration voice messaging
After vocal print feature substitutes into formula (1) as input sample, by continuous adjust ω can realize the average in adjustment type (1) surpass to
Amount so that posterior probability P (x) is maximum, such that it is able to will make the ω of posterior probability P (x) maximum as in registration voice messaging
Respective symbols characteristic of correspondence vector.
Fig. 6 is the schematic flow sheet of the method for recognizing sound-groove in another embodiment of the present invention, as shown in the figure in the present embodiment
Method for recognizing sound-groove can include below scheme:
S601, stochastic generation the first character string simultaneously shows.
S602, obtains checking user and reads aloud the first character string produced checking voice messaging.
S603, identifies the efficient voice fragment in described checking voice messaging and invalid voice fragment.
Concrete, according to intensity of sound, checking voice can be divided, sound bite less for intensity of sound is regarded
For invalid voice fragment (for example including quiet section and impulsive noise).
S604, described efficient voice fragment is carried out speech recognition obtain respectively with the multiple words in described first character string
Accord with corresponding sound bite.
Speech recognition can be passed through, obtain respectively with the corresponding sound bite of multiple characters in described first character string.
S605, determines the sequence of the sound bite of the described multiple characters verified in voice messaging and described first character string
In the sequence of respective symbols consistent.
In order to be prevented effectively from after the voice messaging registering user is copied illegally or illegally copied in order to carry out Application on Voiceprint Recognition, permissible
The first different character string of each stochastic generation, and the sound bite of the multiple characters in checking voice messaging is judged in this step
Sequence whether consistent with the sequence of the respective symbols in the first character string, if inconsistent, then may determine that Application on Voiceprint Recognition failure,
If the sequence with the respective symbols in the first character string is consistent, then perform follow-up flow process.
S606, extracts the vocal print feature of the corresponding sound bite of each character.
Concrete, voice print identification device can extract the MFCC (Mel in the corresponding sound bite of each character
Frequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual Linear
Predictive, perception linear predictor coefficient), as the vocal print feature of the sound bite corresponding to each character.
S607, using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as number of training
According to the employing average super vector to the default corresponding universal background model of respective symbols for the maximal posterior probability algorithm is adjusted
Whole, thus estimate to be verified in voice messaging each character characteristic of correspondence vector.
May be used for differentiation speak owing to substantial amounts of experiment and paper demonstrate the average of each Gauss module in UBM model
The identity information of people, voice print identification device can be by the vocal print feature of the corresponding sound bite of each character in checking voice messaging
As training sample data, use maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default corresponding word
The average super vector according with corresponding universal background model is adjusted, and i.e. will verify the corresponding language of each character in voice messaging
After the vocal print feature of tablet section substitutes into formula (1) as input sample, adjust average super vector by continuous so that posterior probability P
X () is maximum, such that it is able to the average super vector of posterior probability P (x) maximum will be made as respective symbols in checking voice messaging
Characteristic of correspondence vector.
In another alternative embodiment, the slow problem of the high-dimensional convergence rate brought in order to reduce super vector, vocal print
Identify that device can use following formula to be adjusted the average super vector of the default corresponding universal background model of respective symbols, make
The posterior probability of the corresponding universal background model of respective symbols after must adjusting is maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents tune
The average super vector of the universal background model of the respective symbols before whole, T is the super vector subspace matrices preset, and ω is checking
Respective symbols characteristic of correspondence vector in voice messaging, i.e. by the corresponding sound bite of each character in checking voice messaging
After vocal print feature substitutes into formula (1) as input sample, by continuous adjust ω can realize the average in adjustment type (1) surpass to
Amount so that posterior probability P (x) is maximum, such that it is able to will make the ω of posterior probability P (x) maximum as in checking voice messaging
Respective symbols characteristic of correspondence vector.
S608, calculates each character characteristic of correspondence vector and phase in the registration voice messaging preset in checking voice messaging
Answer the similarity score of character characteristic of correspondence vector, if similarity score reaches to preset checking thresholding, then will verify that user is true
It is set to registration voice messaging corresponding registration user.
In the present embodiment, voice print identification device can calculate in checking voice messaging each character characteristic of correspondence vector with
In the registration voice messaging preset, the COS distance value between respective symbols characteristic of correspondence vector is as described similarity score,
I.e. calculate certain character spy in characteristic of correspondence vector sum registration voice messaging in checking voice messaging respectively by following formula
Levy the similarity score between vector:
Wherein, subscript i represents character total in i-th checking voice messaging and registration voice messaging, ωi(tar) table
Show this character characteristic of correspondence vector, ω in checking voice messagingi(test) represent that this character is right in registration voice messaging
The characteristic vector answered.If checking voice messaging comprises multiple identical character in registration voice messaging, then can be according to above formula
The similarity score of each character calculated takes average, presets if the similarity score average of each character reaches corresponding
Checking thresholding, then be defined as described registration voice messaging corresponding registration user by described checking user.If there is multidigit registration
User, such as registration user A, B and the C shown in Fig. 1, can note with each according to the characteristic vector of checking certain character of user
The similarity of the characteristic vector of the respective symbols of volume user, when certain characteristic vector registering the respective symbols of user and checking language
The similarity score of the characteristic vector of this character of sound is the highest and similarity reaches to preset checking thresholding, then make this registration user
For verifying the identification result of user.
Thus, the present embodiment will be by verifying each character characteristic of correspondence vector and registration voice messaging in voice messaging
The characteristic vector of middle respective symbols carries out similarity-rough set, and the sequential combining sound bite judges, can be true further
Protect checking user the accuracy of user identity.
Fig. 7 is the structural representation of a kind of voice print identification device in the embodiment of the present invention, as shown in the figure in the present embodiment
Voice print identification device may include that
Voice acquisition module 710, is used for obtaining checking user and reads aloud the first character string produced checking voice messaging.
Described checking user is the user of unknown identity, needs to verify its user identity by voice print identification device.Described
First character string is for verifying that user carries out the character string of authentication, can be randomly generated, it is also possible to be to preset admittedly
A fixed character string, for example at least partly identical with registration corresponding second character string of voice messaging previously generating one
Character string.Concrete, described character string can comprise m character, wherein has n mutually different character, and m, n are just whole
Number, and m >=n.
Such as, the first character string is " 12358948 ", totally 8 characters, include 7 kinds of mutually different characters " 1 ", " 2 ",
“3”、“4”、“5”、“8”、“9”。
Sound bite identification module 720, obtains described checking language for carrying out speech recognition to described checking voice messaging
Message breath in comprise respectively with the corresponding sound bite of multiple characters in described first character string.
Filter as it is shown on figure 3, sound bite identification module 720 can pass through speech recognition and intensity of sound, by described
Checking voice messaging divides and obtains the corresponding sound bite of multiple character, optionally can also weed out invalid voice fragment,
It is not involved in follow-up processing procedure.
In an alternative embodiment, described sound bite identification module may include that as shown in Figure 8 further
Effective fragment recognition unit 721, for identifying the described efficient voice fragment verified in voice messaging and invalid language
Tablet section.
Concrete, checking voice can be divided by effective fragment recognition unit 721 according to intensity of sound, and sound is strong
Spend less sound bite and be considered as invalid voice fragment (for example including quiet section and impulsive noise).
Voice recognition unit 722, obtains respectively with described first for carrying out speech recognition to described efficient voice fragment
The corresponding sound bite of multiple characters in character string.
Vocal print characteristic extracting module 730, for extracting the sound of the corresponding sound bite of each character in checking voice messaging
Line feature.
Concrete, vocal print characteristic extracting module 730 can extract the MFCC (Mel in the corresponding sound bite of each character
Frequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual Linear
Predictive, perception linear predictor coefficient), as the vocal print feature of the sound bite corresponding to each character.
Characteristic model training module 740, is used for the vocal print feature according to the corresponding sound bite of each character described, in conjunction with
The respective symbols corresponding universal background model training preset is verified in voice messaging each character characteristic of correspondence vector.
Characteristic model training module 740 can be by the vocal print spy of the corresponding sound bite of each character in checking voice messaging
Levy as training sample data, use maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default corresponding
The parameter of the corresponding universal background model of character is adjusted, and i.e. will verify the corresponding voice sheet of each character in voice messaging
After the vocal print feature of section substitutes into formula (1) as input sample, by the continuous corresponding common background of respective symbols adjusting and presetting
The parameter of model so that posterior probability P (x) is maximum, thus characteristic model training module 740 can be according to making posterior probability P
Respective symbols characteristic of correspondence vector in (x) maximum parameter determination checking voice messaging.
May be used for differentiation speak owing to substantial amounts of experiment and paper demonstrate the average of each Gauss module in UBM model
The identity information of people, the average super vector that we define UBM model is:
Thus, characteristic model training module 740 can be by the corresponding sound bite of each character in checking voice messaging
Vocal print feature as training sample data, uses maximal posterior probability algorithm (Maximum A Posteriori, MAP) to presetting
The average super vector of the corresponding universal background model of respective symbols be adjusted, i.e. will each character in checking voice messaging
After the vocal print feature of corresponding sound bite substitutes into formula (1) as input sample, adjust average super vector by continuous so that after
Test probability P (x) maximum, characteristic model training module 740 can make the average super vector of posterior probability P (x) maximum as
Respective symbols characteristic of correspondence vector in checking voice messaging.
In another alternative embodiment, the slow problem of the high-dimensional convergence rate brought in order to reduce super vector, we
By the principal component analytical method (PPCA, probabilistic principal component analysis) based on probability
Being limited in the excursion of average super vector in one sub spaces, characteristic model training module 740 can be by checking voice letter
In breath, the vocal print feature of the corresponding sound bite of each character is as training sample data, uses maximal posterior probability algorithm in advance
If the average super vector of the corresponding universal background model of respective symbols be adjusted, and combine the super vector subspace square preset
Battle array thus be verified in voice messaging each character characteristic of correspondence vector.In implementing, characteristic model training module 740
Following formula can be used to be adjusted the average super vector of the default corresponding universal background model of respective symbols so that after adjustment
The posterior probability of the corresponding universal background model of respective symbols maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents tune
The average super vector of the universal background model of the respective symbols before whole, T is the super vector subspace matrices preset, and ω is checking
Respective symbols characteristic of correspondence vector in voice messaging, i.e. by the corresponding sound bite of each character in checking voice messaging
After vocal print feature substitutes into formula (1) as input sample, by continuous adjust ω can realize the average in adjustment type (1) surpass to
Amount so that posterior probability P (x) is maximum, such that it is able to will make the ω of posterior probability P (x) maximum as in checking voice messaging
Respective symbols characteristic of correspondence vector.Described super vector subspace matrices T for according to the average of described gauss hybrid models surpass to
Correlation determination between each dimension vector in amount obtains.
Similarity judge module 750, is used for calculating each character characteristic of correspondence in checking voice messaging vectorial and default
Registration voice messaging in respective symbols characteristic of correspondence vector similarity score.
Concrete, voice print identification device can get the registration voice messaging of registration user in the voiceprint registration stage,
And pass through sound bite identification module the 720th, vocal print characteristic extracting module 730 and characteristic model training module 740, can obtain
The sound bite characteristic of correspondence vector of each character in registration voice messaging.Described registration voice messaging, can be that vocal print is known
Other device obtains registration user and reads aloud the second character string produced registration voice messaging, described second character string and described first
Character string has at least one identical character, i.e. described registration corresponding second character string of voice messaging and described first character
Go here and there at least partly identical.And then in an alternative embodiment, voice print identification device can also obtain described registration voice letter from outside
After respective symbols characteristic of correspondence vector in breath, i.e. registration user pass through other equipment typings registration voice messaging, other set
Standby or server obtains registering the voice sheet of each character in voice messaging by vocal print feature extraction and vocal print model training
Section characteristic of correspondence vector, voice print identification device is by getting in described registration voice messaging from other equipment or server
Respective symbols characteristic of correspondence vector, thus checking user identification stage similarity judge module 750 in order to test
In card voice messaging, each character characteristic of correspondence vector compares.
In implementing, described similarity score is that voice print identification device will verify that in voice messaging, each character is corresponding
After characteristic vector compares with respective symbols characteristic of correspondence vector in the registration voice messaging preset, weigh identical characters
The score value of the similarity degree between two characteristic vectors.In an alternative embodiment, similarity judge module 750 can calculate checking
Each character characteristic of correspondence vector and respective symbols characteristic of correspondence vector in the registration voice messaging preset in voice messaging
Between COS distance value as described similarity score, i.e. by following formula calculate certain character respectively checking voice messaging in
The similarity score between characteristic vector in characteristic of correspondence vector sum registration voice messaging:
Wherein, subscript i represents character total in i-th checking voice messaging and registration voice messaging, ωi(tar) table
Show this character characteristic of correspondence vector, ω in checking voice messagingi(test) represent that this character is right in registration voice messaging
The characteristic vector answered.In an alternative embodiment, if described checking voice messaging existing same character and occurring more than once, such as
Occur the 1st, the 0th, checking voice messaging as shown in Figure 25 and 8 all occur in that 2 times respectively, then can be according to character 0 twice
Corresponding sound bite processes the phase respectively with the characteristic vector of character 0 in default registration voice messaging for the characteristic vector obtaining
Like the mean value spending fraction, as in this characteristic vector verifying character 0 in voice messaging and the registration voice messaging preset
The similarity score of the characteristic vector of character 0, by that analogy.
It is pointed out that the mode of the similarity weighed between two characteristic vectors also has a lot, above is only this
A kind of embodiment of bright offer, those skilled in the art can be without creative labor on the basis of scheme disclosed by the invention
The similarity obtaining the more characteristic vector calculating character total in checking voice messaging and registration voice messaging is divided dynamicly
The mode of number, the present invention is not necessarily to exhaustive.
Subscriber identification module 760, if reaching to preset checking thresholding for described similarity score, then by described checking user
It is defined as described registration voice messaging corresponding registration user.
If checking voice messaging comprises multiple identical character in registration voice messaging, then subscriber identification module 760 can
Take average with the similarity score according to similarity judge module 750 each character calculated, if each character is similar
Degree fraction average reaches corresponding default checking thresholding, then described checking user is defined as described registration voice messaging corresponding
Registration user.If there is multidigit registration user, such as registration user A, B and the C shown in Fig. 1, subscriber identification module 760 is permissible
The similarity of the characteristic vector of the respective symbols with each registration user for the characteristic vector according to checking certain character of user, when certain
The characteristic vector of the respective symbols of individual registration user the highest with the similarity score of characteristic vector of this character of checking voice and
Similarity reaches to preset checking thresholding, then using this registration user as the identification result verifying user.
And then in an alternative embodiment, described voice acquisition module 710, it is additionally operable to obtain registration user and read aloud the second character
String produced registration voice messaging, described second character string has at least one identical character with described first character string;
Described sound bite identification module 720, is additionally operable to carry out speech recognition to described registration voice messaging and obtains described
Registration voice messaging in comprise respectively with the corresponding sound bite of multiple characters in described second character string;
Described vocal print characteristic extracting module 730, is additionally operable to extract the corresponding voice sheet of each character in registration voice messaging
The vocal print feature of section;
Described characteristic model training module 740, is additionally operable to according to the corresponding language of each character in described registration voice messaging
The vocal print feature of tablet section, obtains in registration voice messaging each in conjunction with the respective symbols corresponding universal background model training preset
Individual character characteristic of correspondence vector.
In an alternative embodiment, voice print identification device can also include further:
Character sorts determining module 770, for determining the sound bite of the multiple characters in described checking voice messaging
Sort consistent with the sequence of the respective symbols in described first character string.
In order to be prevented effectively from after the voice messaging registering user is copied illegally or illegally copied in order to carry out Application on Voiceprint Recognition, permissible
The first different character string of each stochastic generation, and the sound bite of the multiple characters in checking voice messaging is judged in this step
Sequence whether consistent with the sequence of the respective symbols in the first character string, if inconsistent, then may determine that Application on Voiceprint Recognition failure,
If the sequence with the respective symbols in the first character string is consistent, then can notify vocal print characteristic extracting module 730 or characteristic model
Training module 740 performs to train for feature extraction and the vocal print of this checking voice messaging.
In an alternative embodiment, voice print identification device can also include further:
Character string display module 700, is used for the first character string described in stochastic generation and shows.
Thus, the present embodiment is by obtaining the corresponding sound bite of each character in the checking voice messaging verifying user
Vocal print feature, the UBM training in conjunction with the respective symbols preset is verified in voice messaging each character characteristic of correspondence vector,
And by by the feature of each character characteristic of correspondence vector in checking voice messaging and respective symbols in registration voice messaging to
Amount carry out similarity-rough set, so that it is determined that checking user user identity, which in order to compare user characteristics vector with
Concrete character is corresponding, fully takes into account vocal print feature when user reads aloud kinds of characters, thus it is accurate to effectively improve Application on Voiceprint Recognition
True rate.
In actual test case, in 1000 people's training samples, 290,000 tests, (wherein the test of identities match is 1
About ten thousand times, matching test is not about at 280,000 times), it is capable of the recall rate of under one thousandth error rate 79.8%, wait wrong general
Rate (EER, Equal Error Rate) is 3.39%, and compared to traditional unrelated modeling method of text, Application on Voiceprint Recognition performance carries
Rise more than more than 40%.
One of ordinary skill in the art will appreciate that all or part of flow process realizing in above-described embodiment method, be permissible
Instructing related hardware by computer program to complete, described program can be stored in a computer read/write memory medium
In, this program is upon execution, it may include such as the flow process of the embodiment of above-mentioned each method.Wherein, described storage medium can be magnetic
Dish, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access
Memory, RAM) etc..
The above disclosed present pre-ferred embodiments that is only, can not limit the right model of the present invention with this certainly
Enclose, the equivalent variations therefore made according to the claims in the present invention, still belong to the scope that the present invention is covered.
Claims (20)
1. a method for recognizing sound-groove, it is characterised in that described method includes:
Obtain checking user and read aloud the first character string produced checking voice messaging;
To described checking voice messaging carry out speech recognition obtain described checking voice messaging in comprise respectively with described first
The corresponding sound bite of multiple characters in character string;
Extract the vocal print feature of the corresponding sound bite of each character;
According to the vocal print feature of the corresponding sound bite of each character described, in conjunction with the corresponding common background of respective symbols preset
Model training is verified in voice messaging each character characteristic of correspondence vector;
Calculate each character characteristic of correspondence vector and respective symbols pair in the registration voice messaging preset in checking voice messaging
By described, the similarity score of the characteristic vector answered, if described similarity score reaches to preset checking thresholding, then verifies that user is true
It is set to described registration voice messaging corresponding registration user.
2. method for recognizing sound-groove as claimed in claim 1, it is characterised in that described acquisition checking user reads aloud the first character string
Also include before produced checking voice messaging:
Obtain registration user and read aloud the second character string produced registration voice messaging, described second character string and described first word
Symbol string has at least one identical character;
To described registration voice messaging carry out speech recognition obtain described registration voice messaging in comprise respectively with described second
The corresponding sound bite of multiple characters in character string;
Extract the vocal print feature of the corresponding sound bite of each character in registration voice messaging;
According to the vocal print feature of the corresponding sound bite of each character in registration voice messaging, corresponding in conjunction with the respective symbols preset
Universal background model training obtain each character characteristic of correspondence vector in registration voice messaging.
3. method for recognizing sound-groove as claimed in claim 1, it is characterised in that the corresponding voice of each character described in described basis
The vocal print feature of fragment, is verified in voice messaging each in conjunction with the respective symbols corresponding universal background model training preset
Character characteristic of correspondence vector includes:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use maximum
The average super vector to the default corresponding universal background model of respective symbols for the posterior probability algorithm is adjusted, thus estimates
Each character characteristic of correspondence vector in checking voice messaging.
4. method for recognizing sound-groove as claimed in claim 3, it is characterised in that described by each character pair in checking voice messaging
The vocal print feature of the sound bite answered, as training sample data, uses maximal posterior probability algorithm to default respective symbols pair
The average super vector of the universal background model answered is adjusted, thus it is corresponding to estimate to be verified in voice messaging each character
Characteristic vector includes:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use maximum
The average super vector to the default corresponding universal background model of respective symbols for the posterior probability algorithm is adjusted, and combines default
Super vector subspace matrices thus be verified in voice messaging each character characteristic of correspondence vector.
5. method for recognizing sound-groove as claimed in claim 4, it is characterised in that described by each character pair in checking voice messaging
The vocal print feature of the sound bite answered, as training sample data, uses maximal posterior probability algorithm to default respective symbols pair
The average super vector of the universal background model answered is adjusted, and combine preset super vector subspace matrices thus be verified
In voice messaging, each character characteristic of correspondence vector includes:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use following formula
The average super vector of the default corresponding universal background model of respective symbols is adjusted so that the respective symbols pair after adjustment
The posterior probability of the universal background model answered is maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents before adjusting
The average super vector of universal background model of respective symbols, T is the super vector subspace matrices preset, and ω is checking voice
Respective symbols characteristic of correspondence vector in information.
6. method for recognizing sound-groove as claimed in claim 4, it is characterised in that described super vector subspace matrices is for according to described
In universal background model each Gauss module weight between correlation determine and obtain.
7. method for recognizing sound-groove as claimed in claim 1, it is characterised in that each character in described calculating checking voice messaging
Characteristic of correspondence vector includes with the similarity score of respective symbols characteristic of correspondence vector in the registration voice messaging preset:
Calculate each character characteristic of correspondence vector and respective symbols pair in the registration voice messaging preset in checking voice messaging
COS distance value between the characteristic vector answered is as described similarity score.
8. method for recognizing sound-groove as claimed in claim 1, it is characterised in that described voice is carried out to described checking voice messaging
Identify obtain described checking voice messaging in comprise respectively with the corresponding voice sheet of multiple characters in described first character string
Section includes:
Identify the efficient voice fragment in described checking voice messaging and invalid voice fragment;
Carry out speech recognition to described efficient voice fragment and obtain corresponding with the multiple characters in described first character string respectively
Sound bite.
9. method for recognizing sound-groove as claimed in claim 1, it is characterised in that described described checking user is defined as described note
Also include before volume voice messaging corresponding registration user:
Determine that the sequence of the sound bite of the described multiple characters verified in voice messaging is corresponding to described first character string
The sequence of character is consistent.
10. method for recognizing sound-groove as claimed in any one of claims 1-9 wherein, it is characterised in that checking user is bright in described acquisition
Also include before reading the first character string produced checking voice messaging:
First character string described in stochastic generation simultaneously shows.
11. 1 kinds of voice print identification device, it is characterised in that described device includes:
Voice acquisition module, is used for obtaining checking user and reads aloud the first character string produced checking voice messaging;
Sound bite identification module, obtains in described checking voice messaging for carrying out speech recognition to described checking voice messaging
Comprise respectively with the corresponding sound bite of multiple characters in described first character string;
Vocal print characteristic extracting module, for extracting the vocal print feature of the corresponding sound bite of each character in checking voice messaging;
Characteristic model training module, for the vocal print feature according to the corresponding sound bite of each character described, in conjunction with preset
The training of respective symbols corresponding universal background model is verified in voice messaging each character characteristic of correspondence vector;
Similarity judge module, for calculating each character characteristic of correspondence vector and the registration language preset in checking voice messaging
The similarity score of respective symbols characteristic of correspondence vector in message breath;
Described checking user if reaching to preset checking thresholding for described similarity score, is then defined as by subscriber identification module
Described registration voice messaging corresponding registration user.
12. voice print identification device as claimed in claim 11, it is characterised in that
Described voice acquisition module, is additionally operable to obtain registration user and reads aloud the second character string produced registration voice messaging, institute
State the second character string and have at least one identical character with described first character string;
Described sound bite identification module, is additionally operable to carry out speech recognition to described registration voice messaging and obtains described registration voice
Information comprises respectively with the corresponding sound bite of multiple characters in described second character string;
Described vocal print characteristic extracting module, is additionally operable to extract the vocal print of the corresponding sound bite of each character in registration voice messaging
Feature;
Described characteristic model training module, is additionally operable to according to the corresponding sound bite of each character in described registration voice messaging
Vocal print feature, obtains each character pair in registration voice messaging in conjunction with the respective symbols corresponding universal background model training preset
The characteristic vector answered.
13. voice print identification device as claimed in claim 11, it is characterised in that described characteristic vector computing module is used for:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use maximum
The average super vector to the default corresponding universal background model of respective symbols for the posterior probability algorithm is adjusted, thus estimates
Each character characteristic of correspondence vector in checking voice messaging.
14. voice print identification device as claimed in claim 13, it is characterised in that described characteristic vector computing module is used for:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use maximum
The average super vector to the default corresponding universal background model of respective symbols for the posterior probability algorithm is adjusted, and combines default
Super vector subspace matrices thus be verified in voice messaging each character characteristic of correspondence vector.
15. voice print identification device as claimed in claim 14, it is characterised in that described characteristic vector computing module is specifically used
In:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use following formula
The average super vector of the default corresponding universal background model of respective symbols is adjusted so that the respective symbols pair after adjustment
The posterior probability of the universal background model answered is maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents before adjusting
The average super vector of universal background model of respective symbols, T is the super vector subspace matrices preset, and ω is checking voice
Respective symbols characteristic of correspondence vector in information.
16. voice print identification device as claimed in claim 14, it is characterised in that described super vector subspace matrices is for according to institute
State what the correlation determination between each dimension vector in the average super vector of gauss hybrid models obtained.
17. voice print identification device as claimed in claim 11, it is characterised in that described similarity judge module is used for:
Calculate each character characteristic of correspondence vector and respective symbols pair in the registration voice messaging preset in checking voice messaging
COS distance value between the characteristic vector answered is as described similarity score.
18. voice print identification device as claimed in claim 11, it is characterised in that described sound bite identification module includes:
Effective fragment recognition unit, for identifying the described efficient voice fragment verified in voice messaging and invalid voice fragment;
Voice recognition unit, for described efficient voice fragment carried out speech recognition obtain respectively with in described first character string
The corresponding sound bite of multiple characters.
19. voice print identification device as claimed in claim 11, it is characterised in that also include:
Character sequence determining module, is used for determining sequence and the institute of the sound bite of the multiple characters in described checking voice messaging
The sequence stating respective symbols in the first character string is consistent.
20. voice print identification device as according to any one of claim 11-19, it is characterised in that also include:
Character string display module, is used for the first character string described in stochastic generation and shows.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610416650.3A CN106098068B (en) | 2016-06-12 | 2016-06-12 | A kind of method for recognizing sound-groove and device |
PCT/CN2017/087911 WO2017215558A1 (en) | 2016-06-12 | 2017-06-12 | Voiceprint recognition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610416650.3A CN106098068B (en) | 2016-06-12 | 2016-06-12 | A kind of method for recognizing sound-groove and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106098068A true CN106098068A (en) | 2016-11-09 |
CN106098068B CN106098068B (en) | 2019-07-16 |
Family
ID=57846666
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610416650.3A Active CN106098068B (en) | 2016-06-12 | 2016-06-12 | A kind of method for recognizing sound-groove and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN106098068B (en) |
WO (1) | WO2017215558A1 (en) |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107248410A (en) * | 2017-07-19 | 2017-10-13 | 浙江联运知慧科技有限公司 | The method that Application on Voiceprint Recognition dustbin opens the door |
WO2017215558A1 (en) * | 2016-06-12 | 2017-12-21 | 腾讯科技(深圳)有限公司 | Voiceprint recognition method and device |
CN107886943A (en) * | 2017-11-21 | 2018-04-06 | 广州势必可赢网络科技有限公司 | Voiceprint recognition method and device |
CN108154588A (en) * | 2017-12-29 | 2018-06-12 | 深圳市艾特智能科技有限公司 | Unlocking method, system, readable storage medium storing program for executing and smart machine |
CN108269590A (en) * | 2018-01-17 | 2018-07-10 | 广州势必可赢网络科技有限公司 | Vocal cord recovery scoring method and device |
WO2018126338A1 (en) * | 2017-01-03 | 2018-07-12 | Nokia Technologies Oy | Apparatus, method and computer program product for authentication |
CN108447489A (en) * | 2018-04-17 | 2018-08-24 | 清华大学 | A kind of continuous voiceprint authentication method and system of band feedback |
CN108447471A (en) * | 2017-02-15 | 2018-08-24 | 腾讯科技(深圳)有限公司 | Audio recognition method and speech recognition equipment |
WO2018223727A1 (en) * | 2017-06-09 | 2018-12-13 | 平安科技(深圳)有限公司 | Voiceprint recognition method, apparatus and device, and medium |
CN109102812A (en) * | 2017-06-21 | 2018-12-28 | 北京搜狗科技发展有限公司 | A kind of method for recognizing sound-groove, system and electronic equipment |
CN109117622A (en) * | 2018-09-19 | 2019-01-01 | 北京容联易通信息技术有限公司 | A kind of identity identifying method based on audio-frequency fingerprint |
WO2019000832A1 (en) * | 2017-06-30 | 2019-01-03 | 百度在线网络技术(北京)有限公司 | Method and apparatus for voiceprint creation and registration |
CN109257362A (en) * | 2018-10-11 | 2019-01-22 | 平安科技(深圳)有限公司 | Method, apparatus, computer equipment and the storage medium of voice print verification |
CN109473107A (en) * | 2018-12-03 | 2019-03-15 | 厦门快商通信息技术有限公司 | A kind of relevant method for recognizing sound-groove of text half and system |
CN109559759A (en) * | 2017-09-27 | 2019-04-02 | 华硕电脑股份有限公司 | The electronic equipment and its method for having increment registering unit |
CN110047491A (en) * | 2018-01-16 | 2019-07-23 | 中国科学院声学研究所 | A kind of relevant method for distinguishing speek person of random digit password and device |
CN110517695A (en) * | 2019-09-11 | 2019-11-29 | 国微集团(深圳)有限公司 | Verification method and device based on vocal print |
CN110875044A (en) * | 2018-08-30 | 2020-03-10 | 中国科学院声学研究所 | Speaker identification method based on word correlation score calculation |
CN110956732A (en) * | 2019-12-19 | 2020-04-03 | 重庆特斯联智慧科技股份有限公司 | Safety entrance guard based on thing networking |
CN110971763A (en) * | 2019-12-10 | 2020-04-07 | Oppo(重庆)智能科技有限公司 | Arrival reminding method and device, storage medium and electronic equipment |
CN111081260A (en) * | 2019-12-31 | 2020-04-28 | 苏州思必驰信息科技有限公司 | Method and system for identifying voiceprint of awakening word |
CN111081256A (en) * | 2019-12-31 | 2020-04-28 | 苏州思必驰信息科技有限公司 | Digital string voiceprint password verification method and system |
CN111597531A (en) * | 2020-04-07 | 2020-08-28 | 北京捷通华声科技股份有限公司 | Identity authentication method and device, electronic equipment and readable storage medium |
CN111613230A (en) * | 2020-06-24 | 2020-09-01 | 泰康保险集团股份有限公司 | Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium |
CN111669350A (en) * | 2019-03-05 | 2020-09-15 | 阿里巴巴集团控股有限公司 | Identity verification method, verification information generation method, payment method and payment device |
WO2021017982A1 (en) * | 2019-07-29 | 2021-02-04 | 华为技术有限公司 | Voiceprint recognition method, and device |
CN112820299A (en) * | 2020-12-29 | 2021-05-18 | 马上消费金融股份有限公司 | Voiceprint recognition model training method and device and related equipment |
CN113113022A (en) * | 2021-04-15 | 2021-07-13 | 吉林大学 | Method for automatically identifying identity based on voiceprint information of speaker |
CN113570754A (en) * | 2021-07-01 | 2021-10-29 | 汉王科技股份有限公司 | Voiceprint lock control method and device and electronic equipment |
US11335352B2 (en) * | 2017-09-29 | 2022-05-17 | Tencent Technology (Shenzhen) Company Limited | Voice identity feature extractor and classifier training |
CN116530944A (en) * | 2023-07-06 | 2023-08-04 | 荣耀终端有限公司 | Sound processing method and electronic equipment |
CN116978368A (en) * | 2023-09-25 | 2023-10-31 | 腾讯科技(深圳)有限公司 | Wake-up word detection method and related device |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109147767B (en) * | 2018-08-16 | 2024-06-21 | 平安科技(深圳)有限公司 | Method, device, computer equipment and storage medium for recognizing numbers in voice |
CN111199729B (en) * | 2018-11-19 | 2023-09-26 | 阿里巴巴集团控股有限公司 | Voiceprint recognition method and voiceprint recognition device |
CN110738998A (en) * | 2019-09-11 | 2020-01-31 | 深圳壹账通智能科技有限公司 | Voice-based personal credit evaluation method, device, terminal and storage medium |
CN112037815B (en) * | 2020-08-28 | 2024-09-06 | 中移(杭州)信息技术有限公司 | Audio fingerprint extraction method, server and storage medium |
CN112435673B (en) * | 2020-12-15 | 2024-05-14 | 北京声智科技有限公司 | Model training method and electronic terminal |
WO2024077588A1 (en) * | 2022-10-14 | 2024-04-18 | Qualcomm Incorporated | Voice-based user authentication |
CN115641852A (en) * | 2022-10-18 | 2023-01-24 | 中国电信股份有限公司 | Voiceprint recognition method and device, electronic equipment and computer readable storage medium |
CN115550075B (en) * | 2022-12-01 | 2023-05-09 | 中网道科技集团股份有限公司 | Anti-counterfeiting processing method and equipment for community correction object public welfare activity data |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101997689A (en) * | 2010-11-19 | 2011-03-30 | 吉林大学 | USB (universal serial bus) identity authentication method based on voiceprint recognition and system thereof |
CN102163427A (en) * | 2010-12-20 | 2011-08-24 | 北京邮电大学 | Method for detecting audio exceptional event based on environmental model |
CN102254559A (en) * | 2010-05-20 | 2011-11-23 | 盛乐信息技术(上海)有限公司 | Identity authentication system and method based on vocal print |
CN102314877A (en) * | 2010-07-08 | 2012-01-11 | 盛乐信息技术(上海)有限公司 | Voiceprint identification method for character content prompt |
CN102737634A (en) * | 2012-05-29 | 2012-10-17 | 百度在线网络技术(北京)有限公司 | Authentication method and device based on voice |
CN103679452A (en) * | 2013-06-20 | 2014-03-26 | 腾讯科技(深圳)有限公司 | Payment authentication method, device thereof and system thereof |
CN104064189A (en) * | 2014-06-26 | 2014-09-24 | 厦门天聪智能软件有限公司 | Vocal print dynamic password modeling and verification method |
CN104282303A (en) * | 2013-07-09 | 2015-01-14 | 威盛电子股份有限公司 | Method for conducting voice recognition by voiceprint recognition and electronic device thereof |
CN104575504A (en) * | 2014-12-24 | 2015-04-29 | 上海师范大学 | Method for personalized television voice wake-up by voiceprint and voice identification |
CN104901808A (en) * | 2015-04-14 | 2015-09-09 | 时代亿宝(北京)科技有限公司 | Voiceprint authentication system and method based on time type dynamic password |
CN105096121A (en) * | 2015-06-25 | 2015-11-25 | 百度在线网络技术(北京)有限公司 | Voiceprint authentication method and device |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100406307B1 (en) * | 2001-08-09 | 2003-11-19 | 삼성전자주식회사 | Voice recognition method and system based on voice registration method and system |
CN102238189B (en) * | 2011-08-01 | 2013-12-11 | 安徽科大讯飞信息科技股份有限公司 | Voiceprint password authentication method and system |
CN105656887A (en) * | 2015-12-30 | 2016-06-08 | 百度在线网络技术(北京)有限公司 | Artificial intelligence-based voiceprint authentication method and device |
CN106098068B (en) * | 2016-06-12 | 2019-07-16 | 腾讯科技(深圳)有限公司 | A kind of method for recognizing sound-groove and device |
-
2016
- 2016-06-12 CN CN201610416650.3A patent/CN106098068B/en active Active
-
2017
- 2017-06-12 WO PCT/CN2017/087911 patent/WO2017215558A1/en active Application Filing
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102254559A (en) * | 2010-05-20 | 2011-11-23 | 盛乐信息技术(上海)有限公司 | Identity authentication system and method based on vocal print |
CN102314877A (en) * | 2010-07-08 | 2012-01-11 | 盛乐信息技术(上海)有限公司 | Voiceprint identification method for character content prompt |
CN101997689A (en) * | 2010-11-19 | 2011-03-30 | 吉林大学 | USB (universal serial bus) identity authentication method based on voiceprint recognition and system thereof |
CN102163427A (en) * | 2010-12-20 | 2011-08-24 | 北京邮电大学 | Method for detecting audio exceptional event based on environmental model |
CN102737634A (en) * | 2012-05-29 | 2012-10-17 | 百度在线网络技术(北京)有限公司 | Authentication method and device based on voice |
CN103679452A (en) * | 2013-06-20 | 2014-03-26 | 腾讯科技(深圳)有限公司 | Payment authentication method, device thereof and system thereof |
CN104282303A (en) * | 2013-07-09 | 2015-01-14 | 威盛电子股份有限公司 | Method for conducting voice recognition by voiceprint recognition and electronic device thereof |
CN104064189A (en) * | 2014-06-26 | 2014-09-24 | 厦门天聪智能软件有限公司 | Vocal print dynamic password modeling and verification method |
CN104575504A (en) * | 2014-12-24 | 2015-04-29 | 上海师范大学 | Method for personalized television voice wake-up by voiceprint and voice identification |
CN104901808A (en) * | 2015-04-14 | 2015-09-09 | 时代亿宝(北京)科技有限公司 | Voiceprint authentication system and method based on time type dynamic password |
CN105096121A (en) * | 2015-06-25 | 2015-11-25 | 百度在线网络技术(北京)有限公司 | Voiceprint authentication method and device |
Cited By (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017215558A1 (en) * | 2016-06-12 | 2017-12-21 | 腾讯科技(深圳)有限公司 | Voiceprint recognition method and device |
US11283631B2 (en) | 2017-01-03 | 2022-03-22 | Nokia Technologies Oy | Apparatus, method and computer program product for authentication |
WO2018126338A1 (en) * | 2017-01-03 | 2018-07-12 | Nokia Technologies Oy | Apparatus, method and computer program product for authentication |
CN108447471A (en) * | 2017-02-15 | 2018-08-24 | 腾讯科技(深圳)有限公司 | Audio recognition method and speech recognition equipment |
CN108447471B (en) * | 2017-02-15 | 2021-09-10 | 腾讯科技(深圳)有限公司 | Speech recognition method and speech recognition device |
WO2018223727A1 (en) * | 2017-06-09 | 2018-12-13 | 平安科技(深圳)有限公司 | Voiceprint recognition method, apparatus and device, and medium |
CN109102812A (en) * | 2017-06-21 | 2018-12-28 | 北京搜狗科技发展有限公司 | A kind of method for recognizing sound-groove, system and electronic equipment |
CN109102812B (en) * | 2017-06-21 | 2021-08-31 | 北京搜狗科技发展有限公司 | Voiceprint recognition method and system and electronic equipment |
US11100934B2 (en) | 2017-06-30 | 2021-08-24 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and apparatus for voiceprint creation and registration |
WO2019000832A1 (en) * | 2017-06-30 | 2019-01-03 | 百度在线网络技术(北京)有限公司 | Method and apparatus for voiceprint creation and registration |
CN107248410A (en) * | 2017-07-19 | 2017-10-13 | 浙江联运知慧科技有限公司 | The method that Application on Voiceprint Recognition dustbin opens the door |
CN109559759B (en) * | 2017-09-27 | 2021-10-08 | 华硕电脑股份有限公司 | Electronic device with incremental registration unit and method thereof |
CN109559759A (en) * | 2017-09-27 | 2019-04-02 | 华硕电脑股份有限公司 | The electronic equipment and its method for having increment registering unit |
US11335352B2 (en) * | 2017-09-29 | 2022-05-17 | Tencent Technology (Shenzhen) Company Limited | Voice identity feature extractor and classifier training |
CN107886943A (en) * | 2017-11-21 | 2018-04-06 | 广州势必可赢网络科技有限公司 | Voiceprint recognition method and device |
CN108154588B (en) * | 2017-12-29 | 2020-11-27 | 深圳市艾特智能科技有限公司 | Unlocking method and system, readable storage medium and intelligent device |
CN108154588A (en) * | 2017-12-29 | 2018-06-12 | 深圳市艾特智能科技有限公司 | Unlocking method, system, readable storage medium storing program for executing and smart machine |
CN110047491A (en) * | 2018-01-16 | 2019-07-23 | 中国科学院声学研究所 | A kind of relevant method for distinguishing speek person of random digit password and device |
CN108269590A (en) * | 2018-01-17 | 2018-07-10 | 广州势必可赢网络科技有限公司 | Vocal cord recovery scoring method and device |
CN108447489A (en) * | 2018-04-17 | 2018-08-24 | 清华大学 | A kind of continuous voiceprint authentication method and system of band feedback |
CN108447489B (en) * | 2018-04-17 | 2020-05-22 | 清华大学 | Continuous voiceprint authentication method and system with feedback |
CN110875044B (en) * | 2018-08-30 | 2022-05-03 | 中国科学院声学研究所 | Speaker identification method based on word correlation score calculation |
CN110875044A (en) * | 2018-08-30 | 2020-03-10 | 中国科学院声学研究所 | Speaker identification method based on word correlation score calculation |
CN109117622A (en) * | 2018-09-19 | 2019-01-01 | 北京容联易通信息技术有限公司 | A kind of identity identifying method based on audio-frequency fingerprint |
CN109117622B (en) * | 2018-09-19 | 2020-09-01 | 北京容联易通信息技术有限公司 | Identity authentication method based on audio fingerprints |
CN109257362A (en) * | 2018-10-11 | 2019-01-22 | 平安科技(深圳)有限公司 | Method, apparatus, computer equipment and the storage medium of voice print verification |
CN109473107A (en) * | 2018-12-03 | 2019-03-15 | 厦门快商通信息技术有限公司 | A kind of relevant method for recognizing sound-groove of text half and system |
CN109473107B (en) * | 2018-12-03 | 2020-12-22 | 厦门快商通信息技术有限公司 | Text semi-correlation voiceprint recognition method and system |
CN111669350A (en) * | 2019-03-05 | 2020-09-15 | 阿里巴巴集团控股有限公司 | Identity verification method, verification information generation method, payment method and payment device |
US20220229891A1 (en) * | 2019-07-29 | 2022-07-21 | Huawei Technologies Co., Ltd. | Voiceprint recognition method and device |
WO2021017982A1 (en) * | 2019-07-29 | 2021-02-04 | 华为技术有限公司 | Voiceprint recognition method, and device |
CN110517695A (en) * | 2019-09-11 | 2019-11-29 | 国微集团(深圳)有限公司 | Verification method and device based on vocal print |
CN110971763B (en) * | 2019-12-10 | 2021-01-26 | Oppo广东移动通信有限公司 | Arrival reminding method and device, storage medium and electronic equipment |
CN110971763A (en) * | 2019-12-10 | 2020-04-07 | Oppo(重庆)智能科技有限公司 | Arrival reminding method and device, storage medium and electronic equipment |
CN110956732A (en) * | 2019-12-19 | 2020-04-03 | 重庆特斯联智慧科技股份有限公司 | Safety entrance guard based on thing networking |
CN111081256A (en) * | 2019-12-31 | 2020-04-28 | 苏州思必驰信息科技有限公司 | Digital string voiceprint password verification method and system |
CN111081260A (en) * | 2019-12-31 | 2020-04-28 | 苏州思必驰信息科技有限公司 | Method and system for identifying voiceprint of awakening word |
CN111597531A (en) * | 2020-04-07 | 2020-08-28 | 北京捷通华声科技股份有限公司 | Identity authentication method and device, electronic equipment and readable storage medium |
CN111613230A (en) * | 2020-06-24 | 2020-09-01 | 泰康保险集团股份有限公司 | Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium |
CN112820299B (en) * | 2020-12-29 | 2021-09-14 | 马上消费金融股份有限公司 | Voiceprint recognition model training method and device and related equipment |
CN112820299A (en) * | 2020-12-29 | 2021-05-18 | 马上消费金融股份有限公司 | Voiceprint recognition model training method and device and related equipment |
CN113113022A (en) * | 2021-04-15 | 2021-07-13 | 吉林大学 | Method for automatically identifying identity based on voiceprint information of speaker |
CN113570754A (en) * | 2021-07-01 | 2021-10-29 | 汉王科技股份有限公司 | Voiceprint lock control method and device and electronic equipment |
CN113570754B (en) * | 2021-07-01 | 2022-04-29 | 汉王科技股份有限公司 | Voiceprint lock control method and device and electronic equipment |
CN116530944A (en) * | 2023-07-06 | 2023-08-04 | 荣耀终端有限公司 | Sound processing method and electronic equipment |
CN116530944B (en) * | 2023-07-06 | 2023-10-20 | 荣耀终端有限公司 | Sound processing method and electronic equipment |
CN116978368A (en) * | 2023-09-25 | 2023-10-31 | 腾讯科技(深圳)有限公司 | Wake-up word detection method and related device |
CN116978368B (en) * | 2023-09-25 | 2023-12-15 | 腾讯科技(深圳)有限公司 | Wake-up word detection method and related device |
Also Published As
Publication number | Publication date |
---|---|
CN106098068B (en) | 2019-07-16 |
WO2017215558A1 (en) | 2017-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106098068A (en) | A kind of method for recognizing sound-groove and device | |
CN106057206B (en) | Sound-groove model training method, method for recognizing sound-groove and device | |
CN107610707B (en) | A kind of method for recognizing sound-groove and device | |
CN110457432B (en) | Interview scoring method, interview scoring device, interview scoring equipment and interview scoring storage medium | |
CN110310647B (en) | Voice identity feature extractor, classifier training method and related equipment | |
TWI527023B (en) | A voiceprint recognition method and apparatus | |
Kelly et al. | Deep neural network based forensic automatic speaker recognition in VOCALISE using x-vectors | |
Das et al. | Development of multi-level speech based person authentication system | |
CN107104803A (en) | It is a kind of to combine the user ID authentication method confirmed with vocal print based on numerical password | |
CN111402862B (en) | Speech recognition method, device, storage medium and equipment | |
CN100363938C (en) | Multi-model ID recognition method based on scoring difference weight compromised | |
Mansour et al. | Voice recognition using dynamic time warping and mel-frequency cepstral coefficients algorithms | |
CN104217149A (en) | Biometric authentication method and equipment based on voice | |
CN101465123A (en) | Verification method and device for speaker authentication and speaker authentication system | |
CN110047504B (en) | Speaker identification method under identity vector x-vector linear transformation | |
CN106782603A (en) | Intelligent sound evaluating method and system | |
CN101609672B (en) | Speech recognition semantic confidence feature extraction method and device | |
Meyer et al. | Anonymizing speech with generative adversarial networks to preserve speaker privacy | |
Umesh et al. | Frequency warping and the Mel scale | |
Beigi | Challenges of LargeScale Speaker Recognition | |
CN110111798A (en) | A kind of method and terminal identifying speaker | |
Büyük | Sentence‐HMM state‐based i‐vector/PLDA modelling for improved performance in text dependent single utterance speaker verification | |
Ghaemmaghami et al. | Speaker attribution of australian broadcast news data | |
Misra et al. | Maximum-likelihood linear transformation for unsupervised domain adaptation in speaker verification | |
Mandalapu et al. | Multilingual voice impersonation dataset and evaluation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230712 Address after: 518057 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 floors Patentee after: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd. Patentee after: TENCENT CLOUD COMPUTING (BEIJING) Co.,Ltd. Address before: 2, 518000, East 403 room, SEG science and Technology Park, Zhenxing Road, Shenzhen, Guangdong, Futian District Patentee before: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd. |
|
TR01 | Transfer of patent right |