CN106098068A - A kind of method for recognizing sound-groove and device - Google Patents

A kind of method for recognizing sound-groove and device Download PDF

Info

Publication number
CN106098068A
CN106098068A CN201610416650.3A CN201610416650A CN106098068A CN 106098068 A CN106098068 A CN 106098068A CN 201610416650 A CN201610416650 A CN 201610416650A CN 106098068 A CN106098068 A CN 106098068A
Authority
CN
China
Prior art keywords
voice messaging
character
checking
characteristic
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610416650.3A
Other languages
Chinese (zh)
Other versions
CN106098068B (en
Inventor
李为
钱柄桦
金星明
李科
吴富章
吴永坚
黄飞跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610416650.3A priority Critical patent/CN106098068B/en
Publication of CN106098068A publication Critical patent/CN106098068A/en
Priority to PCT/CN2017/087911 priority patent/WO2017215558A1/en
Application granted granted Critical
Publication of CN106098068B publication Critical patent/CN106098068B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/08Use of distortion metrics or a particular distance between probe pattern and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Business, Economics & Management (AREA)
  • Telephonic Communication Services (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Telephone Function (AREA)

Abstract

The embodiment of the invention discloses a kind of method for recognizing sound-groove and device, the method comprise the steps that acquisition checking user reads aloud the first character string produced checking voice messaging;To described checking voice messaging carry out speech recognition obtain described checking voice messaging in comprise respectively with the corresponding sound bite of multiple characters in described first character string;Extract the vocal print feature of the corresponding sound bite of each character;It according to the vocal print feature of the corresponding sound bite of each character described, is verified in voice messaging each character characteristic of correspondence vector in conjunction with the respective symbols corresponding universal background model training preset;Calculate the vectorial similarity score with respective symbols characteristic of correspondence vector in the registration voice messaging preset of each character characteristic of correspondence in checking voice messaging, if described similarity score reaches to preset checking thresholding, then described checking user is defined as described registration voice messaging corresponding registration user.Use the present invention, Application on Voiceprint Recognition accuracy rate can be effectively improved.

Description

A kind of method for recognizing sound-groove and device
Technical field
The present invention relates to voice recognition technology field, particularly relate to a kind of method for recognizing sound-groove and device.
Background technology
Application on Voiceprint Recognition knows method for distinguishing as a kind of biological information, including user registers and two rank of user identity identification Section.Voice is mapped as user model by a series of process by registration phase.At cognitive phase for the unknown language of one section of identity Sound, is carried out mating of similarity with model, and then sentences to whether the identity of unknown voice is consistent with the identity registering voice Disconnected.Existing vocal print modeling method is typically to be modeled from the unrelated aspect of text to realize retouching speaker's identity feature Stating, but the unrelated modeling pattern of text being when user reads aloud different content, recognition accuracy is relatively low, it is difficult to meets and requires.
Content of the invention
In view of this, the embodiment of the present invention provides a kind of method for recognizing sound-groove and device, can effectively improve Application on Voiceprint Recognition accurate True rate.
In order to solve above-mentioned technical problem, embodiments providing a kind of method for recognizing sound-groove, described method includes:
Obtain checking user and read aloud the first character string produced checking voice messaging;
To described checking voice messaging carry out speech recognition obtain described checking voice messaging in comprise respectively with described The corresponding sound bite of multiple characters in first character string;
Extract the vocal print feature of the corresponding sound bite of each character;
According to the vocal print feature of the corresponding sound bite of each character described, corresponding general in conjunction with the respective symbols preset Background model training is verified in voice messaging each character characteristic of correspondence vector;
Calculate each character characteristic of correspondence vector and corresponding word in the registration voice messaging preset in checking voice messaging Described checking if described similarity score reaches to preset checking thresholding, is then used by the similarity score of symbol characteristic of correspondence vector Family is defined as described registration voice messaging corresponding registration user.
Correspondingly, the embodiment of the present invention additionally provides a kind of voice print identification device, and described device includes:
Voice acquisition module, is used for obtaining checking user and reads aloud the first character string produced checking voice messaging;
Sound bite identification module, obtains described checking voice letter for carrying out speech recognition to described checking voice messaging Breath comprises respectively with the corresponding sound bite of multiple characters in described first character string;
Vocal print characteristic extracting module, the vocal print for extracting the corresponding sound bite of each character in checking voice messaging is special Levy;
Characteristic model training module, for the vocal print feature according to the corresponding sound bite of each character described, in conjunction with in advance If the training of respective symbols corresponding universal background model be verified in voice messaging each character characteristic of correspondence vector;
Similarity judge module, for calculating each character characteristic of correspondence vector and the note preset in checking voice messaging The similarity score of respective symbols characteristic of correspondence vector in volume voice messaging;
By described, subscriber identification module, if reaching to preset checking thresholding for described similarity score, then verifies that user is true It is set to described registration voice messaging corresponding registration user.
The present embodiment is by obtaining the vocal print verifying the corresponding sound bite of each character in voice messaging of checking user Feature, the UBM training in conjunction with the respective symbols preset is verified in voice messaging each character characteristic of correspondence vector, and leads to Cross and each character characteristic of correspondence vector in checking voice messaging and the characteristic vector of respective symbols in registration voice messaging are entered Row similarity-rough set, so that it is determined that checking user user identity, which in order to the user characteristics vector that compares with concrete Character is corresponding, fully takes into account vocal print feature when user reads aloud kinds of characters, thus can effectively improve Application on Voiceprint Recognition accuracy rate.
Brief description
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing In having technology to describe, the accompanying drawing of required use is briefly described, it should be apparent that, the accompanying drawing in describing below is only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, all right Obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the Stages Overview schematic diagram of the method for recognizing sound-groove in the embodiment of the present invention;
Fig. 2 is the schematic flow sheet of a kind of method for recognizing sound-groove in the embodiment of the present invention;
Fig. 3 is to identify from voice messaging in the embodiment of the present invention that the principle obtaining the corresponding sound bite of multiple character is shown It is intended to;
Fig. 4 is the principle signal obtaining each character characteristic of correspondence vector in the embodiment of the present invention from voice messaging Figure;
Fig. 5 is the voiceprint registration schematic flow sheet registering user in the embodiment of the present invention;
Fig. 6 is the schematic flow sheet of the method for recognizing sound-groove in another embodiment of the present invention;
Fig. 7 is the structural representation of a kind of voice print identification device in the embodiment of the present invention;
Fig. 8 is the structural representation of the sound bite identification module in the embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe, it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments wholely.Based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of not making creative work Embodiment, broadly falls into the scope of protection of the invention.
Embodiments provide a kind of method for recognizing sound-groove and device.This method for recognizing sound-groove and device can be applicable to Be there is a need in scene or the equipment of identification unknown subscriber's identity.For carrying out the character in the character string of Application on Voiceprint Recognition can be Arabic numerals, English alphabet or other language characters etc..In order to simplify description, the character in the embodiment of the present invention is with Arab It is illustrated as a example by Shuo Zi.
Method for recognizing sound-groove in the embodiment of the present invention can be divided into two stages, as shown in Figure 1:
1) the voiceprint registration stage of user is registered
In the voiceprint registration stage, registration user can read aloud a login-string (the second character i.e. hereinafter occurring String), voice print identification device gathers the registration voice messaging when reading aloud this login-string for this registration user, then to registration language Message breath carry out voice recognition obtain described registration voice messaging in comprise respectively with the multiple words in described login-string Accord with corresponding sound bite, and then corresponding sound bite carry out vocal print feature extraction and vocal print model training to each character, Including according to the vocal print feature of the corresponding sound bite of each character described, in conjunction with the corresponding common background of respective symbols preset Model (Universal Background Model, UBM, i.e. GMM-UBM) training obtains each character in registration voice messaging Characteristic of correspondence vector, then voice print identification device can be respectively different registration users and reads aloud it in the voiceprint registration stage Registration voice messaging in multiple character characteristic of correspondence vectors be saved in the model library of voice print identification device.
Such as, login-string is digit strings 0185851, contains four kinds of digital " 0 ", " 1 ", " 5 ", " 8 ", then sound Line identification device carries out vocal print feature extraction and sound-groove model according to the corresponding sound bite of each character in registration voice messaging Training, obtains " 0 ", " 1 ", " 5 ", the vocal print feature of " 8 " corresponding sound bite, and then it is corresponding to combine the respective symbols preset UBM training obtains each character characteristic of correspondence vector in registration voice messaging, including vectorial with digital " 0 " characteristic of correspondence, Vectorial with digital " 1 " characteristic of correspondence vector and numeral " 5 " characteristic of correspondence and vectorial with numeral " 8 " characteristic of correspondence.
2) the identification stage of user is verified
In the identification stage, the user of the i.e. unknown identity of checking user reads aloud a checking character string (i.e. hereinafter to be occurred The first character string, described second character string has at least one identical character with described first character string), Application on Voiceprint Recognition fill Put the collection checking voice messaging when reading aloud this checking character string for this checking user, then sound is carried out to checking voice messaging Identify obtain described checking voice messaging in comprise respectively with described checking character string in the corresponding voice sheet of multiple characters Section, and then corresponding sound bite carries out vocal print feature extraction and vocal print model training to each character, including according to described respectively The vocal print feature of the corresponding sound bite of individual character, is verified voice letter in conjunction with the respective symbols corresponding UBM training preset Each character characteristic of correspondence vector in breath, in finally calculating checking voice messaging, each character characteristic of correspondence is vectorial and presets Registration voice messaging in respective symbols characteristic of correspondence vector similarity score, if described similarity score reach preset test Card thresholding, then be defined as described registration voice messaging corresponding registration user by described checking user.
Such as, verify that character string is digit strings 85851510, then when voice print identification device is read aloud according to checking user In the checking voice messaging producing, the corresponding sound bite of each character carries out vocal print feature extraction and vocal print model training, obtains " 0 ", " 1 ", " 5 ", " 8 " corresponding GMM, so combine preset the corresponding UBM of respective symbols can be calculated checking user Checking voice messaging characteristic vector, including with digital " 0 " characteristic of correspondence vector and numeral " 1 " characteristic of correspondence vector, Vectorial with numeral " 5 " characteristic of correspondence and vectorial with numeral " 8 " characteristic of correspondence, and then calculate respectively in checking voice messaging " 0 ", " 1 ", " 5 ", " 8 " characteristic of correspondence vector respectively with registration voice messaging in " 0 ", " 1 ", " 5 ", " 8 " corresponding spy Levy the similarity score between vector, if described similarity score reaches to preset checking thresholding, then described checking user is determined For described registration voice messaging corresponding registration user.
It is pointed out that the identification stage of voiceprint registration stage of above-mentioned registration user and checking user can be Same equipment or device realize, it is also possible to realize in different equipment or device respectively, the vocal print note of such as registration user The volume stage implements in the first equipment, and then the multiple character characteristic of correspondence vector registered in voice messaging is sent out by the first equipment Give the second equipment, such that it is able to implement the identification stage of checking user in the second equipment.
Respectively above-mentioned two process is described in detail below by specific embodiment.
Fig. 2 is the schematic flow sheet of a kind of method for recognizing sound-groove in the embodiment of the present invention, as shown in the figure in the present embodiment Method for recognizing sound-groove flow process may include that
S201, obtains checking user and reads aloud the first character string produced checking voice messaging.
Described checking user is the user of unknown identity, needs to verify its user identity by voice print identification device.Described First character string is for verifying that user carries out the character string of authentication, can be randomly generated, it is also possible to be to preset admittedly A fixed character string, for example at least partly identical with registration corresponding second character string of voice messaging previously generating one Character string.Concrete, described character string can comprise m character, wherein has n mutually different character, and m, n are just whole Number, and m >=n.
Such as, the first character string is " 12358948 ", totally 8 characters, include 7 kinds of mutually different characters " 1 ", " 2 ", “3”、“4”、“5”、“8”、“9”。
In an alternative embodiment, voice print identification device can generate and show described first character string, allows and verifies user's root Read aloud according to described first character string of display.
S202, to described checking voice messaging carry out speech recognition obtain described checking voice messaging in comprise respectively with The corresponding sound bite of multiple characters in described first character string.
As it is shown on figure 3, voice print identification device can pass through speech recognition and intensity of sound filters, by described checking voice Information divides and obtains the corresponding sound bite of multiple character, optionally can also weed out invalid voice fragment, after being not involved in Continuous processing procedure.
S203, extracts the vocal print feature of the corresponding sound bite of each character.
Concrete, voice print identification device can extract the MFCC (Mel in the corresponding sound bite of each character Frequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual Linear Predictive, perception linear predictor coefficient), as the vocal print feature of the sound bite corresponding to each character.
S204 is according to the vocal print feature of the corresponding sound bite of each character described, corresponding in conjunction with the respective symbols preset Universal background model training be verified in voice messaging each character characteristic of correspondence vector;
Described universal background model UBM in the embodiment of the present invention, is the language of a kind of optional network specific digit by a large amount of speakers The mixed Gauss model of segment combined training, characterizes distribution in feature space for the corresponding digital voice, again due to instruction Practicing data and deriving from substantial amounts of speaker, therefore it does not characterize a certain class and talks about people specifically, has the unrelated characteristic of identity, can Regard a kind of universal background model as.Schematically, the number of speaking can be used to be more than 1000 people, the language more than 20 hours for the duration Sound sample, and the frequency of occurrences relative equilibrium of each character, training obtains UBM.The mathematic(al) representation of UBM is:
P (x)=∑I=1 ... CaiN(x|μi, ∑i) ... ... formula (1)
Wherein, P (x) represents the probability distribution of UBM, and C represents and has C Gauss module in UBM, sums up, aiRepresent The weight of i-th Gauss module, μiRepresent the average of i-th Gauss module, ∑iRepresent the variance of i-th Gauss module, N (x) Representing Gaussian Profile, x represents sample, sample namely the vocal print feature of input.
Voice print identification device can using checking voice messaging in the corresponding sound bite of each character vocal print feature as Training sample data, use maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default respective symbols pair The parameter of the universal background model answered is adjusted, i.e. at the sound by the corresponding sound bite of each character in checking voice messaging After line feature substitutes into formula (1) as input sample, by the continuous corresponding universal background model of respective symbols adjusting and presetting Parameter so that posterior probability P (x) is maximum, such that it is able to verify voice according to the parameter determination making posterior probability P (x) maximum Respective symbols characteristic of correspondence vector in information.
May be used for differentiation speak owing to substantial amounts of experiment and paper demonstrate the average of each Gauss module in UBM model The identity information of people, the average super vector that we define UBM model is:
μ 1 μ 2 . . . . μ C
Thus, voice print identification device can be by the vocal print feature of the corresponding sound bite of each character in checking voice messaging As training sample data, use maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default corresponding word The average super vector according with corresponding universal background model is adjusted, and i.e. will verify the corresponding language of each character in voice messaging After the vocal print feature of tablet section substitutes into formula (1) as input sample, adjust average super vector by continuous so that posterior probability P X () is maximum, such that it is able to the average super vector of posterior probability P (x) maximum will be made as respective symbols in checking voice messaging Characteristic of correspondence vector.
In another alternative embodiment, the slow problem of the high-dimensional convergence rate brought in order to reduce super vector, we By the principal component analytical method (PPCA, probabilistic principal component analysis) based on probability The excursion of average super vector is limited in a sub spaces, voice print identification device can by checking voice messaging in each The vocal print feature of the corresponding sound bite of character as training sample data, uses maximal posterior probability algorithm to default corresponding The average super vector of the corresponding universal background model of character is adjusted, and combine preset super vector subspace matrices thus Each character characteristic of correspondence vector in checking voice messaging.In implementing, following formula can be used to default corresponding word The average super vector according with corresponding universal background model is adjusted so that the respective symbols corresponding common background mould after adjustment The posterior probability of type is maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents tune The average super vector of the universal background model of the respective symbols before whole, T is the super vector subspace matrices preset, and ω is checking Respective symbols characteristic of correspondence vector in voice messaging, i.e. by the corresponding sound bite of each character in checking voice messaging After vocal print feature substitutes into formula (1) as input sample, by continuous adjust ω can realize the average in adjustment type (1) surpass to Amount so that posterior probability P (x) is maximum, such that it is able to will make the ω of posterior probability P (x) maximum as in checking voice messaging Respective symbols characteristic of correspondence vector.Described super vector subspace matrices T for according to the average of described gauss hybrid models surpass to Correlation determination between each dimension vector in amount obtains.
S205, calculates each character characteristic of correspondence vector and phase in the registration voice messaging preset in checking voice messaging Answer the similarity score of character characteristic of correspondence vector, if described similarity score reaches to preset checking thresholding, then test described Card user is defined as described registration voice messaging corresponding registration user.
Concrete, voice print identification device can get the registration voice messaging of registration user in the voiceprint registration stage, And by the vocal print feature extraction similar with the present embodiment and vocal print model training, can obtain registration voice messaging in each The sound bite characteristic of correspondence vector of character.Described registration voice messaging, can be that voice print identification device obtains registration user Reading aloud the second character string produced registration voice messaging, described second character string and described first character string have at least one Identical character, i.e. described registration corresponding second character string of voice messaging is at least partly identical with described first character string.Enter And in an alternative embodiment, it is corresponding that voice print identification device can also obtain respective symbols described registration voice messaging from outside After characteristic vector, i.e. registration user pass through other equipment typings registration voice messaging, other equipment or server pass through sound Line feature extraction and vocal print model training obtain registering the sound bite characteristic of correspondence vector of each character in voice messaging, sound Line identification device is by getting respective symbols characteristic of correspondence in described registration voice messaging from other equipment or server Vector, thus verifying the identification stage of user in order to enter with each character characteristic of correspondence vector in checking voice messaging Row compares.
In implementing, described similarity score is that voice print identification device will verify that in voice messaging, each character is corresponding After characteristic vector compares with respective symbols characteristic of correspondence vector in the registration voice messaging preset, weigh identical characters The score value of the similarity degree between two characteristic vectors.In an alternative embodiment, each word in checking voice messaging can be calculated COS distance value between respective symbols characteristic of correspondence vector in symbol characteristic of correspondence vector and default registration voice messaging As described similarity score, i.e. calculate certain character characteristic of correspondence vector sum in checking voice messaging respectively by following formula The similarity score between characteristic vector in registration voice messaging:
s c o r e = ω i ( t a r ) T * ω i ( t e s t ) | | ω i ( t a r ) | | * | | ω i ( t e s t ) | |
Wherein, subscript i represents character total in i-th checking voice messaging and registration voice messaging, ωi(tar) table Show this character characteristic of correspondence vector, ω in checking voice messagingi(test) represent that this character is right in registration voice messaging The characteristic vector answered.If checking voice messaging comprises multiple identical character in registration voice messaging, then can be according to above formula The similarity score of each character calculated takes average, presets if the similarity score average of each character reaches corresponding Checking thresholding, then be defined as described registration voice messaging corresponding registration user by described checking user.If there is multidigit registration User, such as registration user A, B and the C shown in Fig. 1, can note with each according to the characteristic vector of checking certain character of user The similarity of the characteristic vector of the respective symbols of volume user, when certain characteristic vector registering the respective symbols of user and checking language The similarity score of the characteristic vector of this character of sound is the highest and similarity reaches to preset checking thresholding, then make this registration user For verifying the identification result of user.
In an alternative embodiment, if described checking voice messaging existing same character occur more than once, such as occurring The 1st, the 0th, checking voice messaging as shown in Figure 25 and 8 all occur in that 2 times respectively, then can be corresponding according to character 0 twice Sound bite process the characteristic vector that the obtains similarity respectively with the characteristic vector of character 0 in default registration voice messaging The mean value of fraction, as the characteristic vector of character 0 and character 0 in the registration voice messaging preset in this checking voice messaging The similarity score of characteristic vector, by that analogy.
It is pointed out that the mode of the similarity weighed between two characteristic vectors also has a lot, above is only this A kind of embodiment of bright offer, those skilled in the art can be without creative labor on the basis of scheme disclosed by the invention The similarity obtaining the more characteristic vector calculating character total in checking voice messaging and registration voice messaging is divided dynamicly The mode of number, the present invention is not necessarily to exhaustive.
Thus, the present embodiment is by obtaining the corresponding sound bite of each character in the checking voice messaging verifying user Vocal print feature, the UBM training in conjunction with the respective symbols preset is verified in voice messaging each character characteristic of correspondence vector, And by by the feature of each character characteristic of correspondence vector in checking voice messaging and respective symbols in registration voice messaging to Amount carry out similarity-rough set, so that it is determined that checking user user identity, which in order to compare user characteristics vector with Concrete character is corresponding, fully takes into account vocal print feature when user reads aloud kinds of characters, thus it is accurate to effectively improve Application on Voiceprint Recognition True rate.
Fig. 5 is the voiceprint registration schematic flow sheet registering user in the embodiment of the present invention, as shown in the figure in the present embodiment Voiceprint registration flow process may include that
S501, obtains registration user and reads aloud the second character string produced registration voice messaging, described second character string with Described first character string has at least one identical character.
Described registration user i.e. determines the user of legal identity, and described second character string is for gathering registration user's vocal print The character string of characteristic vector, can be randomly generated, it is also possible to be to preset a fixing character string.Concrete, described the Two character strings also can comprise m character, wherein has n mutually different character, and m, n are positive integer, and m >=n.
In an alternative embodiment, voice print identification device can generate and show described second character string, allows and registers user's root Read aloud according to described second character string of display.
S502, to described registration voice messaging carry out speech recognition obtain described registration voice messaging in comprise respectively with The corresponding sound bite of multiple characters in described second character string;
Voice print identification device can pass through speech recognition and intensity of sound filters, and divides described checking voice messaging To the corresponding sound bite of multiple characters, optionally invalid voice fragment can also be weeded out, be not involved in follow-up process Journey.
S503, extracts the vocal print feature of the corresponding sound bite of each character in registration voice messaging.
Concrete, voice print identification device can extract the MFCC (Mel in the corresponding sound bite of each character Frequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual Linear Predictive, perception linear predictor coefficient), as the vocal print feature of the sound bite corresponding to each character.
S504, according to the vocal print feature of the corresponding sound bite of each character in registration voice messaging, in conjunction with the phase preset The training of character corresponding universal background model is answered to obtain each character characteristic of correspondence vector in registration voice messaging.
The expression formula of UBM is referred to embodiment above.This step of voiceprint registration flow process and Application on Voiceprint Recognition flow process S204 be similar to, voice print identification device can using registration voice messaging in the corresponding sound bite of each character vocal print feature as Training sample data, use maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default respective symbols pair The parameter of the universal background model answered is adjusted, i.e. at the sound by the corresponding sound bite of each character in registration voice messaging After line feature substitutes into formula (1) as input sample, by the continuous corresponding universal background model of respective symbols adjusting and presetting Parameter so that posterior probability P (x) is maximum, such that it is able to register voice according to the parameter determination making posterior probability P (x) maximum Respective symbols characteristic of correspondence vector in information.
And owing to the average of each Gauss module in UBM model may be used for distinguishing the identity information of speaker, vocal print is known Other device can be adopted using the vocal print feature of the corresponding sound bite of each character in registration voice messaging as training sample data With maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default respective symbols corresponding common background mould The average super vector of type is adjusted, and is i.e. making the vocal print feature of the corresponding sound bite of each character in registration voice messaging After substituting into formula (1) for input sample, adjust average super vector by continuous so that posterior probability P (x) is maximum, such that it is able to will The average super vector making posterior probability P (x) maximum is vectorial as respective symbols characteristic of correspondence in registration voice messaging.
In another alternative embodiment, equal to the default corresponding universal background model of respective symbols of following formula can be used Value super vector is adjusted so that the posterior probability of the corresponding universal background model of respective symbols after adjustment is maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents tune The average super vector of the universal background model of the respective symbols before whole, T is the super vector subspace matrices preset, and ω is registration Respective symbols characteristic of correspondence vector in voice messaging, i.e. by the corresponding sound bite of each character in registration voice messaging After vocal print feature substitutes into formula (1) as input sample, by continuous adjust ω can realize the average in adjustment type (1) surpass to Amount so that posterior probability P (x) is maximum, such that it is able to will make the ω of posterior probability P (x) maximum as in registration voice messaging Respective symbols characteristic of correspondence vector.
Fig. 6 is the schematic flow sheet of the method for recognizing sound-groove in another embodiment of the present invention, as shown in the figure in the present embodiment Method for recognizing sound-groove can include below scheme:
S601, stochastic generation the first character string simultaneously shows.
S602, obtains checking user and reads aloud the first character string produced checking voice messaging.
S603, identifies the efficient voice fragment in described checking voice messaging and invalid voice fragment.
Concrete, according to intensity of sound, checking voice can be divided, sound bite less for intensity of sound is regarded For invalid voice fragment (for example including quiet section and impulsive noise).
S604, described efficient voice fragment is carried out speech recognition obtain respectively with the multiple words in described first character string Accord with corresponding sound bite.
Speech recognition can be passed through, obtain respectively with the corresponding sound bite of multiple characters in described first character string.
S605, determines the sequence of the sound bite of the described multiple characters verified in voice messaging and described first character string In the sequence of respective symbols consistent.
In order to be prevented effectively from after the voice messaging registering user is copied illegally or illegally copied in order to carry out Application on Voiceprint Recognition, permissible The first different character string of each stochastic generation, and the sound bite of the multiple characters in checking voice messaging is judged in this step Sequence whether consistent with the sequence of the respective symbols in the first character string, if inconsistent, then may determine that Application on Voiceprint Recognition failure, If the sequence with the respective symbols in the first character string is consistent, then perform follow-up flow process.
S606, extracts the vocal print feature of the corresponding sound bite of each character.
Concrete, voice print identification device can extract the MFCC (Mel in the corresponding sound bite of each character Frequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual Linear Predictive, perception linear predictor coefficient), as the vocal print feature of the sound bite corresponding to each character.
S607, using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as number of training According to the employing average super vector to the default corresponding universal background model of respective symbols for the maximal posterior probability algorithm is adjusted Whole, thus estimate to be verified in voice messaging each character characteristic of correspondence vector.
May be used for differentiation speak owing to substantial amounts of experiment and paper demonstrate the average of each Gauss module in UBM model The identity information of people, voice print identification device can be by the vocal print feature of the corresponding sound bite of each character in checking voice messaging As training sample data, use maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default corresponding word The average super vector according with corresponding universal background model is adjusted, and i.e. will verify the corresponding language of each character in voice messaging After the vocal print feature of tablet section substitutes into formula (1) as input sample, adjust average super vector by continuous so that posterior probability P X () is maximum, such that it is able to the average super vector of posterior probability P (x) maximum will be made as respective symbols in checking voice messaging Characteristic of correspondence vector.
In another alternative embodiment, the slow problem of the high-dimensional convergence rate brought in order to reduce super vector, vocal print Identify that device can use following formula to be adjusted the average super vector of the default corresponding universal background model of respective symbols, make The posterior probability of the corresponding universal background model of respective symbols after must adjusting is maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents tune The average super vector of the universal background model of the respective symbols before whole, T is the super vector subspace matrices preset, and ω is checking Respective symbols characteristic of correspondence vector in voice messaging, i.e. by the corresponding sound bite of each character in checking voice messaging After vocal print feature substitutes into formula (1) as input sample, by continuous adjust ω can realize the average in adjustment type (1) surpass to Amount so that posterior probability P (x) is maximum, such that it is able to will make the ω of posterior probability P (x) maximum as in checking voice messaging Respective symbols characteristic of correspondence vector.
S608, calculates each character characteristic of correspondence vector and phase in the registration voice messaging preset in checking voice messaging Answer the similarity score of character characteristic of correspondence vector, if similarity score reaches to preset checking thresholding, then will verify that user is true It is set to registration voice messaging corresponding registration user.
In the present embodiment, voice print identification device can calculate in checking voice messaging each character characteristic of correspondence vector with In the registration voice messaging preset, the COS distance value between respective symbols characteristic of correspondence vector is as described similarity score, I.e. calculate certain character spy in characteristic of correspondence vector sum registration voice messaging in checking voice messaging respectively by following formula Levy the similarity score between vector:
s c o r e = ω i ( t a r ) T * ω i ( t e s t ) | | ω i ( t a r ) | | * | | ω i ( t e s t ) | |
Wherein, subscript i represents character total in i-th checking voice messaging and registration voice messaging, ωi(tar) table Show this character characteristic of correspondence vector, ω in checking voice messagingi(test) represent that this character is right in registration voice messaging The characteristic vector answered.If checking voice messaging comprises multiple identical character in registration voice messaging, then can be according to above formula The similarity score of each character calculated takes average, presets if the similarity score average of each character reaches corresponding Checking thresholding, then be defined as described registration voice messaging corresponding registration user by described checking user.If there is multidigit registration User, such as registration user A, B and the C shown in Fig. 1, can note with each according to the characteristic vector of checking certain character of user The similarity of the characteristic vector of the respective symbols of volume user, when certain characteristic vector registering the respective symbols of user and checking language The similarity score of the characteristic vector of this character of sound is the highest and similarity reaches to preset checking thresholding, then make this registration user For verifying the identification result of user.
Thus, the present embodiment will be by verifying each character characteristic of correspondence vector and registration voice messaging in voice messaging The characteristic vector of middle respective symbols carries out similarity-rough set, and the sequential combining sound bite judges, can be true further Protect checking user the accuracy of user identity.
Fig. 7 is the structural representation of a kind of voice print identification device in the embodiment of the present invention, as shown in the figure in the present embodiment Voice print identification device may include that
Voice acquisition module 710, is used for obtaining checking user and reads aloud the first character string produced checking voice messaging.
Described checking user is the user of unknown identity, needs to verify its user identity by voice print identification device.Described First character string is for verifying that user carries out the character string of authentication, can be randomly generated, it is also possible to be to preset admittedly A fixed character string, for example at least partly identical with registration corresponding second character string of voice messaging previously generating one Character string.Concrete, described character string can comprise m character, wherein has n mutually different character, and m, n are just whole Number, and m >=n.
Such as, the first character string is " 12358948 ", totally 8 characters, include 7 kinds of mutually different characters " 1 ", " 2 ", “3”、“4”、“5”、“8”、“9”。
Sound bite identification module 720, obtains described checking language for carrying out speech recognition to described checking voice messaging Message breath in comprise respectively with the corresponding sound bite of multiple characters in described first character string.
Filter as it is shown on figure 3, sound bite identification module 720 can pass through speech recognition and intensity of sound, by described Checking voice messaging divides and obtains the corresponding sound bite of multiple character, optionally can also weed out invalid voice fragment, It is not involved in follow-up processing procedure.
In an alternative embodiment, described sound bite identification module may include that as shown in Figure 8 further
Effective fragment recognition unit 721, for identifying the described efficient voice fragment verified in voice messaging and invalid language Tablet section.
Concrete, checking voice can be divided by effective fragment recognition unit 721 according to intensity of sound, and sound is strong Spend less sound bite and be considered as invalid voice fragment (for example including quiet section and impulsive noise).
Voice recognition unit 722, obtains respectively with described first for carrying out speech recognition to described efficient voice fragment The corresponding sound bite of multiple characters in character string.
Vocal print characteristic extracting module 730, for extracting the sound of the corresponding sound bite of each character in checking voice messaging Line feature.
Concrete, vocal print characteristic extracting module 730 can extract the MFCC (Mel in the corresponding sound bite of each character Frequency Cepstrum Coefficient, mel cepstrum coefficients) or PLP (Perceptual Linear Predictive, perception linear predictor coefficient), as the vocal print feature of the sound bite corresponding to each character.
Characteristic model training module 740, is used for the vocal print feature according to the corresponding sound bite of each character described, in conjunction with The respective symbols corresponding universal background model training preset is verified in voice messaging each character characteristic of correspondence vector.
Characteristic model training module 740 can be by the vocal print spy of the corresponding sound bite of each character in checking voice messaging Levy as training sample data, use maximal posterior probability algorithm (Maximum A Posteriori, MAP) to default corresponding The parameter of the corresponding universal background model of character is adjusted, and i.e. will verify the corresponding voice sheet of each character in voice messaging After the vocal print feature of section substitutes into formula (1) as input sample, by the continuous corresponding common background of respective symbols adjusting and presetting The parameter of model so that posterior probability P (x) is maximum, thus characteristic model training module 740 can be according to making posterior probability P Respective symbols characteristic of correspondence vector in (x) maximum parameter determination checking voice messaging.
May be used for differentiation speak owing to substantial amounts of experiment and paper demonstrate the average of each Gauss module in UBM model The identity information of people, the average super vector that we define UBM model is:
μ 1 μ 2 . . . . μ C
Thus, characteristic model training module 740 can be by the corresponding sound bite of each character in checking voice messaging Vocal print feature as training sample data, uses maximal posterior probability algorithm (Maximum A Posteriori, MAP) to presetting The average super vector of the corresponding universal background model of respective symbols be adjusted, i.e. will each character in checking voice messaging After the vocal print feature of corresponding sound bite substitutes into formula (1) as input sample, adjust average super vector by continuous so that after Test probability P (x) maximum, characteristic model training module 740 can make the average super vector of posterior probability P (x) maximum as Respective symbols characteristic of correspondence vector in checking voice messaging.
In another alternative embodiment, the slow problem of the high-dimensional convergence rate brought in order to reduce super vector, we By the principal component analytical method (PPCA, probabilistic principal component analysis) based on probability Being limited in the excursion of average super vector in one sub spaces, characteristic model training module 740 can be by checking voice letter In breath, the vocal print feature of the corresponding sound bite of each character is as training sample data, uses maximal posterior probability algorithm in advance If the average super vector of the corresponding universal background model of respective symbols be adjusted, and combine the super vector subspace square preset Battle array thus be verified in voice messaging each character characteristic of correspondence vector.In implementing, characteristic model training module 740 Following formula can be used to be adjusted the average super vector of the default corresponding universal background model of respective symbols so that after adjustment The posterior probability of the corresponding universal background model of respective symbols maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents tune The average super vector of the universal background model of the respective symbols before whole, T is the super vector subspace matrices preset, and ω is checking Respective symbols characteristic of correspondence vector in voice messaging, i.e. by the corresponding sound bite of each character in checking voice messaging After vocal print feature substitutes into formula (1) as input sample, by continuous adjust ω can realize the average in adjustment type (1) surpass to Amount so that posterior probability P (x) is maximum, such that it is able to will make the ω of posterior probability P (x) maximum as in checking voice messaging Respective symbols characteristic of correspondence vector.Described super vector subspace matrices T for according to the average of described gauss hybrid models surpass to Correlation determination between each dimension vector in amount obtains.
Similarity judge module 750, is used for calculating each character characteristic of correspondence in checking voice messaging vectorial and default Registration voice messaging in respective symbols characteristic of correspondence vector similarity score.
Concrete, voice print identification device can get the registration voice messaging of registration user in the voiceprint registration stage, And pass through sound bite identification module the 720th, vocal print characteristic extracting module 730 and characteristic model training module 740, can obtain The sound bite characteristic of correspondence vector of each character in registration voice messaging.Described registration voice messaging, can be that vocal print is known Other device obtains registration user and reads aloud the second character string produced registration voice messaging, described second character string and described first Character string has at least one identical character, i.e. described registration corresponding second character string of voice messaging and described first character Go here and there at least partly identical.And then in an alternative embodiment, voice print identification device can also obtain described registration voice letter from outside After respective symbols characteristic of correspondence vector in breath, i.e. registration user pass through other equipment typings registration voice messaging, other set Standby or server obtains registering the voice sheet of each character in voice messaging by vocal print feature extraction and vocal print model training Section characteristic of correspondence vector, voice print identification device is by getting in described registration voice messaging from other equipment or server Respective symbols characteristic of correspondence vector, thus checking user identification stage similarity judge module 750 in order to test In card voice messaging, each character characteristic of correspondence vector compares.
In implementing, described similarity score is that voice print identification device will verify that in voice messaging, each character is corresponding After characteristic vector compares with respective symbols characteristic of correspondence vector in the registration voice messaging preset, weigh identical characters The score value of the similarity degree between two characteristic vectors.In an alternative embodiment, similarity judge module 750 can calculate checking Each character characteristic of correspondence vector and respective symbols characteristic of correspondence vector in the registration voice messaging preset in voice messaging Between COS distance value as described similarity score, i.e. by following formula calculate certain character respectively checking voice messaging in The similarity score between characteristic vector in characteristic of correspondence vector sum registration voice messaging:
s c o r e = ω i ( t a r ) T * ω i ( t e s t ) | | ω i ( t a r ) | | * | | ω i ( t e s t ) | |
Wherein, subscript i represents character total in i-th checking voice messaging and registration voice messaging, ωi(tar) table Show this character characteristic of correspondence vector, ω in checking voice messagingi(test) represent that this character is right in registration voice messaging The characteristic vector answered.In an alternative embodiment, if described checking voice messaging existing same character and occurring more than once, such as Occur the 1st, the 0th, checking voice messaging as shown in Figure 25 and 8 all occur in that 2 times respectively, then can be according to character 0 twice Corresponding sound bite processes the phase respectively with the characteristic vector of character 0 in default registration voice messaging for the characteristic vector obtaining Like the mean value spending fraction, as in this characteristic vector verifying character 0 in voice messaging and the registration voice messaging preset The similarity score of the characteristic vector of character 0, by that analogy.
It is pointed out that the mode of the similarity weighed between two characteristic vectors also has a lot, above is only this A kind of embodiment of bright offer, those skilled in the art can be without creative labor on the basis of scheme disclosed by the invention The similarity obtaining the more characteristic vector calculating character total in checking voice messaging and registration voice messaging is divided dynamicly The mode of number, the present invention is not necessarily to exhaustive.
Subscriber identification module 760, if reaching to preset checking thresholding for described similarity score, then by described checking user It is defined as described registration voice messaging corresponding registration user.
If checking voice messaging comprises multiple identical character in registration voice messaging, then subscriber identification module 760 can Take average with the similarity score according to similarity judge module 750 each character calculated, if each character is similar Degree fraction average reaches corresponding default checking thresholding, then described checking user is defined as described registration voice messaging corresponding Registration user.If there is multidigit registration user, such as registration user A, B and the C shown in Fig. 1, subscriber identification module 760 is permissible The similarity of the characteristic vector of the respective symbols with each registration user for the characteristic vector according to checking certain character of user, when certain The characteristic vector of the respective symbols of individual registration user the highest with the similarity score of characteristic vector of this character of checking voice and Similarity reaches to preset checking thresholding, then using this registration user as the identification result verifying user.
And then in an alternative embodiment, described voice acquisition module 710, it is additionally operable to obtain registration user and read aloud the second character String produced registration voice messaging, described second character string has at least one identical character with described first character string;
Described sound bite identification module 720, is additionally operable to carry out speech recognition to described registration voice messaging and obtains described Registration voice messaging in comprise respectively with the corresponding sound bite of multiple characters in described second character string;
Described vocal print characteristic extracting module 730, is additionally operable to extract the corresponding voice sheet of each character in registration voice messaging The vocal print feature of section;
Described characteristic model training module 740, is additionally operable to according to the corresponding language of each character in described registration voice messaging The vocal print feature of tablet section, obtains in registration voice messaging each in conjunction with the respective symbols corresponding universal background model training preset Individual character characteristic of correspondence vector.
In an alternative embodiment, voice print identification device can also include further:
Character sorts determining module 770, for determining the sound bite of the multiple characters in described checking voice messaging Sort consistent with the sequence of the respective symbols in described first character string.
In order to be prevented effectively from after the voice messaging registering user is copied illegally or illegally copied in order to carry out Application on Voiceprint Recognition, permissible The first different character string of each stochastic generation, and the sound bite of the multiple characters in checking voice messaging is judged in this step Sequence whether consistent with the sequence of the respective symbols in the first character string, if inconsistent, then may determine that Application on Voiceprint Recognition failure, If the sequence with the respective symbols in the first character string is consistent, then can notify vocal print characteristic extracting module 730 or characteristic model Training module 740 performs to train for feature extraction and the vocal print of this checking voice messaging.
In an alternative embodiment, voice print identification device can also include further:
Character string display module 700, is used for the first character string described in stochastic generation and shows.
Thus, the present embodiment is by obtaining the corresponding sound bite of each character in the checking voice messaging verifying user Vocal print feature, the UBM training in conjunction with the respective symbols preset is verified in voice messaging each character characteristic of correspondence vector, And by by the feature of each character characteristic of correspondence vector in checking voice messaging and respective symbols in registration voice messaging to Amount carry out similarity-rough set, so that it is determined that checking user user identity, which in order to compare user characteristics vector with Concrete character is corresponding, fully takes into account vocal print feature when user reads aloud kinds of characters, thus it is accurate to effectively improve Application on Voiceprint Recognition True rate.
In actual test case, in 1000 people's training samples, 290,000 tests, (wherein the test of identities match is 1 About ten thousand times, matching test is not about at 280,000 times), it is capable of the recall rate of under one thousandth error rate 79.8%, wait wrong general Rate (EER, Equal Error Rate) is 3.39%, and compared to traditional unrelated modeling method of text, Application on Voiceprint Recognition performance carries Rise more than more than 40%.
One of ordinary skill in the art will appreciate that all or part of flow process realizing in above-described embodiment method, be permissible Instructing related hardware by computer program to complete, described program can be stored in a computer read/write memory medium In, this program is upon execution, it may include such as the flow process of the embodiment of above-mentioned each method.Wherein, described storage medium can be magnetic Dish, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc..
The above disclosed present pre-ferred embodiments that is only, can not limit the right model of the present invention with this certainly Enclose, the equivalent variations therefore made according to the claims in the present invention, still belong to the scope that the present invention is covered.

Claims (20)

1. a method for recognizing sound-groove, it is characterised in that described method includes:
Obtain checking user and read aloud the first character string produced checking voice messaging;
To described checking voice messaging carry out speech recognition obtain described checking voice messaging in comprise respectively with described first The corresponding sound bite of multiple characters in character string;
Extract the vocal print feature of the corresponding sound bite of each character;
According to the vocal print feature of the corresponding sound bite of each character described, in conjunction with the corresponding common background of respective symbols preset Model training is verified in voice messaging each character characteristic of correspondence vector;
Calculate each character characteristic of correspondence vector and respective symbols pair in the registration voice messaging preset in checking voice messaging By described, the similarity score of the characteristic vector answered, if described similarity score reaches to preset checking thresholding, then verifies that user is true It is set to described registration voice messaging corresponding registration user.
2. method for recognizing sound-groove as claimed in claim 1, it is characterised in that described acquisition checking user reads aloud the first character string Also include before produced checking voice messaging:
Obtain registration user and read aloud the second character string produced registration voice messaging, described second character string and described first word Symbol string has at least one identical character;
To described registration voice messaging carry out speech recognition obtain described registration voice messaging in comprise respectively with described second The corresponding sound bite of multiple characters in character string;
Extract the vocal print feature of the corresponding sound bite of each character in registration voice messaging;
According to the vocal print feature of the corresponding sound bite of each character in registration voice messaging, corresponding in conjunction with the respective symbols preset Universal background model training obtain each character characteristic of correspondence vector in registration voice messaging.
3. method for recognizing sound-groove as claimed in claim 1, it is characterised in that the corresponding voice of each character described in described basis The vocal print feature of fragment, is verified in voice messaging each in conjunction with the respective symbols corresponding universal background model training preset Character characteristic of correspondence vector includes:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use maximum The average super vector to the default corresponding universal background model of respective symbols for the posterior probability algorithm is adjusted, thus estimates Each character characteristic of correspondence vector in checking voice messaging.
4. method for recognizing sound-groove as claimed in claim 3, it is characterised in that described by each character pair in checking voice messaging The vocal print feature of the sound bite answered, as training sample data, uses maximal posterior probability algorithm to default respective symbols pair The average super vector of the universal background model answered is adjusted, thus it is corresponding to estimate to be verified in voice messaging each character Characteristic vector includes:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use maximum The average super vector to the default corresponding universal background model of respective symbols for the posterior probability algorithm is adjusted, and combines default Super vector subspace matrices thus be verified in voice messaging each character characteristic of correspondence vector.
5. method for recognizing sound-groove as claimed in claim 4, it is characterised in that described by each character pair in checking voice messaging The vocal print feature of the sound bite answered, as training sample data, uses maximal posterior probability algorithm to default respective symbols pair The average super vector of the universal background model answered is adjusted, and combine preset super vector subspace matrices thus be verified In voice messaging, each character characteristic of correspondence vector includes:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use following formula The average super vector of the default corresponding universal background model of respective symbols is adjusted so that the respective symbols pair after adjustment The posterior probability of the universal background model answered is maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents before adjusting The average super vector of universal background model of respective symbols, T is the super vector subspace matrices preset, and ω is checking voice Respective symbols characteristic of correspondence vector in information.
6. method for recognizing sound-groove as claimed in claim 4, it is characterised in that described super vector subspace matrices is for according to described In universal background model each Gauss module weight between correlation determine and obtain.
7. method for recognizing sound-groove as claimed in claim 1, it is characterised in that each character in described calculating checking voice messaging Characteristic of correspondence vector includes with the similarity score of respective symbols characteristic of correspondence vector in the registration voice messaging preset:
Calculate each character characteristic of correspondence vector and respective symbols pair in the registration voice messaging preset in checking voice messaging COS distance value between the characteristic vector answered is as described similarity score.
8. method for recognizing sound-groove as claimed in claim 1, it is characterised in that described voice is carried out to described checking voice messaging Identify obtain described checking voice messaging in comprise respectively with the corresponding voice sheet of multiple characters in described first character string Section includes:
Identify the efficient voice fragment in described checking voice messaging and invalid voice fragment;
Carry out speech recognition to described efficient voice fragment and obtain corresponding with the multiple characters in described first character string respectively Sound bite.
9. method for recognizing sound-groove as claimed in claim 1, it is characterised in that described described checking user is defined as described note Also include before volume voice messaging corresponding registration user:
Determine that the sequence of the sound bite of the described multiple characters verified in voice messaging is corresponding to described first character string The sequence of character is consistent.
10. method for recognizing sound-groove as claimed in any one of claims 1-9 wherein, it is characterised in that checking user is bright in described acquisition Also include before reading the first character string produced checking voice messaging:
First character string described in stochastic generation simultaneously shows.
11. 1 kinds of voice print identification device, it is characterised in that described device includes:
Voice acquisition module, is used for obtaining checking user and reads aloud the first character string produced checking voice messaging;
Sound bite identification module, obtains in described checking voice messaging for carrying out speech recognition to described checking voice messaging Comprise respectively with the corresponding sound bite of multiple characters in described first character string;
Vocal print characteristic extracting module, for extracting the vocal print feature of the corresponding sound bite of each character in checking voice messaging;
Characteristic model training module, for the vocal print feature according to the corresponding sound bite of each character described, in conjunction with preset The training of respective symbols corresponding universal background model is verified in voice messaging each character characteristic of correspondence vector;
Similarity judge module, for calculating each character characteristic of correspondence vector and the registration language preset in checking voice messaging The similarity score of respective symbols characteristic of correspondence vector in message breath;
Described checking user if reaching to preset checking thresholding for described similarity score, is then defined as by subscriber identification module Described registration voice messaging corresponding registration user.
12. voice print identification device as claimed in claim 11, it is characterised in that
Described voice acquisition module, is additionally operable to obtain registration user and reads aloud the second character string produced registration voice messaging, institute State the second character string and have at least one identical character with described first character string;
Described sound bite identification module, is additionally operable to carry out speech recognition to described registration voice messaging and obtains described registration voice Information comprises respectively with the corresponding sound bite of multiple characters in described second character string;
Described vocal print characteristic extracting module, is additionally operable to extract the vocal print of the corresponding sound bite of each character in registration voice messaging Feature;
Described characteristic model training module, is additionally operable to according to the corresponding sound bite of each character in described registration voice messaging Vocal print feature, obtains each character pair in registration voice messaging in conjunction with the respective symbols corresponding universal background model training preset The characteristic vector answered.
13. voice print identification device as claimed in claim 11, it is characterised in that described characteristic vector computing module is used for:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use maximum The average super vector to the default corresponding universal background model of respective symbols for the posterior probability algorithm is adjusted, thus estimates Each character characteristic of correspondence vector in checking voice messaging.
14. voice print identification device as claimed in claim 13, it is characterised in that described characteristic vector computing module is used for:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use maximum The average super vector to the default corresponding universal background model of respective symbols for the posterior probability algorithm is adjusted, and combines default Super vector subspace matrices thus be verified in voice messaging each character characteristic of correspondence vector.
15. voice print identification device as claimed in claim 14, it is characterised in that described characteristic vector computing module is specifically used In:
Using the vocal print feature of the corresponding sound bite of each character in checking voice messaging as training sample data, use following formula The average super vector of the default corresponding universal background model of respective symbols is adjusted so that the respective symbols pair after adjustment The posterior probability of the universal background model answered is maximum:
M=m+T ω, wherein M represents the average super vector of the universal background model of certain character after adjusting, and m represents before adjusting The average super vector of universal background model of respective symbols, T is the super vector subspace matrices preset, and ω is checking voice Respective symbols characteristic of correspondence vector in information.
16. voice print identification device as claimed in claim 14, it is characterised in that described super vector subspace matrices is for according to institute State what the correlation determination between each dimension vector in the average super vector of gauss hybrid models obtained.
17. voice print identification device as claimed in claim 11, it is characterised in that described similarity judge module is used for:
Calculate each character characteristic of correspondence vector and respective symbols pair in the registration voice messaging preset in checking voice messaging COS distance value between the characteristic vector answered is as described similarity score.
18. voice print identification device as claimed in claim 11, it is characterised in that described sound bite identification module includes:
Effective fragment recognition unit, for identifying the described efficient voice fragment verified in voice messaging and invalid voice fragment;
Voice recognition unit, for described efficient voice fragment carried out speech recognition obtain respectively with in described first character string The corresponding sound bite of multiple characters.
19. voice print identification device as claimed in claim 11, it is characterised in that also include:
Character sequence determining module, is used for determining sequence and the institute of the sound bite of the multiple characters in described checking voice messaging The sequence stating respective symbols in the first character string is consistent.
20. voice print identification device as according to any one of claim 11-19, it is characterised in that also include:
Character string display module, is used for the first character string described in stochastic generation and shows.
CN201610416650.3A 2016-06-12 2016-06-12 A kind of method for recognizing sound-groove and device Active CN106098068B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610416650.3A CN106098068B (en) 2016-06-12 2016-06-12 A kind of method for recognizing sound-groove and device
PCT/CN2017/087911 WO2017215558A1 (en) 2016-06-12 2017-06-12 Voiceprint recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610416650.3A CN106098068B (en) 2016-06-12 2016-06-12 A kind of method for recognizing sound-groove and device

Publications (2)

Publication Number Publication Date
CN106098068A true CN106098068A (en) 2016-11-09
CN106098068B CN106098068B (en) 2019-07-16

Family

ID=57846666

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610416650.3A Active CN106098068B (en) 2016-06-12 2016-06-12 A kind of method for recognizing sound-groove and device

Country Status (2)

Country Link
CN (1) CN106098068B (en)
WO (1) WO2017215558A1 (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107248410A (en) * 2017-07-19 2017-10-13 浙江联运知慧科技有限公司 The method that Application on Voiceprint Recognition dustbin opens the door
WO2017215558A1 (en) * 2016-06-12 2017-12-21 腾讯科技(深圳)有限公司 Voiceprint recognition method and device
CN107886943A (en) * 2017-11-21 2018-04-06 广州势必可赢网络科技有限公司 Voiceprint recognition method and device
CN108154588A (en) * 2017-12-29 2018-06-12 深圳市艾特智能科技有限公司 Unlocking method, system, readable storage medium storing program for executing and smart machine
CN108269590A (en) * 2018-01-17 2018-07-10 广州势必可赢网络科技有限公司 Vocal cord recovery scoring method and device
WO2018126338A1 (en) * 2017-01-03 2018-07-12 Nokia Technologies Oy Apparatus, method and computer program product for authentication
CN108447489A (en) * 2018-04-17 2018-08-24 清华大学 A kind of continuous voiceprint authentication method and system of band feedback
CN108447471A (en) * 2017-02-15 2018-08-24 腾讯科技(深圳)有限公司 Audio recognition method and speech recognition equipment
WO2018223727A1 (en) * 2017-06-09 2018-12-13 平安科技(深圳)有限公司 Voiceprint recognition method, apparatus and device, and medium
CN109102812A (en) * 2017-06-21 2018-12-28 北京搜狗科技发展有限公司 A kind of method for recognizing sound-groove, system and electronic equipment
CN109117622A (en) * 2018-09-19 2019-01-01 北京容联易通信息技术有限公司 A kind of identity identifying method based on audio-frequency fingerprint
WO2019000832A1 (en) * 2017-06-30 2019-01-03 百度在线网络技术(北京)有限公司 Method and apparatus for voiceprint creation and registration
CN109257362A (en) * 2018-10-11 2019-01-22 平安科技(深圳)有限公司 Method, apparatus, computer equipment and the storage medium of voice print verification
CN109473107A (en) * 2018-12-03 2019-03-15 厦门快商通信息技术有限公司 A kind of relevant method for recognizing sound-groove of text half and system
CN109559759A (en) * 2017-09-27 2019-04-02 华硕电脑股份有限公司 The electronic equipment and its method for having increment registering unit
CN110047491A (en) * 2018-01-16 2019-07-23 中国科学院声学研究所 A kind of relevant method for distinguishing speek person of random digit password and device
CN110517695A (en) * 2019-09-11 2019-11-29 国微集团(深圳)有限公司 Verification method and device based on vocal print
CN110875044A (en) * 2018-08-30 2020-03-10 中国科学院声学研究所 Speaker identification method based on word correlation score calculation
CN110956732A (en) * 2019-12-19 2020-04-03 重庆特斯联智慧科技股份有限公司 Safety entrance guard based on thing networking
CN110971763A (en) * 2019-12-10 2020-04-07 Oppo(重庆)智能科技有限公司 Arrival reminding method and device, storage medium and electronic equipment
CN111081260A (en) * 2019-12-31 2020-04-28 苏州思必驰信息科技有限公司 Method and system for identifying voiceprint of awakening word
CN111081256A (en) * 2019-12-31 2020-04-28 苏州思必驰信息科技有限公司 Digital string voiceprint password verification method and system
CN111597531A (en) * 2020-04-07 2020-08-28 北京捷通华声科技股份有限公司 Identity authentication method and device, electronic equipment and readable storage medium
CN111613230A (en) * 2020-06-24 2020-09-01 泰康保险集团股份有限公司 Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium
CN111669350A (en) * 2019-03-05 2020-09-15 阿里巴巴集团控股有限公司 Identity verification method, verification information generation method, payment method and payment device
WO2021017982A1 (en) * 2019-07-29 2021-02-04 华为技术有限公司 Voiceprint recognition method, and device
CN112820299A (en) * 2020-12-29 2021-05-18 马上消费金融股份有限公司 Voiceprint recognition model training method and device and related equipment
CN113113022A (en) * 2021-04-15 2021-07-13 吉林大学 Method for automatically identifying identity based on voiceprint information of speaker
CN113570754A (en) * 2021-07-01 2021-10-29 汉王科技股份有限公司 Voiceprint lock control method and device and electronic equipment
US11335352B2 (en) * 2017-09-29 2022-05-17 Tencent Technology (Shenzhen) Company Limited Voice identity feature extractor and classifier training
CN116530944A (en) * 2023-07-06 2023-08-04 荣耀终端有限公司 Sound processing method and electronic equipment
CN116978368A (en) * 2023-09-25 2023-10-31 腾讯科技(深圳)有限公司 Wake-up word detection method and related device

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109147767B (en) * 2018-08-16 2024-06-21 平安科技(深圳)有限公司 Method, device, computer equipment and storage medium for recognizing numbers in voice
CN111199729B (en) * 2018-11-19 2023-09-26 阿里巴巴集团控股有限公司 Voiceprint recognition method and voiceprint recognition device
CN110738998A (en) * 2019-09-11 2020-01-31 深圳壹账通智能科技有限公司 Voice-based personal credit evaluation method, device, terminal and storage medium
CN112037815B (en) * 2020-08-28 2024-09-06 中移(杭州)信息技术有限公司 Audio fingerprint extraction method, server and storage medium
CN112435673B (en) * 2020-12-15 2024-05-14 北京声智科技有限公司 Model training method and electronic terminal
WO2024077588A1 (en) * 2022-10-14 2024-04-18 Qualcomm Incorporated Voice-based user authentication
CN115641852A (en) * 2022-10-18 2023-01-24 中国电信股份有限公司 Voiceprint recognition method and device, electronic equipment and computer readable storage medium
CN115550075B (en) * 2022-12-01 2023-05-09 中网道科技集团股份有限公司 Anti-counterfeiting processing method and equipment for community correction object public welfare activity data

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101997689A (en) * 2010-11-19 2011-03-30 吉林大学 USB (universal serial bus) identity authentication method based on voiceprint recognition and system thereof
CN102163427A (en) * 2010-12-20 2011-08-24 北京邮电大学 Method for detecting audio exceptional event based on environmental model
CN102254559A (en) * 2010-05-20 2011-11-23 盛乐信息技术(上海)有限公司 Identity authentication system and method based on vocal print
CN102314877A (en) * 2010-07-08 2012-01-11 盛乐信息技术(上海)有限公司 Voiceprint identification method for character content prompt
CN102737634A (en) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 Authentication method and device based on voice
CN103679452A (en) * 2013-06-20 2014-03-26 腾讯科技(深圳)有限公司 Payment authentication method, device thereof and system thereof
CN104064189A (en) * 2014-06-26 2014-09-24 厦门天聪智能软件有限公司 Vocal print dynamic password modeling and verification method
CN104282303A (en) * 2013-07-09 2015-01-14 威盛电子股份有限公司 Method for conducting voice recognition by voiceprint recognition and electronic device thereof
CN104575504A (en) * 2014-12-24 2015-04-29 上海师范大学 Method for personalized television voice wake-up by voiceprint and voice identification
CN104901808A (en) * 2015-04-14 2015-09-09 时代亿宝(北京)科技有限公司 Voiceprint authentication system and method based on time type dynamic password
CN105096121A (en) * 2015-06-25 2015-11-25 百度在线网络技术(北京)有限公司 Voiceprint authentication method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100406307B1 (en) * 2001-08-09 2003-11-19 삼성전자주식회사 Voice recognition method and system based on voice registration method and system
CN102238189B (en) * 2011-08-01 2013-12-11 安徽科大讯飞信息科技股份有限公司 Voiceprint password authentication method and system
CN105656887A (en) * 2015-12-30 2016-06-08 百度在线网络技术(北京)有限公司 Artificial intelligence-based voiceprint authentication method and device
CN106098068B (en) * 2016-06-12 2019-07-16 腾讯科技(深圳)有限公司 A kind of method for recognizing sound-groove and device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254559A (en) * 2010-05-20 2011-11-23 盛乐信息技术(上海)有限公司 Identity authentication system and method based on vocal print
CN102314877A (en) * 2010-07-08 2012-01-11 盛乐信息技术(上海)有限公司 Voiceprint identification method for character content prompt
CN101997689A (en) * 2010-11-19 2011-03-30 吉林大学 USB (universal serial bus) identity authentication method based on voiceprint recognition and system thereof
CN102163427A (en) * 2010-12-20 2011-08-24 北京邮电大学 Method for detecting audio exceptional event based on environmental model
CN102737634A (en) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 Authentication method and device based on voice
CN103679452A (en) * 2013-06-20 2014-03-26 腾讯科技(深圳)有限公司 Payment authentication method, device thereof and system thereof
CN104282303A (en) * 2013-07-09 2015-01-14 威盛电子股份有限公司 Method for conducting voice recognition by voiceprint recognition and electronic device thereof
CN104064189A (en) * 2014-06-26 2014-09-24 厦门天聪智能软件有限公司 Vocal print dynamic password modeling and verification method
CN104575504A (en) * 2014-12-24 2015-04-29 上海师范大学 Method for personalized television voice wake-up by voiceprint and voice identification
CN104901808A (en) * 2015-04-14 2015-09-09 时代亿宝(北京)科技有限公司 Voiceprint authentication system and method based on time type dynamic password
CN105096121A (en) * 2015-06-25 2015-11-25 百度在线网络技术(北京)有限公司 Voiceprint authentication method and device

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017215558A1 (en) * 2016-06-12 2017-12-21 腾讯科技(深圳)有限公司 Voiceprint recognition method and device
US11283631B2 (en) 2017-01-03 2022-03-22 Nokia Technologies Oy Apparatus, method and computer program product for authentication
WO2018126338A1 (en) * 2017-01-03 2018-07-12 Nokia Technologies Oy Apparatus, method and computer program product for authentication
CN108447471A (en) * 2017-02-15 2018-08-24 腾讯科技(深圳)有限公司 Audio recognition method and speech recognition equipment
CN108447471B (en) * 2017-02-15 2021-09-10 腾讯科技(深圳)有限公司 Speech recognition method and speech recognition device
WO2018223727A1 (en) * 2017-06-09 2018-12-13 平安科技(深圳)有限公司 Voiceprint recognition method, apparatus and device, and medium
CN109102812A (en) * 2017-06-21 2018-12-28 北京搜狗科技发展有限公司 A kind of method for recognizing sound-groove, system and electronic equipment
CN109102812B (en) * 2017-06-21 2021-08-31 北京搜狗科技发展有限公司 Voiceprint recognition method and system and electronic equipment
US11100934B2 (en) 2017-06-30 2021-08-24 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for voiceprint creation and registration
WO2019000832A1 (en) * 2017-06-30 2019-01-03 百度在线网络技术(北京)有限公司 Method and apparatus for voiceprint creation and registration
CN107248410A (en) * 2017-07-19 2017-10-13 浙江联运知慧科技有限公司 The method that Application on Voiceprint Recognition dustbin opens the door
CN109559759B (en) * 2017-09-27 2021-10-08 华硕电脑股份有限公司 Electronic device with incremental registration unit and method thereof
CN109559759A (en) * 2017-09-27 2019-04-02 华硕电脑股份有限公司 The electronic equipment and its method for having increment registering unit
US11335352B2 (en) * 2017-09-29 2022-05-17 Tencent Technology (Shenzhen) Company Limited Voice identity feature extractor and classifier training
CN107886943A (en) * 2017-11-21 2018-04-06 广州势必可赢网络科技有限公司 Voiceprint recognition method and device
CN108154588B (en) * 2017-12-29 2020-11-27 深圳市艾特智能科技有限公司 Unlocking method and system, readable storage medium and intelligent device
CN108154588A (en) * 2017-12-29 2018-06-12 深圳市艾特智能科技有限公司 Unlocking method, system, readable storage medium storing program for executing and smart machine
CN110047491A (en) * 2018-01-16 2019-07-23 中国科学院声学研究所 A kind of relevant method for distinguishing speek person of random digit password and device
CN108269590A (en) * 2018-01-17 2018-07-10 广州势必可赢网络科技有限公司 Vocal cord recovery scoring method and device
CN108447489A (en) * 2018-04-17 2018-08-24 清华大学 A kind of continuous voiceprint authentication method and system of band feedback
CN108447489B (en) * 2018-04-17 2020-05-22 清华大学 Continuous voiceprint authentication method and system with feedback
CN110875044B (en) * 2018-08-30 2022-05-03 中国科学院声学研究所 Speaker identification method based on word correlation score calculation
CN110875044A (en) * 2018-08-30 2020-03-10 中国科学院声学研究所 Speaker identification method based on word correlation score calculation
CN109117622A (en) * 2018-09-19 2019-01-01 北京容联易通信息技术有限公司 A kind of identity identifying method based on audio-frequency fingerprint
CN109117622B (en) * 2018-09-19 2020-09-01 北京容联易通信息技术有限公司 Identity authentication method based on audio fingerprints
CN109257362A (en) * 2018-10-11 2019-01-22 平安科技(深圳)有限公司 Method, apparatus, computer equipment and the storage medium of voice print verification
CN109473107A (en) * 2018-12-03 2019-03-15 厦门快商通信息技术有限公司 A kind of relevant method for recognizing sound-groove of text half and system
CN109473107B (en) * 2018-12-03 2020-12-22 厦门快商通信息技术有限公司 Text semi-correlation voiceprint recognition method and system
CN111669350A (en) * 2019-03-05 2020-09-15 阿里巴巴集团控股有限公司 Identity verification method, verification information generation method, payment method and payment device
US20220229891A1 (en) * 2019-07-29 2022-07-21 Huawei Technologies Co., Ltd. Voiceprint recognition method and device
WO2021017982A1 (en) * 2019-07-29 2021-02-04 华为技术有限公司 Voiceprint recognition method, and device
CN110517695A (en) * 2019-09-11 2019-11-29 国微集团(深圳)有限公司 Verification method and device based on vocal print
CN110971763B (en) * 2019-12-10 2021-01-26 Oppo广东移动通信有限公司 Arrival reminding method and device, storage medium and electronic equipment
CN110971763A (en) * 2019-12-10 2020-04-07 Oppo(重庆)智能科技有限公司 Arrival reminding method and device, storage medium and electronic equipment
CN110956732A (en) * 2019-12-19 2020-04-03 重庆特斯联智慧科技股份有限公司 Safety entrance guard based on thing networking
CN111081256A (en) * 2019-12-31 2020-04-28 苏州思必驰信息科技有限公司 Digital string voiceprint password verification method and system
CN111081260A (en) * 2019-12-31 2020-04-28 苏州思必驰信息科技有限公司 Method and system for identifying voiceprint of awakening word
CN111597531A (en) * 2020-04-07 2020-08-28 北京捷通华声科技股份有限公司 Identity authentication method and device, electronic equipment and readable storage medium
CN111613230A (en) * 2020-06-24 2020-09-01 泰康保险集团股份有限公司 Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium
CN112820299B (en) * 2020-12-29 2021-09-14 马上消费金融股份有限公司 Voiceprint recognition model training method and device and related equipment
CN112820299A (en) * 2020-12-29 2021-05-18 马上消费金融股份有限公司 Voiceprint recognition model training method and device and related equipment
CN113113022A (en) * 2021-04-15 2021-07-13 吉林大学 Method for automatically identifying identity based on voiceprint information of speaker
CN113570754A (en) * 2021-07-01 2021-10-29 汉王科技股份有限公司 Voiceprint lock control method and device and electronic equipment
CN113570754B (en) * 2021-07-01 2022-04-29 汉王科技股份有限公司 Voiceprint lock control method and device and electronic equipment
CN116530944A (en) * 2023-07-06 2023-08-04 荣耀终端有限公司 Sound processing method and electronic equipment
CN116530944B (en) * 2023-07-06 2023-10-20 荣耀终端有限公司 Sound processing method and electronic equipment
CN116978368A (en) * 2023-09-25 2023-10-31 腾讯科技(深圳)有限公司 Wake-up word detection method and related device
CN116978368B (en) * 2023-09-25 2023-12-15 腾讯科技(深圳)有限公司 Wake-up word detection method and related device

Also Published As

Publication number Publication date
CN106098068B (en) 2019-07-16
WO2017215558A1 (en) 2017-12-21

Similar Documents

Publication Publication Date Title
CN106098068A (en) A kind of method for recognizing sound-groove and device
CN106057206B (en) Sound-groove model training method, method for recognizing sound-groove and device
CN107610707B (en) A kind of method for recognizing sound-groove and device
CN110457432B (en) Interview scoring method, interview scoring device, interview scoring equipment and interview scoring storage medium
CN110310647B (en) Voice identity feature extractor, classifier training method and related equipment
TWI527023B (en) A voiceprint recognition method and apparatus
Kelly et al. Deep neural network based forensic automatic speaker recognition in VOCALISE using x-vectors
Das et al. Development of multi-level speech based person authentication system
CN107104803A (en) It is a kind of to combine the user ID authentication method confirmed with vocal print based on numerical password
CN111402862B (en) Speech recognition method, device, storage medium and equipment
CN100363938C (en) Multi-model ID recognition method based on scoring difference weight compromised
Mansour et al. Voice recognition using dynamic time warping and mel-frequency cepstral coefficients algorithms
CN104217149A (en) Biometric authentication method and equipment based on voice
CN101465123A (en) Verification method and device for speaker authentication and speaker authentication system
CN110047504B (en) Speaker identification method under identity vector x-vector linear transformation
CN106782603A (en) Intelligent sound evaluating method and system
CN101609672B (en) Speech recognition semantic confidence feature extraction method and device
Meyer et al. Anonymizing speech with generative adversarial networks to preserve speaker privacy
Umesh et al. Frequency warping and the Mel scale
Beigi Challenges of LargeScale Speaker Recognition
CN110111798A (en) A kind of method and terminal identifying speaker
Büyük Sentence‐HMM state‐based i‐vector/PLDA modelling for improved performance in text dependent single utterance speaker verification
Ghaemmaghami et al. Speaker attribution of australian broadcast news data
Misra et al. Maximum-likelihood linear transformation for unsupervised domain adaptation in speaker verification
Mandalapu et al. Multilingual voice impersonation dataset and evaluation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230712

Address after: 518057 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 floors

Patentee after: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

Patentee after: TENCENT CLOUD COMPUTING (BEIJING) Co.,Ltd.

Address before: 2, 518000, East 403 room, SEG science and Technology Park, Zhenxing Road, Shenzhen, Guangdong, Futian District

Patentee before: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

TR01 Transfer of patent right