CN104485102A

CN104485102A - Voiceprint recognition method and device

Info

Publication number: CN104485102A
Application number: CN201410815733.0A
Authority: CN
Inventors: 李光日
Original assignee: Wisdom Eyes (hunan) Technology Development Co Ltd
Current assignee: Wisdom Eyes (hunan) Technology Development Co Ltd
Priority date: 2014-12-23
Filing date: 2014-12-23
Publication date: 2015-04-01

Abstract

The invention discloses a voiceprint recognition method and a voiceprint recognition device. The method comprises the following steps: extracting a first voiceprint character of a voiceprint to be recognized, wherein the first voiceprint character at least comprises a wavelet character of the voiceprint to be recognized; calculating recognition degree of the voiceprint to be recognized according to the first voiceprint character and a second voiceprint character in a voiceprint recognition model; judging whether the recognition degree is greater than a predetermined threshold; recognizing that the voiceprint to be recognized is a target voiceprint if the recognition degree is greater than a predetermined threshold. According to the method and the device, the technical problems of inaccurate recognition caused by using the existing voiceprint recognition method are solved.

Description

Method for recognizing sound-groove and device

Technical field

The present invention relates to computer realm, in particular to a kind of method for recognizing sound-groove and device.

Background technology

Nowadays, in order to strengthen the safety management to important information, increasing people starts to use encrypted authentication process to important information, such as: fingerprint recognition, face recognition, Application on Voiceprint Recognition.Wherein, Application on Voiceprint Recognition (VoiceprintRecognition, VPR), also referred to as Speaker Identification (Speaker Recognition), comprising two classes, is that speaker recognizes (Speaker Identification) and speaker verification (Speaker Verification) respectively.The former, in order to judge that certain section of voice are which in some people is said, is " multiselect one " problem; And the latter is in order to confirm that whether certain section of voice are that the someone that specifies is said, it is " differentiating one to one " problem.Different tasks and application can use different sound groove recognition technology in es, as needed recognition techniques when reducing criminal investigation scope, then need confirmation technology during bank transaction.

In Application on Voiceprint Recognition mode conventional at present, the most widely used feature is mel-frequency cepstrum coefficient (Mel-Frequency Cepstral Coefficients, MFCC), linear prediction residue error (LinearPrediction Cepstrum Coefficients, LPCC) and time difference feature thereof.But adopt the mode of above-mentioned feature identification vocal print above-mentioned perfect not enough, the sound of such as same person is subject to the impact of health, age, mood etc.; Such as different microphones and channel produce different impacts to Application on Voiceprint Recognition; Such as environmental noise produces interference to Application on Voiceprint Recognition; When such as multiple speaker's mixing is spoken again, everyone vocal print feature is not easily extracted ... to sum up, the recognition result that existing Application on Voiceprint Recognition mode obtains causes because being easily subject to various factors interference and identifies inaccurate problem.Further, for text-independent Application on Voiceprint Recognition, uniquely do not determine because gathered vocal print is various, and then cause the model of cognition that will set up more complicated, and the cycle of setting up is longer, thus cause the stability of Application on Voiceprint Recognition and recognition accuracy also poor.

For the problems of the prior art, at present effective solution is not yet proposed.

Summary of the invention

Embodiments provide a kind of method for recognizing sound-groove and device, at least to solve the inaccurate technical matters of identification owing to adopting existing method for recognizing sound-groove to cause.

According to an aspect of the embodiment of the present invention, provide a kind of method for recognizing sound-groove, comprising: extract the first vocal print feature in vocal print to be identified, wherein, in above-mentioned first vocal print feature, at least comprise the wavelet character of above-mentioned vocal print to be identified; At least according to the resolution of the above-mentioned vocal print to be identified of the second vocal print feature calculation in above-mentioned first vocal print feature and Application on Voiceprint Recognition model; Judge whether above-mentioned resolution is greater than predetermined threshold; If judge, above-mentioned resolution is greater than above-mentioned predetermined threshold, then identifying above-mentioned vocal print to be identified is target vocal print.

Alternatively, before the first vocal print feature in said extracted vocal print to be identified, also comprise: gather above-mentioned vocal print to be identified; Vocal print proper vector parameter corresponding with above-mentioned first vocal print proper vector in the vocal print universal model at least set up in advance according to the first vocal print proper vector adjustment in above-mentioned first vocal print feature, to construct the second vocal print proper vector in the above-mentioned second vocal print feature in the above-mentioned Application on Voiceprint Recognition model that adapts with above-mentioned vocal print to be identified.

Alternatively, above-mentioned first vocal print feature comprises multiple above-mentioned first vocal print proper vector, above-mentioned second vocal print feature comprises multiple above-mentioned second vocal print proper vector, wherein, above-mentionedly at least to comprise according to the resolution of above-mentioned first vocal print feature with the above-mentioned vocal print to be identified of the second vocal print feature calculation in Application on Voiceprint Recognition model: the vector distance calculating each in each above-mentioned first vocal print proper vector in above-mentioned first vocal print feature and above-mentioned second vocal print feature and that above-mentioned first vocal print proper vector is corresponding above-mentioned second vocal print proper vector; The target range of above-mentioned first vocal print feature and above-mentioned second vocal print feature is calculated according to the multiple above-mentioned vector distance calculated; The above-mentioned target range of above-mentioned first vocal print feature and above-mentioned second vocal print feature is at least utilized to calculate the resolution of above-mentioned vocal print to be identified.

Alternatively, before the above-mentioned vocal print to be identified of collection, also comprise: gather multiple vocal print and extract the 3rd vocal print feature of each vocal print in above-mentioned multiple vocal print, to build the corresponding background sound-groove model of multiple and above-mentioned vocal print, wherein, above-mentioned 3rd vocal print feature comprises multiple 3rd vocal print proper vector; Above-mentioned vocal print universal model is set up according to above-mentioned background sound-groove model.

Alternatively, the above-mentioned resolution at least utilizing above-mentioned first vocal print feature and the distance of above-mentioned second vocal print feature to calculate above-mentioned vocal print to be identified comprises: the back pitch calculating the above-mentioned 3rd vocal print feature of above-mentioned first vocal print feature each above-mentioned vocal print corresponding with multiple above-mentioned background sound-groove model respectively from; Calculate distance average according to multiple above-mentioned background distance and criterion distance poor; Calculate above-mentioned first vocal print feature and the above-mentioned above-mentioned target range of the second vocal print feature and the difference of above-mentioned distance average; Calculate the ratio of above-mentioned difference and above-mentioned criterion distance difference, using the above-mentioned resolution of above-mentioned ratio as above-mentioned vocal print to be identified.

Alternatively, above-mentioned wavelet character comprises real wavelet character and/or Phase information feature, and wherein, the first vocal print feature in said extracted vocal print to be identified comprises: detect between the dullness area in above-mentioned vocal print to be identified; Detect fundamental tone between each above-mentioned dullness area interval, and in each above-mentioned fundamental tone interval, extract the above-mentioned real wavelet character of above-mentioned vocal print feature and/or above-mentioned Phase information feature.

Alternatively, above-mentioned in each above-mentioned fundamental tone interval, extract above-mentioned vocal print feature above-mentioned real wavelet character and/or above-mentioned Phase information feature comprise: in each above-mentioned fundamental tone interval, extract predetermined proper vector, and according to wavelet filter, the above-mentioned proper vector in multiple above-mentioned fundamental tone interval to be divided into the sample vector of predetermined length, and by the sample vector normalization of above-mentioned predetermined length; Following at least one wavelet transformation is performed to the sample vector of the above-mentioned predetermined length after normalization: real valued wavelet transform is carried out to the sample vector of the above-mentioned predetermined length after normalization, to obtain the real part coefficient of the first predetermined frequency band, and in above-mentioned first predetermined frequency band, select the frequency band of first predetermined condition to sample, to obtain the above-mentioned real wavelet character in above-mentioned first vocal print feature; Dual-tree complex wavelet transform is carried out to the sample vector of the above-mentioned predetermined length after normalization, to obtain real part coefficient and the imaginary part coefficient of the second predetermined frequency band, and in above-mentioned second predetermined frequency band, select the frequency band of second predetermined condition to sample, to obtain the above-mentioned Phase information feature in above-mentioned first vocal print feature.

Alternatively, after detecting between the dullness area in above-mentioned vocal print to be identified, the first vocal print feature in said extracted vocal print to be identified also comprises: the mel cepstrum coefficients extracting each frame in above-mentioned vocal print to be identified, to obtain the above-mentioned mel cepstrum coefficients feature in above-mentioned first vocal print feature; The difference mel cepstrum coefficients feature of each frame in above-mentioned vocal print to be identified is calculated, to obtain the above-mentioned difference mel cepstrum coefficients feature in above-mentioned first vocal print feature according to above-mentioned mel cepstrum coefficients.

According to the another aspect of the embodiment of the present invention, additionally providing a kind of voice print identification device, comprising: extraction unit, for extracting the first vocal print feature in vocal print to be identified, wherein, in above-mentioned first vocal print feature, at least comprising the wavelet character of above-mentioned vocal print to be identified; Computing unit, at least according to the resolution of the above-mentioned vocal print to be identified of the second vocal print feature calculation in above-mentioned first vocal print feature and Application on Voiceprint Recognition model; Judging unit, for judging whether above-mentioned resolution is greater than predetermined threshold; Recognition unit, for when judging that above-mentioned resolution is greater than above-mentioned predetermined threshold, identifying above-mentioned vocal print to be identified is target vocal print.

Alternatively, said apparatus also comprises: collecting unit, before the first vocal print feature in said extracted vocal print to be identified, gathers above-mentioned vocal print to be identified; Adjustment unit, for vocal print proper vector parameter corresponding with above-mentioned first vocal print proper vector in the vocal print universal model at least set up in advance according to the first vocal print proper vector adjustment in above-mentioned first vocal print feature, to construct the second vocal print proper vector in the above-mentioned second vocal print feature in the above-mentioned Application on Voiceprint Recognition model that adapts with above-mentioned vocal print to be identified.

Alternatively, above-mentioned first vocal print feature comprises multiple above-mentioned first vocal print proper vector, above-mentioned second vocal print feature comprises multiple above-mentioned second vocal print proper vector, above-mentioned computing unit comprises: the first computing module, for calculating the vector distance of the above-mentioned second vocal print proper vector that each above-mentioned first vocal print proper vector is each with above-mentioned second vocal print feature and above-mentioned first vocal print proper vector is corresponding in above-mentioned first vocal print feature; Second computing module, for calculating the target range of above-mentioned first vocal print feature and above-mentioned second vocal print feature according to the multiple above-mentioned vector distance calculated; 3rd computing module, for the resolution at least utilizing the above-mentioned target range of above-mentioned first vocal print feature and above-mentioned second vocal print feature to calculate above-mentioned vocal print to be identified.

Alternatively, said apparatus also comprises: collecting unit, for gathering multiple vocal print and extracting the 3rd vocal print feature of each vocal print in above-mentioned multiple vocal print, to build the corresponding background sound-groove model of multiple and above-mentioned vocal print, wherein, above-mentioned 3rd vocal print feature comprises multiple 3rd vocal print proper vector; Set up unit, for setting up above-mentioned vocal print universal model according to above-mentioned background sound-groove model.

Alternatively, above-mentioned 3rd computing module comprises: the first calculating sub module, for the back pitch that calculates the above-mentioned 3rd vocal print feature of above-mentioned first vocal print feature each above-mentioned vocal print corresponding with multiple above-mentioned background sound-groove model respectively from; Second calculating sub module, for calculate distance average according to multiple above-mentioned background distance and criterion distance poor; 3rd calculating sub module, for calculating above-mentioned first vocal print feature and the above-mentioned above-mentioned target range of the second vocal print feature and the difference of above-mentioned distance average; 4th calculating sub module, for calculating the ratio of above-mentioned difference and above-mentioned criterion distance difference, using the above-mentioned resolution of above-mentioned ratio as above-mentioned vocal print to be identified.

Alternatively, above-mentioned wavelet character comprises real wavelet character and/or Phase information feature, and said extracted unit comprises: detection module, for detecting between the dullness area in above-mentioned vocal print to be identified; First extraction module, detects fundamental tone interval, and in each above-mentioned fundamental tone interval, extracts the above-mentioned real wavelet character of above-mentioned vocal print feature and/or above-mentioned Phase information feature between each above-mentioned dullness area.

Alternatively, above-mentioned first extraction module comprises: first extracts submodule, for extracting predetermined proper vector in each above-mentioned fundamental tone interval, and according to wavelet filter, the above-mentioned proper vector in multiple above-mentioned fundamental tone interval to be divided into the sample vector of predetermined length, and by the sample vector normalization of above-mentioned predetermined length; Transformation submodule, for performing following at least one wavelet transformation to the sample vector of the above-mentioned predetermined length after normalization: carry out real valued wavelet transform to the sample vector of the above-mentioned predetermined length after normalization, to obtain the real part coefficient of the first predetermined frequency band, and in above-mentioned first predetermined frequency band, select the frequency band of first predetermined condition to sample, to obtain the above-mentioned real wavelet character in above-mentioned first vocal print feature; Dual-tree complex wavelet transform is carried out to the sample vector of the above-mentioned predetermined length after normalization, to obtain real part coefficient and the imaginary part coefficient of the second predetermined frequency band, and in above-mentioned second predetermined frequency band, select the frequency band of second predetermined condition to sample, to obtain the above-mentioned Phase information feature in above-mentioned first vocal print feature.

Alternatively, said extracted unit also comprises: the second extraction module, for after detecting between the dullness area in above-mentioned vocal print to be identified, extract the mel cepstrum coefficients of each frame in above-mentioned vocal print to be identified, to obtain the above-mentioned mel cepstrum coefficients feature in above-mentioned first vocal print feature; 3rd computing module, for calculating the difference mel cepstrum coefficients feature of each frame in above-mentioned vocal print to be identified according to above-mentioned mel cepstrum coefficients, to obtain the above-mentioned difference mel cepstrum coefficients feature in above-mentioned first vocal print feature.

In embodiments of the present invention, in extraction vocal print to be identified after first vocal print feature, at least according to the resolution of the second vocal print feature calculation vocal print to be identified in the first vocal print feature and Application on Voiceprint Recognition model, when judging that resolution is greater than predetermined threshold, identify above-mentioned first vocal print and be characterized as target vocal print, wherein, above-mentioned first vocal print feature comprises wavelet character, that is, in conjunction with the wavelet character of vocal print on former characteristic basis, thus improve accuracy and the stability of Voiceprint Recognition System.And then the recognition result overcoming existing Application on Voiceprint Recognition mode identifies inaccurate problem because easily causing by various factors interference.Further, compared by direct and Application on Voiceprint Recognition model, reduce the complexity of model foundation and set up the cycle, thus improve stability and the recognition efficiency of Application on Voiceprint Recognition.

Accompanying drawing explanation

The accompanying drawing forming a application's part is used to provide a further understanding of the present invention, and schematic description and description of the present invention, for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:

Fig. 1 is the process flow diagram of a kind of optional method for recognizing sound-groove according to the embodiment of the present invention;

Fig. 2 is the process flow diagram that in a kind of optional method for recognizing sound-groove according to the embodiment of the present invention, UBM model is set up;

Fig. 3 is the process flow diagram extracting feature in a kind of optional method for recognizing sound-groove according to the embodiment of the present invention; And

Fig. 4 is the schematic diagram of a kind of optional voice print identification device according to the embodiment of the present invention.

Embodiment

It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.Below with reference to the accompanying drawings and describe the present invention in detail in conjunction with the embodiments.

The present invention program is understood better in order to make those skilled in the art person, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the embodiment of a part of the present invention, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, should belong to the scope of protection of the invention.

It should be noted that, term " first ", " second " etc. in instructions of the present invention and claims and above-mentioned accompanying drawing are for distinguishing similar object, and need not be used for describing specific order or precedence.Should be appreciated that the data used like this can be exchanged, in the appropriate case so that embodiments of the invention described herein.In addition, term " comprises " and " having " and their any distortion, intention is to cover not exclusive comprising, such as, contain those steps or unit that the process of series of steps or unit, method, system, product or equipment is not necessarily limited to clearly list, but can comprise clearly do not list or for intrinsic other step of these processes, method, product or equipment or unit.

Embodiment 1

According to the embodiment of the present invention, provide a kind of method for recognizing sound-groove, as shown in Figure 1, the method comprises:

S102, extracts the first vocal print feature in vocal print to be identified, wherein, at least comprises the wavelet character of vocal print to be identified in the first vocal print feature;

S104, at least according to the resolution of the second vocal print feature calculation vocal print to be identified in the first vocal print feature and Application on Voiceprint Recognition model;

S106, judges whether resolution is greater than predetermined threshold;

S108, if judge, resolution is greater than predetermined threshold, then identifying vocal print to be identified is target vocal print.

Alternatively, in the present embodiment, above-mentioned method for recognizing sound-groove can be, but not limited to be applied in utilize vocal print to carry out process that identification and identity discern, such as, when bank transaction to trading object registered in advance Application on Voiceprint Recognition model, when needs carry out authentication to trading object, then adopt above-mentioned method for recognizing sound-groove, when the resolution of the vocal print judging trading object meets predetermined threshold, identifiable design goes out for concluding the business after same vocal print again, thus ensures the security of transaction.Above-mentioned citing is a kind of example, and the present embodiment does not do any restriction to this.

Alternatively, in the present embodiment, the first vocal print feature in above-mentioned vocal print to be identified can include but not limited to: multiple first vocal print proper vector, and above-mentioned second vocal print feature can include but not limited to: multiple second vocal print proper vector.Alternatively, in the present embodiment, above-mentioned first vocal print feature can include but not limited to: 4 real wavelet characters, 4 dual-tree complex wavelet features, mel cepstrum coefficients feature, difference mel cepstrum coefficients features.Wherein, above-mentioned wavelet character comprise following one of at least: real small echo, Phase information.

Alternatively, in the present embodiment, above-mentioned Application on Voiceprint Recognition model can be, but not limited to: after adjusting vocal print universal model according to multiple vocal print proper vectors of vocal print feature in vocal print to be identified, obtains the model for identifying above-mentioned vocal print to be identified adapted with vocal print to be identified.Wherein, above-mentioned universal model can be, but not limited to: gather the vocal print of many people and extract vocal print feature from everyone vocal print, the background sound-groove model corresponding with everyone vocal print is set up respectively according to vocal print feature, again cluster is carried out to the 3rd vocal print feature in above-mentioned multiple background sound-groove model, and then set up vocal print universal model.Such as, universal background model (Un iversa l Background Mode l, UBM).

Such as, Figure 2 shows that the process flow diagram setting up UBM model according to multiple speaker, wherein, a vocal print feature comprises 10 vocal print proper vectors (i.e. 10 category features).Specifically, as step S202, gather the vocal print of many people and extract 10 category features from everyone vocal print, then cluster is carried out (such as to every category feature, comprise 32 centers), as step S204, then step S206, obtains the UBM model of 10 code books (the vocal print characteristic parameter that namely 10 vocal print proper vectors are corresponding) comprising 32 code words.Further, each speaker also can set up background sound-groove model according to the vocal print feature of self.

By the embodiment that the application provides, in extraction vocal print to be identified after first vocal print feature, at least according to the resolution of the second vocal print feature calculation vocal print to be identified in the first vocal print feature and Application on Voiceprint Recognition model, when judging that resolution is greater than predetermined threshold, identify above-mentioned first vocal print and be characterized as target vocal print, wherein, above-mentioned first vocal print feature comprises wavelet character, that is, in conjunction with the wavelet character of vocal print on former characteristic basis, thus improve accuracy and the stability of Voiceprint Recognition System.And then the recognition result overcoming existing Application on Voiceprint Recognition mode identifies inaccurate problem because easily causing by various factors interference.Further, compared by direct and Application on Voiceprint Recognition model, reduce the complexity of model foundation and set up the cycle, thus improve stability and the recognition efficiency of Application on Voiceprint Recognition.

As the optional scheme of one, before extracting the first vocal print feature in vocal print to be identified, also comprise:

S1, gathers vocal print to be identified;

S2, vocal print proper vector parameter corresponding with the first vocal print proper vector in the vocal print universal model at least set up in advance according to the first vocal print proper vector adjustment in the first vocal print feature, to construct the second vocal print proper vector in the second vocal print feature in the Application on Voiceprint Recognition model that adapts with vocal print to be identified.

Alternatively, in the present embodiment, the mode of above-mentioned collection vocal print to be identified can include but not limited to: utilizing microphone to gather duration is the voice to be identified of 5 seconds, and wherein, the form of the audio frequency of employing is sampling rate 16KHz, quantisation depth 16, monophone.

Alternatively, in the present embodiment, in above-mentioned first vocal print feature, multiple first vocal print proper vector can be included but not limited to, in above-mentioned second vocal print feature, multiple second vocal print proper vector can be included but not limited to.Such as, each vocal print feature comprises 10 VQ code books, that is, and the corresponding VQ code book of each first vocal print proper vector, wherein, the corresponding stack features collection of each VQ code book.

Alternatively, in the present embodiment, according to the multiple first vocal print proper vector adjustment vocal print universal models in the first vocal print, to obtain the Application on Voiceprint Recognition model adapted with vocal print to be identified, thus be convenient to utilize above-mentioned Application on Voiceprint Recognition model realization to identify the vocal print gathered afterwards.

By the embodiment that the application provides, before extracting the first vocal print feature in vocal print to be identified, by adjustment universal model to obtain the Application on Voiceprint Recognition model adapted with vocal print to be identified, realize the registered in advance to vocal print to be identified, thus make directly accurately to identify according to the Application on Voiceprint Recognition model of registered in advance when Application on Voiceprint Recognition, reduce the complexity of model foundation and set up the cycle, and then improve reliability and the efficiency of Application on Voiceprint Recognition.

As the optional scheme of one, first vocal print feature comprises multiple first vocal print proper vector, second vocal print feature comprises multiple second vocal print proper vector, wherein, at least comprise according to the resolution of the second vocal print feature calculation vocal print to be identified in the first vocal print feature and Application on Voiceprint Recognition model:

S1, calculates the vector similarity of each first vocal print proper vector in the first vocal print feature and second vocal print proper vector corresponding with the first vocal print proper vector each in the second vocal print feature;

S2 calculates the target range of the first vocal print feature and the second vocal print feature according to the multiple vector similarities calculated;

S3, at least utilizes the target range of the first vocal print feature and the second vocal print feature to calculate the resolution of vocal print to be identified.

Alternatively, in the present embodiment, the vector similarity of the first vocal print proper vector in above-mentioned calculating first vocal print feature and the second vocal print proper vector in the second vocal print feature comprises: calculate the distance between the first vocal print proper vector and the second vocal print proper vector.

Specifically be described in conjunction with following example, such as, vector distance between the second vocal print proper vector in the first vocal print proper vector in first vocal print feature of vocal print to be identified and the second vocal print feature of Application on Voiceprint Recognition model is a, multiple vector distance is normalized, and weighted sum obtains the target range S of the first vocal print feature and the second vocal print feature.The resolution of the first vocal print feature is at least calculated according to the above-mentioned target range S to the first vocal print feature and the second vocal print feature.Wherein, above-mentioned weight can be, but not limited to pre-set according to the significance level of different characteristic vector, is not limited thereto in the present embodiment.

By the embodiment that the application provides, by calculating the vector distance of multiple vocal print proper vector in the first vocal print feature and the second vocal print feature, accurately calculate the target range of the first vocal print feature and the second vocal print feature after summation is weighted to multiple vector distance, and then ensure that the accuracy of Application on Voiceprint Recognition degree.

As the optional scheme of one, before collection vocal print to be identified, also comprise:

S1, gather multiple vocal print and extract the 3rd vocal print feature of each vocal print in multiple vocal print, to build multiple background sound-groove model corresponding with vocal print, wherein, the 3rd vocal print feature comprises multiple 3rd vocal print proper vector;

S2, sets up vocal print universal model according to background sound-groove model.

By the embodiment that the application provides, by gathering the vocal print of multiple speaker to set up multiple background sound-groove model, to realize setting up according to background sound-groove model the vocal print universal model comprising many people vocal print feature, thus be convenient to set up the Application on Voiceprint Recognition model for Application on Voiceprint Recognition in advance, and then reach shorten model set up the cycle, improve the effect of the recognition efficiency of Application on Voiceprint Recognition.

As the optional scheme of one, the resolution at least utilizing the distance of the first vocal print feature and the second vocal print feature to calculate vocal print to be identified comprises:

S1, the back pitch calculating the 3rd vocal print feature of the first vocal print feature each vocal print corresponding with multiple background sound-groove model respectively from;

S2, according to multiple back pitch from calculating distance average and criterion distance poor;

S3, calculates the target range of the first vocal print feature and the second vocal print feature and the difference of distance average;

S4, the ratio of calculated difference and criterion distance difference, using the resolution of ratio as vocal print to be identified.

Alternatively, suppose that the first vocal print feature of vocal print to be identified and the target range S of the second vocal print feature identify, gather i vocal print altogether and set up i background sound-groove model, wherein, it is not D that the back pitch of the individual 3rd vocal print feature of i that the first vocal print feature is corresponding with i vocal print separates ₁, D ₂, D ₃... D _i, further, by calculate above-mentioned multiple back pitch from distance average be u, criterion distance difference is σ.Then by the resolution of following formulae discovery vocal print to be identified:

s’＝(s-u)/σ (1)

Further, judge the resolution s ' of above-mentioned identification vocal print and the magnitude relationship of predetermined threshold, be greater than predetermined threshold if judge, then think that vocal print to be identified is target vocal print.

It should be noted that, may change with the hardware environment and condition setting up Application on Voiceprint Recognition model owing to gathering vocal print to be identified, such as, Mike's model changes, then may cause producing larger change between the first vocal print feature of vocal print to be identified and the second vocal print feature of Application on Voiceprint Recognition model, and then the judgement of impact to vocal print to be identified, thus, further combined with the resolution of the vocal print feature calculation vocal print to be identified in background sound-groove model, further ensure the accuracy of the resolution of vocal print to be identified.

Specifically in conjunction with following example explanation, if adopt the sound card identical with setting up background sound-groove model, microphone is recorded, the vocal print feature of vocal print to be identified now extracted after recording and the close together of Application on Voiceprint Recognition model, also nearer with the distance of background sound-groove model, if and adopt the sound card different from setting up background sound-groove model, microphone is recorded, the distance extracting the vocal print to be identified of feature and Application on Voiceprint Recognition model after recording is then far away, also far away with the distance of background sound-groove model, but, although distance is all far away, but compared with the distance of Application on Voiceprint Recognition model the distance of vocal print to be identified and background sound-groove model still closer to.

By the embodiment that the application provides, by calculating the resolution of vocal print to be identified in conjunction with the Application on Voiceprint Recognition model of registered in advance and background sound-groove model, thus the environment and condition overcome owing to gathering vocal print to be identified changes, the resolution of caused vocal print to be identified calculates inaccurate problem.

As the optional scheme of one, wavelet character comprises real wavelet character and/or Phase information feature, and wherein, the first vocal print feature extracted in vocal print to be identified comprises:

S1, detects between the dullness area in vocal print to be identified;

S2, detects fundamental tone interval, and in each fundamental tone interval, extracts real wavelet character and/or the Phase information feature of vocal print feature between each dullness area.

Alternatively, in the present embodiment, above-mentioned in each fundamental tone interval, extract vocal print feature real wavelet character and/or Phase information feature comprise:

S22, extracts predetermined proper vector in each fundamental tone interval, and the proper vector in multiple fundamental tone interval is divided into the sample vector of predetermined length according to wavelet filter, and by the sample vector normalization of predetermined length;

S24, performs following at least one wavelet transformation to the sample vector of the predetermined length after normalization:

1) real valued wavelet transform is carried out to the sample vector of the predetermined length after normalization, to obtain the real part coefficient of the first predetermined frequency band, and in the first predetermined frequency band, select the frequency band of first predetermined condition to sample, to obtain the real wavelet character in the first vocal print feature;

2) dual-tree complex wavelet transform is carried out to the sample vector of the predetermined length after normalization, to obtain real part coefficient and the imaginary part coefficient of the second predetermined frequency band, and in the second predetermined frequency band, select the frequency band of second predetermined condition to sample, to obtain the Phase information feature in the first vocal print feature.

Alternatively, in the present embodiment, above-mentioned sample vector can be, but not limited to determine according to the length of adopted wavelet filter.

Alternatively, in the present embodiment, after detecting between the dullness area in vocal print to be identified, the first vocal print feature in said extracted vocal print to be identified also comprises:

S3, extracts the mel cepstrum coefficients of each frame in vocal print to be identified, to obtain the mel cepstrum coefficients feature in the first vocal print feature;

S4, calculates the difference mel cepstrum coefficients feature of each frame in vocal print to be identified according to mel cepstrum coefficients, to obtain the difference mel cepstrum coefficients feature in the first vocal print feature.

Such as, as shown in S302-S306 in Fig. 3, carry out detecting between dullness area, then carry out pre-emphasis process to vocal print to be identified, wherein, pre-emphasis process is a kind of Hi-pass filter.Concrete formula is as follows:

y(n)＝x(n)-0.9375*x(n-1) (2)

Feature extraction is carried out to the vocal print after pre-emphasis process, as shown in S308-S318 in Fig. 3, carry out 3 rank real valued wavelet transform, 3 rank even numbers complex wavelet transform, extract mel cepstrum coefficients, difference mel cepstrum coefficients is calculated according to mel cepstrum coefficients, and then obtain the vocal print proper vector of 10 group of 20 dimension, wherein, above-mentioned steps S308 and S310 can carry out simultaneously, and step numbers does not limit the present embodiment.

By the embodiment that the application provides, by extracting the wavelet character in vocal print feature, realize forming new feature in conjunction with wavelet character on former characteristic basis, because wavelet character reflects the non-serviceable phonetic feature of original feature, thus realize the accuracy and the stability that improve Voiceprint Recognition System.

Specifically be described in conjunction with following example; suppose that above-mentioned Application on Voiceprint Recognition is applied to bank's identity and discerns in process; bank B obtains predetermined quantity speaker vocal print in advance and sets up background sound-groove model; and then set up vocal print universal model according to these background sound-groove models again; user A saves the data of outbalance at bank B; wish by the protection of Application on Voiceprint Recognition mode authentication, then to set up Application on Voiceprint Recognition model to the voiceprint extraction vocal print feature of user A.Further, need when bank C reads data after user A, in order to avoid the caused identification error that changes due to the hardware device of sound collection, the method for recognizing sound-groove provided in the present embodiment then can be provided, by extracting the vocal print feature of user A, calculate the Application on Voiceprint Recognition degree of user A according to the Application on Voiceprint Recognition model of user A registered in advance and background sound-groove model, thus ensure the correctness that the identity of user A is discerned, and the security of the article preserved.

Particularly, 10 vocal print proper vectors are comprised for the vocal print feature in Application on Voiceprint Recognition model.

Such as, the feature extracted from the speech data of dozens of speaker builds background sound-groove model respectively, wherein, background sound-groove model comprises 10 VQ code books, 10 features in each VQ code book, as Mel-cepstrum, difference Mel-cepstrum and 4 real wavelet characters, 4 Phase information features, be eachly characterized as 20 dimensional vectors.UBM model is set up further according to background sound-groove model.Further, vocal print to be identified is registered, gather vocal print to be identified and therefrom extract feature, adapting to each feature group by the VQ code book of UBM model, and then construct the VQ code book (the second vocal print proper vector namely in the second vocal print feature) in Application on Voiceprint Recognition model.

Further, the mel cepstrum coefficients in each code book, difference mel cepstrum coefficients and 8 wavelet characters (4 real small echos and 4 composite wavelet) are extracted.

Specifically, input signal s (i): i=0 ..., detect in N-1} between dullness area; Use energy, as energy Ratios that is low, high frequency band, zero-crossing rate detects between dullness area.Again pre-emphasis process is carried out to input signal.

s′(i)＝s(i)-0.9375*s(i-1)，i＝1，...，N-1；

Then following operation is performed to the vocal print after pre-emphasis process:

S1, calculates the MFCC cepstrum of every frame, and every frame has 360 samples, and interframe is divided into 180 samples.

The dimension of the Mel-cepstrum vector calculated is 20.

{MFCC _i，i＝0，...，N _m-1}；

{MFCC _i＝{MFCC _i(k)}；k＝0，...，19}；

S2, for every frame, the difference calculating Mel-cepstrum vector forms difference Mel-cepstrum vector.

DMFCC _i＝MFCC _i+2-MFCC _i-2；

S3, will detect fundamental tone interval between each dullness area, interval to each fundamental tone obtained, and calculates real small echo and the Phase information feature of pitch synchronous.

Wherein, input speech signal s (i): i=0 ..., the interval and peak-peak of the fundamental tone that detects in N-1}.Wherein N is the length between speech region, N _pbe the quantity in fundamental tone interval, reference position and the length in each fundamental tone interval are as follows:

{Pit_st(i):i＝0，...，N _p-1}；

{Pit_ln(i):i＝0，...，N _p-1}；

Further, the account form of real small echo is as follows:

Each fundamental tone interval is extracted to the proper vector of 4 20 dimensions, interval for each fundamental tone, cut out the interval of the sample comprising that interval and two groups of somes before and after it, obtain following vector:

{s(Pit_st(i)-l ₁)，...，s(Pit_st(i)+(Pit_ln(i)+l ₁}，i＝0，...，N _p-1；

Then, its norm is made to be 1 this vectorial normalization.

For above-mentioned vector, carry out three stages real small echo (such as Daubechies small echo) packet transform and obtain eight coefficient sequence:

{RW _i ⁰},i＝1,...,8；

{RW _i ⁰}＝{RW _i ⁰(k)},k＝1,...,M；

Each corresponding specific frequency band, each coefficient sequence length is identical, and length is equivalent to the fundamental tone burst length of 1/8.

In 8 sequences obtained above, 4 sequences corresponding to low-frequency band carry out resampling, produce the vector of 4 20 dimensions:

{RW _i}，i＝1，...，4；

RW _i＝{RW _i(k)}k＝1，...，20；

Further, the account form of Phase information is as follows:

Extract 4 20 dimensional feature vectors to each fundamental tone interval, interval for each fundamental tone, cutting is except the interval of sample comprising those interval and two groups of somes before and after it, and the vectorial normalization obtained makes its norm be 1.

For above-mentioned interval, do a triphasic double-tree complex wavelet package transforms (DT-CWPT), to obtain the coefficient corresponding to 8 frequency bands, each frequency band has real part coefficient and imaginary part coefficient, wherein, each coefficient sequence length is identical, and length is equivalent to the fundamental tone burst length of 1/8.For each frequency band, obtain an absolute value sequence by real part and imaginary part sequence.

{CW _i}，i＝1，...，4；

CW _i＝{CW(k) _i}k＝1，...，20；

S4, according to the 10 stack features collection that said extracted goes out, testing standard method is adopted to be normalized calculating, obtain the similarity of vocal print to be identified and Application on Voiceprint Recognition model, when judging that similarity is greater than predetermined threshold, then identifiable design goes out for same vocal print, speaker namely to be identified be same people in the Application on Voiceprint Recognition model set up.

Embodiment 2

According to the embodiment of the present invention, additionally provide a kind of voice print identification device for implementing above-mentioned method for recognizing sound-groove, as shown in Figure 4, this device comprises:

1) extraction unit 402, for extracting the first vocal print feature in vocal print to be identified, wherein, at least comprises the wavelet character of vocal print to be identified in the first vocal print feature;

2) computing unit 404, at least according to the resolution of the second vocal print feature calculation vocal print to be identified in the first vocal print feature and Application on Voiceprint Recognition model;

3) judging unit 406, for judging whether resolution is greater than predetermined threshold;

4) recognition unit 408, for when judging that resolution is greater than predetermined threshold, identifying vocal print to be identified is target vocal print.

Alternatively, in the present embodiment, above-mentioned voice print identification device can be, but not limited to be applied in utilize vocal print to carry out process that identification and identity discern, such as, when bank transaction to trading object registered in advance Application on Voiceprint Recognition model, when needs carry out authentication to trading object, then adopt above-mentioned method for recognizing sound-groove, when the resolution of the vocal print judging trading object meets predetermined threshold, identifiable design goes out for concluding the business after same vocal print again, thus ensures the security of transaction.Above-mentioned citing is a kind of example, and the present embodiment does not do any restriction to this.

Alternatively, in the present embodiment, above-mentioned Application on Voiceprint Recognition model can be, but not limited to: after adjusting vocal print universal model according to multiple vocal print proper vectors of vocal print feature in vocal print to be identified, obtains the model for identifying above-mentioned vocal print to be identified adapted with vocal print to be identified.Wherein, above-mentioned universal model can be, but not limited to: gather the vocal print of many people and extract vocal print feature from everyone vocal print, the background sound-groove model corresponding with everyone vocal print is set up respectively according to vocal print feature, again cluster is carried out to the 3rd vocal print feature in above-mentioned multiple background sound-groove model, and then set up vocal print universal model.Such as, universal background model (Universal Background Model, UBM).

As the optional scheme of one, said apparatus also comprises:

1) collecting unit, for before extracting the first vocal print feature in vocal print to be identified, gathers vocal print to be identified;

2) adjustment unit, for vocal print proper vector parameter corresponding with the first vocal print proper vector in the vocal print universal model at least set up in advance according to the first vocal print proper vector adjustment in the first vocal print feature, to construct the second vocal print proper vector in the second vocal print feature in the Application on Voiceprint Recognition model that adapts with vocal print to be identified.

As the optional scheme of one, the first vocal print feature comprises multiple first vocal print proper vector, and the second vocal print feature comprises multiple second vocal print proper vector, and computing unit 404 comprises:

1) the first computing module, for calculating the vector distance of each first vocal print proper vector in the first vocal print feature and second vocal print proper vector corresponding with the first vocal print proper vector each in the second vocal print feature;

2) the second computing module, for calculating the target range of the first vocal print feature and the second vocal print feature according to the multiple vector distance calculated;

3) the 3rd computing module, for the resolution at least utilizing the target range of the first vocal print feature and the second vocal print feature to calculate vocal print to be identified.

As the optional scheme of one, said apparatus also comprises:

1) collecting unit, for gathering multiple vocal print and extracting the 3rd vocal print feature of each vocal print in multiple vocal print, to build multiple background sound-groove model corresponding with vocal print, wherein, the 3rd vocal print feature comprises multiple 3rd vocal print proper vector;

2) unit is set up, for setting up vocal print universal model according to background sound-groove model.

As the optional scheme of one, the 3rd computing module comprises:

1) the first calculating sub module, for the back pitch that calculates the 3rd vocal print feature of the first vocal print feature each vocal print corresponding with multiple background sound-groove model respectively from;

2) the second calculating sub module, for according to multiple back pitch from calculating distance average and criterion distance poor;

3) the 3rd calculating sub module, for the difference of the target range and distance average that calculate the first vocal print feature and the second vocal print feature;

4) the 4th calculating sub module, for the ratio of calculated difference and criterion distance difference, using the resolution of ratio as vocal print to be identified.

s’＝(s-u)/σ (3)

As the optional scheme of one, wavelet character comprises real wavelet character and/or Phase information feature, and extraction unit 402 comprises:

1) detection module, for detecting between the dullness area in vocal print to be identified;

2) the first extraction module, detects fundamental tone interval, and in each fundamental tone interval, extracts real wavelet character and/or the Phase information feature of vocal print feature between each dullness area.

Alternatively, in the present embodiment, above-mentioned first extraction module comprises:

1) first extracting submodule, for extracting predetermined proper vector in each fundamental tone interval, and the proper vector in multiple fundamental tone interval being divided into the sample vector of predetermined length according to wavelet filter, and by the sample vector normalization of predetermined length;

2) transformation submodule, for performing following at least one wavelet transformation to the sample vector of the predetermined length after normalization: perform following at least one wavelet transformation to the sample vector of the predetermined length after normalization:

(1) real valued wavelet transform is carried out to the sample vector of the predetermined length after normalization, to obtain the real part coefficient of the first predetermined frequency band, and in the first predetermined frequency band, select the frequency band of first predetermined condition to sample, to obtain the real wavelet character in the first vocal print feature;

(2) dual-tree complex wavelet transform is carried out to the sample vector of the predetermined length after normalization, to obtain real part coefficient and the imaginary part coefficient of the second predetermined frequency band, and in the second predetermined frequency band, select the frequency band of second predetermined condition to sample, to obtain the Phase information feature in the first vocal print feature.

As the optional scheme of one, extraction unit 402 also comprises:

1) the second extraction module, for after detecting between the dullness area in vocal print to be identified, extracts the mel cepstrum coefficients of each frame in vocal print to be identified, to obtain the mel cepstrum coefficients feature in the first vocal print feature;

2) the 3rd computing module, for calculating the difference mel cepstrum coefficients feature of each frame in vocal print to be identified according to mel cepstrum coefficients, to obtain the difference mel cepstrum coefficients feature in the first vocal print feature.

y(n)＝x(n)-0.9375*x(n-1) (4)

The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.

It should be noted that, for aforesaid each embodiment of the method, in order to simple description, therefore it is all expressed as a series of combination of actions, but those skilled in the art should know, the present invention is not by the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and involved action and module might not be that the present invention is necessary.

In the above-described embodiments, the description of each embodiment is all emphasized particularly on different fields, in certain embodiment, there is no the part described in detail, can see the associated description of other embodiments.

In several embodiments that the application provides, should be understood that, disclosed device, the mode by other realizes.Such as, device embodiment described above is only schematic, the such as division of described unit, be only a kind of logic function to divide, actual can have other dividing mode when realizing, such as multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of device or unit or communication connection can be electrical or other form.

The described unit illustrated as separating component or can may not be and physically separates, and the parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of unit wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.

In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, also can be that the independent physics of unit exists, also can two or more unit in a unit integrated.Above-mentioned integrated unit both can adopt the form of hardware to realize, and the form of SFU software functional unit also can be adopted to realize.

If described integrated unit using the form of SFU software functional unit realize and as independently production marketing or use time, can be stored in a computer read/write memory medium.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words or all or part of of this technical scheme can embody with the form of software product, this computer software product is stored in a storage medium, comprises all or part of step of some instructions in order to make a computer equipment (can be personal computer, mobile terminal, server or the network equipment etc.) perform method described in each embodiment of the present invention.And aforesaid storage medium comprises: USB flash disk, ROM (read-only memory) (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), portable hard drive, magnetic disc or CD etc. various can be program code stored medium.

The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. a method for recognizing sound-groove, is characterized in that, comprising:

Extract the first vocal print feature in vocal print to be identified, wherein, in described first vocal print feature, at least comprise the wavelet character of described vocal print to be identified;

At least resolution of vocal print to be identified according to the second vocal print feature calculation in described first vocal print feature and Application on Voiceprint Recognition model;

Judge whether described resolution is greater than predetermined threshold;

If judge, described resolution is greater than described predetermined threshold, then identifying described vocal print to be identified is target vocal print.

2. method for recognizing sound-groove according to claim 1, is characterized in that, before the first vocal print feature in described extraction vocal print to be identified, also comprises:

Gather described vocal print to be identified;

Vocal print proper vector parameter corresponding with described first vocal print proper vector in the vocal print universal model at least set up in advance according to the first vocal print proper vector adjustment in described first vocal print feature, to construct the second vocal print proper vector in the described second vocal print feature in the described Application on Voiceprint Recognition model that adapts with described vocal print to be identified.

3. method for recognizing sound-groove according to claim 2, it is characterized in that, described first vocal print feature comprises multiple described first vocal print proper vector, described second vocal print feature comprises multiple described second vocal print proper vector, wherein, described at least according to the second vocal print feature calculation in described first vocal print feature and Application on Voiceprint Recognition model the resolution of vocal print to be identified comprise:

Calculate the vector distance of the described second vocal print proper vector that each described first vocal print proper vector is each with described second vocal print feature and described first vocal print proper vector is corresponding in described first vocal print feature;

The target range of described first vocal print feature and described second vocal print feature is calculated according to the multiple described vector distance calculated;

The described target range of described first vocal print feature and described second vocal print feature is at least utilized to calculate the resolution of described vocal print to be identified.

4. method for recognizing sound-groove according to claim 3, is characterized in that, before the described vocal print to be identified of collection, also comprises:

Gather multiple vocal print and extract the 3rd vocal print feature of each vocal print in described multiple vocal print, to build the corresponding background sound-groove model of multiple and described vocal print, wherein, described 3rd vocal print feature comprises multiple 3rd vocal print proper vector;

Described vocal print universal model is set up according to described background sound-groove model.

5. method for recognizing sound-groove according to claim 4, is characterized in that, the described resolution at least utilizing the distance of described first vocal print feature and described second vocal print feature to calculate described vocal print to be identified comprises:

The back pitch calculating the described 3rd vocal print feature of described first vocal print feature each described vocal print corresponding with multiple described background sound-groove model respectively from;

According to multiple described back pitch from calculating distance average and criterion distance poor;

Calculate described first vocal print feature and the described described target range of the second vocal print feature and the difference of described distance average;

Calculate the ratio of described difference and described criterion distance difference, using the described resolution of described ratio as described vocal print to be identified.

6. method for recognizing sound-groove according to claim 1, is characterized in that, described wavelet character comprises real wavelet character and/or Phase information feature, and wherein, the first vocal print feature in described extraction vocal print to be identified comprises:

Detect between the dullness area in described vocal print to be identified;

Detect fundamental tone between each described dullness area interval, and in each described fundamental tone interval, extract the described real wavelet character of described vocal print feature and/or described Phase information feature.

7. method for recognizing sound-groove according to claim 6, is characterized in that, described in each described fundamental tone interval, extract described vocal print feature described real wavelet character and/or described Phase information feature comprise:

In each described fundamental tone interval, extract predetermined proper vector, and the described proper vector in multiple described fundamental tone interval is divided into the sample vector of predetermined length according to wavelet filter, and by the sample vector normalization of described predetermined length;

Following at least one wavelet transformation is performed to the sample vector of the described predetermined length after normalization:

Real valued wavelet transform is carried out to the sample vector of the described predetermined length after normalization, to obtain the real part coefficient of the first predetermined frequency band, and in described first predetermined frequency band, select the frequency band of first predetermined condition to sample, to obtain the described real wavelet character in described first vocal print feature;

Dual-tree complex wavelet transform is carried out to the sample vector of the described predetermined length after normalization, to obtain real part coefficient and the imaginary part coefficient of the second predetermined frequency band, and in described second predetermined frequency band, select the frequency band of second predetermined condition to sample, to obtain the described Phase information feature in described first vocal print feature.

8. method for recognizing sound-groove according to claim 7, is characterized in that, after detecting between the dullness area in described vocal print to be identified, the first vocal print feature in described extraction vocal print to be identified also comprises:

Extract the mel cepstrum coefficients of each frame in described vocal print to be identified, to obtain the described mel cepstrum coefficients feature in described first vocal print feature;

The difference mel cepstrum coefficients feature of each frame in described vocal print to be identified is calculated, to obtain the described difference mel cepstrum coefficients feature in described first vocal print feature according to described mel cepstrum coefficients.

9. a voice print identification device, is characterized in that, comprising:

Extraction unit, for extracting the first vocal print feature in vocal print to be identified, wherein, at least comprises the wavelet character of described vocal print to be identified in described first vocal print feature;

Computing unit, at least resolution of vocal print to be identified according to the second vocal print feature calculation in described first vocal print feature and Application on Voiceprint Recognition model;

Judging unit, for judging whether described resolution is greater than predetermined threshold;

Recognition unit, for when judging that described resolution is greater than described predetermined threshold, identifying described vocal print to be identified is target vocal print.

10. voice print identification device according to claim 9, is characterized in that, also comprises:

Collecting unit, before the first vocal print feature in described extraction vocal print to be identified, gathers described vocal print to be identified;

Adjustment unit, for vocal print proper vector parameter corresponding with described first vocal print proper vector in the vocal print universal model at least set up in advance according to the first vocal print proper vector adjustment in described first vocal print feature, to construct the second vocal print proper vector in the described second vocal print feature in the described Application on Voiceprint Recognition model that adapts with described vocal print to be identified.

11. voice print identification device according to claim 10, is characterized in that, described first vocal print feature comprises multiple described first vocal print proper vector, and described second vocal print feature comprises multiple described second vocal print proper vector, and described computing unit comprises:

First computing module, for calculating the vector distance of the described second vocal print proper vector that each described first vocal print proper vector is each with described second vocal print feature and described first vocal print proper vector is corresponding in described first vocal print feature;

Second computing module, for calculating the target range of described first vocal print feature and described second vocal print feature according to the multiple described vector distance calculated;

3rd computing module, for the resolution at least utilizing the described target range of described first vocal print feature and described second vocal print feature to calculate described vocal print to be identified.

12. voice print identification device according to claim 11, is characterized in that, also comprise:

Collecting unit, for gathering multiple vocal print and extracting the 3rd vocal print feature of each vocal print in described multiple vocal print, to build the corresponding background sound-groove model of multiple and described vocal print, wherein, described 3rd vocal print feature comprises multiple 3rd vocal print proper vector;

Set up unit, for setting up described vocal print universal model according to described background sound-groove model.

13. voice print identification device according to claim 12, is characterized in that, described 3rd computing module comprises:

First calculating sub module, for the back pitch that calculates the described 3rd vocal print feature of described first vocal print feature each described vocal print corresponding with multiple described background sound-groove model respectively from;

Second calculating sub module, for according to multiple described back pitch from calculating distance average and criterion distance poor;

3rd calculating sub module, for calculating described first vocal print feature and the described described target range of the second vocal print feature and the difference of described distance average;

4th calculating sub module, for calculating the ratio of described difference and described criterion distance difference, using the described resolution of described ratio as described vocal print to be identified.

14. voice print identification device according to claim 9, is characterized in that, described wavelet character comprises real wavelet character and/or Phase information feature, and described extraction unit comprises:

Detection module, for detecting between the dullness area in described vocal print to be identified;

First extraction module, detects fundamental tone interval, and in each described fundamental tone interval, extracts the described real wavelet character of described vocal print feature and/or described Phase information feature between each described dullness area.

15. voice print identification device according to claim 14, is characterized in that, described first extraction module comprises:

First extracts submodule, for extracting predetermined proper vector in each described fundamental tone interval, and according to wavelet filter, the described proper vector in multiple described fundamental tone interval to be divided into the sample vector of predetermined length, and by the sample vector normalization of described predetermined length;

Transformation submodule, for performing following at least one wavelet transformation to the sample vector of the described predetermined length after normalization:

16. voice print identification device according to claim 15, is characterized in that, described extraction unit also comprises:

Second extraction module, for after detecting between the dullness area in described vocal print to be identified, extracts the mel cepstrum coefficients of each frame in described vocal print to be identified, to obtain the described mel cepstrum coefficients feature in described first vocal print feature;

3rd computing module, for calculating the difference mel cepstrum coefficients feature of each frame in described vocal print to be identified according to described mel cepstrum coefficients, to obtain the described difference mel cepstrum coefficients feature in described first vocal print feature.