CN106971712A

CN106971712A - A kind of adaptive rapid voiceprint recognition methods and system

Info

Publication number: CN106971712A
Application number: CN201610024133.1A
Authority: CN
Inventors: 祝铭明
Original assignee: Yutou Technology Hangzhou Co Ltd
Current assignee: Yutou Technology Hangzhou Co Ltd
Priority date: 2016-01-14
Filing date: 2016-01-14
Publication date: 2017-07-21

Abstract

The invention belongs to field of voice signal, more particularly to a kind of adaptive rapid voiceprint recognition methods and system, applied to domestic robot, including：S1：Gather voice signal；S2：Voice signal is pre-processed；S3：Speech characteristic parameter is extracted from pretreated voice signal；S4：Acoustic model is set up for each kinsfolk；S5：Acoustic model is divided into the first acoustic model and the second acoustic model previously according to frequency of use, the frequency of use of the first acoustic model is more than the second acoustic model, and the first acoustic model is loaded onto in caching when being powered；Wherein, counted, the first acoustic model and the second acoustic model are updated according to the frequency of use in the scheduled time in real time；S6：Pattern match is carried out to voice signal to be measured according to the first acoustic model and the second acoustic model, recognition result is obtained.Above technical scheme can adaptively realize Application on Voiceprint Recognition, and effectively increase the speed of Application on Voiceprint Recognition.

Description

A kind of adaptive rapid voiceprint recognition methods and system

Technical field

The invention belongs to field of voice signal, more particularly to a kind of adaptive rapid voiceprint recognition methods and system.

Background technology

Household service robot is one of current most active field of forward position high-tech research, it can complete the services being beneficial to man, housework, amusement and leisure, education, security monitoring service are such as provided, possess extensive potential customers colony and market, the existing widely used speech recognition technology of household service robot realizes man-machine interaction, robot is allowed to understand human speech, to perform corresponding actions, but, existing robot there is no method to accurately identify speaker's identity, it is impossible to meet the demand of user individual.The sound groove recognition technology in e occurred with the development of computer technology and digital signal processing theory, by from one section of voice of speaker, extract and reflect the human physiology of speaking, the speech characteristic parameter of psychology, by carrying out analysis modeling and pattern match to speech characteristic parameter, come the purpose realized identification or confirm unknown speaker identity.But, existing Voiceprint Recognition System is often designed for a specific application scenarios, when systematic difference scene changes, adaptive ability is not strong, it can not realize and man-machine freely exchange, and because the speed of Application on Voiceprint Recognition is excessively slow, poor user experience is caused, this is that those skilled in the art do not expect to see.

The content of the invention

To solve above technical problem there is provided a kind of adaptive rapid voiceprint recognition methods and system, the defect of existing recognition methods is solved.

Concrete technical scheme is as follows：

A kind of adaptive rapid voiceprint recognition methods, wherein, applied to domestic robot, specific works step includes：

S1：Gather voice signal；

S2：The voice signal is pre-processed；

S3：Speech characteristic parameter is extracted from the pretreated voice signal, the Equations of The Second Kind characteristic parameter that the first kind characteristic parameter and simulation human ear that the speech characteristic parameter is obtained including linear prediction are extracted to the perception characteristic of sound frequency；

S4：A code book is set up for each kinsfolk and is stored in sound template in speech database as the kinsfolk, and all code books of the kinsfolk constitute an acoustic model；

S5：The acoustic model is divided into the first acoustic model and the second acoustic model previously according to frequency of use, the frequency of use of first acoustic model is more than second acoustic model, and first acoustic model is loaded onto in caching when being powered；Wherein, counted, first acoustic model and the second acoustic model are updated according to the frequency of use in the scheduled time in real time；

S6：Pattern match is carried out to voice signal to be measured according to first acoustic model and the second acoustic model, recognition result is obtained.

Include successively in above-mentioned adaptive rapid voiceprint recognition methods, the step S2, the step of the pretreatment：

Step S21, is sampled and is quantified to obtain audio digital signals to the pretreated voice signal；

Step S22, the audio digital signals are by a wave filter group to lift the radio-frequency component of the data signal；

Step S23, the voice signal obtained to step S22 carries out framing and adding window, obtains the voice signal after adding window.

The first kind characteristic parameter is extracted in above-mentioned adaptive rapid voiceprint recognition methods, the step S3 for linear predictor coefficient, extraction step is as follows：

Step S31a, defines Short Time Speech signal and error signal；

Step S32a, calculates the error sum of squares of the Short Time Speech signal and the error signal；

Step S33a, differentiates to the error sum of squares, and solves the equation group acquisition first kind characteristic parameter.

The step of extracting the Equations of The Second Kind characteristic parameter in above-mentioned adaptive rapid voiceprint recognition methods, the step S3 includes：

Step S31b, carries out Fourier transformation to the pretreated voice signal and obtains linear spectral；

Step S32b, corresponding Mel frequency spectrum is obtained to the linear spectral by a triangular band pass wave filter group；

Step S33b, calculates the log spectrum of the Mel frequency spectrum；

Step S34b, carries out discrete cosine transform to the log spectrum and obtains Equations of The Second Kind characteristic parameter.

Above-mentioned adaptive rapid voiceprint recognition methods, the step S4's comprises the following steps that：

Step S41, N number of characteristic vector is extracted from the voice signal, and the characteristic vector sort out by clustering procedure to obtain M code book；

Step S42, obtains the corresponding codebook vectors of each class；

Step S43, the set for setting up the codebook vectors of each kinsfolk constitutes acoustic model.

Above-mentioned adaptive rapid voiceprint recognition methods, the step S6 is specific as follows,

Step S61, voice signal to be identified is matched with first acoustic model and the second acoustic model as similitude successively, and is estimated according to weighted euclidean distance and judged；

Step S62, chooses appropriately distance measurement and is used as threshold value；

Step S63, meets the result in the range of threshold value as recognition result.

Also provide, a kind of adaptive rapid voiceprint identifying system, including

Voice input module, for capturing voice signal；

Pretreatment module, is connected with the voice input module, for being pre-processed to the voice signal；

Fisrt feature parameter extraction module, is connected with the pretreatment module, for obtaining the fisrt feature parameter in the voice signal；

Second feature parameter extraction module, is connected with the pretreatment module, for obtaining the second feature parameter in the voice signal；

Training module, is connected, the sound template for setting up each kinsfolk with the fisrt feature parameter extraction module and the second feature parameter extraction module, and all code books of the kinsfolk constitute an acoustic model；

Classification processing module, it is connected with the training module, the acoustic model is divided into the first acoustic model and the second acoustic model previously according to frequency of use, wherein, the frequency of use of first acoustic model is more than second acoustic model, and first acoustic model is loaded onto in caching when being powered；

Update module, is connected with the classification processing module, is counted, first acoustic model and the second acoustic model are updated according to the frequency of use in the scheduled time in real time；

Template matches module, is connected with the classification processing module, carries out pattern match to voice signal to be measured according to first acoustic model and the second acoustic model, obtain recognition result.

Beneficial effect：Above technical scheme can adaptively realize Application on Voiceprint Recognition, and effectively increase the man-machine communication under the speed of Application on Voiceprint Recognition, reply different application scene, be conducive to lifting Consumer's Experience.

Brief description of the drawings

Fig. 1 is flow chart of the method for the present invention；

Fig. 2 is the method flow diagram of the step 2 of the present invention；

Fig. 3 is system structure diagram of the invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art are obtained on the premise of creative work is not made belongs to the scope of protection of the invention.

It should be noted that in the case where not conflicting, the embodiment in the present invention and the feature in embodiment can be mutually combined.

The invention will be further described with specific embodiment below in conjunction with the accompanying drawings, but not as limiting to the invention.

Reference picture 1, a kind of adaptive rapid voiceprint recognition methods, wherein, applied to domestic robot, specific works step includes：

S1：Gather voice signal；

S2：Voice signal is pre-processed；

S3：Speech characteristic parameter is extracted from pretreated voice signal, the Equations of The Second Kind characteristic parameter that the first kind characteristic parameter and simulation human ear that speech characteristic parameter is obtained including linear prediction are extracted to the perception characteristic of sound frequency；

S4：A code book is set up for each kinsfolk and is stored in sound template in speech database as kinsfolk, and all code books of kinsfolk constitute an acoustic model；

S5：Acoustic model is divided into the first acoustic model (conventional) and the second acoustic model (being of little use) previously according to frequency of use, the frequency of use of first acoustic model is more than the second acoustic model, and the first acoustic model is loaded onto in caching when being powered, the second acoustic model is remained stored in speech database；Wherein, counted, the first acoustic model and the second acoustic model are updated according to the frequency of use in the scheduled time in real time；

S6：Pattern match is carried out to voice signal to be measured according to the first acoustic model and the second acoustic model, recognition result is obtained.

Everyone can cause articulation type and custom of speaking different due to the differences of Physiological of vocal organs, the Equations of The Second Kind characteristic parameter that the present invention is extracted with reference to the first kind characteristic parameter and simulation human ear that linear prediction is obtained to the perception characteristic of sound frequency, obtain acoustic model, to improve existing Application on Voiceprint Recognition effect, Consumer's Experience is lifted.

Include successively in above-mentioned adaptive rapid voiceprint recognition methods, reference picture 2, step S2, the step of pretreatment：

Step S21, is sampled to pretreated voice signal and quantifies to obtain audio digital signals；

Step S22, audio digital signals are by a wave filter group to lift the radio-frequency component of data signal；

It can be linear predictor coefficient that first kind characteristic parameter is extracted in above-mentioned adaptive rapid voiceprint recognition methods, step S3, and its extraction step is as follows：

Step S31a, defines Short Time Speech signal and error signal；

Step S32a, calculates the error sum of squares of Short Time Speech signal and error signal；

Step S33a, differentiates to error sum of squares, and solves equation group acquisition first kind characteristic parameter.

Due to having correlation between voice adjacent spots, the mode of linear prediction can be utilized, present or following sample value is predicted according to past voice sample value, i.e., using several voices sampling in the past or their linear combination, to approach the sample value that voice is present.

The step of Equations of The Second Kind characteristic parameter being extracted in above-mentioned adaptive rapid voiceprint recognition methods, step S3, including：

Step S31b, carries out Fourier transformation to pretreated voice signal and obtains linear spectral；

Step S32b, corresponding Mel frequency spectrum is obtained to linear spectral by a triangular band pass wave filter group；

Step S33b, calculates the log spectrum of Mel frequency spectrum；

Step S34b, carries out discrete cosine transform to log spectrum and obtains Equations of The Second Kind characteristic parameter.

Above-mentioned adaptive rapid voiceprint recognition methods, step S4's comprises the following steps that：

Step S41, N number of characteristic vector is extracted from first kind characteristic parameter and Equations of The Second Kind characteristic parameter, and characteristic vector sort out by clustering procedure to obtain M code book；

Step S42, obtains the corresponding codebook vectors of each class；

Above-mentioned adaptive rapid voiceprint recognition methods, step S6 is specific as follows,

Step S61, voice signal to be identified is matched with the first acoustic model and the second acoustic model as similitude successively, and is estimated according to weighted euclidean distance and judged；

Also provide, a kind of adaptive rapid voiceprint identifying system, reference picture 3, including

Voice input module 1, for capturing voice signal；

Pretreatment module 2, is connected with voice input module 1, for being pre-processed to voice signal；

Fisrt feature parameter extraction module 3, is connected with pretreatment module 2, for obtaining the fisrt feature parameter in voice signal；

Second feature parameter extraction module 4, is connected with pretreatment module 2, for obtaining the second feature parameter in voice signal；

Training module 5, is connected with fisrt feature parameter extraction module and second feature parameter extraction module, the sound template for setting up each kinsfolk, and all code books of kinsfolk constitute an acoustic model；

Classification processing module 6, it is connected with training module 5, acoustic model is divided into the first acoustic model and the second acoustic model previously according to frequency of use, wherein, the frequency of use of first acoustic model is more than the second acoustic model, and the first acoustic model is loaded onto in caching when being powered, the second acoustic model is remained stored in speech database；

Update module 7, is connected with classification processing module, is counted, the first acoustic model and the second acoustic model are updated according to the frequency of use in the scheduled time in real time；

Template matches module 8, is connected with classification processing module 6, carries out pattern match to voice signal to be measured according to the first acoustic model and the second acoustic model successively, obtains recognition result.

It these are only preferred embodiments of the present invention; not thereby embodiments of the present invention and protection domain are limited; to those skilled in the art; the scheme obtained by all utilization description of the invention and the equivalent substitution made by diagramatic content and obvious change should be can appreciate that, should be included in protection scope of the present invention.

Claims

1. a kind of adaptive rapid voiceprint recognition methods, it is characterised in that applied to domestic robot, Specific works step includes：

S1：Gather voice signal；

S2：The voice signal is pre-processed；

S3：Speech characteristic parameter, the speech characteristic parameter are extracted from the pretreated voice signal The first kind characteristic parameter and simulation human ear obtained including linear prediction is carried to the perception characteristic of sound frequency The Equations of The Second Kind characteristic parameter taken；

S4：For each kinsfolk set up a code book be stored in speech database as the family into The sound template of member, all code books of the kinsfolk constitute an acoustic model；

S5：The acoustic model is divided into the first acoustic model and the second acoustic mode previously according to frequency of use Type, the frequency of use of first acoustic model is more than second acoustic model, and when being powered by institute The first acoustic model is stated to be loaded onto in caching；Wherein, counted according to the frequency of use in the scheduled time, it is real When first acoustic model and the second acoustic model are updated；

S6：Row mode is entered to voice signal to be measured according to first acoustic model and the second acoustic model Match somebody with somebody, obtain recognition result.

2. adaptive rapid voiceprint recognition methods according to claim 1, it is characterised in that institute State in step S2, include successively the step of the pretreatment：

Step S21, is sampled to the pretreated voice signal and quantifies to obtain digital speech letter Number；

Step S22, the audio digital signals are by a wave filter group to lift the high frequency of the data signal Composition；

Step S23, the voice signal obtained to step S22 carries out framing and adding window, obtains the language after adding window Message number.

3. adaptive rapid voiceprint recognition methods according to claim 1, it is characterised in that institute State and the first kind characteristic parameter is extracted in step S3 for linear predictor coefficient, extraction step is as follows：

Step S31a, defines Short Time Speech signal and error signal；

Step S33a, differentiates to the error sum of squares, and it is special to solve the equation group acquisition first kind Levy parameter.

4. adaptive rapid voiceprint recognition methods according to claim 1, it is characterised in that institute Stating the step of Equations of The Second Kind characteristic parameter is extracted in step S3 includes：

Step S32b, corresponding Mel is obtained to the linear spectral by a triangular band pass wave filter group Frequency spectrum；

Step S33b, calculates the log spectrum of the Mel frequency spectrum；

5. adaptive rapid voiceprint recognition methods according to claim 1, it is characterised in that institute State comprising the following steps that for step S4：

Step S41, N number of feature is extracted from the first kind characteristic parameter and the Equations of The Second Kind characteristic parameter Vector, to the characteristic vector sort out obtaining M code book by clustering procedure；

Step S42, obtains the corresponding codebook vectors of each class；

6. adaptive rapid voiceprint recognition methods according to claim 1, it is characterised in that institute State step S6 specific as follows,

Step S61, by voice signal to be identified successively with first acoustic model and second acoustics Model makees similitude matching, and is estimated according to weighted euclidean distance and judged；

7. a kind of adaptive rapid voiceprint identifying system, it is characterised in that including

Voice input module, for capturing voice signal；

Pretreatment module, is connected with the voice input module, for being located in advance to the voice signal Reason；

Fisrt feature parameter extraction module, is connected with the pretreatment module, for obtaining the voice letter Fisrt feature parameter in number；

Second feature parameter extraction module, is connected with the pretreatment module, for obtaining the voice letter Second feature parameter in number；

Training module, connects with the fisrt feature parameter extraction module and the second feature parameter extraction module Connect, the sound template for setting up each kinsfolk, all code books of the kinsfolk constitute a sound Learn model；

Classify processing module, be connected with the training module, previously according to frequency of use by the acoustic mode Type divides into the first acoustic model and the second acoustic model, and the frequency of use of first acoustic model is more than Second acoustic model, and first acoustic model is loaded onto in caching when being powered；

Update module, is connected with the classification processing module, is counted according to the frequency of use in the scheduled time, First acoustic model and the second acoustic model are updated in real time；

Template matches module, is connected with the classification processing module, according to first acoustic model and the Two acoustic models carry out pattern match to voice signal to be measured, obtain recognition result.