CN109119085A

CN109119085A - A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector

Info

Publication number: CN109119085A
Application number: CN201810973061.4A
Authority: CN
Inventors: 高原; 陶雯
Original assignee: Shenzhen Zhuyun Science & Technology Co Ltd
Current assignee: Shenzhen Zhuyun Science & Technology Co Ltd
Priority date: 2018-08-24
Filing date: 2018-08-24
Publication date: 2019-01-01

Abstract

The present invention relates to a kind of relevant audio recognition methods of asymmetric text based on wavelet analysis and super vector, including process is short amount extraction algorithm, building background model, the super vector of building, pattern matching algorithm.The content of the asymmetric text of asymmetric text is training voice and tested speech；The trained voice discloses, and tested speech content is underground.Wavelet analysis can effectively analyze this non-stationary signal of voice, and super vector can effectively improve the discrimination between different characteristic vector.So wavelet analysis and super vector are introduced the present invention.The attack of synthesis voice can be prevented and improve recognition performance by implementing the present invention.

Description

A kind of relevant speech recognition of asymmetric text based on wavelet analysis and super vector Method

Technical field

The present invention relates to a kind of audio recognition methods, are based on wavelet analysis and super vector more specifically to one kind The relevant audio recognition method of asymmetric text.

Background technique

Speech recognition technology is a kind of technology using speech samples identification speaker's identity.The technology is widely used in In identity authorization system.Speech recognition modeling is generally divided into 2 modules: feature extraction and speaker clustering.In feature extraction mould In block, speech samples are converted into phonetic feature, contain in this feature only related and unrelated with voice content to speaker Information.In speaker clustering module, a learning algorithm concludes the information in feature and establishes speaker model. When identifying unknown voice, speaker model with regard to and unknown voice matched and identified the identity of unknown speaker.

A kind of most common phonetic feature is short amount.It characterizes speaker information using one group of low dimensional vector. The conventional method for obtaining short amount is Mel cepstrum coefficient algorithm (MFCC).The algorithm is using discrete Fourier transform (DFT) to language Sound signal carries out spectrum analysis.Super vector is a kind of phonetic feature based on short amount.Different with short amount, it is with a higher-dimension The single vector-quantities of degree uniformly characterize the speaker information in speech samples.Super vector usually considers background information, institute Higher discrimination can be obtained with it.

In the relevant audio recognition method of text, training voice and tested speech content are fixed and identical.Due to voice Content is identical, and this model will be unable to effectively resist " attack of synthesis voice ".

Summary of the invention

The technical problem to be solved in the present invention is that it is not high for existing voice Security of test, it be easy to cause synthesis language Sound attack provides a kind of asymmetric text based on wavelet analysis and super vector relevant audio recognition method.

The technical solution adopted by the present invention to solve the technical problems is: constructing a kind of based on wavelet analysis and super vector The relevant audio recognition method of asymmetric text, this method using based on wavelet analysis algorithm extract short amount, with improve The quality and noise immunity of short amount.

In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector, The step of asymmetric text based on wavelet analysis and super vector relevant audio recognition method is S1 experimental data Collection obtains accuracy rate, S4 analysis noiseproof feature with platform, S2 test optimal wavelet, S3.

In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector, The relevant audio recognition method process of the asymmetric text based on wavelet analysis and super vector be short amount extraction algorithm, Construct background model, the super vector of building, pattern matching algorithm.

In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector, The content of the asymmetric text is training voice and tested speech；The trained voice discloses, and tested speech content is underground.

In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector, The short amount extraction algorithm is divided into pretreatment, spectrum analysis, Mel filtering, cepstrum calculating.

In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector, The background model of the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector is by Gaussian Mixture Model indicates.

Target short amount tests short amount, and the building process of super vector, which is referred to as, maximizes adjustment process.

Implement a kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector of the invention, It has the advantages that and wavelet analysis and super vector is introduced into the present invention.It is this non-that wavelet analysis can effectively analyze voice Stationary signal, and super vector can effectively improve the discrimination between different characteristic vector.Synthesis language can be prevented by implementing the present invention Sound attack, and recognition performance can be improved.

Detailed description of the invention

Present invention will be further explained below with reference to the attached drawings and examples, in attached drawing:

Fig. 1 is the relevant audio recognition method process of the asymmetric text of the invention based on wavelet analysis and super vector Figure

Fig. 2 is the structure of the relevant audio recognition method of the asymmetric text of the invention based on wavelet analysis and super vector Build super vector flow chart

Fig. 3 is the short of the relevant audio recognition method of the asymmetric text of the invention based on wavelet analysis and super vector Vector extraction algorithm flow chart

Specific embodiment

In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.

As shown in Figure 1, the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector The step of are as follows:

S1 experimental data set and platform, the present embodiment collect the speech samples of several experimenters, and every speech samples are 10s, all sample standard deviations are under quiet environment and guarantee to record in the case where same frequency, and all Experimental Hardware platforms of this implementation are one Platform is furnished with the PC machine of Intel core5 CPU and 8G memory；

S2 tests optimal wavelet, and after acquisition speech samples success, the algorithm based on wavelet analysis tests voice Analysis finds that the bigger ability for illustrating wavelet analysis capture important information of the energy of Wavelet Spectrum is stronger；

S3 obtains accuracy rate, and the present embodiment is by the voice of several speakers as training voice and tested speech.Using small Wave analysis carries out spectrum analysis to speech samples, can more accurately analyze stationary signal and non-stationary signal.

S4 analyzes noiseproof feature, and noise is added in the present embodiment in the speech samples of several speakers, and experiment shows all With the increase of noise intensity, accuracy rate all declining sample.But when noisy speech is snapped down to 28db, compared to non-small echo Model, wavelet model accuracy rate glide less than 8%.Illustrate that the model noiseproof feature based on wavelet analysis is better than non-wavelet model.

Further, the relevant audio recognition method process of the asymmetric text based on wavelet analysis and super vector For short amount extraction algorithm, building background model, the super vector of building, pattern matching algorithm.

Further, the content of the asymmetric text is training voice and tested speech；The trained voice discloses, and surveys It is underground to try voice content.

Further, the short amount extraction algorithm is divided into pretreatment, spectrum analysis, Mel filtering, cepstrum calculating.

Further, the back of the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector Scape model is indicated by gauss hybrid models.

Although being disclosed by above embodiments to the present invention, scope of protection of the present invention is not limited thereto, Under conditions of without departing from present inventive concept, deformation, the replacement etc. done to above each component will fall into right of the invention In claimed range.

Claims

1. a kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector, which is characterized in that institute The step of asymmetric text based on wavelet analysis and super vector stated relevant audio recognition method is S1 experimental data set Accuracy rate, S4 analysis noiseproof feature are obtained with platform, S2 test optimal wavelet, S3.

2. the relevant speech recognition side of the asymmetric text according to claim 1 based on wavelet analysis and super vector Method, which is characterized in that the relevant audio recognition method process of the asymmetric text based on wavelet analysis and super vector is Short amount extraction algorithm, building background model, the super vector of building, pattern matching algorithm.

3. the relevant speech recognition side of the asymmetric text according to claim 1 based on wavelet analysis and super vector Method, which is characterized in that the content of the asymmetric text is training voice and tested speech；The trained voice discloses, test Voice content is underground.

4. the relevant speech recognition side of the asymmetric text according to claim 2 based on wavelet analysis and super vector Method, which is characterized in that the short amount extraction algorithm is divided into pretreatment, spectrum analysis, Me l filtering, cepstrum calculating.

5. the relevant speech recognition side of the asymmetric text according to claim 1 based on wavelet analysis and super vector Method, which is characterized in that the background of the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector Model is indicated by gauss hybrid models.