CN109119085A - A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector - Google Patents

A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector Download PDF

Info

Publication number
CN109119085A
CN109119085A CN201810973061.4A CN201810973061A CN109119085A CN 109119085 A CN109119085 A CN 109119085A CN 201810973061 A CN201810973061 A CN 201810973061A CN 109119085 A CN109119085 A CN 109119085A
Authority
CN
China
Prior art keywords
super vector
wavelet analysis
asymmetric
asymmetric text
audio recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810973061.4A
Other languages
Chinese (zh)
Inventor
高原
陶雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhuyun Science & Technology Co Ltd
Original Assignee
Shenzhen Zhuyun Science & Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhuyun Science & Technology Co Ltd filed Critical Shenzhen Zhuyun Science & Technology Co Ltd
Priority to CN201810973061.4A priority Critical patent/CN109119085A/en
Publication of CN109119085A publication Critical patent/CN109119085A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/14Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Abstract

The present invention relates to a kind of relevant audio recognition methods of asymmetric text based on wavelet analysis and super vector, including process is short amount extraction algorithm, building background model, the super vector of building, pattern matching algorithm.The content of the asymmetric text of asymmetric text is training voice and tested speech;The trained voice discloses, and tested speech content is underground.Wavelet analysis can effectively analyze this non-stationary signal of voice, and super vector can effectively improve the discrimination between different characteristic vector.So wavelet analysis and super vector are introduced the present invention.The attack of synthesis voice can be prevented and improve recognition performance by implementing the present invention.

Description

A kind of relevant speech recognition of asymmetric text based on wavelet analysis and super vector Method
Technical field
The present invention relates to a kind of audio recognition methods, are based on wavelet analysis and super vector more specifically to one kind The relevant audio recognition method of asymmetric text.
Background technique
Speech recognition technology is a kind of technology using speech samples identification speaker's identity.The technology is widely used in In identity authorization system.Speech recognition modeling is generally divided into 2 modules: feature extraction and speaker clustering.In feature extraction mould In block, speech samples are converted into phonetic feature, contain in this feature only related and unrelated with voice content to speaker Information.In speaker clustering module, a learning algorithm concludes the information in feature and establishes speaker model. When identifying unknown voice, speaker model with regard to and unknown voice matched and identified the identity of unknown speaker.
A kind of most common phonetic feature is short amount.It characterizes speaker information using one group of low dimensional vector. The conventional method for obtaining short amount is Mel cepstrum coefficient algorithm (MFCC).The algorithm is using discrete Fourier transform (DFT) to language Sound signal carries out spectrum analysis.Super vector is a kind of phonetic feature based on short amount.Different with short amount, it is with a higher-dimension The single vector-quantities of degree uniformly characterize the speaker information in speech samples.Super vector usually considers background information, institute Higher discrimination can be obtained with it.
In the relevant audio recognition method of text, training voice and tested speech content are fixed and identical.Due to voice Content is identical, and this model will be unable to effectively resist " attack of synthesis voice ".
Summary of the invention
The technical problem to be solved in the present invention is that it is not high for existing voice Security of test, it be easy to cause synthesis language Sound attack provides a kind of asymmetric text based on wavelet analysis and super vector relevant audio recognition method.
The technical solution adopted by the present invention to solve the technical problems is: constructing a kind of based on wavelet analysis and super vector The relevant audio recognition method of asymmetric text, this method using based on wavelet analysis algorithm extract short amount, with improve The quality and noise immunity of short amount.
In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector, The step of asymmetric text based on wavelet analysis and super vector relevant audio recognition method is S1 experimental data Collection obtains accuracy rate, S4 analysis noiseproof feature with platform, S2 test optimal wavelet, S3.
In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector, The relevant audio recognition method process of the asymmetric text based on wavelet analysis and super vector be short amount extraction algorithm, Construct background model, the super vector of building, pattern matching algorithm.
In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector, The content of the asymmetric text is training voice and tested speech;The trained voice discloses, and tested speech content is underground.
In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector, The short amount extraction algorithm is divided into pretreatment, spectrum analysis, Mel filtering, cepstrum calculating.
In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector, The background model of the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector is by Gaussian Mixture Model indicates.
Target short amount tests short amount, and the building process of super vector, which is referred to as, maximizes adjustment process.
Implement a kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector of the invention, It has the advantages that and wavelet analysis and super vector is introduced into the present invention.It is this non-that wavelet analysis can effectively analyze voice Stationary signal, and super vector can effectively improve the discrimination between different characteristic vector.Synthesis language can be prevented by implementing the present invention Sound attack, and recognition performance can be improved.
Detailed description of the invention
Present invention will be further explained below with reference to the attached drawings and examples, in attached drawing:
Fig. 1 is the relevant audio recognition method process of the asymmetric text of the invention based on wavelet analysis and super vector Figure
Fig. 2 is the structure of the relevant audio recognition method of the asymmetric text of the invention based on wavelet analysis and super vector Build super vector flow chart
Fig. 3 is the short of the relevant audio recognition method of the asymmetric text of the invention based on wavelet analysis and super vector Vector extraction algorithm flow chart
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
As shown in Figure 1, the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector The step of are as follows:
S1 experimental data set and platform, the present embodiment collect the speech samples of several experimenters, and every speech samples are 10s, all sample standard deviations are under quiet environment and guarantee to record in the case where same frequency, and all Experimental Hardware platforms of this implementation are one Platform is furnished with the PC machine of Intel core5 CPU and 8G memory;
S2 tests optimal wavelet, and after acquisition speech samples success, the algorithm based on wavelet analysis tests voice Analysis finds that the bigger ability for illustrating wavelet analysis capture important information of the energy of Wavelet Spectrum is stronger;
S3 obtains accuracy rate, and the present embodiment is by the voice of several speakers as training voice and tested speech.Using small Wave analysis carries out spectrum analysis to speech samples, can more accurately analyze stationary signal and non-stationary signal.
S4 analyzes noiseproof feature, and noise is added in the present embodiment in the speech samples of several speakers, and experiment shows all With the increase of noise intensity, accuracy rate all declining sample.But when noisy speech is snapped down to 28db, compared to non-small echo Model, wavelet model accuracy rate glide less than 8%.Illustrate that the model noiseproof feature based on wavelet analysis is better than non-wavelet model.
Further, the relevant audio recognition method process of the asymmetric text based on wavelet analysis and super vector For short amount extraction algorithm, building background model, the super vector of building, pattern matching algorithm.
Further, the content of the asymmetric text is training voice and tested speech;The trained voice discloses, and surveys It is underground to try voice content.
Further, the short amount extraction algorithm is divided into pretreatment, spectrum analysis, Mel filtering, cepstrum calculating.
Further, the back of the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector Scape model is indicated by gauss hybrid models.
Although being disclosed by above embodiments to the present invention, scope of protection of the present invention is not limited thereto, Under conditions of without departing from present inventive concept, deformation, the replacement etc. done to above each component will fall into right of the invention In claimed range.

Claims (5)

1. a kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector, which is characterized in that institute The step of asymmetric text based on wavelet analysis and super vector stated relevant audio recognition method is S1 experimental data set Accuracy rate, S4 analysis noiseproof feature are obtained with platform, S2 test optimal wavelet, S3.
2. the relevant speech recognition side of the asymmetric text according to claim 1 based on wavelet analysis and super vector Method, which is characterized in that the relevant audio recognition method process of the asymmetric text based on wavelet analysis and super vector is Short amount extraction algorithm, building background model, the super vector of building, pattern matching algorithm.
3. the relevant speech recognition side of the asymmetric text according to claim 1 based on wavelet analysis and super vector Method, which is characterized in that the content of the asymmetric text is training voice and tested speech;The trained voice discloses, test Voice content is underground.
4. the relevant speech recognition side of the asymmetric text according to claim 2 based on wavelet analysis and super vector Method, which is characterized in that the short amount extraction algorithm is divided into pretreatment, spectrum analysis, Me l filtering, cepstrum calculating.
5. the relevant speech recognition side of the asymmetric text according to claim 1 based on wavelet analysis and super vector Method, which is characterized in that the background of the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector Model is indicated by gauss hybrid models.
CN201810973061.4A 2018-08-24 2018-08-24 A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector Pending CN109119085A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810973061.4A CN109119085A (en) 2018-08-24 2018-08-24 A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810973061.4A CN109119085A (en) 2018-08-24 2018-08-24 A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector

Publications (1)

Publication Number Publication Date
CN109119085A true CN109119085A (en) 2019-01-01

Family

ID=64860770

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810973061.4A Pending CN109119085A (en) 2018-08-24 2018-08-24 A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector

Country Status (1)

Country Link
CN (1) CN109119085A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113935329A (en) * 2021-10-13 2022-01-14 昆明理工大学 Asymmetric text matching method based on adaptive feature recognition and denoising

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894561A (en) * 2010-07-01 2010-11-24 西北工业大学 Wavelet transform and variable-step least mean square algorithm-based voice denoising method
CN104008751A (en) * 2014-06-18 2014-08-27 周婷婷 Speaker recognition method based on BP neural network
CN108281146A (en) * 2017-12-29 2018-07-13 青岛真时科技有限公司 A kind of phrase sound method for distinguishing speek person and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894561A (en) * 2010-07-01 2010-11-24 西北工业大学 Wavelet transform and variable-step least mean square algorithm-based voice denoising method
CN104008751A (en) * 2014-06-18 2014-08-27 周婷婷 Speaker recognition method based on BP neural network
CN108281146A (en) * 2017-12-29 2018-07-13 青岛真时科技有限公司 A kind of phrase sound method for distinguishing speek person and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
雷磊 等: "基于小波分析和超级向量的非对称文本相关的说话人识别模型", 《信息安全研究》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113935329A (en) * 2021-10-13 2022-01-14 昆明理工大学 Asymmetric text matching method based on adaptive feature recognition and denoising

Similar Documents

Publication Publication Date Title
CN102968990B (en) Speaker identifying method and system
US7877254B2 (en) Method and apparatus for enrollment and verification of speaker authentication
CN108231067A (en) Sound scenery recognition methods based on convolutional neural networks and random forest classification
CN104900235B (en) Method for recognizing sound-groove based on pitch period composite character parameter
CN108986824B (en) Playback voice detection method
CN102543073B (en) Shanghai dialect phonetic recognition information processing method
CN105895078A (en) Speech recognition method used for dynamically selecting speech model and device
CN104021789A (en) Self-adaption endpoint detection method using short-time time-frequency value
Das et al. Bangladeshi dialect recognition using Mel frequency cepstral coefficient, delta, delta-delta and Gaussian mixture model
CN108198561A (en) A kind of pirate recordings speech detection method based on convolutional neural networks
Todkar et al. Speaker recognition techniques: A review
Ting Yuan et al. Frog sound identification system for frog species recognition
WO2018095167A1 (en) Voiceprint identification method and voiceprint identification system
CN109961794A (en) A kind of layering method for distinguishing speek person of model-based clustering
CN109920435A (en) A kind of method for recognizing sound-groove and voice print identification device
CN112542174A (en) VAD-based multi-dimensional characteristic parameter voiceprint identification method
CN111489763B (en) GMM model-based speaker recognition self-adaption method in complex environment
CN111081223A (en) Voice recognition method, device, equipment and storage medium
CN115510909A (en) Unsupervised algorithm for DBSCAN to perform abnormal sound features
CN110415707B (en) Speaker recognition method based on voice feature fusion and GMM
CN114023353A (en) Transformer fault classification method and system based on cluster analysis and similarity calculation
Aroon et al. Speaker recognition system using Gaussian Mixture model
CN109119085A (en) A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector
Mini et al. Feature vector selection of fusion of MFCC and SMRT coefficients for SVM classifier based speech recognition system
Mardhotillah et al. Speaker recognition for digital forensic audio analysis using support vector machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190101