CN109119085A - A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector - Google Patents
A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector Download PDFInfo
- Publication number
- CN109119085A CN109119085A CN201810973061.4A CN201810973061A CN109119085A CN 109119085 A CN109119085 A CN 109119085A CN 201810973061 A CN201810973061 A CN 201810973061A CN 109119085 A CN109119085 A CN 109119085A
- Authority
- CN
- China
- Prior art keywords
- super vector
- wavelet analysis
- asymmetric
- asymmetric text
- audio recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000000605 extraction Methods 0.000 claims abstract description 10
- 238000012549 training Methods 0.000 claims abstract description 6
- 238000012360 testing method Methods 0.000 claims description 7
- 238000010183 spectrum analysis Methods 0.000 claims description 5
- 238000001914 filtration Methods 0.000 claims description 3
- 230000015572 biosynthetic process Effects 0.000 abstract description 4
- 238000003786 synthesis reaction Methods 0.000 abstract description 4
- 238000013475 authorization Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/06—Decision making techniques; Pattern matching strategies
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/14—Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Abstract
The present invention relates to a kind of relevant audio recognition methods of asymmetric text based on wavelet analysis and super vector, including process is short amount extraction algorithm, building background model, the super vector of building, pattern matching algorithm.The content of the asymmetric text of asymmetric text is training voice and tested speech;The trained voice discloses, and tested speech content is underground.Wavelet analysis can effectively analyze this non-stationary signal of voice, and super vector can effectively improve the discrimination between different characteristic vector.So wavelet analysis and super vector are introduced the present invention.The attack of synthesis voice can be prevented and improve recognition performance by implementing the present invention.
Description
Technical field
The present invention relates to a kind of audio recognition methods, are based on wavelet analysis and super vector more specifically to one kind
The relevant audio recognition method of asymmetric text.
Background technique
Speech recognition technology is a kind of technology using speech samples identification speaker's identity.The technology is widely used in
In identity authorization system.Speech recognition modeling is generally divided into 2 modules: feature extraction and speaker clustering.In feature extraction mould
In block, speech samples are converted into phonetic feature, contain in this feature only related and unrelated with voice content to speaker
Information.In speaker clustering module, a learning algorithm concludes the information in feature and establishes speaker model.
When identifying unknown voice, speaker model with regard to and unknown voice matched and identified the identity of unknown speaker.
A kind of most common phonetic feature is short amount.It characterizes speaker information using one group of low dimensional vector.
The conventional method for obtaining short amount is Mel cepstrum coefficient algorithm (MFCC).The algorithm is using discrete Fourier transform (DFT) to language
Sound signal carries out spectrum analysis.Super vector is a kind of phonetic feature based on short amount.Different with short amount, it is with a higher-dimension
The single vector-quantities of degree uniformly characterize the speaker information in speech samples.Super vector usually considers background information, institute
Higher discrimination can be obtained with it.
In the relevant audio recognition method of text, training voice and tested speech content are fixed and identical.Due to voice
Content is identical, and this model will be unable to effectively resist " attack of synthesis voice ".
Summary of the invention
The technical problem to be solved in the present invention is that it is not high for existing voice Security of test, it be easy to cause synthesis language
Sound attack provides a kind of asymmetric text based on wavelet analysis and super vector relevant audio recognition method.
The technical solution adopted by the present invention to solve the technical problems is: constructing a kind of based on wavelet analysis and super vector
The relevant audio recognition method of asymmetric text, this method using based on wavelet analysis algorithm extract short amount, with improve
The quality and noise immunity of short amount.
In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector,
The step of asymmetric text based on wavelet analysis and super vector relevant audio recognition method is S1 experimental data
Collection obtains accuracy rate, S4 analysis noiseproof feature with platform, S2 test optimal wavelet, S3.
In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector,
The relevant audio recognition method process of the asymmetric text based on wavelet analysis and super vector be short amount extraction algorithm,
Construct background model, the super vector of building, pattern matching algorithm.
In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector,
The content of the asymmetric text is training voice and tested speech;The trained voice discloses, and tested speech content is underground.
In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector,
The short amount extraction algorithm is divided into pretreatment, spectrum analysis, Mel filtering, cepstrum calculating.
In the relevant audio recognition method of the asymmetric text of the present invention based on wavelet analysis and super vector,
The background model of the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector is by Gaussian Mixture
Model indicates.
Target short amount tests short amount, and the building process of super vector, which is referred to as, maximizes adjustment process.
Implement a kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector of the invention,
It has the advantages that and wavelet analysis and super vector is introduced into the present invention.It is this non-that wavelet analysis can effectively analyze voice
Stationary signal, and super vector can effectively improve the discrimination between different characteristic vector.Synthesis language can be prevented by implementing the present invention
Sound attack, and recognition performance can be improved.
Detailed description of the invention
Present invention will be further explained below with reference to the attached drawings and examples, in attached drawing:
Fig. 1 is the relevant audio recognition method process of the asymmetric text of the invention based on wavelet analysis and super vector
Figure
Fig. 2 is the structure of the relevant audio recognition method of the asymmetric text of the invention based on wavelet analysis and super vector
Build super vector flow chart
Fig. 3 is the short of the relevant audio recognition method of the asymmetric text of the invention based on wavelet analysis and super vector
Vector extraction algorithm flow chart
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.
As shown in Figure 1, the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector
The step of are as follows:
S1 experimental data set and platform, the present embodiment collect the speech samples of several experimenters, and every speech samples are
10s, all sample standard deviations are under quiet environment and guarantee to record in the case where same frequency, and all Experimental Hardware platforms of this implementation are one
Platform is furnished with the PC machine of Intel core5 CPU and 8G memory;
S2 tests optimal wavelet, and after acquisition speech samples success, the algorithm based on wavelet analysis tests voice
Analysis finds that the bigger ability for illustrating wavelet analysis capture important information of the energy of Wavelet Spectrum is stronger;
S3 obtains accuracy rate, and the present embodiment is by the voice of several speakers as training voice and tested speech.Using small
Wave analysis carries out spectrum analysis to speech samples, can more accurately analyze stationary signal and non-stationary signal.
S4 analyzes noiseproof feature, and noise is added in the present embodiment in the speech samples of several speakers, and experiment shows all
With the increase of noise intensity, accuracy rate all declining sample.But when noisy speech is snapped down to 28db, compared to non-small echo
Model, wavelet model accuracy rate glide less than 8%.Illustrate that the model noiseproof feature based on wavelet analysis is better than non-wavelet model.
Further, the relevant audio recognition method process of the asymmetric text based on wavelet analysis and super vector
For short amount extraction algorithm, building background model, the super vector of building, pattern matching algorithm.
Further, the content of the asymmetric text is training voice and tested speech;The trained voice discloses, and surveys
It is underground to try voice content.
Further, the short amount extraction algorithm is divided into pretreatment, spectrum analysis, Mel filtering, cepstrum calculating.
Further, the back of the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector
Scape model is indicated by gauss hybrid models.
Although being disclosed by above embodiments to the present invention, scope of protection of the present invention is not limited thereto,
Under conditions of without departing from present inventive concept, deformation, the replacement etc. done to above each component will fall into right of the invention
In claimed range.
Claims (5)
1. a kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector, which is characterized in that institute
The step of asymmetric text based on wavelet analysis and super vector stated relevant audio recognition method is S1 experimental data set
Accuracy rate, S4 analysis noiseproof feature are obtained with platform, S2 test optimal wavelet, S3.
2. the relevant speech recognition side of the asymmetric text according to claim 1 based on wavelet analysis and super vector
Method, which is characterized in that the relevant audio recognition method process of the asymmetric text based on wavelet analysis and super vector is
Short amount extraction algorithm, building background model, the super vector of building, pattern matching algorithm.
3. the relevant speech recognition side of the asymmetric text according to claim 1 based on wavelet analysis and super vector
Method, which is characterized in that the content of the asymmetric text is training voice and tested speech;The trained voice discloses, test
Voice content is underground.
4. the relevant speech recognition side of the asymmetric text according to claim 2 based on wavelet analysis and super vector
Method, which is characterized in that the short amount extraction algorithm is divided into pretreatment, spectrum analysis, Me l filtering, cepstrum calculating.
5. the relevant speech recognition side of the asymmetric text according to claim 1 based on wavelet analysis and super vector
Method, which is characterized in that the background of the relevant audio recognition method of the asymmetric text based on wavelet analysis and super vector
Model is indicated by gauss hybrid models.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810973061.4A CN109119085A (en) | 2018-08-24 | 2018-08-24 | A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810973061.4A CN109119085A (en) | 2018-08-24 | 2018-08-24 | A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109119085A true CN109119085A (en) | 2019-01-01 |
Family
ID=64860770
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810973061.4A Pending CN109119085A (en) | 2018-08-24 | 2018-08-24 | A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109119085A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113935329A (en) * | 2021-10-13 | 2022-01-14 | 昆明理工大学 | Asymmetric text matching method based on adaptive feature recognition and denoising |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101894561A (en) * | 2010-07-01 | 2010-11-24 | 西北工业大学 | Wavelet transform and variable-step least mean square algorithm-based voice denoising method |
CN104008751A (en) * | 2014-06-18 | 2014-08-27 | 周婷婷 | Speaker recognition method based on BP neural network |
CN108281146A (en) * | 2017-12-29 | 2018-07-13 | 青岛真时科技有限公司 | A kind of phrase sound method for distinguishing speek person and device |
-
2018
- 2018-08-24 CN CN201810973061.4A patent/CN109119085A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101894561A (en) * | 2010-07-01 | 2010-11-24 | 西北工业大学 | Wavelet transform and variable-step least mean square algorithm-based voice denoising method |
CN104008751A (en) * | 2014-06-18 | 2014-08-27 | 周婷婷 | Speaker recognition method based on BP neural network |
CN108281146A (en) * | 2017-12-29 | 2018-07-13 | 青岛真时科技有限公司 | A kind of phrase sound method for distinguishing speek person and device |
Non-Patent Citations (1)
Title |
---|
雷磊 等: "基于小波分析和超级向量的非对称文本相关的说话人识别模型", 《信息安全研究》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113935329A (en) * | 2021-10-13 | 2022-01-14 | 昆明理工大学 | Asymmetric text matching method based on adaptive feature recognition and denoising |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102968990B (en) | Speaker identifying method and system | |
US7877254B2 (en) | Method and apparatus for enrollment and verification of speaker authentication | |
CN108231067A (en) | Sound scenery recognition methods based on convolutional neural networks and random forest classification | |
CN104900235B (en) | Method for recognizing sound-groove based on pitch period composite character parameter | |
CN108986824B (en) | Playback voice detection method | |
CN102543073B (en) | Shanghai dialect phonetic recognition information processing method | |
CN105895078A (en) | Speech recognition method used for dynamically selecting speech model and device | |
CN104021789A (en) | Self-adaption endpoint detection method using short-time time-frequency value | |
Das et al. | Bangladeshi dialect recognition using Mel frequency cepstral coefficient, delta, delta-delta and Gaussian mixture model | |
CN108198561A (en) | A kind of pirate recordings speech detection method based on convolutional neural networks | |
Todkar et al. | Speaker recognition techniques: A review | |
Ting Yuan et al. | Frog sound identification system for frog species recognition | |
WO2018095167A1 (en) | Voiceprint identification method and voiceprint identification system | |
CN109961794A (en) | A kind of layering method for distinguishing speek person of model-based clustering | |
CN109920435A (en) | A kind of method for recognizing sound-groove and voice print identification device | |
CN112542174A (en) | VAD-based multi-dimensional characteristic parameter voiceprint identification method | |
CN111489763B (en) | GMM model-based speaker recognition self-adaption method in complex environment | |
CN111081223A (en) | Voice recognition method, device, equipment and storage medium | |
CN115510909A (en) | Unsupervised algorithm for DBSCAN to perform abnormal sound features | |
CN110415707B (en) | Speaker recognition method based on voice feature fusion and GMM | |
CN114023353A (en) | Transformer fault classification method and system based on cluster analysis and similarity calculation | |
Aroon et al. | Speaker recognition system using Gaussian Mixture model | |
CN109119085A (en) | A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector | |
Mini et al. | Feature vector selection of fusion of MFCC and SMRT coefficients for SVM classifier based speech recognition system | |
Mardhotillah et al. | Speaker recognition for digital forensic audio analysis using support vector machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190101 |