CN103778914A - Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching - Google Patents

Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching Download PDF

Info

Publication number
CN103778914A
CN103778914A CN201410040474.9A CN201410040474A CN103778914A CN 103778914 A CN103778914 A CN 103778914A CN 201410040474 A CN201410040474 A CN 201410040474A CN 103778914 A CN103778914 A CN 103778914A
Authority
CN
China
Prior art keywords
snr
template
noise ratio
noise
mfcc
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410040474.9A
Other languages
Chinese (zh)
Other versions
CN103778914B (en
Inventor
宁更新
吴丽菲
宁小娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201410040474.9A priority Critical patent/CN103778914B/en
Publication of CN103778914A publication Critical patent/CN103778914A/en
Application granted granted Critical
Publication of CN103778914B publication Critical patent/CN103778914B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention discloses an anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching. The anti-noise voice identification method based on signal-to-noise ratio weighing template characteristic matching comprises the following steps that (1) input voice signals are preprocessed, and a phase position coefficient is obtained; (2) the characteristics of input voice, namely a phase position MFCC, are calculated; (3) characteristic matching is carried out on a template based on SNR. The invention further discloses a device of the anti-noise voice identification method based on signal-to-noise ratio weighing template characteristic matching. The device comprises a power source module, a display module, a storage module, a DSP/ARM digital processing module, a microphone, an A/D converter and a USB interface. The anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching have the advantages of being wide in application range, high in accuracy, low in cost, convenient and fast to use, high in adaptability and the like.

Description

Based on anti-noise audio recognition method and the device of noise Ratio Weighted template characteristic coupling
Technical field
The present invention relates to a kind of sound signal processing technology, particularly a kind of anti-noise audio recognition method and device based on noise Ratio Weighted template characteristic coupling.
Background technology
The application of speech recognition is very extensive, almost relates to the every aspect of daily life.As voice dialing system, seat reservation system, medical services, bank service, dictation machine, computer control, Industry Control, voice communication system etc.Speech recognition technology changes the existing daily life style of the mankind deeply in every field such as industry, household electrical appliances, communication, medical treatment, home services.Nowadays, actual environment requires more and more higher to the acoustic noise robustness of speech recognition, and therefore, the proper vector that extraction has robustness and stronger separating capacity has great importance to speech recognition system.
For all power spectrum based on voice signal of feature of speech recognition, power spectrum has represented the energy distribution of signal in frequency domain scope now.In the time there is external noise, this energy distribution has also comprised the energy of noise.This just makes corresponding proper vector very responsive to external noise, causes the speech recognition system performance under noise circumstance not good.
The method of separating the external portion of block eigenvector noise-sensitive mainly contains two aspects, and one is based on feature, and one is based on model.Method based on feature is at the front end of speech recognition system, the proper vector generating to be had nothing to do as far as possible with noise.And method based on model is the rear end in speech recognition system, by a small amount of self-adapting data under test environment, model parameter is adjusted, gradually model parameter is transformed to actual environment, thereby reach the object that improves system recognition rate.Solution based on feature has spectrum-subtraction, RASTA facture etc.Method based on model has parallel model mixing method (PMC), the adaptive method (VTS) based on vector Taylor series, signal decomposition method etc.
At present, mainly contain two kinds for the phonic signal character parameter of the extraction of speech recognition: linear prediction cepstrum coefficient coefficient (LPCC) and Mel frequency cepstral coefficient (MFCC).LPCC characteristic parameter can effectively represent speech parameter and have higher computing velocity, but not consider that the mankind's auditory system is to the feature of speech processes.Mel frequency band division is a kind of through engineering approaches simulation to human hearing characteristic, and MFCC has simulated the feature of people's ear to speech processes to a certain extent.
But no matter be MFCC or LPCC, existing speech recognition features, recognition performance under low signal-to-noise ratio environment is not fine, in order to overcome this weakness, first the present invention proposes a kind of new feature in low signal-to-noise ratio situation by changing relativity measurement with better robustness, adopt two angles between time-delay signal vector as relativity measurement, because angle is the nonlinear transformation of traditional coefficient of autocorrelation scalar product, can on frequency domain, strengthen the effect of crest by phase place, and crest relative noise robustness is higher.Then, be suitable for high s/n ratio according to traditional characteristic, new feature is suitable for low signal-to-noise ratio, proposes a kind ofly according to the template matches computing method of noise Ratio Weighted, finally proposes related device.
Summary of the invention
Primary and foremost purpose of the present invention is that the shortcoming that overcomes prior art, with not enough, provides a kind of anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling, the method wide accommodation, and accuracy is high.
The shortcoming that another object of the present invention is to overcome prior art is with not enough, a kind of device of realizing the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling is provided, at DSP/ARM7 chip operation, can use the TMS320C6711 of TI or the ARM7S3C44B0 of Samsung to realize.
Primary and foremost purpose of the present invention is achieved through the following technical solutions: a kind of anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling, comprises the following steps:
Step 1: input speech signal is carried out to pre-service, try to achieve phase coefficient;
By the voice signal s[n after digitizing] divide frame processing, adopt Hamming window to carry out windowing to it simultaneously.Be divided into T frame,
{s 0[n],s 1[n],...,s t[n],...,s T-1[n]}
Wherein
s t[n]={s[Kt],s[Kt+1],...,s[Kt+N-1]}
K is that frame moves, and N is frame length, s t[n] is the frame signal sequence at moment t.
Voice signal has stationarity in short-term, and therefore every frame signal is all stably.Gained frame signal is carried out to periodic extension, obtain thus autocorrelation function and be R [ k ] = Σ n = 0 N - 1 s t ~ [ n ] s t ~ [ n + k ] , k=0,1,...,N-1。
Can be found out R[k by above formula] be the dot product of two N dimensional vectors,
x 0 = { s t ~ [ 0 ] , s t ~ [ 1 ] , . . . , s t ~ [ N - 1 ] } ,
x k = { s t ~ [ k ] , . . . , s t ~ [ N - 1 ] , s t ~ [ 0 ] , . . . , s t ~ [ k - 1 ] } ,
R [ k ] = x 0 T x k = | | x | | 2 cos ( θ k ) , (formula 1)
Wherein, || x|| 2=|| x 0|| 2=|| x k|| 2, expression be frame energy.θ kit is vector x 0and vector x kat the angle of N dimension space.
The nonlinearities change of normalized coefficient of autocorrelation being carried out to arc cosine, obtains phase coefficient.
P [ k ] = θ k = cos - 1 ( R [ k ] | | x | | 2 ) , (formula 2)
P[k] span be between 0 to π, normalized to 0 to 1, obtain normalized Phase autocorrelation function
P n [ k ] = P [ k ] π = cos - 1 ( R n [ k ] ) π , (formula 3)
P n[k] can improve the robustness in low signal-to-noise ratio situation, but in high s/n ratio situation, especially, in the situation of clean speech, performance is not as R n[k].
Step 2: calculate the feature of input voice, i.e. phase place MFCC;
Respectively to P n[k] carries out DFT conversion, obtains phase power spectrum S p[l].
S p [ l ] = Σ k = 0 N - 1 P n [ k ] exp ( - j 2 π N kl ) , (formula 4)
Here S p[l] is called phase power spectrum, and the MEL frequency cepstral coefficient therefrom obtaining is called phase place MFCC,, by the filtering of Mel dimensions in frequency bank of filters, then carries out logarithm operation.When the information separated at each frequency band out after, frequency domain character is changed in time domain with discrete cosine transform (DCT), obtain phase place MFCC parameter.
Phase place MFCC parameter is chosen the static cepstrum coefficient in L rank and single order and second derivative, altogether 3L dimension.
Step 3: the template characteristic coupling based on SNR;
In reference database, there is j reference voice data template, wherein comprise the MFCC feature of 3M dimension and the phase place MFCC feature of 3L dimension.The test template of proper vector 3M dimension MFCC and the wherein Euclidean distance between i reference template are D mi, the Euclidean distance between test template and the i reference template of proper vector 3L dimension phase place MFCC is P li, i=0,1 ..., j-1.
The known robustness of proper vector N dimension phase place MFCC that adopts in low signal-to-noise ratio situation is higher, and in high s/n ratio situation, especially, in clean speech situation, adopts the robustness of proper vector M dimension MFCC higher.
According to this point, the present invention adopts a kind of method based on noise Ratio Weighted, under different signal to noise ratio (S/N ratio) conditions, adopts different weight values, obtains the weight distance value C of two feature vectors in template spacing i.
C i=(1-w) D mi+ wP li, i=0,1 ..., j-1, (formula 5)
Template matches process is searched for exactly in j reference template, finds and makes min{C i, i=0,1 ..., that template that j-1 sets up.
W is the weight of phase place MFCC parameterized template spacing, and its value determines by signal to noise ratio snr, and signal to noise ratio (S/N ratio) can obtain thus:
SNR = log 10 ( | | Y | | 2 | | N | | 2 ) ≅ log 10 ( | | Y | | 2 | | N | | 2 ) , (formula 6)
| | Y | | 2 = | | X | | 2 + | | N | | 2 ≅ | | X | | 2 + | | N | | 2 ‾ , (formula 7)
|| Y|| 2what represent is the frame energy of voice in actual environment, || N|| 2what represent is the energy of the noise signal of sampling in actual environment,
Figure BDA0000463001040000043
represent the estimated value to this energy.
The value of w determines by signal to noise ratio snr,
W=f (SNR), (formula 8)
F (SNR) represents the relation between weight coefficient w and signal to noise ratio snr.F (SNR) span is (0,1), and with x negative correlation each other, this relation can be linear, can be also nonlinear.Can adopt following two kinds of modes to represent this relation:
Mode one:
w = f ( SNR ) = exp ( - SNR - α γ ) · u ( SNR - α ) + u ( SNR - α ) , (formula 9)
Mode two:
w = f ( SNR ) = 1 - 1 1 + exp [ - ( SNR - β ) θ ] , (formula 10)
U () is step function, and α span is (1,5), is the threshold value of SNR, in the time that SNR is less than α, weight coefficient w is 1, in the time that SNR is greater than α, and weight coefficient w and SNR negative correlation, and along index decreased, along with the growth of SNR, final w converges on 0 gradually.The span of β is (1,10), be equivalent to traditional MFCC and phase place MFCC weight equate time SNR critical value.The span of γ and θ is (0.1,1), is all used for regulating the speed changing, and its value is larger, variation just slower.
Another object of the present invention is achieved through the following technical solutions: a kind of device of realizing the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling, comprising: power module, display module, memory module, DSP/ARM digital signal processing module, microphone, A/D converter and USB interface; One end of described memory module, USB interface, display module, power module and A/D converter is all electrically connected with DSP/ARM digital signal processing module, the other end electrical connection of described microphone and A/D converter; Described microphone is for input test voice, described A/D converter is used for tested speech digitizing, described DSP/ARM chip is used for extracting feature and carries out template matches, described memory module is for reference data stored storehouse, described display module is used for showing result, and described USB interface is connected with computer.
Described A/D converter adopts ADC0832 chip; Described DSP/ARM digital signal processing module adopts DSP/ARM7 chip.
Described DSP/ARM7 chip adopts the TMS320C6711 of TI or the ARM7S3C44B0 of Samsung.
The present invention calculates on the basis of MFCC parameter at traditional coefficient of autocorrelation, has increased by phase coefficient and has replaced coefficient of autocorrelation to obtain phase place MFCC parameter, obtains individual features vector, and the template matches computing method of the noise Ratio Weighted of giving chapter and verse.
The present invention has following advantage and effect with respect to prior art:
One, wide accommodation.Application of the present invention is very extensive, almost relates to the every aspect of daily life.
Two, accuracy is high.The robustness that the present invention has applied phase place MFCC in low signal-to-noise ratio situation is higher, and in high s/n ratio situation, especially the higher characteristic of robustness of traditional MFCC in clean speech situation, improve feature extraction distance measure mode, accuracy, the especially accuracy in low signal-to-noise ratio situation of identification are improved.
Three, cost is low.Use a common DSP or ARM chip can complete all computings.
Four, easy to use.This device can be inserted on any equipment that has USB interface, and plug and play is very convenient.
Five, strong adaptability.Environment for use is not had to specific (special) requirements, can in most of environment, normally work.
Accompanying drawing explanation
Fig. 1 is the module frame chart of contrive equipment.
Fig. 2 is pre-service and the feature extraction process flow diagram of contrive equipment.
Fig. 3 is the template matches module process flow diagram of contrive equipment.
Fig. 4 is the hardware structure diagram of contrive equipment.
Embodiment
Below in conjunction with embodiment and accompanying drawing, the present invention is described in further detail, but embodiments of the present invention are not limited to this.
Embodiment 1
As shown in Figure 1, first tested speech enters pretreatment module, then enter characteristic extracting module, obtain tested speech individual features vector MFCC and PAC-MFCC are input to template matches module, mate (concrete template matches module flow process as shown in Figure 3) with the template in reference database by calculating weight distance value, the matching template that obtains weight distance value minimum, result outputs to display module the most at last.
Wherein pretreatment process and feature extraction flow process as shown in Figure 2, are carried out pre-emphasis, digitizing in pretreatment process, divide frame, windowing, extracts tested speech frame feature in feature extraction flow process afterwards, by calculating coefficient of autocorrelation and phase coefficient, carry out FFT conversion, by MEL bank of filters, then by log-transformation and discrete Fourier transformation DCT, try to achieve traditional MFCC and phase place MFCC, and actual environment without tested speech situation under, estimated noise energy, tries to achieve respective environment SNR.
The specific implementation step of speech recognition equipment is as follows:
Step 1: tested speech is carried out to digitized processing, and sample frequency is 8kHz, then carries out pre-emphasis, take 20ms as a frame, frame moves as 10ms, and window is Hamming window.
Step 2: every frame voice are analyzed, first carried out periodic extension, then try to achieve normalized coefficient of autocorrelation and phase coefficient according to (formula 1-3).
Step 3: the coefficient of trying to achieve is carried out to FFT conversion, obtain corresponding power spectrum, then by two kinds of spectrums that obtain, by the MEL scaling filter filtering on 13 rank, by log-transformation and dct transform, try to achieve the static cepstrum coefficient of the static cepstrum coefficient of 13 rank MFCC and the phase place MFCC on 13 rank again, and get both single order and second derivative, obtain the MFCC parameter of 39 dimensions and the phase place MFCC parameter of 39 dimensions, as proper vector.
Step 4: in without tested speech situation, gather the noise signal in actual environment, obtain noise energy.By (formula 6) and (formula 7), estimate the signal to noise ratio (S/N ratio) under the actual environment that has tested speech again.
Step 5: the Euclidean distance D between test template and the reference template of calculated characteristics vector 39 dimension MFCC m, proper vector 39 is tieed up the Euclidean distance P between test template and the reference template of phase place MFCC n.
Step 6: calculate the weighted value of two proper vector template spacings according to (formula 8), last according to (formula 5), obtain weight distance value C.
The calculating formula of calculating respective weights is as follows:
w = f ( SNR ) = exp ( - SNR - α γ ) · u ( SNR - α ) + u ( SNR - α ) ,
Get correlation parameter: α=3, γ=0.5.
As shown in Figure 4, a kind of device of realizing the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling, comprising: power module, display module, memory module, DSP/ARM digital signal processing module, microphone, A/D converter and USB interface; One end of described memory module, USB interface, display module, power module and A/D converter is all electrically connected with DSP/ARM digital signal processing module, the other end electrical connection of described microphone and A/D converter; Described microphone is for input test voice, described A/D converter is used for tested speech digitizing, described DSP/ARM chip is used for extracting feature and carries out template matches, described memory module is for reference data stored storehouse, described display module is used for showing result, and described USB interface is connected with computer.Described A/D converter adopts ADC0832 chip; Described DSP/ARM digital signal processing module adopts DSP/ARM7 chip.Described DSP/ARM7 chip adopts the TMS320C6711 of TI or the ARM7S3C44B0 of Samsung.
Embodiment 2
The present embodiment is except following content, with embodiment 1:
The calculating formula of calculating respective weights is as follows:
w = f ( SNR ) = 1 - 1 1 + exp [ - ( SNR - β ) θ ] ,
Get correlation parameter: β=3, θ=0.5.
Above-described embodiment is preferably embodiment of the present invention; but embodiments of the present invention are not restricted to the described embodiments; other any do not deviate from change, the modification done under Spirit Essence of the present invention and principle, substitutes, combination, simplify; all should be equivalent substitute mode, within being included in protection scope of the present invention.

Claims (9)

1. the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling, is characterized in that, comprises the following steps:
Step 1: input speech signal is carried out to pre-service, try to achieve phase coefficient;
Step 2: calculate the feature of input voice, i.e. phase place MFCC;
Step 3: the template based on SNR is carried out to characteristic matching.
2. the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling according to claim 1, is characterized in that, described step 1 comprises the following steps:
Steps A, by the voice signal s[n after digitizing] divide frame processing, adopt Hamming window to carry out windowing simultaneously, and be divided into T frame:
{s 0[n],s 1[n],...,s t[n],...,s T-1[n]},
Wherein:
S t[n]=and s[Kt], s[Kt+1] ..., s[Kt+N-1] }, K is that frame moves, N is frame length, s t[n] is the frame signal sequence at moment t;
Step B, gained frame signal is carried out to periodic extension, obtains autocorrelation function:
R [ k ] = Σ n = 0 N - 1 s t ~ [ n ] s t ~ [ n + k ] , k = 0,1 , . . . , N - 1 ;
Expression formula by autocorrelation function can draw, R[k] be the dot product of two N dimensional vectors,
x 0 = { s t ~ [ 0 ] , s t ~ [ 1 ] , . . . , s t ~ [ N - 1 ] } ,
x k = { s t ~ [ k ] , . . . , s t ~ [ N - 1 ] , s t ~ [ 0 ] , . . . , s t ~ [ k - 1 ] } ,
R [ k ] = x 0 T x k = | | x | | 2 cos ( θ k ) ,
Wherein, || x|| 2=|| x 0|| 2=|| x k|| 2, expression be frame energy, θ kit is vector x 0and vector x kat the angle of N dimension space;
Step C, normalized coefficient of autocorrelation is carried out to the nonlinearities change of arc cosine, obtains phase coefficient:
P [ k ] = θ k = cos - 1 ( R [ k ] | | x | | 2 ) ,
P[k] span be between 0 to π, normalized to 0 to 1, obtain normalized Phase autocorrelation function:
P n [ k ] = P [ k ] π = cos - 1 ( R n [ k ] ) π ,
Wherein, P n[k] is for improving the robustness in low signal-to-noise ratio situation.
3. the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling according to claim 1, is characterized in that, described step 2 comprises the following steps:
Step I, to P n[k] carries out DFT conversion, obtains phase power spectrum Sp[l]:
S p [ l ] = Σ k = 0 N - 1 P n [ k ] exp ( - j 2 π N kl ) ,
In formula, S p[l] represents phase power spectrum, and the MEL frequency cepstral coefficient obtaining from formula is called phase place MFCC, by the filtering of Mel dimensions in frequency bank of filters, then carries out logarithm operation that is:;
Step II, when the information separated of each frequency band out after, frequency domain character is changed in time domain with discrete cosine transform, obtain phase place MFCC parameter; Described phase place MFCC parameter is chosen the static cepstrum coefficient in L rank and single order and second derivative, altogether 3L dimension.
4. the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling according to claim 1, is characterized in that, described step 3 comprises the following steps:
Step 1., have j reference voice data template in reference database, wherein comprises the MFCC proper vector of 3M dimension and the phase place MFCC proper vector of 3L dimension; The test template of proper vector 3M dimension MFCC and the wherein Euclidean distance between i reference template are D mi, the Euclidean distance between test template and the i reference template of proper vector 3L dimension phase place MFCC is P li, i=0,1 ..., j-1;
Step 2., under different signal to noise ratio (S/N ratio) conditions, adopt different weight values, obtain the weight distance value C of two feature vectors in template spacing i:
C i=(1-w)D Mi+wP Li,i=0,1,...,j-1,
Wherein, w is the weight of phase place MFCC parameterized template spacing; Template matches process refers to search in j reference template, finds and makes min{C i, i=0,1 ..., the template that j-1 sets up;
Signal to noise ratio snr can be obtained by following formula:
SNR = log 10 ( | | Y | | 2 | | N | | 2 ) ≅ log 10 ( | | Y | | 2 | | N | | 2 ) ,
| | Y | | 2 = | | X | | 2 + | | N | | 2 ≅ | | X | | 2 + | | N | | 2 ‾ ,
Wherein, || Y|| 2what represent is the frame energy of voice in actual environment, || N|| 2what represent is the energy of the noise signal of sampling in actual environment,
Figure FDA0000463001030000033
represent the estimated value to this energy;
The value of w is determined by signal to noise ratio snr:
w=f(SNR),
Wherein, f (SNR) represents the relation between weight coefficient w and signal to noise ratio snr, and the span of f (SNR) be (0,1), and f (SNR) and the pass of x are linearity or nonlinear negative correlation each other.
5. the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling according to claim 4, is characterized in that, described f (SNR) is as follows with the expression formula of the relation of x:
w = f ( SNR ) = exp ( - SNR - α γ ) · u ( SNR - α ) + u ( SNR - α ) ,
Wherein, u () is step function, and α span is (1,5), α is the threshold value of SNR, and in the time that SNR is less than α, weight coefficient w is 1, in the time that SNR is greater than α, weight coefficient w and SNR negative correlation, and along index decreased, along with the growth of SNR, w converges on 0 gradually.
6. the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling according to claim 4, is characterized in that, described f (SNR) is as follows with the expression formula of the relation of x:
w = f ( SNR ) = 1 - 1 1 + exp [ - ( SNR - β ) θ ] ,
Wherein, the span of β is (1,10), be traditional MFCC and phase place MFCC weight equate time SNR critical value; The span of γ and θ is (0.1,1), and γ and θ be all for regulating the speed of variation, in the time that the value of γ or θ is larger, changes just slower.
7. realize the device of the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling claimed in claim 1 for one kind, it is characterized in that, comprising: power module, display module, memory module, DSP/ARM digital signal processing module, microphone, A/D converter and USB interface; One end of described memory module, USB interface, display module, power module and A/D converter is all electrically connected with DSP/ARM digital signal processing module, the other end electrical connection of described microphone and A/D converter; Described microphone is for input test voice, described A/D converter is used for tested speech digitizing, described DSP/ARM chip is used for extracting feature and carries out template matches, described memory module is for reference data stored storehouse, described display module is used for showing result, and described USB interface is connected with computer.
8. device according to claim 7, is characterized in that, described A/D converter adopts ADC0832 chip; Described DSP/ARM digital signal processing module adopts DSP/ARM7 chip.
9. device according to claim 8, is characterized in that, described DSP/ARM7 chip adopts the TMS320C6711 of TI or the ARM7S3C44B0 of Samsung.
CN201410040474.9A 2014-01-27 2014-01-27 Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching Expired - Fee Related CN103778914B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410040474.9A CN103778914B (en) 2014-01-27 2014-01-27 Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410040474.9A CN103778914B (en) 2014-01-27 2014-01-27 Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching

Publications (2)

Publication Number Publication Date
CN103778914A true CN103778914A (en) 2014-05-07
CN103778914B CN103778914B (en) 2017-02-15

Family

ID=50571083

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410040474.9A Expired - Fee Related CN103778914B (en) 2014-01-27 2014-01-27 Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching

Country Status (1)

Country Link
CN (1) CN103778914B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106373559A (en) * 2016-09-08 2017-02-01 河海大学 Robustness feature extraction method based on logarithmic spectrum noise-to-signal weighting
CN108735229A (en) * 2018-06-12 2018-11-02 华南理工大学 A kind of amplitude based on noise Ratio Weighted and phase combining compensation anti-noise sound enhancement method and realization device
CN117690439A (en) * 2024-01-31 2024-03-12 国网安徽省电力有限公司合肥供电公司 Speech recognition semantic understanding method and system based on marketing scene

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030154076A1 (en) * 2002-02-13 2003-08-14 Thomas Kemp Method for recognizing speech/speaker using emotional change to govern unsupervised adaptation
CN1675684A (en) * 2002-08-09 2005-09-28 摩托罗拉公司(特拉华州注册) Distributed speech recognition with back-end voice activity detection apparatus and method
CN102592589A (en) * 2012-02-23 2012-07-18 华南理工大学 Speech scoring method and device implemented through dynamically normalizing digital characteristics
CN202454260U (en) * 2012-02-23 2012-09-26 华南理工大学 Speech assessment device utilizing dynamic normalized digital features
CN102737629A (en) * 2011-11-11 2012-10-17 东南大学 Embedded type speech emotion recognition method and device
CN103440872A (en) * 2013-08-15 2013-12-11 大连理工大学 Transient state noise removing method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030154076A1 (en) * 2002-02-13 2003-08-14 Thomas Kemp Method for recognizing speech/speaker using emotional change to govern unsupervised adaptation
CN1675684A (en) * 2002-08-09 2005-09-28 摩托罗拉公司(特拉华州注册) Distributed speech recognition with back-end voice activity detection apparatus and method
CN102737629A (en) * 2011-11-11 2012-10-17 东南大学 Embedded type speech emotion recognition method and device
CN102592589A (en) * 2012-02-23 2012-07-18 华南理工大学 Speech scoring method and device implemented through dynamically normalizing digital characteristics
CN202454260U (en) * 2012-02-23 2012-09-26 华南理工大学 Speech assessment device utilizing dynamic normalized digital features
CN103440872A (en) * 2013-08-15 2013-12-11 大连理工大学 Transient state noise removing method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106373559A (en) * 2016-09-08 2017-02-01 河海大学 Robustness feature extraction method based on logarithmic spectrum noise-to-signal weighting
CN106373559B (en) * 2016-09-08 2019-12-10 河海大学 Robust feature extraction method based on log-spectrum signal-to-noise ratio weighting
CN108735229A (en) * 2018-06-12 2018-11-02 华南理工大学 A kind of amplitude based on noise Ratio Weighted and phase combining compensation anti-noise sound enhancement method and realization device
CN108735229B (en) * 2018-06-12 2020-06-19 华南理工大学 Amplitude and phase joint compensation anti-noise voice enhancement method based on signal-to-noise ratio weighting
CN117690439A (en) * 2024-01-31 2024-03-12 国网安徽省电力有限公司合肥供电公司 Speech recognition semantic understanding method and system based on marketing scene
CN117690439B (en) * 2024-01-31 2024-04-16 国网安徽省电力有限公司合肥供电公司 Speech recognition semantic understanding method and system based on marketing scene

Also Published As

Publication number Publication date
CN103778914B (en) 2017-02-15

Similar Documents

Publication Publication Date Title
CN103117059B (en) Voice signal characteristics extracting method based on tensor decomposition
CN111044814B (en) Method and system for identifying transformer direct-current magnetic bias abnormality
CN102436809B (en) Network speech recognition method in English oral language machine examination system
CN105788603A (en) Audio identification method and system based on empirical mode decomposition
CN108597496A (en) Voice generation method and device based on generation type countermeasure network
CN103065629A (en) Speech recognition system of humanoid robot
US8566084B2 (en) Speech processing based on time series of maximum values of cross-power spectrum phase between two consecutive speech frames
KR20090076683A (en) Method, apparatus for detecting signal and computer readable record-medium on which program for executing method thereof
Wanli et al. The research of feature extraction based on MFCC for speaker recognition
CN107293306B (en) A kind of appraisal procedure of the Objective speech quality based on output
US20100094622A1 (en) Feature normalization for speech and audio processing
Ganapathy et al. Feature extraction using 2-d autoregressive models for speaker recognition.
CN102789779A (en) Speech recognition system and recognition method thereof
CN102723081A (en) Voice signal processing method, voice and voiceprint recognition method and device
CN112489625A (en) Voice emotion recognition method, system, mobile terminal and storage medium
CN103778914A (en) Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching
CN112863517A (en) Speech recognition method based on perceptual spectrum convergence rate
CN107919136B (en) Digital voice sampling frequency estimation method based on Gaussian mixture model
Pardede et al. Generalized-log spectral mean normalization for speech recognition
Singh et al. A comparative study of recognition of speech using improved MFCC algorithms and Rasta filters
Yue et al. Speaker age recognition based on isolated words by using SVM
CN102256201A (en) Automatic environmental identification method used for hearing aid
CN106920558A (en) Keyword recognition method and device
Liao Combining Evidence from Auditory, Instantaneous Frequency and Random Forest for Anti-Noise Speech Recognition
Cao et al. Voice activity detection algorithm based on entropy in noisy environment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170215

Termination date: 20220127

CF01 Termination of patent right due to non-payment of annual fee