CN102737645A - Algorithm for estimating pitch period of voice signal - Google Patents

Algorithm for estimating pitch period of voice signal Download PDF

Info

Publication number
CN102737645A
CN102737645A CN2012101969831A CN201210196983A CN102737645A CN 102737645 A CN102737645 A CN 102737645A CN 2012101969831 A CN2012101969831 A CN 2012101969831A CN 201210196983 A CN201210196983 A CN 201210196983A CN 102737645 A CN102737645 A CN 102737645A
Authority
CN
China
Prior art keywords
voice signal
pitch period
evaluation method
average magnitude
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012101969831A
Other languages
Chinese (zh)
Inventor
管晏
付斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Tianyu Information Industry Co Ltd
Original Assignee
Wuhan Tianyu Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Tianyu Information Industry Co Ltd filed Critical Wuhan Tianyu Information Industry Co Ltd
Priority to CN2012101969831A priority Critical patent/CN102737645A/en
Publication of CN102737645A publication Critical patent/CN102737645A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses an algorithm for estimating a pitch period of a voice signal and relates to the field of voice signal processing. The algorithm comprises the following steps of: 1, denoising a voice signal with noise through an adaptive filter; 2, determining a self-correlation function of the denoised voice signal and a cyclic average magnitude difference function; and 3, obtaining a weighted square characteristic value through a formula, wherein alpha, beta and gamma are constants which are respectively more than 1, R (k) is the self-correlation function, and the D (k) is the average magnitude difference function. By the algorithm, the pitch period can be effectively detected in the environment with a low signal to noise ratio, the extraction errors are reduced, octave or semioctave errors are reduced, the estimation accuracy of a pitch is improved when the algorithm is sensitive to change of an amplitude or frequency of the voice signal, and the robustness is high.

Description

A kind of pitch period algorithm for estimating of voice signal
Technical field
The present invention relates to field of voice signal, specifically relate to a kind of pitch period algorithm for estimating of voice signal.
Background technology
Speech signal analysis is prerequisite and the basis that voice signal is handled; Only analyze the parameter that can characterize the voice signal essential characteristic; Just might utilize these parameters to carry out processing such as phonetic synthesis, speech recognition, voice compression coding efficiently; Wherein, pitch period is one of most important characteristic parameter during voice signal is handled.Pitch period is meant the cycle of vocal cord vibration when sending out voiced sound, and the estimation of pitch period is called pitch Detection, its objective is to extract the geometric locus that pitch period consistent with vibration frequency of vocal band or that match as far as possible changes, and effect is very crucial.
Because voice signal can be considered the stochastic process of a dynamic non-stationary; The frequency range of speech waveform and vocal cord vibration is big and very complicated; The changeableness of sound channel and sound channel characteristic vary with each individual, and the scope of fundamental tone is very wide, even the pitch period that same individual pronounces under different moods is also different; Pitch period also receives the influence of pronunciation of words tone in addition, thereby the accurate detection of pitch period is actually a relatively thing of difficulty.Especially the portion end to end at voice does not have the such periodicity of vocal cord vibration, judges that to the transition frames of some pure and impure sound is very difficult it belongs to periodically or aperiodicity; Even voice signal is quasi-periodic, its resonance peak structure and noise influence crest and zero-crossing rate sometimes, are difficult to the accurately beginning and the end of location pitch period; The pitch period variation range is bigger, and the 500Hz from the bass male sex's 50Hz to high pitch women or children is near three octaves; These have brought certain difficulty all for the detection of pitch period.
In the present fundamental tone detecting method, the most classical with ACF (Auto Correlation Function, autocorrelation function) method and AMDF (Average Magnitude Difference Function, average magnitude difference function) method.The ACF method is the autocorrelation function of computing voice signal, exists big peak value to estimate fundamental tone in pitch period integral multiple position through the ACF curve, but along with the decline of signal to noise ratio (S/N ratio), can cause frequency multiplication or half mistake frequently usually.The AMDF method is the average magnitude difference function of computing voice signal, occurs valley through the AMDF curve at pitch period integral multiple place and estimates fundamental tone, and this method is when the amplitude of voice signal or change of frequency are relatively more responsive, and the pitch Detection precision obviously descends.
Summary of the invention
To the defective that exists in the prior art; The object of the present invention is to provide a kind of pitch period algorithm for estimating of voice signal, under the low signal-to-noise ratio environment, can effectively detect pitch period, reduce and extract error; Reduce frequency multiplication or half mistake frequently; When the amplitude of voice signal or change of frequency are responsive, improve the fundamental tone estimated accuracy, robustness is better.
For reaching above purpose, the technical scheme that the present invention takes is: a kind of pitch period evaluation method of voice signal comprises the steps: that S1. will carry out noise reduction process through sef-adapting filter with the voice signal of noise; S2. obtain the autocorrelation function and the round-robin average magnitude difference function of voice signal behind the noise reduction; S3. draw the weighted quadratic characteristic through formula
Wherein, α, β, γ are the constant greater than 1, and R (k) is said autocorrelation function, and D (k) is said average magnitude difference function.
On the basis of technique scheme, said sef-adapting filter is the least mean-square error sef-adapting filter.
On the basis of technique scheme, among the said S2, the round-robin average magnitude difference function does
Figure BSA00000734556300031
K=0,1 ... N-1, wherein N is the length of speech analysis frame, S ω(n) be windowing voice behind the noise reduction, (n+k, N) n+k is carried out mould is that the mould of asking of N is got surplus operation to mod in expression.
On the basis of technique scheme, when calculating said round-robin average magnitude difference function, each sample point in the current windowing speech frame all is used and only is used once, and the difference item number of summation is also identical.
On the basis of technique scheme, said S ω(n) autocorrelation function Wherein N is the length of speech analysis frame, and k is the delay degree.
On the basis of technique scheme, a said autocorrelation function mistake! Do not find Reference source.R (k) peak feature occurs at fundamental frequency integral multiple place, estimates fundamental tone according to first peak point except that R (0).
On the basis of technique scheme, said autocorrelation function shows as peak value at the pitch period place, and average magnitude difference function shows as valley at the pitch period place.
On the basis of technique scheme, among the said S3, K ( k ) = [ α R ( k ) ] 2 [ D ( k ) + β γ ] 2 = [ α Σ n = 0 N - K - 1 S ω ( n ) S ω ( n + k ) ] 2 [ Σ n = 0 N - 1 | S ω ( Mod ( n + k , N ) ) - S ω ( n ) | + β γ ] 2 , R (k) is identical with the cycle of K (k), and waveform is more sharp-pointed after R (k) and D (k) weighted quadratic, and both are divided by and obtain the weighted quadratic characteristic.
Beneficial effect of the present invention is: the pitch period algorithm for estimating of voice signal of the present invention; Suppressed the influence of resonance peak effectively; Under the low signal-to-noise ratio environment, can effectively detect pitch period, can locate the position of pitch period more accurately, reduce the extraction error; Improved the fundamental tone estimated accuracy, and algorithm complex is lower.
Description of drawings
Fig. 1 is the process flow diagram of the pitch period algorithm for estimating of voice signal of the present invention;
Fig. 2 is a LMS sef-adapting filter schematic diagram in the embodiment of the invention.
Embodiment
Below in conjunction with accompanying drawing and embodiment the present invention is done further explain.
As shown in Figure 1, the pitch period algorithm for estimating of voice signal of the present invention, it comprises the steps:
S1. will carry out noise reduction process through sef-adapting filter with the voice signal of noise.
In the present embodiment, with the voice signal of band noise through LMS (Least Mean Square, least mean-square error) sef-adapting filter enhancement process, to extract pure as far as possible primary speech signal.The LMS sef-adapting filter is one type of adaptive system with feedback performance; It comes the filter parameter of now of adjusting automatically through the filter parameter that previous moment is obtained; Make that the error mean square value between filter output signal and the wanted signal is minimum, thereby reach the effect of optimum filtering.Certainly, in other embodiments, can use other sef-adapting filter.
As shown in Figure 2; For using the schematic diagram of LMS (Least Mean Square, lowest mean square) sef-adapting filter in the present embodiment, X (n) expression n input signal vector constantly; D (n) representes wanted signal; The weighted vector of W (n) expression sef-adapting filter, Y (n) expression output signal, E (n) representes error signal.Said n input signal vector X (n) constantly is: X (n)=[x (n), x (n-1) ..., x (n-M+1)] T, wherein M is the exponent number of sef-adapting filter.
Said output signal Y (n) is: Y (n)=X (n) TW (n); Error signal E (n) is: E (n)=D (n)-Y (n).The weighted vector iterative formula of sef-adapting filter is:
W (n+1)=W (n)+μ E (n) X (n) formula 1
In the said formula 1, μ is the converging factor of sef-adapting filter, and next weighted vector constantly of adaptive iteration can be added with the error signal to be that the input vector of scale factor obtains by the weighted vector of current time.Said converging factor μ is step-length, and it confirms effect of filtering very responsive, selects suitable converging factor μ will influence convergence of algorithm speed, μ than hour, algorithm convergence is slow, but the stable state offset error is little; When μ was big, algorithm the convergence speed was fast, but the stable state offset error is big, so converging factor μ has decisive influence to the performance of algorithm.In addition, the exponent number M of wave filter also will directly influence the performance of sef-adapting filter.As square error E [E 2(n)] hour, wave filter will adjust the best weight value vector W (n) that is fit to external environment automatically, make Y (n) optimal approximation D (n).
S2. obtain the ACF and the CAMDF (Circular Average Magnitude Difference Function, round-robin average magnitude difference function) of voice signal behind the noise reduction.
Said CAMDF function expression is:
D ( k ) = Σ n = 0 N - 1 | S ω ( Mod ( n + k , N ) ) - S ω ( n ) | , k = 0,1 , . . . N - 1 Formula 2
Wherein, S ω(n) be windowing voice behind the noise reduction, N is the length of speech analysis frame, and (n+k, N) n+k is carried out mould is that the mould of asking of N is got surplus operation to mod in expression.D (0)=0; In field of definition, D (k) is about k=N/2 symmetry, i.e. D (k)=D (N-k).In addition, for the minimum period be the strict periodic signal of T, the CAMDF function also possesses following character:
D (aT)≤D (aT+b), 0≤aT+b≤N/2 wherein, 0<b<T, a=0,1,2,
K=aT is the local smallest point of D (K), 0≤aT≤N/2 wherein, and a=0,1,2,
D (aT)≤D (aT+T), 0≤aT<aT+T≤N/2 wherein, a=0,1,2,
When calculating amplitude difference function D (k), each sample point in the current windowing speech frame all is used and only is used once, and the difference item number of summation is also identical.The character of utilizing the functional value of symmetry and the valley point of CAMDF function to increase progressively successively can also overcome the problem that pitch period doubles, brings great convenience to pitch Detection, as: the fluctuation tendency of level, it is easier that valley point is detected; Can one-time positioning arrive the pitch period position of estimating, simplify the testing process of pitch period; The sample point that uses during each D of calculating (k) is all consistent, makes the amplitude difference function more can react the difference between the different value of K.
Said ACF function representation random signal is in any two different degrees of correlation between the value constantly, and the autocorrelation function of periodic signal has the identical cycle.Windowing voice S behind the noise reduction ω(n) ACF function R (k) is:
R ( k ) = Σ n = 0 N - k - 1 S ω ( n ) S ω ( n + k ) Formula 3
Wherein N is the length of speech analysis frame, and k is the delay degree.The autocorrelation function R (k) of voice signal peak feature will occur at fundamental frequency integral multiple place, estimate fundamental tone according to first peak point (except the R (0)) usually.
S3. draw the weighted quadratic characteristic of ACF/CAMDF through formula
Figure BSA00000734556300062
; Wherein, α, β, γ are the constant greater than 1, and R (k) is said autocorrelation function, and D (k) is said average magnitude difference function; All can be drawn by formula 2 and formula 3, promptly this formula further is:
K ( k ) = [ α R ( k ) ] 2 [ D ( k ) + β γ ] 2 = [ α Σ n = 0 N - K - 1 S ω ( n ) S ω ( n + k ) ] 2 [ Σ n = 0 N - 1 | S ω ( Mod ( n + k , N ) ) - S ω ( n ) | + β γ ] 2 Formula 4
Can be known that by above-mentioned steps S2 what the ACF function was sought is the position of maximal peak point, be the position of dark valley point and the CAMDF function is sought; ACF function number shows as peak value at the pitch period place, and the CAMDF function shows as valley at the pitch period place.If first peak value of R (k) is more sharp-pointed or the acutance of the overall valley point of D (k) is outstanding more, then the estimation of pitch period will be accurate more.Analyze and to know by formula 4; R (k) is identical with the cycle of K (k); Then the peak value waveform is more sharp-pointed after R (k) weighted quadratic, and the valley waveform is more outstanding after D (k) weighted quadratic, and both weighted quadratic characteristics that obtains of being divided by are especially obvious at the peak point at pitch period integral multiple place; Because pitch period locatees out through peak point, so this weighted quadratic characteristic has been located the position of pitch period more accurately.
The present invention is not limited to above-mentioned embodiment, for those skilled in the art, under the prerequisite that does not break away from the principle of the invention, can also make some improvement and retouching, and these improvement and retouching also are regarded as within protection scope of the present invention.The content of not doing in this instructions to describe in detail belongs to this area professional and technical personnel's known prior art.

Claims (8)

1. the pitch period evaluation method of a voice signal is characterized in that, comprises the steps:
S1. will carry out noise reduction process through sef-adapting filter with the voice signal of noise;
S2. obtain the autocorrelation function and the round-robin average magnitude difference function of voice signal behind the noise reduction;
S3. draw the weighted quadratic characteristic through formula
Figure FSA00000734556200011
; Wherein, α, β, γ are the constant greater than 1; R (k) is said autocorrelation function, and D (k) is said average magnitude difference function.
2. the pitch period evaluation method of voice signal as claimed in claim 1, it is characterized in that: said sef-adapting filter is the least mean-square error sef-adapting filter.
3. the pitch period evaluation method of voice signal as claimed in claim 1, it is characterized in that: among the said S2, the round-robin average magnitude difference function does
Figure FSA00000734556200012
K=0,1 ... N-1, wherein N is the length of speech analysis frame, S ω(n) be windowing voice behind the noise reduction, (n+k, N) n+k is carried out mould is that the mould of asking of N is got surplus operation to mod in expression.
4. the pitch period evaluation method of voice signal as claimed in claim 3; It is characterized in that: when calculating said round-robin average magnitude difference function; Each sample point in the current windowing speech frame all is used and only is used once, and the difference item number of summation is also identical.
5. the pitch period evaluation method of voice signal as claimed in claim 3 is characterized in that: said S ω(n) autocorrelation function Wherein N is the length of speech analysis frame, and k is the delay degree.
6. the pitch period evaluation method of voice signal as claimed in claim 5 is characterized in that: said autocorrelation function mistake! Do not find Reference source.R (k) peak feature occurs at fundamental frequency integral multiple place, estimates fundamental tone according to first peak point except that R (0).
7. the pitch period evaluation method of voice signal as claimed in claim 5, it is characterized in that: said autocorrelation function shows as peak value at the pitch period place, and average magnitude difference function shows as valley at the pitch period place.
8. the pitch period evaluation method of voice signal as claimed in claim 7 is characterized in that: among the said S3, K ( k ) = [ α R ( k ) ] 2 [ D ( k ) + β γ ] 2 = [ α Σ n = 0 N - K - 1 S ω ( n ) S ω ( n + k ) ] 2 [ Σ n = 0 N - 1 | S ω ( Mod ( n + k , N ) ) - S ω ( n ) | + β γ ] 2 ,
R (k) is identical with the cycle of K (k), and waveform is more sharp-pointed after R (k) and D (k) weighted quadratic, and both are divided by and obtain the weighted quadratic characteristic.
CN2012101969831A 2012-06-15 2012-06-15 Algorithm for estimating pitch period of voice signal Pending CN102737645A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012101969831A CN102737645A (en) 2012-06-15 2012-06-15 Algorithm for estimating pitch period of voice signal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012101969831A CN102737645A (en) 2012-06-15 2012-06-15 Algorithm for estimating pitch period of voice signal

Publications (1)

Publication Number Publication Date
CN102737645A true CN102737645A (en) 2012-10-17

Family

ID=46993014

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012101969831A Pending CN102737645A (en) 2012-06-15 2012-06-15 Algorithm for estimating pitch period of voice signal

Country Status (1)

Country Link
CN (1) CN102737645A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104217731A (en) * 2014-08-28 2014-12-17 东南大学 Quick solo music score recognizing method
CN105679331A (en) * 2015-12-30 2016-06-15 广东工业大学 Sound-breath signal separating and synthesizing method and system
CN106205638A (en) * 2016-06-16 2016-12-07 清华大学 A kind of double-deck fundamental tone feature extracting method towards audio event detection
CN108831504A (en) * 2018-06-13 2018-11-16 西安蜂语信息科技有限公司 Determination method, apparatus, computer equipment and the storage medium of pitch period
CN110390953A (en) * 2019-07-25 2019-10-29 腾讯科技(深圳)有限公司 It utters long and high-pitched sounds detection method, device, terminal and the storage medium of voice signal

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1851806A (en) * 2006-05-30 2006-10-25 北京中星微电子有限公司 Adaptive microphone array system and its voice signal processing method
JP2008209547A (en) * 2007-02-26 2008-09-11 National Institute Of Advanced Industrial & Technology Pitch estimation device, pitch estimation method and program
CN101673550A (en) * 2008-09-09 2010-03-17 联芯科技有限公司 Spectral gain calculating method and device and noise suppression system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1851806A (en) * 2006-05-30 2006-10-25 北京中星微电子有限公司 Adaptive microphone array system and its voice signal processing method
JP2008209547A (en) * 2007-02-26 2008-09-11 National Institute Of Advanced Industrial & Technology Pitch estimation device, pitch estimation method and program
CN101673550A (en) * 2008-09-09 2010-03-17 联芯科技有限公司 Spectral gain calculating method and device and noise suppression system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李晋: "改进的基音检测算法", 《计算机工程与应用》, vol. 03, no. 03, 19 January 2011 (2011-01-19) *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104217731A (en) * 2014-08-28 2014-12-17 东南大学 Quick solo music score recognizing method
CN105679331A (en) * 2015-12-30 2016-06-15 广东工业大学 Sound-breath signal separating and synthesizing method and system
CN106205638A (en) * 2016-06-16 2016-12-07 清华大学 A kind of double-deck fundamental tone feature extracting method towards audio event detection
CN106205638B (en) * 2016-06-16 2019-11-08 清华大学 A kind of double-deck fundamental tone feature extracting method towards audio event detection
CN108831504A (en) * 2018-06-13 2018-11-16 西安蜂语信息科技有限公司 Determination method, apparatus, computer equipment and the storage medium of pitch period
CN108831504B (en) * 2018-06-13 2020-12-04 西安蜂语信息科技有限公司 Method and device for determining pitch period, computer equipment and storage medium
CN110390953A (en) * 2019-07-25 2019-10-29 腾讯科技(深圳)有限公司 It utters long and high-pitched sounds detection method, device, terminal and the storage medium of voice signal
CN110390953B (en) * 2019-07-25 2023-11-17 腾讯科技(深圳)有限公司 Method, device, terminal and storage medium for detecting howling voice signal

Similar Documents

Publication Publication Date Title
Boersma Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound
Alku et al. Formant frequency estimation of high-pitched vowels using weighted linear prediction
EP3040991B1 (en) Voice activation detection method and device
US10510363B2 (en) Pitch detection algorithm based on PWVT
CN111128213B (en) Noise suppression method and system for processing in different frequency bands
US9454976B2 (en) Efficient discrimination of voiced and unvoiced sounds
US7272551B2 (en) Computational effectiveness enhancement of frequency domain pitch estimators
CN102737645A (en) Algorithm for estimating pitch period of voice signal
JP5992427B2 (en) Method and apparatus for estimating a pattern related to pitch and / or fundamental frequency in a signal
KR20100049601A (en) Cyclic signal processing method, cyclic signal conversion method, cyclic signal processing device, and cyclic signal analysis method
KR101762723B1 (en) Method and apparatus for detecting correctness of pitch period
CN103794222A (en) Method and apparatus for detecting voice fundamental tone frequency
CN102054470A (en) High-precision piano tuner and tuning method thereof
CN107371116A (en) A kind of detection method of uttering long and high-pitched sounds based on interframe spectrum flatness deviation
CN107564512A (en) Voice activity detection method and device
Bouzid et al. Voice source parameter measurement based on multi-scale analysis of electroglottographic signal
JP5325130B2 (en) LPC analysis device, LPC analysis method, speech analysis / synthesis device, speech analysis / synthesis method, and program
Jin et al. An improved speech endpoint detection based on spectral subtraction and adaptive sub-band spectral entropy
Li et al. A pitch estimation algorithm for speech in complex noise environments based on the radon transform
Upadhya Pitch detection in time and frequency domain
JP4760179B2 (en) Voice feature amount calculation apparatus and program
CN108830232B (en) Voice signal period segmentation method based on multi-scale nonlinear energy operator
Zhao et al. A New Pitch Estimation Method Based on AMDF.
Graf et al. Improved performance measures for voice activity detection
Wu et al. Speech endpoint detection based on EMD and improved spectral subtraction

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20121017