CN103474074A - Voice pitch period estimation method and device - Google Patents
Voice pitch period estimation method and device Download PDFInfo
- Publication number
- CN103474074A CN103474074A CN2013104094338A CN201310409433A CN103474074A CN 103474074 A CN103474074 A CN 103474074A CN 2013104094338 A CN2013104094338 A CN 2013104094338A CN 201310409433 A CN201310409433 A CN 201310409433A CN 103474074 A CN103474074 A CN 103474074A
- Authority
- CN
- China
- Prior art keywords
- pitch period
- value
- voice signal
- normalized autocorrelation
- autocorrelation functions
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention relates to a method and a device for estimating a voice pitch period. The device comprises: the device comprises a signal preprocessing unit, a normalized autocorrelation function calculating unit and a pitch period post-processing unit. The method comprises the following steps: s1, preprocessing the voice signal for removing the direct current component, perception weighting and signal down sampling; s2, calculating a normalized autocorrelation function value of the preprocessed voice signal; s3, determining the maximum value in the normalized autocorrelation function value in the pitch period searching range, and determining the pitch period candidate value corresponding to the maximum value as the pitch period estimated value of the voice signal. The invention better overcomes the frequency multiplication and half frequency errors in the pitch period estimation, improves the anti-noise performance of the pitch period estimation method, reduces the operation complexity of the algorithm and improves the corresponding digital audio/voice coding efficiency. The invention can be suitable for the fundamental tone search in various voice coding and decoding algorithms and has wide applicability.
Description
Technical field
The present invention relates to speech coding technology, more particularly, relate to a kind of pitch estimation method and apparatus.
Background technology
Pitch period refers to the cycle of vocal cord vibration when the people pronounces.Pitch period is an important problem in voice coding, and its accuracy will directly have influence on coding quality and the efficiency of speech coder.Redundancy can be effectively removed in pitch period analysis accurately in speech, reduces the bit number of coding, realizes low bit rate high-quality speech coding.But, due to the singularity of voice, the accurate search of pitch period can face following difficulty:
(1) voice signal changes very complicatedly, and the glottal excitation waveform is not a periodic pulse train completely, and the cycle of speech waveform becomes while being.
(2) the beginning and end part at voice does not have the such periodicity of vocal cord vibration, and the transition sound such as some pure and impure sound are to be difficult to judge that it belongs to cycle or nonperiodic signal, thereby also just are unable to estimate pitch period.
(3) will from voice signal, remove sound channel impact, directly only the information relevant with vocal cord vibration is more difficult in taking-up.
(4) what define each pitch period in voiced segments accurately starts and finishes the reliable measurements that this difficulty has limited fundamental tone, this is not only because voice signal itself is quasi-periodic (being that fundamental tone is vicissitudinous), simultaneously also because waveform is subject to the impact of resonance peak and noise etc.
(5) in actual applications, ground unrest can affect the performance of pitch Detection, particularly important for mobile communication environment, because waveform often there will be high level of noise.
(6) the pitch period variation range is large has brought certain difficulty also to accurate pitch Detection.
At present, also do not have a kind of general method can accurately extract reliably voice pitch period in either case.Traditional fundamental tone detecting method, can be divided into time domain method and frequency domain method.In time domain, traditional pitch period algorithm comprises based on average magnitude difference function (Average Magnitude Difference Function, AMDF) fundamental tone algorithm for estimating, based on short-time autocorrelation function (Autocorrelation Function, ACF) Pitch Detection Algorithm.These two kinds of algorithms can be referring to the introduction as Publication about Document:
Chu,Wai?C.Speech?coding?algorithms:foundation?and?evolution?of?standardized?coders.John?Wiley&Sons,Inc.2003,pp.33-45。
Angle at frequency domain, Griffin and Lim have proposed a kind of frequency domain pitch period estimation scheme (D.W.Griffin, J.S.Lim.Multiband Excitation Vocoder.IEEE Trans ASSP, 1988,36 (8)),, for multi-band excitation speech coding algorithm (MBE), this pitch period algorithm for estimating adopts the closed-Loop Analysis synthetic method, the matched signal frequency-domain waveform, obtain optimum pitch period and estimate.
In actual applications, the pitch search algorithm based on time domain is because its algorithm is simple, and performance is better and be used widely.For example current speech coding standard G.729, in AMR-WB, all taked the improved short-time autocorrelation function of time domain (ACF) Pitch Detection Algorithm (Bao Changchun. low code check digital speech code basis. Beijing: publishing house of Beijing University of Technology, 2001.2.).But the ACF method of time domain easily produces " frequency multiplication " and " half frequently " mistake usually, the AMDF method can not effectively be followed the tracks of speech frequency and be changed fast.Frequency domain method generally adopts Cepstrum Method, owing to introducing logarithm operation, calculated amount is increased considerably, and be subject to the impact of noise.
Summary of the invention
The technical problem to be solved in the present invention is, above-mentioned defect for prior art, a kind of low complex degree, efficient pitch estimation method and apparatus are provided, can overcome preferably frequency multiplication and half frequency mistake in the pitch period estimation, and energy raising anti-noise performance.
The technical solution adopted for the present invention to solve the technical problems is: propose a kind of pitch estimation method, comprise the steps:
S1, the pre-service of voice signal being removed to DC component, perceptual weighting and signal down-sampling;
S2, use following formula calculate the normalized autocorrelation functions value of described pretreated voice signal:
Wherein, ρ (τ) means the normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ means the voice fundamental cycle candidate value in search, the length that N is a frame signal after the signal down-sampling;
S3, determine the maximal value in described normalized autocorrelation functions value in the pitch period hunting zone, by described maximal value, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal.
In an embodiment, described step S1 further comprises:
S11, to the voice signal inner sampling rate that resamples;
S12, the voice signal resampled is carried out to high-pass filtering to remove DC component;
S13, the voice signal after high-pass filtering is carried out to perceptual weighting;
S14, the voice signal after perceptual weighting is carried out to low-pass filtering and 1/2 down-sampling.
In an embodiment, described inner sampling rate is 12.8kHz, and the cutoff frequency of described high-pass filtering is 50Hz.
In an embodiment, described step S3 further comprises:
S31, according to the sampling rate of voice signal, by the pitch period hunting zone, be divided between the first interval, Second Region and the 3rd interval, obtain respectively each interval normalized autocorrelation functions maximal value and corresponding pitch period candidate value;
S32, the weight parameter that foundation is certain, select the normalized autocorrelation functions maximal value of described pitch period hunting zone from the normalized autocorrelation functions maximal value in described three intervals, by this maximal value, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal.
In an embodiment, described step S32 further comprises: judge whether the normalized autocorrelation functions maximal value between Second Region is more than or equal to the normalized autocorrelation functions maximal value in the first interval and the product of described weight parameter, if, by the normalized autocorrelation functions maximal value between Second Region, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal, otherwise, further judge whether the normalized autocorrelation functions maximal value in the 3rd interval is more than or equal to the normalized autocorrelation functions maximal value in the first interval and the product of described weight parameter, if, by the normalized autocorrelation functions maximal value in the 3rd interval, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal, otherwise corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal by the normalized autocorrelation functions maximal value in the first interval.
In an embodiment, between described the first interval, Second Region and the 3rd interval is specially [L_min, 39], and [40,79], [80, L_max], wherein L_min means the initial value of pitch period hunting zone, L_max means the end value of pitch period hunting zone.
The present invention also proposes a kind of pitch estimation device for solving its technical matters, comprising:
The Signal Pretreatment unit, the pre-service of voice signal being removed to DC component, perceptual weighting and signal down-sampling;
The normalized autocorrelation functions computing unit, used following formula to calculate the normalized autocorrelation functions value of described pretreated voice signal:
Wherein, ρ (τ) means the normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ means the voice fundamental cycle candidate value in search, the length that N is a frame signal after the signal down-sampling;
The pitch period post-processing unit, determine the maximal value in described normalized autocorrelation functions value in the pitch period hunting zone, and by described maximal value, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal.
In an embodiment, described Signal Pretreatment unit is further to the voice signal inner sampling rate that resamples, then the voice signal resampled is carried out to high-pass filtering to remove DC component, subsequently the voice signal after high-pass filtering is carried out to perceptual weighting, finally the voice signal after perceptual weighting is carried out to low-pass filtering and 1/2 down-sampling.
In an embodiment, described pitch period post-processing unit is further according to the sampling rate of voice signal, the pitch period hunting zone is divided into to the first interval, between Second Region and the 3rd interval, obtain respectively each interval normalized autocorrelation functions maximal value and corresponding pitch period candidate value, and according to certain weight parameter, select the normalized autocorrelation functions maximal value of described pitch period hunting zone from the normalized autocorrelation functions maximal value in described three intervals, by this maximal value, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal.
In an embodiment, the normalized autocorrelation functions maximal value that described pitch period post-processing unit is selected described pitch period hunting zone according to certain weight parameter from the normalized autocorrelation functions maximal value in described three intervals is specially: judge whether the normalized autocorrelation functions maximal value between Second Region is more than or equal to the normalized autocorrelation functions maximal value in the first interval and the product of described weight parameter, if, by the normalized autocorrelation functions maximal value between Second Region, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal, otherwise, further judge whether the normalized autocorrelation functions maximal value in the 3rd interval is more than or equal to the normalized autocorrelation functions maximal value in the first interval and the product of described weight parameter, if, by the normalized autocorrelation functions maximal value in the 3rd interval, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal, otherwise corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal by the normalized autocorrelation functions maximal value in the first interval.
Pitch estimation method and apparatus of the present invention, based on the normalized autocorrelation functions pitch Detection, and introduce pre-service and post-processing technology in the pitch period estimation, frequency multiplication and half frequency mistake during pitch period is estimated have been overcome preferably, promoted the noise robustness of pitch period method of estimation, reduce the computational complexity of algorithm simultaneously, improved corresponding DAB/voice coding efficiency.The present invention can be applicable to the pitch search in various voice coding/decoding algorithmss, has applicability widely.
The accompanying drawing explanation
Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:
Fig. 1 is the process flow diagram of the pitch estimation method of one embodiment of the invention;
Fig. 2 is the process flow diagram of a specific embodiment of step 110 in Fig. 1;
Fig. 3 is the process flow diagram of a specific embodiment of step 130 in Fig. 1;
Fig. 4 is the logic diagram of the pitch estimation device of one embodiment of the invention.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
Fig. 1 shows the process flow diagram of the pitch estimation method 100 of one embodiment of the invention.As shown in Figure 1, this pitch estimation method 100 comprises:
In step 110, voice signal is removed to the pre-service of DC component, perceptual weighting and signal down-sampling.
In step 120, calculate the normalized autocorrelation functions value of pretreated voice signal.The present invention uses following normalized autocorrelation functions:
Wherein, ρ (τ) means the normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ means the voice fundamental cycle candidate value in search, the length that N is a frame signal after the signal down-sampling.
In step 130, determine the maximal value in the normalized autocorrelation functions value in the pitch period hunting zone, by described maximal value, corresponding pitch period candidate value is defined as the pitch period estimated value of voice signal.
The present invention has introduced the Signal Pretreatment technology in pitch period is estimated.Fig. 2 shows the process flow diagram of a specific embodiment of the Signal Pretreatment step 110 shown in Fig. 1.As shown in Figure 2, this Signal Pretreatment step 110 further comprises:
In step 111, to the voice signal inner sampling rate (Fs=12.8kHz) that resamples.
In later step 112, the voice signal resampled is carried out to high-pass filtering.The cutoff frequency of high-pass filtering wave filter can be 50Hz, and its purpose is to remove DC component.
Then in step 113, the voice signal after high-pass filtering is carried out to perceptual weighting.
In final step 114, the voice signal after perceptual weighting being carried out to low-pass filtering and 1/2 down-sampling, will be 3.2kHz by the signal broadband.
In further preferred embodiment, thereby the present invention in Signal Pretreatment step 110, can also add numerical filter to remove resonance peak and high frequency noise is estimated pitch period more accurately.
The present invention, before carrying out the pitch period search, carries out pre-service to the voice signal of inputting, and so both can filtering estimate inoperative HFS to pitch period, also can reduce the computational complexity of algorithm simultaneously.
The present invention has also introduced the pitch period post-processing technology in pitch period is estimated.Fig. 3 shows the process flow diagram of a specific embodiment of the pitch period post-processing step 130 shown in Fig. 1.As shown in Figure 3, this pitch period post-processing step 130 further comprises:
In step 131, according to the sampling rate of voice signal, by the pitch period hunting zone, be divided between the first interval, Second Region and the 3rd interval, obtain respectively each interval normalized autocorrelation functions maximal value and corresponding pitch period candidate value.
In an embodiment, the pitch period hunting zone is [L_min, L_max], and wherein L_min means the initial value of pitch period hunting zone, and L_max means the end value of pitch period hunting zone.Sample frequency according to aforesaid voice signal, can be divided into this pitch period hunting zone following three intervals, i.e. the first interval [L_min, 39], [40,79] between Second Region, the 3rd interval [80, L_max], so that determine correct pitch period estimated value in these three intervals.In specific embodiment, L_min and L_max can be respectively 0 and 256.Based on above three intervals, can obtain maximum ρ (τ) value in each interval and corresponding pitch period candidate value τ, be designated as ρ
max1, ρ
max2and ρ
max3, τ
1, τ
2and τ
3.
In step 132, according to certain weight parameter, select the normalized autocorrelation functions maximal value of described pitch period hunting zone from the normalized autocorrelation functions maximal value in described three intervals, by this maximal value, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal.
In an embodiment, selected weight parameter c(can be near the numerical value 1.0, for example 0.97) and, can carry out by the following method to determine optimum pitch period candidate value τ
opt:
At first judge the normalized autocorrelation functions maximal value ρ between Second Region
max2whether be more than or equal to the normalized autocorrelation functions maximal value ρ in the first interval
max1with the product of weight parameter c, if so, by the normalized autocorrelation functions maximal value ρ between Second Region
max2corresponding pitch period candidate value τ
2be defined as the pitch period estimated value of voice signal, otherwise, further judge the normalized autocorrelation functions maximal value ρ in the 3rd interval
max3whether be more than or equal to the normalized autocorrelation functions maximal value ρ in the first interval
max1with the product of weight parameter c, if so, by the normalized autocorrelation functions maximal value ρ in the 3rd interval
max3corresponding pitch period candidate value τ
3be defined as the pitch period estimated value of voice signal, otherwise by the normalized autocorrelation functions maximal value ρ in the first interval
max1corresponding pitch period candidate value τ
1be defined as the pitch period estimated value of voice signal.
Relevant mathematical notation is as follows:
Make τ
opt=τ
1, ρ
max=ρ
max1;
If ρ
max2>=c ρ
max, ρ
max=ρ
max2, τ
opt=τ
2;
If ρ
max3>=c ρ
max, ρ
max=ρ
max3, τ
opt=τ
3.
Further in preferred embodiment, the present invention in pitch period post-processing step 130, can also utilize normalized autocorrelation functions judgement voice signal clear/accuracy that turbid characteristic is estimated to promote pitch period.
Pitch estimation method based on above introduction, the present invention also proposes a kind of pitch estimation device.Fig. 4 shows the logic diagram of the pitch estimation device 400 of one embodiment of the invention.As shown in Figure 4, this pitch estimation device 400 comprises Signal Pretreatment unit 410, normalized autocorrelation functions computing unit 420 and pitch period post-processing unit 430.The voice signal of the 410 pairs of inputs in Signal Pretreatment unit is removed the pre-service of DC component, perceptual weighting and signal down-sampling.Normalized autocorrelation functions computing unit 420 is used following formula to calculate the normalized autocorrelation functions value through Signal Pretreatment unit 410 pretreated voice signals:
Wherein, ρ (τ) means the normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ means the voice fundamental cycle candidate value in search, the length that N is a frame signal after the signal down-sampling.Pitch period post-processing unit 430 is determined the maximal value in the normalized autocorrelation functions value in the pitch period hunting zone, and by this maximal value, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal.
In a specific embodiment, Signal Pretreatment unit 410 is at first to the voice signal inner sampling rate (Fs=12.8kHz) that resamples of input, then the voice signal resampled is carried out to high-pass filtering, the cutoff frequency of wave filter can be 50Hz, its purpose is to remove DC component, subsequently the voice signal after high-pass filtering being carried out to perceptual weighting, finally the voice signal after perceptual weighting is carried out to low-pass filtering and 1/2 down-sampling, will be 3.2kHz by the signal broadband.So both can filtering estimate inoperative HFS to pitch period, also can reduce the computational complexity of algorithm simultaneously.
In a specific embodiment, pitch period post-processing unit 430 is according to the sampling rate of voice signal, by the pitch period hunting zone, be divided between the first interval, Second Region and the 3rd interval, the first interval [L_min for example, 39], between Second Region [40,79], the 3rd interval [80, L_max], wherein L_min means the initial value of pitch period hunting zone, L_max means the end value of pitch period hunting zone, then obtain respectively maximum ρ (τ) value in each interval and corresponding pitch period candidate value τ, be designated as ρ
max1, ρ
max2and ρ
max3, τ
1, τ
2and τ
3.Pitch period post-processing unit 430 can be also near the numerical value 1.0 according to certain weight parameter c(, for example 0.97) and, carry out by the following method to determine optimum pitch period candidate value τ
opt:
Make τ
opt=τ
1, ρ
max=ρ
max1;
If ρ
max2>=c ρ
max, ρ
max=ρ
max2, τ
opt=τ
2;
If ρ
max3>=c ρ
max, ρ
max=ρ
max3, τ
opt=τ
3.
Pitch estimation method and apparatus of the present invention, based on the normalized autocorrelation functions pitch Detection, and introduce pre-service and post-processing technology in the pitch period estimation, frequency multiplication and half frequency mistake during pitch period is estimated have been overcome preferably, promote the noise robustness of pitch period method of estimation, improved corresponding DAB/voice coding efficiency.Below provide the Performance Ratio of pitch search algorithm in the present invention and AMR-WB+:
1, performance test methods: sequence of calculation average signal-to-noise ratio (SNR), it is defined as follows:
Wherein, N(N=256) be the length of a frame voice signal, N
sFbe the totalframes of a voice sequence, x
w(n) be the signal of original signal after perceptual weighting,
for the signal of the voice signal through after coding/decoding after perceptual weighting.
2, test result
Two kinds of sequence of algorithms average SNR contrasts of table 1 (monophony)
Two kinds of sequence of algorithms average SNR contrasts of table 2 (stereo)
3, test result analysis
(1) from test result, the algorithm performance that the present invention proposes slightly is better than the pitch period searching algorithm performance of AMR-WB+, and computational complexity is than the complexity of AMR-WB+ algorithm suitable (also slightly a little bit smaller).
(2) from the interpretation of result of table 1 and table 2, es02, two sequential coding poor-performings of s_cl_mt_2_org, the s_cl_ft_3_org coding efficiency is best.Be middle-aged male sound by sequential analysis es02, two sequences of s_cl_mt_2_org, s_cl_ft_3_org is young woman's sound.By Algorithm Analysis, the parameter that doubling time detected that prevents of setting in this and algorithm of the present invention is chosen relevant, this parameter is an empirical value, algorithm is mainly considered schoolgirl, scholar without a xiucai degree's situation at present, the characteristics of these sequences are that its pitch period variation range is large, and rapidly, Comparatively speaking its pitch period variation of middle-aged male sound is very mild, and variation range is relative also less.
(3) test in along tape test some typical noisy speech s_no_ft_9_org, s_no_2t_1_org, s_no_2t_2_org, s_no_2t_3_org, s_no_ft_1_org, such as the situation that contains a large amount of ground unrests on airport etc., from test result, the noiseproof feature of algorithm of the present invention is better than the AMR-WB+ algorithm.
Claims (10)
1. a pitch estimation method, is characterized in that, comprises the steps:
S1, the pre-service of voice signal being removed to DC component, perceptual weighting and signal down-sampling;
S2, use following formula calculate the normalized autocorrelation functions value of described pretreated voice signal:
Wherein, ρ (τ) means the normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ means the voice fundamental cycle candidate value in search, the length that N is a frame signal after the signal down-sampling;
S3, determine the maximal value in described normalized autocorrelation functions value in the pitch period hunting zone, by described maximal value, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal.
2. method according to claim 1, is characterized in that, described step S1 further comprises:
S11, to the voice signal inner sampling rate that resamples;
S12, the voice signal resampled is carried out to high-pass filtering to remove DC component;
S13, the voice signal after high-pass filtering is carried out to perceptual weighting;
S14, the voice signal after perceptual weighting is carried out to low-pass filtering and 1/2 down-sampling.
3. method according to claim 2, is characterized in that, described inner sampling rate is 12.8kHz, and the cutoff frequency of described high-pass filtering is 50Hz.
4. method according to claim 1, is characterized in that, described step S3 further comprises:
S31, according to the sampling rate of voice signal, by the pitch period hunting zone, be divided between the first interval, Second Region and the 3rd interval, obtain respectively each interval normalized autocorrelation functions maximal value and corresponding pitch period candidate value;
S32, the weight parameter that foundation is certain, select the normalized autocorrelation functions maximal value of described pitch period hunting zone from the normalized autocorrelation functions maximal value in described three intervals, by this maximal value, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal.
5. method according to claim 4, it is characterized in that, described step S32 further comprises: judge whether the normalized autocorrelation functions maximal value between Second Region is more than or equal to the normalized autocorrelation functions maximal value in the first interval and the product of described weight parameter, if, by the normalized autocorrelation functions maximal value between Second Region, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal, otherwise, further judge whether the normalized autocorrelation functions maximal value in the 3rd interval is more than or equal to the normalized autocorrelation functions maximal value in the first interval and the product of described weight parameter, if, by the normalized autocorrelation functions maximal value in the 3rd interval, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal, otherwise corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal by the normalized autocorrelation functions maximal value in the first interval.
6. method according to claim 5, is characterized in that, between described the first interval, Second Region and the 3rd interval is specially [L_min, 39], [40,79], [80, L_max], wherein L_min means the initial value of pitch period hunting zone, L_max means the end value of pitch period hunting zone.
7. a pitch estimation device, is characterized in that, comprising:
The Signal Pretreatment unit, the pre-service of voice signal being removed to DC component, perceptual weighting and signal down-sampling;
The normalized autocorrelation functions computing unit, used following formula to calculate the normalized autocorrelation functions value of described pretreated voice signal:
Wherein, ρ (τ) means the normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ means the voice fundamental cycle candidate value in search, the length that N is a frame signal after the signal down-sampling;
The pitch period post-processing unit, determine the maximal value in described normalized autocorrelation functions value in the pitch period hunting zone, and by described maximal value, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal.
8. device according to claim 7, it is characterized in that, described Signal Pretreatment unit is further to the voice signal inner sampling rate that resamples, then the voice signal resampled is carried out to high-pass filtering to remove DC component, subsequently the voice signal after high-pass filtering is carried out to perceptual weighting, finally the voice signal after perceptual weighting is carried out to low-pass filtering and 1/2 down-sampling.
9. device according to claim 7, it is characterized in that, described pitch period post-processing unit is further according to the sampling rate of voice signal, the pitch period hunting zone is divided into to the first interval, between Second Region and the 3rd interval, obtain respectively each interval normalized autocorrelation functions maximal value and corresponding pitch period candidate value, and according to certain weight parameter, select the normalized autocorrelation functions maximal value of described pitch period hunting zone from the normalized autocorrelation functions maximal value in described three intervals, by this maximal value, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal.
10. device according to claim 9, it is characterized in that, the normalized autocorrelation functions maximal value that described pitch period post-processing unit is selected described pitch period hunting zone according to certain weight parameter from the normalized autocorrelation functions maximal value in described three intervals is specially: judge whether the normalized autocorrelation functions maximal value between Second Region is more than or equal to the normalized autocorrelation functions maximal value in the first interval and the product of described weight parameter, if, by the normalized autocorrelation functions maximal value between Second Region, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal, otherwise, further judge whether the normalized autocorrelation functions maximal value in the 3rd interval is more than or equal to the normalized autocorrelation functions maximal value in the first interval and the product of described weight parameter, if, by the normalized autocorrelation functions maximal value in the 3rd interval, corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal, otherwise corresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal by the normalized autocorrelation functions maximal value in the first interval.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310409433.8A CN103474074B (en) | 2013-09-09 | 2013-09-09 | Pitch estimation method and apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310409433.8A CN103474074B (en) | 2013-09-09 | 2013-09-09 | Pitch estimation method and apparatus |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103474074A true CN103474074A (en) | 2013-12-25 |
CN103474074B CN103474074B (en) | 2016-05-11 |
Family
ID=49798895
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310409433.8A Active CN103474074B (en) | 2013-09-09 | 2013-09-09 | Pitch estimation method and apparatus |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103474074B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105185385A (en) * | 2015-08-11 | 2015-12-23 | 东莞市凡豆信息科技有限公司 | Voice fundamental tone frequency estimation method based on gender anticipation and multi-frequency-band parameter mapping |
CN106205638A (en) * | 2016-06-16 | 2016-12-07 | 清华大学 | A kind of double-deck fundamental tone feature extracting method towards audio event detection |
CN107039051A (en) * | 2016-02-03 | 2017-08-11 | 重庆工商职业学院 | Fundamental frequency detection method based on ant group optimization |
CN108830232A (en) * | 2018-06-21 | 2018-11-16 | 浙江中点人工智能科技有限公司 | A kind of voice signal period divisions method based on multiple dimensioned nonlinear energy operator |
CN109119097A (en) * | 2018-10-30 | 2019-01-01 | Oppo广东移动通信有限公司 | Fundamental tone detecting method, device, storage medium and mobile terminal |
CN110168641A (en) * | 2016-10-04 | 2019-08-23 | 弗劳恩霍夫应用研究促进协会 | Device and method for determining pitch information |
CN110390953A (en) * | 2019-07-25 | 2019-10-29 | 腾讯科技(深圳)有限公司 | It utters long and high-pitched sounds detection method, device, terminal and the storage medium of voice signal |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108831504B (en) * | 2018-06-13 | 2020-12-04 | 西安蜂语信息科技有限公司 | Method and device for determining pitch period, computer equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4486900A (en) * | 1982-03-30 | 1984-12-04 | At&T Bell Laboratories | Real time pitch detection by stream processing |
US5127053A (en) * | 1990-12-24 | 1992-06-30 | General Electric Company | Low-complexity method for improving the performance of autocorrelation-based pitch detectors |
CN101149924A (en) * | 2006-09-18 | 2008-03-26 | 华为技术有限公司 | Method and device for implementing open-loop pitch search |
-
2013
- 2013-09-09 CN CN201310409433.8A patent/CN103474074B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4486900A (en) * | 1982-03-30 | 1984-12-04 | At&T Bell Laboratories | Real time pitch detection by stream processing |
US5127053A (en) * | 1990-12-24 | 1992-06-30 | General Electric Company | Low-complexity method for improving the performance of autocorrelation-based pitch detectors |
CN101149924A (en) * | 2006-09-18 | 2008-03-26 | 华为技术有限公司 | Method and device for implementing open-loop pitch search |
Non-Patent Citations (1)
Title |
---|
赵丹明: "基于归一化自相关函数的开环基音分析算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105185385A (en) * | 2015-08-11 | 2015-12-23 | 东莞市凡豆信息科技有限公司 | Voice fundamental tone frequency estimation method based on gender anticipation and multi-frequency-band parameter mapping |
CN107039051A (en) * | 2016-02-03 | 2017-08-11 | 重庆工商职业学院 | Fundamental frequency detection method based on ant group optimization |
CN106205638A (en) * | 2016-06-16 | 2016-12-07 | 清华大学 | A kind of double-deck fundamental tone feature extracting method towards audio event detection |
CN106205638B (en) * | 2016-06-16 | 2019-11-08 | 清华大学 | A kind of double-deck fundamental tone feature extracting method towards audio event detection |
CN110168641A (en) * | 2016-10-04 | 2019-08-23 | 弗劳恩霍夫应用研究促进协会 | Device and method for determining pitch information |
CN110168641B (en) * | 2016-10-04 | 2023-09-22 | 弗劳恩霍夫应用研究促进协会 | Apparatus and method for determining pitch information |
CN108830232A (en) * | 2018-06-21 | 2018-11-16 | 浙江中点人工智能科技有限公司 | A kind of voice signal period divisions method based on multiple dimensioned nonlinear energy operator |
CN108830232B (en) * | 2018-06-21 | 2021-06-15 | 浙江中点人工智能科技有限公司 | Voice signal period segmentation method based on multi-scale nonlinear energy operator |
CN109119097A (en) * | 2018-10-30 | 2019-01-01 | Oppo广东移动通信有限公司 | Fundamental tone detecting method, device, storage medium and mobile terminal |
CN110390953A (en) * | 2019-07-25 | 2019-10-29 | 腾讯科技(深圳)有限公司 | It utters long and high-pitched sounds detection method, device, terminal and the storage medium of voice signal |
CN110390953B (en) * | 2019-07-25 | 2023-11-17 | 腾讯科技(深圳)有限公司 | Method, device, terminal and storage medium for detecting howling voice signal |
Also Published As
Publication number | Publication date |
---|---|
CN103474074B (en) | 2016-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103474074B (en) | Pitch estimation method and apparatus | |
CN103854662B (en) | Adaptive voice detection method based on multiple domain Combined estimator | |
Prasad et al. | Automatic segmentation of continuous speech using minimum phase group delay functions | |
US10510363B2 (en) | Pitch detection algorithm based on PWVT | |
CN102054480B (en) | Single-channel aliasing voice separation method based on fractional Fourier transform | |
CN103886871B (en) | Detection method of speech endpoint and device thereof | |
Bayya et al. | Spectro-temporal analysis of speech signals using zero-time windowing and group delay function | |
CN111128213B (en) | Noise suppression method and system for processing in different frequency bands | |
CN103440872B (en) | The denoising method of transient state noise | |
CN104021789A (en) | Self-adaption endpoint detection method using short-time time-frequency value | |
CN108305639B (en) | Speech emotion recognition method, computer-readable storage medium and terminal | |
CN103646649A (en) | High-efficiency voice detecting method | |
CN101154383B (en) | Method and device for noise suppression, phonetic feature extraction, speech recognition and training voice model | |
EP3739582A1 (en) | Voice detection | |
CN108682432B (en) | Speech emotion recognition device | |
CN104183245A (en) | Method and device for recommending music stars with tones similar to those of singers | |
Morales-Cordovilla et al. | A pitch based noise estimation technique for robust speech recognition with missing data | |
CN103996399B (en) | Speech detection method and system | |
CN100541609C (en) | A kind of method and apparatus of realizing open-loop pitch search | |
US10522160B2 (en) | Methods and apparatus to identify a source of speech captured at a wearable electronic device | |
CN101447183A (en) | Processing method of high-performance confidence level applied to speech recognition system | |
Jain et al. | Marginal energy density over the low frequency range as a feature for voiced/non-voiced detection in noisy speech signals | |
CN112116909A (en) | Voice recognition method, device and system | |
US6470311B1 (en) | Method and apparatus for determining pitch synchronous frames | |
CN101067929B (en) | Method for enhancing and extracting phonetic resonance hump trace utilizing formant |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220513 Address after: 510530 No. 10, Nanxiang 2nd Road, Science City, Luogang District, Guangzhou, Guangdong Patentee after: Guangdong Guangsheng research and Development Institute Co.,Ltd. Address before: 518057 6th floor, software building, No. 9, Gaoxin Zhongyi Road, high tech Zone, Nanshan District, Shenzhen, Guangdong Province Patentee before: SHENZHEN RISING SOURCE TECHNOLOGY Co.,Ltd. |
|
TR01 | Transfer of patent right |