CN103474074B - Pitch estimation method and apparatus - Google Patents
Pitch estimation method and apparatus Download PDFInfo
- Publication number
- CN103474074B CN103474074B CN201310409433.8A CN201310409433A CN103474074B CN 103474074 B CN103474074 B CN 103474074B CN 201310409433 A CN201310409433 A CN 201310409433A CN 103474074 B CN103474074 B CN 103474074B
- Authority
- CN
- China
- Prior art keywords
- pitch period
- maximum
- normalized autocorrelation
- autocorrelation functions
- voice signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The present invention relates to a kind of pitch estimation method and apparatus. Described device comprises: Signal Pretreatment unit, normalized autocorrelation functions computing unit and pitch period post-processing unit. Described method comprises: S1, the pretreatment of voice signal being removed to DC component, perceptual weighting and signal down-sampling; S2, calculate the normalized autocorrelation functions value of described pretreated voice signal; S3, determine the maximum in described normalized autocorrelation functions value in pitch period hunting zone, pitch period candidate value corresponding described maximum is defined as to the pitch period estimated value of described voice signal. The present invention has overcome frequency multiplication and half mistake frequently in pitch period estimation preferably, has promoted the noise robustness of pitch period method of estimation, has reduced the computational complexity of algorithm simultaneously, has improved corresponding DAB/voice coding efficiency. The present invention can be applicable to the pitch search in various voice coding/decoding algorithms, has applicability widely.
Description
Technical field
The present invention relates to speech coding technology, more particularly, relate to a kind of pitch estimation methodAnd device.
Background technology
Pitch period refers to the cycle of vocal cord vibration when people pronounces. Pitch period be in voice coding one importantProblem, its accuracy will directly have influence on coding quality and the efficiency of speech coder. Fundamental tone week accuratelyPhase property is analyzed, and can in speech, effectively remove redundancy, reduces the bit number of coding, realizesLow bit rate high-quality speech coding. But, due to the particularity of voice, the accurate search meeting of pitch periodFace following difficulty:
(1) voice signal variation is very complicated, and glottal excitation waveform is not a periodic pulse train completely,And when being, becomes in the cycle of speech waveform.
(2) do not have the such periodicity of vocal cord vibration in the beginning and end part of voice, some is clearThe transition sound such as voiced sound are to be difficult to judge that it belongs to cycle or nonperiodic signal, thereby are also just unable to estimate fundamental tone weekPhase.
(3) to from voice signal, remove sound channel impact, directly take out only relevant with vocal cord vibration informationMore difficult.
(4) this difficulty that accurately starts and finish that defines each pitch period in voiced segments has limited fundamental toneReliable measurements, this not only because voice signal itself be quasi-periodic (being that fundamental tone is vicissitudinous), withTime also because waveform is subject to the impact of formant and noise etc.
(5) in actual applications, ambient noise can affect the performance of pitch Detection, for mobile communication ringBorder is particularly important, because waveform often there will be high level of noise.
(6) pitch period excursion is large has brought certain difficulty also to accurate pitch Detection.
At present, also do not have a kind of general method can accurately extract reliably voice base in either caseThe sound cycle. Traditional fundamental tone detecting method, can be divided into time domain method and frequency domain method. In time domain, traditional fundamental tonePeriodical algorithms comprise based on average magnitude difference function (AverageMagnitudeDifferenceFunction,AMDF) fundamental tone algorithm for estimating, based on short-time autocorrelation function (AutocorrelationFunction,ACF) Pitch Detection Algorithm. These two kinds of algorithms can be referring to as the introduction of Publication about Document:
Chu,WaiC.Speechcodingalgorithms:foundationandevolutionofstandardizedcoders.JohnWiley&Sons,Inc.2003,pp.33-45。
In the angle of frequency domain, Griffin and Lim have proposed a kind of frequency domain pitch period estimation scheme(D.W.Griffin,J.S.Lim.MultibandExcitationVocoder.IEEETransASSP,1988,36 (8)),, for multi-band excitation speech coding algorithm (MBE), this pitch period algorithm for estimating adopts and closesRing analysis synthetic method, matched signal frequency-domain waveform, obtains optimum pitch period and estimates.
In actual applications, the pitch search algorithm based on time domain is because its algorithm is simple, and performance is compared with good and obtainTo extensive use. For example at current speech coding standard G.729, in AMR-WB, all taked time domainImproved short-time autocorrelation function (ACF) Pitch Detection Algorithm (Bao Changchun. low code check digital speech codeBasis. Beijing: publishing house of Beijing University of Technology, 2001.2.). But the ACF method of time domain is held conventionallyEasily produce " frequency multiplication " and " half frequently " mistake, AMDF method can not effectively be followed the tracks of speech frequency and be become fastChange. Frequency domain method generally adopts Cepstrum Method, owing to introducing logarithm operation, amount of calculation is increased considerably, andBe subject to the impact of noise.
Summary of the invention
The technical problem to be solved in the present invention is, for the above-mentioned defect of prior art, provides a kind of low multipleAssorted degree, efficient pitch estimation method and apparatus, can overcome in pitch period estimation preferablyFrequency multiplication and half frequency mistake, and can raising anti-noise performance.
The technical solution adopted for the present invention to solve the technical problems is: propose a kind of pitch estimationMethod, comprises the steps:
S1, the pretreatment of voice signal being removed to DC component, perceptual weighting and signal down-sampling;
S2, use following formula calculate the normalized autocorrelation functions value of described pretreated voice signal:
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling;
S3, determine the maximum in described normalized autocorrelation functions value in pitch period hunting zone, by instituteState the pitch period estimated value that pitch period candidate value corresponding to maximum is defined as described voice signal.
In an embodiment, described step S1 further comprises:
S11, to voice signal resampling to inner sample rate;
S12, the voice signal of resampling is carried out to high-pass filtering to remove DC component;
S13, the voice signal after high-pass filtering is carried out to perceptual weighting;
S14, the voice signal after perceptual weighting is carried out to LPF and 1/2 down-sampling.
In an embodiment, described inner sample rate is 12.8kHz, and the cut-off frequency of described high-pass filtering is50Hz。
In an embodiment, described step S3 further comprises:
S31, according to the sample rate of voice signal, pitch period hunting zone is divided into the first interval,Two interval and the 3rd intervals, obtain respectively each interval normalized autocorrelation functions maximum and corresponding baseSound cycle candidate value;
S32, the weight parameter that foundation is certain, from described three interval normalized autocorrelation functions maximumsIn select the normalized autocorrelation functions maximum of described pitch period hunting zone, by this maximum correspondencePitch period candidate value be defined as the pitch period estimated value of described voice signal.
In an embodiment, described step S32 further comprises: judge that normalization between Second Region is from phaseClose function maximum and whether be more than or equal to normalized autocorrelation functions maximum and the described weight in the first intervalThe product of parameter, if so, by fundamental tone week corresponding the normalized autocorrelation functions maximum between Second RegionPhase candidate value is defined as the pitch period estimated value of described voice signal, otherwise, further judge the 3rd intervalNormalized autocorrelation functions maximum whether be more than or equal to the normalized autocorrelation functions maximum in the first intervalThe product of value and described weight parameter, if so, by the normalized autocorrelation functions maximum in the 3rd intervalCorresponding pitch period candidate value is defined as the pitch period estimated value of described voice signal, otherwise by the firstth districtBetween pitch period candidate value corresponding to normalized autocorrelation functions maximum be defined as described voice signalPitch period estimated value.
In an embodiment, between described the first interval, Second Region and the 3rd interval is specially [L_min, 39],[40,79], [80, L_max], wherein L_min represents the initial value of pitch period hunting zone, L_maxRepresent the end value of pitch period hunting zone.
The present invention also proposes a kind of pitch estimation device for solving its technical problem, comprising:
Signal Pretreatment unit, removes DC component, perceptual weighting and signal down-sampling to voice signalPretreatment;
Normalized autocorrelation functions computing unit, uses following formula to calculate returning of described pretreated voice signalOne changes auto-correlation function value:
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling;
Pitch period post-processing unit, determines described normalized autocorrelation functions value in pitch period hunting zoneIn maximum, pitch period candidate value corresponding described maximum is defined as to the fundamental tone of described voice signalCycle estimated value.
In an embodiment, further sample to inner to voice signal resampling in described Signal Pretreatment unitRate, then carries out high-pass filtering to remove DC component, subsequently to high-pass filtering to the voice signal of resamplingAfter voice signal carry out perceptual weighting, finally to the voice signal after perceptual weighting carry out LPF and1/2 down-sampling.
In an embodiment, described pitch period post-processing unit is further according to the sample rate of voice signal,Pitch period hunting zone is divided between the first interval, Second Region and the 3rd interval, obtains respectively each districtBetween normalized autocorrelation functions maximum and corresponding pitch period candidate value, and according to certain weight ginsengNumber is selected described pitch period search model from described three interval normalized autocorrelation functions maximumsThe normalized autocorrelation functions maximum of enclosing, described in pitch period candidate value corresponding this maximum is defined asThe pitch period estimated value of voice signal.
In an embodiment, described pitch period post-processing unit according to certain weight parameter from described threeIn interval normalized autocorrelation functions maximum, select the normalization of described pitch period hunting zone certainlyCorrelation function maximum is specially: judge whether the normalized autocorrelation functions maximum between Second Region is greater than etc.In the normalized autocorrelation functions maximum in the first interval and the product of described weight parameter, if so, willPitch period candidate value corresponding to normalized autocorrelation functions maximum between Second Region is defined as described voiceThe pitch period estimated value of signal, otherwise, further judge the normalized autocorrelation functions maximum in the 3rd intervalWhether value is more than or equal to the normalized autocorrelation functions maximum in the first interval and the product of described weight parameter,If so, pitch period candidate value corresponding the normalized autocorrelation functions maximum in the 3rd interval is determinedFor the pitch period estimated value of described voice signal, otherwise by the normalized autocorrelation functions maximum in the first intervalPitch period candidate value corresponding to value is defined as the pitch period estimated value of described voice signal.
Pitch estimation method and apparatus of the present invention, examines based on normalized autocorrelation functions fundamental toneSurvey, and introduce pretreatment and post-processing technology in pitch period is estimated, overcome preferably pitch period and estimatedFrequency multiplication in meter and half mistake frequently, has promoted the noise robustness of pitch period method of estimation, has reduced simultaneouslyThe computational complexity of algorithm, has improved corresponding DAB/voice coding efficiency. The present invention can be suitable forPitch search in various voice coding/decoding algorithms, has applicability widely.
Brief description of the drawings
Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:
Fig. 1 is the flow chart of the pitch estimation method of one embodiment of the invention;
Fig. 2 is the flow chart of a specific embodiment of step 110 in Fig. 1;
Fig. 3 is the flow chart of a specific embodiment of step 130 in Fig. 1;
Fig. 4 is the logic diagram of the pitch estimation device of one embodiment of the invention.
Detailed description of the invention
In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with accompanying drawing and realityExecute example, the present invention is further elaborated. Only should be appreciated that specific embodiment described hereinOnly, in order to explain the present invention, be not intended to limit the present invention.
Fig. 1 shows the flow chart of the pitch estimation method 100 of one embodiment of the invention. AsShown in Fig. 1, this pitch estimation method 100 comprises:
In step 110, voice signal is removed to the pre-of DC component, perceptual weighting and signal down-samplingProcess.
In step 120, calculate the normalized autocorrelation functions value of pretreated voice signal. The present invention makesNormalized autocorrelation functions with following:
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling.
In step 130, determine the maximum in normalized autocorrelation functions value in pitch period hunting zone,Pitch period candidate value corresponding described maximum is defined as to the pitch period estimated value of voice signal.
The present invention has introduced Signal Pretreatment technology in pitch period is estimated. Fig. 2 shows shown in Fig. 1The flow chart of a specific embodiment of Signal Pretreatment step 110. As shown in Figure 2, this Signal PretreatmentStep 110 further comprises:
In step 111, voice signal resampling is arrived to inner sample rate (Fs=12.8kHz).
In later step 112, the voice signal of resampling is carried out to high-pass filtering. High-pass filtering wave filterCut-off frequency can be 50Hz, and its object is to remove DC component.
Then in step 113, the voice signal after high-pass filtering is carried out to perceptual weighting.
In final step 114, the voice signal after perceptual weighting is carried out to LPF and 1/2 down-sampling,To be 3.2kHz by signal broadband.
Further, in preferred embodiment, the present invention can also add numerical value in Signal Pretreatment step 110Thereby formant is removed in filtering and high-frequency noise is estimated pitch period more accurately.
The present invention, before carrying out pitch period search, carries out pretreatment to the voice signal of input, so bothCan filtering estimate inoperative HFS to pitch period, the computing that also can reduce algorithm is simultaneously multipleAssorted degree.
The present invention has also introduced pitch period post-processing technology in pitch period is estimated. Fig. 3 shows Fig. 1The flow chart of a specific embodiment of shown pitch period post-processing step 130. As shown in Figure 3, shouldPitch period post-processing step 130 further comprises:
In step 131, according to the sample rate of voice signal, pitch period hunting zone is divided into the firstth districtBetween, between Second Region and the 3rd interval, obtain respectively each interval normalized autocorrelation functions maximum and rightThe pitch period candidate value of answering.
In an embodiment, pitch period hunting zone is [L_min, L_max], and wherein L_min representsThe initial value of pitch period hunting zone, L_max represents the end value of pitch period hunting zone. According to frontThe sample frequency of the voice signal of stating, can be divided into this pitch period hunting zone following three intervals,I.e. the first interval [L_min, 39], between Second Region [40,79], the 3rd interval [80, L_max], so thatIn these three intervals, determine correct pitch period estimated value. In specific embodiment, L_min and L_maxCan be respectively 0 and 256. Based on above three intervals, can obtain each interval maximum ρ (τ) value andCorresponding pitch period candidate value τ, is designated as ρmax1、ρmax2And ρmax3,τ1、τ2And τ3。
In step 132, according to certain weight parameter, from described three interval normalized autocorrelation functionsIn maximum, select the normalized autocorrelation functions maximum of described pitch period hunting zone, by this maximumPitch period candidate value corresponding to value is defined as the pitch period estimated value of described voice signal.
In an embodiment, selected weight parameter c(can be near the numerical value 1.0, for example 0.97) and, canCarry out by the following method to determine optimum pitch period candidate value τopt:
First judge the normalized autocorrelation functions maximum ρ between Second Regionmax2Whether be more than or equal to the firstth districtBetween normalized autocorrelation functions maximum ρmax1With the product of weight parameter c, if so, by Second RegionBetween normalized autocorrelation functions maximum ρmax2Corresponding pitch period candidate value τ2Be defined as voice signalPitch period estimated value, otherwise, further judge the normalized autocorrelation functions maximum in the 3rd intervalρmax3Whether be more than or equal to the normalized autocorrelation functions maximum ρ in the first intervalmax1With taking advantage of of weight parameter cLong-pending, if so, by the normalized autocorrelation functions maximum ρ in the 3rd intervalmax3Corresponding pitch period is waitedChoosing value τ3Be defined as the pitch period estimated value of voice signal, otherwise by the normalized autocorrelation letter in the first intervalNumber maximum ρmax1Corresponding pitch period candidate value τ1Be defined as the pitch period estimated value of voice signal.
Relevant mathematical notation is as follows:
Make τopt=τ1,ρmax=ρmax1;
If ρmax2≥cρmax, ρmax=ρmax2,τopt=τ2;
If ρmax3≥cρmax, ρmax=ρmax3,τopt=τ3。
Further, in preferred embodiment, the present invention can also utilize in pitch period post-processing step 130Normalized autocorrelation functions judge voice signal clear/accuracy that turbid characteristic is estimated to promote pitch period.
Pitch estimation method based on above introduction, the present invention also proposes a kind of voice fundamental cycleEstimation unit. Fig. 4 shows the logic of the pitch estimation device 400 of one embodiment of the inventionBlock diagram. As shown in Figure 4, this pitch estimation device 400 comprise Signal Pretreatment unit 410,Normalized autocorrelation functions computing unit 420 and pitch period post-processing unit 430. Signal Pretreatment unitThe voice signal of 410 pairs of inputs is removed the pretreatment of DC component, perceptual weighting and signal down-sampling.Normalized autocorrelation functions computing unit 420 uses following formula to calculate through 410 pretreatment of Signal Pretreatment unitAfter the normalized autocorrelation functions value of voice signal:
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling. Pitch periodPost-processing unit 430 is determined the maximum in normalized autocorrelation functions value in pitch period hunting zone, willPitch period candidate value corresponding to this maximum is defined as the pitch period estimated value of described voice signal.
In a specific embodiment, first Signal Pretreatment unit 410 arrives the voice signal resampling of inputInner sample rate (Fs=12.8kHz), then carries out high-pass filtering, wave filter to the voice signal of resamplingCut-off frequency can be 50Hz, its object is to remove DC component, subsequently to the language after high-pass filteringTone signal is carried out perceptual weighting, finally the voice signal after perceptual weighting is carried out LPF and is adopted for 1/2 timeSample will be 3.2kHz by signal broadband. So both can filtering estimate inoperative height to pitch periodFrequently part, the while also can be reduced the computational complexity of algorithm.
In a specific embodiment, pitch period post-processing unit 430, will according to the sample rate of voice signalPitch period hunting zone is divided between the first interval, Second Region and the 3rd interval, for example the first interval[L_min, 39], between Second Region [40,79], the 3rd interval [80, L_max], wherein L_min represents fundamental toneThe initial value of cycle hunting zone, L_max represents the end value of pitch period hunting zone, then obtains respectivelyTo maximum ρ (τ) value and the corresponding pitch period candidate value τ in each interval, be designated as ρmax1、ρmax2And ρmax3,τ1、τ2And τ3. Pitch period post-processing unit 430 can be also 1.0 according to certain weight parameter c(Near numerical value, for example 0.97), carry out by the following method to determine optimum pitch period candidate value τopt:
Make τopt=τ1,ρmax=ρmax1;
If ρmax2≥cρmax, ρmax=ρmax2,τopt=τ2;
If ρmax3≥cρmax, ρmax=ρmax3,τopt=τ3。
Pitch estimation method and apparatus of the present invention, examines based on normalized autocorrelation functions fundamental toneSurvey, and introduce pretreatment and post-processing technology in pitch period is estimated, overcome preferably pitch period and estimatedFrequency multiplication in meter and half mistake frequently, has promoted the noise robustness of pitch period method of estimation, has improved correspondingDAB/voice coding efficiency. Below provide pitch search algorithm in the present invention and AMR-WB+Performance Ratio is:
1, performance test methods: sequence of calculation average signal-to-noise ratio (SNR), it is defined as follows:
Wherein, N(N=256) be the length of a frame voice signal, NSFBe the totalframes of a voice sequence, xw(n)For the signal of primary signal after perceptual weighting,For the voice signal process after coding/decodingSignal after perceptual weighting.
2, test result
Two kinds of sequence of algorithms average SNR contrasts of table 1 (monophonic)
Two kinds of sequence of algorithms average SNR contrasts of table 2 (stereo)
3, test result analysis
(1), from test result, the algorithm performance that the present invention proposes is slightly better than the fundamental tone week of AMR-WB+Phase searching algorithm performance, computational complexity is than the complexity of AMR-WB+ algorithm suitable (also slightly smallPoint).
(2) from the interpretation of result of table 1 and table 2, es02, two sequential coding performances of s_cl_mt_2_org arePoor, s_cl_ft_3_org coding efficiency is best. By sequence analysis es02, two sequences of s_cl_mt_2_orgBe middle-aged male sound, s_cl_ft_3_org is young woman's sound. By Algorithm Analysis, this and Ben FaPreventing of setting in bright algorithm detects that the parameter of doubling time chooses relevantly, and this parameter is an empirical value, orderFront algorithm is mainly considered schoolgirl, scholar without a xiucai degree's situation, and the feature of these sequences is its pitch period excursionGreatly, and rapid, and Comparatively speaking its pitch period variation of middle-aged male sound is very mild, and changes modelEnclose relative also less.
(3) test in along tape test some typical noisy speech s_no_ft_9_org, s_no_2t_1_org,S_no_2t_2_org, s_no_2t_3_org, s_no_ft_1_org, for example, contain a large amount of backgrounds on airport etc.The situation of noise, from test result, the noiseproof feature of algorithm of the present invention is better than AMR-WB+ algorithm.
Claims (4)
1. a pitch estimation method, is characterized in that, comprises the steps:
S1, the pretreatment of voice signal being removed to DC component, perceptual weighting and signal down-sampling;
S2, use following formula calculate the normalized autocorrelation functions value of described pretreated voice signal:
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling;
S3, determine the maximum in described normalized autocorrelation functions value in pitch period hunting zone, by instituteState the pitch period estimated value that pitch period candidate value corresponding to maximum is defined as described voice signal;
Wherein, described step S1 further comprises:
S11, to voice signal resampling to inner sample rate;
S12, the voice signal of resampling is carried out to high-pass filtering to remove DC component;
S13, the voice signal after high-pass filtering is carried out to perceptual weighting;
S14, the voice signal after perceptual weighting is carried out to LPF and 1/2 down-sampling;
Described step S3 further comprises:
S31, according to the sample rate of voice signal, pitch period hunting zone is divided into the first interval,Two interval and the 3rd intervals, obtain respectively each interval normalized autocorrelation functions maximum and corresponding baseSound cycle candidate value;
S32, the weight parameter that foundation is certain, from described three interval normalized autocorrelation functions maximumsIn select the normalized autocorrelation functions maximum of described pitch period hunting zone, by this maximum correspondencePitch period candidate value be defined as the pitch period estimated value of described voice signal, specifically comprise: judgeWhether the normalized autocorrelation functions maximum in two intervals is more than or equal to the normalized autocorrelation letter in the first intervalThe product of number maximum and described weight parameter, if so, by the normalized autocorrelation functions between Second RegionPitch period candidate value corresponding to maximum is defined as the pitch period estimated value of described voice signal, otherwise,Further judge whether the normalized autocorrelation functions maximum in the 3rd interval is more than or equal to returning of the first intervalOne changes the product of auto-correlation function maximum and described weight parameter, if so, and by the normalizing in the 3rd intervalChange the pitch period that pitch period candidate value corresponding to auto-correlation function maximum is defined as described voice signalEstimated value, otherwise by true pitch period candidate value corresponding the normalized autocorrelation functions maximum in the first intervalBe decided to be the pitch period estimated value of described voice signal.
2. method according to claim 1, is characterized in that, described inner sample rate is 12.8kHz,The cut-off frequency of described high-pass filtering is 50Hz.
3. method according to claim 1, is characterized in that, between described the first interval, Second RegionBe specially [L_min, 39] with the 3rd interval, [40,79], [80, L_max], wherein L_min represents fundamental toneThe initial value of cycle hunting zone, L_max represents the end value of pitch period hunting zone.
4. a pitch estimation device, is characterized in that, comprising:
Signal Pretreatment unit, removes DC component, perceptual weighting and signal down-sampling to voice signalPretreatment;
Normalized autocorrelation functions computing unit, uses following formula to calculate returning of described pretreated voice signalOne changes auto-correlation function value:
Wherein, ρ (τ) represents normalized autocorrelation functions value, and s (n) is the voice signal after perceptual weighting, and τ representsVoice fundamental cycle candidate value in search, N is the length of a frame signal after signal down-sampling;
Pitch period post-processing unit, determines described normalized autocorrelation functions value in pitch period hunting zoneIn maximum, pitch period candidate value corresponding described maximum is defined as to the fundamental tone of described voice signalCycle estimated value;
Wherein, described Signal Pretreatment unit further arrives inner sample rate to voice signal resampling, thenThe voice signal of resampling is carried out to high-pass filtering to remove DC component, subsequently to the voice after high-pass filteringSignal carries out perceptual weighting, finally the voice signal after perceptual weighting is carried out to LPF and 1/2 down-sampling;
Described pitch period post-processing unit further, according to the sample rate of voice signal, is searched for pitch periodScope is divided between the first interval, Second Region and the 3rd interval, obtains respectively each interval normalization from phaseClose function maximum and corresponding pitch period candidate value, and according to certain weight parameter, from described threeIn interval normalized autocorrelation functions maximum, select the normalization of described pitch period hunting zone certainlyCorrelation function maximum, is defined as pitch period candidate value corresponding this maximum in the base of described voice signalSound cycle estimated value;
Wherein, the weight parameter that described pitch period post-processing unit foundation is certain is from described three interval returningOne changes the normalized autocorrelation functions of selecting described pitch period hunting zone in auto-correlation function maximumMaximum is specially: judge whether the normalized autocorrelation functions maximum between Second Region is more than or equal to the firstth districtBetween normalized autocorrelation functions maximum and the product of described weight parameter, if so, by between Second RegionPitch period candidate value corresponding to normalized autocorrelation functions maximum be defined as the base of described voice signalSound cycle estimated value, otherwise, further judge that whether the normalized autocorrelation functions maximum in the 3rd interval is largeIn equaling the normalized autocorrelation functions maximum in the first interval and the product of described weight parameter, if so,Described in pitch period candidate value corresponding the normalized autocorrelation functions maximum in the 3rd interval being defined asThe pitch period estimated value of voice signal, otherwise by the normalized autocorrelation functions maximum correspondence in the first intervalPitch period candidate value be defined as the pitch period estimated value of described voice signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310409433.8A CN103474074B (en) | 2013-09-09 | 2013-09-09 | Pitch estimation method and apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310409433.8A CN103474074B (en) | 2013-09-09 | 2013-09-09 | Pitch estimation method and apparatus |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103474074A CN103474074A (en) | 2013-12-25 |
CN103474074B true CN103474074B (en) | 2016-05-11 |
Family
ID=49798895
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310409433.8A Active CN103474074B (en) | 2013-09-09 | 2013-09-09 | Pitch estimation method and apparatus |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103474074B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108831504A (en) * | 2018-06-13 | 2018-11-16 | 西安蜂语信息科技有限公司 | Determination method, apparatus, computer equipment and the storage medium of pitch period |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105185385B (en) * | 2015-08-11 | 2019-11-15 | 东莞市凡豆信息科技有限公司 | Voice fundamental frequency estimation method based on gender anticipation with the mapping of multiband parameter |
CN107039051B (en) * | 2016-02-03 | 2019-11-26 | 重庆工商职业学院 | Fundamental frequency detection method based on ant group optimization |
CN106205638B (en) * | 2016-06-16 | 2019-11-08 | 清华大学 | A kind of double-deck fundamental tone feature extracting method towards audio event detection |
EP3306609A1 (en) * | 2016-10-04 | 2018-04-11 | Fraunhofer Gesellschaft zur Förderung der Angewand | Apparatus and method for determining a pitch information |
CN108830232B (en) * | 2018-06-21 | 2021-06-15 | 浙江中点人工智能科技有限公司 | Voice signal period segmentation method based on multi-scale nonlinear energy operator |
CN109119097B (en) * | 2018-10-30 | 2021-06-08 | Oppo广东移动通信有限公司 | Pitch detection method, device, storage medium and mobile terminal |
CN110390953B (en) * | 2019-07-25 | 2023-11-17 | 腾讯科技(深圳)有限公司 | Method, device, terminal and storage medium for detecting howling voice signal |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4486900A (en) * | 1982-03-30 | 1984-12-04 | At&T Bell Laboratories | Real time pitch detection by stream processing |
US5127053A (en) * | 1990-12-24 | 1992-06-30 | General Electric Company | Low-complexity method for improving the performance of autocorrelation-based pitch detectors |
CN101149924A (en) * | 2006-09-18 | 2008-03-26 | 华为技术有限公司 | Method and device for implementing open-loop pitch search |
-
2013
- 2013-09-09 CN CN201310409433.8A patent/CN103474074B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4486900A (en) * | 1982-03-30 | 1984-12-04 | At&T Bell Laboratories | Real time pitch detection by stream processing |
US5127053A (en) * | 1990-12-24 | 1992-06-30 | General Electric Company | Low-complexity method for improving the performance of autocorrelation-based pitch detectors |
CN101149924A (en) * | 2006-09-18 | 2008-03-26 | 华为技术有限公司 | Method and device for implementing open-loop pitch search |
Non-Patent Citations (1)
Title |
---|
基于归一化自相关函数的开环基音分析算法研究;赵丹明;《中国优秀硕士学位论文全文数据库 信息科技辑》;20130315;1-52 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108831504A (en) * | 2018-06-13 | 2018-11-16 | 西安蜂语信息科技有限公司 | Determination method, apparatus, computer equipment and the storage medium of pitch period |
CN108831504B (en) * | 2018-06-13 | 2020-12-04 | 西安蜂语信息科技有限公司 | Method and device for determining pitch period, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN103474074A (en) | 2013-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103474074B (en) | Pitch estimation method and apparatus | |
CN103854662B (en) | Adaptive voice detection method based on multiple domain Combined estimator | |
CN111128213B (en) | Noise suppression method and system for processing in different frequency bands | |
US10510363B2 (en) | Pitch detection algorithm based on PWVT | |
CN102054480B (en) | Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT) | |
CN103440872B (en) | The denoising method of transient state noise | |
CN101625858B (en) | Method for extracting short-time energy frequency value in voice endpoint detection | |
CN104183245A (en) | Method and device for recommending music stars with tones similar to those of singers | |
CN103258543B (en) | Method for expanding artificial voice bandwidth | |
Mittal et al. | Study of characteristics of aperiodicity in Noh voices | |
Ding et al. | A DCT-based speech enhancement system with pitch synchronous analysis | |
Cabral et al. | Glottal spectral separation for parametric speech synthesis | |
CN105679312A (en) | Phonetic feature processing method of voiceprint identification in noise environment | |
CN110349598A (en) | A kind of end-point detecting method under low signal-to-noise ratio environment | |
CN101154383A (en) | Method and device for noise suppression, phonetic feature extraction, speech recognition and training voice model | |
CN104269180A (en) | Quasi-clean voice construction method for voice quality objective evaluation | |
CN103745729A (en) | Audio de-noising method and audio de-noising system | |
CN112116909A (en) | Voice recognition method, device and system | |
Patil et al. | Effectiveness of Teager energy operator for epoch detection from speech signals | |
CN101067929B (en) | Method for enhancing and extracting phonetic resonance hump trace utilizing formant | |
CN104658547A (en) | Method for expanding artificial voice bandwidth | |
Shannon et al. | MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition. | |
Govind et al. | Epoch extraction in high pass filtered speech using hilbert envelope | |
Park et al. | Pitch detection based on signal-to-noise-ratio estimation and compensation for continuous speech signal | |
Graf et al. | Low-Complexity Pitch Estimation Based on Phase Differences Between Low-Resolution Spectra. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220513 Address after: 510530 No. 10, Nanxiang 2nd Road, Science City, Luogang District, Guangzhou, Guangdong Patentee after: Guangdong Guangsheng research and Development Institute Co.,Ltd. Address before: 518057 6th floor, software building, No. 9, Gaoxin Zhongyi Road, high tech Zone, Nanshan District, Shenzhen, Guangdong Province Patentee before: SHENZHEN RISING SOURCE TECHNOLOGY Co.,Ltd. |