CN102231274B - Fundamental tone period estimated value correction method, fundamental tone estimation method and related apparatus - Google Patents

Fundamental tone period estimated value correction method, fundamental tone estimation method and related apparatus Download PDF

Info

Publication number
CN102231274B
CN102231274B CN2011101182662A CN201110118266A CN102231274B CN 102231274 B CN102231274 B CN 102231274B CN 2011101182662 A CN2011101182662 A CN 2011101182662A CN 201110118266 A CN201110118266 A CN 201110118266A CN 102231274 B CN102231274 B CN 102231274B
Authority
CN
China
Prior art keywords
max
subframe
value
intermediate variable
opt
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2011101182662A
Other languages
Chinese (zh)
Other versions
CN102231274A (en
Inventor
党红强
刘贵忠
顿玉洁
杜正中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN2011101182662A priority Critical patent/CN102231274B/en
Publication of CN102231274A publication Critical patent/CN102231274A/en
Application granted granted Critical
Publication of CN102231274B publication Critical patent/CN102231274B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Liquid Crystal Display Device Control (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

The invention provides a fundamental tone period estimated value correction method which comprises the following steps: when a sequence maximum value MAmax(i+1) of a present subframe in a present frequency area MCAMD is larger than a weighted value of an intermediate variable MAmax, substituting the intermediate variable MAmax and an intermediate variable Topt with the MAmax(i+1) and delay of MAmax(i+1) respectively; if a ratio of the intermediate variable Topt and fundamental tone period estimated value Tpre_mid_o of first odd number of subframes of the present subframe is less than correction factor r1 or is greater than correction factor r2 and sequence maximum value MAmax0 of the present subframe in Tpre_mid_o contiguous scope is greater than product of the intermediate variable MAmax and experience factor pho2, correcting the intermediate variable Topt with delay T0 corresponding to the MAmax0; carrying out median filtering on the fundamental tone period estimated value of first even number of subframe of the present subframe and the intermediate variable Topt.

Description

Pitch period estimated value modification method, pitch estimation method and relevant apparatus
Technical field
The present invention relates to the signal process field, relate in particular to pitch period estimated value modification method, pitch estimation method and relevant apparatus.
Background technology
In field of voice signal, the caused periodic feature of vocal cord vibration when the original implication of fundamental tone refers to send out voiced sound, pitch period is the inverse of vibration frequency of vocal band.In the Audio Signal Processing field, fundamental tone also has similar implication.From the time domain angle, the obvious characteristic of periodic signal is the similarity of waveform.The cardinal principle that relies on the homophylic Pitch Detection Algorithm of waveform is to determine pitch period by the similarity between the signal after comparison original signal and its displacement.If translocation distance equals pitch period, so, two signals have maximum similarity (perhaps simple crosscorrelation is maximum).No matter be field of voice signal or Audio Signal Processing field, the detection of fundamental tone or the estimation of pitch period all are very important technology, because, by pitch Detection and estimation, extract the fundamental frequency of signal, just can know the variation speed of signal, thereby understand the feature of signal, for further signal is processed the reference that provides necessary.
Pitch Detection and algorithm for estimating are more, usually be divided into time domain approach and frequency domain method two large classes, wherein, time domain approach mainly comprises autocorrelation function method (Autocorrelation Function, ACF), average magnitude difference function method (Average Magnitude Difference Function, AMDF) and simple inverse filter tracking method (Simple Inverse Filtering Tracking, SIFT) etc., frequency domain method comprises that mainly harmonic wave amasss spectrometry (Harmonic Product Spectrum Method) and Cepstrum Method (Ceptrum Method, CM) consider from the computation complexity aspect, correlation method is compared with additive method, on the low and performance of calculated amount and additive method similar.
In numerous pitch Detection and algorithm for estimating, a kind of method that prior art provides is based on the pitch estimation method of Normalized Cross Correlation Function.In the method, at first pre-service is carried out in the signal of input, comprise high-pass filtering, low-pass filtering, go average, numerical value is level and smooth etc.; Then, to through its normalized crosscorrelation sequence of pretreated calculated signals, obtain simple crosscorrelation sequence R (i), delay corresponding to maximal value of the R that calculates (i) can be used as the pitch period estimated value; At last, use median filter that the pitch period estimated value is carried out smoothly, level and smooth purpose is mainly in order to remove " wild point " in the fundamental tone estimated value, i.e. frequency multiplication/half frequency mistake.
In the above-mentioned prior art, the Normalized Cross Correlation Function method is to compare by current demand signal and past signal, and peak value can appear in the waveform similarity then Normalized Cross Correlation Function of signal.If the pitch period of signal is less, and have preferably periodically, then the simple crosscorrelation sequence can comprise a plurality of cycles,, a plurality of maximum value can occur in the simple crosscorrelation sequence that is.Because the impact of harmonic wave, in these maximum value, maximal value often is not maximum value corresponding to first periodic point, therefore, half frequency mistake appears in Normalized Cross Correlation Function method easily, thereby causes median filter that the pitch period estimated value is carried out level and smooth poor effect.
Summary of the invention
The embodiment of the invention provides pitch period estimated value modification method, pitch estimation method and relevant apparatus, be used for to solve prior art is carried out poor effect when level and smooth to the pitch period estimated value problem.
The embodiment of the invention provides a kind of pitch period estimated value modification method, comprising: more current subframe is revised circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described MA Max(i+1) corresponding delay T(i+1) substitute the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is not within the fundamental frequency scope; Calculate described the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptPitch period estimated value and the second intermediate variable T with even number subframe before the described current subframe OptCarry out medium filtering, with the value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe; Described the first intermediate variable MA MaxWeighted value be described MA MaxWith the product of the first experience factor, describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0, described middle subframe is that some subframe neutron frame numbers that the front even number subframe of described current subframe and described current subframe consist of are the subframe of intermediate value, described even number is the maximum even number less than described odd number.
The embodiment of the invention provides a kind of pitch period estimated value correcting device, comprising: comparison module is used for more current subframe and revises circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described MA Max(i+1) corresponding delay T(1) substitute the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is not within the fundamental frequency scope; Correction module is used for calculating described the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptThe medium filtering module is used for pitch period estimated value and the second intermediate variable T with even number subframe before the described current subframe OptCarry out medium filtering, with the pitch period estimated value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe; Described the first intermediate variable MA MaxWeighted value be described MA MaxWith the product of the first experience factor, describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0, some subframe neutron frame numbers that the front even number subframe that described middle subframe is described current subframe and described current subframe consist of are the subframe of intermediate value, described even number is the maximum even number less than described odd number.
The embodiment of the invention provides a kind of pitch estimation method, comprising: the signal that receives is carried out pre-service, and described signal comprises voice signal or sound signal; To its normalized crosscorrelation sequence of the pretreated calculated signals of described process, ask for the correction circular AMDF sequence of described normalized crosscorrelation sequence according to the described normalized crosscorrelation sequence that obtains; According to delay corresponding to the described maximal value of correction circular AMDF sequence in the fundamental frequency scope, the pitch delay estimated value is revised the pitch period estimated value take the delay estimation value of described correction gained as described signal; Described according to delay corresponding to the described maximal value of correction circular AMDF sequence in the fundamental frequency scope, the pitch delay estimated value revised comprise: more current subframe is revised circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described MA Max(i+1) corresponding delay substitutes the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is not within the fundamental frequency scope; Calculate described the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptPitch period estimated value and the second intermediate variable T with even number subframe before the described current subframe OptCarry out medium filtering, with the value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe; Described the first intermediate variable MA MaxWeighted value be described MA MaxWith the product of the first experience factor, describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0, described middle subframe is that some subframe neutron frame numbers that the front even number subframe of described current subframe and described current subframe consist of are the subframe of intermediate value, described even number is the maximum even number less than described odd number.
The embodiment of the invention provides a kind of fundamental tone estimation unit, comprising: pretreatment module, be used for the signal that receives is carried out pre-service, and described signal comprises voice signal or sound signal; Sequence is asked for module, is used for its normalized crosscorrelation sequence of the pretreated calculated signals of described process, asks for the correction circular AMDF sequence of described normalized crosscorrelation sequence according to the described normalized crosscorrelation sequence that obtains; Correcting module is used for according to delay corresponding to the described maximal value of correction circular AMDF sequence in the fundamental frequency scope pitch delay estimated value being revised, the pitch period estimated value take the delay estimation value of described correction gained as described signal; Described correcting module comprises: comparing unit is used for more current subframe and revises circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described MA Max(i+1) corresponding delay T(1) substitute the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is not within the fundamental frequency scope; Correcting unit is used for calculating described the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptMedian filter unit is used for pitch period estimated value and the second intermediate variable T with even number subframe before the described current subframe OptCarry out medium filtering, with the pitch period estimated value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe; Described the first intermediate variable MA MaxWeighted value be described MA MaxWith the product of the first experience factor, describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0, described middle subframe is that some subframe neutron frame numbers that the front even number subframe of described current subframe and described current subframe consist of are the subframe of intermediate value, described even number is the maximum even number less than described odd number.
The embodiment that provides from the invention described above as can be known, to derive and get from having good periodic normalized crosscorrelation sequence owing to revising the circular AMDF sequence, process and revise circular AMDF sequence maximal value and length of delay thereof, by recycle ratio than revising circular AMDF sequence maximal value in the side frequency zone, final correction circular AMDF sequence maximal value and the delay thereof of determining current subframe, can reduce the probability that occurs half frequency/frequency multiplication mistake when the normalized crosscorrelation sequence algorithm is estimated pitch period, and the pitch period estimated value of even number subframe before the peaked delay of current subframe correction circular AMDF sequence and the current subframe is carried out medium filtering, half frequency/frequency multiplication the mistake that occurs in the time of then can further reducing the estimation pitch period has been equivalent to further improve the accuracy that pitch period is estimated.
Description of drawings
In order to be illustrated more clearly in the technical scheme of the embodiment of the invention, the below will do to introduce simply to the accompanying drawing of required use in prior art or the embodiment description, and apparently, the accompanying drawing in the following describes only
Be some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain such as these accompanying drawings other accompanying drawing.
Fig. 1 is the pitch period estimated value modification method schematic flow sheet that the embodiment of the invention provides;
Fig. 2 is the pitch period estimated value modification method schematic flow sheet that another embodiment of the present invention provides;
Fig. 3 is the pitch estimation method schematic flow sheet that the embodiment of the invention provides;
Fig. 4 is a kind of pitch period estimated value correcting device logical organization synoptic diagram that the embodiment of the invention provides;
Fig. 5 is a kind of fundamental tone estimation unit logical organization synoptic diagram that the embodiment of the invention provides;
Fig. 6 is a kind of fundamental tone estimation unit logical organization synoptic diagram that another embodiment of the present invention provides;
Fig. 7 is a kind of fundamental tone estimation unit logical organization synoptic diagram that another embodiment of the present invention provides;
Fig. 8 is the structural representation that the fundamental tone estimation unit is used for the Time alignment technology modules of voice audio uniform coding;
Fig. 9 is the voice signal fundamental tone estimating system logical organization synoptic diagram that the embodiment of the invention provides.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the invention, the technical scheme in the embodiment of the invention is clearly and completely described, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that obtains under the creative work prerequisite.
Seeing also accompanying drawing 1, is the pitch period estimated value modification method schematic flow sheet that the embodiment of the invention provides, and mainly comprises step:
S101, more current subframe is revised circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described MA Max(i+1) corresponding delay T(i+1) substitute the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is not within the fundamental frequency scope.
Considering can be with a frame signal, and for example, voice signal or sound signal are divided into 16 subframes, therefore, in embodiments of the present invention, can be take a subframe as unit, carries out the correction of pitch period estimated value.Further, because the simple crosscorrelation sequence has good periodicity, therefore, in embodiments of the present invention, can convert signal to the normalized crosscorrelation sequence by calculating the normalized crosscorrelation sequence of signal, and then ask for and revise circular AMDF sequence maximal value.So, in fundamental tone is estimated, can consider the zero-time of original signal, but can utilize first peaked position of normalized crosscorrelation sequence to estimate pitch period.
For voice signal or sound signal, the fundamental frequency scope and can be concentrated between 80Hz ~ 2500Hz by the fundamental frequency scope that people's ear is felt generally between 60Hz ~ 4000Hz.For pitch period is estimated, can in advance the fundamental frequency scope be divided several frequency fields.
In an embodiment provided by the invention, with the fundamental frequency scope of 80Hz ~ 2500Hz be divided into the first frequency zone [80,240) Hz, second frequency zone [240,720) Hz and the 3rd frequency field [720,2500) Hz.Then, more current subframe is revised circular AMDF (MCAMD, Modified Circular Average Magnitude Difference) sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, for example, more current subframe in second frequency zone [240,720) interior MCAMD sequence maximal value MA Max(2) with the first intermediate variable MA MaxWeighted value and more current subframe the 3rd frequency field [720,2500) interior MCAMD sequence maximal value MA Max(3) with the first intermediate variable MA MaxWeighted value.
Because what fundamental tone was estimated the easiest appearance is frequency multiplication/half frequently wrong (being " the wild point " in the fundamental tone estimated value), this mistake just appears for fear of at the beginning the time, should make as far as possible two frequency sizes corresponding to adjacent MCAMD sequence maximal value be twice or half times of relation, namely frequency multiplication/half concerns frequently.For example, when a frequency corresponding to MCAMD sequence maximal value is 100Hz, then should make frequency corresponding to adjacent MCAMD sequence maximal value is 200Hz or 50Hz.In embodiments of the present invention, when the fundamental frequency scope with 80Hz ~ 4000Hz is divided into some frequency fields in the following manner: two end values of each frequency field are the multiple relation, frequency field is continuous, and namely the left end value of the right-hand member value of previous frequency field (namely in two end values of this frequency field larger) and a rear frequency field (namely in two end values of this frequency field less) is overlapping.For example, between fundamental frequency scope 80Hz ~ 4000Hz, dividing the first frequency zone is [80Hz, 160Hz), the second frequency zone is [160Hz, 320Hz), the 3rd frequency field is [320Hz, 640Hz), the 4th frequency field be [640Hz, 1280Hz) and the 5th frequency field for [1280Hz, 4000Hz).So, adjacent MCAMD sequence maximal value is larger in the possibility that two frequency sizes corresponding to adjacent frequency field are twice or half times of relation, thereby makes follow-up correction to the fundamental tone estimated value more accurate.
If current subframe is MCAMD sequence maximal value MA in current frequency field Max(i+1) greater than the first intermediate variable MA MaxWeighted value, then with current subframe MCAMD sequence maximal value MA in current frequency field Max(i+1) substitute the first intermediate variable MA Max, with current subframe MCAMD sequence maximal value MA in current frequency field Max(i+1) corresponding delay T(i+1) substitute the second intermediate variable T Opt, and repeat above-mentioned comparison procedure, until current frequency field is not within the fundamental frequency scope.For example, initial at algorithm, set the first intermediate variable MA MaxFor current subframe first frequency zone [80Hz, 160Hz) in MCAMD sequence maximal value MA Max(1), sets the second intermediate variable T OptBe MA Max(1) corresponding delay T(1).The first experience factor is ρ 1, suppose current subframe second frequency zone [160Hz, 320Hz) in MCAMD sequence maximal value MA Max(2) greater than MA MaxWith ρ 1Product, i.e. MA Max(2)〉MA Max* ρ 1, then make MA Max=MA Max(2), T Opt=T(2).Continue more current subframe the 3rd frequency field [320Hz, 640Hz) in MCAMD sequence maximal value MA Max(3) and MA Max* ρ 1If, MA Max(3)<MA Max* ρ 1, then keep MA MaxAnd T OptConstant.Continuing relatively, the 4th frequency field is [640Hz, 1280Hz] interior MCAMD sequence maximal value MA Max(4) and MA Max* ρ 1If, MA Max(4)〉MA Max* ρ 1, then make MA Max=MA Max(4), T Opt=T(4) ..., after this continue relatively, until finish current subframe the 5th frequency field [1280Hz, 4000Hz) in MCAMD sequence maximal value MA Max(5) and current MA MaxWith ρ 1The comparison of product.After relatively finishing, the second intermediate variable T OptBe endowed a value, for example, current subframe the 4th frequency field [640Hz, 1280Hz) in MCAMD sequence maximal value MA Max(4) corresponding delay T(4).
In embodiments of the present invention, the first experience factor ρ 1Can get 0.95.
S102 calculates described the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T Opt
If T OptBe endowed a value, for example, T Opt=MA Max(4), then calculate T Opt/ T Pre_mid_oHere, T Pre_mid_oThe pitch period that is odd number subframe before the current subframe is estimated intermediate value.In embodiment provided by the invention, T Pre_mid_oGet the pitch period of front 5 subframes of current subframe and estimate that intermediate value can obtain preferably estimated result.For example, suppose that current subframe is subframe F 8(F 8The numbering of subscript " 8 " expression subframe, lower with), note subframe F 8Front 5 subframes are followed successively by subframe F 7, subframe F 6, subframe F 5, subframe F 4With subframe F 3, subframe F then 8The pitch period of front 5 subframes estimates that intermediate value is subframe F 7, subframe F 6, subframe F 5, subframe F 4With subframe F 3The pitch period estimated value through obtaining behind the medium filtering, also be the pitch period estimated value that is numbered the subframe of intermediate value in front 5 subframes of current subframe, i.e. subframe F 5The pitch period estimated value.
If described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T Opt
Remember that the first correction factor, the second correction factor and the second experience factor are respectively r 1, r 2And ρ 2If, the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the current subframe Pre_mid_oRatio less than the first correction factor or greater than the second correction factor, that is, if T Opt/ T Pre_mid_o<r 1Or T Opt/ T Pre_mid_oR 2, then at T Pre_mid_oIn the nearby sphere, for example, at the contiguous 1ms(millisecond of Tpre_mid_o) the MCAMD sequence maximal value MA of the current subframe of searching in the scope Max0If T Pre_mid_oThe correction circular AMDF sequence maximal value MA of current subframe in the nearby sphere Max0Greater than the first intermediate variable MA MaxWith the product of the second experience factor, that is, and MA Max0MA Max* ρ 2, the second intermediate variable T then OptCurrency be set to T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Corresponding delay T 0
S103 is with pitch period estimated value and the second intermediate variable T of even number subframe before the described current subframe OptCarry out together medium filtering, with the value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe.
Middle subframe is that some subframe neutron frame numbers that the front even number subframe of current subframe and current subframe consist of are the subframe of intermediate value, and this even number is the maximum even number less than the odd number among the step S102.Pitch period estimated value and the second intermediate variable T with even number subframe before the current subframe OptCarry out together medium filtering, with the pitch period estimated value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe.
For example, suppose that current subframe is subframe F 8, subframe F 8Front 4 subframes are F 7, subframe F 6, subframe F 5With subframe F 4The pitch period estimated value of each subframe is respectively T 7, T 6, T 5And T 4, then with T Opt, T 7, T 6, T 5And T 4Carry out medium filtering, with the pitch period estimated value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe, namely as subframe F 6The pitch period estimated value.
In embodiments of the present invention, the first correction factor r 1, the second correction factor r 2With the second experience factor ρ 2Can get respectively 0.75,1.4 and 0.85.
The embodiment that provides from the invention described above as can be known, to derive and get from having good periodic normalized crosscorrelation sequence owing to revising the circular AMDF sequence, process to revise circular AMDF sequence maximal value and length of delay thereof, by recycle ratio than revising circular AMDF sequence maximal value and default MA in each fundamental frequency zone MaxWeighted value, final correction circular AMDF sequence maximal value and the delay thereof of determining current subframe, can reduce the probability that occurs half frequency/frequency multiplication mistake when the normalized crosscorrelation sequence algorithm is estimated pitch period, and the pitch delay estimated value of the correction that obtains with current subframe correction circular AMDF sequence, carry out medium filtering with the pitch period estimated value of even number subframe before the current subframe, half frequency/frequency multiplication the mistake that occurs in the time of then can further reducing the estimation pitch period has been equivalent to further improve the accuracy that pitch period is estimated.
Below in conjunction with the flow process of accompanying drawing 2, the technical scheme when not satisfying some assumed condition in above-described embodiment is described, for example, suppose among the step S 101 that current subframe is MCAMD sequence maximal value MA in current frequency field Max(i+1) and MA MaxThe comparative result of weighted value be MA Max(i+1) less than MA Max* ρ 1
Among accompanying drawing 2 embodiment, the fundamental frequency scope is divided into 5 frequency fields as the example explanation.Those skilled in the art can understand, when the fundamental frequency scope is divided into the frequency field of other numbers, 3 frequency fields for example, the flow process of processing is similar, therefore repeats no more the treatment scheme when the fundamental frequency scope is divided into the frequency field of other numbers.
S201 makes MA Max=MA Max(1), T Opt=T(1), i=2.The MCAMD sequence maximal value and corresponding delay of this maximal value that are about to the first frequency zone are assigned to the first intermediate variable MAmax and the second intermediate variable T as preset initial value Opt, move forward a frequency field (i=2), namely move to the second frequency zone.
S202, MA Max(i)〉MA Max* ρ 1Namely judge current subframe in new frequency field MCAMD sequence maximal value whether greater than the first experience factor and MA MaxProduct, if greater than, then flow process enters step S203, otherwise flow process enters step S204.
S203 makes MA Max=MA Max(i), T Opt=T(i).Namely when satisfying step S202 condition, the first intermediate variable MA MaxWith the second intermediate variable T OptUse respectively the maximal value of current subframe MCAMD sequence in new frequency field and postpone replacement.
S204 makes i=i+1.Namely move forward a frequency field.
S205, i〉5? judge that namely current frequency field is whether within the fundamental frequency scope, in 5 frequency fields that perhaps whether current frequency field is divided in the fundamental frequency scope, if go beyond the scope (be i〉5), then flow process enters step S206, otherwise the repetition said process is namely got back to step S202.
S206 makes r=T Opt/ T Pre_mid_oNamely calculate the second intermediate variable T OptEstimate intermediate value T with the pitch period of front 5 subframes of current subframe Pre_mid_oRatio.
S207, r<r 1Or r〉r 2Namely judge the second intermediate variable T OptEstimate intermediate value T with the pitch period of front 5 subframes of current subframe Pre_mid_oRatio whether less than the first correction factor r 1Or no greater than the second correction factor r 2If then enter step S208, otherwise change step S211 over to.
In the present embodiment, r 1And r 2Can get respectively 0.75 and 1.4.
S208 is at T Pre_mid_oAsk for the MCAMD sequence maximal value MA of current subframe in the nearby sphere Max0And delay T 0For example, at T Pre_mid_oThe 1ms(millisecond) asks for the MCAMD sequence maximal value MA of current subframe in the scope Max0And delay T 0
S209, MA Max0MA Max* ρ 2Namely judge T Pre_mid_oThe MCAMD sequence maximal value MA of current subframe in the nearby sphere Max0Whether greater than the first intermediate variable MA MaxWith the second experience factor ρ 2Product.If, then enter step S210, otherwise, change step S211 over to.
S210 makes T Opt=T 0Namely use T Pre_mid_oThe MCAMD sequence maximal value MA of current subframe in the nearby sphere Max0Corresponding delay T 0Proofread and correct the second intermediate variable T Opt
S211 is with pitch period estimated value and the second intermediate variable T of even number subframe before the current subframe OptCarry out together medium filtering.Carry out the pitch period estimated value that obtains behind the medium filtering as the pitch period estimated value of middle subframe.The middle subframe here is that some subframe neutron frame numbers that the front even number subframe of current subframe and current subframe consist of are the subframe of intermediate value, for example, supposes that current subframe is subframe F 8, subframe F 8Front 4 subframes are F 7, subframe F 6, subframe F 5With subframe F 4, then middle subframe is subframe F 6
Seeing also accompanying drawing 3, is the pitch estimation method schematic flow sheet that the embodiment of the invention provides, and mainly comprises step:
S301 carries out pre-service to the signal that receives, and described signal comprises voice signal or sound signal.
Because real signal; for example; voice signal or sound signal are usually mixing ground unrest, harmonic wave and formant frequency etc.; these ground unrests and frequency component are so that the waveform of signal becomes very complicated; this usually causes the erroneous judgement of pitch Detection or estimation; for carrying out removing as far as possible these unfavorable factors before fundamental tone is estimated, in embodiment provided by the invention, at first the signal that receives is carried out pre-service.Pre-service comprises to be judged the mute frame of signal and non-mute frame.For example, owing to the feature of the signals such as voice or audio frequency changed along with the time, only in a period of time interval, these signals just keep relative stability (steadily), and this specific character of signal is called " short-time characteristic ".Analysis and processing to voice or sound signal generally are based upon on " short-time characteristic " basis, namely adopt segmentation or a minute frame to process to voice or audio signal stream.The finite length window is weighted to realize to voice or sound signal to divide frame generally to adopt movably.Divide frame both can adopt continuation mode, also can adopt the method for overlapping segmentation to realize.Therefore, in embodiment provided by the invention, can by adopt short-time average energy (short-time average energy of window weighting be equivalent to " square " output of signal by a linear filter) come to signal quiet/non-mute frame judges, namely when the short-time energy of signal during less than certain threshold value, be judged as mute frame, otherwise be non-mute frame.
Because cycle of mute frame is not obvious or do not have a cycle, it is larger that the pitch period of mute frame is often estimated that pitch period with reality departs from, thus the accuracy that impact was estimated the whole signal pitch cycle.In view of the foregoing, in embodiments of the present invention, for the mute frame of signal, do not carry out estimation and the last handling process of pitch period.In addition, in the pitch estimation method that the embodiment of the invention provides, in order to be applicable to the Time alignment technology, the pitch period of mute frame is made as the fundamental tone intermediate value rather than zero of last subframe, its reason is, if the pitch period of mute frame is made as zero, the torsion resistance (Time alignment technology desired parameters) of then extracting by the fundamental tone estimated value can be very large at non-mute frame and mute frame linking point place, the distortion that leads to errors, and then the Time alignment technical feature of the pitch estimation method that the embodiment of the invention provides is used in impact.
Before the mute frame of signal and non-mute frame were judged, pre-service also comprised high-pass filtering, removes average and numerical filter.For example, can adopt lower frequency limit is that the Hi-pass filter of 50Hz carries out filtering to signal, to remove the interference of power supply signal.Consider when signal has Non-zero Mean, the normalized crosscorrelation sequence all has higher numerical value in all delays, the pre-service that the embodiment of the invention provides also comprises the process of average, go signal after the average be s ' (n)=s (n)-u, herein
Figure GDA00001950699200111
For signal being carried out smoothly, the pre-service that the embodiment of the invention provides comprises that also the wave filter that adopts certain exponent number carries out numerical filter to signal.Facts have proved that when signal was carried out numerical filter, it is undesirable that exponent number is lower than the filter effect on 5 rank, the filter effect that is higher than 5 rank promotes obvious not, also brings larger delay.In order not cause delay and to obtain preferably filter effect, in embodiment provided by the invention, can adopt 5 exponent number value filtering devices that signal is carried out numerical filter, the numerical filter formula of 5 exponent number value filtering devices is
Figure GDA00001950699200121
S302 to its normalized crosscorrelation sequence of the pretreated calculated signals of described process, asks for the correction circular AMDF sequence of described normalized crosscorrelation sequence according to the described normalized crosscorrelation sequence that obtains.
Because the simple crosscorrelation sequence has good periodicity, therefore, in embodiments of the present invention, can by calculating Normalized Cross Correlation Function, will convert the normalized crosscorrelation sequence to through pretreated signal.For example, convert signal to normalized crosscorrelation sequence R(j by Normalized Cross Correlation Function (NCCF, Normalized Cross-Correlation Function)):
R ( j ) = Σ n = 0 N - 1 s ~ ( n ) s ~ ( n - j ) Σ n = 0 N - 1 s ~ 2 ( n ) Σ n = 0 N - 1 s ~ 2 ( n - j ) , j=0,...,M
Wherein, elongatedness when M is simple crosscorrelation, N are the simple crosscorrelation sequence length.
Afterwards, to the normalized crosscorrelation (NCC that obtains, Normalized Cross-Correlation) sequence, calculate its circular AMDF function (CAMDF, Circular Average Magnitude Difference Function), further convert signal to circular AMDF sequence A (j):
A ( j ) = 1 M Σ n = 0 M - 1 | R ( ( n + j ) mod M ) - R ( n ) | , j = 0 , . . . , M 2
Further, to circular AMDF (CAMD, Circular Average Magnitude Difference) sequence is revised, ask for correction circular AMDF (MCAMD, Modified Circular Average Magnitude Difference) the sequence M(j of NCC sequence):
MA (j)=A Max-A (j),
Figure GDA00001950699200125
Herein, A MaxMaximal value for A (j) sequence.
Delay corresponding to maximal value MA(j) is the possible pitch period estimated value of a subframe (analyzing if signal is divided into some subframes).
As previously mentioned, because cycle of mute frame is not obvious or do not have a cycle, it is larger that the pitch period of mute frame is often estimated that pitch period with reality departs from, thus the accuracy that impact was estimated the whole signal pitch cycle.Therefore, in embodiments of the present invention, to its normalized crosscorrelation sequence of the pretreated calculated signals of described process, the correction circular AMDF sequence of asking for described normalized crosscorrelation sequence according to the normalized crosscorrelation sequence that obtains specifically comprises: to through pretreated its normalized crosscorrelation sequence of non-mute frame calculated signals, ask for the correction circular AMDF sequence of the normalized crosscorrelation sequence of this non-mute frame signal according to the normalized crosscorrelation sequence of the non-mute frame signal that obtains.
S303 according to delay corresponding to the described maximal value of correction circular AMDF sequence in the fundamental frequency scope, revises the pitch delay estimated value, the pitch period estimated value take the delay estimation value of described correction gained as described signal.
Note MCAMD sequence maximal value is MA Max(i), then in embodiments of the present invention, according to revising delay corresponding to the maximal value of circular AMDF sequence in the fundamental frequency scope, the pitch delay estimated value revised comprised step S3031, S3032 and S3033:
S3031, more current subframe is revised circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described M Amax(i+1) corresponding delay T(i+1) substitute the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is not within the fundamental frequency scope.
The subframe here can be a frame signal that has been divided into 16 subframes, for example, voice signal or sound signal, any one subframe in its 16 subframes, frequency field is some frequency separations of dividing in the fundamental frequency scope.For example, five frequency fields that are divided in the fundamental frequency scope between 80Hz ~ 4000Hz: first frequency zone [80,160) Hz, second frequency zone [160,320) Hz, the 3rd frequency field [320,640) Hz, the 4th frequency field [640,1280) Hz and the 5th frequency field [1280,4000) Hz.
Step S3031 more detailed description can be consulted the related description of the step S101 of accompanying drawing 1 example, does not do herein and gives unnecessary details.
S3032 calculates the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the current subframe Pre_mid_oRatio.If the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the current subframe Pre_mid_oRatio less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T Opt
If T OptBe endowed a value, for example, T Opt=MA Max(4), then calculate T Opt/ T Pre_mid_oHere, T Pre_mid_oThe pitch period that is odd number subframe before the current subframe is estimated intermediate value.In embodiment provided by the invention, T Pre_mid_oGet the pitch period of front 5 subframes of current subframe and estimate that intermediate value can obtain preferably estimated result.For example, suppose that current subframe is subframe F 8(F 8The numbering of subscript " 8 " expression subframe, lower with), note subframe F 8Front 5 subframes are followed successively by subframe F 7, subframe F 6, subframe F 5, subframe F 4With subframe F 3, subframe F then 8The pitch period of front 5 subframes estimates that intermediate value is subframe F 7, subframe F 6, subframe F 5, subframe F 4With subframe F 3The pitch period estimated value through obtaining behind the medium filtering, also be the pitch period estimated value that is numbered the subframe of intermediate value in front 5 subframes of current subframe, i.e. subframe F 5The pitch period estimated value.
At present embodiment, the first correction factor r 1, the second correction factor r 2With the second experience factor ρ 2Can get respectively 0.75,1.4 and 0.85.
S3033 is with pitch period estimated value and the second intermediate variable T of even number subframe before the described current subframe OptCarry out together medium filtering, with the value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe.
Step S3033 more detailed description can be consulted the related description of the step S103 of accompanying drawing 1 example, does not do herein and gives unnecessary details.
From the example of accompanying drawing 3 as can be known, the present invention is the correction circular AMDF sequence of asking for the normalized crosscorrelation sequence according to the normalized crosscorrelation sequence that obtains, and then according to delay corresponding to the described maximal value of correction circular AMDF sequence in the fundamental frequency scope, the pitch delay estimated value is revised the pitch period estimated value take the delay estimation value of described correction gained as described signal.In general, the fundamental tone algorithm for estimating based on normalized crosscorrelation weighting correction circular AMDF function that the embodiment of the invention provides, compare with the time domain fundamental tone algorithm for estimating of prior art and to have low time delay, compare with frequency domain algorithm on the low and performance of computation complexity and be more or less the same.On the other hand, to derive and get from having good periodic normalized crosscorrelation sequence owing to revising the circular AMDF sequence, therefore can reduce the probability that occurs half frequency/frequency multiplication mistake when the normalized crosscorrelation sequence algorithm is estimated pitch period, and according to delay corresponding to the described maximal value of correction circular AMDF sequence in the fundamental frequency scope, the pitch delay estimated value (is for example revised, the pitch delay estimated value of the correction that obtains with current subframe correction circular AMDF sequence, carry out medium filtering with the pitch period estimated value of even number subframe before the current subframe) then can further reduce the half frequency/frequency multiplication mistake that occurs when estimating pitch period, be equivalent to further improve the accuracy that pitch period is estimated.
Seeing also accompanying drawing 4, is a kind of pitch period estimated value correcting device logical organization synoptic diagram that the embodiment of the invention provides.For convenience of explanation, only show the part relevant with the embodiment of the invention.Functional module/unit that the pitch period estimated value correcting device of accompanying drawing 4 examples comprises can be software module/unit, hardware module/unit or the software and hardware module/unit that combines, comprise comparison module 401, correction module 402 and medium filtering module 403, wherein:
Comparison module 401 is used for more current subframe and revises circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described MA Max(i+1) corresponding delay T(i+1) substitute the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is within the fundamental frequency scope, described the first intermediate variable MA MaxWeighted value be described MA MaxWith the product of the first experience factor, described frequency field is some frequency separations of dividing in described fundamental frequency scope;
Correction module 402 is used for calculating described the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T Opt
Describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0
Medium filtering module 403 is used for pitch period estimated value and the second intermediate variable T with even number subframe before the described current subframe OptCarry out medium filtering, with the value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe;
Subframe is that the front even number subframe of described current subframe and some subframe neutron frame numbers of described current subframe formation are the subframe of intermediate value in the middle of described, and described even number is the maximum even number less than described odd number.
Need to prove, in the embodiment of above pitch period estimated value correcting device, the division of each functional module only illustrates, can be as required in the practical application, for example the facility of the configuration requirement of corresponding hardware or software implemented is considered, and the above-mentioned functions distribution is finished by different functional modules, the inner structure that is about to described pitch period estimated value correcting device is divided into different functional modules, to finish all or part of function described above.And in the practical application, the corresponding functional module in the present embodiment can be to be realized by corresponding hardware, also can be finished by the corresponding software of corresponding hardware implement, and for example, in the aforesaid correction module, can be to have the aforementioned calculating second intermediate variable T of execution OptEstimate intermediate value T with the pitch period of odd number subframe before the current subframe Pre_mid_oThe hardware of ratio, ratio calculation device for example, thus also can be to carry out general processor or other hardware devices that the corresponding computer program is finished aforementioned functional; Aforesaid medium filtering module can be to have aforementioned pitch period estimated value and the second intermediate variable T with even number subframe before the current subframe of execution for another example OptCarrying out the hardware of medium filtering function, such as median filter, thereby also can be to carry out general processor or other hardware devices that the corresponding computer program is finished aforementioned functional.
In the pitch period estimated value correcting device of accompanying drawing 4 examples, the fundamental frequency scope can be [80Hz, 4000Hz), described frequency field be [80Hz, 4000Hz) in divide the first frequency zone [80Hz, 160Hz), second frequency zone [160Hz, 320Hz), the 3rd frequency field [320Hz, 640Hz), the 4th frequency field [640Hz, 1280Hz) and the 5th frequency field [1280Hz, 4000Hz).At this moment, comparison module 401 specifically is used for: default the first intermediate variable MA MaxValue for current subframe [80Hz, 160Hz) in correction circular AMDF sequence maximal value MA Max(1), default the second intermediate variable T OptValue is described MA Max(1) corresponding delay T(1); More current subframe [160Hz, 320Hz) in revise circular AMDF sequence maximal value MA Max(2) with the first intermediate variable MA MaxWeighted value and to described MA MaxWith described T OptRevise, more current subframe [320Hz, 640Hz) in revise circular AMDF sequence maximal value MA Max(3) with the first intermediate variable MA MaxWeighted value and to described MA MaxWith described T OptRevise, more current subframe [640Hz, 1280Hz) in revise circular AMDF sequence maximal value MA Max(4) with the first intermediate variable MA MaxWeighted value and to described MA MaxWith described T OptRevise, and more current subframe [1280Hz, 4000Hz) in revise circular AMDF sequence maximal value MA Max(5) with the first intermediate variable MA MaxWeighted value and to described MA MaxWith described T OptRevise.
In the pitch period estimated value correcting device of accompanying drawing 4 examples, the pitch period of odd number subframe is estimated intermediate value T before the current subframe Pre_mid_oCan estimate intermediate value T for the pitch period of front 5 subframes of described current subframe Pre_mid_o, T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor be described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in contiguous 1 millisecond of scope Max0Greater than described the first intermediate variable MA MaxProduct with the second experience factor.At this moment, correction module 402 is concrete for calculating described the second intermediate variable T OptWith the ratio of the pitch period estimated value of the subframe that is numbered intermediate value in front 5 subframes of described current subframe, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in contiguous 1 millisecond of scope Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then with the second intermediate variable T OptCurrency described MA is set Max0Corresponding delay T 0
In the pitch period estimated value correcting device of accompanying drawing 4 examples, the medium filtering module 403 concrete pitch period estimated value T that are used for front 4 subframes of described current subframe Pre_eWith the second intermediate variable T OptCarry out medium filtering.
Seeing also accompanying drawing 5, is a kind of fundamental tone estimation unit logical organization synoptic diagram that the embodiment of the invention provides.For convenience of explanation, only show the part relevant with the embodiment of the invention.The fundamental tone estimation unit of accompanying drawing 5 examples can be used in the Time alignment technology modules of voice audio uniform coding, and the fundamental tone that also can be used for voice signal estimates, it comprises that pretreatment module 501, sequence ask for module 502 and correcting module 503, wherein:
Pretreatment module 501 is used for the signal that receives is carried out pre-service, and described signal comprises voice signal or sound signal;
Sequence is asked for module 502, is used for its normalized crosscorrelation sequence of the pretreated calculated signals of described process, asks for the correction circular AMDF sequence of described normalized crosscorrelation sequence according to the described normalized crosscorrelation sequence that obtains;
Correcting module 503, be used for according to delay corresponding to the described maximal value of correction circular AMDF sequence in the fundamental frequency scope, the pitch delay estimated value is revised the pitch period estimated value take the delay estimation value of described correction gained as described signal.
The pretreatment module 501 of accompanying drawing 5 examples can also comprise judging unit 601, such as the fundamental tone estimation unit of accompanying drawing 6 examples.Judging unit 601 is for after carrying out pre-service to the signal that receives, mute frame and non-mute frame to described signal are judged, at this moment, it is concrete for to pretreated its normalized crosscorrelation sequence of non-mute frame calculated signals of described process that sequence is asked for module 502, asks for the correction circular AMDF sequence of the normalized crosscorrelation sequence of described non-mute frame signal according to the normalized crosscorrelation sequence of the described non-mute frame signal that obtains.
The correcting module 503 of accompanying drawing 5 or accompanying drawing 6 examples can also comprise comparing unit 701, correcting unit 702 and median filter unit 703, such as the fundamental tone estimation unit of accompanying drawing 7 examples, wherein:
Comparing unit 701 is used for more current subframe and revises circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described MA Max(i+1) corresponding delay T(i+1) substitute the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is within the fundamental frequency scope, described the first intermediate variable MA MaxWeighted value be described MA MaxWith the product of the first experience factor, described frequency field is some frequency separations of dividing in described fundamental frequency scope;
Correcting unit 702 is used for calculating described the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T Opt
Describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0,
Median filter unit 703 is used for pitch period estimated value and the second intermediate variable T with even number subframe before the described current subframe OptCarry out medium filtering, with the value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe;
Subframe is that some subframe neutron frame numbers that the front even number subframe of described current subframe and described current subframe consist of are the subframe of intermediate value in the middle of described, and described even number is the maximum even number less than described odd number.
Need to prove, the contents such as the information interaction between each module/unit of said apparatus, implementation, since with the inventive method embodiment based on same design, its technique effect that brings is identical with the inventive method embodiment, particular content can referring to the narration among the inventive method embodiment, repeat no more herein.
Accompanying drawing 8 has provided accompanying drawing 5 is used for Time alignment (TW, the Time Warp) technology modules of voice audio uniform coding to the fundamental tone estimation unit of accompanying drawing 7 arbitrary examples structural representation.The Time alignment technology modules of the voice audio uniform coding of accompanying drawing 8 examples comprises: psychologic acoustics control module 801, TW information coding device 803, TW information decoding device 804, TW mapping model make up module 805, bit output module 806, TW resampling module 807, windowing mapping block 808, discrete cosine transform (Modified Discrete Cosine Transform, the MDCT) module 809 of revising and accompanying drawing 5 to the fundamental tone estimation unit 802 of accompanying drawing 7 arbitrary examples.
In the Time alignment technology modules of accompanying drawing 8 examples, if according to the signal discriminant classification, judge and use the Frequency Domain Coding mode to encode, when scrambler adopts the TW pattern, then signal to be encoded is sent into respectively psychologic acoustics control module 801, fundamental tone estimation unit 802 and TW resampling module 807.Psychologic acoustics control module 801 is extracted the psychoacoustic parameter of signal, and the window type that is used for auxiliary windowing mapping block 808 is selected; The tonal variations of 802 pairs of signals of fundamental tone estimation unit is followed the tracks of, and extracts Pitch Information, mainly comprises pitch period estimated value (in the present embodiment, pitch period can be used for calculating TW mapping curve parameter information); The time domain that TW resampling module 807 is finished signal to be encoded remaps, to improve signal to be encoded at the spectrum concentration degree of frequency domain.The TW mapping curve parameter information of 803 pairs of inputs of TW information coding device carries out coded quantization and is stored in the code stream, and the TW mapping curve parameter information behind 804 pairs of coded quantizations of TW information decoding device is decoded.In the present embodiment, make up the TW mapping model with decoded TW mapping curve parameter information, the mapping parameters of encoding and decoding end is consistent, thereby can recover and the on all four mapping curve of coding side according to this parameter information in decoding end, avoid the error of bringing because of quantization encoding.Decoded TW mapping curve parameter information inputs to the TW mapping model and makes up module 805.The TW mapping model makes up module 805 and makes up mapping curve according to decoded TW mapping curve parameter information, and calculates the control informations such as the required resampling of time-frequency conversion operation and window function.After signal process TW resampling module 807 resamplings to be encoded and the TW windowing mapping block 808 windowings mapping, obtain TW territory signal, send into the discrete cosine transform module 809 of correction and carry out conversion, thereby obtain the coefficient of TW-MDCT.
Because during the TW coding, every frame needs 16 Pitch Information to calculate mapping curve, when every frame data length adopts at 1024 during realization, namely every fundamental tone data of 64 output.The specific embodiments of pitch tracking module is: adopt 128 rectangular window, the mode that each translation is 64 is carried out pitch period and is estimated that every frame data calculate 16 pitch value.
Accompanying drawing 9 has provided fundamental tone that accompanying drawing 5 to the fundamental tone estimation unit of accompanying drawing 7 arbitrary examples is used for voice signal when estimating, voice signal fundamental tone estimating system logical organization synoptic diagram.For convenience of explanation, only show the part relevant with the embodiment of the invention.The voice signal fundamental tone estimating system of accompanying drawing 9 examples comprises that pretreatment module 901, voiceless sound voiced sound judge module 902, center clipping module 903, pitch period zero setting module 904, sequence ask for module 905 and correcting module 906, wherein, pretreatment module 901, sequence are asked for module 905 and correcting module 906 can be respectively that accompanying drawing 5 to pretreatment module, sequence in accompanying drawing 7 arbitrary example fundamental tone estimation units asked for module and correcting module:
Pretreatment module 901 is used for voice signal to input and carries out the high pass low-pass filtering, goes average and numerical value level and smooth etc.;
Voiceless sound voiced sound judge module 902 is used for judging that according to short-time energy and short-time zero-crossing rate the voice signal of processing through pretreatment module 901 is voiced sound or voiceless sound;
Center clipping module 903 is used for removing voiced sound signal energy smaller portions;
Pitch period zero setting module 904 is used for the pitch period zero setting to the voiceless sound signal;
Sequence is asked for module 905, is used for its normalized crosscorrelation sequence of calculated signals through described center clipping module 903 processing is asked for the correction circular AMDF sequence of described normalized crosscorrelation sequence according to the described normalized crosscorrelation sequence that obtains;
Correcting module 906, be used for asking for delay corresponding to the maximal value of correction circular AMDF sequence in the fundamental frequency scope that module 905 is asked for according to described sequence, the pitch delay estimated value is revised the pitch period estimated value take the delay estimation value of described correction gained as described signal.
One of ordinary skill in the art will appreciate that all or part of step in the whole bag of tricks of above-described embodiment is to come the relevant hardware of instruction finish by program, this program can be stored in the computer-readable recording medium, storage medium can comprise: ROM (read-only memory) (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), disk or CD etc.
More than pitch period estimated value modification method, pitch estimation method and relevant apparatus that the embodiment of the invention is provided be described in detail, used specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for persons skilled in the art, according to thought of the present invention, all will change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.

Claims (14)

1. pitch period estimated value modification method is characterized in that described method comprises:
More current subframe is revised circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described MA Max(i+1) corresponding delay T(i+1) substitute the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is not within the fundamental frequency scope;
Calculate described the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T Opt
Pitch period estimated value and the second intermediate variable T with even number subframe before the described current subframe OptCarry out medium filtering, with the value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe;
Described the first intermediate variable MA MaxWeighted value be described MA MaxWith the product of the first experience factor, describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0, described middle subframe is that some subframe neutron frame numbers that the front even number subframe of described current subframe and described current subframe consist of are the subframe of intermediate value, described even number is the maximum even number less than described odd number.
2. the method for claim 1, it is characterized in that, described fundamental frequency scope be [80Hz, 4000Hz), described frequency field is at [80Hz, first frequency zone [the 80Hz that divides 4000Hz), 160Hz), second frequency zone [160Hz, 320Hz), the 3rd frequency field [320Hz, 640Hz), the 4th frequency field [640Hz, 1280Hz) and the 5th frequency field [1280Hz, 4000Hz);
Described more current subframe is revised circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value comprise:
Default the first intermediate variable MA MaxValue for current subframe [80Hz, 160Hz) in correction circular AMDF sequence maximal value MA Max(1), default the second intermediate variable T OptValue is described MA Max(1) corresponding delay T(1);
If current subframe [160Hz, 320Hz) in revise circular AMDF sequence maximal value MA Max(2) greater than described MA MaxWeighted value, then respectively with described MA Max(2) and described MA Max(2) corresponding delay T(2) substitute described MA MaxWith described T Opt
If current subframe [320Hz, 640Hz) in revise circular AMDF sequence maximal value MA Max(3) greater than described MA MaxWeighted value, then respectively with described MA Max(3) and described MA Max(3) corresponding delay T(3) substitute described MA MaxWith described T Opt
If current subframe [640Hz, 1280Hz) in revise circular AMDF sequence maximal value MA Max(4) greater than described MA MaxWeighted value, then respectively with described MA Max(4) and described MA Max(4) corresponding delay T(4) substitute described MA MaxWith described T Opt
If current subframe [1280Hz, 4000Hz) in revise circular AMDF sequence maximal value MA Max(5) greater than described MA MaxWeighted value, then respectively with described MA Max(5) and described MA Max(5) corresponding delay T(5) substitute described MA MaxWith described T Opt
3. the method for claim 1 is characterized in that, the pitch period of odd number subframe is estimated intermediate value T before the described current subframe Pre_mid_oFor the pitch period of front 5 subframes of described current subframe is estimated intermediate value T Pre_mid_o
The pitch period of front 5 subframes of described current subframe is estimated intermediate value T Pre_mid_oFor being numbered the pitch period estimated value of the subframe of intermediate value in front 5 subframes of described current subframe;
Described the second intermediate variable T of described calculating OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio be:
Calculate described the second intermediate variable T OptRatio with the pitch period estimated value of the subframe that is numbered intermediate value in front 5 subframes of described current subframe.
4. the method for claim 1 is characterized in that, described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor be:
Described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in contiguous 1 millisecond of scope Max0Greater than described the first intermediate variable MA MaxProduct with the second experience factor.
Describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Corresponding delay T 0
5. the method for claim 1 is characterized in that, described pitch period estimated value and the second intermediate variable T with even number subframe before the described current subframe OptCarrying out medium filtering comprises:
Pitch period estimated value and the second intermediate variable T with front 4 subframes of described current subframe OptCarry out medium filtering.
6. such as the described method of claim 1 to 5 any one, it is characterized in that described the first experience factor is 0.95, described the first correction factor is 0.75, and described the second correction factor is 1.4, and described the second experience factor is 0.85.
7. pitch period estimated value correcting device is characterized in that described device comprises:
Comparison module is used for more current subframe and revises circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described MA Max(i+1) corresponding delay T(1) substitute the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is not within the fundamental frequency scope;
Correction module is used for calculating described the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T Opt
The medium filtering module is used for pitch period estimated value and the second intermediate variable T with even number subframe before the described current subframe OptCarry out medium filtering, with the pitch period estimated value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe;
Described the first intermediate variable MA MaxWeighted value be described MA MaxWith the product of the first experience factor, describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0, some subframe neutron frame numbers that the front even number subframe that described middle subframe is described current subframe and described current subframe consist of are the subframe of intermediate value, described even number is the maximum even number less than described odd number.
8. device as claimed in claim 7, it is characterized in that, described fundamental frequency scope be [80Hz, 4000Hz), described frequency field is at [80Hz, first frequency zone [the 80Hz that divides 4000Hz), 160Hz), second frequency zone [160Hz, 320Hz), the 3rd frequency field [320Hz, 640Hz), the 4th frequency field [640Hz, 1280Hz) and the 5th frequency field [1280Hz, 4000Hz);
Described comparison module specifically is used for:
Default the first intermediate variable MA MaxValue for current subframe [80Hz, 160Hz) in correction circular AMDF sequence maximal value MA Max(1), default the second intermediate variable T OptValue is described MA Max(1) corresponding delay T(1);
If current subframe [160Hz, 320Hz) in revise circular AMDF sequence maximal value MA Max(2) greater than described MA MaxWeighted value, then respectively with described MA Max(2) and described MA Max(2) corresponding delay T(2) substitute described MA MaxWith described T Opt
If current subframe [320Hz, 640Hz) in revise circular AMDF sequence maximal value MA Max(3) greater than described MA MaxWeighted value, then respectively with described MA Max(3) and described MA Max(3) corresponding delay T(3) substitute described MA MaxWith described T Opt
If current subframe [640Hz, 1280Hz) in revise circular AMDF sequence maximal value MA Max(4) greater than described MA MaxWeighted value, then respectively with described MA Max(4) and described MA Max(4) corresponding delay T(4) substitute described MA MaxWith described T Opt
If current subframe [1280Hz, 4000Hz) in revise circular AMDF sequence maximal value MA Max(5) greater than described MA MaxWeighted value, then respectively with described MA Max(5) and described MA Max(5) corresponding delay T(5) substitute described MA MaxWith described T Opt
9. device as claimed in claim 7 is characterized in that, the pitch period of odd number subframe is estimated intermediate value T before the described current subframe Pre_mid_oFor the pitch period of front 5 subframes of described current subframe is estimated intermediate value T Pre_mid_o
The pitch period of front 5 subframes of described current subframe is estimated intermediate value T Pre_mid_oFor being numbered the pitch period estimated value of the subframe of intermediate value in front 5 subframes of described current subframe;
Described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor be described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in contiguous 1 millisecond of scope Max0Greater than described the first intermediate variable MA MaxProduct with the second experience factor;
Describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0
Described correction module specifically is used for:
Calculate described the second intermediate variable T OptRatio with the pitch period estimated value of the subframe that is numbered intermediate value in front 5 subframes of described current subframe;
If described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in contiguous 1 millisecond of scope Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then with the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0
10. device as claimed in claim 7 is characterized in that, described medium filtering module specifically is used for:
Pitch period estimated value T with front 4 subframes of described current subframe Pre_eWith the second intermediate variable T OptCarry out medium filtering.
11. a pitch estimation method is characterized in that, described method comprises:
The signal that receives is carried out pre-service, and described signal comprises voice signal or sound signal;
To its normalized crosscorrelation sequence of the pretreated calculated signals of described process, ask for the correction circular AMDF sequence of described normalized crosscorrelation sequence according to the described normalized crosscorrelation sequence that obtains;
According to delay corresponding to the described maximal value of correction circular AMDF sequence in the fundamental frequency scope, the pitch delay estimated value is revised the pitch period estimated value take the delay estimation value of described correction gained as described signal;
Described according to delay corresponding to the described maximal value of correction circular AMDF sequence in the fundamental frequency scope, the pitch delay estimated value revised comprise:
More current subframe is revised circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described MA Max(i+1) corresponding delay substitutes the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is not within the fundamental frequency scope;
Calculate described the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T Opt
Pitch period estimated value and the second intermediate variable T with even number subframe before the described current subframe OptCarry out medium filtering, with the value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe;
Described the first intermediate variable MA MaxWeighted value be described MA MaxWith the product of the first experience factor, describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0, described middle subframe is that some subframe neutron frame numbers that the front even number subframe of described current subframe and described current subframe consist of are the subframe of intermediate value, described even number is the maximum even number less than described odd number.
12. method as claimed in claim 11 is characterized in that, describedly the signal that receives is carried out pre-service comprises mute frame and the non-mute frame of described signal are judged;
Described to its normalized crosscorrelation sequence of the pretreated calculated signals of described process, the correction circular AMDF sequence of asking for described normalized crosscorrelation sequence according to described normalized crosscorrelation sequence is:
To pretreated its normalized crosscorrelation sequence of non-mute frame calculated signals of described process, ask for the correction circular AMDF sequence of the normalized crosscorrelation sequence of described non-mute frame signal according to the normalized crosscorrelation sequence of the described non-mute frame signal that obtains.
13. a fundamental tone estimation unit is characterized in that, described device comprises:
Pretreatment module is used for the signal that receives is carried out pre-service, and described signal comprises voice signal or sound signal;
Sequence is asked for module, is used for its normalized crosscorrelation sequence of the pretreated calculated signals of described process, asks for the correction circular AMDF sequence of described normalized crosscorrelation sequence according to the described normalized crosscorrelation sequence that obtains;
Correcting module is used for according to delay corresponding to the described maximal value of correction circular AMDF sequence in the fundamental frequency scope pitch delay estimated value being revised, the pitch period estimated value take the delay estimation value of described correction gained as described signal;
Described correcting module comprises:
Comparing unit is used for more current subframe and revises circular AMDF sequence maximal value MA in current frequency field Max(i+1) with the first intermediate variable MA MaxWeighted value, if described MA Max(i+1) greater than described MA MaxWeighted value, then respectively with described MA Max(i+1) and described MA Max(i+1) corresponding delay T(1) substitute the first intermediate variable MA MaxWith the second intermediate variable T Opt, repeat described comparison procedure, until described current frequency field is not within the fundamental frequency scope;
Correcting unit is used for calculating described the second intermediate variable T OptEstimate intermediate value T with the pitch period of odd number subframe before the described current subframe Pre_mid_oRatio, if described ratio is less than the first correction factor or greater than the second correction factor and described T Pre_mid_oThe correction circular AMDF sequence maximal value MA of described current subframe in the nearby sphere Max0Greater than described the first intermediate variable MA MaxWith the product of the second experience factor, then use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T Opt
Median filter unit is used for pitch period estimated value and the second intermediate variable T with even number subframe before the described current subframe OptCarry out medium filtering, with the pitch period estimated value of carrying out obtaining behind the described medium filtering pitch period estimated value as middle subframe;
Described the first intermediate variable MA MaxWeighted value be described MA MaxWith the product of the first experience factor, describedly use described MA Max0Corresponding delay T 0Proofread and correct the second intermediate variable T OptWith the second intermediate variable T OptCurrency be set to described MA Max0Corresponding delay T 0, described middle subframe is that some subframe neutron frame numbers that the front even number subframe of described current subframe and described current subframe consist of are the subframe of intermediate value, described even number is the maximum even number less than described odd number.
14. device as claimed in claim 13 is characterized in that, described pretreatment module also comprises judging unit, is used for that the signal that receives is carried out pre-service and comprises mute frame and the non-mute frame of described signal are judged;
Described sequence is asked for module and specifically is used for:
To pretreated its normalized crosscorrelation sequence of non-mute frame calculated signals of described process, ask for the correction circular AMDF sequence of the normalized crosscorrelation sequence of described non-mute frame signal according to the normalized crosscorrelation sequence of the described non-mute frame signal that obtains.
CN2011101182662A 2011-05-09 2011-05-09 Fundamental tone period estimated value correction method, fundamental tone estimation method and related apparatus Expired - Fee Related CN102231274B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011101182662A CN102231274B (en) 2011-05-09 2011-05-09 Fundamental tone period estimated value correction method, fundamental tone estimation method and related apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011101182662A CN102231274B (en) 2011-05-09 2011-05-09 Fundamental tone period estimated value correction method, fundamental tone estimation method and related apparatus

Publications (2)

Publication Number Publication Date
CN102231274A CN102231274A (en) 2011-11-02
CN102231274B true CN102231274B (en) 2013-04-17

Family

ID=44843834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011101182662A Expired - Fee Related CN102231274B (en) 2011-05-09 2011-05-09 Fundamental tone period estimated value correction method, fundamental tone estimation method and related apparatus

Country Status (1)

Country Link
CN (1) CN102231274B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103426441B (en) * 2012-05-18 2016-03-02 华为技术有限公司 Detect the method and apparatus of the correctness of pitch period
CN106776664A (en) * 2015-11-25 2017-05-31 北京搜狗科技发展有限公司 A kind of fundamental frequency series processing method and device
CN109119097B (en) * 2018-10-30 2021-06-08 Oppo广东移动通信有限公司 Pitch detection method, device, storage medium and mobile terminal
CN110365555B (en) * 2019-08-08 2021-12-10 广州虎牙科技有限公司 Audio delay testing method and device, electronic equipment and readable storage medium
CN111613243B (en) * 2020-04-26 2023-04-18 云知声智能科技股份有限公司 Voice detection method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010026290A (en) * 1999-09-04 2001-04-06 박종섭 Method for automatically detecting pitch points of voice signals
KR20040028293A (en) * 2002-09-30 2004-04-03 주식회사 현대시스콤 Method for compensating pitch in a AMDF pitch search unit
EP0960418B1 (en) * 1997-12-12 2005-10-26 Motorola, Inc. Apparatus and method for detecting and characterizing signals in a communication system
US7039582B2 (en) * 2001-04-24 2006-05-02 Microsoft Corporation Speech recognition using dual-pass pitch tracking
CN101572089A (en) * 2009-05-21 2009-11-04 华为技术有限公司 Test method and device of signal period

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0960418B1 (en) * 1997-12-12 2005-10-26 Motorola, Inc. Apparatus and method for detecting and characterizing signals in a communication system
KR20010026290A (en) * 1999-09-04 2001-04-06 박종섭 Method for automatically detecting pitch points of voice signals
US7039582B2 (en) * 2001-04-24 2006-05-02 Microsoft Corporation Speech recognition using dual-pass pitch tracking
KR20040028293A (en) * 2002-09-30 2004-04-03 주식회사 현대시스콤 Method for compensating pitch in a AMDF pitch search unit
CN101572089A (en) * 2009-05-21 2009-11-04 华为技术有限公司 Test method and device of signal period

Also Published As

Publication number Publication date
CN102231274A (en) 2011-11-02

Similar Documents

Publication Publication Date Title
Serra et al. Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition
CN101183527B (en) Method and apparatus for encoding and decoding high frequency signal
CN101770776B (en) Coding method and device, decoding method and device for instantaneous signal and processing system
CN102231274B (en) Fundamental tone period estimated value correction method, fundamental tone estimation method and related apparatus
CN105359209A (en) Apparatus and method for improved signal fade out in different domains during error concealment
Shahnaz et al. Pitch estimation based on a harmonic sinusoidal autocorrelation model and a time-domain matching scheme
CN104969290A (en) Method and apparatus for controlling audio frame loss concealment
CN104538011A (en) Tone adjusting method and device and terminal device
CN104919524A (en) Method and apparatus for determining encoding mode, method and apparatus for encoding audio signals, and method and apparatus for decoding audio signals
Marafioti et al. Audio inpainting of music by means of neural networks
WO2012131438A1 (en) A low band bandwidth extender
CN104299614A (en) Decoding method and decoding device
CN103426441A (en) Method and device for detecting correctness of pitch period
CN103295580A (en) Method and device for suppressing noise of voice signals
CN115171709A (en) Voice coding method, voice decoding method, voice coding device, voice decoding device, computer equipment and storage medium
JP3558031B2 (en) Speech decoding device
Kumar et al. A new pitch detection scheme based on ACF and AMDF
CN105393303A (en) Speech signal processing device, speech signal processing method, and speech signal processing program
CN101609681A (en) Coding method, scrambler, coding/decoding method and demoder
Dressler Automatic transcription of the melody from polyphonic music
CN107507610B (en) Chinese tone recognition method based on vowel fundamental frequency information
CN116013343A (en) Speech enhancement method, electronic device and storage medium
CN100487790C (en) Method and device for selecting self-adapting codebook excitation signal
CN115881157A (en) Audio signal processing method and related equipment
Mallidi et al. Robust speaker recognition using spectro-temporal autoregressive models.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130417

Termination date: 20190509