CN100541609C - A kind of method and apparatus of realizing open-loop pitch search - Google Patents

A kind of method and apparatus of realizing open-loop pitch search Download PDF

Info

Publication number
CN100541609C
CN100541609C CNB2006101397038A CN200610139703A CN100541609C CN 100541609 C CN100541609 C CN 100541609C CN B2006101397038 A CNB2006101397038 A CN B2006101397038A CN 200610139703 A CN200610139703 A CN 200610139703A CN 100541609 C CN100541609 C CN 100541609C
Authority
CN
China
Prior art keywords
pitch period
value
candidate value
overall situation
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2006101397038A
Other languages
Chinese (zh)
Other versions
CN101149924A (en
Inventor
胡瑞敏
刘霖
杨玉红
张勇
王庭红
马付伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Wuhan University WHU
Original Assignee
Huawei Technologies Co Ltd
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd, Wuhan University WHU filed Critical Huawei Technologies Co Ltd
Priority to CNB2006101397038A priority Critical patent/CN100541609C/en
Publication of CN101149924A publication Critical patent/CN101149924A/en
Application granted granted Critical
Publication of CN100541609C publication Critical patent/CN100541609C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a kind of method and apparatus of realizing open-loop pitch search.This device comprises: autocorrelation function computing unit, pitch period overall situation reference calculation unit and pitch period determining unit.This method comprises: the autocorrelation function of computing voice signal; Determine current pitch period overall situation reference according to the result of calculation of autocorrelation function; According to the pitch period of the current pitch period overall situation with reference to definite voice signal.The present invention has reduced algorithm computational complexity and the storage overhead in the open-loop pitch search process.

Description

A kind of method and apparatus of realizing open-loop pitch search
Technical field
The present invention relates to speech coding technology, particularly relate to a kind of method and apparatus of realizing open-loop pitch search.
Background technology
Pitch period is meant the cycle of vocal cord vibration when the people pronounces.Pitch period is important problem in the voice coding, and its accuracy will directly have influence on the coding quality and the efficient of speech coder.Redundancy can be effectively removed in pitch period analysis accurately in speech, reduce the bit number of coding, realizes low bit rate high-quality speech coding.
In order to determine there is multiple relevant Pitch Detection Algorithm at present by pitch period exactly.Such as, in the angle of time domain, comprise in traditional algorithm based on the fundamental tone algorithm for estimating of average magnitude difference function (AMDF) with based on the Pitch Detection Algorithm of short-time autocorrelation function (ACF).And for example, in the angle of frequency domain, have a kind of frequency domain pitch period estimation scheme, be used for many band excitation speech coding algorithms (MBE), this pitch period algorithm for estimating adopts the closed-Loop Analysis synthetic method, and the matched signal frequency-domain waveform obtains optimum pitch period and estimates.
In actual applications, based on the pitch search algorithm of time domain because its algorithm is simple, better performances and being used widely.For example in current wideband speech coding standard A MR-WB+, taked the improved short-time autocorrelation function of time domain (ACF) Pitch Detection Algorithm.AMR-WB+ has adopted weighting related function method to carry out the search of pitch period, and its specific implementation process mainly comprises following several processing procedure: Signal Pretreatment process, open-loop pitch search process, closed loop pitch searcher process.
Wherein, in the open-loop pitch search process, AMR-WB+ has used traditional time domain related function to obtain pitch period.The related function computing formula that it adopts is:
T 0 = arg max delay = upper lower ( corr = Σ i = 0 63 s ( n ) s ( n - delay ) ) - - - ( 1.1 )
Referring to formula 1.1, corr represents correlation function value, and s (n) is the voice signal behind the perceptual weighting, the voice fundamental cycle candidate value during delay represents to search for, T 0Expression is when the delay value of related function maximal value correspondence.It is the burst s (n) of 64 sampled points and the similarity degree that postpones the burst s (n-delay) of delay that the related function that utilizes formula 1.1 to calculate has reacted subframe lengths, asks for the T of corr maximal value correspondence 0Then obtained the pitch period value in the open-loop pitch search process.
Because voice cyclical signal no all roses; and voice signal is subjected to various interference such as sound channel resonance peak and outside noise; the pitch period that causes directly adopting above-mentioned related function to draw has certain deviation, doubling time problem and fundamental tone flatness problem occur through regular meeting.Wherein,
For the doubling time problem, referring to Figure 1A, in the ideal case, the voice signal of input is the cyclical signal of standard, and pitch period candidate value delay is T 0, 2T 0, 3T 0, related function is all identical, is related function maximal value corr MaxBut referring to Figure 1B, because actual voice signal is not the cyclical signal of standard, real pitch period then can occur is T 0, but the pitch period that utilizes above-mentioned formula 1.1 to obtain is 3T 0, thereby the doubling time problem appears.
For fundamental tone flatness problem, in the voiced segments of voice signal, it is limited that the pitch period of consecutive frame changes.As a rule, the variation of the pitch period of adjacent two unvoiced frame signals is no more than 10%, does not almost have to surpass 25% situation.Therefore, in the search procedure of pitch period, must consider fundamental tone flatness problem, prevent that noise or other situation from influencing the bigger variation of pitch period generation of consecutive frame.
At present, adopted the mode of pair correlation function weighting to solve doubling time problem and fundamental tone flatness problem among the AMR-WB+.That is to say, in above-mentioned formula 1.1, multiply by a weighting function, that is:
corr _ w = Σ n = 0 63 s ( n ) s ( n - delay ) w ( delay ) - - - ( 1.2 )
T 0 = arg max delay = lower upper ( corr _ w max ( delay ) ) - - - ( 1.3 )
Wherein, the pretreated voice signal of s (n) expression input, the short pitch period candidate value of voice during delay represents to search for, w (delay) expression is the weighting function of variable with delay, this weighting function promptly is that the formula with related function in the formula (1.1) multiply by weighting function w (delay) again.
Wherein, weighting function w (delay) is divided into two parts, is expressed as follows:
w(delay)=w l(delay)w n(delay)(1.4)
W wherein l(delay) be set to:
w l(delay)=and cw (delay) (1.5), be used to solve the doubling time problem;
w n(delay) be set to:
w n ( delay ) = cw ( | T old - delay | + 98 ) , v > 0.8 1.0 , otherwise - - - ( 1.6 )
Wherein, cw (delay) represents above-mentioned weighting function, and T OldThe mean value of pitch period in expression past 5 frames, and v has represented the judgement of divided ring gain in the weighting function, v is set to:
v = 1.0 , gain > 0.6 0.9 v , otherwise - - - ( 1.7 )
Wherein, gain represents open-loop gain, and the gain calculating formula is:
gain = Σ 0 63 s ( n ) s ( n - T 0 ) Σ 0 63 s ( n ) Σ 0 63 s 2 ( n - T 0 ) - - - ( 1.8 )
By above description as can be seen, in the prior art, in order to solve doubling time and pitch period flatness problem, must all be weighted processing to each auto-correlation function value that autocorrelation function calculates, yet, voice signal for reality, all there is the doubling time problem in the pitch period candidate value of not all auto-correlation function value correspondence, and, often there is candidate value in the pitch period candidate value away from true pitch period, like this, prior art all is weighted processing to each auto-correlation function value blindly, then can increase the complicacy of open-loop pitch search process meaninglessly.
In addition, in the prior art, in the open-loop pitch search process, the autocorrelation function that calculates pitch period is non-normalized autocorrelation function, as shown in formula 1.1 corr = Σ i = 0 63 s ( n ) s ( n - delay ) . Yet because the fundamental purpose of open-loop pitch search process is to determine the hunting zone of more meticulous pitch period for the closed loop pitch searcher process, but in the closed loop pitch searcher process, the optimal delay discriminant that obtains for search is T k = Σ n = 0 63 x ( n ) y k ( n ) Σ n = 0 63 y k 2 ( n ) ( 1.9 ) , Wherein, x (n) is an echo signal, and y k(n) be certain previous frame excitation that postpones.What formula 1.9 embodied is the least-mean-square-error criterion of inhibit signal and echo signal.Like this, in the open-loop pitch search process, use non-normalized autocorrelation function then can not meet the least-mean-square-error criterion that requires in the follow-up closed loop pitch searcher process better, thereby the pitch period deviation that causes the open-loop pitch search process to be determined is bigger, greatly reduces the efficient of open-loop pitch search.
Summary of the invention
In view of this, fundamental purpose of the present invention is to provide a kind of method that realizes open-loop pitch search, another object of the present invention is to provide a kind of device of realizing open-loop pitch search, so that reduce the complicacy of open-loop pitch search process.
In order to achieve the above object, technical scheme of the present invention is achieved in that
A kind of method that realizes open-loop pitch search, this method comprises:
The autocorrelation function of computing voice signal;
Determine current pitch period overall situation reference according to the result of calculation of autocorrelation function;
According to the pitch period of the current pitch period overall situation with reference to definite voice signal.
The step of the autocorrelation function of described computing voice signal is specially: the autocorrelation function of computing voice signal normalization.
The step of the normalized autocorrelation function of described calculating is specially: calculate corr = Σ n = 0 M - 1 s ( n ) s ( n - delay ) Σ n = 0 M - 1 s 2 ( n ) Σ n = 0 M - 1 s 2 ( n - delay ) , Wherein M is the frame length of subframe, and corr is normalized auto-correlation function value, and s (n) is the voice signal behind the perceptual weighting, and delay is the voice fundamental cycle candidate value in the search.
Describedly determine that the step of current pitch period overall situation reference is specially: the result of calculation according to autocorrelation function obtains best pitch period candidate value; Judge at present frame whether to determine the reference of the reliable pitch period overall situation according to resulting best pitch period candidate value, if, then resulting best pitch period candidate value is defined as current pitch period overall situation reference, otherwise, whether the pitch period overall situation reference that judgement is determined at preceding frame lost efficacy, if lost efficacy, determine that then the current pitch period overall situation is referenced as zero, if lost efficacy, then will be defined as current pitch period overall situation reference in the pitch period overall situation reference that preceding frame is determined.
After judging not inefficacy, and before will being defined as current pitch period overall situation reference in the pitch period overall situation reference that preceding frame is determined, further comprise: whether the two continuous frames autocorrelation function maximal value of judging former frame and present frame is less than the threshold value of setting, if, determine that then the current pitch period overall situation is referenced as zero, otherwise the continuation execution is described will to be defined as the step of the overall reference of current pitch period in the pitch period overall situation reference that preceding frame is determined.
The step that described result of calculation according to autocorrelation function obtains best pitch period candidate value is specially: according to the result of calculation of autocorrelation function, choose maximum a plurality of auto-correlation function values and corresponding pitch period candidate value thereof in the bound interval in pitch search cycle; Selected autocorrelation function value sequence is weighted processing, obtains best pitch period candidate value according to the autocorrelation function value sequence after the weighted.
The described step that selected auto-correlation function value is weighted processing is specially: to each selected pitch period candidate value, judge that whether this pitch period candidate value and the difference between the pitch period overall situation reference that preceding frame is determined are less than the threshold value that sets in advance, if, then the auto-correlation function value to selected pitch period candidate value correspondence carries out fixed weighting, otherwise the auto-correlation function value to this pitch period candidate value correspondence does not carry out fixed weighting.
After being weighted processing, and before obtaining best pitch period candidate value, further comprise: the autocorrelation function value sequence after the weighted is removed doubling time handle;
Then obtain described best pitch period candidate value according to the autocorrelation function value sequence of removing after doubling time is handled.
The described doubling time of removing is handled the step obtain best pitch period candidate value and is specially: the autocorrelation function of each selected pitch period candidate value correspondence is carried out convergent-divergent handle, obtain current best pitch period candidate value undetermined according to the auto-correlation function value after the convergent-divergent processing, whether the pitch period candidate value of judging current autocorrelation function maximal value correspondence is doubling of current best pitch period candidate value undetermined, if, current best pitch period candidate value undetermined is defined as described best pitch period candidate value, otherwise, the pitch period candidate value of current autocorrelation function maximal value correspondence is defined as described best pitch period candidate value.
Before carrying out the convergent-divergent processing, further comprise: the pitch period candidate value of current autocorrelation function maximal value correspondence is set at current best pitch period candidate value undetermined;
The step that described auto-correlation function value after handling according to convergent-divergent obtains best pitch period candidate value undetermined is specially: investigate the pitch period candidate value that each is chosen successively, if the pitch period candidate value of current investigation is less than current best pitch period candidate value undetermined, and the maximal value of autocorrelation function was divided by zoom factor before the auto-correlation function value of the pitch period candidate value correspondence of current investigation was handled greater than convergent-divergent, and then the pitch period candidate value with current investigation is set at current best pitch period candidate value undetermined.
The step that described autocorrelation function to each selected pitch period candidate value correspondence carries out the convergent-divergent processing is specially: to each selected pitch period candidate value, judge that whether this pitch period candidate value is greater than the threshold value of setting, if, then with the auto-correlation function value of this pitch period candidate value correspondence divided by bigger zoom factor, otherwise, with the auto-correlation function value of this pitch period candidate value correspondence divided by less zoom factor.
Whether described judgement can determine that at present frame the step of reliable pitch period overall situation reference is specially:
Whether the pitch period candidate value of judging current autocorrelation function maximal value correspondence is doubling of best pitch period candidate value, and whether resulting best pitch period candidate value and the difference between the pitch period overall situation reference that preceding frame is determined be less than setting threshold value, if the pitch period candidate value of current autocorrelation function maximal value correspondence for best pitch period candidate value doubling and described difference less than setting threshold value, then determine can determine the reference of the reliable pitch period overall situation at present frame;
Or, judge that in current autocorrelation function value sequence whether the difference between autocorrelation function maximal value and other any one auto-correlation function values is greater than setting threshold value, if then determine can determine the reference of the reliable pitch period overall situation at present frame;
Or, judge in current pitch period candidate value sequence whether exist for one or more pitch period candidate values that best pitch period candidate value doubles, if then determine can determine the reference of the reliable pitch period overall situation at present frame;
Or, whether the pitch period overall situation reference that judgement is determined at preceding frame is doubling of current determined best pitch period candidate value, and whether current autocorrelation function maximal value is greater than the threshold value of setting, if double and, then determine to determine the reference of the reliable pitch period overall situation at present frame greater than described threshold value.
Describedly be specially according to the step of the current pitch period overall situation with reference to the pitch period of determining voice signal: judge resulting best pitch period candidate value and the current pitch period overall situation with reference between difference whether less than the threshold value of setting, if then best pitch period candidate value is defined as the pitch period of described voice signal;
Or, judge that in the autocorrelation function value sequence whether current autocorrelation function maximal value is less than the threshold value of setting, if then the pitch period candidate value of current autocorrelation function maximal value correspondence is defined as the pitch period of described voice signal;
Describedly be specially according to the step of the current pitch period overall situation with reference to the pitch period of determining voice signal:
Determine reference value according to the pitch period overall situation with reference to definite pitch period, in current pitch period candidate value sequence, choose and pitch period is determined the pitch period candidate value of the difference minimum between the reference value, the auto-correlation function value of selected pitch period candidate value correspondence is doubled, from the autocorrelation function value sequence, choose current autocorrelation function maximal value, the pitch period candidate value of selected autocorrelation function maximal value correspondence is defined as the pitch period of described voice signal.
Described according to the pitch period overall situation with reference to determining that pitch period determines that the step of reference value is specially:
Judge whether the determined pitch period overall situation is with reference to non-vanishing, if, then the pitch period overall situation is determined reference value with reference to being defined as pitch period, otherwise, judge a last non-vanishing pitch period overall situation with reference to whether losing efficacy, if lost efficacy, it is zero then pitch period to be determined that reference value is defined as, if do not lose efficacy, then a last non-vanishing pitch period overall situation is determined reference value with reference to being defined as described pitch period.
A kind of device of realizing open-loop pitch search, this device comprises: autocorrelation function computing unit, pitch period overall situation reference calculation unit and pitch period determining unit, wherein,
The autocorrelation function computing unit, the autocorrelation function of computing voice signal with the meter autocorrelation function computing unit of autocorrelation function, calculates corr = Σ n = 0 M - 1 s ( n ) s ( n - delay ) Σ n = 0 M - 1 s 2 ( n ) Σ n = 0 M - 1 s 2 ( n - delay ) Obtain the normalized autocorrelation function of voice signal, wherein, wherein M is the frame length of subframe, corr is normalized auto-correlation function value, s (n) is the voice signal behind the perceptual weighting, delay is the voice fundamental cycle candidate value in the search, exports the result of calculation of autocorrelation function to pitch period overall situation reference calculation unit;
Current pitch period overall situation reference is determined according to the result of calculation of the autocorrelation function that receives in pitch period overall situation reference calculation unit, exports the reference of the determined current pitch period overall situation to the pitch period determining unit;
The pitch period determining unit according to the current pitch period overall situation reference that receives, is determined the pitch period of voice signal.
Described autocorrelation function computing unit calculates normalized autocorrelation function, exports the result of calculation of normalized autocorrelation functions to pitch period overall situation reference calculation unit;
Current pitch period overall situation reference is determined according to the result of calculation of normalized autocorrelation functions in described pitch period overall situation reference calculation unit.
Described pitch period overall situation reference calculation unit, a plurality of auto-correlation function values bigger in the auto-correlation function value that receives are weighted processing, obtain best pitch period candidate value according to the autocorrelation function value sequence after the weighted, determine current pitch period overall situation reference according to best pitch period candidate value.
Described pitch period overall situation reference calculation unit, a plurality of auto-correlation function values bigger in the auto-correlation function value that receives are removed doubling time to be handled, obtain best pitch period candidate value according to the autocorrelation function value sequence of removing after doubling time is handled, determine current pitch period overall situation reference according to best pitch period candidate value.
This shows that the present invention has the following advantages:
1,, there are following two advantages for autocorrelation function of the present invention:
(a) thus maximize the pitch period candidate value that this auto-correlation function value obtains, this candidate value is found the solution the pitch period candidate value that obtains with the Minimum Mean Square Error standard of separating in the error of pursuing original signal and inhibit signal, say more accurate, and consistent from the statistical significance with the integer pitch search in the closed loop pitch searcher;
(b) this autocorrelation function is the autocorrelation function of a normalization, and solves the doubling time problem by the classification analysis to auto-correlation function value, and the strong and weak also final pitch period of determining of periodicity that the pitch period flatness is weighted and judges voice.
2, in the present invention, set pitch period overall situation reference locus as the measuring of pitch period global change, thus level and smooth pitch period.
3, adopt the pitch period analysis of classification, at definite pitch period of signal adaptive.
4, after calculating auto-correlation function value, the present invention does not carry out follow-up a series of processing at all auto-correlation function values, such as, eliminate doubling time processing and the processing of pitch period flatness etc., but chosen wherein for definite the most favourable partial auto correlation functional value of pitch period, like this, then greatly simplified processing procedure, reduce algorithm computational complexity and storage overhead in the open-loop pitch search process, simplified the open-loop pitch search process.
Description of drawings
Figure 1A is the related function curve map of signal normal period.
Figure 1B is the related function curve map of actual speech signal.
Fig. 2 is a structural representation of realizing the device of open-loop pitch search in the present invention.
Fig. 3 is a process flow diagram of realizing the open-loop pitch search process in embodiments of the present invention.
Fig. 4 is a process flow diagram of determining current pitch period overall situation reference in embodiments of the present invention.
Fig. 5 is a process flow diagram of removing the doubling time influence in embodiments of the present invention.
Embodiment
The present invention proposes a kind of method that realizes open-loop pitch search, its core concept is: the autocorrelation function of computing voice signal; Determine current pitch period overall situation reference according to the result of calculation of autocorrelation function; According to the pitch period of the current pitch period overall situation with reference to definite voice signal.
Accordingly, the invention allows for a kind of device of realizing open-loop pitch search.Fig. 2 is a structural representation of realizing the device of open-loop pitch search in the present invention.Referring to Fig. 2, in the present invention, realize that the device of open-loop pitch search comprises: autocorrelation function computing unit, pitch period overall situation reference calculation unit and pitch period determining unit, wherein,
The autocorrelation function computing unit, the autocorrelation function of computing voice signal exports the result of calculation of autocorrelation function to pitch period overall situation reference calculation unit;
Current pitch period overall situation reference is determined according to the result of calculation of the autocorrelation function that receives in pitch period overall situation reference calculation unit, exports the reference of the determined current pitch period overall situation to the pitch period determining unit;
The pitch period determining unit according to the current pitch period overall situation reference that receives, is determined the pitch period of voice signal.
For making the purpose, technical solutions and advantages of the present invention clearer, the present invention is described in further detail below in conjunction with drawings and the specific embodiments.
Fig. 3 is a process flow diagram of realizing the open-loop pitch search process in embodiments of the present invention.Referring to Fig. 2 and Fig. 3, utilize apparatus of the present invention, the inventive method realizes that the process of open-loop pitch search may further comprise the steps:
Step 301: normalized autocorrelation function is set.
In order to make the open-loop pitch search process meet the least-mean-square-error criterion that requires in the follow-up closed loop pitch searcher process better, in this step 301, preferably, normalized autocorrelation function can be set, and can represent by following formula 2.1:
corr = Σ n = 0 M - 1 s ( n ) s ( n - delay ) Σ n = 0 M - 1 s 2 ( n ) Σ n = 0 M - 1 s 2 ( n - delay ) - - - ( 2.1 )
Wherein, M is the frame length of subframe, and corr represents the correlation function value of normalization, and s (n) is the voice signal behind the perceptual weighting, the voice fundamental cycle candidate value during delay represents to search for.Formula (2.1) is to determine according to the criterion of least mean-square error, and the process of derivation is as follows:
The least mean-square error of the voice signal of input speech signal and delay is calculated as follows:
min ( error = Σ n = 0 63 ( s ( n ) - gain × s ( n - delay ) ) 2 Σ n = 0 63 s 2 ( n ) Σ n = 0 63 s 2 ( n - delay ) ) - - - ( 2.2 )
Wherein, error represents the square mean error amount of input speech signal and inhibit signal, and gain represents the gain of open loop, and s (n) is the voice signal behind the perceptual weighting, the voice fundamental cycle candidate value during delay represents to search for.
Find the solution the minimum value of formula (2.2), make delay constant, find the solution best open-loop gain value gain.Order derror dgain = 0 , Find the solution:
gain = Σ n = 0 63 s ( n ) s ( n - delay ) Σ n = 0 63 s ( n - delay ) 2 - - - ( 2.3 )
Substitution formula as a result (2.2) with formula (2.3) obtains:
min ( 1 - ( Σ n = 0 63 s ( n ) s ( n - delay ) ) 2 Σ n = 0 63 s 2 ( n ) Σ n = 0 63 s 2 ( n - delay ) ) - - - ( 2.4 )
(2.4) formula can be converted into:
Figure C20061013970300175
Formula (2.5) is the decision criteria of related function in this programme.And in the closed loop pitch searcher discriminant of back, shown in (1.9) T k = Σ n = 0 63 x ( n ) y k ( n ) Σ n = 0 63 y k 2 ( n ) , Contrast this formula and formula (2.5), find that both have similarity: two formulas all adopt energy to carry out normalization as denominator, thereby guarantee and can provide more meticulous hunting zone for closed loop pitch searcher.
Step 302: the autocorrelation function computing unit is according to the auto-correlation function value of formula 2.1 computing voice signal normalizations.
Step 303: the autocorrelation function computing unit is chosen bigger a plurality of auto-correlation function values and corresponding pitch period candidate value thereof in the bound interval in pitch search cycle.
Here, consider the continuity of voice, and in order to reduce the complexity of computing, in this step 303, preferably, can not choose in all auto-correlation function values that calculated and the corresponding pitch period candidate value thereof, in the bound interval in pitch search cycle, choose bigger a plurality of auto-correlation function values and corresponding pitch period candidate value thereof but only select.Wherein, selected number can determine according to practical application, such as, choose maximum 6 auto-correlation function values and corresponding pitch period candidate value thereof.
Carry out this step, then obtained the autocorrelation function value sequence and the corresponding pitch period candidate value sequence thereof of a selected length.
Step 304: the autocorrelation function computing unit exports selected bigger a plurality of auto-correlation function values and corresponding pitch period candidate value thereof to pitch period overall situation reference calculation unit.
Step 305: current pitch period overall situation reference is determined according to auto-correlation function value that receives and corresponding pitch period candidate value thereof in pitch period overall situation reference calculation unit.
For voice signal, voiced sound part signal particularly, the variation of the pitch period of consecutive frame is very little, and variation can not surpass 10% usually.At this kind situation, the notion of the pitch period overall situation with reference to (global_pitch) proposed among the present invention, this overall situation is to determine the pitch period of present frame with reference to its fundamental purpose on the pitch period of reference former frame or former frames, thereby avoids noise and the interference that estimation causes to pitch period of sound channel characteristic.In this programme, the basis for estimation of pitch period overall situation reference be the auto-correlation function value descending sequence with and corresponding pitch period candidate value sequence.
Fig. 4 is a process flow diagram of determining current pitch period overall situation reference in embodiments of the present invention.Referring to Fig. 4, the specific implementation process of this step 305 mainly comprises:
Step 401: according to the result of calculation of autocorrelation function, promptly selected auto-correlation function value and corresponding pitch period candidate value thereof obtain best pitch period candidate value.
Here, in this step 401, can solve doubling time problem and fundamental tone flatness problem by the process that obtains best pitch period candidate value.
The specific implementation process of this step 401 comprises:
At first, be weighted processing at pitch period flatness problem.
In advance selected autocorrelation function value sequence is carried out descending sort.To each selected pitch period candidate value, judge that whether this pitch period candidate value and the difference between the pitch period overall situation reference that preceding frame is determined are less than the threshold value that sets in advance, if, then the auto-correlation function value to selected pitch period candidate value correspondence carries out fixed weighting, otherwise the auto-correlation function value to this pitch period candidate value correspondence does not carry out fixed weighting.And, after the weighting, can carry out descending sort to current autocorrelation function value sequence again, and the pitch period candidate value sequence of synchronously mobile auto-correlation function value correspondence.
Secondly, remove the doubling time influence.
Fig. 5 is a process flow diagram of removing the doubling time influence in embodiments of the present invention.Referring to Fig. 5, in step 401, realize that the process of removing the doubling time influence may further comprise the steps:
Step 501: the pitch period candidate value of current autocorrelation function maximal value correspondence is set at current best pitch period candidate value undetermined.
Step 502: judge in current pitch period candidate value sequence, whether have the pitch period candidate value of not investigated, if then execution in step 503, otherwise, execution in step 508.
Step 503: select a pitch period candidate value of not investigated, whether judge this pitch period candidate value greater than the threshold value of setting, if then execution in step 504, otherwise, execution in step 505.
Step 504: with the auto-correlation function value of this pitch period candidate value correspondence divided by bigger zoom factor 1, execution in step 506.
Step 505: with the auto-correlation function value of this pitch period candidate value correspondence divided by less zoom factor 2.
Step 506: whether the pitch period candidate value of judging this current investigation is less than current best pitch period candidate value undetermined, and whether the auto-correlation function value of the pitch period candidate value correspondence of this current investigation handles the maximal value of preceding autocorrelation function divided by zoom factor greater than convergent-divergent, if all be, then execution in step 507, otherwise, directly return step 502.
Step 507: pitch period candidate value that will this current investigation is set at current best pitch period candidate value undetermined, returns step 502.
Step 508: whether the pitch period candidate value of judging current autocorrelation function maximal value correspondence is doubling of this current best pitch period candidate value undetermined, if then execution in step 509, otherwise, execution in step 510.
Step 509: this current best pitch period candidate value undetermined is defined as described best pitch period candidate value, finishes current flow process.
Step 510: the pitch period candidate value of current autocorrelation function maximal value correspondence is defined as described best pitch period candidate value.
So far, then obtained the best pitch period candidate value described in the step 401.
Step 402: judge at present frame whether to determine the reference of the reliable pitch period overall situation according to resulting best pitch period candidate value, if then execution in step 403, otherwise, execution in step 404.
Here, whether described judgement can determine that at present frame the implementation of reliable pitch period overall situation reference includes but not limited to:
Mode one, judge whether the pitch period candidate value of current autocorrelation function maximal value correspondence is doubling of best pitch period candidate value, and whether resulting best pitch period candidate value and the difference between the pitch period overall situation reference that preceding frame is determined be less than setting threshold value, if the pitch period candidate value of current autocorrelation function maximal value correspondence for best pitch period candidate value doubling and described difference less than setting threshold value, then determine can determine the reference of the reliable pitch period overall situation at present frame.
Mode two, judgement are in current autocorrelation function value sequence, whether the difference between autocorrelation function maximal value and other any one auto-correlation function values is greater than setting threshold value, if then determine to determine the reference of the reliable pitch period overall situation at present frame;
Whether mode three, judgement exist for one or more pitch period candidate values that best pitch period candidate value doubles in current pitch period candidate value sequence, if then determine can determine the reference of the reliable pitch period overall situation at present frame.
Whether the pitch period overall situation reference that mode four, judgement are determined at preceding frame is doubling of current determined best pitch period candidate value, and whether current autocorrelation function maximal value is greater than the threshold value of setting, if double and, then determine to determine the reference of the reliable pitch period overall situation at present frame greater than described threshold value.
In this step 402, when determine present frame can determine the reliable pitch period overall situation with reference to the time, then mean and can determine the reference of the new pitch period overall situation, promptly carry out subsequent step 403; If can not determine, then mean and will continue front reliable pitch period overall situation reference at present frame, promptly carry out subsequent step 404.
Step 403: resulting best pitch period candidate value is defined as current pitch period overall situation reference, finishes current flow process.
Step 404: judge the pitch period overall situation of determining at preceding frame with reference to whether losing efficacy, if lost efficacy, then execution in step 405, if do not lose efficacy, then execution in step 406.
Here, can set in advance the frame number that pitch period overall situation reference locus can keep, such as keeping three frames, so, whether the pitch period overall situation that described judgement is determined at preceding frame then is to judge whether the pitch period overall situation of determining at preceding frame has surpassed three frames with reference to the frame number that is kept with reference to having lost efficacy, if then determine to lose efficacy, otherwise, determine not lose efficacy.
Here,, mean that then this voice snippet is not a voiced sound, do not have pitch period continuity preferably, promptly carry out subsequent step 405 if lost efficacy in the definite pitch period overall situation reference of preceding frame.
Step 405: determine that the current pitch period overall situation is referenced as 0, finishes current flow process.
Step 406: whether the two continuous frames autocorrelation function maximal value of judging former frame and present frame less than the threshold value of setting, if, then turn to execution in step 405, otherwise, execution in step 407.
Step 407: will be defined as current pitch period overall situation reference in the pitch period overall situation reference that preceding frame is determined.
So far, then realized the described process of determining current pitch period overall situation reference of step 305 among Fig. 3.
Step 306: pitch period overall situation reference calculation unit exports the reference of the determined current pitch period overall situation to the pitch period determining unit.
Step 307: the pitch period determining unit is finally determined the pitch period of voice signal according to the current pitch period overall situation reference that receives.
Here, when finally determining pitch period, can divide three kinds of situations to consider:
(1) according to best pitch period candidate value and the current pitch period overall situation with reference to whether relatively near determining.
In such cases, the reference of the pitch period overall situation is to be determined by the best pitch period candidate value of present frame, if these both more approaching, illustrate that best pitch period candidate value judgement is reliable, therefore can directly export best fundamental tone periodic quantity as pitch period, and, illustrate that also the definite best pitch period candidate value of present frame has satisfied the foundation of pitch period flatness, but for no other reason than that noise, various interference such as sound channel interference, the auto-correlation function value of this pitch period candidate value correspondence is less than normal, is not enough to determine the reference of the reliable pitch period overall situation, and present frame will continue the overall pitch period reference of previous frame like this.Accordingly, the specific implementation process of this step 307 comprises:
Judge resulting best pitch period candidate value and the current pitch period overall situation with reference between difference whether less than the threshold value of setting, judge promptly whether both are more approaching, if then best pitch period candidate value is defined as the pitch period of described voice signal.
(2) determine according to the peaked size of current autocorrelation function.
In this case, the degree of correlation of voice segments signal is smaller, be difficult for judging tangible pitch period, this moment, the search of current pitch period did not have practical significance, just provided a reference of at utmost removing correlativity when long for afterwards closed loop pitch searcher.Thereby the pitch period that can directly export autocorrelation function maximal value correspondence is in this case searched for candidate value as pitch period.Accordingly, the specific implementation process of this step 307 comprises:
Judgement is in the autocorrelation function value sequence, and whether current autocorrelation function maximal value less than the threshold value of setting, if then the pitch period candidate value of current autocorrelation function maximal value correspondence is defined as the pitch period of described voice signal.
(3) determine according to whether can't obviously judging pitch period.
In such cases, introduce the notion that a pitch period is determined reference value (trkp), this value is used for to the effect of determining to play reference of last pitch period.
At first definite pitch period is determined reference value (trkp), the process that this value is determined is: judge whether the determined pitch period overall situation is with reference to non-vanishing, if, then the pitch period overall situation is determined reference value with reference to being defined as pitch period, otherwise, judge that a last non-vanishing pitch period overall situation is with reference to whether losing efficacy, if lost efficacy, then pitch period is determined that reference value is defined as zero, if do not lose efficacy, then a last non-vanishing pitch period overall situation is determined reference value with reference to being defined as described pitch period.
Then, determine reference value (trkp) according to determined pitch period, final definite pitch period, its implementation procedure comprises: in current pitch period candidate value sequence, choose and pitch period is determined the pitch period candidate value of the difference minimum between the reference value, the auto-correlation function value of selected pitch period candidate value correspondence is doubled, from the autocorrelation function value sequence, choose current autocorrelation function maximal value, the pitch period candidate value of selected autocorrelation function maximal value correspondence is defined as the pitch period of described voice signal.
So far, then realized in the open-loop pitch search process, determining the process of pitch period.
Need to prove that the present invention realizes that the method and apparatus of open-loop pitch search can be applied in any one audio coder ﹠ decoder (codec).
In a word, the above is preferred embodiment of the present invention only, is not to be used to limit protection scope of the present invention.Within the spirit and principles in the present invention all, any modification of being done, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (16)

1, a kind of method that realizes open-loop pitch search is characterized in that, this method comprises:
Calculate corr = Σ n = 0 M - 1 s ( n ) s ( n - delay ) Σ n = 0 M - 1 s 2 ( n ) Σ n = 0 M - 1 s 2 ( n - delay ) Obtain the normalized autocorrelation function of voice signal, wherein, wherein M is the frame length of subframe, and corr is normalized auto-correlation function value, and s (n) is the voice signal behind the perceptual weighting, and delay is the voice fundamental cycle candidate value in the search;
Determine current pitch period overall situation reference according to the result of calculation of autocorrelation function;
According to the pitch period of the current pitch period overall situation with reference to definite voice signal.
2, method according to claim 1 is characterized in that, describedly determines that the step of current pitch period overall situation reference is specially: the result of calculation according to autocorrelation function obtains best pitch period candidate value; Judge at present frame whether to determine the reference of the reliable pitch period overall situation according to resulting best pitch period candidate value, if, then resulting best pitch period candidate value is defined as current pitch period overall situation reference, otherwise, whether the pitch period overall situation reference that judgement is determined at preceding frame lost efficacy, if lost efficacy, determine that then the current pitch period overall situation is referenced as zero, if lost efficacy, then will be defined as current pitch period overall situation reference in the pitch period overall situation reference that preceding frame is determined.
3, method according to claim 2, it is characterized in that, after judging not inefficacy, and before will being defined as current pitch period overall situation reference in the pitch period overall situation reference that preceding frame is determined, further comprise: whether the two continuous frames autocorrelation function maximal value of judging former frame and present frame is less than the threshold value of setting, if, determine that then the current pitch period overall situation is referenced as zero, otherwise the continuation execution is described will to be defined as the step of the overall reference of current pitch period in the pitch period overall situation reference that preceding frame is determined.
4, method according to claim 2, it is characterized in that, the step that described result of calculation according to autocorrelation function obtains best pitch period candidate value is specially: according to the result of calculation of autocorrelation function, choose maximum a plurality of auto-correlation function values and corresponding pitch period candidate value thereof in the bound interval in pitch search cycle; Selected autocorrelation function value sequence is weighted processing, obtains best pitch period candidate value according to the autocorrelation function value sequence after the weighted.
5, method according to claim 4, it is characterized in that, the described step that selected auto-correlation function value is weighted processing is specially: to each selected pitch period candidate value, judge that whether this pitch period candidate value and the difference between the pitch period overall situation reference that preceding frame is determined are less than the threshold value that sets in advance, if, then the auto-correlation function value to selected pitch period candidate value correspondence carries out fixed weighting, otherwise the auto-correlation function value to this pitch period candidate value correspondence does not carry out fixed weighting.
6, method according to claim 4 is characterized in that, after being weighted processing, and before obtaining best pitch period candidate value, further comprises: the autocorrelation function value sequence after the weighted is removed doubling time handle;
Then obtain described best pitch period candidate value according to the autocorrelation function value sequence of removing after doubling time is handled.
7, method according to claim 6, it is characterized in that, the described doubling time of removing is handled the step obtain best pitch period candidate value and is specially: the autocorrelation function of each selected pitch period candidate value correspondence is carried out convergent-divergent handle, obtain current best pitch period candidate value undetermined according to the auto-correlation function value after the convergent-divergent processing, whether the pitch period candidate value of judging current autocorrelation function maximal value correspondence is doubling of current best pitch period candidate value undetermined, if, current best pitch period candidate value undetermined is defined as described best pitch period candidate value, otherwise, the pitch period candidate value of current autocorrelation function maximal value correspondence is defined as described best pitch period candidate value.
8, method according to claim 7 is characterized in that, before carrying out the convergent-divergent processing, further comprises: the pitch period candidate value of current autocorrelation function maximal value correspondence is set at current best pitch period candidate value undetermined;
The step that described auto-correlation function value after handling according to convergent-divergent obtains best pitch period candidate value undetermined is specially: investigate the pitch period candidate value that each is chosen successively, if the pitch period candidate value of current investigation is less than current best pitch period candidate value undetermined, and the maximal value of autocorrelation function was divided by zoom factor before the auto-correlation function value of the pitch period candidate value correspondence of current investigation was handled greater than convergent-divergent, and then the pitch period candidate value with current investigation is set at current best pitch period candidate value undetermined.
9, method according to claim 7, it is characterized in that, the step that described autocorrelation function to each selected pitch period candidate value correspondence carries out the convergent-divergent processing is specially: to each selected pitch period candidate value, judge that whether this pitch period candidate value is greater than the threshold value of setting, if, then with the auto-correlation function value of this pitch period candidate value correspondence divided by bigger zoom factor, otherwise, with the auto-correlation function value of this pitch period candidate value correspondence divided by less zoom factor.
10, method according to claim 2 is characterized in that, whether described judgement can determine that at present frame the step of reliable pitch period overall situation reference is specially:
Whether the pitch period candidate value of judging current autocorrelation function maximal value correspondence is doubling of best pitch period candidate value, and whether resulting best pitch period candidate value and the difference between the pitch period overall situation reference that preceding frame is determined be less than setting threshold value, if the pitch period candidate value of current autocorrelation function maximal value correspondence for best pitch period candidate value doubling and described difference less than setting threshold value, then determine can determine the reference of the reliable pitch period overall situation at present frame;
Or, judge that in current autocorrelation function value sequence whether the difference between autocorrelation function maximal value and other any one auto-correlation function values is greater than setting threshold value, if then determine can determine the reference of the reliable pitch period overall situation at present frame;
Or, judge in current pitch period candidate value sequence whether exist for one or more pitch period candidate values that best pitch period candidate value doubles, if then determine can determine the reference of the reliable pitch period overall situation at present frame;
Or, whether the pitch period overall situation reference that judgement is determined at preceding frame is doubling of current determined best pitch period candidate value, and whether current autocorrelation function maximal value is greater than the threshold value of setting, if double and, then determine to determine the reference of the reliable pitch period overall situation at present frame greater than described threshold value.
11, according to any described method in the claim 1 to 10, it is characterized in that, describedly be specially according to the step of the current pitch period overall situation with reference to the pitch period of determining voice signal: judge resulting best pitch period candidate value and the current pitch period overall situation with reference between difference whether less than the threshold value of setting, if then best pitch period candidate value is defined as the pitch period of described voice signal;
Or, judge that in the autocorrelation function value sequence whether current autocorrelation function maximal value is less than the threshold value of setting, if then the pitch period candidate value of current autocorrelation function maximal value correspondence is defined as the pitch period of described voice signal;
12, according to any described method in the claim 1 to 10, it is characterized in that, describedly be specially according to the step of the current pitch period overall situation with reference to the pitch period of determining voice signal:
Determine reference value according to the pitch period overall situation with reference to definite pitch period, in current pitch period candidate value sequence, choose and pitch period is determined the pitch period candidate value of the difference minimum between the reference value, the auto-correlation function value of selected pitch period candidate value correspondence is doubled, from the autocorrelation function value sequence, choose current autocorrelation function maximal value, the pitch period candidate value of selected autocorrelation function maximal value correspondence is defined as the pitch period of described voice signal.
13, method according to claim 12 is characterized in that, described according to the pitch period overall situation with reference to determining that pitch period determines that the step of reference value is specially:
Judge whether the determined pitch period overall situation is with reference to non-vanishing, if, then the pitch period overall situation is determined reference value with reference to being defined as pitch period, otherwise, judge a last non-vanishing pitch period overall situation with reference to whether losing efficacy, if lost efficacy, it is zero then pitch period to be determined that reference value is defined as, if do not lose efficacy, then a last non-vanishing pitch period overall situation is determined reference value with reference to being defined as described pitch period.
14, a kind of device of realizing open-loop pitch search is characterized in that, this device comprises: autocorrelation function computing unit, pitch period overall situation reference calculation unit and pitch period determining unit, wherein,
The autocorrelation function computing unit calculates corr = Σ n = 0 M - 1 s ( n ) s ( n - delay ) Σ n = 0 M - 1 s 2 ( n ) Σ n = 0 M - 1 s 2 ( n - delay ) Obtain the normalized autocorrelation function of voice signal, wherein, wherein M is the frame length of subframe, corr is normalized auto-correlation function value, s (n) is the voice signal behind the perceptual weighting, delay is the voice fundamental cycle candidate value in the search,, export the result of calculation of autocorrelation function to pitch period overall situation reference calculation unit;
Current pitch period overall situation reference is determined according to the result of calculation of the autocorrelation function that receives in pitch period overall situation reference calculation unit, exports the reference of the determined current pitch period overall situation to the pitch period determining unit;
The pitch period determining unit according to the current pitch period overall situation reference that receives, is determined the pitch period of voice signal.
15, device according to claim 14 is characterized in that, the described pitch period overall situation is with reference to meter
Calculate the unit, a plurality of auto-correlation function values bigger in the auto-correlation function value that receives are weighted processing, obtain best pitch period candidate value according to the autocorrelation function value sequence after the weighted, determine current pitch period overall situation reference according to best pitch period candidate value.
16, device according to claim 14, it is characterized in that, described pitch period overall situation reference calculation unit, a plurality of auto-correlation function values bigger in the auto-correlation function value that receives are removed doubling time to be handled, obtain best pitch period candidate value according to the autocorrelation function value sequence of removing after doubling time is handled, determine current pitch period overall situation reference according to best pitch period candidate value.
CNB2006101397038A 2006-09-18 2006-09-18 A kind of method and apparatus of realizing open-loop pitch search Active CN100541609C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006101397038A CN100541609C (en) 2006-09-18 2006-09-18 A kind of method and apparatus of realizing open-loop pitch search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006101397038A CN100541609C (en) 2006-09-18 2006-09-18 A kind of method and apparatus of realizing open-loop pitch search

Publications (2)

Publication Number Publication Date
CN101149924A CN101149924A (en) 2008-03-26
CN100541609C true CN100541609C (en) 2009-09-16

Family

ID=39250413

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006101397038A Active CN100541609C (en) 2006-09-18 2006-09-18 A kind of method and apparatus of realizing open-loop pitch search

Country Status (1)

Country Link
CN (1) CN100541609C (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101908341B (en) * 2010-08-05 2012-05-23 浙江工业大学 Voice code optimization method based on G.729 algorithm applicable to embedded system
JP5992427B2 (en) * 2010-11-10 2016-09-14 コーニンクレッカ フィリップス エヌ ヴェKoninklijke Philips N.V. Method and apparatus for estimating a pattern related to pitch and / or fundamental frequency in a signal
CN103426441B (en) 2012-05-18 2016-03-02 华为技术有限公司 Detect the method and apparatus of the correctness of pitch period
CN103474074B (en) * 2013-09-09 2016-05-11 深圳广晟信源技术有限公司 Pitch estimation method and apparatus
CN105067101A (en) * 2015-08-05 2015-11-18 北方工业大学 Fundamental tone frequency characteristic extraction method based on vibration signal for vibration source identification
CN109389988B (en) * 2017-08-08 2022-12-20 腾讯科技(深圳)有限公司 Sound effect adjustment control method and device, storage medium and electronic device
CN108831504B (en) * 2018-06-13 2020-12-04 西安蜂语信息科技有限公司 Method and device for determining pitch period, computer equipment and storage medium
CN109119097B (en) * 2018-10-30 2021-06-08 Oppo广东移动通信有限公司 Pitch detection method, device, storage medium and mobile terminal

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
基于归一化互相关函数的基音检测算法. 鲍长春,樊昌信.通信学报,第19卷第10期. 1998
基于归一化互相关函数的基音检测算法. 鲍长春,樊昌信.通信学报,第19卷第10期. 1998 *
基音周期估计算法研究. 高戈,李明,胡瑞敏.声学学报,第28卷第6期. 2003
基音周期估计算法研究. 高戈,李明,胡瑞敏.声学学报,第28卷第6期. 2003 *

Also Published As

Publication number Publication date
CN101149924A (en) 2008-03-26

Similar Documents

Publication Publication Date Title
US7337107B2 (en) Perceptual harmonic cepstral coefficients as the front-end for speech recognition
CN100541609C (en) A kind of method and apparatus of realizing open-loop pitch search
US7272556B1 (en) Scalable and embedded codec for speech and audio signals
Talkin et al. A robust algorithm for pitch tracking (RAPT)
US7092881B1 (en) Parametric speech codec for representing synthetic speech in the presence of background noise
US6202046B1 (en) Background noise/speech classification method
US8725499B2 (en) Systems, methods, and apparatus for signal change detection
EP0640952B1 (en) Voiced-unvoiced discrimination method
Hui et al. A pitch detection algorithm based on AMDF and ACF
US7797156B2 (en) Speech analyzing system with adaptive noise codebook
WO2008067719A1 (en) Sound activity detecting method and sound activity detecting device
JPH05346797A (en) Voiced sound discriminating method
KR20020052191A (en) Variable bit-rate celp coding of speech with phonetic classification
CN110648684B (en) Bone conduction voice enhancement waveform generation method based on WaveNet
CN1815552A (en) Frequency spectrum modelling and voice reinforcing method based on line spectrum frequency and its interorder differential parameter
JP2000515998A (en) Method and apparatus for searching an excitation codebook in a code-excited linear prediction (CELP) coder
US5696873A (en) Vocoder system and method for performing pitch estimation using an adaptive correlation sample window
Katsir et al. Evaluation of a speech bandwidth extension algorithm based on vocal tract shape estimation
Zolnay et al. Extraction methods of voicing feature for robust speech recognition.
Sorin et al. The ETSI extended distributed speech recognition (DSR) standards: client side processing and tonal language recognition evaluation
HoChoi et al. Speech recognition method using quantised LSP parameters in CELP-type coders
Upadhya et al. Pitch estimation using autocorrelation method and AMDF
EP0713208B1 (en) Pitch lag estimation system
Addou et al. A noise-robust front-end for distributed speech recognition in mobile communications
Gu et al. Split-band perceptual harmonic cepstral coefficients as acoustic features for speech recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant