CN101609684B

CN101609684B - Post-processing filter for decoding voice signal

Info

Publication number: CN101609684B
Application number: CN2008100392248A
Authority: CN
Inventors: 黄鹤云; 林福辉
Original assignee: Spreadtrum Communications Shanghai Co Ltd
Current assignee: Spreadtrum Communications Shanghai Co Ltd
Priority date: 2008-06-19
Filing date: 2008-06-19
Publication date: 2012-06-06
Anticipated expiration: 2028-06-19
Also published as: CN101609684A

Abstract

The invention provides a post-processing filter for a decoding voice signal, which is used for filtering at all frequency bands to strengthen the fundamental tone element of the decoding voice signal. The post-processing filter comprises a highpass filter and a lowpass filter, wherein the highpass filter is used for filtering the front part of each frame of a voice signal; the lowpass filter is used for filtering the rear part of each frame of a voice signal; and the front part and the rear part of each frame of the voice signal are divided under a condition that delay caused by the highpass filter is no greater than that caused by the lowpass filter. The filter provided by the invention has an excellent suppression effect of interharmonics noises and low delay like that of a lowpass filter.

Description

The post-processing filter of decodeing speech signal

Technical field

The present invention relates to the post-processing technology of decodeing speech signal, relate in particular to a kind of post-processing filter of decodeing speech signal.

Background technology

Usually voice coding/decoding algorithms all is based on model for speech production, promptly simulates channel model with linear forecasting technology and encourages the sound-source signal that characterizes voice with sign indicating number.Common code encoding/decoding mode is the Code Excited Linear Prediction code encoding/decoding mode, i.e. the CELP method.The sound source of voice has three kinds of producing methods basically, and a kind of is to produce periodic sound-source signal by vowel phoneme, can simulate with harmonic wave or sine wave usually.A kind of is the sound-source signal that is similar to white noise that is produced by the voiceless consonant phoneme, another kind of then be Jie's sound-source signal between the two.According to this specific character of voice, the CELP method has adopted adaptive codebook and the whole pumping signal of the original formation of fixed code.Wherein adaptive codebook has mainly characterized the harmonic wave in the voice-activated, and fixed codebook has then characterized residual components.

Because the finiteness of code check, decodeing speech signal has some distortions inevitably.Several types of present voice coding/decoding algorithmss comprise CELP, have all spent the periodicity that characterizes voice than multi-bit (bit), even therefore hanging down under the code check, the periodicity of voice also can obtain more complete maintenance.Yet other components beyond the harmonic component keep poorly.Therefore, the noise between each harmonic component can be stronger relatively, suppresses this part noise and be necessary.Existing voice post-processing approach has a lot, comprises that the low frequency to fixed codebook increases the weight of, to increasing the weight of of adaptive codebook, to comb filtering of whole excitation or the like.

Existing post-processing filter method for designing mainly comprises following several kinds of typical implementations:

A kind of method is to increase the weight of the proportion of self-adapting codebook excitation signal in total pumping signal.Implementation method is that the self-adapting codebook excitation signal of decoding multiply by a weight, is added to also to carry out the energy normalized processing in total excitation at last again.

Another kind method then is that constant codebook excitations is carried out Filtering Processing, and its principle is that the radio-frequency component with constant codebook excitations increases the weight of, and with adaptive codebook excitation addition, produces excitation always again.

Also having a kind of classical way then is that signal rather than all signals to a certain band frequency carries out comb filtering.

Summary of the invention

Problem to be solved by this invention provides a kind of post-processing filter of decodeing speech signal, in order to carry out filtering at all frequency bands, to strengthen the fundamental tone composition of decodeing speech signal.

The present invention proposes a kind of post-processing filter of decodeing speech signal, comprising: higher order filter, carry out filtering in order to the preceding part to each frame of voice signal; Lower order filter is carried out filtering in order to the back part to each frame of voice signal; Wherein the satisfied delay that this higher order filter is caused of the preceding part of each frame of this voice signal and latter part of division is not more than the delay that the low order wave filter is caused.

In above-mentioned wave filter, the preceding part of each frame of this voice signal is to divide according to the exponent number of this higher order filter and low order wave filter with a back part.

In above-mentioned wave filter, the exponent number of this higher order filter is l, and l is a positive integer; The exponent number of low order wave filter is m, and m is positive integer and l＞m, wherein for the voice signal frame with N point; Should a preceding part comprise preceding (N-l+m) individual point, this back part comprises back (l-m) individual point.

In above-mentioned wave filter, this higher order filter and low order wave filter are finite impulse response filter.

In above-mentioned wave filter, this higher order filter is a second order filter, and the low order wave filter is a firstorder filter.

In above-mentioned wave filter, this higher order filter satisfies:

h(n)＝δ(n)+Gδ(n+T)+Gδ(n+2T)；

And the low order wave filter satisfies:

g(n)＝δ(n)+Gδ(n+T)；

Wherein T is the pitch period of said voice signal, and G is the fundamental tone gain of said voice signal.

In above-mentioned wave filter, for the voice signal frame with N point, this higher order filter is preceding (N-T-1) individual the filtering to wherein, and the low order wave filter is back (T+1) individual the filtering to wherein.

What the present invention was different from present post-processing technology is that it is the filtering that all frequency bands are carried out, to strengthen the fundamental tone composition of decodeing speech signal.Wave filter proposed by the invention has combined higher order filter and lower order filter, thereby has preferably Noise Suppression effect between harmonic wave, has again to be equivalent to the low of lower order filter and to postpone.

Embodiment

The present invention proposes a kind of post-processing filter to voice signal, strengthens filtering in order at all frequency bands voice signal is carried out fundamental tone.

Predictably, the performance of wave filter directly depends on its filter order, and more the wave filter of high-order tends to obtain better filter effect.Yet, also can bring bigger time-delay simultaneously or need expend more buffer memory.For this reason, embodiments of the invention propose a kind of filter design method, and it is applied to the aftertreatment of voice.

Suppose to have two groups of wave filter h and g, their exponent number is respectively l and m and l＞m, for each frame voice signal s (n), and n=0,1 ..., N-1, the present invention adopts following wave filter:

l (n) = \{\begin{matrix} h (n), n = 0,1, . . ., N - l + m - 1 \\ g (n), n = N - l + m, . . . ., N - 1 \end{matrix} - - - (1)

Wherein higher order filter h (n) carries out filtering to the preceding part of each frame voice signal, and lower order filter g (n) carries out filtering to the back part of each frame voice signal.At this, a preceding part and latter part of division are satisfied the delay that wave filter h (n) is caused and are not more than the delay that wave filter g (n) is caused.For instance; The delay of wave filter h (n) itself is T1, and the delay of wave filter g (n) itself is T2, owing to shifting to an earlier date of last section processes point; Make wave filter h (n) reduce T3 to the delay that whole filtering caused; And the delay that wave filter g (n) is caused still is T2, as long as therefore satisfy T1-T3≤T2, can satisfy the delay designing requirement of wave filter.

Usually, the exponent number according to wave filter h (n) and g (n) carries out the division of process points.For example,, carry out better processing, and at latter half, promptly l-m point guarantees to postpone with the low lower order filter (being the m rank) that postpones enough little with higher order filter (being the l rank) at preceding N-l+m point.

It is pointed out that at wave filter and express formula (1) lining, can adopt any one FIR wave filter (h and g), do not limit at this.And wave filter h and g can have exponent number l and m arbitrarily, as long as satisfy l＞m.

A kind of main embodiment is described below specifies above-mentioned wave filter, described embodiment only as the purpose of giving an example, is not intended to limit scope of the present invention.

Consider the periodicity of voice, it is that major parameter is constructed finite impulse response filter (FIR) that common post-processing filter all is based on pitch period T and fundamental tone gain G.The FIR of single order is:

g(n)＝δ(n)+Gδ(n+T) (2)

Wherein, δ (n) is shock response.Its frequency response is:

H (ω) = \sqrt{1 + G^{2} + 2 G \cos (ωT)} - - - (3)

The wave filter of expression formula (2) frame is forward used T sampled point, needs certain buffer memory to preserve.

The FIR of second order is:

h(n)＝δ(n)+Gδ(n+T)+Gδ(n+2T) (4)

Its frequency response is:

H(ω)＝1+2Gcos(ωT) (5)

When the wave filter of expression formula (4) needs that frame is used T sampled point forward, also need use T sampled point, its certain buffer memory of needs and certain delay simultaneously backward.Yet also can see simultaneously, it to frequency response between the harmonic wave (

ω = \frac{π}{T}, \frac{3 π}{T},

... near frequency band) inhibition effect (the more little inhibition effect that gains is good more) better than firstorder filter.Therefore, wave filter can be designed as:

l (n) = \{\begin{matrix} δ (n) + Gδ (n + T) + Gδ (n + 2 T), n = 0,1, . . ., N - T - 1 \\ δ (n) + Gδ (n + T), n = N - T, . . . ., N - 1 \end{matrix} - - - (6)

Can find out that this wave filter l (n) has preferably Noise Suppression effect between harmonic wave, and have the low T of delay that is equivalent to firstorder filter.

In the above embodiments, pitch period T can adopt any means to obtain.In one embodiment, can obtain the adaptive codebook delay parameter as pitch period from Voice decoder.

In the above embodiments, the fundamental tone gain G also can adopt any means to obtain.In one embodiment, can obtain adaptive codebook gain from Voice decoder and gain as fundamental tone, perhaps the gain 1/2nd.

In sum, according to the abovementioned embodiments of the present invention, when selecting low order filtering and high-order filtering for use simultaneously, can design a wave filter that postpones equally with low order filtering, but its quality also is better than lower order filter, near the effect of higher order filter.

Claims

1. the post-processing filter of a decodeing speech signal comprises:

Higher order filter carries out filtering in order to the last section processes point to each frame of voice signal;

Lower order filter is carried out filtering in order to the back a part of process points to each frame of voice signal;

Wherein the last section processes point of each frame of this voice signal is not more than the delay that the low order wave filter is caused with the satisfied delay that this higher order filter is caused of the division of a part of process points in back.

2. wave filter as claimed in claim 1 is characterized in that, the last section processes point of each frame of this voice signal is to divide according to the exponent number of this higher order filter and low order wave filter with a part of process points in back.

3. wave filter as claimed in claim 2 is characterized in that, the exponent number of this higher order filter is l; L is a positive integer; The exponent number of low order wave filter is m, and m is positive integer and l＞m, wherein for the voice signal frame with N point; This last section processes point comprises preceding N-l+m point, and a part of process points in this back comprises l-m the point in back.

4. wave filter as claimed in claim 1 is characterized in that, this higher order filter and low order wave filter are finite impulse response filter.

5. wave filter as claimed in claim 4 is characterized in that, this higher order filter is a second order filter, and the low order wave filter is a firstorder filter.

6. wave filter as claimed in claim 5 is characterized in that, this higher order filter satisfies:

h(n)＝δ(n)+Gδ(n+T)+Gδ(n+2T)；

The low order wave filter satisfies:

g(n)＝δ(n)+Gδ(n+T)；

Wherein T is the pitch period of said voice signal, and G is the fundamental tone gain of said voice signal, and δ (n) is shock response.

7. wave filter as claimed in claim 6 is characterized in that, for the voice signal frame with N point, this higher order filter is to the filtering of wherein preceding N-T-1 point, and the low order wave filter is that wherein back T+1 put filtering.