WO2007028280A1

WO2007028280A1 - Encoder and decoder for pre-echo control and method thereof

Info

Publication number: WO2007028280A1
Application number: PCT/CN2005/001435
Authority: WO
Inventors: Lei Wang; Xingde Pan
Original assignee: Beijing E-World Technology Co., Ltd.
Priority date: 2005-09-08
Filing date: 2005-09-08
Publication date: 2007-03-15
Also published as: CN101228574A

Abstract

The invention discloses an encoder for pre-echo control including a signal type analyzing module, modified window function module, time-domain to frequency-domain transformation module, quantization and entropy coding module and code stream multiplexing module,which are connected in sequence, and a psychoacoustic analyzing module which is connected to the quantization and entropy coding module. The invention also discloses an encoding method for pre-echo control, including the steps of : 1.determing whether the signal type of the input audio frame is a transient frame, if it is yes, then the window function value is linearly modified, and then the audio frame is multiplied by the modified window; 2.performing time-domain to frequency-domain transformation for the audio frame being multiplied by the window and quantizing and entropy encoding the frequency-domain coefficients, and then multiplexing the coded audio stream and the result of signal type analyzing. The invention further discloses a decoder for pre-echo control including a demultiplexing module, a dequantization module and entropy decoding module, a frequency-domain to time-domain transformation module and a modified window function module, which are connected in sequence. The invention also discloses a decoding method for pre-echo control.

Description

FIELD OF THE INVENTION The present invention relates to an apparatus and method for encoding and decoding pre-echo, and more particularly to an audio encoding and decoding apparatus for controlling pre-echo using a modified window function method and method.

BACKGROUND OF THE INVENTION Generally, a perceptual encoder is used to encode and compress audio information. A conventional perceptual encoder generally has a psychoacoustic module, and the psychoacoustic module functions to analyze "unrelated components" in an audio signal. After the "unrelated component", the quantification module is used to process these "unrelated components", so that the audio signal reaches "perceived transparency", that is, it has no influence on the human feeling or the influence is within an acceptable range. When the psychoacoustic module analyzes "unrelated components", it mainly uses the masking phenomenon of the human ear. The so-called "masking phenomenon", as shown in Fig. 1, is a phenomenon in which another sound cannot be perceived in the human ear in the presence of one sound, and this sound is the masking signal 3. Masking is further divided into s imul taneous masking, pre-masking, and pos t-masking. Among them, forward masking 2 and backward masking 3 are expressed in the time domain, so there is an additional requirement for the time domain characteristics of the perceptual encoder, that is, to achieve transparent and transparent coding quality, the quantization noise must also have a time domain. The associated masking threshold. But this requirement is not easy to implement for an actual perceptual encoder. It can be known from the uncertainty principle of time-frequency: the block time conversion method is used to transform the audio time domain signal into the frequency domain, and then the quantization error caused by the quantization and coding of the transformed spectral coefficient is reconstructed by the synthesis filter. Diffusion occurs in the time domain. For commonly used filter designs, such as a modified discrete cosine transform (MDCT) filter with a window length of 2048 sample points, the signal with a sampling frequency of 48000 Hz is transformed, and after being reconstructed by the synthesis filter, the quantization is caused. The diffusion of the error is about 42. 7ms. If the stronger energy of the signal in the analysis window is mainly concentrated in a very d, part, then the quantization noise will spread until the signal appears. In extreme cases, in some time periods, the quantization noise is even higher than the energy level of the original signal. This is called the "pre-echo" phenomenon, as shown in Figure 2 and Figure 3. 2 is an uncoded audio signal time domain graph, and Fig. 3 is a time domain graph of the encoded reconstructed audio signal. The portion circled by an ellipse in Fig. 3 It is the pre-echo 5, according to the characteristics of the human ear, if the coding noise lasts for a short time before the signal break point, the forward echo can be masked by forward masking, otherwise the coding noise will be perceived by the human ear. In order to avoid this phenomenon, the time domain characteristics of the quantization noise should be considered when designing the encoder to ensure that the time domain masking condition is satisfied, and the pre-echo phenomenon is always a fast variable type signal (such as a castanets signal). A major difficulty in code rate.

In the codec audio signal, in order to solve the pre-echo phenomenon, the prior art includes the following: Specific bit cell control technology: The filter coefficient is covered by the filter group in the fast variable segment window, and the coding precision is increased. This greatly increases the number of bits required for fast variable frame coding. This method cannot be used for fixed rate encoders. In the MPEG-1 standard, the bit pool method is used to use the bits left by the previous frame when the bits require a peak, thereby maintaining an average constant code rate. In reality, however, if you encounter a very fast-changing signal, you need a huge bit pool that can't be encoded.

Adaptive Window Switching Technology: Adaptive sensing is used in many perceptual encoders. This method can adaptively adjust the size of the filter bank window according to the characteristics of the input signal; the steady-state part or the slow-changing part adopts a long-time window, and the fast-changing signal part adopts a short-time window for encoding. This approach increases the amount of encoder computation and complicates the encoder structure. Since different window lengths require different interpretations and normalizations of psychoacoustic models, as well as different frequency bands and noise-free coding structures, window switching significantly increases the complexity of the encoder structure. In addition, when using overlapping additive structure filter banks, window switching decisions require additional buffering and delay of the encoder, resulting in greater end-to-end delay. Finally, although the long and short windows have better time-frequency local characteristics, the start and end windows introduce larger inefficient coding.

Filter bank switching techniques: Filter bank switching techniques are techniques that control the pre-echo using different filter bank modes. Specifically, in the slow-change signal type, a cosine-modulated filter bank with a high frequency resolution is used; in a fast-changing signal type, a wavelet filter bank is used. When the two filter bank modes are switched to each other, it is difficult to ensure complete reconstruction of the transition block.

Time domain noise shaping (TNS) technology: Time domain noise shaping technology is to judge the signal type after the signal is transformed into the frequency domain coefficient by the filter bank. If it is a fast variable type signal, the frequency domain coefficient is not directly quantized. Instead, the frequency domain coefficients are first linearly predicted and then the residual sequence is quantized. The use of TNS technology will increase the amount of sideband information, affecting the overall coding efficiency.

As shown in FIG. 4, it is a structural frame diagram of a prior art audio encoding device. The encoder includes: The psychoacoustic analysis module 201, the window function module 202, the time-frequency mapping module 203, the quantization and entropy encoding module 204, and the code stream multiplexing module 205. The psychoacoustic analysis module 201 is configured to calculate the perceptual entropy and the masking threshold of the input audio signal, and determine whether the audio signal frame signal type is a fast variable type signal or a slowly varying type signal according to the perceptual entropy. In order to prevent the pre-echo and ensure the encoding quality, the length of the analysis window function of the window function module 202 is determined according to the signal type output by the psychoacoustic analysis module 201. Specifically, if the frame signal is a fast-changing signal type, in order to prevent the pre-echo a window with a 256 sample point length with a higher temporal resolution and a lower frequency resolution; if the frame signal is a slowly varying signal type, to ensure encoding efficiency, a 2048 with a lower temporal resolution and a higher frequency resolution is used. The window of the sample point length. The time-frequency mapping module 203 is configured to convert the time domain audio signal into frequency domain coefficients and output to the quantization and entropy coding module 204; the quantization and entropy coding module 204 controls the masking threshold output by the psychoacoustic analysis module 201, The domain coefficients are quantized and entropy encoded and output to the code stream multiplexing module 205; the code stream multiplexing module 205 is configured to multiplex the received data to form an audio coded code stream. Although the audio encoding device can achieve the purpose of preventing pre-echo, the window function module 202 uses windows of different length sample points, so that the structural complexity of the entire encoder becomes higher.

In the literature ("Adapt ive Transform Coding of Speech Signal s", Rainer Zel inski, Peter Nol l, IEEE Transact ions on Acous ics, Speech, and Signal Process ing, Vol. ASSP-25, No. 4, Augus t 1977 In the author, the author discusses the transform-based speech coding method, specifically, the method of speech coding by discrete cosine transform (DCT). Framing the speech signal ^, assuming that the number of samples in one frame is N (N can be 128, 256,

1 2― '

512, 1024, etc.), variance estimation is performed on a frame signal to obtain a standard deviation, where ^σ . The quantized standard deviation is transmitted as side information to the decoding side. Dividing the frame signal by the quantized standard deviation yields ^ν · where = . A DCT transform is performed on the input sequence to obtain V, and is quantized to obtain ^V ^. The quantized sequence is transmitted to the decoding end, inverse DCT transform is performed to obtain V^, and the last sequence is multiplied to obtain the reconstructed speech signal ^. If this method is directly applied to audio coding, the pre-echo problem that occurs in fast-changing signal frames is still powerless. The transform-based speech coding method cannot change the characteristics of the frame signal 6 after dividing by the quantized standard deviation, that is, the frame signal is still non-stationary, as shown in FIG. 5, for applying the transform-based speech coding. Method of audio signal time domain graphics. If you improve it, estimate a standard deviation ^σ ι before the fast change point. A standard deviation of 2 is estimated after the fast change point, and quantized as and as shown in Fig. 6, which is an audio signal time domain pattern of the improved transform-based speech coding method. In this way, first of all, fast change

1 ^N , the signal before the point Ί is divided by, and the signal 8 after the fast change point is divided by, where ^N ui ^Xj ,

For the location of the quick change point). After such processing, the frame signal becomes a quasi-stationary signal, and then the above processing is performed on the processed signal, which can greatly improve the problem of the pre-echo. However, the improved method is for speech coding. For audio coding, this improved method is difficult to eliminate fast effects, and the coding efficiency is very low. Summary of the invention

SUMMARY OF THE INVENTION A primary object of the present invention is to provide an encoding apparatus for controlling pre-echo, which has a simple structure and can effectively control a pre-echo phenomenon during audio encoding.

Another object of the present invention is to provide an encoding method for controlling pre-echo, which can effectively control the pre-echo phenomenon during audio encoding.

It is still another object of the present invention to provide a decoding apparatus for controlling pre-echo, which can decode a signal encoded by the encoding method disclosed by the present invention, and completely decode the decoded signal and the original audio signal.

It is still another object of the present invention to provide a decoding method for controlling pre-echo, which can decode a signal encoded by the encoding method disclosed by the present invention, and completely reconstruct the decoded signal from the original audio signal.

To achieve the above object, the present invention provides an encoding apparatus for controlling pre-echo, which includes: a signal type analyzing module for judging a signal type of an input audio signal frame, and outputting a fast change point position and a quantized mutation intensity Parameter

a correction window function module, coupled to the signal type analysis module, configured to modify the analysis window function and window-process the input audio signal frame, and output the windowed time domain audio signal; a time-frequency mapping module, And the correction window function module is configured to convert the windowed time domain audio signal into a frequency domain coefficient; a psychoacoustic analysis module, configured to perform psychoacoustic processing on the input audio signal frame, and a masking threshold parameter of the output scale factor band; a quantization and entropy coding module, which is respectively connected to the time-frequency mapping module and the psychoacoustic analysis module, and configured to output the time-frequency mapping module according to the masking threshold parameter output by the psychoacoustic analysis module The frequency domain coefficients are quantized and entropy encoded, and the encoded code stream is output;

a code stream multiplexing module, coupled to the quantization and entropy coding module and the signal type analysis module, configured to output the coded code stream and the signal type analysis module output by the quantization and entropy coding module Multiplexing is performed and an audio coded stream is formed.

In order to achieve the above further object, the present invention provides an encoding method for controlling pre-echo, which includes the following steps:

Step 1, the signal type analysis module determines whether the signal type of the input audio signal frame is a fast change type signal, and the signal type analysis module calculates a parameter of the fast change point position and a sudden intensity parameter of the audio signal frame, and The mutation intensity parameter is quantized to obtain a quantized value of the mutation intensity, and then step 2 is performed; otherwise, the correction window function module uses the original analysis window function to window the audio signal frame to obtain a windowed time domain signal. , then perform step 4;

Step 2. The correction window function module linearly transforms the analysis window function to obtain a modified analysis window function;

Step 3: The correction window function module adds a window to the audio signal frame by using a modified analysis window function to obtain a time domain signal after windowing;

Step 4: The time-frequency mapping module performs time-frequency mapping processing on the windowed time domain signal to obtain a frequency domain coefficient.

Step 5: The quantization and entropy coding module quantizes and entropy encodes the frequency domain coefficients according to a masking threshold parameter of a scale factor band obtained by psychoacoustic processing of the audio signal frame by the psychoacoustic module, to obtain the encoded audio code stream. ;

Step 6. The code stream multiplexing module multiplexes the encoded audio code stream and the result of the signal type analysis to obtain a compressed audio code stream.

In order to achieve the above other object, the present invention provides a decoding apparatus for controlling pre-echo, which includes:

a code stream demultiplexing module, configured to demultiplex the compressed audio code stream;

An inverse quantization and entropy decoding module is connected to the code stream demultiplexing module, configured to decode and inverse quantize the demultiplexed audio code stream, and output inverse quantized frequency domain coefficients; a frequency time mapping module, coupled to the inverse quantization and entropy decoding module, configured to transform the inverse quantized frequency domain coefficients into a time domain signal;

A correction window function module is coupled to the frequency time mapping module for modifying the integrated window function and windowing the time domain signal.

In order to achieve the above further object, the present invention provides a decoding method for controlling pre-echo, which includes the following steps:

Step 1: The code stream demultiplexing module demultiplexes the input compressed audio code stream to obtain the demultiplexed audio code stream and side information.

Step 2: The inverse quantization and entropy decoding module performs inverse quantization and entropy decoding on the demultiplexed audio code stream to obtain inversely quantized frequency domain coefficients;

Step 3: The frequency time mapping module performs frequency-frequency mapping processing on the inverse-quantized frequency domain coefficients to obtain a time domain signal.

Step 4: The correction window function module determines, according to the demultiplexed side information, whether the signal type of the audio signal frame is a fast change type, if yes, step 5 is performed; otherwise, step 6 is performed;

Step 5: The correction window function module linearly transforms the integrated window function to obtain a modified integrated window function, and then uses the modified integrated window function to window the time domain signal to obtain the reconstructed audio signal;

Step 6. The correction window function module uses the original integrated window function to window the time domain signal to obtain a reconstructed audio signal.

Therefore, the present invention has the following advantages: Since the window function employs a fixed window length, the structure of the encoding device is compressed, and the pre-echo phenomenon during audio encoding is controlled while the complete reconstruction of the audio signal is ensured.

The invention will be further described in detail below with reference to the drawings and specific embodiments. DRAWINGS

Figure 1 is a masking characteristic diagram of the human ear.

Figure 2 is an uncoded audio signal time domain graph.

Figure 3 is a time domain graph of the encoded audio signal after reconstruction.

4 is a structural block diagram of a prior art audio encoding device. FIG. 5 is an audio signal time domain graph to which a transform-based speech encoding method is applied.

6 is an audio signal time domain graph to which an improved transform-based speech encoding method is applied. Fig. 7 is a block diagram showing the configuration of a first embodiment of an apparatus for controlling pre-echo of the present invention.

Fig. 8 is a flow chart showing the first embodiment of the encoding method of the pre-control echo according to the present invention.

Figure 9 is a schematic illustration of the original analysis window and the original synthesis window of the encoding and decoding method for controlling the pre-echo of the present invention.

Figure 10 is a schematic illustration of a modified analysis window of the encoding method for controlling pre-echo in accordance with the present invention. Figure 11 is a schematic illustration of a modified integrated window of the decoding method for controlling the pre-echo of the present invention. Fig. 12 is a view showing the window function correction in accordance with the full reconstruction condition in the encoding method of the pre-control echo of the present invention.

Figure 13 is a diagram showing the modified analysis window function of the transition block of the encoding method of the pre-echo control of the present invention.

Figure 14 is a diagram showing the modified integrated window function of the transition block of the decoding method for controlling the pre-echo of the present invention.

Fig. 15 is a block diagram showing the configuration of a second embodiment of the apparatus for controlling pre-echo of the present invention.

Fig. 16 is a flow chart showing the second embodiment of the encoding method of the pre-control echo according to the present invention.

Figure 17 is a block diagram showing the structure of a first embodiment of a decoding apparatus for controlling pre-echo of the present invention.

Figure 18 is a flow chart showing Embodiment 1 of the decoding method of the pre-control echo according to the present invention.

Fig. 19 is a block diagram showing the configuration of a second embodiment of the decoding apparatus for controlling the pre-echo of the present invention.

Figure 20 is a flow chart showing Embodiment 2 of the decoding method of the pre-control echo according to the present invention. detailed description

The present invention utilizes a modified window function (MWF) to control the pre-echo signal appearing in the audio coding, thereby realizing the pre-echo phenomenon when controlling the audio coding while ensuring complete reconstruction of the audio signal.

Referring to FIG. 7, FIG. 7 is a structural block diagram of Embodiment 1 of an encoding apparatus for controlling pre-echo of the present invention. The device is composed of the following functional modules: a signal type analysis module 301, configured to determine a signal type of the input audio signal frame, and output a fast change point position and a quantized abrupt intensity parameter, wherein the signal type analysis module 301 includes a signal. a type analyzer, configured to determine that the input audio frame signal is The slowly varying type signal is also a fast variable type signal; a fast change point positioner is connected, coupled to the signal type analyzer for calculating a position of the fast change point; a mutation strength calculator, connected to the signal type analyzer a mutation intensity for calculating a signal; a mutation intensity quantizer coupled to the mutation intensity calculator for quantizing the intensity of the mutation of the calculated signal; a correction window function module 302, and the signal type analysis module a 301 connection, configured to modify an analysis window function and window-process the input audio signal frame, and output a windowed time domain audio signal, thereby improving a time resolution of encoding the fast-changing signal; and a time-frequency mapping module 304, connected to the modified window function module 302, configured to convert the windowed time domain audio signal into a frequency domain coefficient; a psychoacoustic analysis module 303, configured to perform psychoacoustic processing on the input audio signal frame And outputting a masking threshold parameter of the scale factor band; a quantization and entropy encoding module 305, respectively, and the time-frequency mapping module 304 and the psychoacoustic sound The analysis module 303 is configured to perform quantization and entropy coding on the frequency domain coefficients output by the time-frequency mapping module 304 according to the masking threshold parameter output by the psychoacoustic analysis module 303, and output an encoded code stream; a multiplexing module 306, coupled to the quantization and entropy encoding module 305 and the signal type analyzing module 301, configured to output the encoded code stream and the signal type analyzing module output by the quantization and entropy encoding module 305 The results of the 301 output are multiplexed and form an audio coded stream.

The time-frequency mapping module 304 is composed of a filter bank, which may be a discrete Fourier transform (DFT) filter bank, a discrete cosine transform (DCT) filter bank, a modified discrete cosine transform (MDCT) filter bank, and a cosine modulation filter. Group and so on. When a discrete transform filter bank such as a discrete Fourier transform (DFT) filter bank, a discrete cosine transform (DCT) filter bank, or a cosine modulation filter set is used, the window length in the analysis window function is equal to the audio signal frame length, and The window function can select Hanning window, Hamming window, Blacknan window; when using modified discrete cosine transform (MDCT) filter bank, the window length in the analysis window function is the audio signal frame. It is twice as long, and the window function can select any window function that conforms to the condition of the modified discrete cosine transform.

In the quantization and entropy coding module 305, the quantizer is composed of a set of sub-quantizers, each of which quantizes the frequency domain coefficients of the local region according to the masking threshold of the specific time-frequency region output by the psychoacoustic analysis module 303, usually This area is called the scale factor band. The quantizer can employ a scalar quantizer and a vector quantizer, such as a Moving Picture Experts Group Advanced Audio Coding (MPEG AAC) nonlinear scalar quantizer, and a Moving Picture Experts Group Dual (MPEG TwinVQ) vector quantizer. . Referring to FIG. 8, FIG. 8 is a flowchart of Embodiment 1 of a method for encoding pre-echo control according to the present invention, and the steps are as follows:

Step 21, the signal type analysis module 301 determines whether the signal type of the input audio signal frame is a fast change type signal, if yes, go to step 22, otherwise go to step 25;

Step 22: The signal type analysis module 301 calculates a parameter of the fast change point position and a mutation intensity parameter of the audio signal frame, and quantizes the mutation strength parameter to obtain a quantized value of the mutation intensity;

Step 23: The correction window function module 302 performs an equal scaling reduction on the function value of the fast change point position of the analysis window function, and the reduced value is equal to the quantization value of the mutation intensity, and the modified analysis window function is obtained;

Step 24, the correction window function module 302 uses the modified analysis window function to window the audio signal frame to obtain a windowed time domain signal, and step 26 is performed;

Step 25, the correction window function module 302 uses the original analysis window function to window the audio signal frame to obtain a windowed time domain signal, and step 26 is performed;

Step 26: The time-frequency mapping module 304 performs time-frequency mapping processing on the windowed time domain signal to obtain a frequency domain coefficient.

Step 27: The quantization and entropy coding module 305 quantizes and entropy encodes the frequency domain coefficients according to the masking threshold parameter of the scale factor band obtained by the psychoacoustic module 303 performing psychoacoustic processing on the audio signal frame, to obtain the encoded Audio stream

Step 28: The code stream multiplexing module 306 multiplexes the encoded audio code stream and the result of the signal type analysis to obtain a compressed audio code stream.

In step 21 of the above decoding method, while the signal type analysis module 301 determines the signal type of the input audio signal frame, the psychoacoustic module 303 performs psychoacoustic processing on the audio signal frame to obtain masking of the scale factor band. Threshold parameter. The psychoacoustic processing is a masking curve for calculating a current frame signal according to a human ear hearing characteristic, and a masking threshold of a specific time-frequency region can be calculated according to the masking curve for guiding quantization of a current audio frame signal, where the psychoacoustic model can be The first or second type of psychoacoustic model used by MPEG AAC. The signal type analysis module 301 performs front and back masking effects based on the adaptive threshold and the waveform prediction to perform signal type determination on the frame signal. The specific steps are: decomposing the input frame into multiple subframes, and searching for PCM on each subframe. The local maximum point of the absolute value of the data; the absolute peak value of the sub-frame is selected in the local maximum point of each sub-frame; for a certain sub-frame absolute peak, a plurality of (typically 3) sub-frames in front of the sub-frame are used absolutely The peak sample predicts a typical sample value of a plurality of (typically 4) subframes relative to the forward delay of the subframe; calculates a difference and a ratio of the absolute peak of the subframe to the predicted typical sample value; If the ratio and the ratio are greater than the set threshold, it is determined that the sub-frame has a sudden signal, and the sub-frame has a local maximum peak point with a backward masking pre-echo capability, if the front end of the sub-frame and the mask peak point are 2. If there is a sub-frame with an absolute peak small enough between 5ms, it is judged that the frame signal belongs to the fast-changing type signal, and the sub-frame with the sudden signal is used as the position of the fast change point, and the sub-frame of the sudden signal will be present. The ratio of the absolute peak to the largest absolute peak in all the sub-frames before the sub-frame is used as the intensity of the mutation, and the intensity of the mutation is quantified. The quantization method may be rounding up, down-and-rounding, rounding, etc.; If the ratio is not greater than the set value, the above steps are repeated until it is determined that the frame signal is a fast-changing type signal or reaches the last subframe, and if the last subframe is reached, the frame signal is not determined to be a fast-changing type signal. , the frame signal belongs to a slowly varying type signal.

There are many methods for time-frequency transform of time-domain audio signals into time-frequency audio signals, such as discrete Fourier transform (DFT), discrete cosine transform (DCT), cosine-modulated filter bank, and modified discrete cosine transform. Transform method such as (MDCT) or wavelet transform. When using orthogonal transform methods such as DFT, DCT or cosine-modulated filter banks, the window length in the analysis window function is equal to the length of the audio signal frame, and the window function can select Hanning window, Hamming (Ha匪ing) ) window, Blackmail window; when using modified discrete cosine transform (MDCT), the window length in the analysis window function is twice the frame length of the audio signal, and the window function can select any condition that matches the modified discrete cosine transform. Window function. The analysis window function uses a fixed length window function, the length being an integer greater than 1, preferably 2 to the power of N, where N is a natural number. For the selection of windows, see "Discrete Time Signal Processing (笫2)", Xi'an Jiaotong University Press, A. V. Oppenheim, R. W. Schaeffer, J. R. Barker, Liu Shuzhen, Huang Jianguo, 2001. Since the time-frequency mapping and the windowing function are inseparable, the process of time-frequency mapping and windowing is illustrated by MDCT below.

For the case of time-frequency mapping by using MDCT, firstly, the time domain signals of the M samples of the previous frame and the M samples of the current frame are selected, and then the time domain signals of 2M samples of the two frames are windowed by the module 302. The window of the analysis window is twice as long as the frame length, and then the framed signal is subjected to MDCT transformation by the time-frequency mapping module 304 to obtain M frequency domain coefficients. The impulse response of the MDCT analysis filter is:

2M-\

X(k)= ∑ x{n)h An)

Then MDCT transforms into: "=0, 0≤k≤Ml, where: _w (n) is a window function; x(n) is the input time domain audio signal of MDCT transform; X(k) is the output frequency domain of MDCT transform In order to satisfy the condition that the signal is completely reconstructed, the window function w(n) of the MDCT transform must satisfy the following two conditions:

w(2M -ln) = w(n) and w ² (n) + w ² (n + M) =1 In practice, Sine window, KBD window, etc. can be selected as the window function. The Sine window is taken as an example to illustrate how to modify it in the module 302 to achieve the purpose of controlling the pre-echo. It should be noted that the present invention is not limited to the Sine window and the KBD window, and any window function that satisfies the MDCT transformation condition can be used to correct it, and finally achieve the purpose of controlling the pre-echo.

When the frame signal 9 is analyzed as a slowly varying type signal, the original analysis window function is unchanged, and Fig. 9 is a schematic diagram of the original analysis window function of the encoding and decoding method for controlling the pre-echo of the present invention. If the frame signal 9 is a fast-changing type signal, the original analysis window function is corrected. The correction processing is: performing a proportional reduction on the value of the window function after the fast change point, and the reduced value is equal to the magnitude of the mutation intensity after the quantization. The modified analysis window function is shown in Fig. 10, in which the fast change point position of Fig. 10 is the 1280 sample point 10, and the quantized mutation intensity is 5. In order to satisfy the condition of complete signal reconstruction, after the above-mentioned correction is performed on the analysis window, the integrated window at the time of decoding must also be corrected. The correction processing is: equalizing the value of the window function after the fast change point. The value of the amplification is equal to the intensity of the mutation after quantification. The modified integrated window function is shown in Fig. 11. The fast change point position of Fig. 11 is the 1280 sample point 11, and the quantized mutation intensity is 5. The proof that the analysis window and the synthesis window can still satisfy the full reconstruction condition after the above correction is given below.

The literature ( SF Cheung and JS Lim, "Incorporation of biorthogonality into lapped transforms for audio compression" . Proc. IEEE Int. Conf. On Acoustics, Speech, and Signal Processing, Detroit, May 1997, pp3079-3082 ) proves that the cosine is satisfied The bi-orthogonal complete reconstruction conditions of the modulation filter bank are: ∑ p {mM + n)p {{m + 2s)M + n) = d(s)

0 s ci

2K-l-2s

∑ (-^ ^m p ( M + n)p ((m + 2s)M + (M -nl)) = 0

m = 0 ^{S a}

Where: AW and ^) are the synthesis window and the analysis window, respectively, M is the frame length, K is an integer greater than zero, s = 0, 1, ..., K-1, n = 0, 1, ... M-1.

If the cosine-modulated filter bank used is MDCT, then K is equal to 1 in the above equation and s is equal to 0. Bring it into the above formula to further it:

p { )p _a ^l (n) + ¹ (M + ") ^l (M + n)^l ρ {n)p _a ^l ( - « - 1) - ¹ ( + ή)^- ¹ (2 - » - 1) = 0 ( _χ ) where i denotes the frame number. i-1 denotes the previous frame and i denotes the current frame.

FIG. 12 is a schematic diagram showing the window function correction in accordance with the full reconstruction condition in the encoding method of the pre-control echo according to the present invention. The signal of four consecutive frames is shown in Fig. 12, and it has been judged that the i-th frame is a fast-changing frame, the position of the fast change point 12 is the M+L point, and the mutation intensity is scs, where scs is a real number. When the MDCT transform is performed on the i-th frame, the original analysis window ^) and the original synthesis window are corrected as described above to obtain the modified analysis window » and the modified integrated window ^ («), then:

p _a = p _a (n) 0 ≤ n ≤ M + L

P _a ⁼ P _a (n)/scs M +L≤n< 2M

with

P _s (n) = P _s (n) 0 ≤ n < M + L

p (n) = p («) · scs M + L≤n< 2M

s s

So, for the ith frame p ^l ' ^l M + n)p ¹ ' ¹ ( + n)

= P _s ^l { )p _a ^l (") + ρ ₈ ¹ {Μ + n)p _a ^{l 1} ( + n)

12 ~ ^l (2M - " - 1)

= 0 ( 3 ) It can be known from equations ( 2 ) and ( 3 ) that when MDCT is performed on the i-th frame, the correction window can be completely reconstructed. Similarly, MDCT can be performed on i + 1 frame and i + 2 frame, and the correction window can be completely reconstructed. It can be further proved that as long as the original window is linearly transformed, the complete reconstruction characteristics of the transform are not changed.

Since the abrupt change at the fast change point of the original window generates high frequency information, which reduces the coding efficiency, in step 23, the function value after the fast change point position of the pair analysis window function is performed, etc. In the reduction of the ratio, a transition block can be added near the fast change point to slowly change the value of the analysis window function near the fast change point. The window function method of adding the transition block is: Assume that the window function is ^{0 ≤} " ^{≤ 2M} - the quantized intensity of the mutation is ^ The fast change point position is L. If no transition block is added, the modified window function is:

If a transition block is added, and the length of the transition block is the previous sample point and the post-i sample at the fast change point position

_(n = _(S (n - L + l)

Where is a linear function: ^gW - l "~. As shown in Fig. ₁₃ is a schematic diagram of the modified analysis window function of the transition block of the encoding method of the pre-echo control of the present invention. Corresponding to the modified analysis window function at the time of encoding, The processing of the transition block is also added during decoding, as shown in FIG. 14 is a schematic diagram of the modified integrated window function of the transition block of the pre-control echo decoding method of the present invention, the transition block 13 of FIG. 13 and the transition block 13 of FIG. The length of the transition block 14 is 64 sample points. It has been proved that as long as the linear transformation of the original window does not change the complete reconstruction characteristics of the transformation, we can also arbitrarily analyze the analysis window function or the synthesis window according to the signal type. The linear transformation of the transition segment referred to herein or the arbitrary linear transformation of the window function may be directed to the analysis window or the synthesis window employed in the coding and decoding methods in all of the above embodiments. In step 27, the quantization and entropy coding includes two steps of nonlinear quantization and entropy coding, wherein the quantization process may employ scalar quantization or vector quantization. The scalar quantization method can employ a nonlinear scalar used by MPEG. AAC, which can use vector quantization of MPEG TwinVQ. The quantization process may also employ an audio coding method based on minimizing global noise masking ratio criteria and entropy coding (patent application number 03146213. 8). After the quantization process, the entropy coding technique is used to further remove the quantized coefficients and the statistical redundancy of the side information, and finally the compressed audio code stream is obtained.

Fig. 15 is a block diagram showing the structure of a second embodiment of the encoding apparatus for controlling the pre-echo of the present invention. In this embodiment, based on Embodiment 1 of the pre-control echo coding apparatus of the present invention, a sub-band analysis module 307 is added, which is connected to the signal type analysis module 301 for performing the input audio signal. Subband analysis. The subband analysis can modify the window function differently according to the difference of the intensity of the mutation and the nature of the signal for each frequency segment. For example, the general low frequency band does not generate the pre-echo phenomenon, then the window function of the low frequency band can be omitted. Corrected, thus more flexible control of window function correction for different frequency segments.

For the above-mentioned encoding device, the present invention discloses a second embodiment of the encoding method for controlling the pre-echo. FIG. 16 is a flowchart of Embodiment 2 of the encoding method for controlling the pre-echo according to the present invention. The steps are as follows: Step 40: Subband analysis The module 307 performs subband analysis on the input audio signal frame, and the subband analysis is segmented according to the frequency, and the audio signal frame is divided into multiple subband audio signals; Step 41, the signal type analysis module 301 respectively determines the multiple subband audio signals. Whether the signal type of the frame is a fast change type signal, if yes, step 42 is performed; otherwise, step 45 is performed; step 42; the signal type analyzing module 301 calculates parameters of the fast change point position and the described manner for the multiple subband audio signal frames respectively. The multi-path subband has a mutation intensity parameter of the audio signal frame, and respectively quantizes the mutation intensity parameter of the multi-channel sub-band audio signal frame to obtain a quantized value of the mutation strength of the multi-channel sub-band audio signal frame; step 43, the correction window function module ³ 02 And respectively reducing the function value after the fast change point position of the analysis window function by an equal ratio, and the reduced value is equal to the Mutant intensity values quantized audio signal frame with the road, to give 'analysis window function after positive; step 44, the correction window function module ³⁰² using the modified analysis window function on each of the multi-way audio band signal frame Windowing, obtaining a multi-path sub-band time domain signal after windowing, and performing step 46; Step 45, the correction window function module 30 ² windowing the multi-channel sub-band audio signal frame with an original analysis window function, Obtaining the windowed time domain signal, and performing step 46; Step 46, The time-frequency mapping module 304 performs time-frequency mapping processing on the windowed multi-band sub-band time domain signal to obtain frequency domain coefficients. Step 47, the quantization and entropy coding module 305 integrates the multi-path sub-band frequency domain coefficients; The quantization and chirp encoding module 305 quantizes and entropy encodes the frequency domain coefficients according to a masking threshold parameter of a scale factor band obtained by psychoacoustic processing of the audio signal frame by the psychoacoustic module 303, to obtain an encoded audio code stream; Step 49: The code stream multiplexing module 306 multiplexes the encoded audio code stream and the result of the signal type analysis to obtain a compressed audio code stream.

In step 41 of the above decoding method, while the signal type analysis module 301 determines the signal type of the multiplex subband audio signal frame, the psychoacoustic module 303 performs psychoacoustic processing on the input audio signal frame to obtain a scale factor. Masked threshold parameter with band. The psychoacoustic processing is a masking curve for calculating a current frame signal according to a human ear hearing characteristic, and a masking threshold value of a specific time-frequency region can be calculated according to the masking curve for guiding quantization of a current audio frame signal, where the psychoacoustic model can be Is the first or second type of psychoacoustic model used by MPEG AAC.

The second embodiment of the encoding method of the present invention has the advantage that the window function can be modified differently according to the difference of the mutation strength and the signal property of each frequency segment. For example, if the low frequency band does not generate the pre-echo phenomenon, then the fault may be incorrect. The window function of the low frequency band is modified to more flexibly control the window function correction of different frequency segments.

FIG. 17 is a structural block diagram of Embodiment 1 of a decoding apparatus for controlling pre-echo of the present invention. The device is composed of the following functional modules: a code stream demultiplexing module 401 for demultiplexing the compressed audio code stream; an inverse quantization and entropy decoding module 40 ² , connected to the code stream demultiplexing module 401, Decoding and dequantizing the demultiplexed audio code stream, and outputting the inverse quantized frequency domain coefficients; a frequency time mapping module 403, coupled to the inverse quantization and decoding module 402, for The inverse-quantized frequency domain coefficients are transformed into a time domain signal, and the frequency-time mapping module 403 is composed of a filter bank, the filter bank is an inverse transform filter bank corresponding to the encoding device; the modified window function module 404, and The frequency time mapping module 403 is connected to modify the integrated window function and perform windowing processing on the time domain signal.

FIG. 18 is a flowchart of Embodiment 1 of a method for decoding a pre-control echo according to the present invention. The steps are as follows: Step 31: A code stream demultiplexing module demultiplexes an input compressed audio code stream to obtain a demultiplexed Audio code stream and side information; Step 32: The inverse quantization and entropy decoding module performs inverse quantization and entropy decoding on the demultiplexed audio code stream to obtain inverse quantized frequency domain coefficients; Step 33, frequency time mapping The radio module performs frequency-frequency mapping processing on the inverse-quantized frequency domain coefficients to obtain a time domain signal;

34. The correction window function module determines, according to the demultiplexed side information, whether the signal type of the audio signal frame is a fast change type, if yes, go to step 35, otherwise go to step 37; Step 35, modify the window function module pair The function value after the fast change point position of the integrated window function is amplified in equal proportion, and the amplified value is equal to the quantized value of the mutation intensity, and the modified integrated window function is obtained; Step 36, the modified window function module is used The modified integrated window function windowes the time domain signal to obtain a reconstructed audio signal; and the modified window function module adds a window to the time domain signal by using an original integrated window function. The reconstructed audio signal is obtained.

In step 33, the method of performing frequency-frequency mapping and correcting window function processing on the frequency domain coefficients corresponds to the time-frequency mapping and the modified window function processing method in the encoding method, and is based on encoding control in the compressed audio code stream. Information is used to select the corresponding inverse mapping and window function. The frequency-time mapping processing can be implemented by inverse discrete cosine transform (IDCT), inverse discrete Fourier transform, inverse modified discrete cosine transform (IMDCT).

The following is an example of the inverse time cosine transform IMDCT to illustrate the frequency time mapping process. Since the frequency-time mapping and windowing processing are indivisible, the frequency-time mapping and the correction window function are considered together here.

First, IMDCT transform is performed on the inverse quantized word to obtain the transformed time domain signal ^x '. IMDCT change

X. + n _Q )(k 4-丄) )

The expression changed is: '

^2, where η represents the sample number, and 0 ≤ « < N , Ν represents the length of the window, "o = (N / 2+l) / 2; i represents the frame number; k represents the language number.

Secondly, the time domain signal obtained by the DCT transform is windowed in the time domain. To satisfy the full reconstruction condition, the window function w (n) must satisfy the following two conditions: w( ² M- = and w ² (ji) + w ² (n + M) = l Finally, the above windowed time domain The signal is superimposed to obtain a time domain audio signal. Specifically, the first Ν/2 samples of the signal obtained after the windowing operation are overlapped with the Ν/2 samples of the previous frame signal to obtain /2 outputs. Time domain audio samples, ^ timeS 隐 i, _n = p _re Sa _mi , _n + preSam _η , where i represents the frame number, n represents the sample number, there are ² , and the length is 冒.

As shown in Figures 9 and 11, the integrated window function correction has been described above. The correction processing described in the original integrated window function is: scaling the window function value after the fast change point, the amplified value is equal to After quantifying the magnitude of the mutation intensity, the modified comprehensive window function is shown in Fig. 11. Pair and edit The comprehensive window corresponding to the analysis window used in the code can still satisfy the complete reconstruction condition after the above correction, which has been proved in the foregoing.

Since the abrupt change at the fast change point of the original window generates high frequency information, which reduces the coding efficiency, in step 35, the fast change in the pair of integrated window functions corresponds to the correction process at the time of encoding. In the equal scaling of the function values after the point position, a transition block can be added near the fast change point to slowly change the value of the analysis window function near the fast change point. The window function method of adding the transition block is: Assume that the window function is "where 0 ≤" ≤ ² Μ - 1 , and the quantized mutation intensity is the fast change point position is L. If no transition block is added, the corrected window function is :

If a transition block is added, and the length of the transition block is the first / sample point and the last - 1 sample point of the fast change point position, the corrected window function is:

w(n) 0≤n < L -I

w' n) = w(n) /g(n) L -I≤n≤L + I-l

w(n)/s L + I≤n≤2M -l

(n - L + l)

Where is a linear function: ~" 21 ^. As shown in Fig. 14, is a schematic diagram of the modified integrated window function of the transition block of the pre-control echo decoding method of the present invention, and the transition block 14 in Fig. 14 has a length of 64 Sample points. It has been proved that as long as the linear transformation of the original window does not change the complete reconstruction characteristics of the transformation, we can also perform arbitrary linear transformation on the analysis window function or the synthesis window according to the signal type. The window function of the transition section or the arbitrary linear transformation of the window function may be directed to the analysis window or the synthesis window used in the coding and decoding methods in all of the above embodiments.

Fig. 19 is a block diagram showing the configuration of the first embodiment of the decoding apparatus for controlling the pre-echo of the present invention. In this embodiment, based on Embodiment 1 of the pre-control echo decoding apparatus of the present invention, a sub-band synthesis module 405 is added, which is connected to the correction window function module 404 for multiplexing the multi-channel reconstruction. Subband time domain signals are used for subband synthesis.

For the above decoding apparatus, the present invention discloses a ^second embodiment of the method for decoding the pre-echo. Referring to FIG. 20, FIG. 20 is a flowchart of Embodiment 2 of the method for decoding the pre-echo of the present invention. The steps are as follows: Step 51: Code stream The demultiplexing module demultiplexes the input compressed audio code stream to obtain the demultiplexed audio stream and side information; Step 52, inverse quantization and entropy decoding module for demultiplexed audio The code stream is divided into multiple channels according to the frequency; Step 53: The inverse quantization and entropy decoding modules respectively perform inverse quantization and entropy decoding on the demultiplexed audio code stream to obtain inversely quantized frequency domain coefficients; Step 53 The frequency-time mapping module separately performs frequency-time mapping processing on the inverse-quantized frequency-domain coefficients to obtain a multi-channel time domain signal. Step 54: The correction window function module respectively determines the side information according to the demultiplexed Whether the signal type of the multi-channel audio signal frame is fast-changing type, if yes, step 55 is performed; otherwise, step 57 is performed; step 55, the correction window function module respectively performs the fast-changing point position of the multi-channel integrated window function. The function value is scaled up, the amplified value is equal to the quantized value of the mutation intensity, and the modified integrated window function is obtained; Step 56, the modified window function module uses the modified integrated window function to respectively The time domain signal is windowed to obtain a multiplexed reconstructed audio signal, and then step 58 is performed; Step 57, the modified window function module separately uses the original integrated window function Domain signal windowed multi-channel, audio signal to obtain reconstructed multi channel, then step 58; step 58, the sub-band audio signal synthesizing module reconstructed multi-channel are synthesized.

It should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention and are not intended to be limiting thereof; although the present invention has been described in detail with reference to the preferred embodiments, the technical features of the art are equivalently substituted; All the spirits of the technical solutions of the present invention should be included in the scope of the technical solutions claimed in the present invention.

Claims

Rights request

1. An encoding device for controlling pre-echo., comprising:

a signal type analysis module, configured to determine a signal type of the input audio signal frame, and output a fast change point position and a quantized mutation intensity parameter;

a correction window function module, coupled to the signal type analysis module, configured to modify the analysis window function and window-process the input audio signal frame, and output the windowed time domain audio signal; a time-frequency mapping module, And the correction window function module is configured to convert the windowed time domain audio signal into a frequency domain coefficient;

a psychoacoustic analysis module, configured to perform psychoacoustic processing on the input audio signal frame, and output a masking threshold parameter of a scale factor band;

a quantization and entropy coding module, which is respectively connected to the time-frequency mapping module and the psychoacoustic analysis module, and configured to output the time-frequency mapping module according to the masking threshold parameter output by the psychoacoustic analysis module The frequency domain coefficients are quantized and entropy encoded, and the encoded code stream is output;

2. The apparatus for controlling pre-echo according to claim 1, wherein said signal type analysis module comprises:

a signal type analyzer, coupled to the correction window function module and the code stream multiplexing, configured to determine whether the input audio frame signal is a slow-change type signal or a fast-change type signal;

a fast change point positioner, the signal type analyzer, the correction window function module, and the code stream The multiplexing module is connected to calculate the position of the fast change point;

a mutation strength calculator coupled to the signal type analyzer and the code stream multiplexing module for calculating a mutation intensity of the signal;

A mutation intensity quantizer is coupled to the mutation intensity calculator, the correction window function module, and the code stream multiplexing module for quantizing the intensity of the mutation of the calculated signal.

The apparatus for controlling pre-echo in accordance with claim 1, wherein said time-frequency mapping module is constituted by a filter bank.

4. The apparatus for controlling pre-echo in accordance with claim 3, wherein said filter bank is a discrete Fourier transform filter bank or a discrete cosine transform filter bank; and a window length in the analysis window function is equal to a frame length.

The apparatus for controlling pre-echo according to claim 4, wherein the analysis window function is a Hanning window, a Hamming window or a Blackman window.

6. The apparatus for controlling pre-echo in accordance with claim 3, wherein said filter bank is a modified discrete cosine transform filter bank; and the window length in the analysis window function is twice the frame length.

7. The apparatus for controlling pre-echo in accordance with claim 6, wherein said analysis window function is a window function conforming to a condition of a modified discrete cosine transform.

8. The apparatus for controlling pre-echo in accordance with claim 1, wherein said signal type analysis module is further coupled to a subband analysis module for performing subband analysis on said input audio signal frame.

9. A coding method for controlling pre-echo, comprising the following steps:

Step 1. The signal type analysis module determines whether the signal type of the input audio signal frame is a fast change type signal, and the signal type analysis module calculates a parameter of the fast change point position and the audio signal. The mutation intensity parameter of the frame, and quantizing the mutation intensity parameter to obtain a quantized value of the mutation intensity, and then performing step ² ; otherwise, the correction window function module windowing the audio signal frame with the original analysis window function , get the time domain signal after windowing, and then perform step 4;

Step ² : The correction window function module linearly transforms the analysis window function to obtain a modified analysis window function;

Step 6. The code stream multiplexing module multiplexes the encoded audio code stream and the signal type analysis result to obtain a compressed audio code stream.

The method for encoding a pre-echo control according to claim 9, wherein the signal type analysis module in the step 1 determines whether the signal type of the input audio signal frame is a fast-change type signal, and the signal type analysis module The audio signal frame performs front and back masking effect analysis based on adaptive threshold and waveform prediction to determine the signal type of the audio signal frame.

11. The method for encoding a pre-echo control according to claim 9, wherein in the step 2, the function value after the fast change point position of the analysis window function is scaled down in an equal manner. Adding a transition block near the fast change point causes the value of the analysis window function near the fast change point to slowly change.

12. The method of encoding a pre-echo control according to claim 9, wherein in step 2 The linear transformation is to scale down the function value after the fast change point position of the analysis window function, and the reduced value is equal to the quantized value of the mutation intensity.

The method for encoding a pre-echo control according to claim 9, wherein in the step 4, the time-frequency mapping process is a discrete Fourier transform or a discrete cosine transform, wherein a window length of the analysis window function is The audio signal frames are of equal length.

14. The encoding pre-echo encoding method according to claim 13, wherein the analysis window function is a Hanning window, a Hamming window or a Blackman window.

The method for encoding a pre-echo control according to claim 9, wherein in the step 4, the time-frequency mapping process is a modified discrete cosine transform, wherein a window length of the analysis window function is the audio signal. The length of the frame is twice.

The pre-control echo encoding method according to claim 15, wherein the analysis window function is a window function conforming to a modified discrete cosine transform condition.

17. The method of encoding pre-echo control according to claim 9, wherein prior to step 1, the sub-band analysis module first performs sub-band analysis on the audio signal frame to obtain a multi-channel sub-band audio signal, and after step 4 Integrate multiple subbands with frequency domain coefficients.

The method for encoding a pre-echo control according to claim 17, wherein the sub-band analysis module performs sub-band analysis on the audio signal frame as: the sub-band analysis module pairs the audio signal in a frequency segmentation manner. The frame is subband analysis.

19. The method of encoding pre-echo in accordance with claim 9, wherein the psychoacoustic module performs psychoacoustic processing on the audio signal frame in step 5, and uses the advanced audio coding of the motion image expert group for the psychoacoustic module. A class or type of model performs psychoacoustic processing on the audio signal frame.

The method for encoding a pre-echo control according to claim 9, wherein the quantization and entropy coding module quantizes the frequency domain coefficients in the step 5, and the quantization and entropy coding module uses a scalar quantization method or a vector quantization method. The frequency domain coefficients are quantized.

A method of encoding a pre-echo control according to claim 20, wherein said scalar quantization method is a nonlinear scalar method employed by the advanced audio coding of the Moving Picture Experts Group.

The method of encoding a pre-echo control according to claim 20, wherein said vector quantization method is a vector quantization method used for motion vector expert group double vector quantization.

The method for encoding a pre-echo control according to claim 9, wherein the analysis window function in the step 2 adopts a fixed-length window function, and the length is an integer greater than 1, preferably 2 to the power of N, where N is Natural number.

24. A decoding device for controlling pre-echo, comprising:

An inverse quantization and entropy decoding module is connected to the code stream demultiplexing module, configured to decode and inverse quantize the demultiplexed audio code stream, and output inverse quantized frequency domain coefficients;

a frequency time mapping module, coupled to the inverse quantization and entropy decoding module, configured to transform the inverse quantized frequency domain coefficients into a time domain signal;

The pre-echo decoding apparatus according to claim 24, wherein the correction window function module is further connected with a sub-band synthesis module for sub-band synthesis of the multiplexed sub-band time domain signals.

26. The apparatus for controlling pre-echo decoding according to claim 24, wherein said frequency time The mapping module consists of a filter bank.

27. The apparatus for controlling pre-echo decoding according to claim ²⁶ , wherein said filter bank is an inverse transform filter bank corresponding to an encoding device.

28. A decoding method for controlling pre-echo, comprising the following steps:

Step 1: The code stream demultiplexing module demultiplexes the input compressed audio code stream, and obtains the demultiplexed audio code stream and side information;

Step 5: The modified window function module linearly transforms the integrated window function to obtain a modified integrated window function, and then windowed the time domain signal with the modified integrated window function to obtain the reconstructed audio signal;

The method for decoding a pre-echo control according to claim 28, wherein in the step 5, the function value after the fast change point position of the integrated window function is enlarged in proportion, Adding a transition block near the fast change point causes the integrated window function near the fast change point to slowly change.

30. The method of decoding pre-echo in accordance with claim 28, wherein said linear transformation in said step 5 is equal to a function value after said fast change point position of said integrated window function Amplification, the value of the amplification is equal to the quantified value of the intensity of the mutation.

The method for decoding a pre-echo control according to claim 28, wherein the inverse quantization and entropy decoding module segments the demultiplexed audio code stream by frequency before the step 2 Sub-band synthesis is performed on the reconstructed audio signal after step 5.

The decoding method according to claim 28, wherein said integrated window function in said steps 4 and 5 is to select a corresponding window function based on the encoding control information in said compressed audio code stream.

33. The decoding method according to claim 28, wherein the frequency time mapping in the step 3 is to select a corresponding inverse mapping according to the encoding control information in the compressed audio code stream.