WO2005027094A1 - Procede et dispositif de quantification de vecteur multi-resolution multiple pour codage et decodage audio - Google Patents
Procede et dispositif de quantification de vecteur multi-resolution multiple pour codage et decodage audio Download PDFInfo
- Publication number
- WO2005027094A1 WO2005027094A1 PCT/CN2003/000790 CN0300790W WO2005027094A1 WO 2005027094 A1 WO2005027094 A1 WO 2005027094A1 CN 0300790 W CN0300790 W CN 0300790W WO 2005027094 A1 WO2005027094 A1 WO 2005027094A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- vector
- resolution
- time
- quantization
- frequency
- Prior art date
Links
- 239000013598 vector Substances 0.000 title claims abstract description 449
- 238000000034 method Methods 0.000 title claims abstract description 95
- 238000001914 filtration Methods 0.000 claims abstract description 55
- 238000004458 analytical method Methods 0.000 claims abstract description 33
- 230000005236 sound signal Effects 0.000 claims abstract description 31
- 238000013139 quantization Methods 0.000 claims description 165
- 238000010606 normalization Methods 0.000 claims description 64
- 230000008520 organization Effects 0.000 claims description 43
- 230000008569 process Effects 0.000 claims description 30
- 238000004364 calculation method Methods 0.000 claims description 25
- 230000015572 biosynthetic process Effects 0.000 claims description 18
- 238000003786 synthesis reaction Methods 0.000 claims description 18
- 238000013507 mapping Methods 0.000 claims description 13
- 230000001052 transient effect Effects 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 4
- 230000000873 masking effect Effects 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 3
- 230000010354 integration Effects 0.000 claims 1
- 238000011002 quantification Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 12
- 238000005516 engineering process Methods 0.000 description 11
- 239000011159 matrix material Substances 0.000 description 7
- 230000002776 aggregation Effects 0.000 description 5
- 230000004044 response Effects 0.000 description 4
- 238000004220 aggregation Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000005054 agglomeration Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000003775 Density Functional Theory Methods 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/038—Vector quantisation, e.g. TwinVQ audio
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
- G10L19/0216—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation using wavelet decomposition
Definitions
- the present invention relates to the field of signal processing, and in particular, to a coding method and a device for implementing multi-resolution analysis and vector quantization on audio signals. Background technique
- an audio coding method includes steps of psychoacoustic model calculation, time-frequency domain mapping, quantization, and encoding.
- the time-frequency domain mapping refers to mapping an audio input signal from the time domain to the frequency domain or the time-frequency domain.
- Time-frequency domain mapping also called transformation and filtering, is a basic operation of audio signal coding, which can improve coding efficiency. Through this operation, most of the information contained in the time domain signal can be transformed or concentrated into a subset of the frequency domain or time-frequency domain coefficients.
- a basic operation of a perceptual audio encoder is to map the input audio signal from the time domain to the frequency domain or the time-frequency domain. The basic idea is: decompose the signal into components on each frequency band; once the input signal is in the frequency domain After being expressed, the psychoacoustic model can be used to remove perceptually irrelevant information; then the components in each frequency band are grouped; finally, the number of bits is reasonably allocated to express each group of frequency parameters.
- the audio signal exhibits a strong quasi-periodic nature, this process can greatly reduce the data volume and improve the coding efficiency.
- the commonly used time-frequency domain mapping methods are: discrete Fourier transform DFT method, discrete cosine transform DCT method, mirror filter QMF method, pseudo-mirror filter PQMF method, cosine modulation filter CMF method, modified discrete cosine transform MDCT and discrete wavelet (Packet) transform DW (P) T method, etc., but the above methods either use a transform / filter configuration to compress and express an input signal frame, or use a filter bank or transform compression with a small time domain analysis interval to express Signals that change drastically to eliminate the effect of pre-echo on the decoded signal.
- the vector quantization technology can be used to improve the coding efficiency.
- the current audio coding method that uses vector quantization technology in audio coding is the Transform-domain Weigthed Inter leave Vector Quantization (TWINVQ) coding method. After MDCT transformation of the signal, the method uses cross-selection The signal spectrum parameters are used to construct the vector to be quantized, and then the efficient vector quantization is used to significantly improve the encoded audio quality of the lower bit rate.
- TWINVQ encoding method is a perceptually lossy encoding method.
- the TWINVQ encoding method needs further improvement.
- the TWINVQ encoding method The coefficient interleaving method is used at this time, although the consistency of statistics between vectors can be ensured, the phenomenon of signal energy concentration in local time-frequency regions cannot be effectively used, which also limits the further improvement of coding efficiency.
- the MDCT transform is essentially a filter bank of equal bandwidth, the signal cannot be decomposed according to the aggregation of the signal energy in the time-frequency plane, which limits the efficiency of the TWINVQ coding method.
- the time-frequency plane needs to be effectively divided so that the signal The distance between the components of the class is as large as possible, and the distance between the classes is as small as possible. This is to solve the problem of multi-resolution filtering of the signal.
- the vector needs to be reorganized, selected, and quantized based on an effective time-frequency plane division. The coding gain is maximized, which is to solve the problem of multi-resolution vector quantization of a signal.
- the technical problem to be solved by the present invention is to provide a multi-resolution vector quantization audio coding and decoding method and device, which can adjust the time-frequency resolution for different types of input signals, and effectively use the local agglomeration of the signal in the time-frequency domain. Perform vector quantization to improve coding efficiency.
- the multi-resolution vector quantized audio encoding method of the present invention includes: adaptively filtering an input audio signal to obtain a time-frequency filter coefficient and outputting a filtered signal; performing vector division on the time-frequency plane of the filtered signal to obtain Vector combination; selecting a vector for vector quantization; performing vector quantization on the selected vector, and calculating a quantization residual; the quantized codebook information is transmitted to the audio decoder as side information of the encoder, and the quantization residual is quantized and encoded.
- the multi-resolution vector quantization audio decoding method of the present invention includes: demultiplexing from a code stream to obtain side information of multi-resolution vector quantization, obtaining energy of a selected point and position information of vector quantization; using an inverse vector according to the above information Quantize the normalized vector, calculate the normalization factor, and reconstruct the quantized vector of the original time-frequency plane; add the reconstructed vector to the residual of the corresponding time-frequency coefficient according to the position information; go through multi-resolution Reverse filtering and frequency-to-time mapping to obtain a reconstructed audio signal.
- the multi-resolution vector quantized audio encoder of the present invention includes a time-frequency mapper, a multi-resolution filter, a multi-resolution vector quantizer, a psychoacoustic calculation module, and a quantization encoder;
- the time-frequency mapper Receive an audio input signal, perform time-to-frequency domain mapping, and output to the multi-resolution filter;
- the multi-resolution filter is configured to perform adaptive filtering on the filtered signal and output the filtered signal to the psychoacoustic calculation module And the multi-resolution vector quantizer;
- the multi-resolution vector quantizer is configured to perform vector quantization on the filtered signal and calculate a quantization residual, pass the quantized signal to the audio decoder as side information, and quantize the residual
- the difference is output to the quantization encoder;
- the psychoacoustic calculation module is configured to calculate a masking threshold of the psychoacoustic model according to the input audio signal, and output to the quantization encoder, for controlling the noise allowed by the quantization;
- the multi-resolution vector quantization audio decoder of the present invention includes a decoding and inverse quantizer, a multi-resolution inverse vector quantizer, a multi-resolution inverse filter and a frequency-time mapper; the decoding and inverse quantizer, It is used to demultiplex code stream, entropy decoding and inverse quantization, to obtain side information and encoded data, and output to the multi-resolution inverse vector quantizer; the multi-resolution inverse vector quantizer is used to perform inverse vector A quantization process, reconstructing a quantized vector, and adding the reconstructed vector to a residual coefficient on a time-frequency plane, and outputting the multi-resolution inverse filter to the multi-resolution inverse filter; The sum of the vector and residual coefficients reconstructed by the multi-resolution vector quantizer is inverse filtered and output to the frequency-time mapper; the frequency-time mapper is used to complete the mapping of the signal from frequency to time To obtain the final reconstructed audio signal.
- the audio encoding and decoding method and device based on the multi-resolution vector quantization (Mul tiresolut Vector Quant izat ion, MRVQ for short) technology of the present invention can adaptively filter audio signals, and through multi-resolution filtering, the Effectively use the phenomenon of signal energy concentration in the local time-frequency region, and can #home the type of signal, adaptively adjust the time and frequency resolution; reorganize by filtering coefficients, you can choose different according to the aggregation characteristics of the signal
- the organization strategy uses the results of the above multi-resolution time-frequency analysis effectively. Using vector quantization to quantify these regions can not only improve the coding efficiency, but also conveniently control the quantization accuracy and optimize it.
- FIG. 1 is a flowchart of a multi-resolution vector quantization audio coding method according to the present invention
- FIG. 3 is a schematic diagram of a source encoding / decoding system based on a chord modulation filter
- FIG. 4 is a schematic diagram of three aggregation modes of energy after multi-resolution filtering
- FIG. 6 is a schematic diagram of dividing a vector in three ways
- Figure 8 is a schematic diagram of the area energy / maximum value
- FIG. 9 is a flowchart of another embodiment of multi-resolution vector quantization.
- FIG. 10 is a schematic structural diagram of a multi-resolution vector quantization audio encoder according to the present invention.
- FIG. 11 is a schematic structural diagram of a multi-resolution filter in an audio encoder
- FIG. 12 is a schematic structural diagram of a multi-resolution vector quantizer in an audio encoder
- Figure 3 is a flowchart of a multi-resolution vector quantization audio decoding method of the present invention.
- 14 is a flowchart of multi-resolution inverse filtering
- 15 is a schematic structural diagram of a multi-resolution vector quantization audio decoder according to the present invention
- 16 is a schematic structural diagram of a multi-resolution inverse vector quantizer in an audio decoder
- FIG. 17 is a structural diagram of a multi-resolution inverse filter in an audio decoder.
- the flowchart shown in Figure 1 gives the overall technical solution of the audio coding method of the present invention.
- the input audio signal is first subjected to multi-resolution filtering, then the filter coefficients are reorganized, and the vector is divided on the time-frequency plane. Further select and determine the vector to be quantized; after the vector is determined, quantize each vector to obtain the corresponding vector quantization codebook and quantization residual.
- the vector quantization codebook is sent to the decoder as side information, and the quantization residual is quantized and encoded.
- the flowchart of multi-resolution filtering on the audio signal is shown in Figure 2.
- the input audio signal is decomposed into frames, and the transient measurement calculation is performed on the signal frame.
- the value is determined by comparing the value of the transient measurement with the threshold value. Whether the type of the current signal frame is a slowly changing signal or a fast changing signal.
- the filter structure of the signal frame is selected according to the type of different signal frames. If it is a slowly changing signal, cosine modulation filtering of equal bandwidth is performed to obtain the filter coefficients of the time-frequency plane, and the filtered signal is output.
- fast-changing signal If it is a fast-changing signal, perform cosine modulation filtering of equal bandwidth to obtain the filter coefficients of the time-frequency plane, and then use wavelet transform to perform multi-resolution analysis on the filter coefficients, adjust the time-frequency resolution of the filter coefficients, and finally output the filtered signal.
- a series of fast-changing signal types can be further defined, that is, there are multiple thresholds to subdivide the fast-changing signals, and different types of fast-changing signals use different wavelet transforms for multi-resolution analysis.
- the wavelet base can be fixed or adaptive.
- the filtering of slowly changing signals and fast changing signals is based on the technology of a cosine modulation filter bank.
- the cosine modulation filter bank includes two types of filtering: traditional cosine modulation filtering technology and modified discrete cosine transform MDCT technology.
- the source coding / decoding system based on cosine modulation filtering is shown in Figure 3.
- the input signal is decomposed into M subbands by the analysis filter bank, and the subband coefficients are quantized and entropy coded.
- subband coefficients are obtained, and the subband coefficients are filtered by a comprehensive filter bank to restore the audio signal.
- the cosine modulation filter banks represented by the formulas (F-1) and (F-2) are orthogonal filter banks.
- a symmetrical window is further specified
- the other form of filtering is the modified discrete cosine transform MDCT, also known as TDACCTime Domain Aliasing Cancellation.
- the cosine modulation filter bank has an impulse response of:
- the cosine modulation filter bank is a bi-orthogonal modulation filter bank.
- the analysis window and synthesis window of the cosine modulation filter bank can adopt any window form that satisfies the complete reconstruction condition of the filter bank, such as the SINE and KBD windows commonly used in audio coding.
- cosine modulation filter bank filtering can use fast Fourier transform to improve calculation efficiency, refer to the literature "A New Algorithm for the Implementation of Filter Banks based on 'Time Domain Aliasing Cancellation'" (P. Duhamel, Y. Mahieux and JP Petit, Proc. ICASSP, May 1991, pages 2209-2212).
- wavelet transform technology is also a well-known technology in the field of signal processing.
- wavelet transform technology is also a well-known technology in the field of signal processing.
- the signal after multi-resolution analysis and filtering has the property of reallocating and accumulating signal energy on the time-frequency plane, as shown in FIG. 4.
- signals that are stable in the time domain such as sinusoidal signals, in the time-frequency plane, their energy will be concentrated in a frequency band along the time direction, as shown in a in Figure 4; for fast-varying signals in the time domain, especially in audio coding
- fast-changing signals with obvious pre-echo phenomena, such as castanets their energy is mainly distributed along the frequency direction, that is, most of the energy values are concentrated at a few time points, as shown in Figure 4b; and for the time domain
- the noise signal has a frequency distribution in a wide range, so the energy accumulation mode has multiple modes, both in the time direction, along the frequency direction, and in a regional manner, as shown in Figure 4c As shown.
- the frequency resolution of the low frequency portion is high, and the frequency resolution of the high frequency portion is low. Because the components that cause the pre-echo phenomenon are mainly the middle and high frequency parts, if the coding quality of these components can be improved, the pre-echo can be effectively suppressed.
- An important starting point of multi-resolution vector quantization is to address these important filter coefficients. Optimize the errors introduced by quantization. Therefore, it is particularly important to use efficient coding strategies for these coefficients.
- important filter coefficients can be effectively reorganized and classified. From the above analysis, it can be known that the energy distribution of the signal after multi-resolution filtering shows a strong law.
- vector quantization can effectively use this feature to combine coefficients.
- the regions on the time-frequency plane are organized into a matrix form of a one-dimensional vector.
- vector quantization is performed on all or part of the matrix elements of the vector matrix.
- the quantized information is transmitted to the decoder as side information of the encoder, and the quantized residual and unquantized coefficients together form a residual system for quantization. coding.
- FIG. 5 describes in detail the process of performing multi-resolution vector quantization on the audio signal after multi-resolution filtering.
- the process of multi-resolution vector quantization includes three sub-processes of vector division, vector selection, and vector quantization.
- the vectors can be combined and extracted in different ways for the time-frequency plane, as shown in Figs. 6-a, 6-b, and 6-c.
- the vector is divided into 8 * 16 8-dimensional vectors according to the frequency direction, which is referred to as I-type vector organization for short.
- Figure 6-b is the result of dividing the vector according to the time direction.
- Figure 6-c is the result of organizing the vectors according to the time-frequency region.
- There are 16 * 8 8-dimensional vectors in total referred to as type III vector organization. In this way, 128 8-dimensional vectors can be obtained according to different division methods.
- the vector set obtained by the type I organization can be recorded as ⁇ v r ⁇ , and the vector set obtained by the type II organization can be recorded as ⁇ v J, and the vector set obtained by the type II organization can be recorded as ⁇ v t — r ⁇ .
- the first method is to select all vectors on the entire time-frequency plane for quantization.
- All vectors refer to the vectors covering all the time-frequency grid points obtained according to a certain division.
- all vectors obtained by the I-type vector organization may be used.
- All vectors obtained by type vector organization, or all vectors obtained by type II vector organization just select all vectors in one group.
- the quantization gain which refers to the ratio of the energy before the quantization to the quantity ⁇ ⁇ error energy.
- a vector of a vector organization having a large gain value is selected.
- the second method is to select the most important vector for quantization.
- the most important vector may include a vector in the frequency direction, a vector in the time direction, or a vector in the time-frequency region.
- the side information also needs to include the serial numbers of these vectors.
- the specific method of selecting vectors is described in the following. -After the quantized vector is determined, vector quantization is performed. No matter whether all vectors are selected for quantization or only important vectors are selected for quantization, the basic unit is the quantization of a single vector.
- the vector For a single D-dimensional vector, considering the trade-off between dynamic range and codebook size, the vector needs to be normalized before quantization to obtain a normalization factor.
- the normalization factor reflects the energy dynamic range of different vectors. The value of is the amount of change.
- the vector is quantized again, including the quantization of the codebook index number and the quantization of the normalization factor. Considering the limitation of the code rate and the coding gain, the number of bits occupied by the quantization of the normalization factor is between As few as possible, the better.
- the curve and surface fitting, multi-resolution decomposition, and prediction methods can be used to calculate the multi-resolution time-frequency coefficient envelope to obtain the normalized factor.
- FIG. 7 and FIG. 9 respectively show flowcharts of two specific embodiments of the multi-resolution vector quantization process.
- the embodiment shown in FIG. 7 selects a vector according to the energy and the variance of the internal components of the vector, and uses Taylor expansion to describe the multi-resolution time-frequency coefficient envelope, obtains a normalization factor, and then quantizes to achieve multi-resolution Vector quantization.
- the embodiment shown in FIG. 9 selects a vector according to the coding gain, and calculates a multi-resolution time-frequency coefficient envelope using a spline curve fitting to obtain a normalization factor, and then quantizes to achieve multi-resolution vector quantization.
- vector organization is performed according to the frequency direction, time direction, and time-frequency region. If the frequency coefficient is N-1024, the time-frequency multi-resolution filtering generates 64 * 16 grid points.
- the vector dimension is 8
- a vector in the form of an 8 * 16 matrix can be obtained by dividing by frequency
- a vector in the form of a 64 * 2 matrix can be obtained by dividing by time
- a vector in the form of a 16 * 8 matrix can be obtained according to the time-frequency region.
- the basis for selecting a vector is the energy of the vector and the variance of each component within the vector.
- the vector constituent elements need to take absolute values to exclude the influence of the numerical symbols.
- the ratio of total energy determines the vector to be selected
- the number M the typical value can be an integer within 3-50. Then, the first M vectors are selected for vector quantization. If vectors of the same region are included in the vector organization of type], the vector organization of type II, and the vector of type III, both are sorted by order of variance. Through the above steps, M vectors to be quantized are selected.
- the quantization search process for each order difference is completed.
- the vector needs to be normalized twice.
- the global maximum absolute value is used in the first normalization, and the signal envelope is estimated through finite multiple points in the second normalization. Then, The corresponding position vector is normalized a second time with the estimated value. After two normalizations, the dynamic range of the vector change is effectively controlled.
- the signal envelope estimation method is implemented by Taylor expansion, which will be described in detail later.
- Vector quantization is performed according to the following steps: first determine the parameters in Taylor's approximate calculation formula, in order to use Taylor's formula to represent the approximate energy value of any vector in the entire time-frequency plane, and calculate the maximum energy or maximum absolute value thereof; and then, select The resulting vector is normalized for the first time; the energy approximation of the vector to be vector quantized is calculated by Taylor formula, and the normalization is performed for the second time; finally, the normalized vector is quantized according to the minimum distortion, and Calculate quantized residuals.
- the above steps are described in detail below.
- the coefficient on each time-frequency grid point corresponds to a certain energy value.
- the coefficient energy of the time-frequency grid point as the square of the coefficient or its absolute value; define the energy of the vector as the sum of the coefficient energy on all time-frequency grid points that make up the vector or the largest absolute value of these coefficient values; define the time-frequency
- the energy of the planar region is the sum of the coefficient energies or the largest absolute value of these coefficient values at all the time-frequency grid points constituting the region. Therefore, in order to obtain the energy of the vector, it is necessary to calculate the energy sum or the value with the largest absolute value for all time-frequency grid point coefficients contained in the vector. Therefore, for the entire time-frequency plane, the division manners of FIG. 6-a, 6-b, and / or 6-c can be adopted, and the divided regions are numbered (1, 2 N).
- f ⁇ x Q + ⁇ ) f (x 0 ) + f m (x 0 ) A + ⁇ ( 2 > ( ⁇ 0 ) ⁇ 2 + ⁇ / (3) ( ⁇ ) ⁇ 3 (1 )
- the first, second, and third order differences of this sequence can be used for regression Calculated by the method, that is, DY, D 2 Y, and D 3 Y can be obtained from Y.
- the dots indicate the regions to be quantized and selected from all N regions, where N refers to the entire time-frequency plane division.
- the process of obtaining the normalization factor is as follows: A global gain factor Global-Gain is determined according to the total energy of the signal, and it is quantized and encoded with a logarithmic model. Then use the gain factor Global-Gain to normalize the vector, and then calculate the local normalization factor Local_Gain at the current vector position according to Taylor formula (1), and normalize the current vector again. So the global normalization factor Gain of the current vector is given by the product of the above two normalization factors:
- Local-Gain does not need to be quantized at the encoder.
- the local normalization factor Local-Gain can be obtained by the same process according to Taylor formula (1). Multiply Global-Gain with the reconstructed normalized vector to get the reconstructed value of the current vector. Therefore, the side information that needs to be encoded at the encoder end is the function values at the dots selected in FIG. 8 and their first and second order difference values.
- the present invention uses vector quantization to encode them.
- the vector quantization process is described as follows:
- the function value f (x) of the preselected M regions constitutes an M-dimensional vector y.
- the first-order and second-order differences corresponding to the vector are known, and are represented by dy and d 2 y, respectively.
- the three vectors are quantized separately.
- a codebook corresponding to three vectors has been obtained by using a codebook training algorithm, and the quantization process is a process of searching for the best matching vector.
- the vector y corresponds to the zero-order approximation of the Taylor formula, and the distortion measure in the codebook search uses the Euclidean distance.
- the quantization of the first-order difference dy corresponds to the first-order approximation of Taylor's formula:
- the quantization of the first order difference first searches for a small number of codewords with the least distortion in the corresponding codebook according to the Euclidean distance.
- Vector ⁇ Calculate the quantization distortion for each region in the small neighborhood using formula (3), and finally use the total distortion sum as the distortion metric, that is:
- the above method can be easily extended to the case of two-dimensional time-frequency surfaces.
- FIG. 9 shows another specific embodiment of the multi-resolution vector quantization process.
- vector organization is performed according to the frequency direction, time direction, and region. If all vectors are not quantized, the coding gain of each vector is calculated. The first M vectors with the largest coding gain are selected for vector quantization.
- the method for determining the M value is: After the vectors are sorted according to the energy from large to small, the number of vectors whose total energy percentage exceeds an empirical threshold (for example, 50 ° / -90%) is M. For more effective quantization, the vector needs to be normalized twice. The first time is to use the global maximum absolute value. The second time is to use spline fitting to calculate the normalized value within the vector. After two normalizations, The dynamic range of vector changes is effectively controlled.
- the entire time-frequency plane is re-divided and numbered (1, 2, ..., ).
- the m-th B-spline function on the interval [ ⁇ ;, x i + m + 1 ] is defined as:
- N iiB (x) N- ,, m- , (x) + ⁇ ,. (x) (6)
- any spline can be expressed as:
- the dots represent the regions to be encoded selected from all N regions, where N is obtained by dividing the entire time-frequency plane.
- Vector number The specific vector quantization process is as follows: On the encoder side, the vector to be quantized determines the global gain factor Global-Gain for the total energy of the signal, which is quantized and encoded using a logarithmic model; then the gain factor Global-Gain is used to vector Normalization is performed, and the local normalization factor Local_Gain at the current vector position is calculated according to the fitting formula (7) and the current vector is normalized again, so the overall normalization factor Gain of the current vector is the above two Product of factors:
- Local-Gain does not need to be quantized at the encoder.
- Local_Gain can be obtained by the same process according to the fitting formula (7). Multiply the total gain with the reconstructed normalized vector to obtain the reconstructed value of the current vector. Therefore, when the spline curve fitting method is used, the side information that needs to be encoded at the encoder end is the function value at the circle selected in FIG. 8, and the present invention uses vector quantization to encode them.
- the process of vector quantization is described as follows:
- the function value f (X) of M regions is selected in advance to form an M-dimensional vector y.
- the vector y can be further decomposed into several sub-vectors to control the size of the vector and improve the accuracy of the vector quantization. These vectors This is called the selection point vector.
- each vector y is quantized.
- the corresponding vector codebook can be obtained by using the codebook training algorithm.
- the quantization process is a process of searching for the best matching vector, and the searched codeword index is transmitted to the decoder as side information.
- the quantization error continues to the next quantization encoding process.
- the audio encoder shown in FIG. 10 includes a time-frequency mapper, a multi-resolution filter, a multi-resolution vector quantizer, a psychoacoustic calculation module, and a quantization encoder.
- the input audio signal to be encoded is divided into two channels, one of which passes through a time-frequency mapper and enters a multi-resolution filter for multi-resolution analysis, and the analysis result is used as a vector quantization input and a calculation for adjusting a psychoacoustic calculation module;
- the other way is to enter the psychoacoustic calculation module to estimate the psychoacoustic masking value of the current signal, which is used to control the perceptually irrelevant component of the quantization encoder;
- the multi-resolution vector quantizer uses the output of the multi-resolution filter to
- the coefficients are divided into vectors and vector quantization is performed.
- the quantization residual is quantized and entropy coded by a quantization encoder.
- FIG. 11 is a schematic structural diagram of a multi-resolution filter in the audio encoder shown in FIG. 10.
- the multi-resolution filter includes a transient metric calculation block, a plurality of equal-bandwidth cosine modulation filters, a plurality of multi-resolution analysis modules, and a time-frequency filter coefficient organization module; the number of the multi-resolution analysis modules is greater than the equal-bandwidth cosine.
- the number of modulation filters is one less.
- the working principle is as follows: After analysis of the transient measurement calculation module, the input audio signal is divided into a slowly changing signal and a fast changing signal. The fast changing signal can be further divided into a type I fast changing signal and a type II fast changing signal.
- the multi-resolution analysis module For slow-varying signals, input them into an equal-bandwidth cosine modulation filter to obtain the required time-frequency filter coefficients. For various types of fast-varying signals, first filter through the equal-bandwidth cosine modulation filter, and then enter The multi-resolution analysis module performs wavelet transformation on the filter coefficients, adjusts the time-frequency resolution of the coefficients, and finally organizes the module to output the filtered signals through the time-frequency filter coefficients.
- the structure of the multi-resolution vector quantizer is shown in FIG. 12, and includes a vector organization module, a vector selection module, a global normalization module, a local normalization module, and a quantization module.
- the time-frequency plane coefficients output by the multi-resolution filter pass through the vector organization module, and are organized into a vector form according to different division strategies.
- the vector selection module selects the vector to be quantified according to factors such as the amount of energy and outputs it to the global regression. ⁇ ⁇ ⁇ One module.
- the global normalization module the first global normalization processing is performed on all vectors through the global normalization factor, and then the local normalization factor of each vector is calculated in the local normalization module, and Perform the second local normalization process and output to the quantization module.
- the quantization module the normalized vector is quantized twice, and the quantized residual is calculated as the output of the multi-resolution vector quantizer.
- the present invention also provides a multi-resolution vector quantization audio decoding method.
- the received code stream is first demultiplexed, entropy decoded, and inverse quantized to obtain a quantized global normalization factor and a selection point.
- Quantified index From the codebook, the energy of each selected point and the difference values of each order are calculated, and the position information of the vector quantization on the time-frequency plane is obtained from the code stream, and then the corresponding formula is obtained according to Taylor formula or spline curve fitting formula Quadratic normalization factor at position. Then, a normalized vector is obtained according to the vectorization index, and the normalized vector is multiplied with the above two normalization factors to reconstruct the quantized vector on the time-frequency plane. The reconstructed vector is added to the corresponding coefficients of the time-frequency plane after decoding and inverse quantization, and multi-resolution inverse filtering and frequency-to-time mapping are performed to complete decoding to obtain a reconstructed audio signal.
- Figure 14 illustrates the process of multi-resolution inverse filtering in the decoding method.
- the time-frequency coefficients of the reconstructed vector are In the time-frequency organization, the following filtering operations are performed according to the decoded signal type: if it is a slowly changing signal, perform equal-band cosine modulation filtering to obtain a pulse-code-modulated PCM output in the time domain; if it is a fast-changing signal, perform multi-resolution Synthesis, and then perform equal bandwidth cosine modulation filtering to obtain the PCM output in the time domain.
- fast-changing signals they can be further subdivided into multiple types, and different types of fast-changing signals are different in the method of multi-resolution synthesis.
- the corresponding audio decoder is shown in FIG. 15, and specifically includes a decoding and inverse quantizer, a multi-resolution inverse vector quantizer, a multi-resolution inverse filter, and a frequency-time mapper.
- the decoding and inverse quantizer demultiplexes the received code stream, performs entropy decoding and inverse quantization, obtains side information of multi-resolution vector quantization, and outputs it to the multi-resolution inverse vector quantizer.
- the multi-resolution inverse vector quantizer reconstructs the quantized vector according to the inverse quantization result and the side information, and restores the value of the time-frequency plane.
- the multi-resolution inverse filter performs inverse filtering on the vector reconstructed by the multi-resolution inverse vector quantizer.
- the frequency-time mapper completes the frequency-to-time mapping to obtain the final reconstructed audio signal.
- the structure of the above multi-resolution inverse vector quantizer is shown in FIG. 16 and includes a demultiplexing module, an inverse quantization module, a normalized vector calculation module, a vector reconstruction module, and an addition module.
- the demultiplexing module demultiplexes the received code stream to obtain a normalization factor and a quantized index of a selected point.
- the inverse quantization module the energy envelope is obtained according to the quantization index, the vector quantization position information is obtained according to the demultiplexing result, and according to the normalization factor and the quantization index, the guidance point and the selection point vector are obtained by inverse quantization, and the secondary normalization is calculated.
- the normalization factor is output to a normalized vector calculation module.
- the normalization vector calculation module inverse secondary normalization is performed on the selected point vector to obtain a normalized vector, and the normalized vector is output to the vector reconstruction module. Then, the normalized vector is inversely normalized according to the energy envelope. To obtain a reconstructed vector. The reconstructed vector and the inverse quantization residual corresponding to the time-frequency plane are added in the addition module to obtain the inverse-quantized time-frequency coefficient, which is used as the input of the multi-resolution inverse filter.
- the structure of the multi-resolution inverse filter is shown in FIG. 17, and includes a time-frequency coefficient organization module, multiple multi-resolution synthesis modules, and multiple equal-bandwidth cosine modulation filters, where the number of multi-resolution synthesis modules is equal to the equal bandwidth.
- the number of cosine modulation filters is one less.
- the reconstructed vector is organized by the time-frequency coefficient organization module, it is divided into a slowly changing signal and a fast changing signal.
- the fast changing signal can be further subdivided into multiple types, such as I, I I ... K.
- For a slowly changing signal it is output to a cosine modulation filter of equal bandwidth for filtering to obtain a time-domain PCM output.
- For different fast-changing signal types they are output to different multi-resolution synthesis modules for synthesis, and then output to a cosine modulation filter of equal bandwidth for filtering to obtain the time-domain PCM output.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
Claims
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2003/000790 WO2005027094A1 (fr) | 2003-09-17 | 2003-09-17 | Procede et dispositif de quantification de vecteur multi-resolution multiple pour codage et decodage audio |
JP2005508847A JP2007506986A (ja) | 2003-09-17 | 2003-09-17 | マルチ解像度ベクトル量子化のオーディオcodec方法及びその装置 |
AU2003264322A AU2003264322A1 (en) | 2003-09-17 | 2003-09-17 | Method and device of multi-resolution vector quantilization for audio encoding and decoding |
EP03818611A EP1667109A4 (en) | 2003-09-17 | 2003-09-17 | METHOD AND DEVICE FOR QUANTIFYING MULTI-RESOLUTION VECTOR FOR AUDIO CODING AND DECODING |
US10/572,769 US20070067166A1 (en) | 2003-09-17 | 2003-09-17 | Method and device of multi-resolution vector quantilization for audio encoding and decoding |
CNA038270625A CN1839426A (zh) | 2003-09-17 | 2003-09-17 | 多分辨率矢量量化的音频编解码方法及装置 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2003/000790 WO2005027094A1 (fr) | 2003-09-17 | 2003-09-17 | Procede et dispositif de quantification de vecteur multi-resolution multiple pour codage et decodage audio |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2005027094A1 true WO2005027094A1 (fr) | 2005-03-24 |
Family
ID=34280738
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2003/000790 WO2005027094A1 (fr) | 2003-09-17 | 2003-09-17 | Procede et dispositif de quantification de vecteur multi-resolution multiple pour codage et decodage audio |
Country Status (6)
Country | Link |
---|---|
US (1) | US20070067166A1 (zh) |
EP (1) | EP1667109A4 (zh) |
JP (1) | JP2007506986A (zh) |
CN (1) | CN1839426A (zh) |
AU (1) | AU2003264322A1 (zh) |
WO (1) | WO2005027094A1 (zh) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009511966A (ja) * | 2005-10-12 | 2009-03-19 | フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ | マルチチャンネル音声信号の時間的および空間的整形 |
JP2009512895A (ja) * | 2005-10-21 | 2009-03-26 | クゥアルコム・インコーポレイテッド | スペクトル・ダイナミックスに基づく信号コーディング及びデコーディング |
JP2009514034A (ja) * | 2005-10-31 | 2009-04-02 | エルジー エレクトロニクス インコーポレイティド | 信号処理方法及びその装置、並びにエンコード、デコード方法及びその装置 |
US8392176B2 (en) | 2006-04-10 | 2013-03-05 | Qualcomm Incorporated | Processing of excitation in audio coding and decoding |
US8428957B2 (en) | 2007-08-24 | 2013-04-23 | Qualcomm Incorporated | Spectral noise shaping in audio coding based on spectral dynamics in frequency sub-bands |
US9105264B2 (en) | 2009-07-31 | 2015-08-11 | Panasonic Intellectual Property Management Co., Ltd. | Coding apparatus and decoding apparatus |
CN109087654A (zh) * | 2014-03-24 | 2018-12-25 | 杜比国际公司 | 对高阶高保真立体声信号应用动态范围压缩的方法和设备 |
CN110310659A (zh) * | 2013-07-22 | 2019-10-08 | 弗劳恩霍夫应用研究促进协会 | 用重构频带能量信息值解码或编码音频信号的设备及方法 |
CN112071297A (zh) * | 2020-09-07 | 2020-12-11 | 西北工业大学 | 一种矢量声的自适应滤波方法 |
Families Citing this family (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW594674B (en) * | 2003-03-14 | 2004-06-21 | Mediatek Inc | Encoder and a encoding method capable of detecting audio signal transient |
JP4579930B2 (ja) * | 2004-01-30 | 2010-11-10 | フランス・テレコム | 次元ベクトルおよび可変解像度量子化 |
US8345890B2 (en) | 2006-01-05 | 2013-01-01 | Audience, Inc. | System and method for utilizing inter-microphone level differences for speech enhancement |
US8194880B2 (en) | 2006-01-30 | 2012-06-05 | Audience, Inc. | System and method for utilizing omni-directional microphones for speech enhancement |
US8204252B1 (en) | 2006-10-10 | 2012-06-19 | Audience, Inc. | System and method for providing close microphone adaptive array processing |
US8744844B2 (en) | 2007-07-06 | 2014-06-03 | Audience, Inc. | System and method for adaptive intelligent noise suppression |
US9185487B2 (en) | 2006-01-30 | 2015-11-10 | Audience, Inc. | System and method for providing noise suppression utilizing null processing noise subtraction |
US8150065B2 (en) | 2006-05-25 | 2012-04-03 | Audience, Inc. | System and method for processing an audio signal |
US8934641B2 (en) * | 2006-05-25 | 2015-01-13 | Audience, Inc. | Systems and methods for reconstructing decomposed audio signals |
US8849231B1 (en) | 2007-08-08 | 2014-09-30 | Audience, Inc. | System and method for adaptive power control |
US8204253B1 (en) | 2008-06-30 | 2012-06-19 | Audience, Inc. | Self calibration of audio device |
US8949120B1 (en) | 2006-05-25 | 2015-02-03 | Audience, Inc. | Adaptive noise cancelation |
US8259926B1 (en) | 2007-02-23 | 2012-09-04 | Audience, Inc. | System and method for 2-channel and 3-channel acoustic echo cancellation |
CN101308655B (zh) * | 2007-05-16 | 2011-07-06 | 展讯通信(上海)有限公司 | 一种音频编解码方法与装置 |
US8189766B1 (en) | 2007-07-26 | 2012-05-29 | Audience, Inc. | System and method for blind subband acoustic echo cancellation postfiltering |
US8180064B1 (en) | 2007-12-21 | 2012-05-15 | Audience, Inc. | System and method for providing voice equalization |
US8143620B1 (en) | 2007-12-21 | 2012-03-27 | Audience, Inc. | System and method for adaptive classification of audio sources |
US8194882B2 (en) | 2008-02-29 | 2012-06-05 | Audience, Inc. | System and method for providing single microphone noise suppression fallback |
US8355511B2 (en) | 2008-03-18 | 2013-01-15 | Audience, Inc. | System and method for envelope-based acoustic echo cancellation |
US8521530B1 (en) | 2008-06-30 | 2013-08-27 | Audience, Inc. | System and method for enhancing a monaural audio signal |
US20110135007A1 (en) * | 2008-06-30 | 2011-06-09 | Adriana Vasilache | Entropy-Coded Lattice Vector Quantization |
US8774423B1 (en) | 2008-06-30 | 2014-07-08 | Audience, Inc. | System and method for controlling adaptivity of signal modification using a phantom coefficient |
EP2144230A1 (en) | 2008-07-11 | 2010-01-13 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Low bitrate audio encoding/decoding scheme having cascaded switches |
MX2011003824A (es) * | 2008-10-08 | 2011-05-02 | Fraunhofer Ges Forschung | Esquema de codificacion/decodificacion de audio conmutado de resolucion multiple. |
CN101436406B (zh) * | 2008-12-22 | 2011-08-24 | 西安电子科技大学 | 音频编解码器 |
US9838784B2 (en) | 2009-12-02 | 2017-12-05 | Knowles Electronics, Llc | Directional audio capture |
US9008329B1 (en) | 2010-01-26 | 2015-04-14 | Audience, Inc. | Noise reduction using multi-feature cluster tracker |
US8718290B2 (en) | 2010-01-26 | 2014-05-06 | Audience, Inc. | Adaptive noise reduction using level cues |
US9378754B1 (en) | 2010-04-28 | 2016-06-28 | Knowles Electronics, Llc | Adaptive spatial classifier for multi-microphone systems |
US8400876B2 (en) * | 2010-09-30 | 2013-03-19 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for sensing objects in a scene using transducer arrays and coherent wideband ultrasound pulses |
CN104620315B (zh) * | 2012-07-12 | 2018-04-13 | 诺基亚技术有限公司 | 一种矢量量化的方法及装置 |
FR3000328A1 (fr) * | 2012-12-21 | 2014-06-27 | France Telecom | Attenuation efficace de pre-echos dans un signal audionumerique |
EP2804176A1 (en) * | 2013-05-13 | 2014-11-19 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio object separation from mixture signal using object-specific time/frequency resolutions |
US9536540B2 (en) | 2013-07-19 | 2017-01-03 | Knowles Electronics, Llc | Speech signal separation and synthesis based on auditory scene analysis and speech modeling |
SG10201609218XA (en) | 2013-10-31 | 2016-12-29 | Fraunhofer Ges Forschung | Audio Decoder And Method For Providing A Decoded Audio Information Using An Error Concealment Modifying A Time Domain Excitation Signal |
PL3285254T3 (pl) | 2013-10-31 | 2019-09-30 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Dekoder audio i sposób dostarczania zdekodowanej informacji audio z wykorzystaniem ukrywania błędów na bazie sygnału wzbudzenia w dziedzinie czasu |
WO2015072883A1 (en) * | 2013-11-18 | 2015-05-21 | Baker Hughes Incorporated | Methods of transient em data compression |
KR102626320B1 (ko) | 2014-03-28 | 2024-01-17 | 삼성전자주식회사 | 선형예측계수 양자화방법 및 장치와 역양자화 방법 및 장치 |
CN112927703A (zh) | 2014-05-07 | 2021-06-08 | 三星电子株式会社 | 对线性预测系数量化的方法和装置及解量化的方法和装置 |
US9978388B2 (en) | 2014-09-12 | 2018-05-22 | Knowles Electronics, Llc | Systems and methods for restoration of speech components |
WO2016142002A1 (en) * | 2015-03-09 | 2016-09-15 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Audio encoder, audio decoder, method for encoding an audio signal and method for decoding an encoded audio signal |
US10063892B2 (en) * | 2015-12-10 | 2018-08-28 | Adobe Systems Incorporated | Residual entropy compression for cloud-based video applications |
GB2547877B (en) * | 2015-12-21 | 2019-08-14 | Graham Craven Peter | Lossless bandsplitting and bandjoining using allpass filters |
US9820042B1 (en) | 2016-05-02 | 2017-11-14 | Knowles Electronics, Llc | Stereo separation and directional suppression with omni-directional microphones |
WO2018201112A1 (en) * | 2017-04-28 | 2018-11-01 | Goodwin Michael M | Audio coder window sizes and time-frequency transformations |
US10891960B2 (en) * | 2017-09-11 | 2021-01-12 | Qualcomm Incorproated | Temporal offset estimation |
DE102017216972B4 (de) * | 2017-09-25 | 2019-11-21 | Carl Von Ossietzky Universität Oldenburg | Verfahren und Vorrichtung zur rechnergestützten Verarbeitung von Audiosignalen |
US11423313B1 (en) * | 2018-12-12 | 2022-08-23 | Amazon Technologies, Inc. | Configurable function approximation based on switching mapping table content |
CN115979261B (zh) * | 2023-03-17 | 2023-06-27 | 中国人民解放军火箭军工程大学 | 一种多惯导系统的轮转调度方法、系统、设备及介质 |
CN118296306B (zh) * | 2024-05-28 | 2024-09-06 | 小舟科技有限公司 | 基于分形维数增强的脑电信号处理方法、装置及设备 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5473727A (en) * | 1992-10-31 | 1995-12-05 | Sony Corporation | Voice encoding method and voice decoding method |
CN1222997A (zh) * | 1996-07-01 | 1999-07-14 | 松下电器产业株式会社 | 音频信号编码方法、解码方法,及音频信号编码装置、解码装置 |
CN1224523A (zh) * | 1997-05-15 | 1999-07-28 | 松下电器产业株式会社 | 音频信号编码装置和译码装置以及音频信号编码和译码方法 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
IT1180126B (it) * | 1984-11-13 | 1987-09-23 | Cselt Centro Studi Lab Telecom | Procedimento e dispositivo per la codifica e decodifica del segnale vocale mediante tecniche di quantizzazione vettoriale |
IT1184023B (it) * | 1985-12-17 | 1987-10-22 | Cselt Centro Studi Lab Telecom | Procedimento e dispositivo per la codifica e decodifica del segnale vocale mediante analisi a sottobande e quantizzazione vettorariale con allocazione dinamica dei bit di codifica |
IT1195350B (it) * | 1986-10-21 | 1988-10-12 | Cselt Centro Studi Lab Telecom | Procedimento e dispositivo per la codifica e decodifica del segnale vocale mediante estrazione di para metri e tecniche di quantizzazione vettoriale |
JPH07212239A (ja) * | 1993-12-27 | 1995-08-11 | Hughes Aircraft Co | ラインスペクトル周波数のベクトル量子化方法および装置 |
TW321810B (zh) * | 1995-10-26 | 1997-12-01 | Sony Co Ltd | |
JP3353266B2 (ja) * | 1996-02-22 | 2002-12-03 | 日本電信電話株式会社 | 音響信号変換符号化方法 |
JP3849210B2 (ja) * | 1996-09-24 | 2006-11-22 | ヤマハ株式会社 | 音声符号化復号方式 |
US6363338B1 (en) * | 1999-04-12 | 2002-03-26 | Dolby Laboratories Licensing Corporation | Quantization in perceptual audio coders with compensation for synthesis filter noise spreading |
US6298322B1 (en) * | 1999-05-06 | 2001-10-02 | Eric Lindemann | Encoding and synthesis of tonal audio signals using dominant sinusoids and a vector-quantized residual tonal signal |
-
2003
- 2003-09-17 JP JP2005508847A patent/JP2007506986A/ja active Pending
- 2003-09-17 WO PCT/CN2003/000790 patent/WO2005027094A1/zh active Application Filing
- 2003-09-17 EP EP03818611A patent/EP1667109A4/en not_active Withdrawn
- 2003-09-17 US US10/572,769 patent/US20070067166A1/en not_active Abandoned
- 2003-09-17 CN CNA038270625A patent/CN1839426A/zh active Pending
- 2003-09-17 AU AU2003264322A patent/AU2003264322A1/en not_active Abandoned
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5473727A (en) * | 1992-10-31 | 1995-12-05 | Sony Corporation | Voice encoding method and voice decoding method |
CN1222997A (zh) * | 1996-07-01 | 1999-07-14 | 松下电器产业株式会社 | 音频信号编码方法、解码方法,及音频信号编码装置、解码装置 |
CN1224523A (zh) * | 1997-05-15 | 1999-07-28 | 松下电器产业株式会社 | 音频信号编码装置和译码装置以及音频信号编码和译码方法 |
Non-Patent Citations (2)
Title |
---|
PAN XINGDE ZHU XIAOMING A.H. ET AL.: "EAC audio coding technology", ELECTRONIC AUDIO TECHNOLOGY, February 2003 (2003-02-01), pages 11 - 15 * |
See also references of EP1667109A4 * |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009511966A (ja) * | 2005-10-12 | 2009-03-19 | フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ | マルチチャンネル音声信号の時間的および空間的整形 |
US8644972B2 (en) | 2005-10-12 | 2014-02-04 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Temporal and spatial shaping of multi-channel audio signals |
US9361896B2 (en) | 2005-10-12 | 2016-06-07 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Temporal and spatial shaping of multi-channel audio signal |
JP2009512895A (ja) * | 2005-10-21 | 2009-03-26 | クゥアルコム・インコーポレイテッド | スペクトル・ダイナミックスに基づく信号コーディング及びデコーディング |
US8027242B2 (en) | 2005-10-21 | 2011-09-27 | Qualcomm Incorporated | Signal coding and decoding based on spectral dynamics |
JP2009514034A (ja) * | 2005-10-31 | 2009-04-02 | エルジー エレクトロニクス インコーポレイティド | 信号処理方法及びその装置、並びにエンコード、デコード方法及びその装置 |
US8392176B2 (en) | 2006-04-10 | 2013-03-05 | Qualcomm Incorporated | Processing of excitation in audio coding and decoding |
US8428957B2 (en) | 2007-08-24 | 2013-04-23 | Qualcomm Incorporated | Spectral noise shaping in audio coding based on spectral dynamics in frequency sub-bands |
US9105264B2 (en) | 2009-07-31 | 2015-08-11 | Panasonic Intellectual Property Management Co., Ltd. | Coding apparatus and decoding apparatus |
CN110310659A (zh) * | 2013-07-22 | 2019-10-08 | 弗劳恩霍夫应用研究促进协会 | 用重构频带能量信息值解码或编码音频信号的设备及方法 |
US11735192B2 (en) | 2013-07-22 | 2023-08-22 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Audio encoder, audio decoder and related methods using two-channel processing within an intelligent gap filling framework |
US11769512B2 (en) | 2013-07-22 | 2023-09-26 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for decoding and encoding an audio signal using adaptive spectral tile selection |
US11769513B2 (en) | 2013-07-22 | 2023-09-26 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for decoding or encoding an audio signal using energy information values for a reconstruction band |
CN110310659B (zh) * | 2013-07-22 | 2023-10-24 | 弗劳恩霍夫应用研究促进协会 | 用重构频带能量信息值解码或编码音频信号的设备及方法 |
US11922956B2 (en) | 2013-07-22 | 2024-03-05 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for encoding or decoding an audio signal with intelligent gap filling in the spectral domain |
US11996106B2 (en) | 2013-07-22 | 2024-05-28 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E. V. | Apparatus and method for encoding and decoding an encoded audio signal using temporal noise/patch shaping |
CN109087654A (zh) * | 2014-03-24 | 2018-12-25 | 杜比国际公司 | 对高阶高保真立体声信号应用动态范围压缩的方法和设备 |
CN109087654B (zh) * | 2014-03-24 | 2023-04-21 | 杜比国际公司 | 对高阶高保真立体声信号应用动态范围压缩的方法和设备 |
US11838738B2 (en) | 2014-03-24 | 2023-12-05 | Dolby Laboratories Licensing Corporation | Method and device for applying Dynamic Range Compression to a Higher Order Ambisonics signal |
CN112071297A (zh) * | 2020-09-07 | 2020-12-11 | 西北工业大学 | 一种矢量声的自适应滤波方法 |
CN112071297B (zh) * | 2020-09-07 | 2023-11-10 | 西北工业大学 | 一种矢量声的自适应滤波方法 |
Also Published As
Publication number | Publication date |
---|---|
CN1839426A (zh) | 2006-09-27 |
EP1667109A4 (en) | 2007-10-03 |
US20070067166A1 (en) | 2007-03-22 |
AU2003264322A1 (en) | 2005-04-06 |
EP1667109A1 (en) | 2006-06-07 |
JP2007506986A (ja) | 2007-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2005027094A1 (fr) | Procede et dispositif de quantification de vecteur multi-resolution multiple pour codage et decodage audio | |
US7275036B2 (en) | Apparatus and method for coding a time-discrete audio signal to obtain coded audio data and for decoding coded audio data | |
CA2608030C (en) | Scalable compressed audio bit stream and codec using a hierarchical filterbank and multichannel joint coding | |
CN100395817C (zh) | 编码设备、解码设备和解码方法 | |
US7343287B2 (en) | Method and apparatus for scalable encoding and method and apparatus for scalable decoding | |
KR101343267B1 (ko) | 주파수 세그먼트화를 이용한 오디오 코딩 및 디코딩을 위한 방법 및 장치 | |
US6182034B1 (en) | System and method for producing a fixed effort quantization step size with a binary search | |
US6029126A (en) | Scalable audio coder and decoder | |
CN102436819B (zh) | 无线音频压缩、解压缩方法及音频编码器和音频解码器 | |
US9037454B2 (en) | Efficient coding of overcomplete representations of audio using the modulated complex lapped transform (MCLT) | |
WO2005096274A1 (fr) | Dispositif et procede de codage/decodage audio ameliores | |
US7512539B2 (en) | Method and device for processing time-discrete audio sampled values | |
CN101223577A (zh) | 对低比特率音频信号进行编码/解码的方法和设备 | |
CN1264533A (zh) | 多声道低比特率编码解码方法和设备 | |
CN101162584A (zh) | 使用带宽扩展技术对音频信号编码和解码的方法和设备 | |
KR20130047643A (ko) | 통신 시스템에서 신호 코덱 장치 및 방법 | |
Kumar et al. | The optimized wavelet filters for speech compression | |
JPH10276095A (ja) | 符号化器及び復号化器 | |
JP3557164B2 (ja) | オーディオ信号符号化方法及びその方法を実行するプログラム記憶媒体 | |
WO2005096508A1 (fr) | Equipement de codage et de decodage audio ameliore, procede associe | |
James et al. | A comparative study of speech compression using different transform techniques | |
CN100538821C (zh) | 快变音频信号的编解码方法 | |
Hosny et al. | Novel techniques for speech compression using wavelet transform | |
AU2011205144B2 (en) | Scalable compressed audio bit stream and codec using a hierarchical filterbank and multichannel joint coding | |
Mandridake et al. | Joint wavelet transform and vector quantization for speech coding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 03827062.5 Country of ref document: CN |
|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BY BZ CA CH CN CO CR CU CZ DE DM DZ EC EE EG ES FI GB GD GE GH HR HU ID IL IN IS JP KE KG KP KR KZ LK LR LS LT LU LV MA MD MG MK MW MX MZ NI NO NZ OM PG PH PL RO RU SC SD SE SG SK SL SY TJ TM TR TT TZ UA UG US UZ VC VN YU ZM |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ UG ZM ZW AM AZ BY KG KZ RU TJ TM AT BE BG CH CY CZ DK EE ES FI FR GB GR HU IE IT LU NL PT RO SE SI SK TR BF BJ CF CI CM GA GN GQ GW ML MR NE SN TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2003818611 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2005508847 Country of ref document: JP |
|
WWP | Wipo information: published in national office |
Ref document number: 2003818611 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2007067166 Country of ref document: US Ref document number: 10572769 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 10572769 Country of ref document: US |