WO2011127757A1 - 可分层音频编解码方法和系统及瞬态信号可分层编解码方法 - Google Patents

可分层音频编解码方法和系统及瞬态信号可分层编解码方法 Download PDF

Info

Publication number
WO2011127757A1
WO2011127757A1 PCT/CN2011/070206 CN2011070206W WO2011127757A1 WO 2011127757 A1 WO2011127757 A1 WO 2011127757A1 CN 2011070206 W CN2011070206 W CN 2011070206W WO 2011127757 A1 WO2011127757 A1 WO 2011127757A1
Authority
WO
WIPO (PCT)
Prior art keywords
core layer
coding
signal
subband
frequency domain
Prior art date
Application number
PCT/CN2011/070206
Other languages
English (en)
French (fr)
Inventor
彭科
陈国明
袁浩
江东平
黎家力
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to RU2012136397/08A priority Critical patent/RU2522020C1/ru
Priority to EP11768369.8A priority patent/EP2528057B1/en
Priority to US13/580,855 priority patent/US8874450B2/en
Priority to BR112012021359-8A priority patent/BR112012021359B1/pt
Publication of WO2011127757A1 publication Critical patent/WO2011127757A1/zh
Priority to HK13106102.7A priority patent/HK1179402A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • G10L19/025Detection of transients or attacks for time/frequency resolution switching

Definitions

  • the present invention relates to audio codec technology, and in particular, to a layered audio codec method, a system, and a transient signal layerable codec method. Background technique
  • Hierarchical audio coding refers to organizing the audio coded code stream in a hierarchical manner. Generally, it is divided into a core layer and several extension layers. The decoder can implement only the code stream without higher layers (such as the extension layer). The lower layer (such as the core layer) encodes the code stream for decoding, and the more the number of layers decoded, the greater the sound quality is improved.
  • Hierarchical coding technology has very important practical value for communication networks.
  • the transmission of data can be done by different channels together, and the packet loss rate of each channel may be different.
  • the data needs to be hierarchically processed, and the important part of the data is placed in the packet loss rate.
  • the transmission is relatively low in the stable channel, and the secondary part of the data is transmitted in the unsteady channel with a relatively high packet loss rate, thereby ensuring that only the relative quality degradation occurs when the unsteady channel is lost. There is no case where one frame of data cannot be decoded at all.
  • the bandwidth of some communication networks is very unstable, the bandwidth between different users is different, and a fixed bit rate cannot be used to balance the needs of users with different bandwidths.
  • the coding scheme allows different users to enjoy the best sound quality under the bandwidth conditions they have.
  • the technical problem to be solved by the present invention is to provide an efficient layered audio encoding and decoding method and system and a transient signal layerable encoding and decoding method to improve the quality of layerable audio codec.
  • the present invention provides a layered audio coding method, including: performing a transient decision on an audio signal of a current frame;
  • the windowed audio signal is directly subjected to time-frequency transform to obtain a total frequency domain coefficient; when the transient state is a transient signal, the audio signal is divided into sub-frames, and each sub-frame is performed.
  • the transformed group frequency domain coefficients constitute a total frequency domain coefficient of the current frame, and the total frequency domain coefficients are rearranged according to the coding subband from low frequency to high frequency, wherein the total frequency domain coefficients include a core layer frequency domain coefficient and an extension layer frequency domain coefficient, the coded subband includes a core layer coding subband and an enhancement layer coding subband, and the core layer frequency domain coefficients constitute a plurality of core layer coding subbands, and the extension layer frequency domain coefficients constitute a number of extension layer coding subbands;
  • the amplitude envelope values of the core layer coding subband and the enhancement layer coding subband are uniformly quantized; if it is a transient signal, the amplitude of the core layer coding subband and the extension layer coding subband is performed.
  • the envelope values are separately quantized separately, and the amplitude envelope quantization index of the core layer coding subband and the amplitude envelope quantization index of the enhancement layer coding subband are respectively rearranged;
  • the vector quantized frequency domain coefficients are inversely quantized, and are compared with the original frequency domain coefficients obtained by time-frequency transform to obtain a core layer residual signal;
  • the amplitude envelope code bits of the core layer and the extension layer coded subband, the core layer frequency domain coefficient coded bits, and the coded bits of the extension layer coded signal are multiplexed and packetized, and then transmitted to the decoder.
  • the present invention also provides a layered audio decoding method, the method package Includes:
  • bit allocation is performed on the core layer coding subband, and thus the amplitude envelope quantization index of the core layer residual signal is calculated, according to the amplitude of the core layer residual signal Envelope quantization index and amplitude envelope quantization index of the enhancement layer coding subband perform bit allocation on the coding subband of the enhancement layer coded signal;
  • the coding bits of the core layer frequency domain coefficient and the coding bits of the extension layer coded signal are respectively decoded to obtain the core layer frequency domain coefficient and the extension layer coding.
  • the frequency domain coefficients of the entire bandwidth are directly subjected to time-frequency inverse transform to obtain an output audio signal; if the transient determination information indicates a transient signal, the whole The frequency domain coefficients of the bandwidth are rearranged, and then divided into group frequency domain coefficients, and each group of frequency domain coefficients is subjected to time-frequency inverse transform, and the final audio signal is calculated according to the transformed group time domain signal.
  • the present invention also provides a layered audio coding method for a transient signal, the method comprising:
  • the audio signal is divided into sub-frames, and each sub-frame is time-frequency transformed.
  • the transformed group frequency domain coefficients form the total frequency domain coefficients of the current frame, and the total frequency domain coefficients are sequentially performed from the low frequency to the high frequency according to the coding sub-band.
  • the total frequency domain coefficients include a core layer frequency domain coefficient and an extended layer frequency domain coefficient
  • the coding subband includes a core layer coding subband and an enhancement layer coding subband
  • the core layer frequency domain coefficients form a plurality of a core layer coding subband
  • the extension layer frequency domain coefficients constitute a plurality of extension layer coding subbands;
  • Quantizing and encoding the amplitude envelope values of the core layer coding subband and the extension layer coding subband To the amplitude envelope quantization index of the core layer coding subband and the enhancement layer coding subband and the coded bits thereof, wherein the amplitude envelope values of the core layer coding subband and the extension layer coding subband are separately quantized separately, and The amplitude envelope quantization index of the core layer coding subband and the amplitude envelope quantization index of the extension layer coding subband are respectively rearranged;
  • the vector quantized frequency domain coefficients are inversely quantized, and are compared with the original frequency domain coefficients obtained by time-frequency transform to obtain a core layer residual signal;
  • the amplitude envelope code bits of the core layer coding subband and the enhancement layer coded subband, the coded bits of the core layer frequency domain coefficients, and the coded bits of the enhancement layer coded signal are multiplexed and packetized, and then transmitted to the decoding end.
  • the present invention also provides a layered decoding method for a transient signal, the method comprising:
  • the quantized index, the amplitude envelope quantization index of the core layer coding subband and the extension layer coding subband are rearranged according to the frequency from the largest to the largest;
  • the domain coefficient and the extended layer coded signal rearrange the extended layer coded signals according to the subband order, and add the frequency domain coefficients of the core layer to obtain frequency domain coefficients of the entire bandwidth;
  • the frequency domain coefficients of the entire bandwidth are rearranged, then divided into groups, and each group of frequency domain coefficients is subjected to time-frequency inverse transform, and the final audio signal is calculated according to the transformed group time domain signals.
  • the present invention also provides a layerable audio coding system, the system comprising:
  • a frequency domain coefficient generating unit an amplitude envelope calculating unit, an amplitude envelope quantization and coding unit, a core layer bit allocation unit, a core layer frequency domain coefficient vector quantization and coding unit, and a bit stream multiplexer;
  • the system further includes: a transient decision unit, an extended layer coded signal generating unit, a residual signal amplitude envelope generating unit, an extended layer bit allocation unit, and an extended layer coded signal vector quantization and coding unit;
  • the transient decision unit is configured to: perform a transient decision on the audio signal of the current frame; the frequency domain coefficient generating unit is connected to the transient decision unit, and the frequency domain coefficient generating unit is configured as: a transient decision
  • the windowed audio signal is directly subjected to time-frequency transform to obtain a total frequency domain coefficient
  • the transient state is a transient signal
  • the audio signal is divided into sub-frames, and each subframe is time-frequency-converted.
  • the transformed M-group frequency domain coefficients form a total frequency domain coefficient of the current frame, and the total frequency domain coefficients are rearranged according to the coding sub-band from low frequency to high frequency, wherein the total frequency domain coefficient includes a core layer.
  • the coded subband includes a core layer coding subband and an enhancement layer coding subband, the core layer frequency domain coefficients constitute a plurality of core layer coding subbands, and the extension layer frequency domain coefficients form a plurality of Extended layer coding subband;
  • the amplitude envelope calculation unit is connected to the frequency domain coefficient generation unit, and the amplitude envelope calculation unit is configured to: calculate a magnitude envelope value of the core layer coding subband and the enhancement layer coding subband;
  • the amplitude envelope quantization and coding unit is coupled to the amplitude envelope calculation unit and the transient decision unit, and the amplitude envelope quantization and coding unit is configured to: encode a sub-band and an enhancement layer code for the core layer
  • the amplitude envelope value of the band is quantized and encoded to obtain an amplitude envelope quantization index of the core layer coding subband and the enhancement layer coding subband and a coding bit thereof; wherein, if it is a steady state signal, the core layer coder is obtained
  • the amplitude envelope values of the band and the extended layer coded subband are uniformly quantized; if it is a transient signal, the amplitude envelope values of the core layer coded subband and the extended layer coded subband are separately quantized to And rearranging the amplitude
  • the core layer bit allocation unit is connected to the amplitude envelope quantization and coding unit, and the core layer bit allocation unit is configured to: encode a subband of a core layer according to a magnitude envelope of the core layer coding subband Performing bit allocation to obtain a bit allocation number of the core layer coding subband;
  • the core layer frequency domain coefficient vector quantization and coding unit is connected to the frequency domain coefficient generation unit, the amplitude envelope quantization and coding unit, and the core layer bit allocation unit, and the core layer frequency domain coefficient vector quantization and coding unit
  • the method is set to: use a quantized amplitude envelope value of the core layer coding subband reconstructed by the amplitude envelope of the core layer coding subband and a bit allocation number of the core layer coding subband to the frequency domain of the core layer coding subband
  • the coefficients are normalized, vector quantized, and encoded to obtain a kernel layer frequency domain coefficient coded bit;
  • the extension layer coded signal generating unit is connected to the frequency domain coefficient generating unit and the core layer frequency domain coefficient vector quantization and coding unit, and the extended layer coded signal generating unit is configured to: generate a core layer residual signal, and obtain An enhancement layer coded signal composed of a core layer residual signal and an extended layer frequency domain coefficient;
  • the residual signal amplitude envelope generating unit is connected to the amplitude envelope quantization and coding unit and the core layer bit allocation unit, and the residual signal amplitude envelope generating unit is configured to: according to the core layer coder The amplitude envelope quantization index of the band and the bit allocation number of the corresponding core layer coding subband obtain the amplitude envelope quantization index of the core layer residual signal;
  • the extension layer bit allocation unit is connected to the residual signal amplitude envelope generating unit and the amplitude envelope quantization and coding unit, and the extended layer bit allocation unit is configured to: according to the core layer residual signal amplitude packet a quantized index and an amplitude envelope quantization index of the extended layer coding subband are used to perform bit allocation on the extended layer coded signal coding subband to obtain a bit allocation number of the extended layer coded signal coding subband; the extension layer coded signal vector quantization sum a coding unit, coupled to the amplitude envelope quantization and coding unit, an extension layer bit allocation unit, a residual signal amplitude envelope generation unit, and an enhancement layer coding signal generation unit, the extension layer coding signal vector quantization and coding unit
  • the method is set to: use the amplitude envelope envelope of the extended layer coded signal to reconstruct the index of the coded subband according to the extended layer coded signal to encode the quantized amplitude envelope value of the subband and the bit allocation number of the extended layer coded signal coding subband to
  • the present invention also provides a layered audio decoding system, the system comprising: a bit stream demultiplexer, an amplitude envelope decoding unit, a core layer bit allocation unit, a core layer decoding and an inverse quantization unit
  • the system further includes: a residual signal amplitude envelope generating unit, an extended layer bit allocation unit, an extended layer encoded signal decoding and inverse quantization unit, an overall bandwidth frequency domain coefficient recovery unit, a noise filling unit, and an audio signal recovery unit; :
  • the amplitude envelope decoding unit is connected to the bit stream demultiplexer, and the amplitude envelope decoding unit is configured to: a core layer and an extended layer coding subband outputted by the bit stream demultiplexer
  • the amplitude envelope coded bits are decoded to obtain an amplitude envelope quantization index of the core layer coding subband and the enhancement layer coding subband; if the transient decision information indicates a transient signal, the core layer is coded subband and extended.
  • the amplitude envelope quantization index of the layer coded sub-band is rearranged in order of frequency from small to large;
  • the core layer bit allocation unit is connected to the amplitude envelope decoding unit, and the core layer bit allocation unit is configured to: perform a core layer coding subband according to a magnitude envelope quantization index of a core layer coding subband Bit allocation, obtaining the bit allocation number of the core layer coding subband;
  • the core layer decoding and inverse quantization unit is connected to the bit stream demultiplexer, the amplitude envelope decoding unit and the core layer bit allocation unit, and the core layer decoding and inverse quantization unit is configured to: according to the core layer coding
  • the amplitude envelope quantization index of the subband is calculated to obtain the quantized amplitude envelope value of the core layer coding subband, and the bit stream demultiplexer is used by using the bit allocation number of the core layer coding subband and the quantization amplitude envelope value.
  • the output core layer frequency domain coefficient coded bits are decoded, inverse quantized, and denormalized to obtain a core layer frequency domain coefficient;
  • the residual signal amplitude envelope generating unit is connected to the amplitude envelope decoding unit and the core layer bit allocation unit, and the residual signal amplitude envelope generating unit is configured to: encode the subband according to the core layer The amplitude envelope quantization index and the bit allocation number of the corresponding core layer coding subband, and the correction value statistics table of the core layer residual signal amplitude envelope quantization index are obtained, and the amplitude envelope quantization index of the core layer residual signal is obtained;
  • the extension layer bit allocation unit is connected to the residual signal amplitude envelope generating unit and the amplitude envelope decoding unit, and the extended layer bit allocation unit is configured to: according to the amplitude envelope of the core layer residual signal
  • the quantization index and the amplitude envelope quantization index of the spreading layer coding subband perform bit allocation of the coding layer of the spreading layer coding signal to obtain a bit allocation number of the coding layer subband of the enhancement layer coding signal;
  • the enhancement layer coding signal decoding and inverse quantization unit is connected to the bit stream demultiplexer, the amplitude envelope decoding unit, the extension layer bit allocation unit, and the residual signal amplitude envelope generation unit, the extension layer
  • the coded signal decoding and inverse quantization unit is configured to: calculate the quantized amplitude envelope value of the extended layer coded signal coding subband using the amplitude envelope quantization index of the extended layer coded signal coding subband, and use the extended layer coded signal coding subband.
  • the whole bandwidth frequency domain coefficient recovery unit is connected to the core layer decoding and inverse quantization unit and the extended layer coded signal decoding and inverse quantization unit, and the entire bandwidth frequency domain coefficient recovery unit is set to: according to the subband order
  • the extended layer coded signal decoded by the extended layer coded signal decoding and the inverse quantization unit is reordered, and then summed with the core layer frequency domain coefficients output by the core layer decoding and inverse quantization unit to obtain the entire bandwidth frequency domain coefficient;
  • the noise filling unit is connected to the entire bandwidth frequency domain coefficient recovery unit and the amplitude envelope decoding unit, and the noise filling unit is configured to: perform noise filling on the subbands to which the coding bits are not allocated during the encoding process;
  • An audio signal recovery unit is connected to the noise filling unit, and the audio signal recovery unit is configured to: if the transient decision information indicates a steady state signal, perform direct time-frequency inverse transform on the frequency domain coefficients of the entire bandwidth, The output audio signal; if the transient decision information indicates a transient signal, rearrange the frequency domain coefficients of the entire bandwidth, and then divide the frequency domain coefficients into groups, and perform time-frequency inverse transformation on each set of frequency domain coefficients, according to The resulting group time domain signal is calculated to obtain the final audio signal.
  • the present invention introduces a processing method for a transient signal frame in a layered audio codec method, performs a time-frequency transform on a transient signal frame, and then transforms the obtained frequency domain coefficient at the core layer. And re-arranging respectively in the range of the extension layer, so as to perform the same bit allocation and frequency domain coefficient coding and the like subsequent processing with the steady-state signal frame, thereby improving the coding efficiency of the transient signal frame and improving The quality of layered audio codecs.
  • FIG. 1 is a schematic diagram of a layered audio encoding method of the present invention
  • FIG. 2 is a flow chart of an embodiment of a layered audio encoding method of the present invention
  • FIG. 3 is a flow chart of a method for performing bit allocation correction after vector quantization according to the present invention.
  • FIG. 4 is a schematic diagram of a layer coded code stream of the present invention.
  • FIG. 5 is a schematic diagram showing the relationship between layering according to frequency band range and layering according to code rate according to the present invention
  • FIG. 6 is a schematic structural diagram of a layered audio encoding system according to the present invention.
  • FIG. 7 is a schematic diagram of a layered audio decoding method of the present invention.
  • FIG. 8 is a flow chart of an embodiment of a layered audio decoding method of the present invention.
  • FIG. 9 is a schematic structural diagram of a layered audio decoding system of the present invention.
  • the main idea of the layered audio codec method and system of the present invention is to perform a segmentation time-frequency transform on a transient signal frame by processing a method for processing a transient signal frame in a layerable audio codec method, and then The frequency domain coefficients obtained by the transform are respectively rearranged in the core layer and the extended layer range, so as to perform the same bit allocation and frequency domain coefficient coding and the like subsequent processing with the steady-state signal frame, thereby improving the coding efficiency of the transient signal frame. Improved the quality of layered audio codecs.
  • the layered audio coding method of the present invention includes the following steps:
  • Step 10 Perform a transient decision on the audio signal of the current frame
  • Step 20 processing the audio signal according to the result of the transient decision, and obtaining the frequency domain coefficients of the core layer and the extended layer;
  • the windowed audio signal is directly subjected to time-frequency transform to obtain a total frequency domain coefficient; when the transient state is a transient signal, the audio signal is divided into sub-frames. Performing a time-frequency transform on each sub-frame, and transforming the obtained M-group frequency domain coefficients to form a total frequency domain coefficient of the current frame, and rearranging the total frequency-domain coefficients in order from the low frequency to the high frequency according to the coding sub-band, wherein
  • the total frequency domain coefficients include a core layer frequency domain coefficient and an extended layer frequency domain coefficient
  • the coding subband includes a core layer coding subband and an enhancement layer coding subband
  • the core layer frequency domain coefficients constitute a plurality of core layer coding subbands.
  • the extended layer frequency domain coefficients constitute a number of extended layer coding subbands.
  • the total frequency domain coefficient of the current frame is obtained by: setting the N-point time domain sampling signal x(n) of the current frame and the N-point time domain sampling signal of the previous frame.
  • the 2N point time domain sampling signal " is formed, and then the windowing and time domain anti-aliasing processing is performed to obtain the N point time domain sampling signal x(n);
  • the frequency domain coefficients are rearranged, the frequency domain coefficients are rearranged in the order of the low frequency to the high frequency according to the coding subbands in the core layer and the extended layer.
  • Step 30 Quantize and encode the amplitude envelope values of the core layer coding subband and the extension layer coding subband, and obtain the amplitude envelope quantization index of the core layer coding subband and the extension layer coding subband and the coding bits thereof;
  • the amplitude envelope values of the core layer coding subband and the enhancement layer coding subband are quantized and encoded, and the amplitude envelope quantization index of the core layer coding subband and the enhancement layer coding subband and the coding bits thereof are obtained; Wherein, if it is a steady state signal, the amplitude envelope values of the core layer coding subband and the enhancement layer coding subband are uniformly quantized; if it is a transient signal, the core layer coding subband and the extension layer coding subband are The amplitude envelope values are separately quantized separately, and the amplitude envelope quantization index of the core layer coding subband and the amplitude envelope quantization index of the enhancement layer coding subband are rearranged, respectively.
  • the rearranging the amplitude envelope quantization index specifically includes:
  • the amplitude envelope quantization indices of the coded sub-bands in the same subframe are rearranged in the order of increasing or decreasing frequency, and two code sub-segments representing the peer frequencies belonging to the two subframes are used at the subframe connection. Bring the connection.
  • the amplitude envelope of the quantized core layer coding subband is obtained.
  • Huffman coding if the amplitude envelope quantization index of all core layer coding subbands is Huffman coded, the total number of bits consumed is smaller than the amplitude envelope quantization index of all core layer coding subbands is naturally encoded. Huffman coding is used for the total number of bits consumed, otherwise natural coding is used, and the amplitude envelope Huffman coding identification information of the core layer coding subband is set; the amplitude packet of the quantized extension layer coding subband is obtained by quantization
  • the complex quantization index is Huffman coding.
  • the total number of bits consumed is less than the amplitude envelope entropy index of all the extension layer coding sub-bands.
  • the total number of bits consumed by the encoding is Huffman coding, otherwise natural encoding is used, and the amplitude envelope Huffman coding identification information of the extended layer coding subband is set.
  • Step 40 Perform bit allocation on the core layer coding subband according to the amplitude envelope quantization index of the core layer coding subband, and then quantize and encode the core layer frequency domain coefficients to obtain coded bits of the core layer frequency domain coefficients;
  • the method for obtaining the coded bits of the core layer frequency domain coefficients is:
  • the lattice type vector quantization method and the spherical lattice type vector quantization method perform quantization and coding to obtain coded bits of the core layer frequency domain coefficients;
  • Huffman coding is performed on all the quantization indexes obtained by using the tower type vector quantization in the core layer; if all the quantized indexes obtained by using the tower type vector quantization are Huffman-encoded, the total number of bits consumed is smaller than all the use towers.
  • the quantized index obtained by the trellis vector quantization passes through the total number of bits consumed by natural coding, and then uses Huffman coding, the bits saved by Huffman coding, the number of remaining bits allocated by the initial bit, and the assigned to a single frequency domain coefficient.
  • the total number of bits saved by all coding sub-band coding with a bit number of 1 or 2 is corrected for the number of bit allocations of the core layer coding sub-band, and the vector layer coding sub-band with the modified bit allocation number is again vector quantized and Hough Man coding; otherwise using natural coding, using the initial bit allocation of the remaining bits, the total number of bits saved for all coding subbands with a number of bits allocated to a single frequency domain coefficient of 1 or 2, the bits of the core layer coding subband The number of allocations is corrected, and the core layer coding subbands with the corrected bit allocation number are re-entered. Vector quantization and natural coding.
  • Step 50 Perform inverse quantization on the frequency domain coefficients subjected to vector quantization in the foregoing core layer, and The frequency domain coefficients obtained after the time-frequency transform are subjected to difference calculation to obtain a core layer residual signal;
  • Step 60 Calculating according to the amplitude envelope quantization index of the core layer coding subband and the bit allocation number of the core layer coding subband The amplitude envelope quantization index of the core layer residual signal;
  • Calculate the amplitude envelope quantization index of the core layer residual signal coding subband by the following method: Calculate the correction value of the core layer residual signal amplitude envelope quantization index according to the bit allocation number of the core layer coding subband; The amplitude envelope quantization index of the layer coding subband and the correction value of the core layer residual signal amplitude envelope quantization index of the corresponding coding subband are calculated to obtain a kernel layer residual signal amplitude envelope quantization index.
  • the core layer residual signal amplitude envelope quantization index correction value of each coding subband is greater than or equal to 0, and does not decrease when the bit allocation number of the corresponding core layer coding subband increases;
  • the core layer residual signal amplitude envelope quantization index correction value is 0, when the bit allocation number of a core layer coding subband is the defined maximum bit When the number is allocated, the amplitude envelope value of the corresponding core layer residual signal is zero.
  • Step 70 Perform bit allocation on the coding subband of the enhancement layer coded signal according to the amplitude envelope quantization index of the core layer residual signal and the amplitude envelope quantization index of the enhancement layer coding subband, and then quantize the extension layer coded signal. And encoding the coded bits of the extended layer coded signal, wherein the extended layer coded signal is composed of a core layer residual signal and an extended layer frequency domain coefficient;
  • the method for obtaining the coded bits of the extended layer coded signal is:
  • the bit allocation number of the band is quantized and encoded using a tower type vector quantization method and a spherical type vector quantization method, respectively, to obtain coded bits of the enhancement layer coded signal.
  • the to-be quantized vector of the coded subband with the bit allocation number less than the classification threshold is quantized and encoded by the tower type vector quantization method.
  • the to-be quantized vector of the coded sub-band with the bit allocation number greater than the classification threshold is quantized and encoded by a spherical lattice vector quantization method;
  • the number of bit allocations is the number of bits to which a single coefficient in a coded subband is allocated.
  • the extended layer coded signal it is the core layer residual signal and the extension layer.
  • the frequency domain coefficients are composed.
  • the core layer residual signal is also composed of coefficients.
  • Huffman coding is performed on all the quantization indexes obtained by using the tower type vector quantization in the extension layer; if all the quantization indexes obtained by using the tower type vector quantization are Huffman-encoded, the total number of bits consumed is smaller than that of all the use towers.
  • the quantized index obtained by the trellis vector quantization passes through the total number of bits consumed by natural coding, and then uses Huffman coding, the bits saved by Huffman coding, the number of remaining bits allocated by the initial bit, and the assigned to a single frequency domain coefficient.
  • the total number of bits saved by all coded subband codes having a bit number of 1 or 2 is corrected for the bit allocation number of the coded subband of the enhancement layer coded signal, and the coded subband of the enhancement layer coded signal with the bit allocation number corrected is vectorized again Quantization and Huffman coding; otherwise, using natural coding, the total number of bits saved by allocating all the coding sub-bands of the number of bits allocated to the single frequency-domain coefficient by 1 or 2 using the initial bit allocation, the extension layer coding The number of bit allocations of the signal coding subband is corrected, and the extension of the number of bit allocations is corrected. Again sub-band encoded signal encoded vector quantization and natural coding.
  • variable step bit allocation is performed on each coding subband according to the amplitude envelope quantization index of the coding subband;
  • the step size of the coded subband allocation bit with the bit allocation number of 0 is 1 bit
  • the step size of the importance reduction after the bit allocation is 1, and the bit allocation number is greater than 0 and less than the classification threshold.
  • the bit allocation step size when the coded subband is additionally allocated bits is 0.5 bits
  • the step size of the importance reduction after the bit allocation is 0.5
  • the bit when the bit allocation number is greater than or equal to the coding subband of the classification threshold is additionally allocated bits.
  • Finding the most important coding subband in all coding subbands if the number of bits allocated by the coding subband has reached the maximum value that may be assigned, the importance of the coding subband is adjusted to the minimum, no longer Correcting the bit allocation number to the coding subband, otherwise performing bit allocation correction on the most important coding subband;
  • bit allocation correction process 1 bit is allocated to the coded subband with bit allocation number 0, and the importance is reduced by 1 after bit allocation; 0.5 bits are allocated to the coding subband with bit allocation number greater than 0 and less than 5, bit allocation Post-importance reduction by 0.5; allocation of sub-bands with a bit allocation number greater than 5 The number of bits is reduced by 1 after bit allocation.
  • bit allocation correction iteration number co ⁇ is incremented by 1 every time the bit allocation number is corrected, and the bit number of the bit allocation correction iteration count reaches the preset upper limit value or the number of remaining bits available for correction is smaller than the number of bits required for bit allocation correction. At the end, the bit allocation correction process ends.
  • Step 80 The amplitude envelope code bits of the core layer and the extension layer coded subband, the coded bits of the core layer frequency domain coefficients, and the coded bits of the extended layer coded signal are multiplexed and packetized, and then transmitted to the decoding end.
  • the edge information bits of the core layer are written after the frame header of the code stream, and the amplitude envelope coded bits of the core layer coded subband are written into the bit stream multiplexer MUX (Multiplexer), and then the core layer frequency domain coefficients are Coded bits are written to the MUX;
  • MUX Multiplexer
  • the number of bits satisfying the code rate requirement is transmitted to the decoding end according to the required code rate.
  • FIG. 2 is a flow chart of a layered audio encoding method according to a first embodiment of the present invention.
  • the layered audio encoding method of the present invention is specifically described by taking an audio stream having a frame length of 20 ms and a sampling rate of 32 kHz as an example.
  • the method of the invention is equally applicable under other frame lengths and sample rates. As shown in Figure 2, the method includes:
  • the N-point time domain sampling signal x(n) of the current frame and the N-point time domain sampling signal Xouin of the previous frame are combined into a 2N point time domain sampling signal, and the 2 ⁇ point time domain sampling signal can be represented by the following formula :
  • the current frame is a steady-state signal, directly Performing a class IV discrete cosine transform (DCT IV transform) or other discrete cosine transform on the time domain anti-aliasing signal, the following frequency domain coefficients are obtained:
  • the transient decision flag ⁇ / ⁇ —/ ⁇ / 1 is 1, it indicates that the current frame is a transient signal, and the time domain anti-aliasing signal needs to be symmetrically transformed first to reduce the spurious time domain and frequency domain response. .
  • a zero sequence of length N/8 is added to each end of the signal, and the lengthened signal is divided into four equal-length subframes that overlap each other.
  • Each sub-frame has a length of N/2 and overlaps each other at a ratio of 50%.
  • the two intermediate sub-frames are each windowed with a sine window of length NI2, and the length of each of the two sub-frames at each end
  • the half of the inner sub-frame is windowed for a half sine window of N/4.
  • N 640 (other frame lengths and sampling rates can also calculate the corresponding N).
  • amplitude envelope a frequency domain amplitude envelope of each coding sub-band
  • the coded sub-bands may be evenly divided or non-uniformly divided, and in this embodiment, are divided by non-uniform sub-bands.
  • This step can be implemented using the following substeps:
  • the frequency range of the required coding is 0 ⁇ 13.6 kHz
  • the non-uniform hook sub-band division can be performed according to the human ear perception characteristic.
  • Table 1 and Table 2 respectively show that the transient decision flag Flag transient is 0. And 1 when a specific division.
  • the frequency domain range of the core layer is also divided.
  • the transient decision flag Flag transient is 0 and 1
  • the core layer has a frequency range of 0 to 7 kHz.
  • the sub-band division is performed on the four sets of frequency domain coefficients in the frequency band of the required coding, and then the frequency domain coefficients in the frequency band range of the core layer and the frequency band in the extension layer are followed.
  • the coded subbands are rearranged separately from the low frequency to the high frequency.
  • the remaining frequency domain coefficients in the group are not enough to constitute one sub-band (as shown in Table 2, less than 16), they are supplemented by frequency domain coefficients of the same or similar frequencies in the next set of frequency domain coefficients, as shown in Table 2.
  • the coded subbands in Table 2 are a specific result of the completion of the rearrangement.
  • the frequency domain coefficients constituting the core layer coding subband are called the core layer frequency domain coefficients
  • the frequency domain coefficients constituting the extension layer coding subband are called the extension layer frequency domain coefficients, and can also be described as: the frequency domain coefficients.
  • the core layer frequency domain coefficients are divided into several core layer coding subbands
  • the extended layer frequency domain coefficients are divided into several extended layer coding subbands. It can be understood that the order of division of the frequency domain coefficient layer (referring to the core layer and the extension layer) and the division of the coding subband does not affect the implementation of the present invention.
  • Table 1 Example of subband division when the transient decision flag Flag transient is 0
  • LIndex J and respectively indicate the starting frequency domain coefficient index and the ending frequency domain coefficient index of the jth coded subband, and the specific values thereof are as shown in Table 1 (when the transient decision flag Flag_ transient is 0) and the table 2 (when the transient decision flag is 3 ⁇ 4 ⁇ —ira3 ⁇ 4v e «i is 1).
  • the amplitude envelope values of the core layer coding subband and the extension layer coding subband are uniformly quantized; when the transient decision flag Flag transient is 1, the core layer is The amplitude envelope values of the coded subband and the enhancement layer coded subband are separately quantized separately, and the amplitude envelope quantization index of the core layer coding subband and the amplitude envelope quantization index of the extension layer coding subband are respectively weighted row.
  • [ j means rounding down.
  • the transient decision flag ira3 ⁇ 4v e «i 1, the amplitude envelope quantization index of the core layer coded sub-band is rearranged to differentiate the amplitude envelope quantization index of the core layer coded sub-band as follows. The coding is more efficient.
  • the 6-bit is used to encode the amplitude envelope quantization index of the first coded sub-band, 7 ⁇ 4 (0), which consumes 6 bits.
  • the amplitude envelope can be modified as follows to ensure that the range of ⁇ 3 ⁇ 4 ( ⁇ ) is within [ - 15, 16]:
  • the coded bits of the amplitude envelope quantization index of the core layer coded subband ie, the amplitude envelope of the first subband and the coded bits of the amplitude envelope difference value
  • the Huffman coded flag need to be transmitted to the MUX in.
  • the amplitude envelope difference value A q (j) J L - core - 1, L - 2 is Huffman coded, and the consumption is calculated at this time.
  • the number of bits (called Huffman coded bits).
  • the amplitude envelope of the extended layer coded subband is quantized according to the following formula, and the quantization index of the extended layer coded subband amplitude envelope is obtained. , that is, the output value of the quantizer:
  • Th q (L—core is the amplitude envelope quantization index of the first coding subband formed by the spreading layer frequency domain coefficients, limiting its range to [-5 , 34].
  • the amplitude envelope quantization index of the extended layer coding subband is rearranged so that the following is more efficient for differential encoding of the amplitude envelope quantization index of the extended layer coding subband. See Table 4 for an example.
  • the 6-bit is used to encode the amplitude envelope quantization index TT ⁇ L_core of the first coding sub-band formed by the spreading layer frequency domain coefficients, that is, 6 bits are consumed.
  • the extended layer composed of the extended layer frequency domain coefficients
  • the differential operation value between the encoded subband amplitude envelope quantization indices is calculated by the following formula:
  • the coded bits of the constructed amplitude envelope quantization index and the Huffman coded identification bits need to be transferred to the MUX.
  • This step can be implemented using the following substeps:
  • the number of bits available core for core layer coding is extracted from the total number of bits available in the 20 ms frame length, and the bit number of the core layer side core and the core layer coded subband amplitude envelope quantization are subtracted.
  • the number of bits consumed by the index, the bit Th-core obtains the remaining number of bits that can be used for encoding the kernel layer frequency domain coefficients - left-core, that is:
  • Bits left core bits available core - bit sides core - bits Th core (11 )
  • the side information includes the Huffman coded flag Flag huff - ⁇ core, Flag huff PLVQ - core and the number of iteration count core bits.
  • the Flag huff rms core is used to identify whether the Huffman coding is used for the core layer coded subband amplitude envelope quantization index;
  • Flag huff—PLVQ—core is used to identify whether the vector code of the core layer frequency domain is used for vector coding.
  • Huffman coding, and the iteration count core is used to identify the number of iterations of the core layer bit allocation ( timing (see the description in the subsequent steps).
  • the optimal bit value under the condition of the maximum quantization signal-to-noise ratio gain of each coding sub-band under the code rate distortion limit can be calculated:
  • the initial value of the core layer coding subband importance used to control bit allocation in the actual bit allocation can be obtained:
  • the rate correlation can be obtained by statistical analysis, usually 0 ⁇ ⁇ 1, and in this embodiment, the value is 0.7; indicating the importance of the first coding subband when performing bit allocation.
  • the bit allocation of the core layer is performed according to the importance of the core layer coding subband. Detailed description:
  • the core layer coding subband in which the maximum value is located is found from each, and the number of the coded subband is assumed to be then increased by the bit allocation number region_bit(j k ) of each frequency domain coefficient in the core layer coding subband, and is decreased.
  • bit allocation method in this step can be represented by the following pseudo code:
  • the remaining less than 16 bits are allocated to the core layer coding subband satisfying the requirement according to the following principle, and each frequency domain coefficient is allocated in the core layer coding subband with bit allocation 1.
  • bit allocation 1.
  • bit_left-core-bit used all ⁇ 8 the bit allocation ends.
  • the last remaining bits are recorded as the initial allocation of the remaining bits of the core layer, remain b Us-core.
  • the value of the above-mentioned classification threshold is greater than or equal to 2 and less than or equal to 8, which may be 5 in this embodiment.
  • MaxBU is the maximum number of bit allocations that can be allocated by a single frequency domain coefficient in the core layer coding subband, and the unit is bit/frequency domain coefficient.
  • regW n— The number of bits allocated by a single frequency domain coefficient in the J j core layer coding subbands, that is, the number of bit allocations of a single frequency domain coefficient in the subband.
  • L cord determines the size of the number of bits allocated by the coding subband j, region bit ⁇ j], and if the number of allocated bits is region bit( ⁇ ', the classification is called the value, then the coding is called
  • the subband is a low bit coding subband, and the vector to be quantized in the low bit coding subband is quantized and encoded by a tower type trellis vector quantization method; if the allocated bit number region_bit(j) is greater than ⁇ equal to
  • the coded subband is referred to as a high bit coded subband, and the vector to be quantized in the high bit coded subband is quantized and encoded by a spherical lattice vector quantization method; Use 5 bits.
  • Z 8 represents an 8-dimensional integer space.
  • the basic method of mapping an 8-dimensional vector to (ie, quantizing to) /3 ⁇ 4 grid points is described below:
  • X be an arbitrary real number
  • x denotes the rounding and quantization of the integers which are closer to each other of the two integers adjacent to X, and denotes the rounding and quantization of the integers which are far apart from the adjacent two integers.
  • / ( ) (/ ⁇ ), /( ),..., /( ⁇ .
  • the codebook number index and the energy scaling factor scale corresponding to the number of bits are queried from Table 2, and then the energy is quantized according to the following formula. Regularity:
  • c ale (Y; - a) * scaleiindex) ( 20 )
  • 1 is the first normalized 8-dimensional vector to be quantized in the encoded sub-band
  • 3 ⁇ 4 ⁇ fe is the 8-dimensional vector after energy normalization
  • a (2- 6 , 2 - 6 , 2 - 6 , 2 - 6 , 2 - 6 , 2 - 6 , 2 - 6 , 2 - 6 , 2 - 6 , 2 - 6 ).
  • ⁇ ; fnA ca!e (21 )
  • /3 ⁇ 4( ⁇ ) represents a quantization operator that maps an 8-dimensional vector to a /3 ⁇ 4 lattice.
  • J m Ybak temp _K Kbak At this point, the last energy does not exceed the 3 ⁇ 4 grid point of the maximum tower energy radius, and temp _ K is the energy of the grid point.
  • Step 1 According to the energy of the tower surface, mark the grid points on each tower surface.
  • N(J, Q) ⁇ ; ⁇ ).
  • N(J, Q has the following recursion relationship:
  • N(L, K) N(L - K) + N(L -1, ⁇ -1) + N(L, K- ⁇ ) (L ⁇ K ⁇ )
  • Step 1.3: k Then stop searching, b is the label of Y, otherwise continue to step 1.2).
  • Step 2 Uniform labeling of grid points on all tower faces.
  • zwifex—b(, ) is the index of the /3 ⁇ 4 grid point in the codebook. That is, the index of the wth 8-dimensional vector in the encoded subband.
  • each of the 4 bits of the natural binary code of each vector quantization index is grouped and subjected to Fuman coding.
  • the tower type vector quantization index for each 8-dimensional vector is encoded using 15 bits. Among the 15 bits, three sets of 4-bit bits and one set of 3-bit bits are respectively Huffman-encoded. Therefore, in all the encoded sub-bands in which the number of bits to which the single frequency domain coefficient is allocated is 2, the encoding of each 8-dimensional vector is saved by 1 bit.
  • Plvq codebookij i) plvq code(tmp+ 1 );
  • plvq codebook(j,k), and plvq count(j,k, _/subband codeword and bit consumption in the Huffman codebook of the 8-dimensional vector; plvq bit count and plvq code Find it according to Table 6.
  • n is in the range of [0, region bitij) x 8/4 - 2]
  • the step size is incremented by 1, and the following loop is performed:
  • Plvq codebookij i) plvq code (tmp+1);
  • plvq countij i), and plvq_codebook(j,k) ⁇ ⁇ ⁇ subband Huffman bit consumption and codeword of the 8-dimensional vector; plvq bit count and plvq code are found according to Table 6.
  • Bit—used uff— all bit—used uff— all + plvq bit _count ⁇ tmp+ 1 );
  • Plvq codebookij i) plvq code—r2— 3 (tmp+1);
  • plvq countijji) and plvq-codebook(j,k) ⁇ 7 ⁇ subband Huffman bit consumption number and codeword of 8-dimensional vector; plvq bit count r2 3 and plvq code r 2 3 according to Table 7 finds. Update the total number of bit consumption after Huffman encoding:
  • Bit—used— huff—all bit—used— huff—all + plvq bit _count ⁇ tmp+ 1 );
  • Plvq codebookij i) plvq code rl 4(tmp+ ⁇ );
  • plvq countij i), and plvq—codebook(j,k) ⁇ ⁇ ⁇ subband Huffman bit consumption number and codeword of 8D vector; plvq bit count rl 4 and plvq code rl 4 according to the table 8 Find.
  • Bit—used mff— all bit—used mff— all + plvq bit _count ⁇ tmp+ 1 );
  • Plvq countij i) plvq bit count r7 3(tmp+l);
  • Plvq_codebook(j,k) plvq code r7 3(tmp+l);
  • plvq countij i), and plvq_codebook(j,k) ⁇ ⁇ ⁇ subband Huffman bit consumption number and codeword of the 8-dimensional vector; codebook plvq bit count rl 3 and plvq code r 1 3 Find according to Table 9.
  • Bit—used uff— all bit—used uff— all + plvq bit _count ⁇ tmp+ 1 );
  • bit _ used _ huff _ all compare bit _ used _ huff _ all with the total number of bits used for natural encoding, bit used nohuff all, such as ⁇ bit-used-huff-all ⁇ bit-used iohuff-all , then transmit Huffman encoding
  • the quantized vector index is set at the same time as the Huffman coded flag Flag_huff_PLVQ-core, otherwise, the quantized vector index is naturally encoded directly, and the Homan code identifier Mag_huff_PLVQ-core is set to zero.
  • bit-used-nohuff-all is equal to the total number of bits allocated to all coding sub-bands in C (bit _ band _ used(j), je C) minus bit-saved_r 1 -r2-all ⁇ difference value.
  • the Höhman coded flag Flag_huff_PLVQ_core is 0, then the remaining bits are allocated by the initial allocation. The remaining bits are hard saved.
  • the 8-dimensional trellis vector quantization based on /3 ⁇ 4 is also used here.
  • Scale(region _ bit(j)) and scale(region _ bit(J ) represents the energy scaling factor when the number of bit allocations of a single frequency domain coefficient in the coding subband is region _bit(j), which can be found according to Table 10. Correspondence.
  • the index vector k of the /3 ⁇ 4 grid point that satisfies the zero vector condition is obtained according to the index vector calculation formula, otherwise the small multiple value w of the backup is added to the vector, and then quantized to /3 ⁇ 4 grid point Until the zero vector is unconditionally satisfied; finally, according to the index vector calculation formula, the index vector k of the /3 ⁇ 4 lattice point which satisfies the zero vector condition recently; and the index vector k of the output /3 ⁇ 4 grid point.
  • bit allocation correction process specifically includes the following steps:
  • Diff bit count core remain bits core+bit saved r 1 _r2 all core If the Hörmann code identifies Flag—huff—PLVQ—core is 1, then
  • step 304 Determine whether the diff bit count core is greater than or equal to the bit that needs to be consumed by the bit allocation number of the modified coding sub-band j k (if the Flag_huff_PLVQ core is 0, the calculation is performed according to the natural coding; the Flag Huff-PLVQ- core is ⁇ , press Huffman coding calculation), if yes, step 305 is performed, the number of bit allocation correction coding sub-band ⁇ of rWo «_1 ⁇ 2), the importance of reducing the subband ⁇ ) ,, and the encoded sub-values Carry out vector quantization and natural coding or Huffman coding with ⁇ , and finally update the value of diff bit count core; otherwise the bit allocation correction process ends;
  • bit allocation correction process In the bit allocation correction process, allocate 1 bit to the coded subband with the bit allocation number of 0, reduce the importance of bit allocation by 1 , and assign 0.5 bits to the coded subband with the bit allocation number greater than 0 and less than 5, The bit importance is reduced by 0.5 after bit allocation, and 1 bit is allocated to the coded sub-band with bit allocation number greater than 5, and the importance of the bit allocation is reduced by 1.
  • step 108 may also be performed after the bit allocation of the extended layer encoded signal is completed (step 110).
  • This step can be implemented using the following substeps:
  • the quantized index correction value can be set by the following rules:
  • the number of region bits (the subband amplitude envelope quantization index calculated under j and the difference between the subband amplitude envelope quantization indices calculated directly from the residual signal) may be allocated to each bit.
  • the values are statistically obtained, and the statistical table of the amplitude envelope metrics index correction value with the highest probability is obtained, as shown in Table 11:
  • Tfi q (j) Th q )- diff(region _ bit(j)) , household 0, ... , L core ⁇ 1 where is the amplitude envelope quantization index of the encoded sub-band J in the core layer.
  • the number of bit allocations of a certain coding subband in the core layer is 0, it is not necessary to perform the coding subband amplitude envelope of the core layer residual signal, and then the residual of the core layer.
  • the signal subband amplitude envelope value is the same as the core layer's coded subband amplitude envelope value.
  • the quantized amplitude envelope value of the first coded subband of the core layer residual signal is zero.
  • Bit allocation of the coding subband of the enhancement layer coded signal in the extension layer The extension layer subband division is determined by Table 1 or Table 2.
  • the coded signals in subbands 0, ..., L core - l are core layer residual signals, and the coded signals in -re, ..., -1 are frequency domain coefficients in the extended layer coded subband.
  • Subbands 0 to -1 are also referred to as coded subbands of the enhancement layer coded signal.
  • the amplitude envelope quantization index, the extended layer coding subband are calculated by using the same bit allocation scheme as the core layer to calculate the initial value of the coding subband importance of the enhancement layer coded signal over the entire extension layer band, and coding the extension layer
  • the coded subband of the signal is bit allocated.
  • the extended layer band range is 0 to 13.6 kHz.
  • the total bit rate of the audio stream is 64 kbps, and the code rate of the core layer is 32 kbps.
  • the maximum bit rate of the extended layer is 64 kbps.
  • the total number of available bits in the extension layer is calculated based on the core layer code rate and the extension layer maximum code rate, and then the bit allocation is performed until the bits are completely consumed.
  • the vector composition, the vector quantization method, and the encoding method of the encoded signal in the extended layer are the same as the vector composition, the vector quantization method, and the encoding method of the frequency domain coefficients in the core layer, respectively.
  • the layered coded stream is constructed in the following manner: First, the side information of the core layer is written into the bit stream multiplexer MUX in the following order: Flag transient, Flag huff - ⁇ core, Flag huff PLVQ core And count core, then write the encoded sub-band amplitude envelope coded bits of the core layer to the MUX, and then write the coded bits of the core layer frequency domain coefficients to the MUX; then write the edge information of the extended layer to the MUX in the following order: The amplitude envelope Huffman coding flag of the extension layer coding subband is Flag huff-rms ext, the frequency domain coefficient Huffman coding flag Flag huff PLVQ ext and the bit allocation correction iteration number count ext, and then the extension layer coding sub- The amplitude envelope coded bits with ( L core, ... , L ⁇ l ) are written to the MUX, and then the coded bits of the extended layer coded signal are written
  • the write order of the code bits of the extended layer coded signal is ordered according to the initial value of the importance of the coded subband of the coded layer coded signal. That is, the coded bit of the coded subband of the spread layer coded signal having a large importance initial value is preferentially written into the code stream, and for the coded subband having the same importance, the low frequency coded subband is prioritized.
  • the amplitude envelope of the residual signal in the enhancement layer is calculated by the amplitude envelope and the bit allocation number of the core layer coding subband, it is not transmitted to the decoding end. This can increase the coding accuracy of the core layer bandwidth without adding bits to transmit the amplitude envelope value of the residual signal.
  • the unnecessary bits in the back of the bit stream multiplexer are rounded off according to the required code rate, the number of bits satisfying the code rate requirement is transmitted to the decoding end. That is, unnecessary bits are rounded off in order of importance of the coding subbands from small to large.
  • the coding frequency band ranges from 0 to 13.6 kHz, and the maximum code rate is 64 kbps.
  • the method of layering by code rate is as follows:
  • the frequency domain coefficients in the coding band range from 0 to 7 kHz are divided into core layers.
  • the maximum code rate corresponding to the core layer is 32 kbps, which is denoted as L0 layer;
  • the coding band of the extension layer ranges from 0 to 13.6 kHz, and the maximum bit rate is 64kbps, recorded as Li-5 layer;
  • the code rate can be divided into -1 layer according to the number of rounded bits, corresponding to 36kbps, Lj_2 layer, corresponding to 40kbps, _3 layer, corresponding to 48kbps, _ layer, corresponding to 56kbps and Li-5 layer, Corresponds to 64kbps.
  • Figure 5 shows the relationship between layering according to frequency band range and layering according to code rate.
  • FIG. 6 is a schematic structural diagram of a layered audio coding system according to the present invention.
  • the system includes: a transient decision unit, a frequency domain coefficient generation unit, an amplitude envelope calculation unit, and an amplitude envelope quantization and coding.
  • Unit core layer bit allocation unit, core layer frequency domain coefficient vector quantization and coding unit, extended layer coded signal generation unit, residual signal amplitude envelope generation unit, extended layer bit allocation unit, extended layer coded signal vector quantization and coding Unit, bit stream multiplexer;
  • the transient decision unit is configured to perform a transient decision on an audio signal of a current frame
  • the frequency domain coefficient generating unit is connected to the transient determining unit, and when the transient state is a steady state signal, the total frequency domain coefficient obtained by directly performing time-frequency transform on the windowed audio signal;
  • the audio signal is divided into sub-frames, and each sub-frame is time-frequency transformed.
  • the transformed group frequency domain coefficients form the total frequency domain coefficients of the current frame, and the total frequency domain coefficients are coded according to the coding.
  • the band is rearranged from the low frequency to the high frequency, wherein the total frequency domain coefficients include a core layer frequency domain coefficient and an extended layer frequency domain coefficient, and the coded subband includes a core layer coded subband and an extended layer coder.
  • the core layer frequency domain coefficients form a plurality of core layer coding subbands, and the extension layer frequency domain coefficients constitute a plurality of extension layer coding subbands;
  • the amplitude envelope calculation unit is connected to the frequency domain coefficient generation unit, and configured to calculate an amplitude envelope value of the core layer coding subband and the extension layer coding subband;
  • the amplitude envelope quantization and coding unit is coupled to the amplitude envelope calculation unit and the transient decision unit for quantizing the amplitude envelope values of the core layer coding subband and the enhancement layer coding subband Encoding, obtaining an amplitude envelope quantization index of the core layer coding subband and the enhancement layer coding subband and a coding bit thereof; wherein, if it is a steady state signal, encoding the subband of the core layer and the coding layer of the extension layer coding subband The value envelope value is uniformly quantized; if it is a transient signal, the amplitude envelope values of the core layer coding subband and the extension layer coding subband are separately quantized separately, and the amplitude envelope of the core layer coding subband is separately performed.
  • the core layer bit allocation unit is connected to the amplitude envelope quantization and coding unit, and configured to perform bit allocation on the core layer coding subband according to the amplitude envelope quantization index of the core layer coding subband, to obtain a core layer coding.
  • the core layer frequency domain coefficient vector quantization and coding unit is coupled to the frequency domain coefficient generation unit, the amplitude envelope quantization and coding unit, and the core layer bit allocation unit, for using the amplitude of the coded subband according to the core layer
  • the quantization amplitude envelope value and the bit allocation number of the core layer coding subband of the envelope quantization index reconstruction normalize, vector quantize and encode the frequency domain coefficients of the core layer coding subband, and obtain the coding of the core layer frequency domain coefficients.
  • the extended layer coded signal generating unit is connected to the frequency domain coefficient generating unit and the core layer frequency domain coefficient vector quantization and coding unit, and configured to generate a residual signal, and obtain a residual signal and an extended layer frequency domain coefficient.
  • Extended layer coded signal
  • the residual signal amplitude envelope generating unit is connected to the amplitude envelope quantization and coding unit and the core layer bit allocation unit, and configured to use the amplitude envelope quantization index and the corresponding coder according to the core layer coding subband a bit allocation number of the band, obtaining an amplitude envelope quantization index of the core layer residual signal;
  • the extension layer bit allocation unit connected to the residual signal amplitude envelope generating unit and the amplitude envelope quantization and coding unit And performing bit allocation on the extended layer coding subband according to the core layer residual signal amplitude envelope quantization index and the amplitude envelope quantization index of the enhancement layer coding subband, to obtain a bit allocation number of the extension layer coding subband;
  • the extension layer coded signal vector quantization and coding unit is connected to the amplitude envelope quantization and coding unit, the extension layer bit allocation unit, the residual signal amplitude envelope generation unit, and the enhancement layer coded signal generation unit, An enhancement layer coded value of an extended layer coded signal coding subband using an amplitude envelope quantization index reconstructed according to an enhancement layer coded signal coding subband
  • the code signal is normalized, vector quantized and encoded to obtain coded bits of the extended layer coded signal; the bit stream multiplexer and the amplitude envelope quantization and coding unit, the core layer frequency domain coefficient vector quantization and coding unit
  • an enhancement layer coded signal vector quantization and coding unit connection which is used for the core layer side information bits, the coded bits of the amplitude envelope of the core layer coding subband, the coded bits of the core layer frequency domain coefficients, and the extended layer side information bits.
  • the frequency domain coefficient generating unit acquires the total frequency domain coefficient of the current frame, it is used to compare the N point time domain sampling signal of the current frame with the N point time domain sampling signal x of the previous frame.
  • w ( «) constitutes a 2N point time domain sample signal "), and then "window” and time domain anti-aliasing processing to obtain N point time domain sampling signal ⁇ "); and symmetric transformation of the time domain signal, Then, a zero sequence is added to each end of the signal, and the lengthened signal is divided into M mutually overlapping subframes, and then windowed, time domain anti-aliasing processing and time-frequency transform are performed on the time domain signals of each subframe. The group frequency domain coefficients are obtained to form the total frequency domain coefficients of the current frame.
  • the frequency domain coefficient generating unit rearranges the frequency domain coefficients, the frequency domain coefficients are rearranged in the order of the low frequency to the high frequency according to the encoding subbands in the core layer and the extended layer.
  • the amplitude envelope quantization and coding unit rearranging the amplitude envelope quantization index specifically refers to: rearranging the amplitude envelope quantization indices of the coded sub-bands in the same subframe in the order of increasing or decreasing frequency; At the subframe connection, the two coded sub-bands representing the peer frequency belonging to the two subframes are connected.
  • bit stream multiplexer is multiplexed and packaged according to the following code stream format:
  • the edge information bits of the core layer are written after the frame header of the code stream, and the amplitude envelope coded bits of the core layer coded sub-band are written into the bit stream multiplexer MUX, and then the coding ratio of the core layer frequency domain coefficients is compared.
  • the bit stream multiplexer MUX Into the MUX;
  • the number of bits satisfying the code rate requirement is transmitted to the decoding end according to the required code rate.
  • the edge information of the core layer includes the transient decision flag bit, and the amplitude envelope of the core layer coded subband Huffman coded flag bit, Huffman coded flag bit of core layer frequency domain coefficient and core layer bit allocation modified iteration number bit;
  • the side information of the extension layer includes Huffman coded bit bits of the amplitude envelope of the extended layer coded subband, Huffman coded bit bits of the spread layer coded signal, and extended layer bit allocation modified iterative bits.
  • the extension layer coded signal generating unit further includes a residual signal generating module and an extended layer coded signal synthesizing module;
  • the residual signal generating module is configured to inverse quantize the quantized value of the core layer frequency domain coefficient, and perform a difference calculation with the core layer frequency domain coefficient to obtain a core layer residual signal;
  • the spreading layer coded signal synthesizing module is configured to synthesize the core layer residual signal and the frequency domain coefficients of the extended layer in the order of frequency bands to obtain an encoded signal of the extended layer.
  • the residual signal amplitude envelope generating unit further includes a quantization index correction value acquisition module and a residual signal amplitude envelope quantization index calculation module;
  • the quantization index correction value obtaining module is configured to find a correction value statistical table of the core layer residual signal amplitude envelope quantization index according to the core layer coded subband bit allocation number, and obtain a quantization index correction value of the residual signal coding subband.
  • the quantized index correction value of each coding subband is greater than or equal to 0, and is not decremented when the number of bit allocations of the corresponding coding subband of the core layer is increased. If the number of bit allocations of the coding subband of the core layer is 0, the core layer residual is The quantized index correction value of the signal in the coded subband is 0. If the bit allocation number of the subband is the defined maximum bit allocation number, the residual envelope signal has an amplitude envelope value of zero in the subband;
  • the residual signal amplitude envelope quantization index calculation module is configured to perform a difference calculation between the amplitude envelope quantization index of the core layer coding subband and the quantization index correction value of the corresponding coding subband, to obtain a core layer residual signal coding identifier.
  • the amplitude envelope of the band is quantized.
  • the bit stream multiplexer writes the spreading layer coded signal coded bits into the code stream in descending order of the initial value of the coded subband importance of each of the extended layer coded signals, for the coded subbands having the same importance, The coded bits of the low frequency coded subband are preferentially written to the code stream.
  • the present invention can layer the audio decoding method. As shown in FIG. 7, the decoding method includes the following steps:
  • Step 701 Demultiplexing the bit stream transmitted by the encoding end, and decoding the amplitude envelope coded bits of the core layer coding subband and the enhancement layer coding subband, to obtain the core layer coding subband and the extension layer coding subband. Amplitude envelope quantization index; if the transient decision information indicates a transient signal, the amplitude envelope quantization indices of the core layer coding subband and the enhancement layer coding subband are respectively rearranged according to the frequency from small to large;
  • Step 702 Perform bit allocation on the core layer coding subband according to the amplitude envelope quantization index of the core layer coding subband, and thereby estimate the amplitude envelope quantization index of the core layer residual signal, according to the core layer residual signal.
  • the amplitude envelope quantization index and the amplitude envelope quantization index of the enhancement layer coding subband perform bit allocation on the extension layer coded signal coding subband;
  • the method for calculating the amplitude envelope quantization index of the residual signal is as follows: According to the number of bit allocation of the core layer, the correction value statistics table of the amplitude envelope envelope quantization index of the core layer residual signal is obtained, and the core layer residual signal amplitude envelope is obtained. The correction value of the quantization index; the difference between the amplitude envelope quantization index of the core layer coding subband and the correction value of the core layer residual signal amplitude envelope quantization index of the corresponding coding subband, to obtain the core layer residual signal amplitude Value envelope quantization index;
  • the core layer residual signal amplitude envelope quantization index correction value of each coding subband is greater than or equal to 0, and does not decrease when the bit allocation number of the corresponding core layer coding subband increases;
  • the core layer residual signal amplitude envelope quantization index correction value is 0, when the bit allocation number of a core layer coding subband is the defined maximum bit When the number is allocated, the amplitude envelope value of the corresponding core layer residual signal is zero.
  • Step 703 Decode the coded bits of the core layer frequency domain coefficient coded bits and the extended layer coded signal according to the bit allocation number of the core layer and the extension layer, respectively, to obtain the core layer frequency domain coefficient and the extended layer coded signal, and the extended layer coded signal. Rearranged in the order of subbands, and added to the frequency domain coefficients of the core layer to obtain frequency domain coefficients of the entire bandwidth;
  • Step 704 If the transient decision information indicates a steady state signal, directly perform time-frequency inverse transform on the frequency domain coefficients of the entire bandwidth to obtain an output audio signal; if the transient determination information indicates a transient signal, The frequency domain coefficients of the entire bandwidth are rearranged and then divided into group frequency domain coefficients. The time-frequency inverse transform is performed on each set of frequency domain coefficients, and the final audio signal is calculated according to the transformed group time domain signal.
  • the decoding order of the coded bits of the extended layer coded signal is determined according to the initial value of the importance of the coded subband of the corresponding extended layer coded signal, and the coded subband of the significant layer coded signal is preferentially decoded if There are two extension layer coded signal coding subbands having the same importance, then the low frequency coding subband is preferentially decoded, the decoded number of bits is calculated during decoding, and decoding is stopped when the number of decoded bits satisfies the total number of bits.
  • FIG. 8 is a flow chart of an embodiment of a layered audio decoding method of the present invention. As shown in Figure 8, the method includes:
  • the side information is first decoded, and then the amplitude-encoded bits of the core layer in the frame are Huffman-decoded or directly decoded according to the value of the Flag huff core to obtain the core layer coded sub-band.
  • Amplitude Envelope Quantization Index 73 ⁇ 4( ), j 0,..., — core — 1
  • bit allocation method 802 Calculate an initial value of the core layer coding subband according to the amplitude envelope quantization index of the core layer coding subband, and perform bit allocation on the core layer coding subband by using the subband importance, to obtain a bit allocation number of the core layer.
  • the bit allocation method at the decoding end is exactly the same as the bit allocation method at the encoding end. In the bit allocation process, the bit allocation step size and the step size of the coding subband reduction after the bit allocation are varied.
  • the count core value of the correction bit number and the importance of the core layer coding subband are allocated according to the bit allocation of the core layer of the encoding end, and the core layer coding subband is further subjected to count core bit allocation, and then the whole process of bit allocation End.
  • the step size of the coded subband allocation bit with the bit allocation number of 0 is 1 bit
  • the step size of the importance reduction after the bit allocation is 1, and the bit allocation number is greater than 0 and less than a certain threshold.
  • the bit allocation step size is 0.5 bits
  • the step size of the importance reduction after the bit allocation is also 0.5
  • the coded subband with the bit allocation number greater than or equal to the wide value is added.
  • the bit allocation step size when assigning bits is 1
  • the step of decreasing importance after bit allocation is also 1;
  • bit coding subband and the high bit coding subband are inverse quantized using a tower type vector quantization inverse quantization method and a spherical lattice type vector quantization inverse quantization method, respectively;
  • Huffman decoding is performed on the low bit coded subband or directly subjected to natural decoding to obtain an index of the tower type vector quantization of the low bit coded subband, and the index of all the tower type vector quantization is reversed.
  • Quantization and denormalization yield the frequency domain coefficients of the coded subband.
  • the quantization index is calculated according to the natural binary code value; if the natural binary code of the quantization index If the value is equal to "1111 111", the next bit will continue to be read. If the next bit is 0, the quantization index value is 127. If the next bit is 1, the quantization index value is 128.
  • Step 4 If b ⁇ xb+2*N(l-l,k-j), then
  • , Y ( yl, y2, ..., y8) is the lattice point.
  • Y 1 ⁇ + ⁇ ) ⁇ scale ⁇ index)
  • a (2- 6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 ) , scale ⁇ index) is the scaling factor, which can be found in Table 5.
  • the coded bits of the high-bit coded sub-band are directly decoded to obtain the m-th index vector k of the high-bit coded sub-band _; the inverse quantization process of the ball-type vector quantization of the index vector is performed. It is the reverse process of the quantification process.
  • the specific steps are as follows:
  • the extended layer coded signal is composed of a core layer residual signal and an extended layer frequency domain coefficient, and the extended layer coded signal coding subband is calculated according to the amplitude envelope quantization index of the coded subband of the extended layer coded signal.
  • the calculation of the initial value of the coding subband of the decoding end and the bit allocation method are the same as the calculation method and the bit allocation method of the coding subband importance initial value at the encoding end.
  • the extension layer encodes the signal.
  • the method of decoding and dequantizing the extension layer is the same as the method of decoding and dequantizing the core layer.
  • the order of decoding the subband decoding of the extended layer coded signal is determined according to the initial value of the importance of the coded subband of the enhancement layer coded signal. If the coded subbands of the two extended layer coded signals have the same importance, the low frequency coded subband is preferentially decoded while calculating the number of decoded bits, and decoding is stopped when the number of decoded bits satisfies the total number of bits required.
  • the code rate sent from the encoding end to the decoding end is 64 kbps, but due to network reasons, the decoding end can only obtain 48 kbps information in front of the code stream, or the decoding end only supports 48 kbps decoding, so when the decoding end decodes to 48 kbps, it stops. decoding.
  • the coded signals obtained by decoding the extension layer are rearranged according to the frequency, and the core layer frequency domain coefficients and the extended layer coded signals at the same frequency are added to obtain a frequency domain coefficient output value.
  • the frequency domain coefficients are rearranged, that is, all the frequency domain coefficients corresponding to the L sub-bands in Table 2 are corresponding to the original frequency domain coefficient index number. The position is rearranged, and the frequency domain coefficients corresponding to the frequency domain coefficient index not mentioned in Table 2 are both set to zero.
  • FIG. 9 is a schematic structural diagram of a layered audio decoding system according to the present invention.
  • the system includes: a bit stream demultiplexer (DeMUX), an amplitude envelope decoding unit of a core layer coding subband, and a core layer.
  • the amplitude envelope decoding unit is coupled to the bitstream demultiplexer for decoding the amplitude envelope coded bits of the core layer and the extended layer coding subband output by the bitstream demultiplexer Obtaining an amplitude envelope quantization index of the core layer coding subband and the enhancement layer coding subband; if the transient decision information indicates a transient signal, the amplitude envelope of the core layer coding subband and the extension layer coding subband is also obtained The quantization index is rearranged separately according to the frequency from d to large;
  • the core layer bit allocation unit is connected to the amplitude envelope decoding unit, configured to perform bit allocation on the core layer coding subband according to the amplitude envelope quantization index of the core layer coding subband, to obtain a core layer coding sub The number of bit allocations with the band;
  • the core layer decoding and inverse quantization unit is connected to the bit stream demultiplexer, the amplitude envelope decoding unit and the core layer bit allocation unit, and is configured to calculate the quantization index of the amplitude envelope according to the core layer coding subband. Obtaining a quantized amplitude envelope value of the core layer coding subband, and using the bit allocation number of the core layer coding subband and the quantization amplitude envelope value to perform the core layer frequency domain coefficient coding bit output by the bit stream demultiplexer Decoding, inverse quantization, and denormalization processing to obtain frequency domain coefficients of the core layer;
  • the residual signal amplitude envelope generating unit is connected to the amplitude envelope decoding unit and the core layer bit allocation unit, and configured to perform quantization index and corresponding coding subband according to the amplitude envelope of the core layer coding subband.
  • the number of bit allocations is used to find a correction value statistics table of the core layer residual signal amplitude envelope quantization index, and obtain a core layer residual signal amplitude envelope envelope quantization index;
  • the extension layer bit allocation unit is connected to the residual signal amplitude envelope generating unit and the amplitude envelope decoding unit, and configured to perform quantization index and extension layer coding subband according to the core layer residual signal amplitude envelope.
  • the amplitude envelope quantization index performs bit allocation of the coding layer subband of the enhancement layer coding signal to obtain a bit allocation number of the coding layer subband of the enhancement layer coding signal;
  • the extension layer coded signal decoding and inverse quantization unit is coupled to the bit stream demultiplexer, the amplitude envelope decoding unit, the extended layer bit allocation unit, and the residual signal amplitude envelope generating unit for using the extension
  • the amplitude envelope quantization index of the layer coded signal coding subband is calculated to obtain the quantized amplitude envelope value of the coded subband of the enhancement layer coded signal, and the bit allocation number and the quantization amplitude envelope value pair of the subband are encoded using the extended layer coding signal.
  • the whole bandwidth frequency domain coefficient restoring unit is connected to the core layer decoding and inverse quantization unit and the extended layer coded signal decoding and inverse quantization unit, and is configured to decode and dequantize the extended layer coded signal according to the coded subband order
  • the coded signals of the extended layer output by the unit are reordered, and then summed with the core layer frequency domain coefficients output by the core layer decoding and inverse quantization unit to obtain the entire bandwidth frequency domain coefficient;
  • the noise filling unit is connected to the entire bandwidth frequency domain coefficient recovery unit and the amplitude envelope decoding unit, and is configured to perform noise filling on a subband with no coded bits allocated in the encoding process;
  • An audio signal recovery unit is connected to the noise filling unit, and if the transient decision information indicates a steady state signal, the frequency domain coefficient of the entire bandwidth is directly subjected to time-frequency inverse transform to obtain an output audio signal;
  • the transient judgment information is indicated as a transient signal, which is used to rearrange the frequency domain coefficients of the entire bandwidth, and then divide into frequency frequency domain coefficients, and perform time-frequency inverse transform on each set of frequency domain coefficients, according to the group time obtained by the transformation.
  • the domain signal is calculated to obtain the final audio signal.
  • the residual signal amplitude envelope generating unit further includes a quantization index correction value acquisition module and a residual signal amplitude envelope quantization index calculation module;
  • the quantization index correction value obtaining module is configured to find a correction value statistics table of the core layer residual signal amplitude envelope quantization index according to the core layer coded subband bit allocation number, and obtain a quantization index correction value of the residual signal coding subband.
  • the quantization index correction value of each coding sub-band is greater than or equal to 0, and does not decrease when the number of bit allocations of the corresponding coding sub-band of the core layer increases.
  • the core layer The residual signal has a correction value of 0 in the coded subband, if If the number of bit allocations of a certain core layer coded subband is the maximum number of bit allocations defined, the amplitude envelope value of the residual signal in the coded subband is zero;
  • the residual signal amplitude envelope quantization index calculation module is configured to perform a difference calculation between the amplitude envelope quantization index of the core layer coding subband and the quantization index correction value of the corresponding coding subband, to obtain a core layer residual signal coding identifier.
  • the amplitude envelope of the band is quantized.
  • the order in which the spreading layer coding signal decoding and inverse quantization unit decodes the coding subband of the enhancement layer coded signal is determined according to the initial value of the coding subband importance of the enhancement layer coded signal, and the importance layer of the extension layer coded signal
  • the coded subband is preferentially decoded. If the coded subbands of the two extended layer coded signals have the same importance, the low frequency coded subband is preferentially decoded, and the decoded number of bits is calculated during the decoding process, when the number of decoded bits satisfies the total The decoding is stopped when the number of bits is required.
  • the order in which the extension layer coded signal decoding and inverse quantization unit decodes the extension layer coded signal coding subband is determined according to the initial value of the coding subband importance of the enhancement layer coded signal, and the coding of the importance layer coding signal is large.
  • Subband priority decoding if the coding subbands of two extension layer coded signals have the same importance, the low frequency coding subband is preferentially decoded, and the decoded number of bits is calculated during decoding, when the number of decoded bits satisfies the total bits The decoding is stopped when the number is required.
  • the audio signal recovery unit rearranges the frequency domain coefficients of the entire bandwidth by specifically arranging the frequency domain coefficients belonging to the same subframe according to the encoding subbands from the low frequency to the high frequency, and obtaining the group frequency domain coefficients, and then grouping The frequency domain coefficients are arranged in the order of the subframes.
  • the process of the audio signal recovery unit calculating the final audio signal according to the transformed group time domain signal specifically includes: performing inverse time domain anti-aliasing processing on each group, and then The signals obtained by the M group are windowed, and then the M-group windowed signals are overlapped and added to obtain an N-point time domain sampling signal; the time domain signal is subjected to inverse time domain anti-aliasing processing and adding Window processing, overlapping and adding two adjacent frames to obtain the final audio output signal.
  • the present invention also provides the following layered encoding and decoding methods for transient signals:
  • the layered audio coding method for the transient signal of the present invention includes:
  • the frequency-to-high frequency sequence is rearranged, wherein the total frequency domain coefficients include a core layer frequency domain coefficient and an extended layer frequency domain coefficient, and the coded sub-band includes a core layer coded sub-band and an extended layer coded sub-band.
  • the core layer frequency domain coefficients constitute a plurality of core layer coding subbands, and the extension layer frequency domain coefficients constitute a plurality of extension layer coding subbands;
  • C1 performing bit allocation on the core layer coding subband according to the amplitude envelope quantization index of the core layer coding subband, and then quantizing and encoding the core layer frequency domain coefficients to obtain coded bits of the core layer frequency domain coefficients;
  • the amplitude envelope coded bits of the core layer and the extended layer coded subband, the coded bits of the core layer frequency domain coefficients, and the coded bits of the extended layer coded signal are multiplexed and packetized, and then transmitted to the decoding end.
  • step A1 the method for obtaining the total frequency domain coefficient of the current frame is:
  • N-point time domain sampling signal x(n) of the current frame is combined with the N-point time domain sampling signal Xouin of the previous frame to form a 2N point time domain sampling signal "), and then windowing and time domain are applied to ") Anti-aliasing process to obtain N point time domain sampling signal ⁇
  • step A1 when the frequency domain coefficients are rearranged, the frequency domain coefficients are rearranged in the order of the low frequency to the high frequency according to the coding subbands in the core layer and the extended layer.
  • step B1 the rearranging the amplitude envelope quantization index includes:
  • the amplitude envelope quantization indices of the coded sub-bands in the same subframe are rearranged in the order of increasing or decreasing frequency, and two code sub-segments representing the peer frequencies belonging to the two subframes are used at the subframe connection. Bring the connection.
  • step F1 the multiplexing is performed according to the following code stream format:
  • the edge information bits of the core layer are written after the frame header of the code stream, and the amplitude envelope coded bits of the core layer coded sub-band are written into the bit stream multiplexer MUX, and then the coding ratio of the core layer frequency domain coefficients is compared.
  • the bit stream multiplexer MUX Into the MUX;
  • the number of bits satisfying the code rate requirement is transmitted to the decoding end according to the required code rate.
  • the side information of the core layer includes the transient decision flag bit, the Huffman coded flag bit of the amplitude envelope of the core layer coded subband, the Huffman coded flag bit of the core layer frequency domain coefficient, and the core layer bit allocation. Correcting the number of iterations bits; the side information of the extension layer includes the Huffman coded flag bit of the amplitude envelope of the extended layer coded subband, the Huffman coded bit bit of the extended layer coded signal, and the modified layer bit allocation correction iteration number Bit.
  • the layered decoding method of the transient signal of the present invention comprises:
  • Step A2 Demultiplexing the bit stream transmitted by the encoding end, and decoding the amplitude envelope coded bits of the core layer coding subband and the extension layer coding subband, to obtain the core layer coding subband and the extension layer coding subband.
  • the amplitude envelope quantization index, the amplitude envelope quantization index of the core layer coding subband and the extension layer coding subband are rearranged according to the frequency from d to large;
  • Step B2 Perform bit allocation on the core layer coding subband according to the amplitude envelope quantization index of the rearranged core layer coding subband, and thereby calculate a magnitude envelope quantization index of the core layer residual signal;
  • Step C2 Perform bit allocation on the coded subband of the extended layer coded signal according to the amplitude envelope quantization index of the core layer residual signal and the amplitude envelope quantization index of the rearranged extended layer coded subband;
  • Step D2 according to The number of bit allocations of the core layer and the extension layer respectively decodes the coded bits of the core layer frequency domain coefficient coded bits and the extended layer coded signal, and obtains the core layer frequency domain coefficients and the extended layer coded signals, and the extended layer coded signals are in the subband order. Rearrange and add the frequency domain coefficients of the core layer to obtain the frequency domain coefficients of the entire bandwidth;
  • Step E2 rearranging the frequency domain coefficients of the entire bandwidth, and then dividing into groups, performing time-frequency inverse transform on each set of frequency domain coefficients, and calculating a final audio signal according to the transformed group time domain signals.
  • step E2 the frequency domain coefficients of the entire bandwidth are rearranged, specifically, the frequency domain coefficients belonging to the same subframe are arranged in order from the low frequency to the high frequency according to the coding subband, and the frequency domain coefficients are obtained, and then the frequency domain is set.
  • the coefficients are arranged in the order of the sub-frames.
  • step E2 the process of calculating the final audio signal according to the transformed M group time domain signal comprises: performing inverse time domain anti-aliasing processing on each group, and then performing windowing processing on the obtained group of signals, and then After the window is added, the signals are overlapped and added to obtain an N-point time domain sample signal x q (n); the time domain signal is subjected to inverse time domain anti-aliasing processing and windowing processing, and the adjacent two frames are intersected. Add the stack to get the final audio output signal.
  • the invention introduces a processing method for a transient signal frame in a layered audio codec method, performs a time-frequency transform on a transient signal frame, and then transforms the obtained frequency domain coefficient in a core layer and an extended layer.
  • the rearrangement is performed separately to perform the same bit allocation and frequency domain coefficient encoding and the like subsequent processing with the steady-state signal frame, thereby improving the coding efficiency of the transient signal frame and improving the quality of the layered audio codec.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Description

可分层音频编解码方法和系统及瞬态信号可分层编解码方法
技术领域
本发明涉及音频编解码技术, 尤其涉及一种可分层音频编解码方法、 系 统及瞬态信号可分层编解码方法。 背景技术
可分层音频编码是指以分层的方式组织音频编码的码流, 一般分成一个 核心层和若干个扩展层, 解码器可以在没有高层(譬如扩展层)编码码流的 情况下实现只对低层 (譬如核心层)编码码流进行解码, 解码的层数越多, 音质提高得越大。
可分层编码技术对于通信网络有非常重要的实用价值。 一方面, 数据的 传输可以由不同信道共同合作完成, 而每个信道的丟包率都有可能不同, 这 时候往往需要对数据进行可分层处理, 将数据中重要的部分放在丟包率相对 较低的稳定信道中传输, 而将数据中次要的部分放在丟包率相对较高的非稳 定信道中传输, 从而保证当非稳定信道丟包时只会产生相对的音质下降, 而 不会出现一帧数据完全无法解码的情况。 另一方面, 某些通信网络(比如因 特网) 的带宽很不稳定, 不同用户之间的带宽各不相同, 不能釆用一个固定 的码率来兼顾不同带宽用户的需求, 而釆用可分层的编码方案就可以使得不 同用户在各自所拥有的带宽条件下获得最佳的音质享受。
传统的可分层音频编码方案, 比如国际电信联合会 ( International Telecommunication Union, ITU ) 的标准 G.729.1和 G.VBR, 都没有针对瞬态 信号帧实施针对性的处理, 因此对于含有较多瞬态成分的信号 (比如打击乐 信号)编码效果较低, 尤其是在中低码率的传输条件下。
发明内容
本发明所要解决的技术问题是, 提供一种高效的可分层音频编码、 解码 方法及系统及瞬态信号可分层编解码方法,以改善可分层音频编解码的质量。 为解决上述问题, 本发明提供了一种可分层音频编码方法, 包括: 对当前帧的音频信号进行瞬态判决;
瞬态判决为稳态信号时, 对加窗后的音频信号直接进行时频变换得到总 的频域系数; 瞬态判决为瞬态信号时, 将音频信号分成 个子帧, 对每个子 帧进行时频变换, 变换得到的 组频域系数构成当前帧总的频域系数, 对总 的频域系数按照编码子带从低频到高频的顺序进行重排, 其中, 所述总的频 域系数包括核心层频域系数和扩展层频域系数, 所述编码子带包括核心层编 码子带和扩展层编码子带, 核心层频域系数构成若干个核心层编码子带, 扩 展层频域系数构成若干个扩展层编码子带;
对核心层编码子带和扩展层编码子带的幅值包络值进行量化和编码, 得 到核心层编码子带和扩展层编码子带的幅值包络量化指数及其编码比特; 其 中, 若为稳态信号, 则对核心层编码子带和扩展层编码子带的幅值包络值进 行统一量化; 若为瞬态信号, 则对核心层编码子带和扩展层编码子带的幅值 包络值分别进行单独量化, 以及对核心层编码子带的幅值包络量化指数和扩 展层编码子带的幅值包络量化指数分别进行重排;
根据核心层编码子带的幅值包络量化指数对核心层编码子带进行比特分 配,然后对核心层频域系数进行量化和编码得到核心层频域系数的编码比特; 对前述核心层中经过矢量量化的频域系数进行反量化, 并与原始的经过 时频变换后得到的频域系数进行差计算, 得到核心层残差信号;
根据核心层编码子带的幅值包络量化指数和比特分配数计算核心层残差 信号的幅值包络量化指数;
根据核心层残差信号的幅值包络量化指数和扩展层编码子带的幅值包络 量化指数对扩展层编码信号的编码子带进行比特分配, 然后对扩展层编码信 号进行量化和编码得到扩展层编码信号的编码比特, 其中, 所述扩展层编码 信号由核心层残差信号和扩展层频域系数构成;
将核心层和扩展层编码子带的幅值包络编码比特、 核心层频域系数编码 比特和扩展层编码信号的编码比特复用打包后, 传送给解码端。
为解决上述问题, 本发明还提供了一种可分层音频解码方法, 该方法包 括:
对编码端传送过来的比特流进行解复用, 对核心层编码子带和扩展层编 码子带的幅值包络编码比特解码, 得到核心层编码子带和扩展层编码子带的 幅值包络量化指数; 若瞬态判决信息表明为瞬态信号, 还对核心层编码子带 和扩展层编码子带的幅值包络量化指数按照频率从小到大的顺序分别进行重 排;
根据核心层编码子带的幅值包络量化指数, 对核心层编码子带进行比特 分配, 并由此计算核心层残差信号的幅值包络量化指数, 根据核心层残差信 号的幅值包络量化指数和扩展层编码子带的幅值包络量化指数对扩展层编码 信号的编码子带进行比特分配;
根据核心层编码子带和扩展层编码信号的编码子带的比特分配数, 分别 对核心层频域系数的编码比特和扩展层编码信号的编码比特解码, 得到核心 层频域系数和扩展层编码信号, 将扩展层编码信号按照子带顺序重新排列, 并和核心层频域系数相加, 得到全体带宽的频域系数;
若所述瞬态判决信息表明为稳态信号, 则对全体带宽的频域系数直接进 行时频逆变换, 得到输出的音频信号; 若所述瞬态判断信息表明为瞬态信号, 则将全体带宽的频域系数进行重排, 然后分成 组频域系数, 对每一组频域 系数进行时频逆变换, 根据变换得到的 组时域信号计算得到最终的音频信 号。
为解决上述问题,本发明还提供了一种瞬态信号的可分层音频编码方法, 该方法包括:
将音频信号分成 个子帧,对每个子帧进行时频变换, 变换得到的 组 频域系数构成当前帧总的频域系数, 对总的频域系数按照编码子带从低频到 高频的顺序进行重排, 其中, 所述总的频域系数包括核心层频域系数和扩展 层频域系数, 所述编码子带包括核心层编码子带和扩展层编码子带, 核心层 频域系数构成若干个核心层编码子带, 扩展层频域系数构成若干个扩展层编 码子带;
对核心层编码子带和扩展层编码子带的幅值包络值进行量化和编码, 得 到核心层编码子带和扩展层编码子带的幅值包络量化指数及其编码比特, 其 中对核心层编码子带和扩展层编码子带的幅值包络值分别进行单独量化, 以 及对核心层编码子带的幅值包络量化指数和扩展层编码子带的幅值包络量化 指数分别进行重排;
根据核心层编码子带的幅值包络量化指数对核心层编码子带进行比特分 配,然后对核心层频域系数进行量化和编码得到核心层频域系数的编码比特; 对前述核心层中经过矢量量化的频域系数进行反量化, 并与原始的经过 时频变换后得到的频域系数进行差计算, 得到核心层残差信号;
根据核心层编码子带的幅值包络量化指数和核心层编码子带的比特分配 数计算核心层残差信号编码子带的幅值包络量化指数;
根据核心层残差信号的幅值包络量化指数和扩展层编码子带的幅值包络 量化指数对扩展层编码信号的编码子带进行比特分配, 然后对扩展层编码信 号进行量化和编码得到扩展层编码信号的编码比特, 其中, 所述扩展层编码 信号由核心层残差信号和扩展层频域系数构成;
将核心层编码子带和扩展层编码子带的幅值包络编码比特、 核心层频域 系数的编码比特和扩展层编码信号的编码比特复用打包后, 传送给解码端。
为解决上述问题, 本发明还提供了一种瞬态信号的可分层解码方法, 该 方法包括:
对编码端传送过来的比特流进行解复用, 对核心层编码子带和扩展层编 码子带的幅值包络编码比特解码, 得到核心层编码子带和扩展层编码子带的 幅值包络量化指数, 对核心层编码子带和扩展层编码子带的幅值包络量化指 数按照频率从 、到大的顺序分别进行重排;
根据重排后的核心层编码子带的幅值包络量化指数, 对核心层编码子带 进行比特分配, 并由此计算核心层残差信号的幅值包络量化指数;
根据核心层残差信号的幅值包络量化指数和重排后的扩展层编码子带的 幅值包络量化指数对扩展层编码子带进行比特分配;
根据核心层编码子带和扩展层编码信号的编码子带的比特分配数, 分别 对核心层频域系数编码比特和扩展层编码信号编码比特解码, 得到核心层频 域系数和扩展层编码信号, 将扩展层编码信号按照子带顺序重新排列, 并和 核心层频域系数相加, 得到全体带宽的频域系数;
将全体带宽的频域系数进行重排, 然后分成 组, 对每一组频域系数进 行时频逆变换, 根据变换得到的 组时域信号计算得到最终的音频信号。
为解决上述问题, 本发明还提供了一种可分层音频编码系统, 该系统包 括:
频域系数生成单元、 幅值包络计算单元、 幅值包络量化和编码单元、 核 心层比特分配单元、核心层频域系数矢量量化和编码单元以及比特流复用器; 该系统还包括: 瞬态判决单元、 扩展层编码信号生成单元、 残差信号幅值包 络生成单元、扩展层比特分配单元以及扩展层编码信号矢量量化和编码单元; 其中:
所述瞬态判决单元设置为: 对当前帧的音频信号进行瞬态判决; 所述频域系数生成单元, 与所述瞬态判决单元连接, 所述频域系数生成 单元设置为: 瞬态判决为稳态信号时, 对加窗后的音频信号直接进行时频变 换得到总的频域系数; 瞬态判决为瞬态信号时, 将音频信号分成 个子帧, 对每个子帧进行时频变换, 变换得到的 M组频域系数构成当前帧总的频域系 数, 对总的频域系数按照编码子带从低频到高频的顺序进行重排, 其中, 所 述总的频域系数包括核心层频域系数和扩展层频域系数, 所述编码子带包括 核心层编码子带和扩展层编码子带, 核心层频域系数构成若干个核心层编码 子带, 扩展层频域系数构成若干个扩展层编码子带;
所述幅值包络计算单元, 与所述频域系数生成单元连接, 所述幅值包络 计算单元设置为: 计算核心层编码子带和扩展层编码子带的幅值包络值; 所述幅值包络量化和编码单元, 与所述幅值包络计算单元以及瞬态判决 单元连接, 所述幅值包络量化和编码单元设置为: 对核心层编码子带和扩展 层编码子带的幅值包络值进行量化和编码, 得到核心层编码子带和扩展层编 码子带的幅值包络量化指数及其编码比特; 其中, 若为稳态信号, 则对核心 层编码子带和扩展层编码子带的幅值包络值进行统一量化; 若为瞬态信号, 则对核心层编码子带和扩展层编码子带的幅值包络值分别进行单独量化, 以 及对核心层编码子带的幅值包络量化指数和扩展层编码子带的幅值包络量化 指数分别进行重排;
所述核心层比特分配单元, 与所述幅值包络量化和编码单元连接, 所述 核心层比特分配单元设置为: 根据核心层编码子带的幅值包络量化指数对核 心层编码子带进行比特分配, 得到核心层编码子带的比特分配数;
所述核心层频域系数矢量量化和编码单元, 与所述频域系数生成单元、 幅值包络量化和编码单元及核心层比特分配单元连接, 所述核心层频域系数 矢量量化和编码单元设置为: 使用根据核心层编码子带的幅值包络量化指数 重建的核心层编码子带的量化幅值包络值和核心层编码子带的比特分配数对 核心层编码子带的频域系数进行归一化、 矢量量化和编码, 得到核心层频域 系数编码比特;
所述扩展层编码信号生成单元, 与所述频域系数生成单元及核心层频域 系数矢量量化和编码单元连接, 所述扩展层编码信号生成单元设置为: 生成 核心层残差信号, 得到由核心层残差信号和扩展层频域系数构成的扩展层编 码信号;
所述残差信号幅值包络生成单元, 与所述幅值包络量化和编码单元及核 心层比特分配单元连接, 所述残差信号幅值包络生成单元设置为: 根据核心 层编码子带的幅值包络量化指数与对应的核心层编码子带的比特分配数, 得 到核心层残差信号的幅值包络量化指数;
所述扩展层比特分配单元, 与所述残差信号幅值包络生成单元及幅值包 络量化和编码单元连接, 所述扩展层比特分配单元设置为: 根据核心层残差 信号幅值包络量化指数和扩展层编码子带的幅值包络量化指数对扩展层编码 信号编码子带进行比特分配, 得到扩展层编码信号编码子带的比特分配数; 所述扩展层编码信号矢量量化和编码单元, 与所述幅值包络量化和编码 单元、 扩展层比特分配单元、 残差信号幅值包络生成单元及扩展层编码信号 生成单元连接, 所述扩展层编码信号矢量量化和编码单元设置为: 使用根据 扩展层编码信号编码子带的幅值包络量化指数重建的扩展层编码信号编码子 带的量化幅值包络值和扩展层编码信号编码子带的比特分配数对扩展层编码 信号进行归一化、 矢量量化和编码, 得到扩展层编码信号编码比特; 所述比特流复用器与所述幅值包络量化和编码单元、 核心层频域系数矢 量量化和编码单元、扩展层编码信号矢量量化和编码单元连接, 所述比特流 复用器设置为: 将核心层边信息比特、核心层编码子带的幅值包络的编码比 特、 核心层频域系数编码比特、 扩展层边信息比特, 扩展层编码子带的幅值 包络的编码比特和扩展层编码信号编码比特进行打包。
为解决上述问题, 本发明还提供了一种可分层音频解码系统, 该系统包 括: 比特流解复用器、 幅值包络解码单元、 核心层比特分配单元、 核心层解 码和反量化单元; 该系统还包括: 残差信号幅值包络生成单元、 扩展层比特 分配单元、扩展层编码信号解码和反量化单元、全体带宽频域系数恢复单元、 噪声填充单元和音频信号恢复单元; 其中:
所述幅值包络解码单元, 与所述比特流解复用器连接, 所述幅值包络解 码单元设置为: 对所述比特流解复用器输出的核心层和扩展层编码子带的幅 值包络编码比特进行解码, 得到核心层编码子带和扩展层编码子带的幅值包 络量化指数; 若瞬态判决信息表明为瞬态信号, 还对核心层编码子带和扩展 层编码子带的幅值包络量化指数按照频率从小到大的顺序进行重排;
所述核心层比特分配单元, 与所述幅值包络解码单元连接, 所述核心层 比特分配单元设置为: 根据核心层编码子带的幅值包络量化指数, 对核心层 编码子带进行比特分配, 得到核心层编码子带的比特分配数;
所述核心层解码和反量化单元, 与所述比特流解复用器、 幅值包络解码 单元及核心层比特分配单元连接, 所述核心层解码和反量化单元设置为: 根 据核心层编码子带的幅值包络量化指数计算得到核心层编码子带的量化幅值 包络值, 使用核心层编码子带的比特分配数和量化幅值包络值对所述比特流 解复用器输出的核心层频域系数编码比特进行解码、反量化及反归一化处理, 得到核心层频域系数;
所述残差信号幅值包络生成单元, 与所述幅值包络解码单元及核心层比 特分配单元连接, 所述残差信号幅值包络生成单元设置为: 根据核心层编码 子带的幅值包络量化指数与对应核心层编码子带的比特分配数, 查找核心层 残差信号幅值包络量化指数的修正值统计表, 得到核心层残差信号的幅值包 络量化指数; 所述扩展层比特分配单元, 与所述残差信号幅值包络生成单元及幅值包 络解码单元连接, 所述扩展层比特分配单元设置为: 根据核心层残差信号的 幅值包络量化指数和扩展层编码子带的幅值包络量化指数进行扩展层编码信 号编码子带的比特分配, 得到扩展层编码信号编码子带的比特分配数;
所述扩展层编码信号解码和反量化单元, 与比特流解复用器、 所述幅值 包络解码单元、 扩展层比特分配单元及残差信号幅值包络生成单元连接, 所 述扩展层编码信号解码和反量化单元设置为: 使用扩展层编码信号编码子带 的幅值包络量化指数计算得到扩展层编码信号编码子带的量化幅值包络值, 使用扩展层编码信号编码子带的比特分配数和量化幅值包络值对所述比特流 解复用器输出的扩展层编码信号编码比特进行解码、反量化及反归一化处理, 得到扩展层编码信号;
所述全体带宽频域系数恢复单元, 与所述核心层解码和反量化单元以及 扩展层编码信号解码和反量化单元连接, 所述全体带宽频域系数恢复单元设 置为: 根据子带顺序对所述扩展层编码信号解码和反量化单元输出的扩展层 编码信号进行重新排序, 然后与所述核心层解码和反量化单元输出的核心层 频域系数做和计算, 得到全体带宽频域系数;
所述噪声填充单元, 与所述全体带宽频域系数恢复单元及幅值包络解码 单元连接, 所述噪声填充单元设置为: 对编码过程中未分配编码比特的子带 进行噪声填充; 所述音频信号恢复单元, 与所述噪声填充单元连接, 所述音频信号恢复 单元设置为: 若所述瞬态判决信息表明为稳态信号, 对全体带宽的频域系数 直接进行时频逆变换, 得到输出的音频信号; 若所述瞬态判决信息表明为瞬 态信号, 将全体带宽的频域系数进行重排, 然后分成 组频域系数, 对每一 组频域系数进行时频逆变换, 根据变换得到的 组时域信号计算得到最终的 音频信号。
综上所述, 本发明通过在可分层音频编解码方法中引入针对瞬态信号帧 的处理方法, 对瞬态信号帧进行分段时频变换, 然后对变换得到的频域系数 在核心层和扩展层范围内分别进行重排, 以便与稳态信号帧进行相同的比特 分配、 频域系数编码等后续编码处理, 提高了瞬态信号帧的编码效率, 改善 了可分层音频编解码的质量。
附图概述
图 1是本发明可分层音频编码方法的示意图;
图 2是本发明可分层音频编码方法实施例的流程图;
图 3是本发明矢量量化后进行比特分配修正的方法流程图;
图 4是本发明可分层编码码流的示意图;
图 5是本发明根据频带范围分层和根据码率分层的关系示意图; 图 6是本发明可分层音频编码系统的结构示意图;
图 7是本发明可分层音频解码方法的示意图;
图 8是本发明可分层音频解码方法实施例的流程图;
图 9是本发明可分层音频解码系统的结构示意图。
本发明的较佳实施方式
本发明可分层音频编解码方法和系统的主要思想是通过在可分层音频编 解码方法中弓 I入针对瞬态信号帧的处理方法, 对瞬态信号帧进行分段时频变 换, 然后对变换得到的频域系数在核心层和扩展层范围内分别进行重排, 以 便与稳态信号帧进行相同的比特分配、 频域系数编码等后续编码处理, 提高 了瞬态信号帧的编码效率, 改善了可分层音频编解码的质量。
编码方法及系统
如图 1所示, 基于以上发明思想, 本发明可分层音频编码方法包括以下 步骤:
步骤 10: 对当前帧的音频信号进行瞬态判决;
步骤 20: 根据瞬态判决结果对音频信号进行处理, 获得核心层和扩展层 频域系数;
具体地, 瞬态判决为稳态信号时, 对加窗后的音频信号直接进行时频变 换得到总的频域系数; 瞬态判决为瞬态信号时, 将音频信号分成 个子帧, 对每个子帧进行时频变换, 变换得到的 M组频域系数构成当前帧总的频域系 数, 对总的频域系数按照编码子带从低频到高频的顺序进行重排, 其中, 所 述总的频域系数包括核心层频域系数和扩展层频域系数, 所述编码子带包括 核心层编码子带和扩展层编码子带, 核心层频域系数构成若干个核心层编码 子带, 扩展层频域系数构成若干个扩展层编码子带。
当瞬态判决为瞬态信号时, 当前帧总的频域系数的获取方法为: 将当前帧的 N点时域釆样信号 x(n)与上一帧的 N点时域釆样信号 Xouin) 组成 2N点时域釆样信号 《) , 然后对 《)实施加窗和时域抗混叠处理得到 N 点时域釆样信号 x(n);
对时域信号 做对称变换, 接着在信号两端各添加一段零序列, 将加 长后的信号分成 个互相交迭的子帧,然后对每个子帧的时域信号实施加窗、 时域抗混叠处理和时频变换,得到 组频域系数,构成当前帧总的频域系数。
当瞬态判决为瞬态信号时, 对频域系数进行重排时, 在核心层和扩展层 范围内按照编码子带从低频到高频的顺序分别进行频域系数重排。
步骤 30: 对核心层编码子带和扩展层编码子带的幅值包络值进行量化和 编码, 得到核心层编码子带和扩展层编码子带的幅值包络量化指数及其编码 比特;
具体地, 对核心层编码子带和扩展层编码子带的幅值包络值进行量化和 编码, 得到核心层编码子带和扩展层编码子带的幅值包络量化指数及其编码 比特; 其中, 若为稳态信号, 则对核心层编码子带和扩展层编码子带的幅值 包络值进行统一量化; 若为瞬态信号, 则对核心层编码子带和扩展层编码子 带的幅值包络值分别进行单独量化, 以及对核心层编码子带的幅值包络量化 指数和扩展层编码子带的幅值包络量化指数分别进行重排。
所述对幅值包络量化指数进行重排具体包括:
将同一子帧内的编码子带的幅值包络量化指数按照频率递增或递减的顺 序重新排列在一起, 在子帧连接处釆用分属于两个子帧的代表对等频率的两 个编码子带来连接。
当瞬态判决为稳态信号时, 对量化得到的核心层编码子带的幅值包络量 化指数进行霍夫曼编码, 若所有核心层编码子带的幅值包络量化指数经过霍 夫曼编码后所消耗比特的总数小于所有核心层编码子带的幅值包络量化指数 经过自然编码所消耗比特的总数, 则使用霍夫曼编码, 否则使用自然编码, 并设置核心层编码子带的幅值包络霍夫曼编码标识信息; 对量化得到的扩展 层编码子带的幅值包络量化指数进行霍夫曼编码, 若所有扩展层编码子带的 幅值包络量化指数经过霍夫曼编码后所消耗比特的总数小于所有扩展层编码 子带的幅值包络量化指数经过自然编码所消耗比特的总数, 则使用霍夫曼编 码, 否则使用自然编码, 并设置扩展层编码子带的幅值包络霍夫曼编码标识 信息。
步骤 40: 根据核心层编码子带的幅值包络量化指数对核心层编码子带进 行比特分配, 然后对核心层频域系数进行量化和编码得到核心层频域系数的 编码比特;
得到核心层频域系数编码比特的方法为:
根据由核心层编码子带的幅值包络量化指数重建的核心层编码子带的量 化幅值包络值对核心层频域系数进行归一化, 根据编码子带的比特分配数分 别使用塔型格型矢量量化方法和球型格型矢量量化方法进行量化和编码, 得 到核心层频域系数的编码比特;
对核心层所有使用塔型格型矢量量化得到的量化索引进行霍夫曼编码; 若所有使用塔型格型矢量量化得到的量化索引经过霍夫曼编码后所消耗 比特的总数小于所有使用塔型格型矢量量化得到的量化索引经过自然编码所 消耗比特的总数, 则使用霍夫曼编码, 利用霍夫曼编码节省下来的比特、 初 次比特分配剩余比特数、 对单个频域系数所分配到的比特数为 1或 2的所有 编码子带编码所节省比特的总数对核心层编码子带的比特分配数进行修正, 以及对修正了比特分配数的核心层编码子带再次进行矢量量化和霍夫曼编 码; 否则使用自然编码, 利用初次比特分配剩余比特数、 对单个频域系数所 分配到的比特数为 1或 2的所有编码子带编码所节省比特的总数对核心层编 码子带的比特分配数进行修正, 以及对修正了比特分配数的核心层编码子带 再次进行矢量量化和自然编码。
步骤 50: 对前述核心层中经过矢量量化的频域系数进行反量化, 并与原 始的经过时频变换后得到的频域系数进行差计算, 得到核心层残差信号; 步骤 60: 根据核心层编码子带的幅值包络量化指数和核心层编码子带的 比特分配数计算核心层残差信号的幅值包络量化指数;
釆用如下方法计算核心层残差信号编码子带的幅值包络量化指数: 根据核心层编码子带的比特分配数, 推算核心层残差信号幅值包络量化 指数的修正值; 对核心层编码子带的幅值包络量化指数和对应编码子带的核 心层残差信号幅值包络量化指数的修正值进行差计算, 得到核心层残差信号 幅值包络量化指数。
各编码子带的核心层残差信号幅值包络量化指数修正值大于等于 0, 且 对应核心层编码子带的比特分配数增加时不减小;
当某个核心层编码子带的比特分配数为 0时, 核心层残差信号幅值包络 量化指数修正值为 0 , 当某个核心层编码子带的比特分配数为所限定的最大 比特分配数时, 对应的核心层残差信号的幅值包络值为零。
步骤 70: 根据核心层残差信号的幅值包络量化指数和扩展层编码子带的 幅值包络量化指数对扩展层编码信号的编码子带进行比特分配, 然后对扩展 层编码信号进行量化和编码得到扩展层编码信号的编码比特, 其中, 所述扩 展层编码信号由核心层残差信号和扩展层频域系数构成;
得到扩展层编码信号编码比特的方法为:
根据由扩展层编码信号编码子带的幅值包络量化指数重建的扩展层编码 信号编码子带的量化幅值包络值对扩展层编码信号进行归一化 , 根据扩展层 编码信号各编码子带的比特分配数分别使用塔型格型矢量量化方法和球型格 型矢量量化方法进行量化和编码, 得到扩展层编码信号的编码比特。
在对核心层频域系数和扩展层编码信号进行量化和编码的过程中, 对比 特分配数小于分类阔值的编码子带的待量化矢量釆用塔型格型矢量量化方法 进行量化和编码, 对比特分配数大于所述分类阔值的编码子带的待量化矢量 釆用球型格型矢量量化方法进行量化和编码;
比特分配数是一个编码子带中单个系数所分配到的比特数。
可理解地, 对于扩展层编码信号来说, 其是由核心层残差信号及扩展层 频域系数构成的, 某种意义上核心层残差信号也是由系数构成的。 对扩展层所有使用塔型格型矢量量化得到的量化索引进行霍夫曼编码; 若所有使用塔型格型矢量量化得到的量化索引经过霍夫曼编码后所消耗 比特的总数小于所有使用塔型格型矢量量化得到的量化索引经过自然编码所 消耗比特的总数, 则使用霍夫曼编码, 利用霍夫曼编码节省下来的比特、 初 次比特分配剩余比特数、 对单个频域系数所分配到的比特数为 1或 2的所有 编码子带编码所节省比特的总数对扩展层编码信号编码子带的比特分配数进 行修正, 以及对修正了比特分配数的扩展层编码信号编码子带再次进行矢量 量化和霍夫曼编码; 否则使用自然编码, 利用初次比特分配剩余比特数、 对 单个频域系数所分配到的比特数为 1或 2的所有编码子带编码所节省比特的 总数对扩展层编码信号编码子带的比特分配数进行修正, 以及对修正了比特 分配数的扩展层编码信号编码子带再次进行矢量量化和自然编码。
进行核心层编码子带和扩展层编码信号编码子带比特分配时, 根据编码 子带的幅值包络量化指数对各编码子带进行变步长比特分配;
在比特分配过程中, 对比特分配数为 0的编码子带分配比特的步长是 1 个比特, 比特分配后重要性降低的步长为 1 , 对比特分配数大于 0且小于分 类阔值的编码子带追加分配比特时的比特分配步长为 0.5 个比特, 比特分配 后重要性降低的步长为 0.5 ,对比特分配数大于等于所述分类阔值的编码子带 追加分配比特时的比特分配步长为 1 , 比特分配后重要性降低的步长为 1 ; 所述对编码子带的比特分配数进行修正的过程如下:
计算可用于修正的比特数;
在所有编码子带中寻找重要性最大的编码子带, 如果该编码子带所分配 的比特数已经达到可能分配给与的最大值, 则将该编码子带的重要性调整到 最低, 不再对该编码子带修正比特分配数, 否则对该重要性最大的编码子带 进行比特分配修正;
在比特分配修正过程中, 对比特分配数为 0的编码子带分配 1个比特, 比特分配后重要性降低 1 ; 对比特分配数大于 0且小于 5的编码子带分配 0.5 个比特, 比特分配后重要性降低 0.5; 对比特分配数大于 5的编码子带分配 1 个比特, 比特分配后重要性降低 1。
比特分配数每修正 1次, 则将比特分配修正迭代次数 co ί加 1 , 当比特 分配修正迭代次数 count达到预设上限值或可用于修正的剩余比特数小于比 特分配修正所需要的比特数时, 比特分配修正流程结束。
步骤 80: 将核心层和扩展层编码子带的幅值包络编码比特、 核心层频域 系数的编码比特和扩展层编码信号的编码比特复用打包后, 传送给解码端。
按照如下码流格式进行复用打包:
首先将核心层的边信息比特写入码流的帧头后面, 将核心层编码子带的 幅值包络编码比特写入比特流复用器 MUX (Multiplexer), 然后将核心层频域 系数的编码比特写入 MUX;
然后将扩展层的边信息比特写入 MUX,然后将扩展层频域系数的编码子 带的幅值包络编码比特写入 MUX, 然后将扩展层编码信号的编码比特写入 MUX;
根据所要求的码率, 将满足码率要求的比特数传送到解码端。
下面将结合附图和实施例对本发明进行详细描述。
图 2是本发明第一实施例可分层音频编码方法的流程图。 本实施例中以 帧长为 20ms、釆样率为 32kHz的音频流为例具体说明本发明的可分层音频编 码方法。 在其它帧长和釆样率条件下, 本发明的方法同样适用。 如图 2所示, 该方法包括:
101 : 对帧长为 20ms、 釆样率为 32kHz的音频流进行瞬态判决, 判断该 帧音频信号是瞬态信号还是稳态信号, 当判断该帧信号是瞬态信号时, 置瞬 态判决标识位 ¾7gjra«we«i = 1 ; 当判断该帧信号是稳态信号时,置瞬态判决 标识位 Flag transient = 0; 杂的技术, 包括但不限于感知熵方法、 多级判决方法等。
102: 对帧长为 20ms、 釆样率为 32kHz的音频流实施时频变换得到 N个 频域釆样点上的频域系数; 本步骤的具体实现方式可以是:
将当前帧的 N点时域釆样信号 x(n)与上一帧的 N点时域釆样信号 Xouin) 组成 2N点时域釆样信号 , 2Ν点的时域釆样信号可由下式表示:
« = 0,1,···, N-1
( 1 )
[χ(η-Ν) η二 Ν,Ν + 1,···,2ΝΛ 对 《)实施加窗处理, 得到加窗后的信号:
xw(n) = h(n)x(n) (2)
其中 是窗函数, 定义为:
π
h(n) = sin n + - « = 0,...,2N-1 (3)
2 2N 加窗后的 40ms帧信号 xw使用时域抗混叠处理变换为 20ms帧长的信号 i 操作方法如下
0 0
X =
L 一J、
(4)
1 0 0 1
0 1 1 0
(ΛΓ/2)χ(ΛΓ/2) (ΛΓ/2)χ(ΛΓ/2) 如果瞬态判决标识位 ¾^—/^¾«£^为 0,则表示当前帧为稳态信号,直 接对时域抗混叠信号 进行 IV类离散余弦变换 ( DCTIV变换 )或其他类离 散余弦变换, 得到如下频域系数:
Figure imgf000017_0001
如果瞬态判决标识位^/^—/^^^/1为 1, 则表示当前帧为瞬态信号, 需 要首先对时域抗混叠信号 做对称变换以减少寄生的时域和频域响应。 接 着, 在信号两端各添加长度为 N/8的零序列, 将加长后的信号分成 4个互相 交迭的等长子帧。 每个子帧的长度是 N/2, 以 50%的比例互相交迭。 两个中 间的子帧各用一个长度为 NI2的正弦窗实施加窗, 两端的两个子帧各用长度 为 N/4的半个正弦窗对内侧的半个子帧实施加窗。 然后, 对每个加窗后的子 帧信号进行时域抗混叠处理和 DCTIV变换,得到 4组长度为 N/4的频域系数, 构成总长度为 N的频域系数 i^), k = 0,...,N_l。
此外, 当帧长为 20ms, 釆样率为 32kHz时, N=640 (其他帧长及釆样率 可同样算出相应的 N ) 。
103 :将 N点频域系数分成若干个编码子带,计算各个编码子带的频域幅 值包络(简称幅值包络) ;
所述编码子带可以是均匀划分, 也可以是非均匀划分, 在本实施例中釆 用非均匀子带划分。
本步骤可以釆用如下子步骤实现:
103a: 将所需编码的频带范围内的频域系数分成 个子带 (可以称为编 码子带) ;
本实施例中, 所需编码的频带范围是 0 ~ 13.6kHz, 可以按照人耳感知特 性进行非均勾子带划分, 表 1 和表 2 分别给出了当瞬态判决标识位 Flag transient为 0和 1时一种具体的划分方式。
在表 1和表 2中, 将 0 ~ 13.6kHz频带范围内的频域系数划分成 30个编 码子带, 即 = 30; 并将 13.6kHz以上的频域系数置为 0。
在本实施例中, 还划分出核心层的频域范围。 当瞬态判决标识位 Flag transient为 0和 1时, 分别选择表 1和表 2中的 0 ~ 17号子带作为核心 层的子带,核心层编码子带的个数 — re=l 8。核心层的频带范围是 0 ~ 7kHz。
当瞬态判决标识位 Flag transient为 1时, 对所需编码的频带范围内的 4 组频域系数进行子带划分, 再对核心层的频带范围和扩展层的频带范围内的 频域系数按照编码子带从低频到高频的顺序分别进行重排。 当组内剩余的频 域系数不够构成一个子带(如表 2 , 少于 16个) 时, 则用下一组频域系数中 相同或相近频率的频域系数进行补充, 如表 2中的核心层子带 16、 17。 表 2 中的编码子带即为完成重排的一种具体结果。
可理解地, 组成核心层编码子带的频域系数称为核心层频域系数, 组成 扩展层编码子带的频域系数称为扩展层频域系数, 也可描述为, 将频域系数 划分为核心层频域系数和扩展层频域系数, 将核心层频域系数划分为若干个 核心层编码子带, 将扩展层频域系数划分为若干个扩展层编码子带。 可理解 地, 频域系数层 (指核心层和扩展层) 的划分与编码子带的划分的先后顺序 并不影响本发明的实现。 表 1 当瞬态判决标识位 Flag transient为 0时的子带划分示例
Figure imgf000019_0001
17 272 287 16
18 288 303 16
19 304 319 16
20 320 335 16
21 336 351 16
22 352 367 16
23 368 383 16
24 384 399 16
25 400 415 16
26 416 447 32
27 448 479 32
28 480 511 32
29 512 543 32 表 2 当瞬态判决标识位 Flag transient为 1时的子带划分示例
Figure imgf000020_0001
18 72 87 16
19 232 247 16
20 392 407 16
21 552 567 16
22 88 103 16
23 248 263 16
24 408 423 16
25 568 583 16
26 104 135 32
27 264 295 32
28 424 455 32
29 584 615 32
103b: 按照以下公式计算各编码子带的幅值包络值: i ( 6 )
Figure imgf000021_0001
其中, LIndex(J和 分别表示第 j个编码子带的起始频域系数索引 和结束频域系数索引, 其具体数值如表 1 (当瞬态判决标识位 Flag— transient 为 0时)和表 2 (当瞬态判决标识位 ¾^— ira¾v e«i为 1时)所示。
104: 当瞬态判决标识位 ira¾v e«i为 1时, 对核心层编码子带和扩 展层编码子带的幅值包络值进行量化和编码, 得到核心层编码子带和扩展层 编码子带的幅值包络量化指数和核心层编码子带和扩展层编码子带的幅值包 络编码比特, 核心层编码子带的幅值包络编码比特和扩展层编码子带的幅值 包络编码比特需要传送到比特流复用器(MUX ) 中;
当瞬态判决标识位 ira¾v e«i为 0时,对核心层编码子带和扩展层编 码子带的幅值包络值进行统一量化; 瞬态判决标识位 Flag transient为 1时, 对核心层编码子带和扩展层编码子带的幅值包络值分别进行单独量化, 以及 对核心层编码子带的幅值包络量化指数和扩展层编码子带的幅值包络量化指 数分别进行重排。
以下对核心层编码子带的幅值包络量化编码的过程进行说明: 釆用以下公式( 7 )对各编码子带幅值包络进行量化, 得到各编码子带幅 值包络的量化指数, 即量化器的输出值:
Figure imgf000022_0001
[ j表示向下取整。 7 (0)为第一个核心层编码子带的幅值包络量化指数, 将其范围限制在 [ - 5, 34]内,即当 Thq (0) < - 5时,令
Figure imgf000022_0002
= -5; 当 Thq (0) > 34时, 令 = 34。 当瞬态判决标识位 ira¾v e«i为 1时,对核心层编码子带的幅值包络 量化指数进行重排, 以使下述对核心层编码子带的幅值包络量化指数进行差 分编码的效率更高。
具体重排示例见表 3。
表 3 核心层幅值包络重排示例
Figure imgf000022_0003
13 5
14 12
15 14
16 4
17 13 使用 6比特对第一个编码子带的幅值包络量化指数 7¾(0)进行编码, 即 消耗 6比特。
核心层编码子带幅值包络量化指数间的差分运算值釆用如下公式计算: Δ7¾ ) = Thq (j +\)-Thq(j) 7 =0,···, L_ core -2 (8) 可以对幅值包络进行如下修正以保证 Δ ¾( ·)的范围在[ - 15, 16]之内:
^口果 Δ ¾ (_/·)<—15 , 则令
Δ7¾90) = -15, ^0) = ¾90 + 1) + 15, j = L—隱 -2,-·-,0
^口果 Δ7¾(_/')>16 , 则令 Δ7¾( ) = 16, Thq(j + \) = Thq(j) + \6J = 0,...,L_core-2; 对 Δ7¾9(_/·),_/· = 0,..., — core- 2进行霍夫曼 ( Huffinan )编码, 并计算此时所 消耗的比特数(称为霍夫曼编码比特, Huffman coded bits ) 。 如果此时霍夫 曼编码比特大于或等于固定分配的比特数(在本实施例中大于或等于 ( ― core - I)x 5 ) , 则不使用霍夫曼编码方式对 A7¾g(j'),j' = 0,...,J— core- 2进行编码, 并 置霍夫曼编码标识位 Flag huff—rms core = 0; 否则利用霍夫曼编码对 AThq (j), j = Q,...,L_ core— 2进行编码, 并置霍夫曼编码标识位 Flag huff —画 core = 1。 核心层编码子带的幅值包络量化指数的编码比特(即第一个 子带的幅值包络和幅值包络差分值的编码比特)和霍夫曼编码标识位需要被 传送到 MUX中。
以下对扩展层编码子带的幅值包络量化编码的过程进行说明:
当瞬态判决标识位 Flag transient 为 0 时, 对幅值包络差分值 A q (j) J = L—core - 1, L - 2进行霍矢曼 (Huffman )编码, 并计算此时所消耗 的比特数(称为霍夫曼编码比特, Huffman coded bits") 。 如果此时霍夫曼编 码比特大于或等于固定分配的比特数(在本实施例中大于或等于 ( -L core) X 5 ) , 贝' J不使用霍夫曼编码方式对 Δ7¾。(_/·),_/· = — core_l,..., 一 2进行编码, 并 置霍夫曼编码标识位 Flag huff—rms ext = 0; 否则利用霍夫曼编码对 AThq (jl j = L—core - 1, ..., J - 2进行编码, 并置霍夫曼编码标识位 Flag huff― ext= \
当瞬态判决标识位^/^—/^^^/1为 1时,按照以下公式对扩展层编码子 带的幅值包络进行量化, 得到扩展层编码子带幅值包络的量化指数, 即量化 器的输出值:
Thq (j) = 2 log2
Figure imgf000024_0001
j = L_ core,…, - 1 (9) 其中 Thq(L—core为扩展层频域系数所构成的第一个编码子带的幅值包络 量化指数, 将其范围限制在 [-5, 34]内。 对扩展层编码子带的幅值包络量化 指数进行重排, 以使下述对扩展层编码子带的幅值包络量化指数进行差分编 码的效率更高。 具体重排示例见表 4。
表 4扩展层编码子带幅值包络重排示例
Figure imgf000024_0002
使用 6比特对扩展层频域系数所构成的第一个编码子带的幅值包络量化 指数 TT^ L— core)进行编码, 即消耗 6比特。 扩展层频域系数所构成的扩展层 编码子带幅值包络量化指数间的差分运算值釆用如下公式计算:
Δ7¾ ) = Th (j + 1)— Th (j) j=L_ core,…, - 2 ( 10) 可以对幅值包络进行如下修正以保证 Δ ¾( ·)的范围在 [_ 15, 16]之内: 如果 AThq(j)<-\5 , 则令 Δ7¾(_/') = _15, Thq(j) = Thq(j + \) + \5, j = L—匿 ,"',L-1 ^口果 AThq(j)>\6 , 则令 Δ7¾( ) = 16,
Figure imgf000025_0001
+ 1) = 7¾( ) + 16,_/'= — c e,"',J-2。 然后, 对 AThq(J j = L—core,… , L - 2进行霍 曼 ( Huffman )编码, 并计算此时所消耗的 比特数(称为霍夫曼编码比特, Huffman coded bits ) 。 如果此时霍夫曼编码 比特大于或等于固定分配的比特数(在本实施例中大于或等于 ( -L core - 1) X 5 ) , 则不使用霍夫曼编码方式对 Δ7¾9(_/·),_/· = — core ..,J-2进行编码, 并 置霍夫曼编码标识位 Flag huff—rms ext = 0; 否则利用霍夫曼编码对 AThq 0), j = L—core,… , J - 2进行编码,并置霍夫曼编码标识位 Flag_huff_rms ext 0 扩展层频域系数所构成的幅值包络量化指数的编码比特和霍夫曼编码标 识位需要被传送到 MUX中。
105:根据码率失真理论和核心层编码子带幅值包络信息计算核心层编码 子带重要性的初始值, 并根据核心层编码子带的重要性进行核心层的比特分 配。
本步骤可以釆用如下子步骤实现:
105a: 计算核心层单个频域系数的比特消耗平均值:
从 20ms帧长可提供的总的比特数 bits available中抽出用于核心层编码 的比特数 bits available core, 扣除核心层边信息消耗的比特数 bit sides core 和核心层编码子带幅值包络量化指数所消耗掉的比特数 bits Th— core,得到剩 余的可用于核心层频域系数编码的比特数 bits— left— core , 即:
bits left core = bits available core - bit sides core - bits Th core (11 ) 边信息包括霍夫曼编码标识 Flag huff—丽 core、 Flag huff PLVQ— core 和迭代次数 count core的比特。 Flag huff rms core用于标识是否对核心层编 码子带幅值包络量化指数使用了霍夫曼编码; Flag huff— PLVQ— core用于标识 是否在对核心层频域系数进行矢量编码时使用了霍夫曼编码, 而迭代次数 count core 用于标识核心层比特分配爹正时的迭代次数(详见后续步骤中的 描述) 。
计算核心层单个频域系数的比特消耗平均值为 _core: 其中, L_ core为核心层编码子带的个数。
105b: 根据码率失真理论计算在最大量化信噪比增益条件下的最佳比特 值:
通过拉格朗日方法优化基于独立高斯分布随机变量的码率失真度, 可计 算得到该码率失真度界限下各编码子带的最大量化信噪比增益条件下的最佳 比特值为:
rr _ core( j) = [R_ core + Rmin _ core( j)] , y = 0, · · · , L _ core— 1 ( 13 ) 其中,
Rmin _ core(j) = [Thq (j) - mean _ Thq _ core] j = Q,''、L— core _ 1 ( 14) 以及
1 L_core-\
mean Th core Th (i)[HIndex(i) - LIndex(i) + 1]
_ HIndex{L _ core— 1) + 1 ~^
( 15 ) "
105c: 计算核心层编码子带在进行比特分配时的重要性初始值:
使用上述最佳比特值, 以及符合人耳感知特性的比例因子, 可以得到在 实际比特分配中用于控制比特分配的核心层编码子带重要性的初始值:
rk{j) = xrr _ core(j) = [R _ core + _ core(j)], j = 0, · · · , Z _ core— 1 ( 16) 其中 为比例因子, 该因子跟编码码率相关, 可通过统计分析得到, 通 常 0< <1, 在本实施例中 取值为 0.7; 表示在进行比特分配时第 ·个编 码子带的重要性。
105d: 根据核心层编码子带的重要性进行核心层的比特分配。 具体描述 下:
首先从各 中找到最大值所在的核心层编码子带, 假设该编码子带的 编号为 然后增加该核心层编码子带中每个频域系数的比特分配数 region _bit(jk), 并降低该核心层编码子带的重要性; 同时计算该子带编码消耗 比特总数 Wt_to _M>«^ );最后计算所有核心层编码子带所消耗比特数的总 和 sum bit band used (f) j=Q , ... ,L core - 1; 重复上述过程直至消耗比特数的 总和满足可提供比特限制条件下的最大值。
本步骤中的比特分配方法可以由如下伪代码表示:
令 region b it(j) =0, j=0,l,■··,∑_ core - 1;
对于编码子带 0,1,..., core- \:
寻找 jk = arg max[rk( j)];
J=0,-,L- 如 region—bitjk)<分类阔值
^口果 region_bit(jk)=0 " region bitijk) = region _bit(jk) + 1; if ^-bit band usedijk) = region bitijk) * BandWidthijk);
Figure imgf000027_0001
否则 ¾口果 region bit(Jk)>=\ " region bitijk)) = region bitijk) + 0.5; b it band usedijk) = region bitijk) * BandWidthijk)^^).5;
^rk(jk) = rkijk) - 0.5; 否则如果 region bitjk)>=分类阔值
令 region—bitijk = region bitijk) + 1;
入 on _ bit( jk ) < MaxBit
rk 一 ;
Figure imgf000027_0002
b it band usedijk) = region bitijk) x BandWidthijk);
计算 bit used all = sum (bit_band_used(j)) 7=0,1 , ... ,L_core― 1; 如果 bit used all < bits left core - 16, 返回并在各编码子带中重新寻找 jk, 循环计算比特分配数(或称为编码比特数) ; 其中 16为核心层编码子带 比特数的最大值。
否则, 结束循环, 计算比特分配数, 输出此时的比特分配数。
最后,根据子带的重要性, 将剩余的不到 16个比特按如下原则分配给满 足要求的核心层编码子带, 在比特分配为 1的核心层编码子带中给每个频域 系数分配 0.5 个比特, 同时降低该核心层编码子带的重要性 0.5, 直至 bit— left— core - bit used all < 8, 比特分配结束。此时最终剩余的比特记为核心 层初次分配剩余比特数 remain b Us—core。
上述分类阔值的取值范围为大于等于 2且小于等于 8, 本实施例中可以 为 5。
其中, MaxBU为核心层编码子带中单个频域系数所能分配到的最大的比 特分配数, 单位为 bit/频域系数。 本实施例中釆用 MaxBit=9。 这个值可以根 据编解码器的编码码率适当调整。 regWn— (J j个核心层编码子带中单 个频域系数所分配的比特数, 也就是该子带中单个频域系数的比特分配数。
此外, 本步骤中也可以将 ¾( 、 或将 ^^1(¾2[7¾( ]+1」作为核心层编码 子带的比特分配重要性初始值进行核心层的比特分配, j=Q,...,L core - V, 以下步骤 106至步骤 107中所说的编码子带均为核心层编码子带。
106:用根据核心层编码子带的幅值包络量化指数重建的量化幅值包络值 对核心层编码子带中的频域系数进行归一化计算, 然后对归一化的频域系数 进行分组, 组成若干个矢量;
对于所有 j=Q,.."L—core - 1,使用编码子带 j的量化幅值包络 7^^2对该 编码子带中所有频域系数 Xj进行归一化处理:
norm alized ― X J
X J = 2Thq { j),2 ; ( 17) 将编码子带中连续的 8个系数分组构成 1个 8维矢量。 根据表 1对编码 子带的划分,编码子带 ·中的系数正好可以分组构成 Lattice D8( ^个 8维矢量。 各个归一化后分组的 8维待量化矢量可表示为 , 其中 m表示该 8维矢量 在编码子带中的位置, 其范围在 0到 Lattwe—D8( )A之间。
107: 对于所有 j=Q,...,L cord 判断编码子带 j 所分配比特数 region bit{j ]大小, 如果所分配比特数 region bit(}'、于分类阔值, 则称该编 码子带为低比特编码子带, 并对该低比特编码子带中待量化矢量釆用塔型格 型矢量量化方法进行量化和编码; 如果所分配比特数 region— bit(j)大于^等于 该阔值, 则称该编码子带为高比特编码子带, 并对该高比特编码子带中的待 量化矢量釆用球型格型矢量量化方法进行量化和编码; 本实施例的阔值釆用 5比特。
以下对塔型格型矢量量化和编码方法进行说明:
对低比特编码子带釆用塔型格型矢量量化方法进行量化, 此时子带 ·所 分配到的比特数满足: Y<=region bit(f)<5。
本发 :
Figure imgf000029_0001
其中 Z8表示 8维的整数空间。 将 8维矢量映射到 (即量化到) /¾格点的 基本方法描述如下:
设 X为任意实数, x)表示取和 X相邻的两个整数中相距较近的整数的取 整量化, 表示取相邻的两个整数中相距较远的整数的取整量化。 对任意 矢量 =( ^,..., ^^,同样可定义/ ( ) = (/^),/( ),...,/( 》。 在 )中选择 取整量化误差的绝对值最大的分量中的最小下标, 记为 由此定义 (^ = (/^),/( ), j(¾),...,/(xs》, 则 )或 g( )中有一个且只有一个是/ ¾格 点的数值, 此时量 值为:
Figure imgf000029_0002
将待量化矢量量化到 /¾格点的方法及求解 /¾格点索引的具体步骤如下: a: 待量化矢量的能量规整; 量化之前需要对待量化矢量进行能量规整。 根据待量化矢量所在编码子 带 j所分配的比特数 region—bitif), 从表 2中查询到该比特数所对应的码本序 号 index和能量缩放因子 scale 然后根据下面的公式对待量化矢量进行能量 规整:
cale = (Y; - a) * scaleiindex) ( 20 ) 其中, 1 表示编码子带 中第 个归一化后的待量化 8维矢量, ¾∞fe表 示对】 进行能量规整后的 8维矢量, a = (2- 6,2-6,2- 6,2-6,2- 6,2-6,2- 6,2- 6)。
表 5塔式格型矢量量化比特数与码本序号、 能量缩放因子及最大塔面能 量半径的对应关系
Figure imgf000030_0001
b : 对规整后的矢量进行格点量化; 将能量规整后的 8维矢量 ¾Lfe量化到 /¾格点 上:
Ϋ; = fnA ca!e (21 ) 其中, /¾(·)表示将某个 8维矢量映射到 /¾格点的量化算子。
c : 根据/ ¾格点¾"的塔面能量对^¾∞/6的能量进行截断 ;
计算 /¾格点 的能量并和编码码本中的最大塔面能量半径 flrgefOifex)进行比较。 如果不大于最大塔面能量半径, 则计算该格点在码 本中的索引; 否则将该编码子带规整后的待量化矢量 i¾∞fe进行能量截断, 直到能量截断后的待量化矢量的量化格点的能量不大于最大塔面能量半径; 这时对能量截断后的待量化矢量持续增加其自身的一个小能量, 直至其量化 到 /¾格点的能量超过最大塔面能量半径;取最后一个能量不超过最大塔面能 量半径的 /¾格点作为待量化矢量的量化值。 具体过程可以用下面伪代码描 述:
计算 的塔面能量,即求编码子带 j中第 m个矢量的各分量绝对值之和, temp K = sum Ϋ;'
Ybak = Y Jm
Kbak = temp _ K
If temp— K> LargeK(index)
While temp—K> LargeK{index)
Figure imgf000031_0001
j , scale j, scale ,
Figure imgf000031_0002
Figure imgf000031_0003
Ybak = Y Jm
Kbak = temp _ K
While temp_K<= LargeK{index)
Ybak = Y" Kbak = temp _ K
Figure imgf000032_0001
j , scale
j J j , scale ) temp _K = sum Ϋ"
Y 二
Jm Ybak temp _K = Kbak 这时的 是最后一个能量不超过最大塔面能量半径的 ¾格点, temp _ K是该格点的能量。
d : 生成/ ¾格点¾ "在码本中的量化索引; 根据以下步骤, 通过计算得到/ ¾格点 在码本中的索引。 具体步骤如 下:
步骤 1: 根据塔面能量的大小, 分别对各个塔面上的格点进行标号。
对于维数为 L的整数格点网格 ZL , 定义能量半径为 f的塔面为:
S(L,K) = {Y = (y„y2,...,yL)G ZL |∑\yt \=K} (22)
记 N(J, Q为^; ^)中格点的个数, 对于整数格 ^来说, N(J, Q有如下的 递推关系:
N(J,0) = 1 (L≥ 0), N(0,K) = 0 (K≥l)
N(L, K) = N(L - K) + N(L -1,^-1) + N(L, K-\) (L≥\K≥\)
对于能量半径为 K 的塔面上的整数格点; = (n...,_yjez£ , 用
[0,1,....,NL,^)-1]中的某一个数 来标识, 并称 b为该格点的标号。 求解标号 b的步骤如下:
步骤 1.1: 令 b=0, i=l, k=K, l=L, 根据上述的递推公式, 计算 NO,«), (m<=L,n<=K)。 定义: N(l-l,k-l);
Figure imgf000033_0001
如果y,|>l, 则
Figure imgf000033_0002
l-sgn(^)
N(/— ΐϋ Ι)
2 步骤 1.3: k =
Figure imgf000033_0003
则停止搜索, b为 Y的 标号, 否则继续步骤 1.2)。
步骤 2: 对所有塔面上的格点进行统一标号。
根据各个塔面的格点数和每个格点在各自塔面上的标号, 计算每个格点 在全体塔面中的标号:
其中, kk为偶数。 此时的 zwifex— b(, )即为/ ¾格点¾^在码本中的索引。 也就是编码子带 中第 w个 8维矢量的索引。
e: 重复步骤& ~(1, 直到所有编码比特大于 0的编码子带的各个 8维矢量 都完成索引生成;
f: 根据塔型格型矢量量化方法得到各个编码子带中每个 8维矢量的矢量 量化索引 index b(j,k),其中 表示编码子带 j的第 个 8维矢量,分以下几种 情况对量化索引 mdex—b( ,k)进行霍矢曼编码:
1 )在单个频域系数所分配到的比特数为大于 1小于 5但除去 2的所有编 码子带中, 对每个矢量量化索引的自然二进制码中每 4位分成一组并对其进 行霍夫曼编码。 2 )在单个频域系数所分配到的比特数为 2的所有编码子带中, 对每个 8 维矢量的塔型格型矢量量化索引使用 15个比特进行编码。 在 15个比特中, 对 3组 4位比特和 1组 3位比特分别进行霍夫曼编码。 因此, 在单个频域系 数所分配到的比特数为 2的所有编码子带中, 对每个 8维矢量的编码都节省 了 1个比特。
3 )当编码子带的单个频域系数所分配到的比特数为 1时, 如果量化索引 小于 127 ,则对量化索引使用 7个比特进行编码,把 7个比特分成 1组 3比特 和 1组 4比特, 分别对两组进行霍夫曼编码; 如果量化索引等于 127 , 则它 的自然二进制码值为" 1111 1110", 把前面 7个 1分成 1组 3比特和 1组 4比 特, 分别对两组进行霍夫曼编码; 如果量化索引等于 128 , 则它的自然二进 制码值为" 1111 1111", 把前面 7个 1分成 1组 3比特和 1组 4比特, 分别对 两组进行霍夫曼编码。
对量化索引进行霍夫曼编码的方法可用如下伪代码描述:
在所有的 region—biti ) =1.5和 2〈region bit(j"}〈5的编码子带内
n在 [0 , region bitij) x 8/4 - 1]的范围内, 步长为 1递增, 做如下循环:
将 index bij c)右移 4*"位,
计算 "ifex— b( , :)低 4比特位 i ;?, ^LtJt tmp = md(index b(j k), 15) 计算 tmp在码本中的码字及其比特消耗数:
plvq codebookij i) = plvq code(tmp+ 1 );
plvq countij i) = plvq bit count(tmp+ 1 );
其中 plvq codebook(j,k), 和 plvq count(j,k、分 _/子带第 :个 8维矢量 的霍夫曼编码码本中的码字和比特消耗数; plvq bit count和 plvq code根据 表 6查找。
更新釆用霍夫曼编码后的比特消耗总数:
bit—used uff— all = bit—used uff— all + plvq bit _count{tmp+ 1 ); 在 region—biti) =2的编码子带内
n在 [0, region bitij) x 8/4 - 2]的范围内, 步长为 1递增, 做如下循环:
将 index bij c)右移 4*"位,
计算 "ifex— b(, :)低 4比特位 i ;?, ^LtJt tmp = md(index b(jk), 15) 计算 tmp在码本中的码字及其比特消耗:
plvq countij i) = plvq bit count (tmp+1);
plvq codebookij i) = plvq code (tmp+1);
其中 plvq countij i), 和 plvq—codebook(j,k)^^ ^ ·子带第 :个 8维矢量 的霍夫曼比特消耗数和码字; plvq bit count和 plvq code根据表 6查找。
更新釆用霍夫曼编码后的比特消耗总数:
bit—used uff— all = bit—used uff— all + plvq bit _count{tmp+ 1 );
下面需要处理一个 3比特情况:
在 index— b(j ,ΐή右移 [region— bit(f) x 8/4 - 2]*4位后,
计算 "ifex—b(, )低 3比特位 i/w?, 也就 tmp = and(index—b(JJc), 7) 计算 tmp在码本中的码字及其比特消耗:
plvq countij i) = plvq bit count—r2— 3 (tmp+1);
plvq codebookij i) = plvq code—r2— 3 (tmp+1);
其中 plvq countijji), 和 plvq—codebook(j,k)^^ ^ 7·子带第 :个 8维矢量 的霍夫曼比特消耗数和码字; plvq bit count r2 3和 plvq code r 2 3根据表 7 查找。 更新釆用霍夫曼编码后的比特消耗总数:
bit—used— huff—all = bit—used— huff—all + plvq bit _count{tmp+ 1 );
在 region bit( ) =1的编码子带内
如果 index _b(j,k)<\21
计算 "ifex— b(, :)低 4比特位 i ;?, ^LtJt tmp = md(index b(jk , 15) 计算 tmp在码本中的码字及其比特消耗:
plvq countij i) = plvq bit count rl 4(tmp+\);
plvq codebookij i) = plvq code rl 4(tmp+\);
其中 plvq countij i), 和 plvq—codebook(j,k)^^ ^ ·子带第 :个 8维矢量 的霍夫曼比特消耗数和码字; plvq bit count rl 4和 plvq code rl 4根据表 8 查找。
更新釆用霍夫曼后的比特消耗总数:
bit—used mff— all = bit—used mff— all + plvq bit _count{tmp+ 1 );
下面需要处理一个 3比特情况:
将 index b{j,k右 4位,
计算 "ifex—b(, )低 3比特位 i/w?, 也就 tmp = and(index—b(JJc), 7) 计算 tmp在码本中的码字及其比特消耗:
plvq countij i) = plvq bit count r7 3(tmp+l); plvq_codebook(j,k) = plvq code r7 3(tmp+l);
其中 plvq countij i), 和 plvq—codebook(j,k)^^ ^ ·子带第 :个 8维矢量 的霍夫曼比特消耗数和码字; 码本 plvq bit count rl 3和 plvq code r 1 3根 据表 9查找。
更新釆用霍夫曼后的比特消耗总数:
bit—used uff— all = bit—used uff— all + plvq bit _count{tmp+ 1 );
Figure imgf000037_0001
对于前三个 "1"和后四个 "1"分别查找表 9和表 8的霍夫曼码表 : 计算方法同前面 index b(j,k)< 下的情况.
更新釆用霍夫曼后的比特消耗总数: 总共需要 8个比特.
如果 index _b(j,k)=\2S
Figure imgf000037_0002
对于前三个 "1"和后四个 "1"分别查找表 7和表 6的霍夫曼码表, 计算 方法同前面 index—b{j,k〈Un下的情况.
更新釆用霍夫曼后的比特消耗总数: 总共需要 8个比特.
因此, 在单个频域系数所分配到的比特数为 1的所有编码子带中, 对每 个 8维矢量的编码, 当 index b(J,ky U,时, 节省 1个比特。
表 6塔式矢量量化霍夫曼码表
Tmp Plvq bit count plvq code
0 2 0 1 4 6
2 4 1
3 4 5
4 4 3
5 4 7
6 4 13
7 4 10
8 4 11
9 5 30
10 5 25
11 5 18
12 5 9
13 5 14
14 5 2
15 4 15
表 7 塔式矢量量化霍夫曼码表
Tmp Plvq bit count r2 3 p q code r2 3
0 1 0
1 4 1
2 4 15
3 5 25
4 3 3
5 3 5
6 4 7
7 5 9 表 8塔式矢量量化霍夫曼码表
Tmp Plvq bit count rl 4 p q code rl 4
0 3 7
1 5 13
2 5 29
3 4 14
4 4 3
5 4 6
6 4 1
7 4 0
8 4 8
9 4 12
10 4 4
11 4 10
12 4 9
13 4 5
14 4 11
15 4 2 表 9塔式矢量量化霍夫曼码表
Tmp Plvq bit count rl 3 p q code rl 3
0 2 1
1 3 0
2 3 2
3 4 7 4 4 15
5 3 6
6 3 4
7 3 3 g: 判断霍夫曼编码是否节省比特;
将所有低比特编码子带的集合记为 C , 计算以上步骤 f的 2 ) 、 3 ) 中所 述的对单个频域系数所分配到的比特数为 1或 2的所有编码子带编码所节省 的比特, 记为硬节省比特数 bz— r7» /— core , 计算属于 C中所有编 码子带的 8 维矢量的量化矢量索引经过霍夫曼编码后消耗比特的总数 bit _ used _ huff _ all; 将 bit _ used _ huff _ all和自然编码所需消耗比特的总数 bit used nohuff all进行比较, 如^^ bit—used—huff—all < bit—used iohuff— all , 则传输霍夫曼编码后的量化矢量索引, 同时设置霍夫曼编码标识 Flag— huff— PLVQ— core为 ,否则, 直接对量化矢量索引进行自然编码, 并设置 霍 曼編码标识 Mag— huff— PLVQ— core为 0。
上述 bit— used— nohuff— all等于给 C中所有编码子带所分配比特数的总数 sum(bit _ band _ used(j), j e C)减去 bit—saved—r 1—r2—all 々差值。
h: 比特分配数的修正;
若霍 曼编码标识 Flag— huff— PLVQ— core为 0 , 则利用初次分配剩余比特 数 remain bits core 硬节省 t匕特数 bit saved r 1 _r2 all core对编码子带的 t匕 特分配进行修正。 若霍夫曼编码标识 ¾^— ½#— core为 1 , 则利用初次 分配剩余比特数 remain bits core , 硬节省比特数 bit saved r 1 _r2 all core和 以下对球型格型矢量量化和编码方法进行说明: 对高比特编码子带釆用球型格型矢量量化方法进行量化, 此时子带 j 所 分配到的比特数满足: 5<=region bit(f) <=9。
此处同样釆用基于/ ¾格的 8维格型矢量量化。
a:根据编码子带 j中单个频域系数所分配到的比特数 regzow— bz ( )对该编 码子带归一化后的第 m个待量化矢量 I 进行如下能量规整:
7; =^( -a) (24) 其中 , a = (2- 6,2- 6,2- 6,2- 6,2- 6,2- 6,2- 6,2- 6 ) ,
\
β =—— - ~~ ,
scale(region _ bit(j)) 而 scale(region _ bit(J )表示编码子带中单个频域系数的比特分配数为 region _bit(j)时的能量缩放因子, 根据表 10可查到它们的对应关系。
表 10 球型格型矢量量化的比特分配数与能量缩放因子的对应关系
Figure imgf000041_0002
b: 生成/ ¾格点的索引矢量
将编码子带 j中进行能量缩放后的第 m个待量化矢量 映射到 /¾的格 点 上:
Figure imgf000041_0001
判断/ Α0^/2 °" ίω)是否为零矢量, 即它的各个分量是否都为零, 如 果是则称为零矢量条件满足, 否则称为零矢量条件不满足。
如果零矢量条件满足,那么索引矢量可由下面的索引矢量生成公式得到: k = (f G- 1 ) mod glon-hltU) (26 ) 输出此时/ ¾格点 的索引矢量 k, 其中 G为/ ¾格点的生成矩阵, 形式 下: 2 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0
1 0 1 0 0 0 0 0
1 0 0 1 0 0 0 0
G
1 0 0 0 1 0 0 0
1 0 0 0 0 1 0 0
1 0 0 0 0 0 1 0
1 0 0 0 0 0 0 1 如果零矢量条件不满足, 将矢量 ^ 的值除以 2, 直至零矢量条件
/A(f/72 ° ίω)成立; 并备份^ 2自身的小倍数值为 然后对缩减后的矢 量 加上备份的小倍数值 w, 再量化到/ ¾格点, 判断零矢量条件是否满足; 如果零矢量条件不满足, 则根据索引矢量计算公式得到最近满足零矢量条件 的 /¾格点的索引矢量 k, 否则继续对矢量 增加备份的小倍数值 w, 然后 再量化到/ ¾格点, 直至零矢量不条件满足; 最后根据索引矢量计算公式得到 最近满足零矢量条件的 /¾格点的索引矢量 k; 输出 /¾格点 的索引矢量 k。 这个过程也可通过下面的伪代码描述:
temp _D = fDg (7/ 12region-bit(j))
Ybak = Y jm
Dbak = temp _ D While temp _D≠0
Figure imgf000042_0001
temp— D二 ¾(7//2^
w = Y I\6
Ybak = T: Dbak = temp _ D
While temp _ D = 0
Ybak = Y jm
Dbak = temp _ D
Ym = fDs (Ym ) temp _ D = fDg (7/ 1 2region-bit(j))
Y Jm 二 Ybak
k = (^G- i)mod ¾ c: 对高比特编码子带的矢量量化索引进行编码, 此时子带 j所分配的比 特数满足: 5<=region bit(j)<=90
根据球型格型矢量量化的方法, 对比特分配数为 5到 9的编码子带中的 8维矢量进行量化后得到矢量索引 k={kl, k2, k3, k4, k5, k6, k7, k8} ,根据单个 频域系数所分配到的比特数对索引矢量 k的各个分量进行自然编码, 得到该 矢量的编码比特。
如图 3所示, 比特分配修正流程具体包括如下步骤:
301 :计算可用于比特分配修正的比特数 bz— co / core。若霍夫曼编 码标 i?、 Flag— huff— PLVQ— core为 0 , 则
diff bit count core = remain bits core+bit saved r 1 _r2 all core 若霍 曼编码标识 Flag— huff— PLVQ— core为 1 , 则
diff bit count core = remain bits core+bit saved r 1 _r2 all core + (bit—usedjwhuff— all-bit—used— huff—all) 令 count=0:
302: 如果 diff bit count core大于零, 则在各 rk( j ( j=0, ... ,L_core - 1 ) 中寻找最大值
Figure imgf000044_0001
用公式表示为:
Figure imgf000044_0002
303 : 判断 regzow— bz ) +1是否小于或等于 9 , 如果是则执行下一步, 否 则将 Jk对应的编码子带的重要性调整到最低(例如令 rk{jk)= - 100 ) , 表示无 需再对该编码子带的比特分配数进行修正, 并跳转至步骤 302;
304: 判断 diff bit count core是否大于或等于修正编码子带 jk的比特分 配数所需要多消耗的比特(若 Flag—huff—PLVQ core为 0 ,则按自然编码计算; 务 Flag huff— PLVQ— core为 \ , 则按霍夫曼编码计算) , 如果是, 则执行步骤 305 ,修正编码子带 Λ的比特分配数 rWo«_½ ) ,降低子带重要性^ )的值,, 并对编码子带 Λ重新进行矢量量化和自然编码或霍夫曼编码, 最后更新 diff bit count core的值; 否则比特分配修正流程结束;
305: 在比特分配修正过程中, 对比特分配数为 0的编码子带分配 1个比 特, 比特分配后重要性降低 1 , 对比特分配数大于 0且小于 5的编码子带分 配 0.5个比特, 比特分配后重要性降低 0.5 , 对比特分配数大于 5的编码子带 分配 1个比特, 比特分配后重要性降氐 1。
306: 令 count=count+ , 判断 co ί是否小于或等于 Moxco ί, 如果是, 则跳转至步骤 302 , 否则比特分配修正流程结束。
H Maxcount为循环迭代次数上限值,该值由编码比特流及其釆样率决 定,本实施例中,若霍夫曼编码标识 ¾^— ½#— 为 0 ,则釆用 Maxcount=, 若霍夫曼编码标识 Flag huff PL 为 1 , 则釆用 Maxcount=31。
108: 对前述核心层中经过矢量量化的频域系数进行反量化, 并与原始的 经过时频变换后得到的频域系数进行差计算, 得到核心层的残差信号, 并用 核心层残差信号和扩展层频域系数构成扩展层编码信号;
可理解地, 构成扩展层编码信号的步骤(步骤 108 )也可以在完成扩展 层编码信号的比特分配(步骤 110 )之后执行。
109: 对核心层的残差信号进行与频域系数相同的子带划分, 根据核心层 的编码子带幅值包络量化指数和核心层的比特分配数(即各 region—bitij , j=0,...,L core- \ )计算出核心层残差信号编码子带的幅值包络量化指数。
本步骤可以釆用如下子步骤实现:
109a:根据核心层编码子带中单个频域系数所分配的比特数 region bit{j , 7=0,..., L core- 1查找核心层残差信号幅值包络量化指数的修正值统计表,得 到核心层残差信号幅值包络量化指数的修正值 dtff[regwn—bU{j ) , j=0,...,L_core - 1;
其中, region _bit(j)= 1 , 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, S,j=0,...,L core - 1 , 而幅值包络量化指数修正值可通过以下规则进行设置:
di氛 region—bitif) >0; 且
当 region bit{f)H diff{region bit()) 着 regow— bz ()值的增大而不 减。
为了得到更好的编解码效果, 可以对各比特分配数 region bit(j 下计 算出的子带幅值包络量化指数和直接从残差信号计算出的子带幅值包络量化 指数的差值进行统计, 得到概率最高的幅值包络量化指数修正值统计表, 如 表 11所示:
表 11 幅值包络量化指数修正值统计表
region bit diff
1 1
1.5 2
2 3
2.5 4
3 5
3.5 5
4 6
4.5 7
5 7
6 9
7 10
8 12
109b: 根据核心层中编码子带 j的幅值包络量化指数和表 8中的量化指 数修正值, 计算出核心层残差信号第 个子带的幅值包络量化指数:
Tfiq (j) = Thq )― diff(region _ bit(j)) ,户 0, ... ,L core― 1 其中, 是核心层中编码子带 J的幅值包络量化指数。
需要注意的是, 当核心层中某个编码子带的比特分配数为 0时, 则无需 对核心层残差信号的编码子带幅值包络进行^ ί 正, 这时核心层的残差信号子 带幅值包络值与核心层的编码子带幅值包络值相同。
此外, 当核心层中某个编码子带比特分配数 regzow— bz ( )=9时, 置核心层 残差信号第 个编码子带的量化幅值包络值为零。
110: 在扩展层中对扩展层编码信号的编码子带进行比特分配: 扩展层子带划分由表 1或表 2决定。 子带 0, ... ,L core - l中的编码信号 是核心层残差信号, — re, ... , - 1中的编码信号是扩展层编码子带中的频域 系数。 子带 0至 -1也称为扩展层编码信号的编码子带。
根据计算出的核心层残差信号的幅值包络量化指数、 扩展层编码子带的 幅值包络量化指数以及扩展层可用比特数, 釆用与核心层相同的比特分配方 案在整个扩展层频带范围内计算扩展层编码信号的编码子带重要性的初始 值, 并对扩展层编码信号的编码子带进行比特分配。
本实施例中,扩展层频带范围是 0 ~ 13.6kHz。音频流的总码率为 64kbps, 核心层的码率为 32kbps, 则扩展层的最大码率为 64kbps。 根据核心层码率和 扩展层最大码率计算出扩展层中总的可用的比特数, 然后进行比特分配, 直 至比特完全消耗。
111 :根据扩展层编码信号编码子带的幅值包络量化指数和相应的比特分 配数, 对扩展层编码信号进行归一化、 矢量量化和编码, 得到编码信号的编 码比特。 其中, 扩展层中编码信号的矢量构成、 矢量量化方法和编码方法分 别与核心层中频域系数的矢量构成、 矢量量化方法和编码方法相同。
112: 构造可分层编码码流, 根据码率的大小构造码率层。
如图 4所示, 釆用如下方式构造可分层编码码流: 首先将核心层的边信 息按如下顺序写入比特流复用器 MUX: Flag transient , Flag huff—丽 core、 Flag huff PLVQ core和 count core, 然后将核心层的编码子带幅值包络编码 比特写入 MUX, 然后将核心层频域系数的编码比特写入 MUX; 然后将扩展 层的边信息按如下顺序写入 MUX:扩展层编码子带的幅值包络霍夫曼编码标 识位 Flag huff—rms ext、 频域系数霍夫曼编码标识位 Flag huff PLVQ ext和 比特分配修正迭代次数 count ext, 然后将扩展层编码子带 ( L core, ... ,L ~ l ) 的幅值包络编码比特写入 MUX , 然后将扩展层编码信号的编码比特写入 MUX; 最后将按上述顺序写成的可分层码流传送到解码端;
其中, 扩展层编码信号编码比特的写入顺序按照扩展层编码信号的编码 子带重要性的初始值排序。 即重要性初始值大的扩展层编码信号的编码子带 的编码比特优先写入码流, 对于具有相同重要性的编码子带, 低频编码子带 优先。
由于扩展层中的残差信号的幅值包络是由核心层编码子带的幅值包络和 比特分配数计算出来的, 因此不用传送到解码端。 这样既可以增加核心层带 宽的编码精度又不必附加比特用以传送残差信号的幅值包络值。 根据所要求传送的码率, 把比特流复用器后部不必要的比特舍去后, 将 满足码率要求的比特数传送到解码端。 即按照编码子带重要性从小到大的顺 序舍去不必要的比特。
在本实施例中, 编码频带范围为 0 ~ 13.6kHz, 最大码率为 64kbps , 按码 率分层的方法如下:
将编码频带范围 0 ~ 7kHz内的频域系数划分成核心层, 核心层所对应的 最大码率是 32kbps , 记为 L0层; 扩展层的编码频带范围为 0 ~ 13.6kHz, 其 最大码率为 64kbps , 记为 Li— 5层;
在送到解码端之前, 根据舍去比特数的多少可以将码率划分为 —1层, 对应 36kbps , Lj_2层, 对应 40kbps , _3层, 对应 48kbps , _ 层, 对应 56kbps及 Li— 5层, 对应 64kbps。
图 5表示了根据频带范围分层和根据码率分层的关系。
图 6是本发明可分层音频编码系统的结构示意图, 如图 6所示, 该系统 包含: 瞬态判决单元、 频域系数生成单元、 幅值包络计算单元、 幅值包络量 化和编码单元、核心层比特分配单元、核心层频域系数矢量量化和编码单元、 扩展层编码信号生成单元、 残差信号幅值包络生成单元、 扩展层比特分配单 元、 扩展层编码信号矢量量化和编码单元、 比特流复用器; 其中:
所述瞬态判决单元, 用于对当前帧的音频信号进行瞬态判决;
所述频域系数生成单元, 与所述瞬态判决单元连接, 瞬态判决为稳态信 号时, 用于对加窗后的音频信号直接进行时频变换得到的总的频域系数; 瞬 态判决为瞬态信号时, 用于将音频信号分成 个子帧, 对每个子帧进行时频 变换, 变换得到的 组频域系数构成当前帧总的频域系数, 对总的频域系数 按照编码子带从低频到高频的顺序进行重排, 其中, 所述总的频域系数包括 核心层频域系数和扩展层频域系数, 所述编码子带包括核心层编码子带和扩 展层编码子带, 核心层频域系数构成若干个核心层编码子带, 扩展层频域系 数构成若干个扩展层编码子带;
所述幅值包络计算单元, 与所述频域系数生成单元连接, 用于计算核心 层编码子带和扩展层编码子带的幅值包络值; 所述幅值包络量化和编码单元, 与所述幅值包络计算单元以及瞬态判决 单元连接, 用于对核心层编码子带和扩展层编码子带的幅值包络值进行量化 和编码, 得到核心层编码子带的和扩展层编码子带的幅值包络量化指数及其 编码比特; 其中, 若为稳态信号, 则对核心层编码子带和扩展层编码子带的 幅值包络值进行统一量化; 若为瞬态信号, 则对核心层编码子带和扩展层编 码子带的幅值包络值分别进行单独量化, 以及对核心层编码子带的幅值包络 量化指数和扩展层编码子带的幅值包络量化指数分别进行重排;
所述核心层比特分配单元, 与所述幅值包络量化和编码单元连接, 用于 根据核心层编码子带的幅值包络量化指数对核心层编码子带进行比特分配, 得到核心层编码子带的比特分配数;
所述核心层频域系数矢量量化和编码单元, 与所述频域系数生成单元、 幅值包络量化和编码单元及核心层比特分配单元连接, 用于使用根据核心层 编码子带的幅值包络量化指数重建的核心层编码子带的量化幅值包络值和比 特分配数对核心层编码子带的频域系数进行归一化、 矢量量化和编码, 得到 核心层频域系数的编码比特;
所述扩展层编码信号生成单元, 与所述频域系数生成单元及核心层频域 系数矢量量化和编码单元连接, 用于生成残差信号, 得到由残差信号和扩展 层频域系数构成的扩展层编码信号;
所述残差信号幅值包络生成单元, 与所述幅值包络量化和编码单元以及 核心层比特分配单元连接, 用于根据核心层编码子带的幅值包络量化指数与 对应编码子带的比特分配数, 得到核心层残差信号的幅值包络量化指数; 所述扩展层比特分配单元, 与所述残差信号幅值包络生成单元及幅值包 络量化和编码单元连接, 用于根据核心层残差信号幅值包络量化指数和扩展 层编码子带的幅值包络量化指数对扩展层编码子带进行比特分配, 得到扩展 层编码子带的比特分配数;
所述扩展层编码信号矢量量化和编码单元, 与所述幅值包络量化和编码 单元、 扩展层比特分配单元、 残差信号幅值包络生成单元及扩展层编码信号 生成单元连接, 用于使用根据扩展层编码信号编码子带的幅值包络量化指数 重建的扩展层编码信号编码子带的量化幅值包络值和比特分配数对扩展层编 码信号进行归一化、 矢量量化和编码, 得到扩展层编码信号的编码比特; 所述比特流复用器与所述幅值包络量化和编码单元、 核心层频域系数矢 量量化和编码单元、扩展层编码信号矢量量化和编码单元连接, 用于将核心 层边信息比特、核心层编码子带的幅值包络的编码比特、核心层频域系数的 编码比特、扩展层边信息比特, 扩展层编码子带的幅值包络的编码比特和扩 展层编码信号的编码比特进行打包。
所述频域系数生成单元获取当前帧总的频域系数时, 用于将当前帧的 N 点时域釆样信号 与上一帧的 N点时域釆样信号 x。w(«)组成 2N点时域釆样 信号 《) , 然后对 《)实施加窗和时域抗混叠处理得到 N点时域釆样信号 ^") ; 以及对时域信号 做对称变换, 接着在信号两端各添加一段零序列, 将加长后的信号分成 M个互相交迭的子帧, 然后对每个子帧的时域信号实施 加窗、 时域抗混叠处理和时频变换, 得到 组频域系数, 构成当前帧总的频 域系数。
所述频域系数生成单元对频域系数进行重排时, 在核心层和扩展层范围 内按照编码子带从低频到高频的顺序分别进行频域系数的重排。
所述幅值包络量化和编码单元对幅值包络量化指数进行重排具体指: 将 同一子帧内的编码子带的幅值包络量化指数按照频率递增或递减的顺序重新 排列在一起, 在子帧连接处釆用分属于两个子帧的代表对等频率的两个编码 子带来连接。
所述比特流复用器按照如下码流格式进行复用打包:
首先将核心层的边信息比特写入码流的帧头后面, 将核心层编码子带的 幅值包络编码比特写入比特流复用器 MUX,然后将核心层频域系数的编码比 特写入 MUX;
然后将扩展层的边信息比特写入 MUX,然后将扩展层频域系数编码子带 的幅值包络编码比特写入 MUX, 然后将扩展层编码信号的编码比特写入 MUX;
根据所要求的码率, 将满足码率要求的比特数传送到解码端。
核心层的边信息包括瞬态判决标识位比特、 核心层编码子带的幅值包络 的霍夫曼编码标志位比特、 核心层频域系数的霍夫曼编码标志位比特和核心 层比特分配修正迭代次数比特;
扩展层的边信息包括扩展层编码子带的幅值包络的霍夫曼编码标识位比 特、 扩展层编码信号的霍夫曼编码标识位比特和扩展层比特分配修正迭代次 数比特。
所述扩展层编码信号生成单元还包括残差信号生成模块和扩展层编码信 号合成模块;
所述残差信号生成模块用于对核心层频域系数的量化值进行反量化, 并 与核心层频域系数进行差计算, 得到核心层残差信号;
所述扩展层编码信号合成模块用于将核心层残差信号和扩展层的频域系 数按频带的顺序合成, 得到扩展层的编码信号。
所述残差信号幅值包络生成单元还包括量化指数修正值获取模块和残差 信号幅值包络量化指数计算模块;
所述量化指数修正值获取模块用于根据核心层编码子带比特分配数, 查 找核心层残差信号幅值包络量化指数的修正值统计表, 得到残差信号编码子 带的量化指数修正值, 各编码子带的量化指数修正值大于等于 0, 且当核心 层对应编码子带的比特分配数增加时不减, 如果核心层的编码子带的比特分 配数为 0, 则核心层残差信号在该编码子带的量化指数修正值为 0, 如果子带 的比特分配数为所限定的最大比特分配数, 则残差信号在该子带的幅值包络 值为零;
所述残差信号幅值包络量化指数计算模块用于将核心层编码子带的幅值 包络量化指数与对应编码子带的量化指数修正值进行差计算, 得到核心层残 差信号编码子带的幅值包络量化指数。
所述比特流复用器将扩展层编码信号编码比特按照各扩展层编码信号的 编码子带重要性的初始值从大到小的顺序写入码流, 对于具有相同重要性的 编码子带 , 低频编码子带的编码比特优先写入码流。
图 6中的各单元(模块) 的具体功能详见对图 2所示流程的描述。
解码方法和系统 基于本发明思想, 本发明可分层音频解码方法, 如图 7所示, 该解码方 法包括以下步骤:
步骤 701 : 对编码端传送过来的比特流进行解复用, 对核心层编码子带 和扩展层编码子带的幅值包络编码比特解码, 得到核心层编码子带和扩展层 编码子带的幅值包络量化指数; 若瞬态判决信息表明为瞬态信号, 还对核心 层编码子带和扩展层编码子带的幅值包络量化指数按照频率从小到大的顺序 分别进行重排;
步骤 702: 根据核心层编码子带的幅值包络量化指数, 对核心层编码子 带进行比特分配, 并由此推算核心层残差信号的幅值包络量化指数, 根据核 心层残差信号的幅值包络量化指数和扩展层编码子带的幅值包络量化指数对 扩展层编码信号编码子带进行比特分配;
计算残差信号的幅值包络量化指数的方法为: 根据核心层比特分配数, 查找核心层残差信号幅值包络量化指数的修正值统计表, 得到核心层残差信 号幅值包络量化指数的修正值; 对核心层编码子带的幅值包络量化指数和对 应编码子带的核心层残差信号幅值包络量化指数的修正值进行差计算, 得到 核心层残差信号幅值包络量化指数;
各编码子带的核心层残差信号幅值包络量化指数修正值大于等于 0, 且 对应核心层编码子带的比特分配数增加时不减小;
当某个核心层编码子带的比特分配数为 0时, 核心层残差信号幅值包络 量化指数修正值为 0 , 当某个核心层编码子带的比特分配数为所限定的最大 比特分配数时, 对应的核心层残差信号的幅值包络值为零。
步骤 703 : 根据核心层和扩展层的比特分配数, 分别对核心层频域系数 编码比特和扩展层编码信号的编码比特解码, 得到核心层频域系数和扩展层 编码信号, 将扩展层编码信号按照子带顺序重新排列, 并和核心层频域系数 相加, 得到全体带宽的频域系数;
步骤 704: 若所述瞬态判决信息表明为稳态信号, 则对全体带宽的频域 系数直接进行时频逆变换, 得到输出的音频信号; 若所述瞬态判断信息表明 为瞬态信号, 则将全体带宽的频域系数进行重排, 然后分成 组频域系数, 对每一组频域系数进行时频逆变换, 根据变换得到的 组时域信号计算得到 最终的音频信号。
釆用如下顺序对扩展层编码信号的编码比特进行解码:
在扩展层中, 扩展层编码信号编码比特的解码顺序是根据对应的扩展层 编码信号的编码子带重要性的初始值决定的, 重要性大的扩展层编码信号的 编码子带优先解码,如果有两个扩展层编码信号编码子带具有相同的重要性, 则低频编码子带优先解码, 解码过程中计算已解码的比特数, 当已解码的比 特数满足总比特数要求时停止解码。
图 8是本发明可分层音频解码方法实施例的流程图。 如图 8所示, 该方 法包括:
801 :从编码端传送过来的可分层码流中(即从比特流解复用器 DeMUX, Demultiplexer中)提取一帧的编码比特;
提取出编码比特后, 首先对边信息进行解码, 然后根据 Flag huff core 的值对该帧中核心层的各幅值包络编码比特进行霍夫曼 解码或直接解码, 得到核心层编码子带的幅值包络量化指数 7¾( ) , j=0,…, — core — 1
802:根据核心层编码子带的幅值包络量化指数计算出核心层编码子带重 要性初始值, 并利用子带重要性对核心层编码子带进行比特分配, 得到核心 层的比特分配数;解码端的比特分配方法和编码端的比特分配方法完全相同。 在比特分配过程中, 比特分配步长及比特分配后编码子带重要性降低的步长 是变化的。
在完成上述比特分配过程后, 根据编码端核心层的比特分配修正次数 count core 值和核心层编码子带的重要性, 对核心层编码子带再进行 count core次比特分配, 然后比特分配全过程结束。
在比特分配过程中, 对比特分配数为 0的编码子带分配比特的步长是 1 个比特, 比特分配后重要性降低的步长为 1 , 对比特分配数大于 0且小于某 个阔值的编码子带追加分配比特时的比特分配步长为 0.5 个比特, 比特分配 后重要性降低的步长也为 0.5 ,对比特分配数大于等于该阔值的编码子带追加 分配比特时的比特分配步长为 1 , 比特分配后重要性降低的步长也为 1 ;
803 :利用核心层编码子带的比特分配数和核心层编码子带的量化幅值包 络值, 并根据 Flag huff PL VQ core对核心层频域系数的编码比特进行解码、 反量化及反归一化处理, 得到核心层频域系数。
804: 在对核心层频域系数的编码比特进行解码、 反量化时, 根据核心层 编码子带比特分配数将核心层编码子带划分成低比特编码子带和高比特编码 子带, 对低比特编码子带和高比特编码子带分别使用塔型格型矢量量化反量 化方法和球型格型矢量量化反量化方法进行反量化;
根据核心层边信息, 对低比特编码子带进行霍夫曼解码或者直接进行自 然解码得到低比特编码子带的塔型格型矢量量化的索引, 对所有塔型格型矢 量量化的索引进行反量化及反归一化, 得到该编码子带的频域系数。 以下对 塔型格型矢量量化反量化过程进行说明:
a: 对于所有户0, ... ,L—core - 1 , 如果 Flag huff PLVQ— core =0 , 直接解码 得到低比特编码子带 j 的第 m 个矢量量化的索引 index b(]',m) , 如果 Flag huff PL VQ core =1 ,那么根据编码子带单个频域系数的比特分配数所对 应的霍夫曼编码码表, 得到低比特编码子带 j 的第 m 个矢量量化的索引 index _b(j,m) ;
当编码子带的单个频域系数所分配到的比特数为 1时, 如果量化索引的 自然二进制码值小于 "1111 111",则按照自然二进制码值计算量化索引; 如果 量化索引的自然二进制码值等于" 1111 111",则继续读入下一位比特,如果下 一位比特是 0 , 则量化索引值为 127 , 如果下一位比特是 1 , 则量化索引值为 128。
b: 对该量化索引的塔式格型矢量反量化的过程实际上是矢量量化过程 108的逆过程, 反量化过程如下:
1 ) 确定矢量量化索引所在的能量塔面及在该能量塔面上的标号: 在塔面能量从 2到 LargeK(regwn bU(i))中寻找 kk, 使得下面的不等式满 足:
N(8,kk) <= index _b(j,m) < N(8,^+2), 如果找到这样的^ , M K=kk为量化智 ^UndeX—b(j,m) 所对应 D8格点所 在塔面的能量, b= index—b(j,m)-N(8,kk)为该 D8格点在所在塔面上的索引标号; 如果找不到这样的^ ,则量化索引 index b(j,m、 所对应 D8格点的塔面能 量 K=0及索引标号 b=0;
2 )求解塔面能量为 K和索引标号为 b的 D8格点矢量 Y=(yl, y2 y3, y4, y5, y6, y7, y8,)的具体步骤如下:
步骤 1: 令 Y=(0,0,0,0,0,0,0,0), xb=0, i=l, k=K, 1=8;
步骤 2: 如果 b=xb那么 yi=0;跳转至步骤 6;
步骤 3 : 如果 b<xb+N(l-l,k), 则 yi=0,跳转至步骤 5;
否则 , xb=xb+N(l-l ,k);令 j=l;
步骤 4: 如果 b<xb+2*N(l-l,k-j),则
^口果 xb<=b<xb+N(l-l,k-j), 则 yi=j ;
如果 b>=xb+N(l-l,k-j), 则 yi=-j , xb=xb+N(l-l, k-j);
否则 xb=xb+2*N(l-l, k-j)J=j+l ; 继续本步骤;
步骤 5: 更新 k=k-|yi|, 1=1-1, i=i+l , 如果 k>0, 则跳转到步骤 2;
步骤 6: 如果 k>0, 则 y8=k-| yi| , Y=( yl, y2, ... , y8)为所求格点。
3 )对所求 D8格点进行能量反规整, 得到
Y 1 = {Υ + Ά) Ι scale{index) 其中, a = (2-6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 ) , scale{index)是缩放因子, 可从 表 5查找到。
4 )对 Ϋ 进行反归一化处理, 得到解码端恢复出的编码子带 的第 m个 矢量的频域系数: j ― j
其中, 为第 个编码子带的幅值包络量化指数。
对高比特编码子带的编码比特直接进行自然解码得到高比特编码子带 _; 的第 m个索引矢量 k, 对该索引矢量进行球型格型矢量量化的反量化过程实 际上是量化过程的逆过程, 具体步骤如下:
a: 计算 x= *G, 并计算 _yte /?=x/(2A(regzo«— bz()); 其中, 为矢量量化 的索引矢量, r o«_ t /)表示编码子带 · 中单个频域系数的比特分配数; G 为/ ¾格点的生成矩阵, 形式如下:
2 0 0 0 0 0 0 0
1 1 0 0 0 0 0 0
1 0 1 0 0 0 0 0
1 0 0 1 0 0 0 0
G
1 0 0 0 1 0 0 0
1 0 0 0 0 1 0 0
1 0 0 0 0 0 1 0
1 0 0 0 0 0 0 1 b:
Figure imgf000056_0001
it(j));
c: 对所求 D8格点进行能量反规整, 得到
= ca/e(reg謂— )/(2 。" - ) + a ,
其中, a = (2-6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 , 2- 6 ) , scale(region _ bit(j))是缩放因 子, 可从表 10查找到。
d: 对 进行反归一化处理, 得到解码端恢复出的编码子带 的第 w个 矢量的频域系数: j ― j
其中, 为第 个编码子带的幅值包络量化指数。
805:利用核心层编码子带的幅值包络量化指数和核心层编码子带的比特 分配数计算核心层残差信号的子带幅值包络量化指数; 解码端的计算方法与 编码端的计算方法完全相同;
根据 Flag huff rms ext的值对扩展层编码子带的幅值包络编码比特进行 霍夫曼解码或直接解码, 得到扩展层编码子带的幅值包络量化指数 Thq(j , j=,L_core, ...,L— 1。
806: 扩展层编码信号是由核心层残差信号和扩展层频域系数构成, 根据 扩展层编码信号编码子带的幅值包络量化指数计算扩展层编码信号编码子带 重要性的初始值, 并使用扩展层编码信号编码子带的重要性初始值对扩展层 编码信号编码子带进行比特分配, 得到扩展层编码信号编码子带的比特分配 数;
解码端的编码子带重要性初始值的计算和比特分配方法与编码端的编码 子带重要性初始值的计算方法和比特分配方法相同。
807: 计算扩展层编码信号:
利用扩展层编码信号的比特分配数对编码信号的编码比特进行解码和反 量化, 并利用扩展层编码信号编码子带的量化幅值包络值对反量化后的数据 进行反归一化, 得到扩展层编码信号。
扩展层的解码及反量化的方法和核心层的解码及反量化的方法相同。 本步骤中, 扩展层编码信号的编码子带解码的顺序是根据扩展层编码信 号的编码子带重要性的初始值决定的。 如果有两个扩展层编码信号的编码子 带具有相同的重要性, 则低频编码子带优先解码, 同时计算已解码的比特数, 当已解码的比特数满足总比特数要求时停止解码。
例如, 从编码端送往解码端的码率为 64kbps, 但是由于网络原因, 解码 端只能得到码流前面 48kbps的信息, 或解码端只支持 48kbps的解码, 所以 当解码端解码到 48kbps时就停止解码。
808: 将扩展层解码得到的编码信号按频率大小重新排列, 并将相同频率 下的核心层频域系数和扩展层编码信号相加得到频域系数输出值。
809:对编码过程中未分配编码比特的子带或传输过程中丟失了的子带进 行噪声填充。
810: 当瞬态判决标识位 ¾^— ira¾v e«i为 1时,对频域系数进行重排, 即 将表 2中的 L个子带所对应的所有频域系数按照原始频域系数索引序号对应 的位置重新排列, 表 2中没有提及的频域系数索引对应的频域系数均置为 0。
811 : 对频域系数进行时频逆变换, 得到最终的音频输出信号。 具体步骤 下:
当瞬态判决标识位 ira¾v e«i为 0时,对 N点频域系数进行长度为 N 的逆 DCTiv变换, 得到 xq (n),n = 0, ... , N— 1。 当瞬态判决标识位^/^—/^^ ^/1为 1时,首先将 N点频域系数分成等长 的 4组, 对每一组频域系数进行长度为 N/4的逆 DCTIV变换和逆时域抗混叠 处理, 接着对这 4组得到的信号进行加窗处理(窗结构同编码端) , 然后对 这 4组力口窗后信号进行交迭相力口, 得到 j^ («)," = o,...,N_l。
对^» = 0,..., -1进行逆时域抗混叠处理和加窗处理 (窗结构同编码 端) 。 对相邻两帧进行交迭相加, 得到最终的音频输出信号。
图 9是本发明可分层音频解码系统的结构示意图, 如图 9所示, 该系统 包含: 比特流解复用器(DeMUX ) 、 核心层编码子带的幅值包络解码单元、 核心层比特分配单元、 核心层解码和反量化单元、 残差信号幅值包络生成单 元、 扩展层比特分配单元、 扩展层编码信号解码和反量化单元、 全体带宽频 域系数恢复单元、 噪声填充单元、 音频信号恢复单元; 其中:
所述幅值包络解码单元, 与所述比特流解复用器连接, 用于对所述比特 流解复用器输出的核心层和扩展层编码子带的幅值包络编码比特进行解码, 得到核心层编码子带和扩展层编码子带的幅值包络量化指数; 若瞬态判决信 息表明为瞬态信号, 还对核心层编码子带和扩展层编码子带的幅值包络量化 指数按照频率从 d、到大的顺序分别进行重排;
所述核心层比特分配单元, 与所述幅值包络解码单元连接, 用于根据核 心层编码子带的幅值包络量化指数, 对核心层编码子带进行比特分配, 得到 核心层编码子带的比特分配数;
所述核心层解码和反量化单元, 与所述比特流解复用器、 幅值包络解码 单元及核心层比特分配单元连接, 用于根据核心层编码子带的幅值包络量化 指数计算得到核心层编码子带的量化幅值包络值, 使用核心层编码子带的比 特分配数和量化幅值包络值对所述比特流解复用器输出的核心层频域系数编 码比特进行解码、 反量化及反归一化处理, 得到核心层的频域系数;
所述残差信号幅值包络生成单元, 与所述幅值包络解码单元及核心层比 特分配单元连接, 用于根据核心层编码子带的幅值包络量化指数与对应编码 子带的比特分配数,查找核心层残差信号幅值包络量化指数的修正值统计表, 得到核心层残差信号幅值包络量化指数; 所述扩展层比特分配单元, 与所述残差信号幅值包络生成单元及幅值包 络解码单元连接, 用于根据核心层残差信号幅值包络量化指数和扩展层编码 子带的幅值包络量化指数进行扩展层编码信号编码子带的比特分配, 得到扩 展层编码信号编码子带的比特分配数;
所述扩展层编码信号解码和反量化单元, 与比特流解复用器、 所述幅值 包络解码单元、 扩展层比特分配单元及残差信号幅值包络生成单元连接, 用 于使用扩展层编码信号编码子带的幅值包络量化指数计算得到扩展层编码信 号编码子带的量化幅值包络值, 使用扩展层编码信号编码子带的比特分配数 和量化幅值包络值对所述比特流解复用器输出的扩展层编码信号的编码比特 进行解码、 反量化及反归一化处理, 得到扩展层编码信号;
所述全体带宽频域系数恢复单元, 与所述核心层解码和反量化单元以及 扩展层编码信号解码和反量化单元连接, 用于根据编码子带顺序对所述扩展 层编码信号解码和反量化单元输出的扩展层的编码信号进行重新排序, 然后 与所述核心层解码和反量化单元输出的核心层频域系数做和计算, 得到全体 带宽频域系数;
所述噪声填充单元, 与所述全体带宽频域系数恢复单元及幅值包络解码 单元连接, 用于对编码过程中未分配编码比特的子带进行噪声填充;
音频信号恢复单元, 与所述噪声填充单元连接, 若所述瞬态判决信息表 明为稳态信号, 用于对全体带宽的频域系数直接进行时频逆变换, 得到输出 的音频信号; 若所述瞬态判断信息表明为瞬态信号, 用于将全体带宽的频域 系数进行重排,然后分成 组频域系数,对每一组频域系数进行时频逆变换, 根据变换得到的 组时域信号计算得到最终的音频信号。
所述残差信号幅值包络生成单元还包括量化指数修正值获取模块和残差 信号幅值包络量化指数计算模块;
所述量化指数修正值获取模块用于根据核心层编码子带比特分配数查找 核心层残差信号幅值包络量化指数的修正值统计表, 得到残差信号编码子带 的量化指数修正值, 各编码子带的量化指数修正值大于等于 0, 且当核心层 对应编码子带的比特分配数增加时不减小, 如果核心层的某个编码子带的比 特分配数为 0, 则核心层残差信号在该编码子带的量化指数修正值为 0, 如果 某个核心层编码子带的比特分配数为所限定的最大比特分配数, 则残差信号 在该编码子带的幅值包络值为零;
所述残差信号幅值包络量化指数计算模块用于将核心层编码子带的幅值 包络量化指数与对应编码子带的量化指数修正值进行差计算, 得到核心层残 差信号编码子带的幅值包络量化指数。
所述扩展层编码信号解码和反量化单元对扩展层编码信号的编码子带解 码的顺序是根据扩展层编码信号的编码子带重要性的初始值决定的, 重要性 大的扩展层编码信号的编码子带优先解码, 如果有两个扩展层编码信号的编 码子带具有相同的重要性, 则低频编码子带优先解码, 解码过程中计算已解 码的比特数, 当已解码的比特数满足总比特数要求时停止解码。
所述扩展层编码信号解码和反量化单元对扩展层编码信号编码子带解码 的顺序是根据扩展层编码信号的编码子带重要性的初始值决定的, 重要性大 的扩展层编码信号的编码子带优先解码, 如果有两个扩展层编码信号的编码 子带具有相同的重要性, 则低频编码子带优先解码, 解码过程中计算已解码 的比特数, 当已解码的比特数满足总比特数要求时停止解码。
所述音频信号恢复单元对全体带宽的频域系数进行重排具体指将属于同 一子帧的频域系数按照编码子带从低频到高频的顺序排列, 得到 组频域系 数后, 再将 组频域系数按照子帧的顺序排列。
若瞬态判决信息表明为瞬态信号, 所述音频信号恢复单元根据变换得到 的 组时域信号计算得到最终的音频信号的过程具体包括: 对每一组进行逆 时域抗混叠处理,接着对这 M组得到的信号进行加窗处理, 然后对这 M组加 窗后信号进行交迭相加, 得到 N点时域釆样信号 ; 对时域信号 进行 逆时域抗混叠处理和加窗处理, 对相邻两帧进行交迭相加, 得到最终的音频 输出信号。
本发明还提供以下针对瞬态信号的可分层编码及解码方法:
本发明瞬态信号的可分层音频编码方法, 包括:
Al、 将音频信号分成 个子帧, 对每个子帧进行时频变换, 变换得到的 M组频域系数构成当前帧总的频域系数, 对总的频域系数按照编码子带从低 频到高频的顺序进行重排, 其中, 所述总的频域系数包括核心层频域系数和 扩展层频域系数, 所述编码子带包括核心层编码子带和扩展层编码子带, 核 心层频域系数构成若干个核心层编码子带, 扩展层频域系数构成若干个扩展 层编码子带;
B1、对核心层编码子带和扩展层编码子带的幅值包络值进行量化和编码, 得到核心层编码子带和扩展层编码子带的幅值包络量化指数及其编码比特, 其中对核心层编码子带和扩展层编码子带的幅值包络值分别进行单独量化, 以及对核心层编码子带的幅值包络量化指数和扩展层编码子带的幅值包络量 化指数分别进行重排;
C1、 根据核心层编码子带的幅值包络量化指数对核心层编码子带进行比 特分配, 然后对核心层频域系数进行量化和编码得到核心层频域系数的编码 比特;
Dl、 对前述核心层中经过矢量量化的频域系数进行反量化, 并与原始的 经过时频变换后得到的频域系数进行差计算, 得到核心层残差信号;
E1、 根据核心层编码子带的幅值包络量化指数和比特分配数计算核心层 残差信号编码子带的幅值包络量化指数;
F1、 根据核心层残差信号的幅值包络量化指数和扩展层编码子带的幅值 包络量化指数对扩展层编码信号的编码子带进行比特分配, 然后对扩展层编 码信号进行量化和编码得到扩展层编码信号的编码比特, 其中, 所述扩展层 编码信号由核心层残差信号和扩展层频域系数构成;
Fl、 将核心层和扩展层编码子带的幅值包络编码比特、 核心层频域系数 的编码比特和扩展层编码信号的编码比特复用打包后, 传送给解码端。
步骤 A1中, 当前帧总的频域系数的获取方法为:
将当前帧的 N点时域釆样信号 x(n)与上一帧的 N点时域釆样信号 Xouin) 组成 2N点时域釆样信号 《) , 然后对 《)实施加窗和时域抗混叠处理得到 N 点时域釆样信号^
对时域信号 做对称变换, 接着在信号两端各添加一段零序列, 将加 长后的信号分成 个互相交迭的子帧,然后对每个子帧的时域信号实施加窗、 时域抗混叠处理和时频变换,得到 组频域系数,构成当前帧总的频域系数。 步骤 A1 中, 对频域系数进行重排时, 在核心层和扩展层范围内按照编 码子带从低频到高频的顺序分别进行频域系数的重排。
步骤 B1中, 所述对幅值包络量化指数进行重排具体包括:
将同一子帧内的编码子带的幅值包络量化指数按照频率递增或递减的顺 序重新排列在一起, 在子帧连接处釆用分属于两个子帧的代表对等频率的两 个编码子带来连接。
步骤 F1中, 按照如下码流格式进行复用打包:
首先将核心层的边信息比特写入码流的帧头后面, 将核心层编码子带的 幅值包络编码比特写入比特流复用器 MUX,然后将核心层频域系数的编码比 特写入 MUX;
然后将扩展层的边信息比特写入 MUX,然后将扩展层频域系数编码子带 的幅值包络编码比特写入 MUX, 然后将扩展层编码信号的编码比特写入 MUX;
根据所要求的码率, 将满足码率要求的比特数传送到解码端。
核心层的边信息包括瞬态判决标识位比特、 核心层编码子带的幅值包络 的霍夫曼编码标志位比特、 核心层频域系数的霍夫曼编码标志位比特和核心 层比特分配修正迭代次数比特; 扩展层的边信息包括扩展层编码子带的幅值包络的霍夫曼编码标识位比 特、 扩展层编码信号的霍夫曼编码标识位比特和扩展层比特分配修正迭代次 数比特。
本发明瞬态信号可分层解码方法, 包括:
步骤 A2、对编码端传送过来的比特流进行解复用, 对核心层编码子带和 扩展层编码子带的幅值包络编码比特解码, 得到核心层编码子带和扩展层编 码子带的幅值包络量化指数, 对核心层编码子带和扩展层编码子带的幅值包 络量化指数按照频率从 d、到大的顺序分别进行重排;
步骤 B2、 根据重排后的核心层编码子带的幅值包络量化指数, 对核心层 编码子带进行比特分配, 并由此计算核心层残差信号的幅值包络量化指数; 步骤 C2、根据核心层残差信号的幅值包络量化指数和重排后的扩展层编 码子带的幅值包络量化指数对扩展层编码信号的编码子带进行比特分配; 步骤 D2、根据核心层和扩展层的比特分配数, 分别对核心层频域系数编 码比特和扩展层编码信号的编码比特解码, 得到核心层频域系数和扩展层编 码信号, 将扩展层编码信号按照子带顺序重新排列, 并和核心层频域系数相 加, 得到全体带宽的频域系数;
步骤 E2、将全体带宽的频域系数进行重排, 然后分成 组, 对每一组频 域系数进行时频逆变换, 根据变换得到的 组时域信号计算得到最终的音频 信号。
步骤 E2中,将全体带宽的频域系数进行重排具体指将属于同一子帧的频 域系数按照编码子带从低频到高频的顺序排列, 得到 组频域系数后, 再将 组频域系数按照子帧的顺序排列。
步骤 E2中, 根据变换得到的 M组时域信号计算得到最终的音频信号的 过程包括: 对每一组进行逆时域抗混叠处理, 接着对这 组得到的信号进行 加窗处理, 然后对这 组加窗后信号进行交迭相加, 得到 N点时域釆样信号 xq(n); 对时域信号 进行逆时域抗混叠处理和加窗处理, 对相邻两帧进行 交迭相加, 得到最终的音频输出信号。
工业实用性
本发明通过在可分层音频编解码方法中引入针对瞬态信号帧的处理方 法, 对瞬态信号帧进行分段时频变换, 然后对变换得到的频域系数在核心层 和扩展层范围内分别进行重排, 以便与稳态信号帧进行相同的比特分配、 频 域系数编码等后续编码处理, 提高了瞬态信号帧的编码效率, 改善了可分层 音频编解码的质量。

Claims

权 利 要 求 书
1、 一种可分层音频编码方法, 包括:
对当前帧的音频信号进行瞬态判决;
瞬态判决为稳态信号时, 对加窗后的音频信号直接进行时频变换得到总 的频域系数; 瞬态判决为瞬态信号时, 将音频信号分成 个子帧, 对每个子 帧进行时频变换, 变换得到的 组频域系数构成当前帧总的频域系数, 对总 的频域系数按照编码子带从低频到高频的顺序进行重排, 其中, 所述总的频 域系数包括核心层频域系数和扩展层频域系数, 所述编码子带包括核心层编 码子带和扩展层编码子带, 核心层频域系数构成若干个核心层编码子带, 扩 展层频域系数构成若干个扩展层编码子带;
对核心层编码子带和扩展层编码子带的幅值包络值进行量化和编码, 得 到核心层编码子带和扩展层编码子带的幅值包络量化指数及其编码比特; 其 中, 若为稳态信号, 则对核心层编码子带和扩展层编码子带的幅值包络值进 行统一量化; 若为瞬态信号, 则对核心层编码子带和扩展层编码子带的幅值 包络值分别进行单独量化, 以及对核心层编码子带的幅值包络量化指数和扩 展层编码子带的幅值包络量化指数分别进行重排;
根据核心层编码子带的幅值包络量化指数对核心层编码子带进行比特分 配,然后对核心层频域系数进行量化和编码得到核心层频域系数的编码比特; 对前述核心层中经过矢量量化的频域系数进行反量化, 并与原始的经过 时频变换后得到的频域系数进行差计算, 得到核心层残差信号;
根据核心层编码子带的幅值包络量化指数和比特分配数计算核心层残差 信号的幅值包络量化指数;
根据核心层残差信号的幅值包络量化指数和扩展层编码子带的幅值包络 量化指数对扩展层编码信号的编码子带进行比特分配, 然后对扩展层编码信 号进行量化和编码得到扩展层编码信号的编码比特, 其中, 所述扩展层编码 信号由核心层残差信号和扩展层频域系数构成;
将核心层和扩展层编码子带的幅值包络编码比特、 核心层频域系数编码 比特和扩展层编码信号的编码比特复用打包后, 传送给解码端。
2、 如权利要求 1所述的方法, 其中, 当瞬态判决为瞬态信号时, 构成当 前帧总的频域系数的步骤包括:
将当前帧的 N点时域釆样信号 x(n)与上一帧的 N点时域釆样信号 Xouin) 组成 2N点时域釆样信号 《) , 然后对 《)实施加窗和时域抗混叠处理得到 N 点时域釆样信号 x(n);
对时域信号 做对称变换, 接着在信号两端各添加一段零序列, 将加 长后的信号分成 个互相交迭的子帧,然后对每个子帧的时域信号实施加窗、 时域抗混叠处理和时频变换,得到 组频域系数,构成当前帧总的频域系数。
3、 如权利要求 1所述的方法, 其中, 当瞬态判决为瞬态信号, 对频域系 数进行重排时, 在核心层和扩展层范围内按照编码子带从低频到高频的顺序 分别进行频域系数的重排。
4、 如权利要求 1所述的方法,其中, 所述对幅值包络量化指数进行重排 包括:
将同一子帧内的编码子带的幅值包络量化指数按照频率递增或递减的顺 序重新排列在一起, 在子帧连接处釆用分属于两个子帧的代表对等频率的两 个编码子带来连接。
5、 如权利要求 1所述的方法, 其中, 该方法还包括: 当瞬态判决为稳态 信号时,
对量化得到的核心层编码子带的幅值包络量化指数进行霍夫曼编码, 若 所有核心层编码子带的幅值包络量化指数经过霍夫曼编码后所消耗比特的总 数小于所有核心层编码子带的幅值包络量化指数经过自然编码所消耗比特的 总数, 则使用霍夫曼编码, 否则使用自然编码, 并设置核心层编码子带的幅 值包络霍夫曼编码标识信息;
对量化得到的扩展层编码子带的幅值包络量化指数进行霍夫曼编码, 若 所有扩展层编码子带的幅值包络量化指数经过霍夫曼编码后所消耗比特的总 数小于所有扩展层编码子带的幅值包络量化指数经过自然编码所消耗比特的 总数, 则使用霍夫曼编码, 否则使用自然编码, 并设置扩展层编码子带的幅 值包络霍夫曼编码标识信息。
6、 如权利要求 1所述的方法, 其中, 该方法还包括: 釆用如下方式计算 核心层残差信号编码子带的幅值包络量化指数:
根据核心层编码子带的比特分配数, 推算核心层残差信号幅值包络量化 指数的修正值; 对核心层编码子带的幅值包络量化指数和对应编码子带的核 心层残差信号幅值包络量化指数的修正值进行差计算, 得到核心层残差信号 幅值包络量化指数;
各编码子带的核心层残差信号幅值包络量化指数修正值大于等于 0, 且 对应核心层编码子带的比特分配数增加时不减小;
当某个核心层编码子带的比特分配数为 0时, 核心层残差信号幅值包络 量化指数修正值为 0 , 当某个核心层编码子带的比特分配数为所限定的最大 比特分配数时, 对应的核心层残差信号的幅值包络值为零。
7、 如权利要求 1所述的方法,其中, 所述对核心层频域系数进行量化和 编码包括:
对核心层所有使用塔型格型矢量量化得到的量化索引进行霍夫曼编码; 若所有使用塔型格型矢量量化得到的量化索引经过霍夫曼编码后所消耗 比特的总数小于所有使用塔型格型矢量量化得到的量化索引经过自然编码所 消耗比特的总数, 则使用霍夫曼编码, 利用霍夫曼编码节省下来的比特、 初 次比特分配剩余比特数、 对单个频域系数所分配到的比特数为 1或 2的所有 编码子带编码所节省比特的总数对编码子带的比特分配数进行修正, 以及对 修正了比特分配数的编码子带再次进行矢量量化和霍夫曼编码; 否则使用自 然编码, 利用初次比特分配剩余比特数、 对单个频域系数所分配到的比特数 为 1或 2的所有编码子带编码所节省比特的总数对编码子带的比特分配数进 行修正,以及对修正了比特分配数的编码子带再次进行矢量量化和自然编码; 所述对扩展层编码信号进行量化和编码包括:
对扩展层所有使用塔型格型矢量量化得到的量化索引进行霍夫曼编码; 若所有使用塔型格型矢量量化得到的量化索引经过霍夫曼编码后所消耗 比特的总数小于所有使用塔型格型矢量量化得到的量化索引经过自然编码所 消耗比特的总数, 则使用霍夫曼编码, 利用霍夫曼编码节省下来的比特、 初 次比特分配剩余比特数、 对单个频域系数所分配到的比特数为 1或 2的所有 编码子带编码所节省比特的总数对编码子带的比特分配数进行修正, 以及对 修正了比特分配数的编码子带再次进行矢量量化和霍夫曼编码; 否则使用自 然编码, 利用初次比特分配剩余比特数、 对单个频域系数所分配到的比特数 为 1或 2的所有编码子带编码所节省比特的总数对编码子带的比特分配数进 行修正,以及对修正了比特分配数的编码子带再次进行矢量量化和自然编码。
8、 一种可分层音频解码方法, 该方法包括:
对编码端传送过来的比特流进行解复用, 对核心层编码子带和扩展层编 码子带的幅值包络编码比特解码, 得到核心层编码子带和扩展层编码子带的 幅值包络量化指数; 若瞬态判决信息表明为瞬态信号, 还对核心层编码子带 和扩展层编码子带的幅值包络量化指数按照频率从小到大的顺序分别进行重 排;
根据核心层编码子带的幅值包络量化指数, 对核心层编码子带进行比特 分配, 并由此计算核心层残差信号的幅值包络量化指数, 根据核心层残差信 号的幅值包络量化指数和扩展层编码子带的幅值包络量化指数对扩展层编码 信号的编码子带进行比特分配;
根据核心层编码子带和扩展层编码信号的编码子带的比特分配数, 分别 对核心层频域系数的编码比特和扩展层编码信号的编码比特解码, 得到核心 层频域系数和扩展层编码信号, 将扩展层编码信号按照子带顺序重新排列, 并和核心层频域系数相加, 得到全体带宽的频域系数;
若所述瞬态判决信息表明为稳态信号, 则对全体带宽的频域系数直接进 行时频逆变换, 得到输出的音频信号; 若所述瞬态判断信息表明为瞬态信号, 则将全体带宽的频域系数进行重排, 然后分成 组频域系数, 对每一组频域 系数进行时频逆变换, 根据变换得到的 组时域信号计算得到最终的音频信 号。
9、 如权利要求 8所述的方法,其中, 所述计算核心层残差信号的幅值包 络量化指数的步骤包括: 根据核心层编码子带的比特分配数, 推算核心层残 差信号幅值包络量化指数的修正值; 对核心层编码子带的幅值包络量化指数 和对应编码子带的核心层残差信号幅值包络量化指数的修正值进行差计算, 得到核心层残差信号幅值包络量化指数; 各编码子带的核心层残差信号幅值包络量化指数修正值大于等于 0, 且 对应核心层编码子带的比特分配数增加时不减小;
当某个核心层编码子带的比特分配数为 0时, 核心层残差信号幅值包络 量化指数修正值为 0 , 当某个核心层编码子带的比特分配数为所限定的最大 比特分配数时, 对应的核心层残差信号的幅值包络值为零。
10、 如权利要求 8所述的方法,其中,若瞬态判决信息表明为瞬态信号, 则将全体带宽的频域系数进行重排, 包括: 将属于同一子帧的频域系数按照 编码子带从低频到高频的顺序排列,得到 组频域系数后,再将 组频域系 数按照子帧的顺序排列。
11、 如权利要求 8所述的方法,其中,若瞬态判决信息表明为瞬态信号, 根据变换得到的 组时域信号计算得到最终的音频信号的过程包括: 对每一 组时域信号进行逆时域抗混叠处理,接着对这 组得到的信号进行加窗处理, 然后对这 M组加窗后信号进行交迭相加, 得到 N点时域釆样信号 ;
对时域信号 进行逆时域抗混叠处理和加窗处理, 对相邻两帧进行交 迭相加, 得到最终的音频输出信号。
12、 一种瞬态信号的可分层音频编码方法, 该方法包括:
将音频信号分成 个子帧,对每个子帧进行时频变换, 变换得到的 组 频域系数构成当前帧总的频域系数, 对总的频域系数按照编码子带从低频到 高频的顺序进行重排, 其中, 所述总的频域系数包括核心层频域系数和扩展 层频域系数, 所述编码子带包括核心层编码子带和扩展层编码子带, 核心层 频域系数构成若干个核心层编码子带, 扩展层频域系数构成若干个扩展层编 码子带;
对核心层编码子带和扩展层编码子带的幅值包络值进行量化和编码, 得 到核心层编码子带和扩展层编码子带的幅值包络量化指数及其编码比特, 其 中对核心层编码子带和扩展层编码子带的幅值包络值分别进行单独量化, 以 及对核心层编码子带的幅值包络量化指数和扩展层编码子带的幅值包络量化 指数分别进行重排;
根据核心层编码子带的幅值包络量化指数对核心层编码子带进行比特分 配,然后对核心层频域系数进行量化和编码得到核心层频域系数的编码比特; 对前述核心层中经过矢量量化的频域系数进行反量化, 并与原始的经过 时频变换后得到的频域系数进行差计算, 得到核心层残差信号;
根据核心层编码子带的幅值包络量化指数和核心层编码子带的比特分配 数计算核心层残差信号编码子带的幅值包络量化指数;
根据核心层残差信号的幅值包络量化指数和扩展层编码子带的幅值包络 量化指数对扩展层编码信号的编码子带进行比特分配, 然后对扩展层编码信 号进行量化和编码得到扩展层编码信号的编码比特, 其中, 所述扩展层编码 信号由核心层残差信号和扩展层频域系数构成;
将核心层编码子带和扩展层编码子带的幅值包络编码比特、 核心层频域 系数的编码比特和扩展层编码信号的编码比特复用打包后, 传送给解码端。
13、 如权利要求 12所述的方法,其中,构成当前帧总的频域系数的步骤 包括:
将当前帧的 N点时域釆样信号 χ(η)与上一帧的 Ν点时域釆样信号 Xouin) 组成 2N点时域釆样信号 《) , 然后对 《)实施加窗和时域抗混叠处理得到 N 点时域釆样信号^
对时域信号 做对称变换, 接着在信号两端各添加一段零序列, 将加 长后的信号分成 个互相交迭的子帧,然后对每个子帧的时域信号实施加窗、 时域抗混叠处理和时频变换,得到 组频域系数,构成当前帧总的频域系数。
14、 如权利要求 12所述的方法,其中,在核心层和扩展层范围内按照编 码子带从低频到高频的顺序分别进行频域系数的重排。
15、 如权利要求 12所述的方法,其中, 所述对幅值包络量化指数进行重 排包括:
将同一子帧内的编码子带的幅值包络量化指数按照频率递增或递减的顺 序重新排列在一起, 在子帧连接处釆用分属于两个子帧的代表对等频率的两 个编码子带来连接。
16、 一种瞬态信号的可分层解码方法, 该方法包括: 对编码端传送过来的比特流进行解复用, 对核心层编码子带和扩展层编 码子带的幅值包络编码比特解码, 得到核心层编码子带和扩展层编码子带的 幅值包络量化指数, 对核心层编码子带和扩展层编码子带的幅值包络量化指 数按照频率从 、到大的顺序分别进行重排;
根据重排后的核心层编码子带的幅值包络量化指数, 对核心层编码子带 进行比特分配, 并由此计算核心层残差信号的幅值包络量化指数;
根据核心层残差信号的幅值包络量化指数和重排后的扩展层编码子带的 幅值包络量化指数对扩展层编码子带进行比特分配;
根据核心层编码子带和扩展层编码信号的编码子带的比特分配数, 分别 对核心层频域系数编码比特和扩展层编码信号编码比特解码, 得到核心层频 域系数和扩展层编码信号, 将扩展层编码信号按照子带顺序重新排列, 并和 核心层频域系数相加, 得到全体带宽的频域系数;
将全体带宽的频域系数进行重排, 然后分成 组, 对每一组频域系数进 行时频逆变换, 根据变换得到的 组时域信号计算得到最终的音频信号。
17、 如权利要求 16所述的方法,其中,将全体带宽的频域系数进行重排 的步骤包括: 将属于同一子帧的频域系数按照编码子带从低频到高频的顺序 排列, 得到 组频域系数后, 再将 组频域系数按照子帧的顺序排列。
18、 如权利要求 16所述的方法, 其中, 根据变换得到的 M组时域信号 计算得到最终的音频信号的过程包括: 对每一组进行逆时域抗混叠处理, 接 着对这 组得到的信号进行加窗处理,然后对这 组加窗后信号进行交迭相 加 , 得到 N点时域釆样信号 ;
对时域信号 进行逆时域抗混叠处理和加窗处理, 对相邻两帧进行交 迭相加, 得到最终的音频输出信号。
19、 一种可分层音频编码系统, 该系统包括:
频域系数生成单元、 幅值包络计算单元、 幅值包络量化和编码单元、 核 心层比特分配单元、核心层频域系数矢量量化和编码单元以及比特流复用器; 该系统还包括: 瞬态判决单元、 扩展层编码信号生成单元、 残差信号幅值包 络生成单元、扩展层比特分配单元以及扩展层编码信号矢量量化和编码单元; 其中:
所述瞬态判决单元设置为: 对当前帧的音频信号进行瞬态判决; 所述频域系数生成单元, 与所述瞬态判决单元连接, 所述频域系数生成 单元设置为: 瞬态判决为稳态信号时, 对加窗后的音频信号直接进行时频变 换得到总的频域系数; 瞬态判决为瞬态信号时, 将音频信号分成 个子帧, 对每个子帧进行时频变换, 变换得到的 M组频域系数构成当前帧总的频域系 数, 对总的频域系数按照编码子带从低频到高频的顺序进行重排, 其中, 所 述总的频域系数包括核心层频域系数和扩展层频域系数, 所述编码子带包括 核心层编码子带和扩展层编码子带, 核心层频域系数构成若干个核心层编码 子带, 扩展层频域系数构成若干个扩展层编码子带;
所述幅值包络计算单元, 与所述频域系数生成单元连接, 所述幅值包络 计算单元设置为: 计算核心层编码子带和扩展层编码子带的幅值包络值; 所述幅值包络量化和编码单元, 与所述幅值包络计算单元以及瞬态判决 单元连接, 所述幅值包络量化和编码单元设置为: 对核心层编码子带和扩展 层编码子带的幅值包络值进行量化和编码, 得到核心层编码子带和扩展层编 码子带的幅值包络量化指数及其编码比特; 其中, 若为稳态信号, 则对核心 层编码子带和扩展层编码子带的幅值包络值进行统一量化; 若为瞬态信号, 则对核心层编码子带和扩展层编码子带的幅值包络值分别进行单独量化, 以 及对核心层编码子带的幅值包络量化指数和扩展层编码子带的幅值包络量化 指数分别进行重排;
所述核心层比特分配单元, 与所述幅值包络量化和编码单元连接, 所述 核心层比特分配单元设置为: 根据核心层编码子带的幅值包络量化指数对核 心层编码子带进行比特分配, 得到核心层编码子带的比特分配数;
所述核心层频域系数矢量量化和编码单元, 与所述频域系数生成单元、 幅值包络量化和编码单元及核心层比特分配单元连接, 所述核心层频域系数 矢量量化和编码单元设置为: 使用根据核心层编码子带的幅值包络量化指数 重建的核心层编码子带的量化幅值包络值和核心层编码子带的比特分配数对 核心层编码子带的频域系数进行归一化、 矢量量化和编码, 得到核心层频域 系数编码比特; 所述扩展层编码信号生成单元, 与所述频域系数生成单元及核心层频域 系数矢量量化和编码单元连接, 所述扩展层编码信号生成单元设置为: 生成 核心层残差信号, 得到由核心层残差信号和扩展层频域系数构成的扩展层编 码信号;
所述残差信号幅值包络生成单元, 与所述幅值包络量化和编码单元及核 心层比特分配单元连接, 所述残差信号幅值包络生成单元设置为: 根据核心 层编码子带的幅值包络量化指数与对应的核心层编码子带的比特分配数, 得 到核心层残差信号的幅值包络量化指数;
所述扩展层比特分配单元, 与所述残差信号幅值包络生成单元及幅值包 络量化和编码单元连接, 所述扩展层比特分配单元设置为: 根据核心层残差 信号幅值包络量化指数和扩展层编码子带的幅值包络量化指数对扩展层编码 信号编码子带进行比特分配, 得到扩展层编码信号编码子带的比特分配数; 所述扩展层编码信号矢量量化和编码单元, 与所述幅值包络量化和编码 单元、 扩展层比特分配单元、 残差信号幅值包络生成单元及扩展层编码信号 生成单元连接, 所述扩展层编码信号矢量量化和编码单元设置为: 使用根据 扩展层编码信号编码子带的幅值包络量化指数重建的扩展层编码信号编码子 带的量化幅值包络值和扩展层编码信号编码子带的比特分配数对扩展层编码 信号进行归一化、 矢量量化和编码, 得到扩展层编码信号编码比特;
所述比特流复用器与所述幅值包络量化和编码单元、 核心层频域系数矢 量量化和编码单元、扩展层编码信号矢量量化和编码单元连接, 所述比特流 复用器设置为: 将核心层边信息比特、核心层编码子带的幅值包络的编码比 特、 核心层频域系数编码比特、 扩展层边信息比特, 扩展层编码子带的幅值 包络的编码比特和扩展层编码信号编码比特进行打包。
20、 如权利要求 19所述的系统, 其中,
所述扩展层编码信号生成单元还包括残差信号生成模块和扩展层编码信 号合成模块;
所述残差信号生成模块设置为:对核心层频域系数的量化值进行反量化, 并与核心层频域系数进行差计算, 得到核心层残差信号; 所述扩展层编码信号合成模块设置为: 将核心层残差信号和扩展层的频 域系数按频带的顺序合成, 得到扩展层编码信号。
21、 如权利要求 19所述的系统, 其中,
所述残差信号幅值包络生成单元还包括量化指数修正值获取模块和残差 信号幅值包络量化指数计算模块;
所述量化指数修正值获取模块设置为: 根据核心层编码子带的比特分配 数, 推算残差信号编码子带的量化指数修正值, 各编码子带的量化指数修正 值大于等于 0 , 且当对应的核心层编码子带的比特分配数增加时不减, 如果 核心层的某个编码子带的比特分配数为 0, 则核心层残差信号在该编码子带 的量化指数修正值为 0 , 如果核心层编码子带的比特分配数为所限定的最大 比特分配数, 则核心层残差信号在该编码子带的幅值包络值为零;
所述残差信号幅值包络量化指数计算模块设置为: 将核心层编码子带的 幅值包络量化指数与对应编码子带的量化指数修正值进行差计算, 得到核心 层残差信号编码子带的幅值包络量化指数。
22、 如权利要求 19所述的系统, 其中, 所述比特流复用器还设置为: 将 扩展层编码信号编码比特按照扩展层编码信号编码子带重要性的初始值从大 到小的顺序写入码流, 对于具有相同重要性的编码子带, 低频编码子带的编 码比特优先写入码流。
23、 如权利要求 19所述的系统, 其中, 所述频域系数生成单元设置为: 获取当前帧总的频域系数时, 将当前帧的 N点时域釆样信号 与上一帧的 N点时域釆样信号 x。w(«)组成 2N点时域釆样信号 ^τή , 然后对 ^τή实施加窗 和时域抗混叠处理得到 N点时域釆样信号 ("); 以及对时域信号^ 做对称 变换, 接着在信号两端各添加一段零序列, 将加长后的信号分成 个互相交 迭的子帧, 然后对每个子帧的时域信号实施加窗、 时域抗混叠处理和时频变 换, 得到 组频域系数, 构成当前帧总的频域系数。
24、 如权利要求 19所述的系统,其中,所述频域系数生成单元还设置为: 对频域系数进行重排时, 在核心层和扩展层范围内按照编码子带从低频到高 频的顺序分别进行频域系数的重排。
25、 如权利要求 19所述的系统,其中, 所述幅值包络量化和编码单元是 设置为: 对幅值包络量化指数进行重排指: 将同一子帧内的编码子带的幅值 包络量化指数按照频率递增或递减的顺序重新排列在一起, 在子帧连接处釆 用分属于两个子帧的代表对等频率的两个编码子带来连接。
26、 一种可分层音频解码系统, 该系统包括: 比特流解复用器、 幅值包 络解码单元、 核心层比特分配单元、 核心层解码和反量化单元; 该系统还包 括: 残差信号幅值包络生成单元、 扩展层比特分配单元、 扩展层编码信号解 码和反量化单元、 全体带宽频域系数恢复单元、 噪声填充单元和音频信号恢 复单元; 其中:
所述幅值包络解码单元, 与所述比特流解复用器连接, 所述幅值包络解 码单元设置为: 对所述比特流解复用器输出的核心层和扩展层编码子带的幅 值包络编码比特进行解码, 得到核心层编码子带和扩展层编码子带的幅值包 络量化指数; 若瞬态判决信息表明为瞬态信号, 还对核心层编码子带和扩展 层编码子带的幅值包络量化指数按照频率从小到大的顺序进行重排;
所述核心层比特分配单元, 与所述幅值包络解码单元连接, 所述核心层 比特分配单元设置为: 根据核心层编码子带的幅值包络量化指数, 对核心层 编码子带进行比特分配, 得到核心层编码子带的比特分配数;
所述核心层解码和反量化单元, 与所述比特流解复用器、 幅值包络解码 单元及核心层比特分配单元连接, 所述核心层解码和反量化单元设置为: 根 据核心层编码子带的幅值包络量化指数计算得到核心层编码子带的量化幅值 包络值, 使用核心层编码子带的比特分配数和量化幅值包络值对所述比特流 解复用器输出的核心层频域系数编码比特进行解码、反量化及反归一化处理, 得到核心层频域系数;
所述残差信号幅值包络生成单元, 与所述幅值包络解码单元及核心层比 特分配单元连接, 所述残差信号幅值包络生成单元设置为: 根据核心层编码 子带的幅值包络量化指数与对应核心层编码子带的比特分配数, 查找核心层 残差信号幅值包络量化指数的修正值统计表, 得到核心层残差信号的幅值包 络量化指数; 所述扩展层比特分配单元, 与所述残差信号幅值包络生成单元及幅值包 络解码单元连接, 所述扩展层比特分配单元设置为: 根据核心层残差信号的 幅值包络量化指数和扩展层编码子带的幅值包络量化指数进行扩展层编码信 号编码子带的比特分配, 得到扩展层编码信号编码子带的比特分配数;
所述扩展层编码信号解码和反量化单元, 与比特流解复用器、 所述幅值 包络解码单元、 扩展层比特分配单元及残差信号幅值包络生成单元连接, 所 述扩展层编码信号解码和反量化单元设置为: 使用扩展层编码信号编码子带 的幅值包络量化指数计算得到扩展层编码信号编码子带的量化幅值包络值, 使用扩展层编码信号编码子带的比特分配数和量化幅值包络值对所述比特流 解复用器输出的扩展层编码信号编码比特进行解码、反量化及反归一化处理, 得到扩展层编码信号;
所述全体带宽频域系数恢复单元, 与所述核心层解码和反量化单元以及 扩展层编码信号解码和反量化单元连接, 所述全体带宽频域系数恢复单元设 置为: 根据子带顺序对所述扩展层编码信号解码和反量化单元输出的扩展层 编码信号进行重新排序, 然后与所述核心层解码和反量化单元输出的核心层 频域系数做和计算, 得到全体带宽频域系数;
所述噪声填充单元, 与所述全体带宽频域系数恢复单元及幅值包络解码 单元连接, 所述噪声填充单元设置为: 对编码过程中未分配编码比特的子带 进行噪声填充;
所述音频信号恢复单元, 与所述噪声填充单元连接, 所述音频信号恢复 单元设置为: 若所述瞬态判决信息表明为稳态信号, 对全体带宽的频域系数 直接进行时频逆变换, 得到输出的音频信号; 若所述瞬态判决信息表明为瞬 态信号, 将全体带宽的频域系数进行重排, 然后分成 组频域系数, 对每一 组频域系数进行时频逆变换, 根据变换得到的 组时域信号计算得到最终的 音频信号。
27、 如权利要求 26所述的系统, 其中,
所述残差信号幅值包络生成单元还包括量化指数修正值获取模块和残差 信号幅值包络量化指数计算模块;
所述量化指数修正值获取模块设置为: 根据核心层编码子带的比特分配 数, 推算残差信号编码子带的量化指数修正值, 各编码子带的量化指数修正 值大于等于 0 , 且当对应的核心层编码子带的比特分配数增加时不减小, 如 果核心层的某个编码子带的比特分配数为 0, 则核心层残差信号在该编码子 带的量化指数修正值为 0 , 如果某个核心层编码子带的比特分配数为所限定 的最大比特分配数, 则核心层残差信号在该编码子带的幅值包络值为零; 所述残差信号幅值包络量化指数计算模块设置为: 将核心层编码子带的 幅值包络量化指数与对应编码子带的量化指数修正值进行差计算, 得到核心 层残差信号编码子带的幅值包络量化指数。
28、 如权利要求 26所述的系统, 其中,
所述扩展层编码信号解码和反量化单元还设置为: 对扩展层编码信号的 编码子带解码的顺序是根据扩展层编码信号的编码子带重要性的初始值决定 的, 重要性大的扩展层编码信号的编码子带优先解码, 如果有两个扩展层编 码信号的编码子带具有相同的重要性, 则低频编码子带优先解码, 解码过程 中计算已解码的比特数, 当已解码的比特数满足总比特数要求时停止解码。
29、 如权利要求 26所述的系统,其中,所述音频信号恢复单元是设置为: 对全体带宽的频域系数进行重排指: 将属于同一子帧的频域系数按照编码子 带从低频到高频的顺序排列,得到 组频域系数后,再将 组频域系数按照 子帧的顺序排列。
30、 如权利要求 26所述的系统,其中,所述音频信号恢复单元是设置为: 若瞬态判决信息表明为瞬态信号, 根据变换得到的 M组时域信号计算得到最 终的音频信号的过程包括: 对每一组时域信号进行逆时域抗混叠处理, 接着 对这 组得到的信号进行加窗处理,然后对这 组加窗后信号进行交迭相加, 得到 N点时域釆样信号 ;对时域信号 进行逆时域抗混叠处理和加窗 处理, 对相邻两帧进行交迭相加, 得到最终的音频输出信号。
PCT/CN2011/070206 2010-04-13 2011-01-12 可分层音频编解码方法和系统及瞬态信号可分层编解码方法 WO2011127757A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
RU2012136397/08A RU2522020C1 (ru) 2010-04-13 2011-01-12 Способ и система иерархического кодирования и декодирования звуковой частоты, способ иерархического кодирования и декодирования частоты для переходного сигнала
EP11768369.8A EP2528057B1 (en) 2010-04-13 2011-01-12 Hierarchical frequency encoding and decoding method for transient signal and system
US13/580,855 US8874450B2 (en) 2010-04-13 2011-01-12 Hierarchical audio frequency encoding and decoding method and system, hierarchical frequency encoding and decoding method for transient signal
BR112012021359-8A BR112012021359B1 (pt) 2010-04-13 2011-01-12 Método de codificação hierárquica de áudio, método de descodificação hierárquica de áudio, método de codificação hierárquica de áudio para sinais transitórios, método de descodificação hierárquica para sinais transitórios , e, sistema de codificação hierárquica de áudio
HK13106102.7A HK1179402A1 (zh) 2010-04-13 2013-05-23 用於瞬態信號及系統的可分層頻率編解碼方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2010101455311A CN102222505B (zh) 2010-04-13 2010-04-13 可分层音频编解码方法系统及瞬态信号可分层编解码方法
CN201010145531.1 2010-04-13

Publications (1)

Publication Number Publication Date
WO2011127757A1 true WO2011127757A1 (zh) 2011-10-20

Family

ID=44779039

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/070206 WO2011127757A1 (zh) 2010-04-13 2011-01-12 可分层音频编解码方法和系统及瞬态信号可分层编解码方法

Country Status (7)

Country Link
US (1) US8874450B2 (zh)
EP (1) EP2528057B1 (zh)
CN (1) CN102222505B (zh)
BR (1) BR112012021359B1 (zh)
HK (1) HK1179402A1 (zh)
RU (1) RU2522020C1 (zh)
WO (1) WO2011127757A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9721574B2 (en) 2013-02-05 2017-08-01 Telefonaktiebolaget L M Ericsson (Publ) Concealing a lost audio frame by adjusting spectrum magnitude of a substitute audio frame based on a transient condition of a previously reconstructed audio signal
CN110232929A (zh) * 2013-02-20 2019-09-13 弗劳恩霍夫应用研究促进协会 用于对音频信号进行译码的译码器和方法

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4322161A3 (en) * 2011-04-20 2024-05-01 Panasonic Holdings Corporation Device and method for execution of huffman coding
TWI576829B (zh) * 2011-05-13 2017-04-01 三星電子股份有限公司 位元配置裝置
JP5807453B2 (ja) * 2011-08-30 2015-11-10 富士通株式会社 符号化方法、符号化装置および符号化プログラム
EP2717262A1 (en) * 2012-10-05 2014-04-09 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Encoder, decoder and methods for signal-dependent zoom-transform in spatial audio object coding
CN103854653B (zh) 2012-12-06 2016-12-28 华为技术有限公司 信号解码的方法和设备
US9560386B2 (en) * 2013-02-21 2017-01-31 Mozilla Corporation Pyramid vector quantization for video coding
US9665541B2 (en) 2013-04-25 2017-05-30 Mozilla Corporation Encoding video data using reversible integer approximations of orthonormal transforms
WO2015081699A1 (zh) 2013-12-02 2015-06-11 华为技术有限公司 一种编码方法及装置
KR102185478B1 (ko) * 2014-02-28 2020-12-02 프라운호퍼-게젤샤프트 추르 푀르데룽 데어 안제반텐 포르슝 에 파우 복호 장치, 부호화 장치, 복호 방법, 및 부호화 방법
CN111312278B (zh) 2014-03-03 2023-08-15 三星电子株式会社 用于带宽扩展的高频解码的方法及设备
WO2015162500A2 (ko) * 2014-03-24 2015-10-29 삼성전자 주식회사 고대역 부호화방법 및 장치와 고대역 복호화 방법 및 장치
HUE042095T2 (hu) * 2014-07-28 2019-06-28 Ericsson Telefon Ab L M Piramis vektor kvantáló alakú keresés
FR3024581A1 (fr) * 2014-07-29 2016-02-05 Orange Determination d'un budget de codage d'une trame de transition lpd/fd
EP2988300A1 (en) 2014-08-18 2016-02-24 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Switching of sampling rates at audio processing devices
EP2993665A1 (en) * 2014-09-02 2016-03-09 Thomson Licensing Method and apparatus for coding or decoding subband configuration data for subband groups
WO2016035731A1 (ja) * 2014-09-04 2016-03-10 ソニー株式会社 送信装置、送信方法、受信装置および受信方法
JPWO2016052191A1 (ja) * 2014-09-30 2017-07-20 ソニー株式会社 送信装置、送信方法、受信装置および受信方法
KR102362788B1 (ko) * 2015-01-08 2022-02-15 한국전자통신연구원 레이어드 디비전 멀티플렉싱을 이용한 방송 신호 프레임 생성 장치 및 방송 신호 프레임 생성 방법
WO2016111567A1 (ko) 2015-01-08 2016-07-14 한국전자통신연구원 레이어드 디비전 멀티플렉싱을 이용한 방송 신호 프레임 생성 장치 및 방송 신호 프레임 생성 방법
EP3182411A1 (en) * 2015-12-14 2017-06-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for processing an encoded audio signal
US10210871B2 (en) * 2016-03-18 2019-02-19 Qualcomm Incorporated Audio processing for temporally mismatched signals
CN116343804A (zh) * 2016-12-16 2023-06-27 瑞典爱立信有限公司 用于处理包络表示系数的方法、编码器和解码器
US10586546B2 (en) 2018-04-26 2020-03-10 Qualcomm Incorporated Inversely enumerated pyramid vector quantizers for efficient rate adaptation in audio coding
US10573331B2 (en) * 2018-05-01 2020-02-25 Qualcomm Incorporated Cooperative pyramid vector quantizers for scalable audio coding
US10734006B2 (en) 2018-06-01 2020-08-04 Qualcomm Incorporated Audio coding based on audio pattern recognition
CN109036457B (zh) * 2018-09-10 2021-10-08 广州酷狗计算机科技有限公司 恢复音频信号的方法和装置
WO2020253941A1 (en) * 2019-06-17 2020-12-24 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder with a signal-dependent number and precision control, audio decoder, and related methods and computer programs
CN113129910A (zh) * 2019-12-31 2021-07-16 华为技术有限公司 音频信号的编解码方法和编解码装置
CN115691521A (zh) * 2021-07-29 2023-02-03 华为技术有限公司 一种音频信号的编解码方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6418408B1 (en) * 1999-04-05 2002-07-09 Hughes Electronics Corporation Frequency domain interpolative speech codec system
CN1849649A (zh) * 2003-09-09 2006-10-18 皇家飞利浦电子股份有限公司 瞬态音频信号分量的编码
CN101206860A (zh) * 2006-12-20 2008-06-25 华为技术有限公司 一种可分层音频编解码方法及装置
CN101414864A (zh) * 2008-12-08 2009-04-22 华为技术有限公司 多天线分层预编码的方法及装置
CN101622667A (zh) * 2007-03-02 2010-01-06 艾利森电话股份有限公司 用于分层编解码器的后置滤波器

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5502789A (en) * 1990-03-07 1996-03-26 Sony Corporation Apparatus for encoding digital data with reduction of perceptible noise
CN1062963C (zh) * 1990-04-12 2001-03-07 多尔拜实验特许公司 用于产生高质量声音信号的解码器和编码器
US5388181A (en) * 1990-05-29 1995-02-07 Anderson; David J. Digital audio compression system
US5956674A (en) * 1995-12-01 1999-09-21 Digital Theater Systems, Inc. Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels
US5886276A (en) * 1997-01-16 1999-03-23 The Board Of Trustees Of The Leland Stanford Junior University System and method for multiresolution scalable audio signal encoding
KR100335609B1 (ko) * 1997-11-20 2002-10-04 삼성전자 주식회사 비트율조절이가능한오디오부호화/복호화방법및장치
EP1047047B1 (en) * 1999-03-23 2005-02-02 Nippon Telegraph and Telephone Corporation Audio signal coding and decoding methods and apparatus and recording media with programs therefor
US6260017B1 (en) * 1999-05-07 2001-07-10 Qualcomm Inc. Multipulse interpolative coding of transition speech frames
US6931373B1 (en) * 2001-02-13 2005-08-16 Hughes Electronics Corporation Prototype waveform phase modeling for a frequency domain interpolative speech codec system
AU2002307533B2 (en) * 2001-05-10 2008-01-31 Dolby Laboratories Licensing Corporation Improving transient performance of low bit rate audio coding systems by reducing pre-noise
US7003454B2 (en) * 2001-05-16 2006-02-21 Nokia Corporation Method and system for line spectral frequency vector quantization in speech codec
US7328150B2 (en) * 2002-09-04 2008-02-05 Microsoft Corporation Innovations in pure lossless audio compression
FI119533B (fi) * 2004-04-15 2008-12-15 Nokia Corp Audiosignaalien koodaus
US7895034B2 (en) * 2004-09-17 2011-02-22 Digital Rise Technology Co., Ltd. Audio encoding system
US7386445B2 (en) * 2005-01-18 2008-06-10 Nokia Corporation Compensation of transient effects in transform coding
US7961890B2 (en) * 2005-04-15 2011-06-14 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung, E.V. Multi-channel hierarchical audio coding with compact side information
WO2007063913A1 (ja) * 2005-11-30 2007-06-07 Matsushita Electric Industrial Co., Ltd. サブバンド符号化装置およびサブバンド符号化方法
US8417532B2 (en) * 2006-10-18 2013-04-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
CA2698039C (en) * 2007-08-27 2016-05-17 Telefonaktiebolaget Lm Ericsson (Publ) Low-complexity spectral analysis/synthesis using selectable time resolution
TWI346465B (en) * 2007-09-04 2011-08-01 Univ Nat Central Configurable common filterbank processor applicable for various audio video standards and processing method thereof
US8290782B2 (en) * 2008-07-24 2012-10-16 Dts, Inc. Compression of audio scale-factors by two-dimensional transformation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6418408B1 (en) * 1999-04-05 2002-07-09 Hughes Electronics Corporation Frequency domain interpolative speech codec system
CN1849649A (zh) * 2003-09-09 2006-10-18 皇家飞利浦电子股份有限公司 瞬态音频信号分量的编码
CN101206860A (zh) * 2006-12-20 2008-06-25 华为技术有限公司 一种可分层音频编解码方法及装置
CN101622667A (zh) * 2007-03-02 2010-01-06 艾利森电话股份有限公司 用于分层编解码器的后置滤波器
CN101414864A (zh) * 2008-12-08 2009-04-22 华为技术有限公司 多天线分层预编码的方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2528057A4 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9721574B2 (en) 2013-02-05 2017-08-01 Telefonaktiebolaget L M Ericsson (Publ) Concealing a lost audio frame by adjusting spectrum magnitude of a substitute audio frame based on a transient condition of a previously reconstructed audio signal
US10332528B2 (en) 2013-02-05 2019-06-25 Telefonaktiebolaget Lm Ericsson (Publ) Method and apparatus for controlling audio frame loss concealment
US10559314B2 (en) 2013-02-05 2020-02-11 Telefonaktiebolaget L M Ericsson (Publ) Method and apparatus for controlling audio frame loss concealment
US11437047B2 (en) 2013-02-05 2022-09-06 Telefonaktiebolaget L M Ericsson (Publ) Method and apparatus for controlling audio frame loss concealment
CN110232929A (zh) * 2013-02-20 2019-09-13 弗劳恩霍夫应用研究促进协会 用于对音频信号进行译码的译码器和方法
US11621008B2 (en) 2013-02-20 2023-04-04 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for encoding or decoding an audio signal using a transient-location dependent overlap
CN110232929B (zh) * 2013-02-20 2023-06-13 弗劳恩霍夫应用研究促进协会 用于对音频信号进行译码的译码器和方法
US11682408B2 (en) 2013-02-20 2023-06-20 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for generating an encoded signal or for decoding an encoded audio signal using a multi overlap portion

Also Published As

Publication number Publication date
RU2522020C1 (ru) 2014-07-10
HK1179402A1 (zh) 2013-09-27
CN102222505A (zh) 2011-10-19
RU2012136397A (ru) 2014-05-20
US20120323582A1 (en) 2012-12-20
BR112012021359A2 (pt) 2017-08-15
EP2528057A4 (en) 2014-08-06
EP2528057A1 (en) 2012-11-28
US8874450B2 (en) 2014-10-28
CN102222505B (zh) 2012-12-19
BR112012021359B1 (pt) 2020-12-15
EP2528057B1 (en) 2016-04-06

Similar Documents

Publication Publication Date Title
WO2011127757A1 (zh) 可分层音频编解码方法和系统及瞬态信号可分层编解码方法
WO2011063694A1 (zh) 一种可分层音频编码、解码方法及系统
WO2011063594A1 (zh) 格型矢量量化音频编解码方法和系统
JP4224021B2 (ja) 信号のマルチレートによる格子ベクトル量子化の方法とシステム
JP6600054B2 (ja) 方法、符号化器、復号化器、及び移動体機器
TWI671736B (zh) 對信號的包絡進行寫碼的設備及對其進行解碼的設備
JP2019191594A (ja) 音声音響符号化装置、音声音響復号装置、音声音響符号化方法及び音声音響復号方法
JP5331249B2 (ja) 符号化方法、復号方法、装置、プログラムおよび記録媒体
WO2012004998A1 (ja) スペクトル係数コーディングの量子化パラメータを効率的に符号化する装置及び方法
CN105957533B (zh) 语音压缩方法、语音解压方法及音频编码器、音频解码器
KR20160003264A (ko) 신호 인코딩 및 디코딩 방법 및 장치
CA2803276A1 (en) Encoding method, decoding method, encoding device, decoding device, program, and recording medium
KR20160098597A (ko) 통신 시스템에서 신호 코덱 장치 및 방법
BRPI0317954B1 (pt) Variable rate audio coding and decoding process

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11768369

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 13580855

Country of ref document: US

Ref document number: 2011768369

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2012136397

Country of ref document: RU

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: A20121237

Country of ref document: BY

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112012021359

Country of ref document: BR

REG Reference to national code

Ref country code: BR

Ref legal event code: B01E

Ref document number: 112012021359

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112012021359

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20120824