CN101206860A - Method and apparatus for encoding and decoding layered audio - Google Patents

Method and apparatus for encoding and decoding layered audio Download PDF

Info

Publication number
CN101206860A
CN101206860A CNA2006101678915A CN200610167891A CN101206860A CN 101206860 A CN101206860 A CN 101206860A CN A2006101678915 A CNA2006101678915 A CN A2006101678915A CN 200610167891 A CN200610167891 A CN 200610167891A CN 101206860 A CN101206860 A CN 101206860A
Authority
CN
China
Prior art keywords
subband
enhancement layer
module
core layer
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2006101678915A
Other languages
Chinese (zh)
Inventor
万华林
张军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CNA2006101678915A priority Critical patent/CN101206860A/en
Priority to PCT/CN2007/071154 priority patent/WO2008074251A1/en
Publication of CN101206860A publication Critical patent/CN101206860A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0212Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a hierarchical coding and decoding method and the device thereof. In the embodiment of the invention, the hierarchical and coding decoding proposal adopts the steps as following, after an input signal is converted into a MLT (Modulated Lapped Transform) coefficient, according to an auditory perception model, the input signal is divided into a kernel layer signal and an enhancement layer signal, and then an overlapped and packed coding data is obtained; during decoding, according to the auditory perception model, after the weighted calculation is performed to the importance of each sub-band in the enhancement layer, the reverse MLT is performed to the obtained MLT coefficient of the kernel layer and the enhancement layer, and the decoding flow is output. Compared with the prior hierarchical coding decoding art, in the invention, the MLT is performed to the input signal, the weighted calculation is performed to the importance of each sub-band in the enhancement layer according to the auditory perception model, thus, the quality of coding and decoding is enhanced, and the problem that the input signal with high sampling rate can not be effectively processed in the prior art is solved.

Description

But a kind of encoding and decoding layered audio method and device
Technical field
The present invention relates to encoding and decoding technique, but be specifically related to a kind of encoding and decoding layered audio method and device.
Background technology
Along with the fast development of multimedia technology, audio coding decoding is applied to digital audio broadcasting, high quality audio transmission, digital movie etc. on the Internet more and more widely.
A key character of audio coding and decoding system is to make audio coding and decoding system can be adapted to different applied environments.But audio frequency hierarchical coding technology grows up under this demand just, but layered characteristic means the form tissue of sound signal with layer, signal is divided into inferior quality part and high-quality part, the inferior quality of signal partly is the core layer of sound signal, the high-quality of signal partly is the enhancement layer of sound signal, and the inferior quality part can be decoded under the situation without any the high-quality partial information.When transmission channel can not ensure that whole bandwidth are transmitted complete signal, but that hierarchical nature just seems is particularly useful.For example, when a plurality of users insert identical audio frequency by different communication links, user by the high-speed link accessing to audio can play the surround sound that 384kbit/s encodes in good time, only has the user of 56kbit/s modulator-demodular unit then can't enjoy this audio frequency.After the sound signal classification, when having user's enjoy high quality audio frequency of high bandwidth, the core layer part that the user who connects with the 56kbit/s code check then can download signal is enjoyed a low-qualityer audio frequency.
Referring to Fig. 1 a, but structural representation for hierarchical audio coding device in the prior art, this device comprises that integral mirror is as bank of filters (QMF, Quadrature Mirror Filterbanks) 101, QMF102, code book linear prediction (CELP, Code Excited Linear Prediction) coding module 103, CELP decoder module 104, totalizer 105, revise discrete cosine transform (MDCT, Modified Discrete CosineTransform) module 106, MDCT module 107, the time domain aliasing is eliminated (TDAC, Time DomainAlias Cancellation) coding module 108, time domain bandwidth expansion (TDBWE, Time DomainBandwidth Extension) module 109, multiplexing and the packetization module 110 of bit stream.
QMF101 carries out filtering to pulse-code modulation (PCM, the Pulse Code Modulation) signal of importing, and is output as core layer signal.
QMF101 is input as 16, the PCM input signal of 000Hz sample frequency.
QMF102 carries out filtering to the PCM signal of importing, and is output as enhancement layer signal.
The PCM signal is divided into core layer signal and enhancement layer signal after QMF1 and QMF2 filtering.
CELP coding module 103 carries out the CELP coding to the core layer signal of QMF1 input, sends the data behind the coding to CELP decoder module 104 and bit stream is multiplexing and packetization module 110.
CELP decoder module 104, the coded data of CELP coding module 103 input carried out the CELP decoding after, send totalizer 105 to.
Totalizer 105 with the core layer signal of QMF101 input and the signal subtraction of CELP decoder module 104 inputs, sends output signal to MDCT module 106.
MDCT module 106, the signal that totalizer 105 is imported is a frequency domain by spatial transform, obtains the MDCT coefficient, sends TDAC coding module 108 to.
MDCT module 107, the enhancement layer signal that QMF102 is imported is a frequency domain by spatial transform, the MDCT coefficient of the layer that is enhanced sends TDAC coding module 108 to.
TDAC coding module 108 carries out the TDAC coding to the MDCT coefficient of MDCT module 106 inputs and the enhancement layer MDCT coefficient of MDCT module 107 inputs, sends the data behind the coding to bit stream Multiplexing module 110.
In TDAC when coding, be divided into 18 subbands with the MDCT coefficient of 0~7000Hz, calculates the envelope value of these 18 subbands, is each allocation of subbands coded-bit figure place according to the size of envelope value, each subband is quantized and encodes according to the coded-bit figure place of each subband.
TDBWE module 109, the enhancement layer signal extraction high-frequency parameter to the QMF102 input sends the multiplexing and packetization module 110 of bit stream to.
Multiplexing and the packetization module 110 of bit stream is carried out multiplexing and packing to the coded data of CELP coding module 103 inputs, the coded data of TDAC coding module 108 inputs and the data of TDBWE109 input.
During packing, coded data is arranged in order according to each subband envelope value order from big to small.
Referring to Fig. 1 b, but in the prior art with the structural representation of the corresponding layered audio decoding device of Fig. 1 a, this device comprises bit stream demultiplexing module 120, CELP decoder module 121, TDAC decoder module 122, TDBWE decoder module 123, totalizer 124, contrary MDCT module 125, contrary MDCT module 126, QMF127, QMF128, totalizer 129.
Bit stream demultiplexing module 120 is carried out demultiplexing to the coded data that receives, and the core layer coded data that demultiplexing is obtained sends CELP decoder module 121 to, sends other layer data to TDAC decoder module 122 and TDBWE decoder module 123.
CELP decoder module 121 after the core layer coded data that receives decoded, sends totalizer 124 to.
TDAC decoder module 122 after the coded data that receives decoded, sends contrary MDCT module 125 and contrary MDCT module 126 to.
TDBWE decoder module 123 after the coded data that receives decoded, sends QMF128 to.
Contrary MDCT module 125 is converted to time-domain signal with the frequency-region signal that receives, and sends totalizer 124 to.
Contrary MDCT module 126 is converted to time-domain signal with the frequency-region signal that receives, and sends QMF128 to.
Totalizer 124 will be carried out sum operation by the core layer decoded data of CELP decoder module 121 inputs with by the data of contrary MDCT module 125 inputs, send summed result to QMF127.
QMF127 carries out rising sampling to the received signal, obtains core layer signal.
QMF128 carries out rising sampling to the received signal, and layer signal is enhanced.
Totalizer 129 will be carried out sum operation by the core layer signal of QMF127 input and the enhancement layer signal of being imported by QMF128, obtain the pcm stream of decompress(ion).
But existing layering decoding method has following shortcoming:
1) in general, the human auditory system can feel 20Hz to 20, the sound in the 000Hz frequency range, the upper limit of frequency depends on the situation of everyone auditory system and the intensity of sound, ordinary people's auditory system is to 2, and 000Hz to 8, the sound in the 000Hz frequency range are relatively more responsive.What prior art was handled is 16, the input signal of 000Hz sample frequency, and according to each subband envelope value size allocated code number of bits, the sub-band coding data that envelope value is big come the front as low layer information, and this is feasible.Yet, for 32,000Hz, 44,100Hz or 48, the input signal of 000Hz sample frequency, will there be very big defective in this disposal route.For example, certain is 16 years old, near the 000Hz subband has bigger envelope value, but may also not reach the appreciable threshold value of people's ear, be that people's ear is insensitive, if the more number of bits of allocation of subbands for this reason will cause real important subband not have enough number of bits to encode and influences coding quality.This method also may make the important subband of people's ear sensitivity be come the back of code stream because envelope value is less, is preferentially abandoned when network condition is bad, and this will influence user's auditory perception.In other words, but the encoding and decoding layered audio method of prior art can not effectively solve the situation of high sampling frequency signal input.
2) QMF of available technology adopting has increased the complexity of code decode algorithm, has increased the time delay of code decode algorithm.The CELP coding that core layer signal is adopted designs for adapting to the voice signal characteristics, and to the signal of the other types that are both low frequency and improper, this will influence the encoding and decoding effect.
Summary of the invention
In view of this, but a purpose of the embodiment of the invention is to provide a kind of hierarchical audio coding device, and this device has effectively improved coding quality.
But another purpose of the embodiment of the invention is to provide a kind of layered audio decoding device, and this device has improved decoding quality effectively.
But the another purpose of the embodiment of the invention is to provide a kind of encoding and decoding layered audio method, and this method has effectively improved the encoding and decoding quality.
In order to achieve the above object, technical scheme of the present invention is achieved in that
But a kind of hierarchical audio coding device, this device comprises: based on hierarchical block, auditory perception model, the calculating of subband envelope and coding module, core layer coding module, the enhancement layer coding module of auditory perception model with bit stream is multiplexing and packetization module;
Described hierarchical block based on auditory perception model, with input signal through modulated lapped transform (mlt) (MLT, Modulated Lapped Transform), be transformed to the MLT coefficient after, according to auditory perception model, be divided into core layer signal and enhancement layer signal;
Described auditory perception model is for the hierarchical block based on auditory perception model provides classification foundation, for the subband importance weighting of enhancement layer coding module provides foundation;
Described subband envelope calculates and coding module, according to core layer signal and enhancement layer signal, after calculating envelope value based on each subband of the core layer signal of the hierarchical block of auditory perception model input and enhancement layer signal, give the core layer coding module with the envelope value of core layer signal and each subband of core layer signal, send enhancement layer signal and each subband envelope value of enhancement layer signal to the enhancement layer coding module; Each subband envelope value is encoded, send coded data to bit stream multiplexing and packetization module;
Described core layer coding module according to the envelope value of each subband of core layer signal of input, after the core layer signal of input encoded, sends the multiplexing and packetization module of bit stream to;
Described enhancement layer coding module according to the envelope value of each subband of enhancement layer signal of auditory perception model and input, after the enhancement layer signal of input encoded, sends the multiplexing and packetization module of bit stream to;
Multiplexing and the packetization module of described bit stream is to the coded data of each subband of enhancement layer of the coded data of each subband of core layer of core layer coding module input, the input of enhancement layer coding module with the subband envelope calculates and the subband envelope value coded data of coding module input is carried out multiplexing and packing.
But a kind of layered audio decoding device, this device comprises: bit stream demultiplexing module, subband envelope decoder module, core layer decoder module, enhancement layer decoder module, auditory perception model, MLT coefficient reconstruction and inverse transform module;
Described bit stream demultiplexing module is decomposed into subband envelope value coded data, core layer coded data and enhancement layer coding data with the coded data that receives, and sends subband envelope decoder module to;
Described subband envelope decoder module, subband envelope value coded data is decoded, after obtaining each subband envelope value, send the envelope value of core layer coded data and each subband of core layer to the core layer decoder module, send the envelope value of enhancement layer coding data and each subband of enhancement layer to the enhancement layer decoder module;
Described core layer decoder module according to the envelope value of each subband of core layer of input, is decoded to the core layer coded data of input, obtain the MLT coefficient of each subband of core layer of decompress(ion) after, send MLT coefficient reconstruction and inverse transform module to;
Described enhancement layer decoder module, envelope value according to each subband of enhancement layer of auditory perception model and input, enhancing coded data to input is decoded, obtain the MLT coefficient of each subband of enhancement layer of decompress(ion), send the MLT coefficient of each subband of enhancement layer and the envelope value of each subband of enhancement layer to MLT coefficient reconstruction and inverse transform module;
Described auditory perception model is for the subband importance weighting of enhancement layer decoder module provides foundation;
Described MLT coefficient reconstruction and inverse transform module are carried out inverse transformation to the MLT coefficient of each subband of core layer and the MLT coefficient of each subband of enhancement layer, obtain the output signal of decompress(ion).
But a kind of encoding and decoding layered audio method, this method comprises:
A, with input signal behind MLT, be divided into core layer signal and enhancement layer signal according to auditory perception model, according to core layer signal and enhancement layer signal, obtain the coded data of each subband envelope value;
B, obtain the coded data of each subband of core layer according to the envelope value of core layer signal and each subband of core layer signal, envelope value according to enhancement layer signal, auditory perception model and each subband of enhancement layer signal, be enhanced the layer each subband coded data, after the multiplexing together packing of coded data of the coded data of each subband envelope value that steps A is obtained, the coded data of each subband of core layer and each subband of enhancement layer, send decoding end to.
From such scheme as can be seen, but the encoding and decoding layered audio scheme of the embodiment of the invention has been carried out MLT to input signal, after obtaining multiplexing packing data according to auditory perception model, send decoding end to, like this, improve the quality of encoding and decoding, solved the problem that effectively to handle the high sampling rate input signal in the prior art.
Description of drawings
But Fig. 1 a is the structural representation of hierarchical audio coding device in the prior art;
But Fig. 1 b be in the prior art with the structural representation of the corresponding layered audio decoding device of Fig. 1 a;
But Fig. 2 is the structural representation of embodiment of the invention hierarchy encoding apparatus;
Fig. 3 be among Fig. 2 multiplexing and the packing after the audio code stream structural representation;
But Fig. 4 is the structural representation of embodiment of the invention hierarchical decoding device;
But Fig. 5 is the process flow diagram of embodiment of the invention hierarchy encoding method;
But Fig. 6 is the process flow diagram of embodiment of the invention hierarchical decoding method.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, below in conjunction with embodiment and accompanying drawing, the present invention is described in more detail.
Referring to Fig. 2, but be the structural representation of embodiment of the invention hierarchy encoding apparatus, comprise that hierarchical block 210 based on auditory perception model, auditory perception model 220, subband envelope calculate and coding module 230, core layer coding module 240, enhancement layer coding module 250 and bit stream is multiplexing and packetization module 260.
Based on the hierarchical block 210 of auditory perception model, according to auditory perception model 220, with the PCM signal through MLT, be transformed to the MLT coefficient after, be divided into core layer signal and enhancement layer signal.It comprises MLT module 211, sub-band division module 212 and frequency band importance hierarchical block 213.
MLT module 211 is carried out MLT to the PCM signal of importing, and is transformed to the MLT coefficient.
Sub-band division module 212 is divided into a plurality of uniformly-spaced subbands with each frame MLT coefficient, or according to auditory perception model 220 each frame MLT coefficient is divided into a plurality of unequal interval subbands.
The method that is divided into a plurality of unequal interval subbands is: according to auditory perception model 220, the MLT coefficient is divided into the subband of a plurality of unequal intervals, the bandwidth of subband is relevant with its spectrum position, the MLT coefficient low to frequency, divide narrower subband, the MLT coefficient high to frequency, the subband of division broad.
Frequency band importance hierarchical block 213 according to auditory perception model 220, is divided into the MLT coefficient that is divided into a plurality of subbands core layer signal that comprises sensitive signal and the enhancement layer signal that comprises time sensitive signal.
Here,, the MLT coefficient in the frequency band range of people's ear sensitivity is divided into core layer signal, the MLT coefficient in the frequency band range of people's ear time sensitivity is divided into enhancement layer signal according to auditory perception model 220.For example, according to auditory perception model, people's ear is to 2, signal in 000HZ~8,000HZ frequency range is responsive, just can be with 0HZ~8, MLT coefficient in the 000HZ frequency range is divided into core layer signal, and with 8, the MLT coefficient in the above frequency range of 000HZ is divided into enhancement layer signal.Here, core layer signal and enhancement layer signal comprise a plurality of subbands respectively.
Auditory perception model 220, dividing for the MLT coefficient unequal interval of sub-band division module 212 provides foundation, for the subband layering of frequency band importance hierarchical block 213 provides foundation, for the subband importance weighting of subband importance weighting block 251 provides foundation.
The subband envelope calculates and coding module 230, according to core layer and enhancement layer signal, after calculating envelope value by each subband of the core layer signal of frequency band importance hierarchical block 213 input and enhancement layer signal, send the envelope value of core layer signal and each subband of core layer signal to core layer coding module 240, send the envelope value of enhancement layer signal and each subband of enhancement layer signal to enhancement layer coding module 250; Each subband envelope value is encoded, send coded data to bit stream multiplexing and packetization module 260.
Core layer coding module 240, calculate and the envelope value of each subband of core layer signal of coding module 230 inputs according to the subband envelope, to the input core layer signal encode after, send the multiplexing and packetization module 260 of bit stream to, it comprises subband Bit Allocation in Discrete module 241 and quantification and coding module 242.
Subband Bit Allocation in Discrete module 241, receive the calculating of subband envelope and the core layer signal of coding module 230 inputs and the envelope value of each subband of core layer signal, according to each subband envelope value of core layer signal, be each allocation of subbands number of bits, the number of bits information of each subband signal and core layer signal are sent to quantize and coding module 242.
Core layer signal comprises a plurality of subbands, promptly is divided the MLT coefficient of a plurality of subband of layer.
Quantize and coding module 242,, each subband signal of the core layer signal of input is quantized and encodes, send the coded data of each subband of core layer to bit stream multiplexing and packetization module 260 according to the number of bits of each subband of core layer.
Enhancement layer coding module 250, envelope value and auditory perception model 220 according to each subband of enhancement layer signal of importing by calculating of subband envelope and coding module 230, to the input enhancement layer signal encode after, send the multiplexing and packetization module 260 of bit stream to, it comprises subband importance weighting block 251, subband Bit Allocation in Discrete module 252 and quantification and coding module 253.
Subband importance weighting block 251, reception is calculated by the subband envelope and the enhancement layer signal of coding module 230 inputs and the envelope value of each subband of enhancement layer, envelope value and auditory perception model 220 according to each subband of enhancement layer of importing, importance to each subband of enhancement layer signal is weighted calculating, sends the result and the enhancement layer signal of the importance weighting of each subband of enhancement layer of calculating to subband Bit Allocation in Discrete module 252.
Because the frequency of enhancement layer signal is higher, the frequency band broad, the importance of signal is not only relevant with envelope value, also relevant to the susceptibility of signal with people's ear, so, the present invention is weighted calculating according to auditory perception model 220 to enhancement layer signal: to the subband of people's ear sensitivity, the result of importance weighting is the envelope value of this subband and the product of a bigger weighted value; For people's ear time responsive subband, the result of importance weighting is the envelope value of this subband and the product of a less weighted value.That is to say that in the prior art, the importance of each subband of enhancement layer is just by the envelope value decision, and in the present invention, the importance of each subband of enhancement layer is determined jointly by envelope value and people's ear susceptibility.
Subband Bit Allocation in Discrete module 252, reception is by the result and the enhancement layer signal of the importance weighting of each subband of enhancement layer of subband importance weighting block 251 inputs, result according to the importance weighting of each subband of enhancement layer signal, be each subband signal allocation bit figure place of enhancement layer signal, the number of bits information of each subband signal and enhancement layer signal are sent to quantize and coding module 253.
Result according to the importance weighting of each subband of enhancement layer signal for the big subband signal of importance, distributes more number of bits, for the little subband signal of importance, distributes less number of bits.
Quantize and coding module 253, receive the number of bits information and the enhancement layer signal of each subband of enhancement layer of subband Bit Allocation in Discrete module 252 inputs, number of bits according to each subband signal of enhancement layer, each subband signal of enhancement layer signal is quantized and encodes, send the coded data of each subband of enhancement layer to bit stream multiplexing and packetization module 260.
Multiplexing and the packetization module 260 of bit stream is to quantizing and the coded data of each subband of enhancement layer of coded data, quantification and coding module 253 inputs of each subband of core layer of coding module 242 inputs and the subband envelope calculates and the subband envelope value coded data of coding module 230 inputs is carried out multiplexing and packing.
Here, the subband envelope value coded data that the subband envelope calculates and coding module 230 is imported comprises the subband envelope value corresponding with each subband of core layer signal, the subband envelope value corresponding with each subband of enhancement layer signal.
Referring to Fig. 3, be the audio code stream structural representation after multiplexing among Fig. 2 and the packing, comprise the core and strengthen part.The core comprises the coded data and the core layer coded data of frame head, each subband envelope value, and the core layer coded data i.e. layer 0 coded data among the figure, is formed according to frequency series arrangement from low to high by the coded data of each subband of core layer.Enhancement layer part is made up of the enhancement layer coding data, is divided into as shown in FIG. layer 1 coded data to a layer N coded data.The method of the coded data of each subband of enhancement layer being inserted code stream is: the coded data of each subband of enhancement layer is inserted code stream successively according to importance order from big to small, the a certain sub-band coding data of enhancement layer are inserted before the code stream, the code stream that calculates the place frame earlier is the number of bits and the described a certain subband number of bits sum of usefulness, compare with available total number of bits of place frame again, if be less than or equal to total number of bits, then described a certain sub-band coding data are inserted code stream, and used before will be being updated to number of bits bit number and described a certain sub-band coding data bit figure place with, continue to insert next subband coded data; Otherwise, stop to insert the sub-band coding data, residue available bits figure place is filled with the value that sets in advance, as, " 1 " or " 0 " just, gives up described a certain sub-band coding data and all sub-band coding data littler than described a certain sub-band coding data importance.
Referring to Fig. 4, but be the structural representation of embodiment of the invention hierarchical decoding device, comprise bit stream demultiplexing module 410, subband envelope decoder module 420, core layer decoder module 430, enhancement layer decoder module 440, auditory perception model 450, MLT coefficient reconstruction and inverse transform module 460.
Bit stream demultiplexing module 410 is subband envelope value coded data, core layer coded data and enhancement layer coding data with the coded data demultiplexing that receives, and sends subband envelope decoder module 420 to.
The core layer coded data is an integral body of being made up of a plurality of core layer sub-band coding data, and the enhancement layer coding data are an integral body of being made up of a plurality of enhancement layer subband coded datas.
Subband envelope decoder module 420, receive core layer coded data, subband envelope value coded data and the enhancement layer coding data of 410 inputs of bit stream demultiplexing module, subband envelope value coded data is decoded, after obtaining the envelope value of each subband, send the envelope value of core layer coded data and each subband of core layer to core layer decoder module 430, send the envelope value of enhancement layer coding data and each subband of enhancement layer to enhancement layer decoder module 440.
Each the subband envelope value that obtains after subband envelope value coded data decoded comprises the envelope value of each subband of core layer and the envelope value of each subband of enhancement layer.
Core layer decoder module 430, receive the core layer coded data of subband envelope decoder module 420 inputs and the envelope value of each subband of core layer, envelope value according to each subband of core layer, the core layer coded data is decoded, after obtaining the MLT coefficient of each subband of core layer of decompress(ion), send MLT coefficient reconstruction and inverse transform module 460 to.It comprises subband Bit Allocation in Discrete module 431, subband data extraction module 432 and re-quantization and decoder module 433.
Subband Bit Allocation in Discrete module 431, receive the core layer coded data of subband envelope decoder module 420 inputs and the envelope value of each subband of core layer, envelope value according to each subband of core layer, be each allocation of subbands number of bits, send the number of bits information and the core layer coded data of each subband of core layer to subband data extraction module 432.
Subband data extraction module 432, receive the number of bits information and the core layer coded data of each subband of core layer of subband Bit Allocation in Discrete module 431 inputs, according to the shared number of bits of each subband of core layer, extract the coded data of each subband of core layer coded data, send the coded data of each subband of core layer to re-quantization and decoder module 433.
From the core layer coded data of subband Bit Allocation in Discrete module 431 inputs is an integral body that comprises a plurality of core layer sub-band coding data, is output as the coded data of each subband of core layer behind subband data extraction module 432.
Re-quantization and decoder module 433, receive the coded data of each subband of core layer of subband data extraction module 432 inputs, to the coded data of each subband of core layer carry out re-quantization and the decoding after, obtain the MLT coefficient of each subband of core layer of decompress(ion), send MLT coefficient reconstruction and inverse transform module 460 to.
Enhancement layer decoder module 440, receive the enhancement layer coding data of subband envelope decoder module 420 inputs and the envelope value of each subband of enhancement layer, envelope value and auditory perception model 450 according to each subband of enhancement layer, the enhancement layer coding data are decoded, obtain the MLT coefficient of each subband of enhancement layer of decompress(ion), send the MLT coefficient of each subband of enhancement layer and the envelope value of each subband of enhancement layer to MLT coefficient reconstruction and inverse transform module 460, it comprises subband importance weighting block 441, subband Bit Allocation in Discrete module 442, subband data extraction module 443 and re-quantization and decoder module 444.
Subband importance weighting block 441, receive the enhancement layer coding data of subband envelope decoder module 420 inputs and the envelope value of each subband of enhancement layer, envelope value and auditory perception model 450 according to each subband of enhancement layer of importing, importance to each subband of enhancement layer coding data is weighted calculating, sends the envelope value of importance weighted results, enhancement layer coding data and each subband of enhancement layer of each subband of the enhancing coded data that calculates to subband Bit Allocation in Discrete module 442.
Because the frequency of enhancement layer signal is higher, the frequency band broad, the importance of signal is not only relevant with envelope value, also relevant to the susceptibility of signal with people's ear, so, the present invention is weighted calculating according to auditory perception model 450 to enhancement layer signal: to the subband of people's ear sensitivity, the result of importance weighting is the envelope value of this subband and the product of a bigger weighted value; For people's ear time responsive subband, the result of importance weighting is the envelope value of this subband and the product of a less weighted value.The result of calculation numerical value that obtains is big more, and importance is big more, and result of calculation numerical value is more little, and importance is more little.
That is to say that in the prior art, the importance of enhancement layer subband is just by the envelope value decision, and in the present invention, the importance of enhancement layer subband is determined jointly by envelope value and people's ear susceptibility.
Subband Bit Allocation in Discrete module 442, the envelope value of importance weighted results, enhancement layer coding data and each subband of enhancement layer of each subband of the enhancement layer coding data of reception subband importance weighting block 441 inputs, result according to the importance weighting of each subband of enhancement layer coding data, be the coded data allocation bit figure place of each subband of enhancement layer, send the envelope value of number of bits information, enhancement layer coding data and each subband of enhancement layer of the coded data of the importance weighted results of each subband of enhancement layer coding data, each subband to subband data extraction module 443.
Subband data extraction module 443, the importance weighted results of each subband of the enhancing coded data of reception subband Bit Allocation in Discrete module 442 inputs, the number of bits information of the coded data of each subband of enhancement layer, the envelope value of enhancement layer coding data and each subband of enhancement layer, according to the importance order from big to small that strengthens each subband data of coded data, according to the shared number of bits of each subband of enhancement layer, extract the coded data of each subband of enhancement layer coding data, send the coded data of each subband of enhancement layer and the envelope value of each subband of enhancement layer to re-quantization and decoder module 444.
From the enhancement layer coding data of subband Bit Allocation in Discrete module 442 inputs is an integral body that comprises a plurality of enhancement layer subband coded datas, is output as the coded data of each subband of enhancement layer behind subband data extraction module 443.
Importance order from big to small according to strengthening each subband data of coded data according to the shared number of bits of corresponding each subband, extracts each sub-band coding data of enhancement layer coding data.When extracting data, at first calculate the number of bits of code stream of the place frame that has extracted and enhancement layer coding data that be about to extract the shared number of bits of a certain sub-band coding data and, compare with total number of bits of the code stream of place frame then, if greater than total number of bits, then stop to extract data; Otherwise extract the coding of described a certain subband, with extract number of bits extracted before being updated to number of bits and the shared bit of described a certain sub-band coding and, continue next subband coded datas of extraction enhancement layer coding data.
Re-quantization and decoder module 444, the coded data of each subband of enhancement layer of reception subband data extraction module 443 inputs and the envelope value of each subband of enhancement layer, to the coded data of each subband of enhancement layer carry out re-quantization and the decoding after, obtain the MLT coefficient of each subband of enhancement layer of decompress(ion), send the MLT coefficient of each subband of enhancement layer and the envelope value of each subband of enhancement layer to MLT coefficient reconstruction and inverse transform module 460.
Auditory perception model 450 is for the subband importance weighting of enhancement layer decoder module provides foundation; If when adapting to network condition and lost the data of less some subband of enhancement layer of importance, then provide the foundation of the enhancement layer MLT coefficient of reconstructing lost in coding or the transmission course for MLT coefficient reconstruction module 461.
MLT coefficient reconstruction and inverse transform module 460, receive the MLT coefficient of each subband of core layer of re-quantization and decoder module 433 inputs, with the MLT coefficient of each subband of enhancement layer of re-quantization and decoder module 444 inputs, the envelope value of each subband of enhancement layer, the MLT coefficient of each subband of core layer and the MLT coefficient of each subband of enhancement layer are carried out inverse transformation, obtain the PCM signal of decompress(ion), it comprises MLT coefficient reconstruction module 461 and MLT inverse transform module 462.
MLT coefficient reconstruction module 461, receive the MLT coefficient of each subband of core layer of re-quantization and decoder module 433 inputs, with the MLT coefficient of each subband of enhancement layer of re-quantization and decoder module 444 inputs, the envelope value of each subband of enhancement layer, envelope value according to each subband of enhancement layer, rearrange the MLT coefficient of core layer and each subband of enhancement layer according to the frequency band order after, send MLT inverse transform module 462 to.
MLT coefficient after rearranging is an integral body that comprises core layer MLT coefficient and enhancement layer MLT coefficient.
The MLT coefficient of core layer and each subband of enhancement layer is arranged in order according to frequency order from small to large.MLT coefficient for each subband of enhancement layer, may exist in coding or the transmission course for adapting to the data of less some subband of enhancement layer of importance that network condition loses, for example, at bit stream in the multiplexing and packing of multiplexing and packetization module 260, the coded data of some enhancement layer subband that the importance that may lose is less.At this moment, behind the MLT coefficient that obtains rearranging, can be according to the enhancement layer MLT coefficient of the envelope value compensating missing of each subband of enhancement layer, compensation method is: the symbol picked at random of MLT coefficient, can be for just, also can be for negative, the envelope value of respective sub-bands be multiply by a proportionality constant, amplitude as the MLT coefficient, described proportionality constant determines that according to auditory perception model 450 for the big subband signal of people's ear susceptibility, its proportionality constant value is big, for the little signal of people's ear susceptibility degree, its proportionality constant value is little.
MLT inverse transform module 462, receive 461 inputs of MLT coefficient reconstruction module the MLT coefficient, the MLT coefficient is carried out contrary MLT, obtain the PCM signal of decompress(ion).
Referring to Fig. 5, but be the process flow diagram of embodiment of the invention hierarchy encoding method.Among this embodiment, the input sample frequency is the PCM signal of 48kHz, and frame length is 20ms, and delaying time is 40ms, range of code rates 32~64kbits/s, and wherein the core layer code check is 32kbits/s, but the layering step-length is 0.8kbits/s.May further comprise the steps:
Step 501 is carried out MLT with the PCM signal, is transformed to the MLT coefficient.
Under the 48kHz sampling rate, the sample value number of every frame 20ms is 960, and therefore the input of MLT each time is up-to-date 1920 sample value x (n), and wherein, x (0) is that the oldest sample value, and, 0≤n<1920.
MLT exports 960 MLT coefficients, i.e. mlt (m), wherein, 0≤m<960.
MLT is provided by following formula:
mlt ( m ) = ∑ n = 0 1919 2 960 sin ( π 1920 ( n + 0.5 ) ) cos ( π 960 ( n - 479.5 ) ( m + 0.5 ) ) x ( n )
MLT can be decomposed into window, overlapping and additive operation, carries out IV type discrete cosine transform (DCT, Discrete Cosine Transform) then.Window, overlapping and additive operation are finished by following formula:
V (n)=w (479-n) x (479-n)+w (480+n) x (480+n) is for 0≤n≤479
V (n+480)=w (959-n) x (960+n)-w (n) x (1919-n) is for 0≤n≤479
Wherein:
w ( n ) = sin ( π 1920 ( n + 0.5 ) ) , For 0≤n<960
V (n) and IV type DCT are merged, and the expression formula of the MLT coefficient of formation is:
mlt ( m ) = ∑ n = 0 959 2 960 cos ( π 960 ( n + 0.5 ) ( m + 0.5 ) ) v ( n ) , For 0≤m<960
Step 502 is divided into a plurality of uniformly-spaced subbands or a plurality of unequal interval subband with each frame MLT coefficient.
Here, the MLT coefficient in 0~20kHz frequency band range uniformly-spaced is divided into 40 subbands, the frequency span of each subband is 500Hz, comprises 20 MLT coefficients.
Step 503 according to auditory perception model, is divided into the MLT coefficient core layer signal that comprises sensitive signal and the enhancement layer signal that comprises time sensitive signal.
According to auditory perception model, people's ear is responsive to the signal of 2k~8kHz scope, therefore with 0~8kHz scope, be that subband 0~15 scope division is a core layer signal, and for it distributes the 32kbits/s code check, be enhancement layer signal with subband 16~39 scope division, code check is remaining 32kbits/s.
Step 504 according to core layer signal and enhancement layer signal, calculates the envelope value of each subband of core layer signal and enhancement layer signal, and each subband envelope value is encoded, and obtains the coded data of each subband envelope value, then execution in step 505 and step 507.
The subband envelope value is defined as the root mean square (RMS, Root MeanSquare) of MLT coefficient in this zone, and its calculating formula is:
rms ( r ) = 1 20 ∑ n = 0 19 mlt ( 20 r + n ) mlt ( 20 r + n ) , 0≤r<40
Calculate each subband and get after the envelope value, each subband envelope value is encoded, obtain the coded data of each subband envelope value with variable word length coding (VLC, Variable LengthCode) method or other coding method.
Step 505 according to the envelope value of each subband of core layer signal, is each allocation of subbands number of bits of core layer signal.
Can adopt G.722.1 or bit distribution algorithm G.929EV, be each subband signal allocation bit position of core layer.
Step 506 according to the number of bits of each subband of core layer signal, quantizes and encodes each subband signal of core layer signal, obtains the coded data of each subband of core layer, and execution in step 510 then.
Step 507 according to the envelope value of auditory perception model and each subband of enhancement layer signal, is weighted calculating to the importance of each subband of enhancement layer signal.
Because the frequency of enhancement layer signal is higher, the frequency band broad, the importance of signal is not only relevant with envelope value, also relevant to the susceptibility of sound with people's ear, so, the present invention is weighted calculating according to auditory perception model to enhancement layer signal: to the subband of people's ear sensitivity, the result of importance weighting is the product of a rms (r) and a bigger weighted value; For people's ear time responsive subband, the result of importance weighting is the rms (r) of this subband and the product of a less weighted value.That is to say that the importance of each subband signal of enhancement layer signal is by envelope value and the decision of people's ear susceptibility.
Subband importance weighted calculation can be expressed as simply:
ip ( r ) = rms ( 16 + r ) * 1.67 0 &le; r < 4 rms ( 16 + r ) * 1.33 4 &le; r < 12 rms ( 16 + r ) 12 &le; r < 24
The size of ip (r) is represented the size of importance of each subband signal of enhancement layer signal.
Step 508 according to the importance weighted results of each subband of enhancement layer signal that calculates, is each subband signal allocation bit figure place.
Weighting importance according to step 507 calculates is each subband signal allocation bit figure place of enhancement layer signal.The subband signal big to importance distributes more number of bits, and the subband signal little to importance distributes less number of bits.
Step 509 according to the number of bits of each subband signal of enhancement layer, quantizes and encodes the coded data of each subband of layer that be enhanced each subband signal of enhancement layer signal.
Step 510, to the coded data of the coded data of the coded data of each subband envelope value, each subband of core layer and each subband of enhancement layer carry out multiplexing and packing after, send decoding end to.
Referring to Fig. 3, be the audio code stream structural representation after multiplexing and the packing.Method descriptions multiplexing with bit stream and packetization module 260 places multiplexing and packing are identical.
Referring to Fig. 6, but be the process flow diagram of embodiment of the invention hierarchical decoding method, the flow process of this embodiment for the code stream that obtains behind the coding among Fig. 5 is decoded may further comprise the steps:
Step 601, the coded data demultiplexing that coding side is transmitted is core layer coded data, subband envelope value coded data and enhancement layer coding data.
The core layer coded data is an integral body of being made up of a plurality of core layer sub-band coding data, and the enhancement layer coding data are an integral body of being made up of a plurality of enhancement layer subband coded datas.
Step 602 is decoded to each subband envelope value coded data, obtains the envelope value of each subband, then execution in step 603 and step 606.
Each the subband envelope value that obtains after subband envelope value coded data decoded comprises the envelope value of each subband of core layer and the envelope value of each subband of enhancement layer.
Step 603 according to each subband envelope value of core layer coded data, is each allocation of subbands number of bits of core layer coded data.
Step 604, the number of bits shared, each sub-band coding data of extraction core layer coded data according to each subband of core layer coded data.
The core layer coded data is an integral body of being made up of the sub-band coding data of a plurality of core layer coded datas, is decomposed into the coded data of each subband of core layer after the extraction.
Step 605, each sub-band coding data of core layer of extracting are carried out re-quantization and decoding after, obtain the MLT coefficient of each subband of core layer of decompress(ion), execution in step 610 then.
Step 606 according to the envelope value of auditory perception model and each subband of enhancement layer, is weighted calculating to the importance of each subband of enhancement layer coding data.
Step 607 according to the importance of each subband of enhancement layer coding data, is the coded data allocation bit figure place of each subband of enhancement layer.
Step 608 according to the importance order from big to small of each subband data of enhancement layer coding data, according to the shared number of bits of each subband of enhancement layer, is extracted the coded data of each subband of enhancement layer coding data.
The described method of this step is identical with the description at subband data extraction module 443 places, repeats no more here.
Step 609 is carried out re-quantization and decoding to the coded data of each subband of enhancement layer of being extracted, obtains the enhancement layer MLT coefficient of decompress(ion).
Adopt with the coding flow process in quantification and the opposite process of encoding the enhancement layer coding data are carried out re-quantization and decoding, obtain 20 MLT coefficients of each subband.
Step 610 rearranges the MLT coefficient of core layer and each subband of enhancement layer according to the frequency order.
The MLT coefficient of core layer and each subband of enhancement layer is arranged in order according to frequency order from small to large.MLT coefficient for each subband of enhancement layer, may exist in coding or the transmission course for adapting to the data of less some subband of enhancement layer of importance that network condition loses, for example, in the coding flow process, multiplexing and when packing, the coded data of some enhancement layer subband that the importance that may lose is less.But enhancement layer MLT coefficient according to the envelope value reconstructing lost of each subband of enhancement layer, method for reconstructing is: the symbol picked at random of MLT coefficient, can be for just, also can be for negative, the envelope value of subband be multiply by the amplitude of a proportionality constant as the MLT coefficient, described proportionality constant is determined according to auditory perception model, for the big subband signal of people's ear susceptibility, its proportionality constant value is big, and for the little signal of people's ear susceptibility degree, its proportionality constant value is little.Table 1 is a proportionality constant corresponding with each subband in the present embodiment.
The subband subscript The proportionality constant of MLT coefficient reconstruction
16-19 0.85
20-27 0.75
28-39 0.70
The proportionality constant of table 1:MLT coefficient reconstruction
Step 611 is carried out contrary MLT to the MLT coefficient of core layer and each subband of enhancement layer, obtains the PCM signal of decompress(ion).
Contrary each time 960 MLT coefficients of MLT calculation process produce 960 time-domain audio sample values.Contrary MLT can be decomposed into IV type DCT, window, overlapping and additive operation.IV type DCT is:
u ( n ) = &Sum; n = 0 959 2 960 cos ( &pi; 960 ( m + 0.5 ) ( n + 0.5 ) ) mlt ( m ) , For 0≤n<960
Half of a half-sum former frame DCT output sample of present frame DCT output sample used in window, overlapping and additive operation:
Y (n)=w (n) u (479-n)+w (959-n) u_old (n) is for 0≤n≤479
Y (n+480)=w (480+n) u (n)-w (479-n) u_old (479-n) is for 0≤n≤479
Wherein:
w ( n ) = sin ( &pi; 1920 ( n + 0.5 ) ) , For 0≤n≤959
Among the u () untapped half be stored as u_old, use for next frame:
U_old (n)=u (n+480) is for 0≤n≤479
Y (n) is the expression of PCM signal.
As seen from the above-described embodiment, after embodiment of the invention encoding scheme is transformed to the MLT coefficient with input signal, be divided into core layer signal and enhancement layer signal according to auditory perception model, again according to core layer signal, enhancement layer signal and auditory perception model, obtain multiplexing and the packing after coded data; During decoding, according to auditory perception model, the importance of each subband of enhancement layer is weighted calculating after, core layer MLT coefficient and the enhancement layer MLT coefficient that obtains carried out contrary MLT, output decompress(ion) code stream.But compare with existing layering encoding and decoding technique, the embodiment of the invention has been carried out MLT to input signal, according to auditory perception model the importance of each subband of enhancement layer is weighted calculating, like this, improve the quality of encoding and decoding, solved the problem that effectively to handle the high sampling rate input signal in the prior art.And the present invention does not adopt QMF and CELP coding, has reduced the encoding and decoding complexity, has strengthened the encoding and decoding effect.
Above-described specific embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above only is specific embodiments of the invention; and be not intended to limit the scope of the invention; within the spirit and principles in the present invention all, any modification of being made, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (20)

1. but hierarchical audio coding device, it is characterized in that this device comprises: based on hierarchical block, auditory perception model, the calculating of subband envelope and coding module, core layer coding module, the enhancement layer coding module of auditory perception model with bit stream is multiplexing and packetization module;
Described hierarchical block based on auditory perception model, with input signal through modulated lapped transform (mlt) MLT, be transformed to the MLT coefficient after, according to auditory perception model, be divided into core layer signal and enhancement layer signal;
Described auditory perception model is for the hierarchical block based on auditory perception model provides classification foundation, for the subband importance weighting of enhancement layer coding module provides foundation;
Described subband envelope calculates and coding module, according to core layer signal and enhancement layer signal, after calculating envelope value based on each subband of the core layer signal of the hierarchical block of auditory perception model input and enhancement layer signal, give the core layer coding module with the envelope value of core layer signal and each subband of core layer signal, send enhancement layer signal and each subband envelope value of enhancement layer signal to the enhancement layer coding module; Each subband envelope value is encoded, send coded data to bit stream multiplexing and packetization module;
Described core layer coding module according to the envelope value of each subband of core layer signal of input, after the core layer signal of input encoded, sends the multiplexing and packetization module of bit stream to;
Described enhancement layer coding module according to the envelope value of each subband of enhancement layer signal of auditory perception model and input, after the enhancement layer signal of input encoded, sends the multiplexing and packetization module of bit stream to;
Multiplexing and the packetization module of described bit stream is to the coded data of each subband of enhancement layer of the coded data of each subband of core layer of core layer coding module input, the input of enhancement layer coding module with the subband envelope calculates and the subband envelope value coded data of coding module input is carried out multiplexing and packing.
2. device as claimed in claim 1 is characterized in that, described hierarchical block based on auditory perception model comprises: MLT module, sub-band division module and frequency band importance hierarchical block;
Described MLT module is carried out MLT to input signal, is transformed to the MLT coefficient;
Described sub-band division module is divided into a plurality of uniformly-spaced subbands with each frame MLT coefficient;
Described frequency band importance hierarchical block according to auditory perception model, is divided into core layer signal and enhancement layer signal with the MLT coefficient that is divided into a plurality of subbands.
3. device as claimed in claim 1 is characterized in that, described auditory perception model is divided for the MLT coefficient unequal interval of sub-band division module foundation is provided;
Described hierarchical block based on auditory perception model comprises: MLT module, sub-band division module and frequency band importance hierarchical block;
Described MLT module is carried out MLT to input signal, is transformed to the MLT coefficient;
Described sub-band division module is divided into a plurality of unequal interval subbands according to described auditory perception model with each frame MLT coefficient;
Described frequency band importance hierarchical block according to auditory perception model, is divided into core layer signal and enhancement layer signal with the MLT coefficient that is divided into a plurality of subbands.
4. device as claimed in claim 1 is characterized in that, described core layer coding module comprises subband Bit Allocation in Discrete module and quantification and coding module;
Described subband Bit Allocation in Discrete module, receive the calculating of subband envelope and the core layer signal of coding module input and the envelope value of each subband of core layer signal, according to each subband envelope value of core layer signal, be each allocation of subbands number of bits, the number of bits information of each subband signal and core layer signal are sent to quantize and coding module;
Described quantification and coding module according to the number of bits of each subband of core layer, quantize and encode each subband signal of core layer signal of input, send the coded data of each subband of core layer to bit stream multiplexing and packetization module.
5. as each described device of claim 1 to 4, it is characterized in that described enhancement layer coding module comprises subband importance weighting block, subband Bit Allocation in Discrete module and quantification and coding module;
Described subband importance weighting block, reception is calculated by the subband envelope and the enhancement layer signal of coding module input and the envelope value of each subband of enhancement layer, envelope value and auditory perception model according to each subband of enhancement layer of importing, importance to each subband of enhancement layer signal is weighted calculating, sends the result and the enhancement layer signal of the importance weighting of each subband of enhancement layer of calculating to subband Bit Allocation in Discrete module;
Described subband Bit Allocation in Discrete module, the result according to the importance weighting of each subband of enhancement layer signal is each subband signal allocation bit figure place, the number of bits information of each subband signal and enhancement layer signal is sent to quantize and coding module;
Described quantification and coding module according to the number of bits of each subband signal of enhancement layer, quantize and encode each subband signal of enhancement layer signal, send the coded data of each subband of enhancement layer to bit stream multiplexing and packetization module.
6. but layered audio decoding device, it is characterized in that this device comprises: bit stream demultiplexing module, subband envelope decoder module, core layer decoder module, enhancement layer decoder module, auditory perception model, modulated lapped transform (mlt) MLT coefficient reconstruction and inverse transform module;
Described bit stream demultiplexing module is decomposed into subband envelope value coded data, core layer coded data and enhancement layer coding data with the coded data that receives, and sends subband envelope decoder module to;
Described subband envelope decoder module, subband envelope value coded data is decoded, after obtaining each subband envelope value, send the envelope value of core layer coded data and each subband of core layer to the core layer decoder module, send the envelope value of enhancement layer coding data and each subband of enhancement layer to the enhancement layer decoder module;
Described core layer decoder module according to the envelope value of each subband of core layer of input, is decoded to the core layer coded data of input, obtain the MLT coefficient of each subband of core layer of decompress(ion) after, send MLT coefficient reconstruction and inverse transform module to;
Described enhancement layer decoder module, envelope value according to each subband of enhancement layer of auditory perception model and input, enhancing coded data to input is decoded, obtain the MLT coefficient of each subband of enhancement layer of decompress(ion), send the MLT coefficient of each subband of enhancement layer and the envelope value of each subband of enhancement layer to MLT coefficient reconstruction and inverse transform module;
Described auditory perception model is for the subband importance weighting of enhancement layer decoder module provides foundation;
Described MLT coefficient reconstruction and inverse transform module are carried out inverse transformation to the MLT coefficient of each subband of core layer and the MLT coefficient of each subband of enhancement layer, obtain the output signal of decompress(ion).
7. device as claimed in claim 6 is characterized in that, described core layer decoder module comprises subband Bit Allocation in Discrete module, subband data extraction module and re-quantization and decoder module;
Described subband Bit Allocation in Discrete module, receive the core layer coded data of subband envelope decoder module input and the band envelope value of each son of core layer, envelope value according to each subband of core layer, be each allocation of subbands number of bits, send the number of bits information and the core layer coded data of each subband of core layer to the subband data extraction module;
Described subband data extraction module according to the shared number of bits of each subband of core layer, extracts the coded data of each subband of core layer coded data, sends the coded data of each subband of core layer to re-quantization and decoder module;
Described re-quantization and decoder module, the coded data of each subband of core layer carried out re-quantization and decoding after, obtain the MLT coefficient of each subband of core layer of decompress(ion), send MLT coefficient reconstruction and inverse transform module to.
8. as claim 6 or 7 described devices, it is characterized in that described enhancement layer decoder module comprises subband importance weighting block, subband Bit Allocation in Discrete module, subband data extraction module and re-quantization and decoder module;
Described subband importance weighting block, receive the enhancement layer coding data of subband envelope decoder module input and the envelope value of each subband of enhancement layer, envelope value and auditory perception model according to each subband of enhancement layer, importance to each subband of enhancement layer coding data is weighted calculating, sends the envelope value of importance weighted results, enhancement layer coding data and each subband of enhancement layer of each subband of the enhancing coded data that calculates to subband Bit Allocation in Discrete module;
Described subband Bit Allocation in Discrete module, result according to the importance weighting of each subband of enhancement layer coding data, be the coded data allocation bit figure place of each subband of enhancement layer coding data, send the envelope value of number of bits information, enhancement layer coding data and each subband of enhancement layer of the coded data of the importance weighted results of each subband of enhancement layer coding data, each subband to the subband data extraction module;
Described subband data extraction module, importance order from big to small according to each sub-band coding data of enhancement layer coding data, according to the shared number of bits of corresponding each subband, extract the coded data of each subband of enhancement layer coding data, send the coded data of each subband of enhancement layer and the envelope value of each subband of enhancement layer to re-quantization and decoder module;
Described re-quantization and decoder module, after the coded data of each subband of enhancement layer of input carried out re-quantization and decoding, obtain the MLT coefficient of each subband of enhancement layer of decompress(ion), send the envelope value of each subband of enhancement layer of the MLT coefficient of each subband of enhancement layer and input to MLT coefficient reconstruction and inverse transform module.
9. device as claimed in claim 8 is characterized in that, described MLT coefficient reconstruction and inverse transform module comprise MLT coefficient reconstruction module and MLT inverse transform module;
Described MLT coefficient reconstruction module, according to the envelope value of each subband of enhancement layer of input, rearrange the MLT coefficient of core layer and each subband of enhancement layer according to the frequency band order after, send the MLT inverse transform module to;
Described MLT inverse transform module is carried out contrary MLT conversion to the MLT coefficient of core layer and each subband of enhancement layer, obtains the output signal of decompress(ion).
10. device as claimed in claim 9 is characterized in that, described auditory perception model is for MLT coefficient reconstruction module provides foundation to the compensation of the enhancement layer MLT coefficient lost;
Described MLT coefficient reconstruction module according to described auditory perception model, compensates the enhancement layer MLT coefficient of losing.
11. but an encoding and decoding layered audio method is characterized in that this method comprises:
A, with input signal behind modulated lapped transform (mlt) MLT, be divided into core layer signal and enhancement layer signal according to auditory perception model, according to core layer signal and enhancement layer signal, obtain the coded data of each subband envelope value;
B, obtain the coded data of each subband of core layer according to the envelope value of core layer signal and each subband of core layer signal, envelope value according to enhancement layer signal, auditory perception model and each subband of enhancement layer signal, be enhanced the layer each subband coded data, after the multiplexing together packing of coded data of the coded data of each subband envelope value that steps A is obtained, the coded data of each subband of core layer and each subband of enhancement layer, send decoding end to.
12. method as claimed in claim 11, it is characterized in that, input signal described in the steps A further comprises after carrying out behind the MLT: each the frame MLT coefficient that obtains behind the described MLT is divided into a plurality of uniformly-spaced subbands, or each the frame MLT coefficient that obtains after with described MLT according to auditory perception model is divided into a plurality of unequal interval subbands.
13. method as claimed in claim 12 is characterized in that, the described method of coding data that obtains each subband envelope value of steps A is:
Calculate the envelope value of each subband of core layer signal and enhancement layer signal, each subband envelope value is encoded, obtain the coded data of each subband envelope value;
The described method of coding data that obtains each subband of core layer of step B is:
According to the envelope value of each subband of core layer signal, be each allocation of subbands number of bits of core layer signal;
According to the number of bits of each subband of core layer signal, each subband signal of core layer signal is quantized and encodes, obtain the coded data of each subband of core layer;
The described method of coding data that is enhanced each subband of layer of step B is:
According to the envelope value of auditory perception model and each subband of enhancement layer signal, the importance of each subband of enhancement layer signal is weighted calculating;
According to the importance weighted results of each subband of enhancement layer signal that calculates, be each subband signal allocation bit figure place;
According to the number of bits of each subband signal of enhancement layer, each subband signal of enhancement layer signal is quantized and encodes the coded data of each subband of layer that be enhanced.
14., it is characterized in that the method for the described multiplexing packing of step B is as each described method of claim 11 to 13:
The coded data of each subband envelope value is placed the frame head back of code stream, the coded data of each subband of core layer is placed after the coded data of each subband envelope value, the coded data of each subband of enhancement layer is placed after the coded data of each subband of core layer.
15. method as claimed in claim 14 is characterized in that, the described method of inserting the enhancement layer coding data is:
Importance order from big to small according to each subband is inserted code stream successively with the coded data of each subband of enhancement layer, before a certain sub-band coding data are inserted code stream with enhancement layer, the code stream that calculates the place frame earlier is the number of bits of usefulness and the number of bits sum of described a certain subband, compare with available total number of bits of place frame again, if be less than or equal to total bit number, then the current sub coded data is inserted code stream, and used before will be being updated to number of bits bit number and described a certain sub-band coding data bit figure place with, continue to insert next subband coded data; Otherwise, stop to insert the sub-band coding data.
16. method as claimed in claim 11 is characterized in that, further comprises after the described step B:
A, the packing data that coding side is transmitted carry out demultiplexing, and according to auditory perception model, the importance of each subband of the enhancement layer coding data after computational solution is multiplexing obtains the MLT coefficient of core layer and each subband of enhancement layer;
B, rearrange the MLT coefficient of core layer and each subband of enhancement layer, the MLT coefficient is carried out contrary MLT, output decompress(ion) code stream according to the frequency band order.
17. method as claimed in claim 16 is characterized in that, step a is described to carry out further comprising behind the demultiplexing to packing data:
The coded data of each subband envelope value of obtaining behind the demultiplexing is decoded, obtain the envelope value of each subband;
The described method that obtains the MLT coefficient of each subband of core layer of step a is:
According to the envelope value of each subband of core layer coded data, each allocation of subbands number of bits of the core layer coded data that obtains for demultiplexing;
The number of bits shared according to each subband of core layer coded data, each sub-band coding data of extraction core layer coded data;
After each sub-band coding data of core layer of extracting are carried out re-quantization and decoding, obtain the MLT coefficient of each subband of core layer of decompress(ion);
Step a is described according to the be enhanced method of MLT coefficient of each subband of layer of auditory perception model to be:
According to the envelope value of auditory perception model and each subband of enhancement layer, the importance of each subband of enhancement layer coding data is weighted calculating;
According to the importance of each subband of enhancement layer coding data, be the coded data allocation bit figure place of each subband of enhancement layer;
According to the importance order from big to small that strengthens each subband data of coded data,, extract the coded data of each subband of enhancement layer coding data according to the shared number of bits of each subband of enhancement layer;
Coded data to each subband of enhancement layer of being extracted is carried out re-quantization and decoding, obtains the enhancement layer MLT coefficient of decompress(ion).
18. as claim 13 or 17 described methods, it is characterized in that, described importance to each subband is weighted Calculation Method: the envelope value of each subband of enhancement layer be multiply by a weighted value, the be enhanced importance weighted results of each subband of layer, described weighted value is determined according to auditory perception model.
19. method as claimed in claim 18 is characterized in that, the method for coding data of each subband of described extraction enhancement layer coding data is:
Calculate earlier the number of bits of code stream of the place frame that has extracted and enhancement layer coding data that be about to extract the shared number of bits of a certain sub-band coding data and, compare with total number of bits of the code stream of place frame again, if greater than total number of bits, then stop to extract data; Otherwise extract the coding of described a certain subband, with extract number of bits extracted before being updated to number of bits and the shared bit of described a certain sub-band coding and, continue next subband coded datas of extraction enhancement layer coding data.
20. method as claimed in claim 19, it is characterized in that, when losing the less enhancement layer subband data of importance in coding or the transmission course, step b is described to rearrange the method that further comprises the enhancement layer MLT coefficient of compensating missing after the MLT coefficient of core layer and each subband of enhancement layer according to the frequency band order:
The symbol picked at random of MLT coefficient multiply by the amplitude of proportionality constant as the MLT coefficient with envelope value, and described proportionality constant is determined according to auditory perception model.
CNA2006101678915A 2006-12-20 2006-12-20 Method and apparatus for encoding and decoding layered audio Pending CN101206860A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CNA2006101678915A CN101206860A (en) 2006-12-20 2006-12-20 Method and apparatus for encoding and decoding layered audio
PCT/CN2007/071154 WO2008074251A1 (en) 2006-12-20 2007-11-29 A hierarchical coding decoding method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2006101678915A CN101206860A (en) 2006-12-20 2006-12-20 Method and apparatus for encoding and decoding layered audio

Publications (1)

Publication Number Publication Date
CN101206860A true CN101206860A (en) 2008-06-25

Family

ID=39536002

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2006101678915A Pending CN101206860A (en) 2006-12-20 2006-12-20 Method and apparatus for encoding and decoding layered audio

Country Status (2)

Country Link
CN (1) CN101206860A (en)
WO (1) WO2008074251A1 (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010075777A1 (en) * 2008-12-30 2010-07-08 华为技术有限公司 Method, device and system for signal encoding and decoding
CN101923859A (en) * 2009-06-11 2010-12-22 索尼公司 Voice data receiving trap, method of reseptance and voice data send and receiving system
WO2011063694A1 (en) * 2009-11-27 2011-06-03 中兴通讯股份有限公司 Hierarchical audio coding, decoding method and system
CN102222505A (en) * 2010-04-13 2011-10-19 中兴通讯股份有限公司 Hierarchical audio coding and decoding methods and systems and transient signal hierarchical coding and decoding methods
WO2013023595A1 (en) * 2011-08-17 2013-02-21 北京泰美世纪科技有限公司 Method and apparatus for frequency synchronization and receiving of digital audio broadcast signal
WO2013189030A1 (en) * 2012-06-19 2013-12-27 深圳广晟信源技术有限公司 Monophonic or stereo audio coding method
CN103489450A (en) * 2013-04-07 2014-01-01 杭州微纳科技有限公司 Wireless audio compression and decompression method based on time domain aliasing elimination and equipment thereof
WO2014005327A1 (en) * 2012-07-06 2014-01-09 深圳广晟信源技术有限公司 Method for encoding multichannel digital audio
WO2015081699A1 (en) * 2013-12-02 2015-06-11 华为技术有限公司 Encoding method and apparatus
CN105957533A (en) * 2016-04-22 2016-09-21 杭州微纳科技股份有限公司 Speech compression method, speech decompression method, audio encoder, and audio decoder
CN106605263A (en) * 2014-07-29 2017-04-26 奥兰吉公司 Determining a budget for LPD/FD transition frame encoding
CN107180637A (en) * 2012-05-14 2017-09-19 杜比国际公司 The method and device that compression and decompression high-order ambisonics signal are represented
CN109036457A (en) * 2018-09-10 2018-12-18 广州酷狗计算机科技有限公司 Restore the method and apparatus of audio signal
CN110797004A (en) * 2018-08-01 2020-02-14 百度在线网络技术(北京)有限公司 Data transmission method and device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2348504B1 (en) 2009-03-27 2014-01-08 Huawei Technologies Co., Ltd. Encoding and decoding method and device
CN111402907B (en) * 2020-03-13 2023-04-18 大连理工大学 G.722.1-based multi-description speech coding method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1183685C (en) * 1998-05-27 2005-01-05 微软公司 System and method for entropy ercoding quantized transform coefficients of a sigral
DE60214599T2 (en) * 2002-03-12 2007-09-13 Nokia Corp. SCALABLE AUDIO CODING
US7395210B2 (en) * 2002-11-21 2008-07-01 Microsoft Corporation Progressive to lossless embedded audio coder (PLEAC) with multiple factorization reversible transform
EP1619664B1 (en) * 2003-04-30 2012-01-25 Panasonic Corporation Speech coding apparatus, speech decoding apparatus and methods thereof
US8446947B2 (en) * 2003-10-10 2013-05-21 Agency For Science, Technology And Research Method for encoding a digital signal into a scalable bitstream; method for decoding a scalable bitstream
CN101138174B (en) * 2005-03-14 2013-04-24 松下电器产业株式会社 Scalable decoder and scalable decoding method

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8380526B2 (en) 2008-12-30 2013-02-19 Huawei Technologies Co., Ltd. Method, device and system for enhancement layer signal encoding and decoding
WO2010075777A1 (en) * 2008-12-30 2010-07-08 华为技术有限公司 Method, device and system for signal encoding and decoding
CN101923859A (en) * 2009-06-11 2010-12-22 索尼公司 Voice data receiving trap, method of reseptance and voice data send and receiving system
WO2011063694A1 (en) * 2009-11-27 2011-06-03 中兴通讯股份有限公司 Hierarchical audio coding, decoding method and system
US8694325B2 (en) 2009-11-27 2014-04-08 Zte Corporation Hierarchical audio coding, decoding method and system
CN102081927B (en) * 2009-11-27 2012-07-18 中兴通讯股份有限公司 Layering audio coding and decoding method and system
CN102222505B (en) * 2010-04-13 2012-12-19 中兴通讯股份有限公司 Hierarchical audio coding and decoding methods and systems and transient signal hierarchical coding and decoding methods
WO2011127757A1 (en) * 2010-04-13 2011-10-20 中兴通讯股份有限公司 Hierarchical audio frequency encoding and decoding method and system, hierarchical frequency encoding and decoding method for transient signal
CN102222505A (en) * 2010-04-13 2011-10-19 中兴通讯股份有限公司 Hierarchical audio coding and decoding methods and systems and transient signal hierarchical coding and decoding methods
RU2522020C1 (en) * 2010-04-13 2014-07-10 ЗетТиИ Корпорейшн Hierarchical audio frequency encoding and decoding method and system, hierarchical frequency encoding and decoding method for transient signal
US8874450B2 (en) 2010-04-13 2014-10-28 Zte Corporation Hierarchical audio frequency encoding and decoding method and system, hierarchical frequency encoding and decoding method for transient signal
WO2013023595A1 (en) * 2011-08-17 2013-02-21 北京泰美世纪科技有限公司 Method and apparatus for frequency synchronization and receiving of digital audio broadcast signal
CN107180637A (en) * 2012-05-14 2017-09-19 杜比国际公司 The method and device that compression and decompression high-order ambisonics signal are represented
US11234091B2 (en) 2012-05-14 2022-01-25 Dolby Laboratories Licensing Corporation Method and apparatus for compressing and decompressing a Higher Order Ambisonics signal representation
US11792591B2 (en) 2012-05-14 2023-10-17 Dolby Laboratories Licensing Corporation Method and apparatus for compressing and decompressing a higher order Ambisonics signal representation
CN104170007A (en) * 2012-06-19 2014-11-26 深圳广晟信源技术有限公司 Monophonic or stereo audio coding method
WO2013189030A1 (en) * 2012-06-19 2013-12-27 深圳广晟信源技术有限公司 Monophonic or stereo audio coding method
CN104170007B (en) * 2012-06-19 2017-09-26 深圳广晟信源技术有限公司 To monophonic or the stereo method encoded
WO2014005327A1 (en) * 2012-07-06 2014-01-09 深圳广晟信源技术有限公司 Method for encoding multichannel digital audio
CN103489450A (en) * 2013-04-07 2014-01-01 杭州微纳科技有限公司 Wireless audio compression and decompression method based on time domain aliasing elimination and equipment thereof
WO2015081699A1 (en) * 2013-12-02 2015-06-11 华为技术有限公司 Encoding method and apparatus
US9754594B2 (en) 2013-12-02 2017-09-05 Huawei Technologies Co., Ltd. Encoding method and apparatus
RU2636697C1 (en) * 2013-12-02 2017-11-27 Хуавэй Текнолоджиз Ко., Лтд. Device and method for coding
US10347257B2 (en) 2013-12-02 2019-07-09 Huawei Technologies Co., Ltd. Encoding method and apparatus
US11289102B2 (en) 2013-12-02 2022-03-29 Huawei Technologies Co., Ltd. Encoding method and apparatus
CN106605263B (en) * 2014-07-29 2020-11-27 奥兰吉公司 Determining budget for encoding LPD/FD transition frames
CN106605263A (en) * 2014-07-29 2017-04-26 奥兰吉公司 Determining a budget for LPD/FD transition frame encoding
CN105957533B (en) * 2016-04-22 2020-11-10 杭州微纳科技股份有限公司 Voice compression method, voice decompression method, audio encoder and audio decoder
CN105957533A (en) * 2016-04-22 2016-09-21 杭州微纳科技股份有限公司 Speech compression method, speech decompression method, audio encoder, and audio decoder
CN110797004A (en) * 2018-08-01 2020-02-14 百度在线网络技术(北京)有限公司 Data transmission method and device
CN109036457A (en) * 2018-09-10 2018-12-18 广州酷狗计算机科技有限公司 Restore the method and apparatus of audio signal
CN109036457B (en) * 2018-09-10 2021-10-08 广州酷狗计算机科技有限公司 Method and apparatus for restoring audio signal

Also Published As

Publication number Publication date
WO2008074251A1 (en) 2008-06-26

Similar Documents

Publication Publication Date Title
CN101206860A (en) Method and apparatus for encoding and decoding layered audio
CN101615396B (en) Voice encoding device and voice decoding device
CN1973319B (en) Method and apparatus to encode and decode multi-channel audio signals
CN100454389C (en) Sound encoding apparatus and sound encoding method
CN101140759B (en) Band-width spreading method and system for voice or audio signal
CN101501763B (en) Audio codec post-filter
CN101836251B (en) Scalable speech and audio encoding using combinatorial encoding of MDCT spectrum
CN101622661B (en) Advanced encoding / decoding of audio digital signals
CN103106902B (en) Low bit-rate audio signal coding/decoding method
CN103415884B (en) Device and method for execution of huffman coding
CN101779236A (en) Temporal masking in audio coding based on spectral dynamics in frequency sub-bands
CN102436819B (en) Wireless audio compression and decompression methods, audio coder and audio decoder
CN105070293A (en) Audio bandwidth extension coding and decoding method and device based on deep neutral network
WO2009029557A1 (en) Spectral noise shaping in audio coding based on spectral dynamics in frequency sub-bands
CN1918630B (en) Method and device for quantizing an information signal
CN104025190A (en) Energy lossless-encoding method and apparatus, audio encoding method and apparatus, energy lossless-decoding method and apparatus, and audio decoding method and apparatus
CN101527138A (en) Coding method and decoding method for ultra wide band expansion, coder and decoder as well as system for ultra wide band expansion
CN105280190A (en) Bandwidth extension encoding and decoding method and device
CN101436406B (en) Audio encoder and decoder
CN105957533B (en) Voice compression method, voice decompression method, audio encoder and audio decoder
CN1318904A (en) Practical sound coder based on wavelet conversion
CN101527139A (en) Audio encoding and decoding method and device thereof
Venkateswaran et al. An Efficient Time Domain Speech Compression Algorithm Based on LPC and Sub-Band Coding Techniques.
CN103474079A (en) Voice encoding method
Dhubkarya et al. HIGH QUALITY AUDIO CODING AT LOW BIT RATE USING WAVELET AND WAVELET PACKET TRANSFORM.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20080625