EP2023339B1 - Codeur audio à faible retard - Google Patents

Codeur audio à faible retard Download PDF

Info

Publication number
EP2023339B1
EP2023339B1 EP07113397A EP07113397A EP2023339B1 EP 2023339 B1 EP2023339 B1 EP 2023339B1 EP 07113397 A EP07113397 A EP 07113397A EP 07113397 A EP07113397 A EP 07113397A EP 2023339 B1 EP2023339 B1 EP 2023339B1
Authority
EP
European Patent Office
Prior art keywords
model
distribution model
distribution
signal
combined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
EP07113397A
Other languages
German (de)
English (en)
Other versions
EP2023339A1 (fr
Inventor
Willem Bastiaan Kleijn
Li Minyue
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Global IP Solutions GIPS AB
Global IP Solutions Inc
Original Assignee
Global IP Solutions GIPS AB
Global IP Solutions Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Global IP Solutions GIPS AB, Global IP Solutions Inc filed Critical Global IP Solutions GIPS AB
Priority to EP07113397A priority Critical patent/EP2023339B1/fr
Priority to DE602007008717T priority patent/DE602007008717D1/de
Priority to AT07113397T priority patent/ATE479182T1/de
Priority to PCT/EP2008/057970 priority patent/WO2009015944A1/fr
Priority to US12/671,631 priority patent/US8463615B2/en
Publication of EP2023339A1 publication Critical patent/EP2023339A1/fr
Application granted granted Critical
Publication of EP2023339B1 publication Critical patent/EP2023339B1/fr
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/0017Lossless audio signal coding; Perfect reconstruction of coded audio signal by transmission of coding error
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R25/00Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
    • H04R25/55Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception using an external connection, either wireless or wired
    • H04R25/552Binaural
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2225/00Details of deaf aids covered by H04R25/00, not provided for in any of its subgroups
    • H04R2225/55Communication between hearing aids and external devices via a network for data exchange

Definitions

  • the present invention relates generally to methods and devices for encoding and decoding audio signals.
  • the present invention relates to coders and decoders for reducing bit rate variations during the encoding and decoding procedures of speech signals.
  • Coding of a digital audio signal is commonly based on the use of a signal model to reduce bit rate (also called “rate” in the following) and maintain high signal quality.
  • a signal model enables the transformation of data to new data that are more amenable to coding or the definition of a distribution of the digital audio signal, which distribution can be used in coding.
  • the signal model may be used for linear prediction, which removes dependencies among samples of the digital audio signal (a method called linear predictive encoding).
  • the signal model may be used to provide a probability distribution of a signal segment of the digital audio signal to a quantizer, thereby facilitating the computation of the quantizer which operates either directly on the signal or on a unitary transform of the signal (method called adaptive encoding).
  • Delay is an important factor in many applications of coding of audio signals.
  • the delay is particularly critical.
  • backward signal analysis backward adaptive encoding
  • signal reconstruction in the following.
  • Coding methods are commonly divided into two classes, namely variable-rate coding, which corresponds to constrained-entropy quantization, and fixed-rate coding, which corresponds to constrained-resolution quantization.
  • the behaviour of these two coding methods can be analysed for the so-called high-rate case, which is often considered to be a good approximation of the low-rate case.
  • a constrained-resolution quantizer minimizes the distortion under a fixed-rate constraint, which, at high rate, results generally in non-uniform cell sizes.
  • a constrained-entropy quantizer minimizes the distortion under an average rate (the quantization index entropy) constraint.
  • the instant rate varies over time, which, at high-rate, generally results in an uncountable set of quantization cells of uniform size and shape while redundancy removal is left to lossless coding.
  • constrained-entropy quantization provides a (nearly) constant distortion, which is especially beneficial when the signal model or probabilistic signal model is not optimal.
  • a non-optimal probabilistic signal model leads also to an increase in bit rate in the case of constrained-entropy coding.
  • constrained-resolution quantization leads to an increased distortion while keeping a constant rate when the probabilistic signal model is not optimal.
  • speech and audio signals display so-called transitions, at which the optimal probabilistic signal model would change abruptly. If the model is not updated immediately at a transition, the quality of the encoding degrades in the constrained-resolution case (increased distortion) while the bit rate increases in the constrained-entropy case.
  • the problem at transitions is particularly significant when the probabilistic signal model is updated by a backward signal analysis.
  • the problem at transitions leads to error propagation since the signal reconstruction is inaccurate because the signal model is inaccurate, and the signal model is inaccurate because the signal reconstruction is inaccurate. Thus, it takes a relatively long time for the coder to retrieve a good signal quality.
  • constrained-entropy quantization there is little error propagation but the bit rate increases significantly at abrupt transitions (resulting in bit rate peaks).
  • US patent application 2007/0016418 discloses using either a fixed or an adaptive entropy coding.
  • An object of the present invention is to wholly or partly overcome the above disadvantages and drawbacks of the prior art and to provide improved methods and devices for encoding and decoding audio signals.
  • the present invention provides methods and apparatus enabling to reduce bit rate variation, such as bit rate peaks, when coding an input signal based on variable-rate quantization while maintaining a high average compression rate.
  • the methods and apparatus provided by the present invention enable to reduce the propagation of errors caused by packet loss or channel errors, in particular in audio coding of input signal based on fixed-rate quantization, while maintaining high average compression rate.
  • a method for encoding an input signal is provided in accordance with appended claim 1.
  • an apparatus for encoding an input signal is provided in accordance with appended claim 16.
  • a method for decoding a bit stream of coded data is provided in accordance with appended claim 36.
  • an apparatus for decoding a bit stream of coded data is provided in accordance with appended claim 46.
  • a computer readable medium is provided in accordance with appended claim 58.
  • a computer readable medium is provided in accordance with appended claim 59.
  • An advantage of the present invention is to remove bit rate peaks associated with transitions in audio coding for constrained-entropy encoding without increasing the average bit rate significantly.
  • the present invention is based on an insight that the rate increases at transitions because of the non-optimality of the probabilistic signal model obtained with backward adaptation (or backward adaptive encoding).
  • quantizers are designed based on a probabilistic signal model, their performance varies with the accuracy of the model.
  • the optimal model for a given distortion is the model that provides the lowest bit rate.
  • the probabilistic signal model used in backward adaptive encoding is generally not the probabilistic signal model leading to the lowest bit rate, which results in significant rate peaks at transitions.
  • the present invention is advantageous since flexibility is introduced in the determination of the probabilistic signal model using a low rate of side information.
  • This flexibility is introduced by encoding a current signal segment of the input signal using a combined distribution model obtained by adding at least one first distribution model and at least one fixed distribution model, to which distribution models weighting coefficients are affected.
  • the first distribution model is associated with model parameters extracted from a reconstructed signal generated from past signal segments of the input signal.
  • the probabilistic signal model or combined distribution model used to encode the current signal segment takes into account past signal segments of the input signal and is also based on other signal models.
  • weighting coefficients affected to the first and the fixed distribution models may be selected for minimizing an estimated code length for the current signal segment.
  • the probabilistic model or combined distribution model comprises a sum of probability distributions, which is also referred to as a sum of distribution models, each multiplied by a coefficient. At least one of the distribution models is obtained based on the past coded signal. Good or optimal values for the coefficients may be computed by a modeller.
  • the probabilistic model is preferably based on at least one of the following: i) a distribution model generated based on a reconstructed signal (which can be available at both the encoder and the decoder), ii) information stored at both the encoder and the decoder (for example a fixed distribution model characteristic of the input signal), and iii) transmitted information.
  • the combined distribution model or probabilistic model may be created by combining, in a manner specified in information transmitted from the encoder to the decoder, a distribution based on a reconstructed signal and one or more fixed distribution models known at both the encoder and the decoder.
  • the combined distribution model may be a mixture model further including at least one adaptive distribution model selected in response to the model parameters extracted from the reconstructed signal, to which adaptive distribution model a weighting factor is affected. This is advantageous since one more component is included in the combined distribution model, thereby increasing the flexibility of the signal model.
  • the combined distribution model is selected from a plurality of combined distribution models in response to a code length of a subsegment of the current signal segment and a code length used for describing the distribution model of the reconstructed signal.
  • the plurality of combined distribution models may be obtained by varying the values of a set of weighting coefficients associated with a particular signal model.
  • the proposed signal representation i.e. the combined distribution model, decreases the code length for the signal segments or blocks near transitions for backward adaptive encoding and may also decrease the average rate because the probabilistic signal model is closer to optimal.
  • the information concerning the values of the weighting coefficients may be transmitted as side information in the form of one or more quantization indices.
  • the information about the combined distribution model may be transmitted in the form of a model index, which will then be used at a decoder or apparatus for decoding the transmitted data or stored at the encoder.
  • the weighting coefficients may be biased for minimizing the propagation of errors caused by packet loss and channel errors.
  • the weighting coefficient affected to the first distribution model may be biased towards a value of zero or compared to a threshold value below which it is set to zero.
  • An advantage of the present invention is to provide methods and devices for encoding and decoding audio signals that present low delay, low bit rate in average and low rate variations.
  • the present invention is suitable for both constrained-resolution quantization and constrained-entropy quantization.
  • the invention has broad applications for audio coding, in particular coding based on variable bit rate. It is applicable to low delay audio coding, where backward model adaptation is often selected to reduce the bit rate. Low delay coding is applicable in, for example, a scenario where the listener perceives an audio signal both through an acoustic path and through a communication network or for inter-ear communication for hearing aids, where delay affects spatial perception.
  • Fig. 1 shows an apparatus or system 10 for encoding an input signal 120, such as a digital audio signal or speech signal.
  • the input signal 120 is processed on a segment-by-segment (block-by-block) basis.
  • a signal model suitable for encoding a current signal segment of the input signal 120 in an encoder 119 is provided by a modeller 113, also called probabilistic modeller 113 in the following.
  • the signal model output from the modeller 113 is also called probabilistic model or combined distribution model in the following and corresponds to a probabilistic model of the joint distribution of the signal samples or segments.
  • the modeller 113 obtains the combined distribution model by adding at least one first distribution model and at least one fixed distribution model, each of the distribution models being multiplied by a weighting coefficient.
  • the first distribution model is associated with model parameters extracted by an extracting means 118 from a reconstructed signal 121, which reconstructed signal 121 is the output of the signal quantizer 104 processed optionally by a reconstructing means or post-processing means 117 to approximate past segments of the input signal 120.
  • the modeller 113 obtains the combined distribution model by combining at least one first distribution model based on the reconstructed signal 121 and one or more fixed distribution models. Examples of a reconstructing means 117 and an extracting means 118 will be described in more detail with reference to Fig. 2 . The structure of the modeller 113 will be explained in more detail with reference to Fig. 5 .
  • the encoding of the current segment of the input signal 120 is performed at the encoder 119 which uses the combined distribution model output from the modeller 113.
  • the encoded signal or sequence of coded data output by the encoder 119 is provided to a multiplexer 116, which generates a bit stream 124.
  • information about the combined distribution model is also provided to the multiplexer 116 and included in the bit stream 124.
  • the input signal 120 may be pre-processed by a pre-processing means 125, which addresses perceptual and blocking (segmentation) effects.
  • the pre-processing means 125 will be explained in more detail with reference to Fig. 2 .
  • the pre-processing means 125 and the post-processing means 117 form a matching pair. If no pre-processing means and post-processing means are used, the output of the quantizer 104 is the quantized speech signal itself.
  • the encoder 119 includes a quantizer 104 and a first codeword generator 109.
  • the quantizer 104 generates indices and the first codeword generator 109 converts a sequence of these indices into codewords. Each codeword may correspond to one or more indices.
  • the quantizer 104 can be either a constrained-resolution quantizer, a constrained-entropy quantizer or any other kind of quantizer. For the purpose of illustration, a constrained-resolution quantizer and a constrained-entropy quantizer are discussed. In the case of constrained-resolution quantization, the number of allowed reconstruction (dequantized) points is fixed and the quantizer 104 is dependent on the combined distribution model, i.e.
  • the quantizer 104 operates using the combined distribution model.
  • the first codeword generator 109 generates one codeword per index, and all codewords have the same length in bits.
  • all quantization cells have a fixed size, thereby facilitating the quantization.
  • the size of the quantization cells can be scaled with the variance of the combined distribution model created by the modeller 113 in order to scale the expected distortion with the input signal 120 or can be fixed in order to obtain a fixed distortion.
  • the first codeword generator 109 operates using the combined distribution model and generates codewords of unequal length or codewords that describe many indices.
  • the probability of the indices is estimated based on the combined distribution model provided by the modeller 113 in order to generate codewords having minimal average length per index.
  • the first codeword generator 109 is set to achieve an encoding having an average rate that is close to the entropy of the indices (which corresponds to a method called entropy coding, also called lossless coding), for which the well-known Huffman or arithmetic coding techniques can be used.
  • the weighting coefficients affected to each of the distribution models are selected by the modeller 113 for minimizing a code length or estimated code length corresponding to the current signal segment.
  • the manner of combining the distribution model based on the reconstructed signal 121 of the input signal 120 with the fixed distribution model characteristic of the input signal 120 is specified by a model index 123.
  • information about the combined distribution model such as the weighting coefficients affected to each of the distribution models (the first and fixed distribution models), is specified in the model index 123.
  • the model index 123 may be encoded in a second codeword generator 100 and provided to the multiplexer 116 to be included in the bit stream 124. If the lossless coding is used for the first codeword generator 109, it is then preferable to use the same technique for the second codeword generator 100.
  • the bit stream 124 includes the encoded signal or sequence of coded data and the information about the combined distribution model used to encode the current signal segment, i.e. the model index 123.
  • the bit stream 124 may then be transmitted to a decoder 30, which will be described with reference to Fig. 3 , or stored at the apparatus 10 for encoding.
  • the model index may be transmitted as side information in the form of a coded model index specifying at least the weighting coefficients.
  • Fig. 2 shows a system or apparatus 20 for encoding an input signal 120, such as a digital audio signal or speech signal, which apparatus 20 is equivalent to the apparatus 10 described with reference to Fig. 1 except that examples of a pre-processing means 125, a reconstructing means 117 and an extracting means 118 are illustrated in more detail.
  • the apparatus 20, as well as the apparatus 10, may be used as a backward adaptive, variable rate, low delay audio coder.
  • the apparatus 20 for encoding operates also on a block-by-block basis.
  • the input signal 120 or digital audio signal 120 may be sampled at 16000 Hz, and a typical block size would be 0.25 ms, or 4 samples.
  • the processing steps of the encoder may be summarized as: (1) perceptual weighting, (2) two-stage decorrelation, (3) constrained-entropy quantization, and (4) entropy coding.
  • the extracting means 118 includes a linear predictive (LP) analyzer 110 performing a linear predictive analysis (equivalent to a particular estimation method of autoregressive model parameters) of the most recent segment of a reconstructed signal 121 generated from past segments of the input signal 120 in the reconstructing means 117.
  • the prediction order may be set to 32, thereby capturing some of the spectral fine-structure of the input signal 120.
  • the LP analyzer 110 it is preferable for the LP analyzer 110 to operate on the reconstructed signal 121 because no delay is required for the analysis.
  • a signal similar to the reconstructed signal 121 can also be available at a decoder, such as the decoders 30 or 40 that will be described with reference to Figs.
  • the reconstructed signal 121 which is input to the LP analyzer 110 may be first windowed using an asymmetric window as defined in ITU-T Recommendation G.728.
  • the autocorrelation function for the windowed signal is computed and the predictor coefficients may be computed using e.g. the well-known split Levinson algorithm.
  • a ( z ) the transfer function of the prediction-error filter corresponding to the set of prediction coefficients extracted by the LP analyzer 110. That is,
  • a ( z ) 1 - a 1 z -1 ⁇ - a k z - k
  • a 1 , ⁇ , a k are the predictor coefficients and k is the predictor order that is advantageously set to 32.
  • the operation of the pre-processing means 125 is now described in more detail.
  • the signal i.e. the current signal segment
  • the filtered signal segment may then be corrected by a first correcting means or adder 114 that subtracts a (closed-loop) zero-input response that is described in more detail below, transformed in a transformer 102 and normalized by a normalization means 103.
  • the normalized signal segment may be quantized in the quantizer 104 of the encoder 119 before it enters the reconstructing means 117. It is to be noted that the first correcting means 114 and the normalization means 103 are optional elements of the pre-processing means 125.
  • the perceptual weighting filter 101 transforms the digital audio signal 120 from a signal domain to a "perceptual" domain, in which minimizing the squared error of quantization approximates minimizing the perceptual distortion.
  • This filter is computed in perceptual weighting adaptation 111.
  • these scalars ⁇ 1 and ⁇ 2 may be set to 0.9 and 0.7, respectively.
  • the next two processing steps of the pre-processing means 125 shown in Fig. 2 are a prediction of the segment and a transform of the segment, which both aim at decorrelation, thereby forming a two-stage decorrelation.
  • a first stage is based on linear prediction and a second stage is based on a unitary transform.
  • An advantage provided by linear prediction is the possibility to remove long-range correlations independently of the block length.
  • a transform can not remove correlations over separations longer than the block length.
  • long blocks imply long delay.
  • An advantage of transform coding, when based on a unitary transform is that the shape of the quantization cells is not affected by the transform.
  • the prediction step is carried out by a linear predictor or response computer 107 and the first correcting means or adder 114.
  • the linear prediction of the perceptually weighted signal from the past reconstructed perceptually weighted signal by the linear predictor 107 corresponds to the computation of the zero-input response 122.
  • the zero-input response is the zero input response of a cascade of the inverse of the prediction-error filter and the perceptual weighting filter (see equation (1)): W ( z )/ A ( z ).
  • the first correcting means or adder 114 then performs a subtraction of zero-input response 122 for the current signal block or segment. The subtraction of the zero-input response is aimed at removing correlations between adjacent signal blocks (segments).
  • H U ⁇ ⁇ ⁇ V , where U and V are unitary matrices, and ⁇ is a diagonal matrix. This operation is performed in the SVD 112.
  • the matrix U forms a model-based Karhunen-Loève transform (KLT) for the signal x .
  • KLT is enacted by multiplying the transpose of U on x .
  • variable-rate (constrained-entropy) coding it is preferable to use uniform quantization, which is optimal in the high-rate limit.
  • uniform quantization For any particular average rate, a fixed scalar quantizer with uniform quantization step size may be used. The selection of scalar quantization is preferable since, asymptotically with increasing rate, the performance loss will not be more than 0.25 bit per sample over infinite-dimension vector quantization.
  • either the average rate or the average distortion may be set as a constraint.
  • the distortion may be set to a constant value equal to an average distortion.
  • the average distortion is determined by the step size of the uniform scalar quantizer, which facilitates usage of the apparatus for encoding since one simply selects a step size.
  • the average distortion is 1/12 of the square step size.
  • the average-rate constraint requires that the combined distribution model is accurate.
  • it is preferable to use a distortion constraint. Varying the value of the distortion constraint and measuring the resulting average rate over a range of distortions allows the selection of a desired bit rate with a certain numerical precision (distortion).
  • the first codeword generator 109 may be an entropy coder based on an arithmetic coding method.
  • the entropy coder receives the probability density of the symbols, i.e. the combined distribution model, from the probabilistic modeller 113, the quantized signal values and the quantization step size from the quantizer 104. It is preferable to use an arithmetic coding since it is possible to compute the codeword of a single quantized signal vector s using the combined distribution model without the need of computing other codewords. Thus, if the distribution changes, it is not necessary to update the entire set of all possible codewords in the method of the present invention. This contrasts with Huffman coding where it is most natural to compute the entire set of codewords and store them in a table.
  • a cumulative probability function or cumulative distribution is used.
  • the cumulative probability function of each transformed sample suffices for this purpose.
  • the quantization values are ordered and the ordering normally coincides with the index values, which are normally selected to be positive consecutive integers.
  • the cumulative distribution is the sum of the probabilities of the quantization values having an index equal or inferior to m. If the model probability function is selected to be of a simple form, as it generally is the case, then the summation can be replaced by an analytic integration, thereby reducing the computational effort.
  • the arithmetic coding method can be generalized to the vector quantization case, which usually is associated with a truncation of the region of support.
  • the arithmetic coder buffer depth can be bound using standard methods (e.g., a non-existing source symbol is introduced to enact a flushing of the buffer).
  • the output of the first codeword generator 109 and the model index 123 output from the second codeword generator 100 are multiplexed in the multiplexer 116 into a bit stream 124.
  • This bit stream 124 may be transmitted to a receiver, such as a decoder, or stored at the apparatus 10 or 20 for encoding.
  • the multiplexing should be done in such a way that the decoder is able to distinguish between the bits describing the model and the bits describing the data.
  • the signal samples and the model index each have fixed codeword length, this is a simple alternation of sets of codewords for a set of signal samples with codewords for a model index.
  • arithmetic coding this is most conveniently done by combining the first codeword generator 109 and the second codeword generator 100 into a single codeword generator and interlacing the parameters to be encoded as input to the combined codeword generator.
  • signal segments are coded by the arithmetic code as a single codeword (i.e, with an end-of-sequence termination) by the first codeword generator 109, alternated by the corresponding independent encoding of a set of model indices (also with an end-of-sequence termination) by the second codeword generator 100.
  • the model index is used for the model index and arithmetic coding is used for the signal samples, and each fixed-length codeword for the model index is inserted as soon as the encoding of a corresponding signal segment of samples is completed in the sense that the the signal segment of samples can be decoded from the bitstream.
  • the third method results in an arithmetic code for the signal samples that is interlaced with model index samples, without requiring additional bits for separating the bitstreams containing information for the dequantizer 204 and the modeller 213.
  • the reconstructed signal 121 is formed by processing the quantized segments produced by the quantizer 104 in the reconstructing means 117, which reconstructing means 117 includes components performing the inverse operations of the components of the pre-processing means 125.
  • the reconstructing means 117 may include a denormalization means 105 for performing a denormalization of the signal segment, an inverse transformer 106 for applying an inverse transform to the denormalized signal segment, a second correcting means or adder 115 that adds back the zero-input response to the inversely transformed signal segment, and an inverse weighting filter 108 for applying an inverse filter to the corrected signal segment.
  • the reconstruction operators may also be updated from the reconstructed signal 121. It is to be noted that the normalization means and the correcting means are optional components of the reconstructed means 117.
  • a decoder or apparatus 30 for decoding will now be described in accordance with an embodiment of the present invention.
  • Fig. 3 shows a decoder or apparatus 30 for decoding a bit stream 124 of coded data which may be received from the coder or apparatus 10 or 20 for encoding described with reference to Figs. 1 or 2 , respectively.
  • the bit stream is received by a demultiplexer 214 that splits the bit stream in information about a combined distribution model and a bit stream corresponding to a current sequence of coded data, i.e. quantization indices for a current signal segment of the input signal 120, pre-processed by the pre-processing means 125 such as described with reference to Figs. 1 and 2 .
  • the current sequence of coded data is provided to a decoder 219, which uses a combined distribution model provided by a modeller 213 in order to output a sequence of decoded data.
  • the quantization indices input in the decoder 219 specify quantized subsegments.
  • the modeller 213 obtains the combined distribution model by adding at least one first distribution model with which model parameters are associated and at least one fixed distribution model.
  • the model parameters are extracted by an extracting means 218 from an existing part of a reconstructed signal 221 which corresponds to past sequences of the bit stream 124.
  • the reconstructed signal 221 is generated by a reconstructing means 217 which will be described in more detail with reference to Fig. 4 in the following.
  • the information about the combined distribution model which may be received in the form of a model index, includes at least weighting coefficients and is provided to the modeller 213.
  • the modeller 213 can then affect the weighting coefficients to the corresponding distribution models (the first and fixed distribution models) in accordance with the model index 223 for obtaining the combined distribution model.
  • the extracting means 218 allows the probabilistic modeller 213 to create a combined distribution model in a similar manner as the extracting means 118 described with reference to Figs. 1 or 2 .
  • the decoder 219 includes a first codeword interpreter 209, which outputs quantization indices, and a dequantizer 204, which outputs the sequence of decoded data, i.e. the quantized current signal segment.
  • the dequantizer computes the quantized data from the quantization indices.
  • the reconstructing means 217 performs the inverse process of the pre-processing means 125 described with reference to Figs. 1 or 2 on a segment-by-segment basis, thereby rendering a reconstructed signal 221 in response to the sequence of decoded data provided by the dequantizer 204.
  • the reconstructed signal 221 can then output a part of the reconstructed signal 221 from the current sequence of decoded data, thereby the reconstructed signal 221 is continuously updated.
  • a second codeword interpreter 200 may be arranged between the demultiplexer 214 and the modeller 213 in order to decode the coded model index or coded information about the combined distribution model and provide this information or model index to the modeller 213.
  • the model index specifies information about the combined distribution model and in particular a set of weighting coefficients.
  • the modeller provides a combined distribution model 424 to the first codeword interpreter 209 and/or to the dequantizer 204.
  • the combined distribution model specifies the set of reconstruction points used in the dequantizer 204.
  • the first codeword interpreter 209 provides the index for a particular point and this point is then determined in the dequantizer 204.
  • the set of reconstruction points of the constrained-resolution quantizer is spaced with a spacing that is the inverse of the local density of reconstruction points as computed by standard high-rate quantization theory based on the combined distribution model 424 provided by the modeller 213.
  • the index information is used to determine the correct quantization index in the first codeword interpreter 209 using the combined distribution model provided by the modeller 213.
  • This quantization index is then used in the dequantizer 204 to select one of the reconstruction points of the uniform constrained-entropy quantizer.
  • the reconstruction points of the dequantizer 204 are identical to the reconstruction points of the quantizer 104, and it could be considered that the dequantizer 204 is identical to a component of the quantizer 104.
  • Fig. 4 shows a system or apparatus 40 for decoding a bit stream 124 of coded data, which apparatus 40 is equivalent to the apparatus 30 described with reference to Fig. 3 except that examples of a reconstructed means 217 and an extracting means 218 are illustrated in more detail.
  • the reconstructed means 217 is equivalent to the reconstructed means 117 described with reference to Fig. 2 and may include a denormalization means 205, an inverse transformer 206 such as an inverse KLT transformer 206, a correcting means or adder 215, a response computer 207 and an inverse weighting filter 218.
  • the extracting means 218 is equivalent to the extracting means 118 described with reference to Fig. 2 and may include a LP analyser 210, a perceptual weighting adaptation means 211 and an SVD 212.
  • FIG. 5 An example of a modeller 113 of the apparatus 10 or 20 for encoding, such as described with reference to Figs. 1 or 2 , will now be described with reference to Fig. 5 .
  • the probabilistic modeller 113 determines a probabilistic model or combined distribution model for the quantization indices.
  • the probabilistic model is based on the autoregressive signal model corresponding to the linear prediction coefficients estimated by the LP analyzer 110 and the perceptual weighting computed in adaptation 115.
  • the entropy coder 109 can define the code words that are to be transmitted or stored.
  • the optimal description length used to describe the current signal segment with a particular probabilistic model can be estimated via a summation of the code length of the quantized signal and the length used for describing the model.
  • the resulting length called description length in the following, can be used as a means for selecting the model.
  • Equation (8) clearly illustrates the effect of reverse waterfilling, i.e. a component p Si
  • the probability density model used in the present invention is a mixture (weighted sum) of a backward adapted probability density and one or more other component probability densities.
  • Each joint probability density model is a mixture model resulting in a combined distribution model.
  • the distribution models may share the same mixture components, wherein only the weights or weighting coefficients of the components vary, as illustrated in the following equation: ⁇ j p s j
  • M i p s
  • ⁇ k 1 K w ik ⁇ p s
  • M i ) represents a probability distribution, the sum of the weights or weighting coefficients is equal to unity.
  • the set of weights or weighting coefficients forms a probability distribution for the component probability densities.
  • two or three component probability densities may be used.
  • the combined distribution model is obtained by adding at least one first distribution model with which the model parameters extracted from the reconstructed signal 121 are associated and at least one fixed distribution model. Weighting coefficients are affected to and multiplied by each of these distribution models. The sum of these weighted distribution models results in the combined distribution model.
  • the combined distribution model is obtained by adding at least one first Gaussian distribution model generatated in the first distribution generator 303 based on the autoregressive model parameters extracted from the reconstructed signal 121, at least one fixed uniform distribution model generated in the second distribution generator 301 and at least one adaptive uniform distribution model generated in the adaptive distribution generator 302, selected in response to the extracted autoregressive model parameters.
  • weighting coefficients are affected to and multiplied by each of the corresponding distribution models for a summation.
  • any arbitrary number of component probability densities may be used.
  • a quantized version of the weighting coefficients or a weigth vector representing the weighting coefficients is transmitted or is stored together with the sequence of coded data.
  • a constrained-entropy quantization procedure may be used to quantize the weight vectors in order to optimize performance.
  • the quantizer weight vectors have a low bit rate, it is reasonable to use a constrained-resolution quantizer for the weight vectors even when constrained-entropy coding is used for the signal segments. In this case the number L ( M i ) in equation (8) is fixed.
  • three component distribution densities generated in a first 303, a second 301 and a third 302 generator, are weighted and summed before the resulting mixture density function, i.e. the combined distribution model, is used to estimate the description length in a description length estimator 305.
  • the estimator 305 receives a segment of the preprocessed quantized signal 321 from the codeword generator 109, comprising the set of scalars s j for equation (8).
  • the first generator 303 may generate a Gaussian distribution model obtained from the model parameters through the SVD operator 112.
  • the model parameters are associated with the Gaussian model and may represent the variance of the Gaussian distribution.
  • the second generator 301 may generate a fixed distribution model, which may be a uniform distribution with a range that equals the range of the digital representation of the input signal 120.
  • the third generator 302 may generate an adaptive distribution model selected in response to the model parameters extracted from the reconstructed signal 121.
  • the distribution model generated by the third generator 302 may be a uniform distribution which is adaptive with a range corresponding to 12 times the range of the standard deviation of the corresponding Gaussian distribution generated by the first generator 301.
  • the uniform distribution components remove precision problems associated with the Gaussian density.
  • one of the distribution models is adapted for large deviation and one of the other models is adapted for small deviation.
  • the weight vectors and codewords are affected to the distribution models by a weight codebook 304.
  • the probabilistic modeller 113 searches through every entry or set of values of weighting coefficients of the weight codebook 304 and selects the set of weighting coefficients leading to the shortest description length. Then, the combined distribution model 324 which corresponds to the sum of the different distribution models generated by the generators 301-303, each of the model being multiplied by its respective weighting coefficient, is sent to the entropy coder 109.
  • the probabilistic modeller 213 receives the model index 223 and generates the combined distribution model 424 used by the first codeword interpreter 209 and the dequantizer 204.
  • the modeller 213 is equivalent to the modeller 113 described with reference to Fig. 5 except that the modeller 213 of the apparatus for decoding does not include a description length estimator.
  • the modeller 213 includes a first generator 403 for generating a first Gaussian distribution model based on the autoregressive model parameters, a second generator 401 for generating a fixed distribution model and may further include a third generator 402 for generating an adaptive uniform distribution model selected in response to the autoregressive model parameters. These model parameters are extracted by the extracting means 218 from the reconstructed signal 221 generated by the reconstructing means 217.
  • the first distribution model 403 may be a Gaussian distribution model and the extracted model parameters provided by the extracting means 218 are parameters of the Gaussian distribution model.
  • the fixed distribution model may be a uniform signal model, which is characteristic of the input signal 120.
  • weighting coefficients are affected to each of these distribution models in accordance with the model index 223 decoded by the second codeword interpreter 200.
  • backward adaptive encoding enables to reduce bit rate
  • this type of encoding may present poor robustness against channel errors in the form of bit errors and/or packet loss.
  • One of the reasons may be that the reconstructed signal segment is used for analysis. This type of error will be referred to as error propagation through analysis in the following.
  • Another reason may be that the subtraction of the zero-input response propagates past signal errors. This type of errors decays if the filters are stable and will be referred to as error propagation through filtering in the following.
  • the set of weighting coefficients ⁇ w i 1 , ⁇ , w ik ⁇ determines whether the mixture probabilistic model, i.e. the combined distribution model with weight index i , is dependent on the backward adaptation probabilistic density, i.e. the distribution model generated by the first generator 403. If the weighting coefficient for a probabilistic density is zero for a time segment longer than the window length of the backward adaptive analysis, then the error propagation through analysis is stopped.
  • the threshold values can be adapted, either in real-time or off-line, such that a desired level of robustness is achieved. It is noted that as the quality of the reconstructed signal 121 does not vary with the combined distribution model used (the rate does), the bias can be enacted both during background or foreground signals.
  • a plurality of fixed probabilistic signal models (distribution models) that are commonly seen in the input signal 120 may be introduced as components of the combined distribution model in addition to the fixed distribution model generated in by the third generators 302 and 402.
  • Error propagation through filtering is generally a lesser problem.
  • Most common methods used to estimate autoregressive model parameters through linear-predictive analysis lead to stable filters, which implies that errors in the contributions of the zero-input response decay without additional effort.
  • a channel is particularly poor, it can be ensured that the zero-input response decays more rapidly by e.g. considering the zero-input response as a summation of responses to previous individual blocks. For each block the response can then be windowed, so that it has a finite support and, therefore, does not ring beyond a small number of samples. When this is done consistently at the encoder and the decoder, then error propagation through filtering is significantly diminished.
  • a computer readable medium having computer executable instructions for carrying out, when run on a processing unit, each of the steps of the method for encoding described above is provided, and a computer readable medium having computer executable instructions for carrying out, when run on a processing unit, each of the steps of the method for decoding described above is provided.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Health & Medical Sciences (AREA)
  • Neurosurgery (AREA)
  • Otolaryngology (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Stereophonic System (AREA)

Claims (59)

  1. Procédé de codage d'un signal d'entrée (120), ledit procédé incluant les étapes consistant à :
    générer un signal reconstruit (121) d'après des segments de signal codés anciens dudit signal d'entrée (120) ;
    extraire des paramètres de modèle dudit signal reconstruit (121) ;
    ajouter au moins un premier modèle de distribution auquel les paramètres de modèle extraits sont associés et au moins un modèle de distribution fixe, dans lequel des coefficients de pondération sont affectés à chacun de ces modèles de distribution, pour obtenir un modèle de distribution mixte ;
    coder un segment de signal actuel dudit signal d'entrée (120) en une séquence de données codées en utilisant ledit modèle de distribution mixte ; et
    générer un flux binaire (124) incluant ladite séquence de données codées et une information concernant ledit modèle de distribution mixte correspondant audit segment de signal actuel.
  2. Procédé selon la revendication 1, dans lequel l'information concernant ledit modèle de distribution mixte est codée comme une information d'accompagnement sous la forme d'un index de modèle spécifiant au moins lesdits coefficients de pondération.
  3. Procédé selon la revendication 1 ou 2, dans lequel les coefficients de pondération sont choisis afin de minimiser une longueur de code estimée dudit segment de signal actuel.
  4. Procédé selon une quelconque des revendications précédentes, dans lequel l'étape de codage inclut les étapes consistant à :
    quantifier ledit segment de signal actuel en utilisant ledit modèle de distribution mixte ; et
    coder le segment de signal actuel quantifié en ladite séquence de données codées.
  5. Procédé selon une quelconque des revendications 1 à 3, dans lequel l'étape de codage inclut les étapes consistant à :
    quantifier ledit segment de signal actuel ; et
    coder le segment de signal actuel quantifié dans ladite séquence de données codées en utilisant ledit modèle de distribution mixte.
  6. Procédé selon la revendication 4 ou 5, dans lequel la taille de cellule de quantification utilisée pour l'étape de quantification d'un jeu d'échantillons particulier est constante.
  7. Procédé selon une quelconque des revendications précédentes, dans lequel le modèle de distribution fixe est un modèle de distribution uniforme.
  8. Procédé selon une quelconque des revendications précédentes, dans lequel le premier modèle de distribution est un modèle de distribution Gaussien et les paramètres de modèle extraits sont des paramètres dudit modèle de distribution Gaussien.
  9. Procédé selon une quelconque des revendications précédentes, dans lequel ledit modèle de distribution mixte est un modèle de mélange incluant en outre au moins un modèle de distribution adaptatif choisi en réponse aux paramètres de modèle extraits, auquel modèle de distribution adaptatif un facteur de pondération est affecté, et lequel modèle de distribution adaptatif pondéré est ajouté au premier modèle et au modèle de distribution pondéré fixe pour obtenir le modèle de distribution mixte.
  10. Procédé selon une quelconque des revendications précédentes, dans lequel le modèle de distribution mixte est choisi parmi une pluralité de modèles de distribution mixtes, en réponse à une longueur de code d'un sous-segment dudit segment de signal actuel et une longueur de code utilisée pour décrire le modèle de distribution dudit signal reconstruit.
  11. Procédé selon une quelconque des revendications précédentes, dans lequel, avant l'étape de génération d'un signal reconstruit, le procédé inclut les étapes consistant à :
    appliquer un filtre perceptuel à un segment de signal dudt signal d'entrée (120) ;
    appliquer une transformée au segment de signal filtré ; et
    quantifier le segment de signal filtré et transformé.
  12. Procédé selon la revendication 11, dans lequel l'étape de génération d'un signal reconstruit inclut les étapes consistant :
    appliquer une transformée inverse au segment de signal quantifié ; et
    appliquer un filtre de pondération inverse au segment de signal transformé inversement.
  13. Procédé selon une quelconque des revendications précédentes, dans lequel les coefficients de pondération sont biaisés pour minimiser la propagation d'erreur.
  14. Procédé selon une quelconque des revendications précédentes, dans lequel le coefficient de pondération affecté au premier modèle de distribution est biaisé vers une valeur de zéro pour minimiser la propagation d'erreur.
  15. Procédé selon une quelconque des revendications 1 à 13, dans lequel le coefficient de pondération affecté au premier modèle de distribution est comparé avec une valeur de seuil en dessous de laquelle le coefficient de pondération est fixé à zéro.
  16. Appareil de codage d'un signal d'entrée (120), ledit appareil incluant :
    un moyen de reconstruction (117) pour générer un signal reconstruit (121) d'après des segments de signal codés anciens dudit signal d'entrée (120) ;
    un moyen d'extraction (118) pour extraire des paramètres de modèle dudit signal reconstruit (121) ;
    un modélisateur (113) adapté afin d'ajouter au moins un premier modèle de distribution généré par au moins un premier générateur de distribution (303) avec lesdits paramètres de modèle et au moins un modèle de distribution fixe généré par au moins un second générateur de distribution (301), dans lequel un livre de code de pondération (304) affecte les coefficients de pondération à chacun de ces modèles de distribution, pour obtenir un modèle de distribution mixte ;
    un codeur (119) pour coder un segment de signal actuel dudit signal d'entrée (120) en une séquence de données codées en utilisant le modèle de distribution mixte ; et
    un multiplexeur (116) recevant l'information concernant le modèle de distribution mixte provenant du modélisateur (113) et la séquence de données codées provenant du codeur (119) pour générer un flux binaire (124) correspondant audit segment de signal actuel.
  17. Appareil selon la revendication 16, dans lequel un second générateur de mot de code (100) code une information concernant le modèle de distribution mixte comme une information d'accompagnement sous la forme d'un index de modèle spécifiant au moins lesdits coefficients de pondération.
  18. Appareil selon la revendication 16 ou 17, dans lequel ledit livre de code de pondération (304) sélectionne les coefficients de pondération pour minimiser une longueur de code estimée par un estimateur (305).
  19. Appareil selon une quelconque des revendications 16 à 18, dans lequel le codeur (119) inclut :
    un quantificateur (104) pour quantifier ledit segment de signal actuel en utilisant ledit modèle de distribution mixte ; et
    un premier générateur de mot de code (109) pour coder le segment de signal actuel quantité en ladite séquence de données codées.
  20. Appareil selon une quelconque des revendications 16 à 18, dans lequel le codeur (119) inclut :
    un quantificateur (104) pour quantifier ledit segment de signal actuel ; et
    un premier générateur de mot de code (109) pour coder le segment de signal actuel quantifié en ladite séquence de données codées en utilisant ledit modèle de distribution mixte.
  21. Appareil selon la revendication 19 ou 20, dans lequel le quantificateur (104) est un quantificateur scalaire.
  22. Appareil selon une quelconque des revendications 19 à 21, dans lequel la taille de cellule de quantification dudit quantificateur (104) est constante pour un jeu particulier d'échantillons.
  23. Appareil selon une quelconque des revendications 16 à 22, dans lequel le modèle de distribution fixe du second générateur de distribution (301) est un modèle de distribution uniforme.
  24. Appareil selon une quelconque des revendications 16 à 23, dans lequel le premier modèle de distribution du premier générateur de distribution (303) est un modèle de distribution Gaussien et les paramètres de modèle extraits sont des paramètres dudit modèle de distribution Gaussien.
  25. Appareil selon une quelconque des revendications 16 à 24, dans lequel le modélisateur (113) inclut en outre au moins un générateur de distribution adaptatif (302) pour générer un modèle de distribution adaptatif choisi en réponse aux paramètres de modèle extraits, dans lequel ledit libre de code de pondération (304) affecte un coefficient de pondération audit modèle de distribution adaptatif, et dans lequel ledit modélisateur (113) obtient le modèle de distribution mixte en ajoutant, chacun des modèles de distribution étant multipliés par son coefficient de pondération correspondant, ledit modèle de distribution adaptatif auxdits modèles de distribution premier et fixe.
  26. Appareil selon une quelconque des revendications 16 à 25, dans lequel le modélisateur (113) sélectionne le modèle de distribution mixte parmi une pluralité de modèles de distribution mixtes en réponse à une longueur de code d'un sous-segment dudit segment de signal actuel et une longueur de code utilisée pour décrite le modèle de distribution dudit signal reconstruit (121).
  27. Appareil selon une quelconque des revendications 19 à 26, dans lequel, avant d'être soumis au moyen de reconstruction (117), le signal d'entrée (120) est soumis à :
    un filtre de pondération perceptuel (101) pour filtrer un segment de signal ;
    un transformateur (102) pour appliquer une transformée au segment de signal filtré ; et
    le quantificateur (104) du codeur (119) pour quantifier le segment de signal transformé.
  28. Appareil selon la revendication 27, dans lequel le moyen de reconstruction (117) inclut :
    un transformateur inverse (106) pour appliquer une transformée inverse au segment de signal quantifié ; et
    un filtre de pondération inverse (108) pour appliquer un filtre de pondération inverse au segment de signal transformé inversement.
  29. Appareil selon la revendication 28, incluant en outre :
    un premier moyen de correction (114) disposé entre ledit filtre de pondération perceptuel (101) et ledit transformateur (102) afin d'effectuer une soustraction de réponse d'entrée zéro sur le segment de signal filtré ; et
    un second moyen de correction (115) disposé entre ledit transformateur inverse (106) et le filtre de pondération inverse (108) afin d'effectuer une addition de réponse d'entrée zéro au segment de signal transformé inversement.
  30. Appareil selon la revendication 28 ou 29, incluant en outre :
    un moyen de normalisation (103) disposé entre ledit transformateur (102)et ledit quantificateur (104) afin d'effecteur une normalisation du segment de signal transformé ; et
    un moyen de dénormalisation (105) disposé entre ledit quantificateur (104) et ledit transformateur inverse (106) afin d'effectuer une dénormalisation du segment de signal transformé inversement.
  31. Appareil selon la revendication 29 ou 30, incluant en outre un calculateur de réponse (107) pour fournir une réponse d'entrée zéro au moyen de correction (114, 115).
  32. Appareil selon une quelconque des revendications 16 à 31, dans lequel ledit moyen d'extraction (118) inclut un analyseur prédictif linéaire (110).
  33. Appareil selon une quelconque des revendications 16 à 32, dans lequel ledit modélisateur (113) biaise les coefficients de pondération pour minimiser la propagation d'erreur.
  34. Appareil selon une quelconque des revendications 16 à 33, dans lequel ledit modélisateur (113) biaise la sélection des coefficients de pondération des modèles de distribution qui sont basés sur les signaux anciens reconstruits vers une valeur de zéro pour minimiser la propagation d'erreur.
  35. Appareil selon une quelconque des revendications 16 à 34, dans lequel ledit modélisateur (113) compare le coefficient de pondération du premier modèle de distribution avec une valeur de seuil en dessous de laquelle il fixe le coefficient de pondération à zéro.
  36. Procédé de décodage d'un flux binaire (124) de données codées, ledit procédé incluant les étapes consistant à :
    extraire dudit flux binaire (124) une séquence actuelle de données codées et un index de modèle codé (223) incluant une information concernant un modèle de distribution mixte, laquelle information inclut des coefficients de pondération ;
    extraire des paramètres de modèle d'une partie existante d'un signal reconstruit (221) correspondant aux séquences anciennes dudit flux binaire (124) ;
    ajouter au moins un premier modèle de distribution auquel lesdits paramètres de modèle sont associés et au moins un modèle de distribution fixe, dans lequel les coefficients de pondération sont affectés aux modèles de distribution correspondants conformément à l'index de modèle (223), pour obtenir un modèle de distribution mixte ;
    décoder ladite séquence actuelle de données codées en une séquence actuelle de données décodées en utilisant ledit modèle de distribution mixte ; et
    générer une partie du signal reconstruit (221) d'après ladite séquence actuelle de données décodées.
  37. Procédé selon la revendication 36, dans lequel l'index de modèle est reçu comme une information d'accompagnement.
  38. Procédé selon la revendication 36 ou 37, dans lequel le modèle de distribution fixe est un modèle de distribution uniforme.
  39. Procédé selon une quelconque des revendications 36 à 38, dans lequel le premier modèle de distribution est un modèle de distribution Gaussien.
  40. Procédé selon une quelconque des revendications 36 à 39, dans lequel le modèle de distribution mixte est un modèle de mélange incluant au moins un modèle de distribution adaptatif choisi en réponse auxdits paramètres de modèle, auquel modèle de distribution adaptatif un facteur de pondération est affecté conformément audit index de modèle (223), et lequel modèle de distribution adaptatif pondéré est ajouté au premier modèle de distribution et au modèle de distribution fixe pondéré pour obtenir le modèle de distribution mixte.
  41. Procédé selon une quelconque des revendications 36 à 40, dans lequel l'étape de décodage inclut les étapes consistant à :
    interpréter un mot de code des données codées ; et
    déquantifier les données décodées sur la base dudit mot de code.
  42. Procédé selon une quelconque des revendications 36 à 41, incluant en outre une étape d'interprétation d'un mot de code de l'index de modèle codé pour extraire l'index de modèle.
  43. Procédé selon une quelconque des revendications 41 ou 42, dans lequel l'étape de génération d'un signal reconstruit inclut les étapes consistant à :
    appliquer une transformée inverse aux données déquantifiées ; et
    appliquer un filtre de pondération inverse aux données transformées inversement.
  44. Procédé selon la revendication 43, dans lequel, entre l'étape de déquantification et l'étape d'application d'une transformée inverse, l'étape de génération d'un signal reconstruit inclut en outre l'étape consistant à :
    effectuer une dénormalisation des données déquantifiées.
  45. Procédé selon la revendication 43 ou 44, dans lequel, entre l'étape d'application d'une transformée inverse et l'étape d'application d'un filtre de pondération inverse, l'étape de génération d'un signal reconstruit inclut en outre l'étape consistant à :
    corriger les données en effectuant une addition de la réponse d'entrée zéro aux données transformées inversement.
  46. Appareil de décodage d'un flux binaire (124) de données codées, ledit appareil incluant :
    un démultiplexeur (214) pour démultiplexer ledit flux binaire (124) en une séquence actuelle de données codées et un index de modèle (223) incluant une information concernant un modèle de distribution mixte, laquelle information inclut des coefficients de pondération ;
    un moyen d'extraction (218) pour extraire des paramètres de modèle d'une partie existante d'un signal reconstruit (221) correspondant aux séquences anciennes dudit flux binaire (124) ;
    un modélisateur (213) adapté afin d'ajouter au moins un premier modèle de distribution généré avec les paramètres de modèle extraits par au moins un premier générateur (403) et au moins un modèle de distribution fixe généré par au moins un second générateur (401), dans lequel un livre de code de pondération (404) affecte les coefficients de pondération aux modèles de distribution conformément audit index de modèle (223), pour obtenir un modèle de distribution mixte ;
    un décodeur (219) pour décoder ladite séquence actuelle de données codées en une séquence actuelle de données décodées en utilisant ledit modèle de distribution mixte ; et
    un moyen de reconstruction (217) pour générer une partie du signal reconstruit (221) d'après ladite séquence actuelle de données décodées.
  47. Appareil selon la revendication 46, dans lequel un démultiplexeur (214) reçoit l'index de modèle codé (223) comme une information d'accompagnement.
  48. Appareil selon la revendication 46 ou 47, dans lequel le modèle de distribution fixe est un modèle de distribution uniforme.
  49. Appareil selon une quelconque des revendications 46 à 48, dans lequel le premier modèle de distribution est un modèle de distribution Gaussien et les paramètres de modèle extraits sont des paramètres du modèle de distribution Gaussien.
  50. Appareil selon une quelconque des revendications 46 à 49, dans lequel ledit modélisateur (213) inclut en outre au moins un troisième générateur (402) pour générer au moins un modèle de distribution adaptatif avec les paramètres de modèle extraits, dans lequel ledit livre de code de pondération affecte un coefficient de pondération audit modèle de distribution adaptatif conformément audit index de modèle (223), et dans lequel ledit modélisateur (213) obtient le modèle de distribution mixte en ajoutant, chacun des modèles de distribution étant multiplié par son coefficient de pondération correspondant, ledit modèle de distribution adaptatif auxdits modèles de distribution premier et fixe.
  51. Appareil selon une quelconque des revendications 46 à 50, dans lequel ledit décodeur (219) inclut un premier interprète de mot de code (209) et un déquantificateur (204) pour décoder la séquence actuelle de données codées.
  52. Appareil selon une quelconque des revendications 46 à 51, incluant en outre un second interprète de mot de code (200) pour interpréter un mot de code correspondant à l'index de modèle codé.
  53. Appareil selon une quelconque des revendications 51 ou 52, dans lequel ledit moyen de reconstruction (217) inclut :
    un transformateur inverse (206) pour appliquer une transformée inverse aux données déquantifiées ; et
    un filtre de pondération inverse (208) pour appliquer une pondération inverse aux données transformées inversement.
  54. Appareil selon la revendication 53, dans lequel un moyen de dénormalisation (205) est disposé entre ledit déquantificateur (204) et ledit transformateur inverse (206) pour effectuer une dénormalisation des données déquantifiées.
  55. Appareil selon la revendication 53 ou 54, dans lequel un moyen de correction (215) est disposé entre ledit transformateur inverse (206) et ledit filtre de pondération inverse (208) pour effectuer une addition d'une réponse d'entrée zéro aux données transformées inversement.
  56. Appareil selon la revendication 55, incluant en outre un prédicteur linaire (207) pour fournir la réponse d'entrée zéro audit moyen de correction (215).
  57. Appareil selon une quelconque des revendications 46 à 56, dans lequel ledit moyen d'extraction (218) inclut un analyseur prédictif linéaire (210).
  58. Support lisible par ordinateur comportant des instructions exécutables pour mettre en oeuvre chacune des étapes du procédé selon une quelconque des revendications 1 à 15 quand il est exécuté sur une unité de traitement.
  59. Support lisible par ordinateur comportant des instructions exécutables pour mettre en oeuvre chacune des étapes du procédé selon une quelconque des revendications 36 à 45 quand il est exécuté sur une unité de traitement.
EP07113397A 2007-07-30 2007-07-30 Codeur audio à faible retard Active EP2023339B1 (fr)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP07113397A EP2023339B1 (fr) 2007-07-30 2007-07-30 Codeur audio à faible retard
DE602007008717T DE602007008717D1 (de) 2007-07-30 2007-07-30 Audiodekoder mit geringer Verzögerung
AT07113397T ATE479182T1 (de) 2007-07-30 2007-07-30 Audiodekoder mit geringer verzögerung
PCT/EP2008/057970 WO2009015944A1 (fr) 2007-07-30 2008-06-23 Codeur audio à faible retard
US12/671,631 US8463615B2 (en) 2007-07-30 2008-06-23 Low-delay audio coder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP07113397A EP2023339B1 (fr) 2007-07-30 2007-07-30 Codeur audio à faible retard

Publications (2)

Publication Number Publication Date
EP2023339A1 EP2023339A1 (fr) 2009-02-11
EP2023339B1 true EP2023339B1 (fr) 2010-08-25

Family

ID=38820333

Family Applications (1)

Application Number Title Priority Date Filing Date
EP07113397A Active EP2023339B1 (fr) 2007-07-30 2007-07-30 Codeur audio à faible retard

Country Status (4)

Country Link
EP (1) EP2023339B1 (fr)
AT (1) ATE479182T1 (fr)
DE (1) DE602007008717D1 (fr)
WO (1) WO2009015944A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011044898A1 (fr) 2009-10-15 2011-04-21 Widex A/S Prothèse auditive à codec audio et procédé

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9031255B2 (en) * 2012-06-15 2015-05-12 Sonos, Inc. Systems, methods, apparatus, and articles of manufacture to provide low-latency audio
US9769586B2 (en) 2013-05-29 2017-09-19 Qualcomm Incorporated Performing order reduction with respect to higher order ambisonic coefficients
US10770087B2 (en) 2014-05-16 2020-09-08 Qualcomm Incorporated Selecting codebooks for coding vectors decomposed from higher-order ambisonic audio signals
CN112767956B (zh) * 2021-04-09 2021-07-16 腾讯科技(深圳)有限公司 音频编码方法、装置、计算机设备及介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1355298B1 (fr) * 1993-06-10 2007-02-21 Oki Electric Industry Company, Limited Codeur-décodeur prédictif linéaire à excitation par codes
US6894628B2 (en) * 2003-07-17 2005-05-17 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. Apparatus and methods for entropy-encoding or entropy-decoding using an initialization of context variables
US7599840B2 (en) * 2005-07-15 2009-10-06 Microsoft Corporation Selectively using multiple entropy models in adaptive coding and decoding

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011044898A1 (fr) 2009-10-15 2011-04-21 Widex A/S Prothèse auditive à codec audio et procédé

Also Published As

Publication number Publication date
EP2023339A1 (fr) 2009-02-11
DE602007008717D1 (de) 2010-10-07
ATE479182T1 (de) 2010-09-15
WO2009015944A1 (fr) 2009-02-05

Similar Documents

Publication Publication Date Title
US8463615B2 (en) Low-delay audio coder
US6721700B1 (en) Audio coding method and apparatus
US8463604B2 (en) Speech encoding utilizing independent manipulation of signal and noise spectrum
US6401062B1 (en) Apparatus for encoding and apparatus for decoding speech and musical signals
US7756350B2 (en) Lossless encoding and decoding of digital data
US8396706B2 (en) Speech coding
EP2301022B1 (fr) Dispositif et procédé de quantification de filtres lpc avec de multiple références
US8301441B2 (en) Speech coding
US6353808B1 (en) Apparatus and method for encoding a signal as well as apparatus and method for decoding a signal
EP1806737A1 (fr) Codeur de son et méthode de codage de son
KR20080049116A (ko) 오디오 코딩
EP2270774B1 (fr) Codec audio multicanal sans perte
KR100408911B1 (ko) 선스펙트럼제곱근을발생및인코딩하는방법및장치
EP2023339B1 (fr) Codeur audio à faible retard
EP0390975B1 (fr) Codeur capable d'améliorer la qualité de la parole au moyen d'un double dispositif pour la production d'impulsions
EP1921752B1 (fr) Codage et décodage arithmétique adaptatif de données numériques
US20110137661A1 (en) Quantizing device, encoding device, quantizing method, and encoding method
EP1293968A2 (fr) Quantisation de l'excitation dans un système de codage de type "noise-feedback" utilisant des techniques de corrélation
Kırbız et al. Perceptual coding-based informed source separation
EP3008725A1 (fr) Appareil et procédé d'encodage, de traitement et de décodage d'enveloppe de signal audio par division de l'enveloppe de signal audio au moyen d'une quantification et d'un codage de distribution
JP3099876B2 (ja) 多チャネル音声信号符号化方法及びその復号方法及びそれを使った符号化装置及び復号化装置
Averbuch et al. Speech compression using wavelet packet and vector quantizer with 8-msec delay
Hernandez-Gomez et al. High-quality vector adaptive transform coding at 4.8 kb/s
JPH0749700A (ja) Celp型音声復号器
JPH04243300A (ja) 音声符号化方式

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20070730

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK RS

AKX Designation fees paid

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REF Corresponds to:

Ref document number: 602007008717

Country of ref document: DE

Date of ref document: 20101007

Kind code of ref document: P

REG Reference to a national code

Ref country code: NL

Ref legal event code: VDEP

Effective date: 20100825

LTIE Lt: invalidation of european patent or patent extension

Effective date: 20100825

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20101125

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20101227

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20101225

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20101126

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

RAP2 Party data changed (patent owner data changed or rights of a patent transferred)

Owner name: GLOBAL IP SOLUTIONS (GIPS) AB

Owner name: GLOBAL IP SOLUTIONS, INC.

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20101206

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20110526

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602007008717

Country of ref document: DE

Effective date: 20110526

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110731

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

REG Reference to a national code

Ref country code: IE

Ref legal event code: MM4A

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110731

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110731

REG Reference to a national code

Ref country code: DE

Ref legal event code: R082

Ref document number: 602007008717

Country of ref document: DE

REG Reference to a national code

Ref country code: FR

Ref legal event code: TP

Owner name: GOOGLE INC., US

Effective date: 20120626

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110730

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732E

Free format text: REGISTERED BETWEEN 20120712 AND 20120718

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732E

Free format text: REGISTERED BETWEEN 20120809 AND 20120815

REG Reference to a national code

Ref country code: DE

Ref legal event code: R081

Ref document number: 602007008717

Country of ref document: DE

Owner name: GOOGLE, INC., MOUNTAIN VIEW, US

Free format text: FORMER OWNER: GLOBAL IP SOLUTIONS, INC., GLOBAL IP SOLUTIONS (GIPS) AB, , SE

Effective date: 20120725

Ref country code: DE

Ref legal event code: R081

Ref document number: 602007008717

Country of ref document: DE

Owner name: GOOGLE, INC., MOUNTAIN VIEW, US

Free format text: FORMER OWNERS: GLOBAL IP SOLUTIONS, INC., SAN FRANCISCO, CALIF., US; GLOBAL IP SOLUTIONS (GIPS) AB, STOCKHOLM, SE

Effective date: 20120725

Ref country code: DE

Ref legal event code: R081

Ref document number: 602007008717

Country of ref document: DE

Owner name: GOOGLE LLC (N.D.GES.D. STAATES DELAWARE), MOUN, US

Free format text: FORMER OWNERS: GLOBAL IP SOLUTIONS, INC., SAN FRANCISCO, CALIF., US; GLOBAL IP SOLUTIONS (GIPS) AB, STOCKHOLM, SE

Effective date: 20120725

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732E

Free format text: REGISTERED BETWEEN 20120823 AND 20120829

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110730

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100825

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 9

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20150717

Year of fee payment: 9

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20160801

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20170331

REG Reference to a national code

Ref country code: DE

Ref legal event code: R081

Ref document number: 602007008717

Country of ref document: DE

Owner name: GOOGLE LLC (N.D.GES.D. STAATES DELAWARE), MOUN, US

Free format text: FORMER OWNER: GOOGLE, INC., MOUNTAIN VIEW, CALIF., US

P01 Opt-out of the competence of the unified patent court (upc) registered

Effective date: 20230516

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20230727

Year of fee payment: 17

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20230727

Year of fee payment: 17