EP2652735A1

EP2652735A1 - Improved encoding of an improvement stage in a hierarchical encoder

Info

Publication number: EP2652735A1
Application number: EP11811097.2A
Authority: EP
Inventors: Balazs Kovesi; Stéphane RAGOT; Alain Le Guyader
Original assignee: Orange SA
Current assignee: Orange SA
Priority date: 2010-12-16
Filing date: 2011-12-13
Publication date: 2013-10-23
Anticipated expiration: 2031-12-13
Also published as: EP2652735B1; WO2012080649A1; CN103370740B; JP2014501395A; JP5923517B2; US20130268268A1; KR20140005201A; FR2969360A1; CN103370740A

Abstract

The invention relates to a method for encoding a digital audio input signal (x(n)) in a hierarchical encoder including a core encoding stage, having B bits, and at least one current encoding improvement stage k, outputting quantification indices that are concatenated so as to form the indices of the preceding interlaced encoder (I^B+k-1). The method is such that it comprises the steps of obtaining (303) possible quantification values (d_i ^B+k (n)) for the current improvement stage k by determining absolute levels for reconstructing the single current stage k on the basis of the indices from the preceding interlaced encoder (I^B+k-1), and quantifying (306) the hierarchical encoder input signal that was or was not subjected to perceptual weighting processing (x(n) or x'(n)) from said possible quantification values (d_i ^B+k (n)) for forming a scalar quantification index for the stage k (I_enh ^B+k(n)) and a quantified signal (x ^B+k (n)) corresponding to one of the possible quantification values. The invention also relates to a hierarchical encoder implementing the above-described encoding method.

Description

The present invention relates to the field of coding of digital signals.

The coding according to the invention is particularly suitable for the transmission and / or storage of digital signals such as audio-frequency signals (speech, music or other).

The present invention relates more particularly to the coding of waveforms such as coding MIC (for "Coded Pulse Modulation") said PCM (for "Pulse Code Modulation") in English, or adaptive coding of waveform of the ADPCM encoding type (for "Adaptive Differential Pulse Modulation" (ADPCM)), in particular the nested-code coding for issuing quantization indices. Scalable bit stream.

The general principle of nested code ADPCM coding / decoding specified by ITU-T Recommendation G.722 or ITU-T G.727 is as described with reference to Figures 1 and 2.

FIG. 1 thus represents an encoder with nested codes of the ADPCM type (ex:

G.722 low band, G.727) operating between B and B + K bits per sample; note that the case of a non-scalable ADPCM encoding (eg G.726, G.722 high band) corresponds to K = 0, where B is a fixed value which can be chosen from among different possible rates.

It comprises:

a prediction module 110 making it possible to give the prediction of the signal x ^B (n) from the preceding samples of the quantized error signal e _Q ^B (ri) = y ^B _B (n ') v (ri) ri = n - l, ..., n-N _z , where v (ri) is the quantization scale factor, and the reconstructed signal r ^B (ri) ri = n- l, ..., n-N _p where n is the current moment.

a subtraction module 120 which subtracts from the input signal x (n) its prediction Xp (n) to obtain a prediction error signal denoted e (n).

a quantization module 130 Q ^{B + K} of the error signal which receives as input the error signal e (n) to give quantization indices I ^{B + K} (n) consisting of B + K bits. The quantification module Q ^{B + K} is with nested codes, that is to say that it comprises a quantizer of "heart" with B bits and quantizers at B + kk = 1,..., K bits which are nested on the quantizer of "heart".

In the case of ITU-T G.722 low band coding, the decision levels and quantizer reconstruction levels Q ^B , Q ^{B + 1} , Q ^{B + 2} for B = 4 and K = 0 , 1 or 2 are defined by Tables IV and VI of the summary article describing the G.722 standard of X. Master "7 kHz audio coding within 64 kbit / s." IEEE Journal on Selected Areas in Communication, Vol.6, no. February 1988.

The quantization index I ^{B + K} (n) of B + K bits at the output of the quantization module Q ^{B + K} is transmitted via the transmission channel 140 to the decoder as described with reference to FIG. 2.

The encoder also includes:

a module 150 for eliminating the K least significant bits of the index I ^{B + K} (n) to give a low bit rate index I ^B (n) on B bits;

an inverse quantization module 121 for outputting a quantized error signal _Q ^B (n) = y ^B _B (n) v (n) on B bits;

- an adaptation module 170 Q _Adapt quantizers and inverse quantizers to provide a level control parameter v (n) also known as scaling factor for the next moment;

an addition module 180 of the prediction x ^B {n) to the quantized error signal to give the low-speed reconstructed signal r ^B {n);

an adaptation module 190 P _Adapt of the prediction module from the quantized error signal on B bits e _Q ^B (ri) and of the signal e _Q ^B (ri) filtered by \ + P _z {z)

It may be noted that in FIG. 1 the dashed portion referenced 155 represents the low-rate local decoder which contains the predictors 165 and 175 and the inverse quantizer 121. This local decoder thus makes it possible to adapt the inverse quantizer to 170 from the low bit rate index I ^B (n) and adapt the predictors 165 and 175 from the reconstructed low bit rate data.

This part is found identically on the decoder ADPCM nested codes as described with reference to Figure 2.

The nested code ADPCM decoder of FIG. 2 receives as input the indices τ

I from the transmission channel 140, version of / possibly disturbed by binary errors, and performs inverse quantization by the inverse quantization module 210 {j ^B ) B bit rate per sample to obtain the signal (n) = y ^B _B (n) v '(n). The symbol "'" indicates a decoded value from the received bits, possibly different from that used by the encoder due to transmission errors.

The output signal r ^{, B} (n) for B bits will be equal to the sum of the prediction of the signal and the output of the B-bit inverse quantizer. This part 255 of the decoder is identical to the low speed local decoder 155 of FIG.

With the mode flow indicator and the selector 220, the decoder can improve the restored signal.

Indeed, if mode indicates that B + 1 bits have been received, the output will be equal to the sum of the prediction x _p {n) and the output of the inverse quantizer 230 to B + 1 bits yi, (n) v '( not) .

If mode indicates that B + 2 bits have been received then the output will be equal to the sum of the prediction x _p ^B (n) and the output of the inverse quantizer 240 to B + 2 bits y ^B _B i {n) v .

By using the notations of the transform in z, one can write that in this looped structure:

R ^{B + k} (z) = X (Z) + Q ^{B + k} (z)

defining the quantization noise at B + k bits Q ^{B + k} (z) by:

Q ^{B + k} (z) = E ^{B + k} (z) - E (z)

ITU-T G.722 nested code ADPCM (hereinafter referred to as G.722) coding broadband signals which are defined with a minimum bandwidth of [50-7000 Hz] and sampled at 16 kHz. The G.722 encoding is an ADPCM coding of each of the two sub-bands of the signal [0-4000 Hz] and [4000-8000 Hz] obtained by decomposition of the signal by quadrature mirror filters. The low band is coded by a 6, 5 and 4 bit nested code ADPCM coding while the high band is coded by a 2 bit ADPCM coder per sample. The total bit rate will be 64, 56 or 48 bit / s depending on the number of bits used for decoding the low band.

This coding was first developed for use in ISDN (Digital Integrated Services Network). It has recently been deployed in high quality voice over IP telephony applications.

For a quantizer with a large number of levels, the quantization noise spectrum will be relatively flat. However, in the frequency zones where the signal has a low energy, the noise may have a comparable level or higher than the signal and is therefore not necessarily masked. It can then become audible in these regions. Coding noise formatting is therefore necessary. In an encoder such as G.722, coding noise formatting suitable for nested code encoding is furthermore desirable.

In general, the purpose of the formatting of the coding noise is to obtain a quantization noise whose spectral envelope follows the short-term masking threshold; this principle is often simplified so that the noise spectrum follows the signal spectrum approximately, providing a more homogeneous signal-to-noise ratio so that the noise remains inaudible even in the lower energy areas of the signal. A noise shaping technique for a MIC type coding (for

"Coded Pulse Coding") is described in the Recommendation

ITU-T G.711.1 "Wideband embedded extension for G.711 uses modulation code" or

"G.711.1: A wideband extension to ITU-T G.711". Y. Hiwasaki, S. Sasaki, H. Ohmuro, T.

Mori, J. Seong, M. S. Lee, B. Kovesi, S. Ragot, J.-L. Garcia, C. Marro, L. M., J. Xu, V. Malenovsky, J. Lapierre, R. Lefebvre. EUSIPCO, Lausanne, 2008.

This recommendation thus describes coding with coding noise formatting for heart rate coding. A perceptual filter for encoding noise shaping is calculated based on past decoded signals from a reverse core quantizer. A local heart rate decoder thus makes it possible to calculate the noise shaping filter. Thus, at the decoder, it is possible to calculate this noise shaping filter from decoded heart rate signals.

A quantizer delivering improvement bits is used at the encoder.

The decoder receiving the core bit stream and the improvement bits, calculates the coding noise shaping filter in the same way as the coder from the decoded heart rate signal and applies this filter to the output signal of the decoder. inverse quantizer of the enhancement bits, the shaped high-speed signal being obtained by adding the filtered signal to the decoded heart signal.

The shaping of the noise thus improves the perceptual quality of the heart rate signal. It offers a limited improvement in quality for improvement bits. Indeed, the formatting of the coding noise is not carried out for the coding of the improvement bits, the input of the quantizer being the same for the quantization of the core as for the improved quantization.

The decoder must then remove a resulting parasitic component by a matched filtering, when the improvement bits are decoded in addition to the core bits.

The additional calculation of a decoder filter increases the complexity of the decoder. This technique is not used in existing standard scalable decoders of the G.722 or G.727 decoder type. There is therefore a need to improve the quality of the signals regardless of the bit rate while remaining compatible with standard scalable existing decoders.

A solution that does not require the decoder to perform complementary signal processing is described in the patent application WO 2010/058117. In this application, the signal received at the decoder can be decoded by a standard decoder capable of decoding the heart rate signal and nested data rates without requiring calculation of noise shaping or correction term.

This document describes that for an improvement stage of a hierarchical coder, quantization is performed by minimizing a quadratic error criterion in a perceptually filtered domain.

For this, a coding noise shaping filter is defined and applied to a given error signal from at least one reconstructed signal of a preceding coding stage. The method also requires the calculation of the reconstructed signal of the current improvement stage in anticipation of a next coding stage.

In addition, improvement terms are calculated and stored for the current improvement stage. This therefore brings significant complexity and significant storage enhancement terms or reconstructed signal samples of previous stages.

This solution is therefore not optimal from a complexity point of view.

There is therefore a need to improve the state of the art methods for encoding and formatting enhancement coding noise, while remaining compatible with existing hierarchical decoders. The present invention improves the situation.

To this end, it proposes a method for encoding an input digital audio signal (x (n)) in a hierarchical coder comprising a B-bit core coding stage and at least one current improvement coding stage k , the core coding and the coding of the improvement stages preceding the current stage k delivering quantization indices which are concatenated to form the indices of the preceding nested encoder (I ^{B + kl} ). The method is such that it comprises the following steps:

obtaining possible quantization values for the current improvement stage k from the absolute reconstruction levels of the single current stage k and the indices of the preceding nested encoder;

quantization of the input signal of the hierarchical coder which has or has not undergone a perceptual weighting treatment, from the said possible values of quantification to form a quantization index of the stage k and a quantized signal corresponding to one of the possible quantization values.

Thus, the quantization of the improvement stage determines the quantization index bit or bits which are directly concatenated with the indices of the preceding stages. Unlike the state-of-the-art methods, there is no computation of an improvement signal or improvement terms.

In addition, the input signal of the quantization is either directly the input signal of the hierarchical coder, or the same input signal having directly undergone perceptual weighting processing. This is not a signal difference between the input signal and a reconstructed signal of the previous coding stages as in the techniques of the state of the art.

The complexity in terms of computing load is therefore reduced.

In addition, unlike state-of-the-art methods, stored quantization values are not differential values. Thus, it is not useful to memorize the quantization values used for reconstruction in the previous stages to form a quantization dictionary of the improvement stage.

On the other hand, unlike state-of-the-art methods, it is not necessary to construct and store a differential dictionary, since the enhancement stage directly uses absolute levels stored by the encoder and hierarchical decoder. existing

(y ^{+ k} (n)). Thus, the invention avoids the duplication of the dictionaries that can be encountered in the methods of the state of the art where a differential dictionary is used at the encoder and an absolute dictionary at the decoder.

The memory required for the storage of the dictionaries and the quantification operations at the encoder and inverse quantization at the decoder is therefore reduced.

Finally, the fact of directly obtaining the quantization values of the improvement stage without making any difference brings an additional precision between the values obtained at the encoder and those obtained at the decoder when working, for example, in finite precision.

The various particular embodiments mentioned below may be added independently or in combination with each other, to the steps of the method defined above.

In a particular embodiment, the input signal has undergone perceptual weighting processing using a predetermined weighting filter to provide a modified input signal, prior to the quantization step, and the method further includes a step of adapting the weighting filter memories from the quantized signal of the current enhancement coding stage. This perceptual weighting processing applied directly to the input signal of the hierarchical coder for the enhancement coding of the stage k also reduces the complexity in terms of computational load compared to state-of-the-art techniques which performed this perceptual weighting processing on a difference signal between the input signal and a reconstructed signal of the previous coding stages.

Thus, the encoding method described also allows existing decoders to decode the signal without having to make any additional modifications or processing to be expected while benefiting from the improvement of the signal by formatting the effective coding noise.

In a particular embodiment, the possible quantization values for the improvement stage k further contain a scale factor and a prediction value from the adaptive type core coding.

This makes it possible to adapt the quantization values with respect to the values defined in the core coding.

In an alternative embodiment, the modified input signal to be quantized at the improvement stage k is the perceptually weighted input signal from which a prediction value derived from the adaptive type core coding is subtracted.

This also makes it possible to adapt the quantization values with respect to the values defined in the core coding but by making this input adaptation of the quantizer rather than on each quantization value. This is advantageous in the case where the improvement is carried out on several bits.

In particular, the perceptual weighting treatment is performed by prediction filters forming an ARMA type filter.

The formatting of the improvement coding noise is then of good quality. The present invention also relates to a hierarchical coder of an input digital audio signal, comprising a B-bit core coding stage and at least one current improvement coding stage k, the core coding and the coding of the stages. of improvement preceding the current stage k delivering quantization indices which are concatenated to form the indices of the preceding nested encoder. The encoder is such that it comprises:

a module for obtaining possible quantization values for the current improvement stage k by determining absolute reconstruction levels of the single current stage k from the indices of the preceding nested encoder;

a quantization module of the input signal of the hierarchical coder which has or has not undergone perceptual weighting processing, starting from the said possible quantification values to form a quantization index of the stage k and a quantized signal corresponding to one of the possible quantification values. The hierarchical coder further comprises a perceptual weighting pre-processing module using a predetermined weighting filter to give a modified input signal of the quantization module and a weighting filter memory adaptation module from the quantized signal. of the current improvement coding stage.

The hierarchical coder provides the same advantages as those of the method it implements.

It also relates to a computer program comprising code instructions for implementing the steps of the encoding method according to the invention, when these instructions are executed by a processor.

The invention finally relates to a storage means readable by a processor storing a computer program as described.

Other features and advantages of the invention will appear more clearly on reading the following description, given solely by way of nonlimiting example, and with reference to the appended drawings, in which:

FIG. 1 illustrates a coder of the ADPCM type with nested codes according to the state of the art and as previously described;

FIG. 2 illustrates a decoder of the ADPCM type with nested codes according to the state of the art and as described above;

FIG. 3 illustrates a general embodiment of the coding method according to the invention and an encoder according to the invention;

FIG. 4 illustrates a first particular embodiment of the coding method and an encoder according to the invention;

FIG. 5 illustrates a second particular embodiment of the coding method and an encoder according to the invention;

FIG. 6 illustrates a third particular embodiment of the coding method and an encoder according to the invention;

FIG. 7 illustrates an alternative general embodiment of the coding method and an encoder according to the invention;

FIG. 7b illustrates another alternative general embodiment of the coding method and an encoder according to the invention;

FIG. 8 illustrates an exemplary embodiment of the core coding of an encoder according to the invention;

FIG. 9 illustrates an example of quantization reconstruction levels used in the state of the art; and

FIG. 10 illustrates a hardware embodiment of an encoder according to the invention. With reference to FIG. 3, an encoder as well as a coding method according to one embodiment of the invention is described.

It will be recalled here that the case of an encoder with nested codes or hierarchical encoder is considered in which a B-bit core coding and at least one rank improvement stage k is provided. The core coding and the improvement stages preceding the coding of the improvement stage k as represented at 306, deliver multiplexed scalar quantization indices in the index I ^{B + kl} (n) of B + kl bits by sample.

In the exemplary embodiments described below, for the sake of simplification of presentation, the improvement stage (of rank k) is presented as producing one additional bit per sample. In this case, the coding in each improvement stage involves selecting one of two possible values. As it will appear later, the "absolute dictionary" - in terms of absolute levels (in the sense of "non-differential") - corresponding to all the quantization values that can be produced by the rank improvement stage k, is of size 2 ^{B + k} , or sometimes slightly less than 2 ^{B + k} as for example in the G.722 coder which has only 60 possible levels instead of 64 in the quantizer of 6 bits of low band. Hierarchical coding implies a binary tree structure of the "absolute dictionary", which explains why it suffices to have one bit of improvement to perform the coding given the B + kl bits of the preceding stages.

FIG. 9 is an extract from Table VI of the aforementioned article X. Master and represents the first 4 levels of the B-bit core quantizer for B = 4 bits and the quantizer levels at B + 1 and B + 2 bits of the encoding the low band of a G.722 encoder as well as the output values of the state-of-the-art enhancement quantizer for B + 2 bits.

As illustrated in this figure, the quantizer nested at B + 1 = 5 bits is obtained by "splitting" the quantizer levels at B = 4 bits. The quantizer nested at B + 2 = 6 bits is obtained by "splitting" the quantizer levels at B + 1 = 5 bits. The duplication of the reconstruction levels is in fact a consequence of the low band hierarchical coding constraint which is implemented in G.722 in the form of a scalar quantization dictionary (at 4, 5 or 6 bits per sample ) structured in a tree.

In the state of art, the values enh ^ _B ^k _{+ k} _ _{l +} . designating levels of quantization reconstructions for an improvement stage k are defined by the difference between

These values denote the quantization reconstruction levels of a nested quantizer at B + k bits (where B denotes the number of bits of the core coding) and

oes values designating the quantization reconstruction levels of a nested quantizer at B + kl bits, the reconstruction levels of the Nested quantizer at B + k bits being defined by splitting the reconstruction levels of the nested quantizer at B + k-1 bits.

With the invention the differential reconstruction levels enh ^ _ ^k _{+ k} _{+ l.} listed on the right and dotted boxes do not have to be calculated or stored. According to the invention only the absolute reconstruction levels y ^{+ k} of the stage k are calculated and stored.

These absolute reconstruction levels y ^{+ k} of stage k are used at the encoder in the same way as at the decoder, in the sense that the reconstructed signal can be obtained in the general case of ADPCM coding from these reconstruction levels. absolute y ^{+ k} by multiplying by the scale factor v (n) and adding the prediction signal Xp {n), as already presented with reference to the description of Figure 2 which represents the standard nested code ADPCM decoder. Since these levels are already defined and stored in the decoder, the encoder does not add any additional quantization tables in the codec (encoder + decoder).

The coding of the improvement stage according to the invention is very easily generalizable for cases where the improvement stage adds several bits per sample. In this case, the size of the dictionary D _k (n) used in the improvement stage, as defined later, is simply 2 ^U where U> 1 is the number of bits per sample of the improvement stage.

The encoder as shown in FIG. 3 shows a nested coder or hierarchical coder in which a B-bit core coding and at least one rank improvement stage k is provided. The core coding and the improvement stages preceding the coding of the improvement stage k as represented at 306, deliver scalar quantization indices which are concatenated to form the indices of the preceding nested encoder I ^{B + kl} (n) .

Figure 3 simply illustrates a PCM / ADPCM coding module 302 representing the embedded coding preceding the enhancement coding at 306.

The core encoding of the preceding nested encoding may optionally be performed using the masking filter determined at 301 to format "core" coding noise. An example of this type of core coding is described later with reference to FIG.

This module 302 thus delivers the indices I ^{B + kl} (n) of the nested encoder as well as the prediction signal x _F ^B (n) and the scaling factor v (n) in the case where it is a question of a predictive ADPCM coding similar to that described with reference to FIG. In the case of a PCM coding, the module 302 simply delivers the nested quantization indices I ^{B + k1} (n). Moreover, it may be noted that the PCM coding is a special case of the ADPCM coding by taking x ^B (n) = 0 and v (n) = 1.

The knowledge of the nested quantization indices I ^{B + kl} (n) and the absolute reconstruction levels y ^{+ k} , as well as, if appropriate, the prediction signal x ^B (n) and the scaling factor v (n) allow to determine the quantization values D _k (n) =

{d ^{B + k} (n), d ^{B + k} (n)} for the current improvement stage k in the construction module of the quantization value dictionary 303. This dictionary D _k (n) is used by the quantizer referred to herein as an "improvement quantizer" for the rank improvement stage k.

Thus, according to the preferred embodiment, the quantization values of the dictionary are defined as follows, in the case of the ADPCM coding:

d ^{B + k} (n) = x ^B (n) + y ^ ^B A ⁿ ) ^{and + k} (n) = x ^B (n) + y ^B L, ₊₁ v {n),

with j = 0 or 1, represent two possible quantization values of a

nested quantizer of B + k bits, predefined and stored at the encoder and the decoder. We can see the values y ^{B + k} as arising from a "splitting" of the dictionary y ^{+ k ~ l} of the previous stage k-1.

Note that the two elements of the dictionary D _k (n) depend on J ^{B + k ~ l} . In fact, this dictionary is a subset of the "absolute dictionary" defined as:

UD (n) = U

The "absolute dictionary" is a dictionary structured in tree. The index J ^{B + k ~ l} conditions the different branches of the tree to be taken into account in order to determine the possible quantization values of the stage k (D _k (n)). The scale factor v {n) is determined by the core stage of the ADPCM coding as illustrated in FIG. 1, the improvement stage therefore uses this same scale factor to scale the code words of the quantization dictionary.

In one embodiment of the invention, the coder of FIG. 3 does not include modules 301 and 310, that is, no coding noise shaping processing is provided. . Thus, it is the input signal x (n) itself that is quantized by the quantization module 306.

In a particular embodiment, the encoder further comprises a module 301 for calculating a masking filter and for determining the weighting filter W (z) or a predictive version W _PRED (z) described later. The masking or weighting filter is determined here from the input signal x (n) but could very well be determined from a decoded signal, for example from the decoded signal of the preceding nested encoder x ^{B + k ~ l} (not) . The masking filter can be determined or adapted sample by sample or by block of samples.

Indeed, the encoder according to the invention performs a shaping of the coding noise of the improvement stage by using a quantization in the domain weighted by the filter W (z), that is to say by minimizing the quantization noise energy filtered by W (z).

This weighting filter is used at 311 by the filtering module and more generally by the perceptual weighting module 310 of the input signal x (n). This pretreatment is applied directly to the input signal x (n) and not to an error signal as could be the case in state-of-the-art techniques.

This pretreatment module 310 delivers a modified signal x '(n) at the input of the enhancement quantizer 307.

The quantization module 307 of the improvement stage k delivers a quantization index I in _h ^{B + k} (n) which will be concatenated with the indices of the preceding nested encoding (I ^{B + kl} ) to form the indices of the current nested encoding ( I ^{B + k} ), by a module not shown here.

The quantization module 307 of the improvement stage k chooses between the two values d ^{B + k} (n) and d ^{B + k} (n) of the adaptive dictionary D _k (n).

It receives as input the signal x '(n) and gives as output, through the local decoding module 308, the quantized value x ^{B + k} (n) (where x ^{B + k} (n) is equal to d ^{B + k} (n) or d ^{B + k} (n)), minimizing the squared error between x '(n) and x ^{B + k} (n). The adaptive dictionary D _k (n) therefore directly contains the quantized output value of the stage k.

The module 308 gives the quantized value of the input signal by inverse quantization of the index I ^B ^ _h ^k (n). At the decoder the same value is obtained simply by directly using the inverse quantizer of the stage k and the concatenated index: x ^{B + k} (n) = x ^B {n) + y ^B _B ⁺ _{+ k} ^k v {n) .

This quantized signal is used to update the memories of the weighting filter W (z) of the enhancement stage to obtain memories corresponding to an input x (n) -x ^{B + k} (n). Typically one subtracts from the memory (or memories in the case of the filter type ARMA) more recent (s) the current value of the decoded signal x ^{B + k} (n).

Thus, the quantization of the signal x (n) is done in the weighted domain, which means that we minimize the squared error between x (n and x ^{B + k} (n) after filtering by the filter W (z) ■ The quantization noise of the enhancement stage is therefore shaped by a 1 / W (z) filter to make this noise less audible. The energy of the weighted quantization noise is thus minimized.

The general embodiment of the block 310 given in FIG. 3 shows the general case where W (z) is an infinite impulse response (IIR) filter or a finite impulse response (FIR) filter. . The signal x '(n) is obtained by filtering x (n) by W (z) and then when the quantified value x ^{B + k} (n) is known, the memories of the filter W (z) are updated as if the filtering had been done on the signal xn) - x ^{B + k} (n).

The dotted arrow represents the update of the filter memories.

Thus, the steps implemented in the encoder as illustrated in FIG. 3 are also represented. We find in fact the following steps:

obtaining at 303 possible quantization values (d ^{B + k} (n)) for the current improvement stage k by determining absolute reconstruction levels of the single current stage k from the indices of the preceding nested encoder () ;

quantization at 306 of the input signal of the hierarchical coder which has or has not undergone a perceptual weighting treatment (x (n) or x '(n)), from the said possible quantification values (d ^{B + k} (n)) ) to form a quantization index of the stage k (I _enh ^{B + k} (n)) and a quantized signal (x ^{B + k} (n)) corresponding to one of the possible quantization values;

In the case shown in FIG. 3, the input signal has undergone a perceptual weighting processing by using a predetermined weighting filter at 301 to give a modified input signal x '(n), before the quantization step. in 306.

FIG. 3 also represents the step of adapting the weighting filter memories 311 to the quantized signal (x ^{B + k} (n)) of the current improvement coding stage.

Figures 4, 5 and 6 now describe particular embodiments of the pretreatment block 310.

The blocks 301, 302, 303, 306, 307 and 308 then remain identical to those described with reference to FIG.

FIG. 4 represents a first embodiment of the pretreatment block 310 with a finite impulse response (FIR) filter W (z) = A '(z). In this embodiment, the filter memory contains only the input samples passed from the signal x (n) -x ^{B + k} (n), noted:

b ^{B + k} (n '), n' = n - l, ..., n - N _D.

N _D being the order of the perceptual filter W (z) - In 302, the input signal x (n) is encoded by the MIC / ADPCM coding module

302, with or without shaping the coding noise of the nested encoder B + k-1,

In 303, an adaptive dictionary D _k is constructed according to the prediction values x ^B (n), the scaling factor v (n) of the heart stage in the case of ADPCA adaptive type coding and indices. coding J ^{B + k ~ l} (^) as explained with reference to Figure 3. The adaptive dictionary D _k comprises in the particular embodiment or a single improvement bit is provided in the improvement stage k, the two following terms: d ^{+ k} (n) = x _p ^B { ^η ) + ^{and + k} (n) = x ^B (n) + y ^B L _{i + 1} v {n).

In this embodiment, there are the calculation steps in 301 of the filtering filter and the weighting filter W (z), and its predictive version W _PRED (z) based on predictions, that is to say -describe calculations using only past samples.

Recall here the definition of a predictive filter:

Let us take as an example the case of a filtering of a signal x (n) by the non-recursive filter of all-zero transfer function (also called FIR for Finite Impulse Response in

4 (finite impulse response filter) A (z) of order 4, A (z) = 1 + ^ <¾Z ^~! , i = 1 resulting in a signal y (n). In the domain of the z transform, the equation

Y (z) = A (z) X (z)

corresponds to the difference equation

y (n) = a ₀ x (n) + a ₁ x (n - 1) + a ₂ x (n - 2) + a ₃ x (n - 3) + a ₄ x (n - 4)

This expression of y (n) can be divided into two parts:

- the first depends only on the present entry x (n): a ₀ x (n). Most often and in the cases that interest us in this document, ao = 1

- the second which depends only on the past input x (ni), i> 0: a _x x [n-1) + a ₂ x [n-2) + a ₃ x [n-3) + a ₄ x [n- 4] which will therefore be considered as the predictive part of filtering by analogy to linear prediction where it represents the prediction of x (n) from the previous samples. This second part corresponds for the moment of sampling "to the" response to the null entry ", or in English" zero input response "(ZIR) or" ringing "which is in fact a generalized prediction. The z-transform of this component is:

Y _PREO {Z) = {A {Z) - \) X {Z) = H _{A> RED} {Z) X {Z) with H _AtPKED (z) = A (z) -l

Similarly, for the filtering of a signal x (n) by an all-pole recursive filter, resulting in a signal y (n), the function of

transfer gives:

B (z)

with as difference equation:

y (n) = x (n) - y (n - 1) - b ₂ y (n - 2) - b ₃ y (n - 3) - b ₄ y (n - 4)

The innovation part is x (n), the predictive part is -h _l y (nl) -b ₂ y (n-2) -b ₃ y (n-3) -b ₄ y (n-4), transform in z

Y _PRED (z) = - (B (z) -1) Y (z) = (1B (z)) Y (z).

It is the same for the case where the filter contains both zeros and poles (ARMA filter for AutoRegressive Moving Average):

B (z)

with as difference equation (in this example A (z) and B (z) are of order 4):

4 4

y (n) = x (n) + Σa _i x (ni) -Σb _i y (ni)

= l = l

4 4

The innovation part is x (n), the predictive part is y _; x (n-i) - y (n-1),

i = l = 1 of transform in z Y _PRED {z) = (A {Z) - \) X {Z) - (B {Z) - \) Y {z), or ■

In the following, generally H _PRED (z) denotes a filter whose coefficient for its current input x (n) is zero.

1 A (z)

The recursive filters all-pole-- - or ARMA-- - are the so-called IIR filters for

B (z) B (z)

Infinite Impulse Response in English (Infinite Impulse Response Filter). In the present case, in FIG. 4, using the breakdown of a filtering into innovation and predictive parts, the term whose energy is to be minimized is then:

(x (n) + x _PRED (n)) - (x ^{B + k} (n) + x ^B _R ^{+ k} _D (n))

The signal to be quantified by the enhancement quantizer of the stage k is therefore

= x (n) + x _PRED (n) - x _p ^B _R ⁺ _E ^k _D (n) or x _PRED (n) and x _PRED (n) are obtained by filtering jc (ra) and x ^{B + k} (n) ) by the prediction filter W _PRED (z) ■ These two filterings can be combined into one, the input of the common filter W _PRED (z) will then be b ^{B + k} (n) = x (n) - x ^{B + k} (n) (for example by updating the filter memory). At the end of the filtering, we obtain:

b _w , PRED ( ^W ) ^{- X} PRED ( ^W ) ^{_ X} PRED ( ^W ) ·

The pretreatment module 310 implements the steps of calculating a prediction b ^{B +} _PRED (n) by filtering, by W _PRED (z) at 404, past samples of the signal x (n) -x ^{B + k} (n) = b ^{B + k} (n) n = -1, -2, ... - No obtained in 409.

This prediction b ^{B +} _PRED (n) is added to the input signal x (n) at 405 to obtain the modified input signal x '(n) of the quantizer of the improvement stage k.

The quantization of x '(n) takes place at 306 by the quantization module of the improvement stage k, to give the quantization index I ^ (n) of the improvement stage k and the signal decoded x ^{B + k} (n) of the stage k. The module 307 gives the index of the codeword I ^B ^ _h ^k (n) (1 bit in the illustrative example) of the adaptive dictionary D _k which minimizes the squared error between x '(n) and the values of quantification d ^{B + k} (n) and d ^{B + k} (n). This index is to be concatenated with the index of the nested encoder preceding J ^{B + k ~ 1} to obtain at the decoder the index of the codeword of the stage k I ^{B + k} . The module 308 gives the quantized value of the input signal by inverse quantization of the index _¾ * (w),

~ B + k ι Β + ί (\

x (n) = d _{B + k} [n).

At the decoder the same value is obtained simply by directly using the inverse quantization of the stage k and the concatenated index to obtain: x ~ ^{B + k} (n) = x _p ^B (n) + y; ^{B k} v {n). At 409, a step of calculating the encoding noise b ^{B + k} [n) of the encoder including the stage k is performed by subtracting the input signal x (n) from the synthesized signal of the stage kx ^{B + k} ( n) for the samples present (n = 0).

The preprocessing operations of the block 310 thus make it possible to format the improvement coding noise of the stage k by performing a perceptual weighting of the input signal x (n). It is the input signal itself that is perceptually weighted and not an error signal as is the case in state-of-the-art methods.

FIG. 5 illustrates another embodiment of the preprocessing module using, in this embodiment, an ARMA type filtering (for Auto Regressive to Adjusted Average) of transfer function:

l - P _D (z)

W (z) =

The sequence of operations according to FIG. 5 is as follows:

301 calculation of the masking filter and determination of the weighting filter l - P _D (z)

W (z) =

l - P _N (z)

- Encoding 302 of the input signal x (n) by a nested type encoder

MIC / ADPCM of B + k-1 bits, possibly with coding noise shaping using the masking filter determined at 301 to format the coding noise;

Determination at 303 of the adaptive dictionary D _k as a function of the prediction values x ^B (n) and of the scaling factor v (/ 1) (in the case of a ADPCM coding) of the heart stage, and indexes of Quantization I ^{B + k ~ l} (n) (d ^{B + k} [n) = x _F ^B (n) + y (n) and d ^{B + k} (n) = x ^B {n) + y ^B ^ _k _ _{l + i} v {n));

These steps are equivalent to those described with reference to FIG.

The pretreatment module 310 comprises a step of calculating, at 512, a prediction signal b ^{B + k} _d (n) of the filtered quantization noise b ^{B + k} (n), by adding the prediction calculated at 510 from the samples filtered reconstituted noise _{^ ι} p _N ( ^m ) t> _w ⁺ (n-m) and m = l

by subtracting the prediction calculated in 511 from the reconstituted noise Σp _D (m) b ^{B + k} (n - m).

m = l At 505, a step of adding the prediction signal b _w ⁺ _pred (n) to the signal x (n) is performed to give the modified signal.

The step of quantizing the modified signal x (n) is performed by the quantization module 306, in the same manner as that explained with reference to FIGS. 3 and 4.

Thus, the quantization of the block 306 will output the index I ^ _h ^k (n) and the decoded signal on the floor kx ^{B + k} (n).

In 509, a step of subtracting the reconstructed signal x ^{B + k} (n) from the signal x (n) is performed to give the reconstructed noise b ^{B + k} (n).

In 513, a step of adding the prediction signal b ^{B + k} _red (n) to the signal b ^{B + k} (n) is performed to give the reconstructed filtered noise b ^{B + k} (n).

All the steps carried out at 505, 509, 510, 511, 512 and 513 by the modules of the preprocessing block 310 make it possible to format the coding noise for the improvement coding stage k. This shaping of the noise is then carried out by two prediction filters thus constituting an ARMA filter which brings a better precision of noise shaping.

FIG. 6 illustrates yet another embodiment of preprocessing block 310 where here the difference lies in the way in which the reconstructed filtered signal b ^{B + k} (n) is calculated. The reconstructed filtered noise b ^{B + k} (n) is obtained here by subtracting the reconstructed signal x ^{B + k} (n) from the signal x '(n) at 614.

In FIGS. 5 and 6 described above, it is also possible to update the memories of the weighting filters from the filtered reconstructed noise signal b ^{B + k} (n) for the passed samples.

FIG. 7 illustrates an alternative embodiment for the quantization step 306 of the signal x '(n) by differently processing the predicted signal x ^B (n) from the core coding. This embodiment is presented with the example of pretreatment block 310 shown in FIG. 3, but may of course be integrated with pretreatment blocks described in FIGS. 4, 5 and 6. The sequence of operations according to FIG. 7 is as follows:

301 calculation of the masking filter and determination of the weighting filter

W (z) or its predictive version W _PRED (z) ■ Encoding 302 of the input signal x {n) by a nested type encoder

MIC / ADPCM of B + k-1 bits, possibly with encoding noise shaping using the masking filter determined at 301 to format encoding noise;

Determination at 701 of the adaptive dictionary D _k 'as a function of the scale factor v (n) of the core stage (in the case of ADPCM coding) and of the quantification indices I ^{B + k ~ 1} (n) of the nested encoding preceding the stage k (d ^{B + k} [n) = and

Filtering the signal x (n) by W (z) at 311 to obtain the modified input signal x (n) of the enhancement quantizer with, for filter memories, values corresponding to an input signal x (n) - x _{B + k} (n);

Quantification of x '(n at 706 to give the index I ^ ^k (n) and the decoded signal at the kx ^{B + k} (n) stage.

In this embodiment, the predicted signal x _F ^B (n) of the heart stage is subtracted from the signal x '(n) (module 702) to obtain the modified signal x "(n) = x' (n) - x _F ^B {n).

The module 707 gives the index of the code word I ^ _h ^k (n) (1 bit in the illustrative example) of the adaptive dictionary D _k 'which minimizes the quadratic error between x "(n) and the words of code d ^{B + k} [n) and d ^{B + k} [n) This index is to be concatenated with the index of the preceding nested encoding J ^{B + k ~ 1} in order to obtain at the decoder the index of the current nested encoding I ^{B + k} comprising the stage k.

The module 708 gives the quantized value of the signal x "(n) by inverse quantization of the index I ^B * _h ^k (n), x" (n = d ^B ^ ₊ ^k '(n). The module 703 calculates the quantized signal of stage k by adding the predicted signal to the quantizer output signal x ^{B + k} (n) = x ^B (n) + x "(n).

Finally, a step of updating the memories of the filter W (z) is performed at 311, to obtain memories that correspond to an input xn) - x ^{B + k} (n). Typically one subtracts from the memory (or memories in the case of the filter type ARMA) more recent (s) the current value of the decoded signal x ^{B + k} (n). The solution in FIG. 7 is equivalent in terms of quality and storage to that of FIG. 3, but requires less calculation in the case where the improvement stage uses more than one bit. In fact, instead of adding the predicted value x ^B (n) to all the code words (> 2), we only subtract before quantization and add to find the quantized value x ^{B + k} (n). The complexity is reduced.

Another alternative embodiment is illustrated in FIG. 7b. Here, the adaptive dictionary D _k "is constructed by subtracting the levels of reconstruction of the stage k weighted where appropriate by the scaling factor v (n), with the modified input signal ( {not) ). In this case, it is the prediction signal x ^B (n) that is quantized by minimizing the quadratic error. Then the decoded signal x ^{B + k} (n) for the update of the memories is obtained in the following way: x ^{B + k} (n) = x '(n) + x ^B (n) - d ^B ^ ₊ ^k " ( not ) .

Figure 8 details a possible embodiment of a noise shaping heart coding. The module 801 calculates the coefficients of the noise shaping filter P, \ z) = -; or F, \ z) = -; · The module 802 calculates the coding error 'Α (ζ / γ) ² A {zl _{Ï 2} )

q _w (n) = x (n) - x (n) of the previous sampling instants, n- \, n- 2, .... This error is filtered by a _prediction filter H _PRED (z) to obtain the prediction signal q _{w pred} (n). The filter H (z) corresponding to H _PRED z) may be equal, for example, to

At time n, this predicted value will be subtracted from the signal to be encoded in order to obtain the signal to be coded modified x '(n) = x (n) - q _{w pred} (n).

The difference between the input and the output of the MIC / ADPCM encoder-MIC / ADPCM decoder chain, q {n) = x {n) -x '(n), can be considered in the short term as a white noise when these Encoders use a quantizer with a large number of levels and assuming the stationary input signal.

Take the example where H (z) = --- = A (z / Y) ■ The input signal of the standard MIC / ADPCM coding string is modified by the subtraction of the contribution (H (z) - l ) (x (z) - X (z)). As a result, the coding noise of the complete string q _G (n) = x (n) - x (n) will be formatted by the, here is the proof in equations

X (z) = Xiz) + Q (z) = X (z) - (H (z) -l) (x (z) -X (z)) + Q (z) =

= X (z) -H (z) x (z) + H (z) x (z) + Q (z)

hence H {z) x {z) = H (z) x {z) + Q {z) and therefore X (z) = X (z) + - ^

H (z)

In fact, the filter _PRED (z) = H (z) - 1 has a zero coefficient in z ° (for the moment n), it is therefore a predictor acting on q _w (n) = x ( n) - x (n) which is known only at the end of the PCM / ADPCM processing when the decoded value x (n) is known.

The sequence of operations in Figure 8 is as follows:

- Computation in 801 of the filter of masking and determination of the filter H (z). Note that the filter H (z) can also be determined from the decoded signal x (n);

Computation in 803 of the prediction q _Wipred (n), ([H (z) - 1] Q _w (z)), starting from the values q _w (n) = x (n) - x (n) instants d previous sampling, n- \, n- 2, ...;

Subtraction at 804 of the prediction q _{w pred} (n) to x (n) to obtain the modified signal x '(n);

805-806 encoding / decoding of the modified signal x '(n) by a standard MIC / ADPCM encoder / decoder. The local decoder can be a standard local decoder of the MIC / ADPCM type of the G.711, G.721, G.726, G.722 or G.727 standards.

802 calculation of the filtered coding noise q _w (n) by subtracting the input signal x (n) from the output signal x (n)

The circled portion 807 can be seen and implemented as a noise shaping pretreatment that modifies the input of the standard encoder / decoder string.

An exemplary embodiment of an encoder according to the invention is now described with reference to FIG.

Materially, an encoder 900 as described according to the various embodiments above, in the sense of the invention, typically comprises a μΡ processor cooperating with a memory block BM including a storage and / or working memory, as well as 'a memory MEM buffer aforesaid as a means for storing for example a dictionary of quantization reconstruction levels or any other data necessary for the implementation of the coding method as described with reference to Figures 3, 4, 5, 6 and 7. This encoder receives as input successive frames of the digital signal x (n) and delivers concatenated quantization indices I ^{B + K.}

The memory block BM may comprise a computer program comprising the code instructions for implementing the steps of the method according to the invention when these instructions are executed by a μΡ processor of the encoder and in particular the steps of obtaining possible quantization values. for the current improvement stage k by determining absolute reconstruction levels of the single current stage k from the indices of the preceding nested encoder, quantization of the input signal of the hierarchical coder which has or has not undergone perceptual weighting processing (x (n) or x '(n)), from said possible quantization values to form a quantization index of the stage k and a quantized signal corresponding to one of the possible quantization values.

In a more general manner, a means of storage, readable by a computer or a processor, integrated or not integrated with the encoder, possibly removable, stores a computer program implementing a coding method according to the invention.

Figures 3 to 7 may for example illustrate the algorithm of such a computer program.

Claims

A method of encoding an input digital audio signal (x (n)) in a hierarchical encoder comprising a B-bit heart coding stage and at least one current enhancement coding stage k, the heart coding and the coding of the improvement stages preceding the current stage k delivering quantization indices which are concatenated to form the indices of the preceding nested encoder (I ^{B + kl} ), the method being characterized in that it comprises the following steps:

obtaining (303) possible quantization values (df ^{+ k} (n)) for the current improvement stage k from the absolute reconstruction levels (y ^{+ k} ) of the single current stage k and nested encoder indices previous (i ^{B + k ~ l} );

quantization (306) of the input signal of the hierarchical coder which has or has not undergone a perceptual weighting treatment (x (n) or x '(n)), from the said possible quantization values {df ^{+ k} (n) ) to form a quantization index of the stage k (I _enh ^{B + k} (n)) and a quantized signal (x ^{B + k} (n)) corresponding to one of the possible quantization values.

The method according to claim 1, characterized in that the input signal has undergone perceptual weighting processing using a predetermined weighting filter to give a modified input signal x '(n), before the quantization step. (306) and further comprising a step of adapting (311) the weighting filter memories from the quantized signal (x ^{B + k} (n)) of the current enhancement coding stage. .

3. Method according to claim 1, characterized in that the possible quantization values for the improvement stage k further contain a scale factor and a prediction value from the adaptive type core coding.

4. Method according to claim 2, characterized in that the modified input signal (x "(n)) to be quantized at the improvement stage k is the perceptually weighted input signal from which a prediction value is subtracted. from the adaptive type core coding.

5. Method according to claim 1 to 4, characterized in that the perceptual weighting treatment is carried out by prediction filters forming an ARMA type filter.

Hierarchical coder of an input digital audio signal (x (n)), comprising a B-bit core coding stage and at least one current improvement coding stage k, the core coding and the stage coding improvement device preceding the current stage k delivering quantization indices which are concatenated to form the indices of the preceding nested encoder (I ^{B + kl} ), the encoder being characterized in that it comprises:

a module for obtaining (303) possible quantization values {df ^{+ k} [n)) for the current improvement stage k by determining absolute reconstruction levels of the single current stage k from the encoder indices Nested precedent (/ ^{B +} ");

a quantization module (306) of the input signal of the hierarchical coder which has or has not undergone a perceptual weighting treatment (x (n) or x '(n)), from said possible quantization values {df ^{+ k} (n)) to form a quantization index of the k stage

(I _enh ^{B + k} (n)) and a quantized signal (x ^{B + k} (n)) corresponding to one of the possible quantization values; A hierarchical encoder according to claim 6, characterized in that it further comprises a perceptual weighting pre-processing module (310) using a predetermined weighting filter to provide a modified input signal (x '(n)) in input of the quantization module (306) and an adaptation module (311) of the weighting filter memories from the quantized signal (x ^{B + k} (n)) of the current enhancement coding stage.

8. Computer program comprising code instructions for implementing the steps of the coding method according to one of claims 1 to 5, when these instructions are executed by a processor.