EP1692689A1

EP1692689A1 - Optimized multiple coding method

Info

Publication number: EP1692689A1
Application number: EP04805538A
Authority: EP
Inventors: David Virette; Claude Lamblin; Abdellatif Benjelloun Touimi
Original assignee: France Telecom SA
Current assignee: Orange SA
Priority date: 2003-12-10
Filing date: 2004-11-24
Publication date: 2006-08-23
Anticipated expiration: 2024-11-24
Also published as: EP1692689B1; CN1890714B; KR101175651B1; WO2005066938A1; US7792679B2; JP2007515677A; JP4879748B2; US20070150271A1; ZA200604623B; PL1692689T3; FR2867649A1; CN1890714A; DE602004023115D1; KR20060131782A; ATE442646T1; ES2333020T3

Abstract

The invention relates to the compression coding of digital signals such as multimedia signals (audio or video), and more particularly a method for multiple coding, wherein several encoders each comprising a series of functional blocks receive an input signal in parallel. According to the invention, a) the functional blocks (BF10, , BFnN) forming each encoder are identified, along with one or several functions carried out of each block, b) functions which are common to various encoders are itemized and c) said common functions are carried out definitively for a part of at least all of the encoders within at least one same calculation module. (BF1CC, , BFnCC).

Description

Optimized multiple coding method

The present invention relates to the encoding / decoding of digital signals, in applications for transmission or storage of multimedia signals such as audio signals (speech and / or sounds) or video.

To provide mobility and continuity, modern and innovative multimedia communication services must be able to operate under a wide variety of conditions. The dynamism of the multimedia communication sector, the heterogeneity of networks, access and terminals have led to a proliferation of compression formats.

The present invention is in the context of an optimization of "multiple coding" techniques, implemented as soon as a digital signal, or a portion of this signal, is coded according to several coding techniques. This multiple coding can be performed simultaneously (in one pass) or not. The processes can be carried out on the same signal, or possibly on versions derived from the same signal (for example according to different bandwidths). One thus distinguishes the "multiple coding" of the "transcodings", where each coder carries out the compression of a version resulting from the decoding of the signal compressed by the preceding encoder.

Multiple coding is, for example, in the case of the same content which is coded according to several formats and then transmitted to terminals that do not support the same coding formats. If it is a real-time broadcast, the processing should be done simultaneously. If it is a question of access to a database, the codings can be carried out one after another, delayed. In these examples, the multiple coding makes it possible to code the same signal in different formats by using several coders (or possibly several rates or several modes of the same encoder), each encoder operating independently of the other coders.

Another use of multiple coding occurs in encoding structures where multiple encoders compete to encode a signal segment, with only one encoder ultimately selected for encoding that segment. The choice of the selected encoder can be made at the end of the processing of this segment, or even later (by delayed decision). In what follows, this type of structure will be referred to as "multi-mode coding" (with reference to the selection of a "mode" of coding). In these multi-mode structures, several coders sharing a "common past" are required to encode the same signal portion. The coding techniques used may be different or from a single coding structure. However, they will not be completely independent unless they are "memory-free" techniques. Indeed, in the (current) case of coding techniques implementing recursive processing, the processing of a given signal segment depends on the way in which this signal has been coded in the past. There is therefore a certain dependence between the encoders, since an encoder will have to take into account in its memories the output of another encoder.

In these different contexts, the notion of "multiple coding" has been introduced as well as the conditions of use of such techniques. However, the complexity of implementation can be prohibitive.

For example, in the case of content servers that broadcast the same content in several formats adapted to the access conditions, networks and terminals of different clients, this operation becomes extremely complex as the number of desired formats increases. If it is a real-time broadcast, one is quickly limited by the resources of the system since the different formats are coded in parallel. The second use case mentioned concerns multi-mode coding applications, allowing the selection of one encoder from a set for each portion of signal analyzed. The selection requires the definition of a criterion, the most common aiming at the optimization of the rate-distortion compromise. The signal being analyzed over successive time segments, at each segment several codings are evaluated. The lowest bit rate coding for a given quality is then selected, or the best bit rate coding for a given bit rate. It will be noted that other constraints than those of flow / distortion can be used.

In general, in such structures, the coding selection is performed a priorV by an analysis of the signal on the segment under consideration (selection according to the characteristics of the signal), however, the difficulty of producing a robust classification of the signal for this selection. This has led to proposing a selection of optimal mode after encoding of all the modes, at the cost however of a high complexity.

Intermediate methods combining the two approaches have been proposed to reduce the cost of calculation. These strategies, however, are suboptimal and prove to be less efficient than the exploration of all modes. The exploration of all the modes or a large part of the modes constitutes a multiple coding application which presents a potentially high complexity, hardly compatible a priori with the real-time coding for example.

Currently, most multi-coding and transcoding operations do not take into account the interactions between formats and between the format and its content. Some multi-mode coding techniques have been proposed, but the decision of the mode used is generally a priori on the signal (by classification, for example the SMV coder for "Selectable Mode Vocoder"), depending on the network conditions (for example in AMR encoders for Adaptive Multi-Rate).

In the documents: "An overview of variable rate speech coding for cellular networks", Gersho, A .; Paksoy, E .; Wireless Communications, 1992. Conference Proceedings, 1992 IEEE International Conference on Selected Topics, 25-26 Jun 1992 Page (s): 172 -175,

"A variable rate speech coding algorithm for cellular networks", Paksoy, E .; Gersho, A .; Speech Coding for Telecommunications, 1993. Proceedings, IEEE Workshop 1993, Page (s): 109-110,

"Variable rate speech coding for multiple access wireless networks", Paksoy E .; Gersho A .; Electrotechnical Conference, 1994, Proceedings, 7th Mediterranean, 12-14 Apr 1994 Page (s): 47 -50 vol.1,

several selection modes are presented, in particular a decision controlled by the source and a decision controlled by the network.

In the case of a decision controlled by the source, the decision a priori is made from a classification of the input signal. There are many methods of signal classification.

In the case of a decision controlled by the network, it is simpler to produce a multi-mode encoder whose bit rate is chosen by an external module rather than by the source. The simplest method is to develop a family of coders each fixed rate but whose flow rates are different between coders and to switch between these different rates to obtain a desired current mode. Some work was also presented on the possibility of combining several criteria to select a priori the mode to be used, particularly in documents:

"Variable-rate for basic speech service in UMTS" Berruto, E .; Sereno, D .; Vehicular Technology Conference, 1993 IEEE 43rd, 18-20 May 1993 Page (s): 520 -523

"A VR-CELP coded implementation for CDMA mobile communications" Cellario, L .; Sereno, D .; Giani, M .; Blocher, P .; Hellwig, K .; Acoustics, Speech, and Signal Processing, 1994, ICASSP-94, 1994 IEEE International Conference, Volume: 1, 19-22 Apr 1994 Page (s): 1/281 -I / 284 vol.1.

All the multi-mode coding algorithms with selection of the prior coding mode suffer from the same drawback, in particular related to problems of robustness of the classification a priori.

This is why techniques using a posteriori decision of the coding mode have been proposed. For example in the document:

"Finite state CELP for variable rate speech coding" Vaseghi, S.V; Acoustics, Speech, and Signal Processing, 1990, ICASSP-90, 1990 International Conference, 3-6 Apr 1990 Page (s): 37-40 vol.1,

the encoder can switch between different modes by optimizing an objective quality measurement, the decision is therefore a posteriori based on the characteristics of the input signal, the report referred flow / SQNR (for "Signal to Quantization Noise Ratio") and the current state of the encoder. Such a coding scheme makes it possible to obtain an improvement in the quality. However, the different codings being made in parallel, the resulting complexity of this type of system is prohibitive. Other techniques combining a priori decision and a closed-loop improvement have been proposed. In the document :

"Multimode variable bit rate speech coding: an efficient paradigm for high-quality low-rate representation of speech." Das, A. DeJaco, A. Manjunath, S. Ananthapadmanabhan, A. Huang, J. Choy, E. Acoustics, Speech, and Signal Processing, 1999. ICASSP '99 Proceedings, 1999 IEEE International Conference, Volume: 4, 15-19 Mar 1999 Page (s): 2307 -2310 vol.4,

the proposed system makes a first selection (open-loop selection) of the mode, depending on the characteristics of the signal. This decision can be made by classification. Then, from an error measurement, if the performances of the selected mode are not satisfactory, a higher rate mode is applied and the operation is repeated (according to a decision sought in closed loop).

Similarly, in the documents:

* "Variable rate speech coding for umts" Cellario, L .; Sereno, D .; Speech Coding for Telecommunications, 1993. Proceedings, IEEE Workshop, 1993

Page (s): 1 -2

"Phonetically-based vector excitation coding of speech at 3.6 kbps" Wang, S .; Gersho, A .; Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference, 23-26 May 1989 Page (s): 49 -52 vol.1

* "A modified CS-ACELP algorithm for robust variable-rate speech coding in noisy environments" Beritelli, F .; IEEE Signal Processing Letters, Volume: 6 Issue: 2, Feb 1999 Page (s): 31 -34, similar techniques were used. A first open-loop selection is performed after classification of the input signal (phonetic classification, or voiced / unvoiced), then a closed loop decision is made: - either on the complete encoder and, in this case, all the speech segment is coded again; ^{* "} - or on a part of the coding, as in the above references preceded by a star (*), for which the choice of the dictionary to be used is done in a closed loop.

All the studies mentioned above tend to solve the problem of the complexity of the optimal selection of the mode by the use, total or partial, of a selection or pre-selection a priori, which avoids the multiple coding or decreases the number of coders to be implemented in parallel.

However, no technique of the prior art that makes it possible to reduce the complexity of the codings made in parallel has been proposed.

The present invention improves the situation.

To this end, it proposes a method of multiple coding in compression, in which an input signal is intended to supply in parallel a plurality of coders each comprising a succession of functional blocks, for compression coding of said signal by each encoder.

The method of the invention comprises the following preparatory steps: a) identifying the functional blocks forming each encoder, as well as one or more functions performed by each block, b) identifying, among said functions, functions that are common to each block; one coder to another, and c) performing said common functions, once and for all, for at least a part of all the coders, within at least one same calculation module.

In an advantageous embodiment, the above steps are implemented by a computer program product comprising program instructions for this purpose. As such, the present invention also relates to such a computer program product, intended to be stored in a memory of a ^~ ιmité treatment, in particular a computer or mobile terminal, or on a removable memory medium and intended to cooperate with a reader of the processing unit.

The present invention also relates to a device for aiding in compression encoding, for implementing the method according to the invention, and then including a memory adapted to store instructions of a computer program product of the aforementioned type.

Other features and advantages of the invention will appear on examining the detailed description below, and the attached drawings in which:

FIG. 1a schematically illustrates the application context of the present invention, with a plurality of encoders placed in parallel,

FIG. 1b schematically illustrates the application of the invention, with the sharing of functional blocks between several coders in parallel,

FIG. 1c schematically illustrates the application of the invention, with the sharing of functional blocks in multi-mode coding,

FIG. 1d schematically illustrates the application of the invention, in trellis multi-mode coding,

FIG. 2 schematically represents the main functional blocks of a perceptual frequency coder; FIG. 3 schematically represents the main functional blocks of a synthesis analysis coder; FIG. 4a schematically represents the main functional blocks of a TDAC coder,

FIG. 4b diagrammatically represents the format of the bit stream coded by the coder of FIG. 4a; FIG. 5 diagrammatically represents the application of the invention to a plurality of TDAC coders in parallel, in an advantageous embodiment; FIG. 6a schematically represents the main functional blocks of an MPEG-1 (layer I and II) coder,

FIG. 6b diagrammatically represents the format of the bitstream coded by the coder of FIG. 6a;

FIG. 7 diagrammatically represents the application of the invention to a plurality of MPEG-1 coders (layer I and II) connected in parallel, according to an advantageous embodiment,

and FIG. 8 shows in more detail the functional blocks of a synthesis analysis coder, here of the NB-AMR type according to the 3GPP standard.

Referring firstly to FIG. 1a, there is shown a plurality of encoders CO, C1,..., CN, in parallel and each receiving an input signal s ₀ . Each encoder comprises functional blocks BF1 to BFn for implementing successive coding steps and ultimately outputting a coded bitstream BS0, BS1, ..., BSN. It is furthermore indicated that in an application in multi-mode coding, the outputs of the coders C0 to CN are connected to an optimal mode selection module MM and the bit stream BS of the optimal coder is transmitted (dotted line arrows in the figure 1a).

For the sake of simplicity, all the coders of the example of FIG. 1a have the same number of functional blocks, but of course all these functional blocks are not necessarily provided in all the coders, in practice. Some BFi function blocks are sometimes identical from one mode (or encoder) to another, while others differ only in quantizer level. Usable relationships also exist when encoders from the same coding family are used, using similar models or computing parameters physically related to the signal.

It is these relationships that the present invention proposes to exploit, in order to reduce the complexity of multiple coding operations.

In a first step, the invention proposes to identify the functional blocks that make up each of the coders. The technical similarities between the coders are then exploited by considering the functional blocks whose functions are equivalent or similar. For each of these blocks, the invention proposes: on the one hand to define so-called "common" operations, and to perform them only once for all the coders; On the other hand, to implement calculation methods specific to each coder and in particular using the results of these common calculations. These methods of calculation produce a result possibly different from that produced by a complete coding. The objective is then to accelerate the processing by exploiting the information available and provided in particular by the common calculations. Such methods for accelerating calculations are for example implemented in techniques intended to reduce the complexity of transcoding operations (so-called "intelligent transcoding" techniques).

Figure 1b illustrates the proposed solution. In the example shown, the aforementioned "common" operations are performed once for at least one part of the coders and, preferably, for all the coders, in an independent module Ml which will redistribute the results. obtained at least part of the coders, or preferably all these coders. It is thus a sharing between at least part of all coders CO to CN (or "pooling" hereafter) of the results obtained. Such an independent module M1 can be part of a device for a multiple compression coding as defined above.

In an advantageous variant, rather than having recourse to an external calculation module M1, the existing functional block or blocks BF1 to BFn of the same or more different coders is used, this or these coders being chosen according to criteria which will be described later.

The present invention can implement several strategies which, of course, may differ depending on the role of the functional block considered.

A first strategy is to use the parameters of the encoder whose bit rate is the lowest to focus the search parameters for all other modes.

Conversely, a second strategy is to use the parameters of the encoder whose rate is the highest, then to "degrade" progressively to the encoder whose bit rate is the lowest.

Of course, if it is desired to favor a particular encoder, it is possible to encode a signal segment using this encoder, then, by applying the two strategies above, to reach the upper and lower rate encoders.

Of course, other criteria than flow can be used to drive the search. For some functional blocks, for example, it is possible to favor the encoder whose parameters lend themselves best to efficient extraction (or analysis) and / or coding of similar parameters from the others. coders, the efficiency being judged according to the complexity, the quality or a compromise of the two.

It may also be envisaged to create an independent coding module, not present in the coders, but allowing a more efficient coding of the parameters of the functional block considered, for all the coders.

These various implementation strategies are particularly interesting in the case of multi-mode coding. In this context illustrated in FIG. 1c, the present invention makes it possible to reduce the complexity of the calculations preliminary to the a posteriori selection of an encoder carried out in the last step, for example by the last module MM before the transmission of the bit stream BS.

In this particular case of multi-mode coding, a variant of the present invention, represented in the example of FIG. 1c, proposes to introduce a partial selection module MSPi (with i = 1, 2, ..., N ) after each coding step (thus after the functional blocks BFM to BFiNi put in competition and whose result of the selected block or blocks BFicc will be used later). Thus, the similarities between the different modes are exploited to speed up the calculation of each functional block. All coding schemes will not necessarily be evaluated.

A more sophisticated variant of the multi-mode structure based on the functional block cutting described above is now proposed, with reference to FIG. 1d. The multi-mode structure of FIG. 1d is called "trellis", with several possible paths in the trellis. In fact, in Figure 1d, all possible paths of the lattice are shown so that it is in a tree form. In particular, it is indicated that each path of the trellis is defined by a combination of operating modes of the functional blocks, each functional block supplying several possible variants of the next functional block. Thus, each coding mode is derived from the combination of operating modes of the functional blocks: the functional block 1 has Ni modes of operation, the functional block 2 has N ₂ , and so on up to the block P. The the set of NN = Ni x N ₂ x ... x N _p possible combinations is therefore represented by a lattice of NN branches describing, end-to-end, a complete multi-mode encoder with NN modes. Some branches of the lattice may be removed a priori and thus define a tree with a reduced number of branches. A first feature of this structure is that it provides, for a given functional block, a common calculation module per output of the previous functional block. These common calculation modules perform the same operations, but on the basis of different signals since they come from different previous blocks. Advantageously, the common calculation modules of the same level are pooled: the results of a given module usable by the following modules are provided to these following modules. On the other hand, a partial selection, made at the end of the processing of each functional block, advantageously makes it possible to eliminate the less efficient branches according to the chosen criterion. It is therefore possible to reduce the number of branches of the trellis to be evaluated.

An advantageous application of this multi-mode lattice structure is as follows.

When the functional blocks are capable of operating at different respective flow rates and using respective parameters specific to said flow rates, for a given functional block, the chosen trellis path is the one passing through the lowest flow functional block, or the highest rate functional block according to the coding context, and the results obtained from the lowest (or highest) bit rate functional block are adapted to the bit rates of at least a portion of the other functional blocks by a focused search of parameters for at least part of all other functional blocks, up to the highest (or lowest) rate functional block.

As a variant, a given flow function block is chosen and at least a portion of the parameters specific to this functional block are progressively adapted:

- to the functional block capable of operating at the lowest rate, by focussed search, and - to the functional block capable of operating at the highest rate, by focussed search.

In general, this reduces the complexity associated with multiple coding.

The invention applies to any compression scheme implementing the multiple encoding of a multimedia content. Three embodiments are presented in the following, in the field of audio compression (speech and sound). The first two exemplary embodiments are in the context of the family of transform coders, the following document of which can be given for reference:

"Perceptual Coding of Digital Audio", Painter, T .; Spanias, A .; Proceedings of the IEEE, Vol. 88, No. 4, April 2000.

The third exemplary embodiment is in the context of CELP coders, the following document of which can be cited for reference: "Code Excited Linear Prediction (CELP): High quality speech at very low bit rates" Schroeder MR; Atal BS; Acoustics, Speech, and Signal Processing, 1985. Proceedings. 1985 IEEE International Conference, Page (s): 937-940. A reminder of the main characteristics of these two families of coding is first presented in the following.

Transformers or subband coders These are transform or transform compression coders based on psychoacoustic criteria. This type of encoder proceeds by transforming blocks of the time signal to obtain a set of coefficients. The transformations are of the time-frequency type, one of the most used transformations being the Modified Discrete Cosine Transform (MDCT), before the quantization of these coefficients, an algorithm allocates the bits so that the quantization noise is the least audible possible The binary allocation and the quantization of the coefficients implement a masking curve, obtained using a psychoacoustic model allowing to evaluate, for each spectral line considered, a masking threshold representative of the amplitude necessary for a sound at this frequency to be audible Figure 2 gives the schematic diagram of a frequency encoder. in the form of functional blocks is well represented Referring to FIG. 2, the main functional blocks are: a block 21 for transforming the time / frequency of the signal So digital audio input, - a block 22 for determining a perceptual model from the transformed signal, - a block 23 for quantization and coding, from the perceptual model, - and a block 24 for formatting the bit stream to obtain a coded audio frame St _c .

* Synthetic analysis coders (CELP coding) In synthesis analysis coders, the reconstructed signal synthesis model is used at the encoder to extract the parameters modeling the signals to be coded. These signals can be sampled at the frequency of 8 kHz (telephone band 300-3400 Hz) or at a higher frequency, for example at 16 kHz for wideband coding (bandwidth 50 Hz to 7 kHz). Depending on the application and the desired quality, the compression ratio varies from 1 to 16. These encoders operate at rates of 2 to 16 kbit / s in the telephone band, and at speeds of 6 to 32 kbit / s in the extended band. . The CELP type digital coding device, currently used as a synthesis analysis coder, is presented in FIG. 3 in the form of main functional blocks. The speech signal s ₀ is sampled and converted into a sequence of frames of a number L of samples. Each frame is synthesized by filtering a waveform extracted from a directory (called "dictionary"), multiplied by a gain, through two filters varying in time. The fixed excitation dictionary is a finite set of waveforms of the L samples. The first filter is a long-term prediction filter. A "LTP" analysis (for "Long Term Prediction") makes it possible to evaluate the parameters of this long-term predictor which exploits the periodicity of the voiced sounds, this harmonic component being modeled in the form of an adaptive dictionary (block 32) The second filter is a short-term prediction filter. "Linear Prediction Coding" (LPC) analysis methods make it possible to obtain these short-term prediction parameters, which are representative of the vocal tract transfer function. and Characteristics of the Signal Spectrum Envelope The method used to determine the innovation sequence is the synthesis analysis method which is summarized as follows: At the encoder, a large number of innovation sequences from the fixed excitation are filtered by the LPC filter (synthesis filter of the function block 34 of Figure 3), beforehand the adaptive excitation has been obtained in a similar way. st that producing the synthetic signal closest to the original signal (minimizing the error at the functional block 35), according to a perceptual weighting criterion (functional block 36) which is generally known as the "CELP" criterion In the block diagram of the CELP coder given in FIG. 3, the extraction of the fundamental frequency of the voiced sounds (or "pitch"), applied to the signal resulting from the LPC analysis of the block 31, then makes it possible to extract the long-term correlation at block 32, called the harmonic component or adaptive excitation (EA). The residual signal is finally modeled conventionally by a few pulses, all of the positions of which are predefined in a directory called fixed excitation directory (EF) in block 33.

Decoding is, for its part, much less complex than coding. The bitstream generated by the coder enables the decoder, after demultiplexing, to obtain the quantization index of each parameter. The decoding of the parameters and the application of the synthesis model then make it possible to reconstruct the signal.

The three aforementioned embodiments are described below, starting first with a transform coder of the type shown in FIG. 2.

* First example of implementation: application to a coder "TDAC"

The first embodiment relates to the perceptual frequency coder called "TDAC" and described in particular in the published document US-2001/027393. This TDAC encoder is used to encode digital audio signals sampled at 16 kHz (wide band). Figure 4a illustrates the main functional blocks of this encoder. An audio signal x (n) limited in band at 7 kHz and sampled at 16 kHz is cut into frames of 320 samples (20 ms). A modified discrete cosine transform (or "MDC") is applied (function block 41) on input signal frames of 640 samples with 50% overlap, thus with a refresh of the MDCT analysis every 20 ms. The spectrum is limited to 7225 Hz by setting the last 31 coefficients to zero (only the first 289 coefficients are different of 0). A masking curve (block 42) is determined from this spectrum and all masked coefficients are set to zero. The spectrum is divided into 32 bands of unequal widths. Any masked bands are determined according to the transformed coefficients of the signals. For each band of the spectrum, the energy of the MDCT coefficients is calculated (to obtain scale factors). The 32 scale factors constitute the spectral envelope of the signal which is then quantized and coded ^* by entropy coding (functional block 43), and finally transmitted in the coded frame s _c .

The dynamic allocation of the bits (functional block 44) is based on a band masking curve (functional block 42) calculated from the decoded and dequantized version of the spectral envelope. This measurement makes it possible to have compatibility between the bit allocation of the encoder and the decoder. The normalized MDCT coefficients in each band are then quantized (function block 45) by vector quantizers using size-nested dictionaries, the dictionaries being composed of a type II permutation code union. Finally, with reference to FIG. 4b, the information on the tone (coded here on a bit Bi) and the voicing (coded here on a bit Bo), as well as the spectral envelope e _q (i) and the coded coefficients y _q (j) are multiplexed (block 46 of FIG. 4a) and transmitted in frames.

Since this encoder can operate at several rates, it is proposed to make a multi-rate encoder for example at 16, 24 and 32 kbit / s. In this coding scheme, the following functional blocks can be shared between the different modes: MDCT Transform (Block 41), Voicing Detection (Function Block 47 of FIG. 4a) and Tone Detection (Function Block 48 of FIG. Figure 4a), • Calculation, quantization and entropic coding of the spectral envelope (block 43), • Calculation of a masking curve, coefficient by coefficient, and a masking curve per band (block 42).

These different blocks make up 61.5% of the processing complexity in the coding process. Their factorization is therefore of great interest to reduce this complexity when generating several bit streams corresponding to different rates.

The results of these functional blocks already make it possible to obtain a first portion common to all the output bit streams which contains the information bits on the voicing, the tone and the coded spectral envelope.

In a first variant of this exemplary embodiment, it is possible to carry out the bit allocation and quantization operations for each of the output bit streams corresponding to each of the bit rates considered. These two operations are performed in exactly the same way as usual in a TDAC encoder.

In a second, more advanced variant as illustrated in FIG. 5, it is possible to implement "intelligent" transcoding techniques (as described in the document published US-2001/027393 cited above) to further reduce complexity and pool certain operations, in particular: the allocation of bits (functional block 44), and also the quantization of the coefficients (functional blocks 45_i), as will be seen hereafter: In Figure 5, the functional blocks shared between the encoders (or "pooled") bear the same reference as those of a single TDAC encoder as represented in FIG. 4a: blocks 41, 42, 47, 48, 43 and 44. in particular, the bit allocation block 44 is used in several passes, and the number of bits allocated is adjusted for the transquantification performed by each coder (blocks 45_1,..., 45_ (K-2), 45_ (K -1)), as will be seen below. Note further that these transquantifications use the results obtained by the quantization function block 45_0 for a chosen encoder, index 0 (the lowest rate encoder in the example described). Finally, the only functional biocs of the encoders that act without real interaction are the multiplexing blocks 46_0, 46_1, ..., 46 K-2), 46_ (K-1), although they all use the same voicing information and tone, as well as the same coded spectral envelope. As such, it is simply stated that a partial pooling of the multiplexing can be conducted, again.

For the two bit allocation and quantization functional blocks, the strategy employed is to exploit the results of the two bit allocation and quantization functional blocks made for the bit stream (0), at the lowest bit rate D ₀ , to accelerate the operations of the two corresponding function blocks for the K-1 other bitstream (k) (l ≤ k <K). It is also possible to consider the multi-rate coding scheme which uses a bit-allocation functional block per bit stream (without factorization provided for this block) but mutualizes a part of the quantization operations thereafter.

The multiple coding techniques presented below are advantageously based on intelligent transcoding used for the reduction of the coded audio stream bit rate, generally located in a node of the network.

In the following, the bit streams k, 0 ≤ k <K, are classified according to an increasing order of rates (DQ <D- _I <... <Dκ-ι). Thus, bit stream 0 corresponds to the lowest bit rate. * Bit allocation

Bit allocation in the TDAC encoder is done in two phases. Firstly, a first calculation of the number of bits to be allocated to each band is preferably carried out according to the following formula: where c = is a constant,

B is the total number of bits available,

M is the number of bands, e _q () is the decoded and dequantized value of the spectral envelope on the band, and s _b (1) is the masking threshold for this band.

Each of the values obtained is rounded to the nearest natural integer. If the total flow allocated is not exactly equal to that available, a second phase is used to perform the readjustment. This step is preferably done by a succession of iterative operations based on a perceptual criterion that adds or removes bits from the bands.

Thus, if the total number of bits distributed is less than that available, the bits are added to the bands where the perceptual improvement is the most important. This perceptual improvement is measured by the variation of the noise to mask ratio between the initial and final allocation of the bands. The rate is increased for the band where this variation is greatest. In the opposite case where the total number of distributed bits is greater than that available, the extraction of bits on the bands is dual to the latter procedure.

In the multi-rate coding scheme corresponding to the TDAC coder, it is possible to factorize certain operations for bit allocation. So, the first determination step by the above formula can be done once based on the lowest bit rate D ₀ . The readjustment phase by adding bits can then be done continuously. Once the total number of bits distributed reaches the number corresponding to a bit rate of a bit stream k, k = 1, 2 ..., K-1, the current distribution is considered as that which is used for the quantization band-normalized coefficient vectors of this bit stream.

Quantification of Coefficients As far as the quantization of coefficients is concerned, the TDAC encoder uses a vector quantization using size-nested dictionaries, the dictionaries being composed of a type II permutation code union. This type of quantization applies to each of the vectors of the MDCT coefficients on a band. Such a vector is normalized beforehand by using the dequantized value of the spectral envelope on this band. We notice :

- C (b ,, ^,) the dictionary corresponding to the number of bits b _l and the dimension d _it

- N (b _{i 5} ^,) the number of elements in this dictionary,

- CL {t> _i , d all of its leaders, and - Nl (b ,, ^) the number of leaders.

The quantization result for each band i of the frame is a code word m _t transmitted in the bit stream. It represents the index of the quantized vector in the dictionary and calculated from the following information: • the number L, in the set CL (b _{ , d _f ) of the leaders of the dictionary C (b "d _t ), of quantized leader vector Y (i) nearest neighbor of a current leader Y (i),

The rank η of Y _q (i) in the class of the leader Ϋ _q (i), • and the combination of signs sign _q (i) to be applied to Y _q (i) (or to Ϋ _q (i)), where we specify the following notations: • Y (i) is the vector of the absolute values of normalized coefficients of the band i, • sign (i) is the vector of the signs of the normalized coefficients of the band i,

• Y (i) is the leading vector of the aforementioned vector Y (i), obtained by descending ordering of its components (the corresponding permutation is denoted perm (i)), • and Y _q (i) is the quantized vector of Y ( i) (or "/ e nearest neighbor" of Y (i) in the dictionary C (bj ₅ ,)).

In the following, the notation α ^(A) , with an exponent k, indicates the parameter used in the processing performed to obtain the bitstream of the encoder k. The parameters without this exponent being calculated once and for all for the bit stream 0. They are independent of the flow (or mode) considered.

The "nesting" property of the aforementioned dictionaries is expressed according to the relation: with also:

We notice the complementary of in His cardinal is equal to Obtaining the codewords (with 0≤k <K), results of the quantization of the vector of the coefficients of the band i for each of the bit streams k, is as follows. • For the bit stream k = 0, the quantization operation is done conventionally as usual in the TDAC coder. It allows to obtain the parameters, and r ⁰⁾ which make it possible to construct the code word w, ⁽⁰⁾ . Moreover, the vectors Y (i) and sign (i) are determined in this same step. They are stored in memory, as well as the perm permutation perm (i), to be used, if necessary, in the following steps relating to other bitstreams. • For the bit streams l ≤ k <K, we proceed incrementally, from k = 1 to k ≈ K-1, preferably using the following steps: If (bP ≈ bW) then: 1. the word of code, on the band i, the frame of the bit stream k is the same as that of the frame of the bit stream (kl): Otherwise, ie {b, ^k) > bj ^k - ^l) ): 2. We search among the N- ^-1 -, d,)) leaders of CL (b1k ⁾ , d,) nearest neighbor of? (/), 3. With the result of step 2 and knowing the nearest neighbor of i (i) in we test if the nearest neighbor of? (/) in is in (case "Flag = 0" below) or EC (b ^k), d) ("Flag = 1" case below), 4. If Flag = 0 ((the closest leader of Y (/) in is also its closest neighbor in CL (b ^k) , d _t )) then: If Flag = 1 (the closest leader to Ϋ (i) in found in step 2 is also its nearest neighbor in CJ (b, ^(/ °, d)), or ή ^k) its number (with L ^{k) ≥ NLψ; ^{k ~ X)} , d), then the following steps are performed: a. Search for rank r, ^w of (new quantified vector of Y (i) in the leader's class for example by the Schalkwijk algorithm using permit), b. Determination of using sign (i) and perm (i), c. Determining the codeword, ⁽ⁱ⁾ from r / * - ¹ and sign ^(k) (i).

Second Example: Application to an MPEG-1 Layer Transform Encoder

The MPEG-1 Layer encoder l & l1, shown in FIG. 6a, uses a filterbank with 32 uniform subbands (block 61 of FIG. 6a) to perform the time / frequency transformation of the input audio signal n0. The output samples of each subband are grouped and then normalized by a common scale factor (determined by function block 67) before being quantized (block 62). The number of levels of the uniform scalar quantizer used for each subband results from a dynamic bit allocation procedure (performed by block 63). This procedure uses a psychoacoustic model (block 64) to determine the bit distribution that makes the quantization noise as noticeable as possible. The hearing models proposed in the standard are based on the estimation of the spectrum obtained by a fast Fourier transform (FFT) of the input temporal signal (made by block 65). Referring to FIG. 6b, the frame s _c , multiplexed by the block 66 of FIG. 6a and which is finally transmitted, contains, after a headless field HD, the set of samples of the quantized subbands E _S B, which represent the main information, and complementary information used for the decoding operation constituted by the scaling factors F _E and the allocation of bits Ai.

From this coding scheme, the construction of a multi-rate encoder, in one application of the invention, can be carried out by pooling the following functional blocks, with reference to FIG. 7: • Filter bank analysis 61 • Determination of scaling factors 67 • FFT Fourier transform calculation 65 • Determination of masking thresholds according to a psychoacoustic model 64.

The two blocks 64 and 65 already provide the signal to mask ratios (SMR arrows of FIGS. 6a and 7) used for the bit allocation procedure (block 70 of FIG. 7).

In this exemplary embodiment as shown in FIG. 7, it is possible to take advantage of the procedure used for the allocation of bits to also put it in common, but adding however some modifications to the allocation (block 70). bit allocation of FIG. 7). Only the quantization functional block 62_0 to 62_ (K-1) is therefore specific to each bit stream corresponding to a rate D _k , 0 ≤ k ≤ K-1. The same is true for the multiplexing block 66_0 to 66 K-1). * Bit allocation

In the MPEG-1 Layer encoder I & II, the allocation is preferentially done by a succession of iterative steps as follows.

Step 0: Zero initialization of the number of bits b, of each of the sub-bands i, 0 </ <M.

Step 1: Update the NMR distortion function () (called "Noise to Mask Ratio") on each of the subbands: NMR (i) = SMR (i) - SNR (b,), where SNR (b _t ) is the signal to noise ratio corresponding to the quantizer having a number of bits b,, and SMR (i) the signal to mask ratio provided by the psychoacoustic model.

Step 2: Incrementing the number of bits b _k of the subband; ₀ where this distortion is maximum: b, _o = b _o + ε, / '"= argmax [NMR (z')] where ε is a positive integer value depending on the band, generally taken as equal to 1.

Steps 1 and 2 are repeated iteratively until the total number of available bits, corresponding to the operating rate, is distributed. The result is then a bit distribution vector (b ₀ , b ₁₅ ..., b _M _-).

In the multi-rate coding scheme, these steps are shared with a few other modifications, including: • the function block having as output K bit distribution vectors is obtained when the total number of available bits corresponding to the bit rate D _k of the bit stream k is distributed, at the iteration of steps 1 and _. 2. • The stop of the iteration of steps 1 and 2 is done when the total number of available bits corresponding to the highest bit rate D _κ _ _λ is totally distributed (it is recalled that the bit streams are ordered according to an order increasing flow rates).

Note that the bit distribution vectors are obtained successively from k = 0 to k = K-1. The K outputs of this bit allocation block then feed the quantization blocks for each of the bit streams at the given bit rate.

Third Embodiment: Application to a CELP Encoder The last exemplary embodiment relates to the coding of the multi-mode speech with a posterior decision from the 3GPP NB-AMR coder (for "Narrow-Band Adaptive Multi-Rate"). ) which is an adaptive multi-rate bandwidth speech coder, according to a 3GPP standard. This encoder, which belongs to the well-known family of CELP coders whose principle was briefly described above, has eight modes (or bit rates) ranging from 12.2 kbit / s to 4.75 kbit / s, all based on the technique ACELP (for "Algebraic Code Excited Linear Prediction"). Figure 8 gives the functional block coding scheme of this encoder. This structure was exploited in order to realize a post-decision multi-mode encoder, based on 4 modes of the NB-AMR encoder (7.4, 6.7, 5.9, 5.15).

In a first variant, only the sharing of the identical functional blocks is exploited (the results of the four codings are then identical to those of the four parallel codings). In a second variant, the complexity is even smaller. The non-identical functional block calculations for some modes are accelerated by exploiting those of another mode or a common processing module, as will be seen below. The results of the four encodings thus shared are then different from those of the four codings in parallel.

In yet another variant, the functional blocks of these four modes are used for trellis multi-mode coding, as seen above with reference to Figure 1d.

The four modes (7.4, 6.7, 5.9, 5.15) of the 3GPP NB-AMR encoder are briefly described below.

The 3GPP NB-AMR coder is working on a 3.4 kHz band-limited speech signal sampled at 8 kHz cut into 20 ms frames (160 samples). Each frame has 4 subframes of 5 ms (40 samples) grouped 2 by 2 in "super subframes" of 10 ms (80 samples). For all modes, the same types of parameters are extracted from the signal but with variants of modeling and / or quantification of these parameters. In the NB-AMR encoder, five types of parameters are to be analyzed and coded. LSP settings (for "One

Spectral Pairs') are processed once per frame for all modes, except for 12.2 mode (so once per super subframe). Other parameters

(in particular the LTP delay, the gain of the adaptive excitation, the fixed excitation, the gain of the fixed excitation) are treated once per subframe.

The four modes considered here (7.4, 6.7, 5.9, 5.15) are distinguished essentially by the quantifications of their parameters. The binary allocation of these 4 modes is summarized in Table 1 below. Table 1: Binary Allocation of the 4 Modes (7.4, 6.7, 5.9, 5.15) of the 3GPP NB-AMR Encoder

These 4 modes of the NB-AMR encoder (7.4, 6.7, 5.9, 5.15) have identical modules such as preprocessing, analysis of linear prediction coefficients, signal calculation weighted. The signal preprocessing is 80 Hz high-pass cut-off filtering to suppress the continuous components combined with division of the input signals to avoid overflows. The LPC analysis includes sub-modules of windowing, autocorrelation calculation, implementation of the Levinson-Durbin algorithm, A (z) - »LSP transformation, calculation of unquantized LSPj parameters for each sub-module. -frame (i = 0, ..., 3) by interpolation between the LSPs of the past frame and those of the current frame, and of inverse transformation (LSPr »Aj (z)).

The calculation of the weighted speech signal lies in a filtering by the perceptual weighting filter where Aj (z) is the unquantized filter of the subscript of index i with γi = 0.94 and γ ₂ = 0.6).

Other functional blocks are identical for only three of these modes (7.4, 6.7, 5.9). For example, searching for LTP delay in open loop performed on the weighted signal once per super subframe for these three modes. For the 5.15 mode, however, it is performed only once per frame.

Likewise, if the four modes use an average 1-to-average predefined weighted vector prediction MA (for "Moving Average") quantization and LSP parameters in the normalized frequency domain, the quantization of the LSP parameters from , 15 kbit / s is done on 23 bits, that of the other three modes on 26 bits. After transformation in the normalized frequency domain, the Cartesian product vector quantization (so-called "split VQ") of the LSP parameters divides the LSP parameters into 3 sub-vectors, of dimension 3, 3 and 4. The first sub-vector composed of The first 3 LSP is quantized on 8 bits by the same dictionary for the four modes. The second sub-vector composed of the following 3 LSPs is quantized for the 3 high-speed modes by a dictionary of size 512 (9 bits) and for the mode with 5,15 by half of this dictionary (one vector out of 2). The third and last sub-vector composed of the last 4 LSPs is quantized for the 3 high-speed modes by a dictionary of size 512 (9 bits) and for the mode of lower bit rate by a dictionary of size 128 (7 bits). The transformation in the normalized frequency domain, the calculation of the squared error criterion weights and the MA prediction (for "Moving Average") of the LSP residue to be quantized are identical for the 4 modes. Since the three broadband modes use the same dictionaries to quantify the LSPs, they can share, in addition to the same vector quantization module, the inverse transformation (to return from the normalized frequency domain to the cosine domain), as well as the calculation of the LSPs. Quantified ^Q for each subframe

(i = 0 3) by interpolation between the quantized LSPs of the past frame and those of the current frame, and finally the inverse transformation LSP ^Q j → A ^Q j (z). The closed-loop searches of the adaptive and fixed excitations are done sequentially and require the calculation of the impulse response of the weighted synthesis filter and then of the target signals beforehand. The impulse response of the weighted synthesis filter (Ai (z / γ ₁ ) / [A ^Q i (z) A 1 (z / γ ₂ )]) is identical for the 3 high-speed modes (7.4, 6.7; 5.9). For each subframe, the calculation of the target signal for the adaptive excitation depends on the weighted signal (regardless of the mode); the quantized filter A ^Q j (z) (same ^"for 3 modes) and the past of the subframe (different for each sub-frame other than the first sub-frame). For each subframe, the signal- Target for fixed excitation is obtained by removing from the previous target signal the contribution of the filtered adaptive excitation of this subframe (which is different from one mode to another except for the first subframe of the first 3 modes).

Three adaptive dictionaries are used. The first dictionary, for the even subframes (i = 0 and 2) of the modes (7.4, 6.7, 5.9) and for the first subframe of the 5.15 mode, comprises 256 absolute delays. fractional, 1/3 resolution in the range [19 + 1 / 3.84 + 2/3] and full resolution in the range [85,143]. The search in this dictionary of absolute delays is focused around the delay found in open loop (range of ± 5 for the 5.15 mode, ± 3 for the other modes). For the first sub-frame of the modes (7.4, 6.7, 5.9), the target signal and the open-loop delay being identical, the result of this closed-loop search is also identical. The other two dictionaries are of the differential type and make it possible to code the difference between the current delay and the integer delay Tu which is closest to the fractional delay of the preceding sub-frame. The first 5-bit differential dictionary, used for the odd subframes of the 7.4 mode, is 1/3 resolution around the integer delay TM in the interval [T -5 +2/3, TM + 4 + 2/3]. The second 4-bit differential dictionary, included in the first one, is used for the odd subframes of the modes at 6.7 and 5.9 as well as for the last three subframes of the 5.15 mode. This second dictionary is of integer resolution around the integer delay T _M in the interval PVι-5, T +4] plus a resolution of 1/3 in the interval [T -1 + 2/3, TM + 2/3 ].

Fixed dictionaries belong to the well-known family of ACELP dictionaries. The structure of an ACELP directory is based on the ISPP (Interleaved Single-Pulse Permutation) concept, which consists of dividing all L positions into K interleaved tracks, each of the N pulses being located in certain predefined tracks. The 4 modes (7.4, 6.7, 5.9, 5.15) use the same slice of the 40 samples of a 5-track subframe of length 8 interleaved, as shown in Table 2a. Table 2b shows, for the 3 modes (7.4, 6.7, 5.9) the dictionary rate, the number of pulses and their distribution in the tracks. The distribution of the 2 pulses of the ACELP 9-bit dictionary of the 5.15 mode is even more constrained.

Table 2a: Interleaved Cutting of the 40 Positions of a Subframe of the 3GPP NB-AMR Encoder Table 2b: Pulse Distribution in the Tracks for 7.4 Modes; 6.7; 5.9 3GPP NB-AMR Encoder

The gains of the adaptive and fixed excitations are quantified on 7 or 6 bits (with an MA prediction applied to the gain of the fixed excitation) by a joint vector quantization minimizing the CELP criterion.

* Post-decision multi-mode coding exploiting only the sharing of identical functional blocks From this coding scheme, the construction of a post-decision multi-mode coder can be achieved by pooling the functional blocks following.

Referring to FIG. 8, for the 4 modes, the following is done in common: • the preprocessing (block 81), • the analysis of the linear prediction coefficients (windowing and calculation of the autocorrelations 82, implementation of the Levinson-Durbin algorithm 83, transformation A (z) → LSP 84, LSP interpolation and inverse transformation 862), • the calculation of the weighted input signal 87, • the transformation of the LSP parameters in the normalized frequency domain, the calculation of the weights of the criterion d quadratic error for the vector quantization of the LSPs, the MA prediction of the LSP residue, the vector quantization of the first 3 LSPs (in block 85).

For all these blocks, their cumulative complexity is thus divided by 4. For the 3 higher throughput modes (7.4, 6.7, 5.9), we perform: • the vector quantization of the last 7 LSPs (once per frame) (in block 85 of FIG. 8), • the search for the LTP delay in open loop (twice per frame) (block 88), • the interpolation of the quantized LSPs (861) and the inverse transformation to the filters A ^Q i (for each subframe), • the computation of the impulse response 89 of the weighted synthesis filter (for each subframe).

For these blocks, the calculations are no longer performed 4 times but twice, once for the 3 higher rate modes and once for the low rate mode. Their complexity is therefore divided by 2.

It is also possible, for these 3 higher rate modes, to pool for the first sub-frame the calculation of the target signals for the fixed excitation (block 91 in FIG. 8) and adaptive excitation (block 90), as well as the search LTP closed loop (block 881). It should be noted that the pooling of these operations for the first sub-frame produces identical results only in the context of multiple coding of multi-mode type with a posteriori decision. In the general context of multiple coding, the past of the first sub-frame is, as for the other 3 sub-frames, different according to the flow rates, these operations generally lead to different results.

* Advanced post-decision multi-mode coding Non-identical functional blocks can be accelerated by exploiting those of another mode or a common processing module. Depending on the constraints of the application (in terms of quality and / or complexity), different variants can be used. Some examples are described below. It is also possible to rely on intelligent transcoding techniques between CELP coders.

* The vector quantization of the second sub-vector of LSP

One can, as in the case of the embodiment for the TDAC encoder, exploit the nesting of certain dictionaries to accelerate the computations. Thus, the dictionary of the second LSP sub-vector of the 5.15 mode being included in that of the other 3 modes, the quantification of this sub-vector Y by the four modes can thus be advantageously combined: • Step 1: Finding its nearest neighbor Y | in the smallest dictionary (corresponding to half of the big dictionary) Y Y quantifies Y for the mode at 5, 15 • Step 2: Find the nearest neighbor Yh in the complement in the big dictionary (ie the other half of the dictionary )

• Step 3: Test if the nearest neighbor of Y in the 9-bit dictionary is Yι (case "Flag = 0") or Y _h (case "Flag = 1") o case "Flag≈O": Y | also quantifies Y for the 7.4 modes; 6.7 and 5.9 o otherwise (case "Flag = 1"), Yh quantizes Y for modes at 7.4; 6.7 and 5.9 This implementation gives a result identical to that of the non-optimized multi-mode coding. If one wishes to further reduce the complexity of the quantization, one can stop at step 1 and take Y- as a quantized vector for the high-speed modes if this vector is considered sufficiently close to Y. This simplification can therefore give a different result from an exhaustive search.

* Acceleration of LTP research in open loop

The search for the open-loop LTP delay of the 5.15 mode can exploit the results of that of the other modes. If the two open-loop delays found on the 2 super-frames are close enough to allow differential coding, the open-loop search of the 5.15 mode is not performed. Rather, the results of the higher modes are used. Otherwise, we can:

- perform the classic search, - or focus the open-loop search on the entire frame around the two open-loop delays found by the higher modes.

Conversely, it is also possible to first carry out the search for the open-loop delay in the 5.15 mode and focus the two searches for the open-loop delay of the higher modes around the value determined by the 5-mode. 15.

In a third, more advanced variant, illustrated in FIG. 1d, it is proposed to produce a trellis multi-mode encoder allowing several combinations of functional blocks, each functional block having at least two modes of operation (or flows). This new encoder was constructed from the four NB-AMR encoder rates mentioned above (5.15, 5.90, 6.70, 7.40). In this encoder, there are four functional blocks: the LPC block, the LTP block, the fixed excitation block and the gain block. Referring to Table 1 presented above, Table 3a below summarizes for each of these functional blocks, its number of flow rates and its flow rates.

Table 3a: Number of flow rates and flow rates of the functional blocks for the four modes (5,15; 5,90; 6,70; 7,40) of the NB-AMR encoder.

We thus have P = 4 functional blocks and 2 x 3 x 4 x 2 = 48 possible combinations. In the particular embodiment, it is chosen not to consider the high bit rate of the functional block 2 (LTP bit rate 26 bits / frame). Another choice is possible, of course.

The multi-rate encoder thus obtained has a high granularity in rates, with 32 possible modes given in Table 3b. However, it is indicated that the encoder thus obtained is not interoperable with the aforementioned NB-AMR encoder. In Table 3b, the modes corresponding to the three flows of the NB-AMR (5.15, 5.90, 6.70) are shown in bold, the exclusion of the highest bit rate of the LTP functional block eliminating the flow of 7 40.

Table 3b: Flow rate per functional and global block of the multi-mode lattice encoder

This encoder has 32 possible bit rates, 5 bits are needed to identify the mode used. As in the previous variant, the pooling of functional blocks is exploited. Different coding strategies are applied for the different functional blocks. For example, for the functional block 1 comprising the quantification of the LSPs, the low bit rate is preferred as mentioned above in the following manner: the first compound sub-vector of the first 3 LSPs is quantified on 8 bits by the same dictionary for the two flows associated with this functional block, - The second sub-vector composed of the following 3 LSPs is quantized on 8 bits by the dictionary of the smallest bit rate. This dictionary corresponds to half of the dictionary of higher speed, one searches in the other half of the dictionary only if the distance between the 3 LSP and the element chosen in the dictionary exceeds a certain threshold. The third and last compound sub-vector of the last 4 LSPs is quantized by a dictionary of size 512 (9 bits) and by a dictionary of size 128 (7 bits).

On the other hand, as mentioned above in the second variant (corresponding to the multi-mode coding with advanced posterior decision), one chooses to privilege the high bitrate for the functional block 2 (LTP delay). In NB-AMR coder, the search for the open-loop LTP delay is performed twice per frame for the 24-bit LTP delay and is performed once per frame for the 20-bit one. For this functional block, it is desired to promote broadband. Therefore, the computation of the LTP delay in open loop is carried out as follows:

- Two open-loop delays are calculated on the 2 super-subframes. If they are close enough to allow differential coding; the open loop search on the entire frame is not performed. Instead, the results of the two super-frames are used. - Otherwise, we perform an open loop search on the entire frame by focusing around the two open loop delays found previously. A complexity-reducing variant retains only the open-loop delay of the first.

After some functional blocks, it is possible to make a partial selection to reduce the number of combinations to explore. For example, after Function Block 1 (LPC), 26-bit combinations can be eliminated for this block if the performance of the 23-bit rate is close enough or conversely eliminate the 23-bit mode if its performance is too degraded compared to in 26-bit mode.

Thus, the present invention makes it possible to provide an effective solution to the problem of the complexity of multiple codings, by pooling and accelerating the calculations implemented by the various coders. The coding structures can therefore be represented using functional blocks describing the various operations performed during a treatment. The functional blocks of the different encodings implemented in multiple coding have strong relationships that are exploited in the sense of the present invention. These relations are particularly strong when the different codings correspond to different modes of the same structure. Finally, it is pointed out that the present invention is flexible from the point of view of complexity. It is indeed possible to decide a priori the maximum complexity of the multiple coding and to adapt the number of coders explored as a function of this complexity.

Claims

claims

A method of multiple coding in compression, wherein an input signal is adapted to supply in parallel a plurality of encoders each comprising a succession of functional blocks, for encoding in compression of said signal by each encoder, characterized in that that it comprises the preparatory steps hereinafter: ^* a) identifying the functional blocks forming each coder, and one or more functions performed by each block, b) identifying, among said functions, functions which are common to one coder to another, and c) perform said common functions, once and for all, for at least a part of all the coders, within at least one same calculation module.

2. Method according to claim 1, characterized in that said computation module is constituted by one or more blocks of one of the coders.

3. Method according to claim 2, characterized in that, for each function executed in step c), at least one functional block of an encoder chosen from said plurality of encoders is used, and in that the block of said encoder selected is arranged to deliver partial results to other coders, for efficient coding, from said other coders, verifying an optimal criterion between the complexity and the quality of the coding.

4. The method as claimed in claim 3, in which the coders are capable of operating at different respective bit rates, characterized in that the chosen coder is the weakest bit rate encoder, and in that the results obtained, following the execution of the function in step c) with parameters specific to the chosen encoder, are adapted to the bit rates of at least part of the other coders by a focused search of parameters for at least part of all the other modes, until 'to the highest rate encoder.

5. Method according to claim 3, in which the encoders are able to operate at different respective rates, characterized in that the chosen encoder is the highest bit rate encoder, and in that the results obtained, following the execution of the function in step c) with parameters specific to the chosen coder, are adapted to a portion flows at least one of other coders by a focused search for parameters for some at least of ^* all other modes, to the lowest bit rate encoder.

6. Method according to claim 4, taken in combination with claim 5, characterized in that, for a given bit rate, the functional block of an encoder operating at said given bit rate, as a calculation module, is used and adapted. progressively at least a portion of the parameters specific to this encoder: - up to the highest rate encoder, by focused search, and - to the lowest rate encoder, by focused search.

7. Method according to claim 1, in which the functional blocks of the different coders are arranged in lattices, with several possible paths in the trellis, characterized in that each path of the trellis is defined by a combination of operating modes of the functional blocks, each functional block supplying several possible variants of the next functional block.

8. Method according to claim 7, characterized in that a partial selection module is provided, after each coding step carried out by one or more functional blocks, capable of selecting the results provided by one or more of these functional blocks, for subsequent coding steps.

The method according to claim 7, wherein the functional blocks are capable of operating at different respective rates and using respective parameters specific to said flows, characterized in that, for a given functional block, the selected trellis path is the one passing through the lowest bit rate functional block, and in that the results obtained from said lowest bit rate functional block are adapted to the bit rates of at least part of the other functional blocks-by a focused search of parameters for a part at least all other functional blocks up to the highest rate functional block.

The method according to claim 7, wherein the functional blocks are capable of operating at different respective rates and using respective parameters specific to said rates, characterized in that, for a given functional block, the selected trellis path is the one passing through the highest rate functional block, and in that the results obtained from said highest rate functional block are adapted to the bit rates of at least a portion of the other functional blocks by a focused search of parameters for a portion of the less than all other functional blocks, up to the lowest throughput functional block.

11. The method according to claim 9, taken in combination with claim 10, characterized in that, for a given bit rate associated with the parameters of a functional block of an encoder, the functional block operating at said given bit rate, as a function block, is used. as a calculation module, and progressively adapting at least a portion of the parameters specific to this functional block:

- to the functional block capable of operating at the lowest rate, by focussed search, and

- to the functional block capable of operating at the highest rate, by focussed search.

12. Method according to claim 1, characterized in that said computation module is a module independent of said coders, and arranged to redistribute the results obtained in step c) to all the coders.

13. The method of claim 12, taken in combination with claim 2, characterized in that the independent module and the block or blocks of at least one of the coders are arranged to mutually exchange the results obtained in step c). , and in that the calculation module is arranged to perform an adaptation transcoding between functional blocks of different coders.

14. Method according to one of claims 12 and 13, characterized in that the independent module comprises an at least partial coding block and an adaptation transcoding block.

15. Method according to one of the preceding claims, in which the parallel coders are arranged to operate in multi-mode coding, characterized in that a posterior selection module is provided, capable of selecting an encoder from among the coders. .

16. The method of claim 15, characterized in that there is provided a partial selection module, after each coding step conducted by one or more functional blocks, independent of the encoders and capable of selecting one or more encoders.

Method according to one of the preceding claims, in which the encoders are of the transform type, characterized in that the calculation module comprises a bit allocation block, shared between all the coders, each bit allocation performed for one encoder being followed by an adaptation to this encoder in particular according to its flow.

18. The method of claim 17, characterized in that the method further comprises a quantization step, the results of which are provided to all coders.

19. The method of claim 18, characterized in that it further comprises steps common to all the coders among:

a time-frequency transform (MDCT),

a voicing detection in the input signal,

- a tone detection, - the determination of a masking curve,

and a spectral envelope coding.

20. The method according to claim 17, wherein the coders perform a sub-band coding (MPEG-1), characterized in that the method further comprises steps common to all the coders among:

- the application of an analysis filter bank,

- a determination of scale factors,

a spectral transform calculation (FFT),

- and the determination of masking thresholds according to a psychoacoustic model.

21. Method according to one of claims 1 to 16, wherein the coders are of the type of analysis by synthesis (CELP), characterized in that the method comprises steps common to all the coders among at least: - a pre- treatment,

- analysis of linear prediction coefficients,

a weighted input signal calculation,

and a quantization for at least a portion of the parameters.

22. The method of claim 21, taken in combination with claim 16, characterized in that the partial selection module is implemented after a shared vector quantization step for short-term parameters (LPC).

23. The method of claim 21, taken in combination with claim 16, characterized in that the partial selection module is implemented after a shared step of long-term parameter search (LTP) open loop.

24. Computer program product intended to be stored in a memory of a processing unit, in particular a computer or a mobile terminal, or on a removable memory medium and intended to cooperate with a reader of the unit processing, characterized in that it comprises instructions for implementing the transcoding method according to one of the preceding claims.

25. A device for coding multiple compression coding in which an input signal is intended to supply in parallel a plurality of coders each comprising a succession of functional blocks, for compression coding said signal by each encoder, characterized in that it comprises a memory adapted to store instructions of a computer program product according to claim 24.

26. Device according to claim 25, characterized in that it further comprises an independent calculation module (M1) for implementing the method according to one of claims 12 to 16 and 22, 23.