EP1875465A1 - Verfahren zur anpassung für interoperabilität zwischen kurzzeit-korrelationsmodellen digitaler signale - Google Patents

Verfahren zur anpassung für interoperabilität zwischen kurzzeit-korrelationsmodellen digitaler signale

Info

Publication number
EP1875465A1
EP1875465A1 EP06743681A EP06743681A EP1875465A1 EP 1875465 A1 EP1875465 A1 EP 1875465A1 EP 06743681 A EP06743681 A EP 06743681A EP 06743681 A EP06743681 A EP 06743681A EP 1875465 A1 EP1875465 A1 EP 1875465A1
Authority
EP
European Patent Office
Prior art keywords
format
interpolation
block
lpc
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP06743681A
Other languages
English (en)
French (fr)
Inventor
Mohamed Ghenania
Claude Lamblin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orange SA
Original Assignee
France Telecom SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by France Telecom SA filed Critical France Telecom SA
Publication of EP1875465A1 publication Critical patent/EP1875465A1/de
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/173Transcoding, i.e. converting between two coded representations avoiding cascaded coding-decoding

Definitions

  • the invention relates to the encoding / decoding of digital signals, particularly in applications for transmitting or storing multimedia signals such as audio signals (speech and / or sounds).
  • encoders use signal properties such as its harmonic structure, exploited by. long-term prediction filters, as well as its local stationary, exploited by short-term prediction filters.
  • the speech signal can be considered as a stationary signal for example over time intervals of 10 to 20 ms. This signal can therefore be analyzed by sample blocks called frames after appropriate windowing.
  • the short-term correlations can be modeled by time-varying linear filters whose coefficients are obtained by means of a linear prediction analysis on frames, of short duration (from 10 to 20 ms in the aforementioned example ).
  • Linear prediction coding is one of the most widely used digital coding techniques. It involves performing an LPC analysis of the signal to be coded to determine an LPC filter, then quantifying this filter, on the one hand, and modeling and coding the excitation signal, on the other hand. This LPC analysis is performed by minimizing the prediction error on the signal to be modeled or a modified version of this signal.
  • the autoregressive P-order linear prediction model consists in determining a signal sample at an instant n by a linear combination of the past P samples (prediction principle).
  • the short-term prediction filter denoted by A (z)
  • ⁇ 0) ⁇ - ⁇ , x2 ⁇ '
  • LSP coefficients are now the most used for the representation of the LPC filter because they are well suited for vector quantization.
  • Linear prediction exploits the quasi-local stationarity of the signal. However, this assumption of local parking is not always verified. In particular, if the updating of the LPC coefficients is not done often enough, the quality of the LPC analysis deteriorates. Increasing the calculation frequency of the LPC parameters obviously improves the quality of the LPC analysis by making it possible to better follow the spectral variations of the signal. However, this situation leads to an increase in the number of filters to be transmitted and therefore an increase in the flow rate.
  • the G.729 encoder uses interpolation of the transformed LPC parameters to obtain LPC parameters every 5 ms.
  • the complexity of the LPC analysis is critical when several codings must be performed by a single processing unit such as a gateway responsible for managing many parallel communications or a server distributing a large number of multimedia contents.
  • the problem of complexity is further increased by the multiplicity of signal compression formats that circulate on the networks. It will therefore be understood that a first problem arises with respect to a compromise between bit rate / quality / complexity for the LPC analysis.
  • Transcoding is necessary when, in a transmission chain, a compressed signal frame transmitted by an encoder can no longer continue its path in this format. Transcoding makes it possible to convert this frame into another format compatible with the rest of the transmission chain.
  • the most basic solution (and the most common at the moment) is the end-to-end addition of a decoder and an encoder.
  • the compressed frame arrives in a first format. It is then decompressed.
  • the decompressed signal is then compressed again in a second format accepted by following the chain comrr ⁇ nicltio ⁇ T "This” ifiis ⁇ "Eri ⁇ cascâd ⁇ Tcl'Dn" decoder and a coder is called a tandem.
  • Such a solution is very expensive in complexity (essentially because of the recoding) and it degrades the quality because the second coding is done on a decoded signal which is a degraded version of the original signal.
  • a frame can meet several tandems before arriving at its destination, resulting in a high calculation cost and a loss of quality.
  • the delays associated with each tandem operation accumulate and can hinder the interactivity of communications.
  • the complexity is also problematic in the context of a multi-format compression system where the same content is compressed in several formats. This is typically the case for content servers that broadcast the same content in several formats adapted to the conditions of access, networks and terminals of different customers. This multi-coding operation becomes extremely complex as the number of desired formats increases, so that the resources of the system appear to be rapidly limited.
  • post-decision multi-mode compression Another case of multiple-coding in parallel is post-decision multi-mode compression which is described as follows. For each signal segment to be encoded, several compression modes are performed and the one that optimizes a given criterion or obtains the best compromise between bit rate and distortion is selected.
  • the complexity of each of the modes of compression limits the number and / or leads to elaborate a selection a priori of a very limited number of modes.
  • a second problem arises with respect to the multiplicity of possible compression formats.
  • the transcoding of the parameter is done at the binary level by copying its bit field of the bit stream of the format A into the bit stream of the If the parameter is calculated in the same way but quantified differently, it is usually necessary to re-quantify it with the method used by the coding format B. Similarly, if the formats A and B do not calculate this parameter at the same frequency (for example if their frame lengths or subframes are different), it is necessary to interpolate this parameter. It is possible to perform this step only on the above parameter, without having to go back to the complete signal. The transcoding is then performed only at the parameter level. Moreover, the LSP coefficients are usually transcoded at this "parameter" level.
  • a first method consists in calculating the coefficients modeling the LPC filter of the second format for a frame, by interpolating the coefficients of the LPC filters of the second format corresponding substantially to this frame:
  • PB ( m ) ⁇ * PA (»-! + ⁇ P ⁇ ( n )
  • p B (m) is the vector of coefficients of the second model for its frame (m)
  • p A ( ⁇ ) is the vector of coefficients of the first model for its frame n
  • ⁇ and ⁇ are interpolation factors. In general, ⁇ is equal to (1- ⁇ ).
  • P E vacQm 0.5417p GJ23 ⁇ (2m-1) + 0 ⁇ 583p GJ13 ⁇ (2rn + 1)
  • p EVRC (3m + 1) 0.8750 J p ⁇ mi (2w) + 0.1250p ⁇ i723i i (2w + 1)
  • the set of interpolation factors is set according to the temporal position of the frame of the second format in its group of frames. Even more complex transcoding methods that involve more than two first-format filters, or second-format filters, use a fixed set of interpolation factors.
  • the present invention proposes to use an adaptive (or “dynamic”) interpolation.
  • An object of the invention is the dynamic selection of a set of interpolation factors in a multiple coding context.
  • Another object of the invention is to limit the number of sets of interpolation factors, preferably taking into account a desired quality / complexity compromise and, for a given complexity, to optimize the quality or, conversely, to minimize the complexity for a given quality.
  • the invention firstly proposes a coding method according to a second format, based on information obtained by implementing at least one coding step in a first format.
  • the first and second formats implement, in particular for the coding of a speech signal, short-term prediction models LPC on blocks of samples of digital signal, using filters represented by respective LPC coefficients.
  • the LPC coefficients of the second format are determined from an interpolation on values representative of the LPC coefficients of the first format at least, between at least a first given block and a second block, preceding the first block. .
  • the aforementioned interpolation is conducted dynamically, by selecting for each current block at least one interpolation factor among a preselection of factors, according to a predetermined criterion.
  • pre-assembled set of interpolation factors which, in no way limiting, may include sets of factors ⁇ and ⁇ as defined above ( ⁇ and ⁇ pairs, or even ⁇ triplets).
  • the invention proposes to determine a set of several sets of interpolation factors and to use, for each LPC analysis block , a set of interpolation factors selected from this preconstituted set.
  • This selection in the preconstituted set is performed dynamically according to the aforementioned predetermined criterion.
  • This predetermined criterion may advantageously be related to the detection of a stationary failure of the digital signal between the given block and the previous block.
  • the preselection can be built initially according to a heuristic choice or from a preliminary statistical study, as will be seen in the detailed description below.
  • FIG. 1 schematically represents an exemplary transcoding module for the implementation of the invention
  • FIG. 2 diagrammatically illustrates the interpolation principle in order to estimate the representative values of the LPC coefficients of the second format for a succession. of blocks m-1, m, m + 1 of the coded second format signal SC2, from an interpolation conducted on the representative values of the estimated first format LPC coefficients for successive blocks n-2, n-1, n of the first coded signal SC1,
  • FIGS. 3A and 3B schematically illustrate parallel coding and transcoding systems, respectively, involving a transcoding module in the sense of the invention
  • FIG. 4 is a flowchart illustrating the general algorithm of a computer program product in the sense of the invention, for dynamically choosing the interpolation factors among the preselection,
  • FIG. 5 illustrates the steps of construction of the preselection in an advantageous embodiment of the invention
  • FIGS. 6A and 6B illustrate the histograms of the optimum value of the interpolation factor ⁇ respectively for the first two frames of the groups of 3 frames of the G.729 standard encoder, as the second coder
  • FIG. 7A illustrates the correspondence between a frame of the G.723.1 standardized coder (30 ms), as the first coder, and 3 frames of the G.729 standardized coder (10 ms), as a second coder
  • FIG. 7B illustrates the correspondence between the sub-frames of the G.729 encoder (5 ms) and the G.723.1 encoder (7.5 ms),
  • FIGS. 8A, 8B and 8C illustrate the distributions of the spectral distortions obtained by static interpolation ("Static" curve in full line) in the sense of the prior art and by fine dynamic interpolation in the sense of the invention ("Fine" curve "in dotted lines), respectively for three successive common frames of the G.729 standard encoder, as the second encoder,
  • FIGS. 9A and 9B illustrate the distributions of the spectral distortions obtained by the fine (“Fine” dashed line) and coarse (“Coarse” curve in solid line) interpolations respectively for two current successive frames of the G.729 coder, and
  • FIG. 10 is a flow diagram of an example of interpolation factor selection algorithm ⁇ , dynamically.
  • the transcoding module MOD can for example be arranged between:
  • a first coder COD1 of an input signal S according to a first format, and intended for example to deliver a first coded signal SC1
  • a second coder COD2 of the same input signal S according to a second format, and intended for example to deliver a second coded signal SC2.
  • the first coder COD1 has begun to encode the input signal S, completely or partially, but, in any case, sufficiently to have already determined the LPC coefficients according to the first format.
  • the transcoding module MOD within the meaning of the invention retrieves at least the LPC coefficients obtained by the coding according to the first format, or values representative of these coefficients, for example the vectors (LSP) i and, from these values, estimates by interpolation the coefficients (LPC) 2 (or representative values (LSP) 2 ) which will be used by the second coder COD2 to construct the second coded signal SC2 in the second format.
  • LPC coefficients
  • LPC representative values
  • the transcoding module MOD in the sense of the invention in general, is adapted for the purpose of coding a signal S according to a second format, from information (including in particular LPC coefficients obtained from the first coding or values representative of these coefficients, for example the vectors (LSP) O obtained by the implementation of at least one coding step (the step that makes it possible to retrieve the information including the representative values of the coefficients (LPC) i) of the same input signal S according to the first format.
  • information including in particular LPC coefficients obtained from the first coding or values representative of these coefficients, for example the vectors (LSP) O obtained by the implementation of at least one coding step (the step that makes it possible to retrieve the information including the representative values of the coefficients (LPC) i) of the same input signal S according to the first format.
  • first and second formats implement, in particular for the coding of a speech signal S, LPC short-term prediction models on blocks of digital signal samples (as will be seen later with reference in Figure 2), using filters represented by respective LPC coefficients.
  • the module thus comprises: an input 5 (FIG. 1) for receiving information (LPC) i representative of the LPC coefficients obtained by the first format, and including, for example, the values (LSP) i,
  • a processing unit for determining the LPC coefficients of the second format (referenced (LPC) 2 , or more particularly the values (LSP) 2 of FIG. 1 if the interpolation module 1 processes LSP vector values) from an interpolation (conducted by the module 1 of FIG. 1) on values (LSP) i representative of the LPC coefficients obtained from the first format between at least a first given block (referenced n in FIG. 2) and a second block
  • the coded signal in the first format SC1 comprises a succession of sample blocks n, n-1, n-2, and so on. Values (LSP) i [nI , (LSP) i [n-1] , etc., representative of the LPC coefficients in the first format, have been obtained
  • the SC2 signal coded in the second format also comprises a succession of sample blocks (also called "frames") ) referenced m-1, m, m + 1
  • the processing unit of the transcoding module performs this interpolation dynamically, by selecting for each current block n at least one interpolation factor cij from a preselection (module 3) of factors ( ⁇ -i, ⁇ 2 , ..., ⁇ k ), according to a predetermined criterion.
  • the predetermined criterion can typically be a criterion of continuity in the time of the signal S (or "stationing" of the signal), or any other criterion of stability of the signal with respect to one or more parameters related to the signal S (gain, energy, parameters long-term LTP, period of the fundamental harmonic (or "pitch"), and preferably calculated by COD1.
  • the input 5 of the transcoding module receives such noted parameters (LPC) i which inform a module 2 for stationary failure detection in the signal S.
  • the transcoding module MOD comprises a memory 3, for example addressable, and which stores a pre-selection of interpolation factors, noted ( ⁇ -i, 0: 2 , ..., c *) in the example shown. This notation means that, in the example described:
  • the module 1 then constructs by interpolation on the vector values (LSP) i (at the blocks n and n-1), from these two factors ⁇ j and ⁇ j, the vectors (LSP) 2 representative of the LPC coefficients specific to the second format. (referenced (LPC) 2 ) to form the second coded signal SC2.
  • the transcoding module MOD is useful both for multiple coding in cascade (so-called “transcoding"), and in parallel (so-called “multi-codings” and “multimode” codings).
  • the situation of the MOD module illustrated in FIG. 1 is a parallel configuration. The same is true for FIG. 3A, where the same input signal S feeds the two coders COD1 and COD2 in parallel, whereas the transcoding module MOD connected to the second coder COD2 receives from the coder COD1 the information (LPC) i. useful for the implementation of the invention, in particular the representative values of the LPC coefficients obtained by the first coding format.
  • the two encoders separately deliver the two coded signals SC1 and SC2.
  • 3B is substantially different in that the input signal S is received by the first coder COD1 only, which delivers to the transcoding module MOD the information (LPC) I which is useful for the implementation of FIG. the invention.
  • a DECOD module is here provided for at least partially decoding the signal SC1 originating from the first coder COD1 and which supplies the second coder COD2.
  • the use of the transcoding module MOD is particularly advantageous here in that it is not necessary to completely decode the signal SC1 from the first coder and that it is not necessary either to apply all the recoding steps in the second format.
  • integer transcoding systems
  • integer multiple coding systems
  • the present invention also aims at such systems, comprising:
  • a coder COD1 according to a first format and a coder COD2 according to a second format implementing short-term prediction models LPC on blocks of digital signal samples, using filters represented by respective LPC coefficients,
  • transcoding module MOD within the meaning of the invention, of the type described above.
  • the invention also relates to a computer program product intended to be stored in a memory of a transcoding module of the type described above. Referring to FIG. 4, tracing its general algorithm, the program computer, when run on the module, then contains instructions for:
  • step 43 determining (steps 43) representative values (LSP) 2 of the second format LPC coefficients from an interpolation on representative values (LSP) i of the LPC coefficients obtained from the first format between at least the given block n and the block n-1 preceding the given block n,
  • step 42 dynamically perform this interpolation, by choosing (step 42) for each current block at least one interpolation factor ctj from among a preselection of factors, according to a predetermined criterion (test 41).
  • this criterion can be associated with the stationarity of the signal and the test 41 detects a possible break in stationarity of the signal, on the basis of the information (LPC) i communicated to it by example the first coder COD1. If a stationarity break is actually detected (arrow N at the output of the test 41), the choice of the factor ⁇ is changed and the module chooses in the preselection the best factor ⁇ and carries out the interpolation from this factor ⁇ j. Otherwise (arrow O at the output of the test 41), the value of the factor ⁇ , set at the initialization step 40 which occurs before the test 41, is retained.
  • LPC information
  • the interpolation in the sense of the invention may involve a first factor ⁇ relative to a given first block (n) and a second ⁇ factor relative to second block (n-1) preceding the first block.
  • a third factor Y relating to a block (n-2) still preceding the second block.
  • the aforementioned preselection can be initially set to include the value "0", the value "1" and at least a third value between "0" and "1", for example "0.5” .
  • the set of interpolation factors as well as the size of this set can be determined heuristically.
  • An elementary example of a heuristic choice is a set of size 3, composed of the values of a ⁇ 0; 0.5; 1 ⁇ (taking the aforementioned relation ⁇ ⁇ l-a).
  • the preselection of the interpolation factors is initially set following a preliminary statistical study, carried out offline.
  • K chosen to include the preselection within the meaning of the invention - for this purpose, the number of elements K to constitute this first set (50) is chosen sufficiently large, b) for each block n, one determines from the first together 50 a better interpolation factor a (ri) according to a chosen criterion, in particular a distance
  • step 54 between the interpolated values (set calculated in step 52 and noted ⁇ [E (LSP) 2 j ] i ⁇ with j between 1 and M-1 and i between 1 and N) and the representative values (set 53) LPC coefficients obtained by the second format.
  • a second set 55 of interpolation factors a (n) of reduced size, for example by eliminating the elements a (n) little or not solicited and retaining the most redundant elements of this set.
  • it is also possible to limit the size of this set by grouping the elements closest to each other around an average.
  • the reduction of the size of the set of interpolation factors a (n) can be based on the study of a histogram of the type illustrated in one of FIGS. 6A or 6B. This type of histogram represents:
  • the K factors (a ⁇ , a 2, - •, » ⁇ ) arbitrarily chosen initially, for example between 0 and 1 and spaced at a fixed pitch of 0.01,
  • LSP for "Line Spectral Pairs"
  • N the number of frames
  • the two sets formed correspond to the unquantized LSPs of the two coders.
  • the two sets correspond to the unquantized LSPs of the format B and to the dequantized LSPs of the format A.
  • the interpolated vector p B (n) a (n) p A (n1) + ⁇ .- a (n)) p A (n) from the vectors of the first format A as close as possible to the vector p B ( ⁇ ) obtained by the second format.
  • distance between two sets of LPC parameters conventionally used in LPC coding such as the squared error (weighted or not) between two vectors of LSPs or the measurement of spectral distortion calculated from the coefficients a,.
  • the abscissas of the I 1 peaks of the histogram
  • the choice of an interpolation factor ⁇ from the preselection of factors, for each current block at least, is preferentially carried out a priori.
  • a priori is performed according to a certain criterion, preferably a criterion of local parking.
  • the prior choice of an interpolation factor implements a classification a priori based on a local stationary criterion detected on the digital signal. For example, the presence of a stationary interruption of the signal is first detected and in case of positive detection, it is then determined which parameters of the two filters must be given more weight.
  • the variations of some selected parameters of the first format will advantageously be used to evaluate the stationary criterion. For example, it is possible to use in particular the LPC coefficients obtained by the first coding format. Another example of parameters will be given in an exemplary embodiment below.
  • the complexity of the process is adjustable depending on the desired quality / complexity compromise (or the desired complexity or the desired quality).
  • the determination of the set of interpolation factors will be more or less efficient (that is to say more or less able to select the optimal set of factors).
  • the interpolation factor values can be recalculated according to the classes formed by the selection algorithm. It will therefore be understood that the procedures determining the set of interpolation factors and the associated classification can be iterated.
  • the number of elements in the preselection is chosen according to a predetermined quality / complexity compromise, according to a preferred characteristic of the invention. Typically, the greater the number of The parameters used to detect stationary break are large, and the number of elements in the preset is large too.
  • the embodiment presented below is for transcoding between two different ITU-T G.729 and ITU-T G.723.1 coding formats.
  • a description of these two standard encoders is first given as well as their LPC modelizations.
  • the reconstructed signal synthesis model is used at the encoder to extract the parameters modeling the signals to be encoded. These signals can be sampled at the frequency of
  • 8 kHz (300-3400 Hz telephone band) or a higher frequency, for example 16 kHz for wideband coding (bandwidth 50 Hz to 7 kHz).
  • the compression ratio varies from 1 to 16: these encoders operate at rates of 2 to 16 kbit / s in the telephone band, and at rates of 6 to 32 kbit / s in the extended band. .
  • the speech signal is sampled and converted into a series of blocks of L samples.
  • Each block is synthesized by filtering a waveform extracted from a repertoire (also called dictionary), multiplied by a gain, through two filters varying in time.
  • the excitation dictionary is a finite set of waveforms of L samples.
  • the first filter is the long-term prediction filter.
  • a "LTP" analysis (for "Long Term Prediction") allows to evaluate the parameters of this long-term predictor that exploits the periodicity of voiced sounds.
  • the second filter which interests the invention, is the short-term prediction filter.
  • LPC Linear Prediction Coding
  • the method used to determine the innovation sequence is the method of synthesis analysis: at the coder, a large number of innovation sequences of the excitation dictionary are filtered by the two LTP and LPC filters, and the form of The selected wave is that producing the synthetic signal closest to the original signal according to a perceptual weighting criterion, known generally as the CELP criterion.
  • Decoding is much less complex than encoding.
  • the bitstream generated by the encoder allows the decoder after demultiplexing to obtain the quantization index of each parameter.
  • the decoding of the parameters and the application of the synthesis model make it possible to reconstruct the signal.
  • the ITU-T G.729 coder is working on a 3.4 kHz band-limited speech signal sampled at 8 kHz cut into 10 ms frames (80 samples). Each frame is divided into two subframes (numbered 0 and 1) of 40 samples (5 ms). A 10-order LPC analysis is performed every 10 ms (once per frame) using the autocorrelation method with an asymmetric window of 30 ms and a "look-ahead" analysis of 5 ms. The first 11 autocorrelation coefficients of the window speech signal are initially calculated to deduce the LPC coefficients by the so-called "Levinson” algorithm. These coefficients are then transformed in the domain of spectral line pairs (LSP) for quantification and interpolation.
  • LSP spectral line pairs
  • Quantification of the LSP values is performed by means of a 4-bit switched predictive vector quantization over 18 bits.
  • the coefficients of the linear prediction filter, quantized and unquantized, are used for the second subframe, whereas for the first subframe, the LPC coefficients (quantized and unquantized) are obtained by linear interpolation of the corresponding LSP values in subframes adjacent (second subframes of the current frame and the past frame of ' ⁇ 7A and 7B) This interpolation "is applied to pairs of LSP coefficients in the cosine domain.
  • the coefficients of the perceptual weighting filter are deduced from the linear prediction filter before quantization.
  • Quantized and unquantized LSP coefficients of interpolated filters are reconverted to LPC coefficients to construct the perceptual weighting and synthesis filters for each subframe.
  • the ITU-T G.723.1 coder indicates that the latter is working on a 3.4 kHz band-limited speech signal sampled at 8 kHz cut into 30 ms frames (240 samples).
  • Each frame has 4 subframes of 7.5 ms (60 samples) grouped 2 by 2 in super subframes of 15 ms (120 samples).
  • For each subframe a 10-order LPC analysis is performed using the autocorrelation method with a 180-sample Hamming window centered on each subframe (for the last subframe, therefore, an analysis is used. look-ahead of 7.5 ms). For each subframe, eleven autocorrelation coefficients are first calculated and then by the Levinson algorithm, the LPC coefficients are calculated.
  • LPC filter of the last subframe is quantized using a predictive vector quantizer.
  • the LPC coefficients are first converted to LSP coefficients. Quantification of LSPs is performed using 24-bit first order predictive vector quantization.
  • the LSP coefficients of the last sub-frame thus quantized are decoded and then interpolated with the decoded LSP coefficients of the last subframe of the preceding frame to obtain the coefficients of the first three subframes.
  • These LSP coefficients are reconverted into LPC coefficients in order to build the synthesis filters for the 4 subframes.
  • transcoding is done at the "parameter" level.
  • the LSP coefficients of the second coding format are determined by dynamic interpolation of the LSP coefficients of the first dequantized coding format.
  • the interpolated coefficients are then quantized by the second format method.
  • a frame of G.723.1 corresponds to three G.729 frames.
  • Figure 7B shows one frame of G723.1 and 3 frames of G.729 and their respective subframes. It thus appears that the subframes of G.729 (5 ms) do not coincide with those of G.723.1 (7.5 ms).
  • a) corresponds substantially to the elements ([E (LSP) 2 ⁇ ) of FIG. 5, by simply specifying here that the best factors a (n) will be estimated by subframes, the subframes here being the blocks of samples considered.
  • FIGS. 8A, 8B and 8C compare the distributions of the spectral distortions obtained by static interpolation and the fine dynamic interpolation within the meaning of the invention. They illustrate the improved performance provided by dynamic interpolation.
  • the set of interpolation factors is: ⁇ 0.24; 0.68; 0.98 ⁇ (respectively 0.01, 0.39, 0.82).
  • FIGS. 9A and 9B show that the performances of this adaptive interpolation, even more coarse, are close to those obtained by fine adaptive interpolation and much better than those of static interpolation.
  • the selection of the interpolation factor set is then as follows.
  • the distribution of the "optimal" ⁇ (3n + i) factors for fine adaptive interpolation has two peaks at the ends of the interval [0,1]. In most cases, these two extreme values correspond to non-stationary zones exhibiting stationary failure such as an attack or extinction.
  • the procedure for selecting the set of interpolation factors among the three possible therefore consists of a first step of detecting a local stationary failure using a stationary criterion. Then, in case of positive detection, it is determined whether the G.729 frame is before or after the break.
  • Figure 10 gives the simplified flowchart of the algorithm for selecting the interpolation factor.
  • the stationary criterion is evaluated in step 80 and the test 81 distinguishes whether the signal is stationary or not. If it is stationary (arrow O from test 81), the value assigned to ⁇ (m) is that intermediate ⁇ (step 82). Otherwise (non-stationary signal - arrow N at the output of test 81), we try to determine:
  • step 84 if the break occurs before the frame (3m + 1) of the G.729 encoder (arrow O at the output of the test 83), in which case a factor ⁇ -i 'is assigned at the beginning of the histogram (step 84); if the break occurs after the frame (3m + i) of the G.729 encoder (arrow N at the output of the test 83), in which case a factor ⁇ 3 'is assigned at the end of the histogram (step 85).
  • this weight can take into account the relative temporal proximities of the blocks (n) and (n-1) with respect to the block (m) and the instant of rupture.
  • the variations of at least one parameter of the G.723.1 coder are advantageously used to evaluate the local stationary.
  • Several types of parameters can be used: such as LSPs vectors (or another LPC representation), pitch periods, fixed excitation gains, and so on. It is also possible to use other parameters calculated on the G.723.1 synthesis signal (such as the energy of this signal per subframe). If the variations can be evaluated by a simple quadratic error (possibly weighted), it is also possible to use more sophisticated measures for example to estimate the evolution of the trajectory of the pitch taking into account multiples or submultiples.
  • Parameters extracted from the frames preceding the current frame of G.729 can also be used.
  • the choice of the number of criteria and their types depends on the desired quality / complexity compromise.
  • a multi-criteria approach (based on the spectral distortion between two consecutive G.723.1 LPC filters, the evolution of the pitch trajectory and the energy variations of the G.723.1 synthesis signal in the subframes) allows Measure the local stationary well and then select the best interpolation factor among the three.
  • the detection is done by comparing the different measurements of stationary with respect to thresholds. These thresholds are preferably determined using a statistical study of the distributions of the variation measurements obtained for the optimal classification.
  • the threshold values S and S ' were determined to favor the interpolation factor close to the static coefficient, which leads to restricting the use of dynamic interpolation only in the case where a rupture is clearly detected.
  • the interpolation factors are recalculated according to the classification performed by this decision algorithm.
  • the dynamic interpolation procedure may be conservative, in which case the static interpolation factor is chosen as the average interpolation factor ⁇ ' ⁇ and only the extreme factors ( ⁇ '-i. ⁇ 's) are optimized.
  • the description above was limited to the case where the LPC parameters of a current frame of the second format are determined by an adaptive interpolation of the LPC parameters of two consecutive frames of the second format.
  • the invention can be applied to more complex interpolation schemes, for example involving more than two frames of the first format and / or possibly other frames of the second format.
  • the method in the sense of the invention is not limited to an embodiment according to which the LPC coefficients of the second format would be deduced from an interpolation on the LPC coefficients of the first format only.
  • a variant that remains within the scope of the invention would consist of using the LPC coefficients of both the first and the second format (possibly determined for previous blocks) to carry out the interpolation.
  • the process defined in the meaning of the invention has been defined above as involving a given block (n) and at least one preceding block (n-1).
  • This given block may be a current block, while the previous block (n-1) is a past block.
  • the interpolation can be carried out for a current block (n) and a future block (n + 1), if a delay is allowed in the process within the meaning of the invention.
  • the invention can be applied to other sample blocks than the frames of the first or second format (for example subframes).

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
EP06743681A 2005-04-26 2006-04-12 Verfahren zur anpassung für interoperabilität zwischen kurzzeit-korrelationsmodellen digitaler signale Withdrawn EP1875465A1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR0504191A FR2884989A1 (fr) 2005-04-26 2005-04-26 Procede d'adaptation pour une interoperabilite entre modeles de correlation a court terme de signaux numeriques.
PCT/FR2006/000805 WO2006114494A1 (fr) 2005-04-26 2006-04-12 Procede d’adaptation pour une interoperabilite entre modeles de correlation a cout terme de signaux numeriques

Publications (1)

Publication Number Publication Date
EP1875465A1 true EP1875465A1 (de) 2008-01-09

Family

ID=35482341

Family Applications (1)

Application Number Title Priority Date Filing Date
EP06743681A Withdrawn EP1875465A1 (de) 2005-04-26 2006-04-12 Verfahren zur anpassung für interoperabilität zwischen kurzzeit-korrelationsmodellen digitaler signale

Country Status (5)

Country Link
US (1) US8078457B2 (de)
EP (1) EP1875465A1 (de)
CN (1) CN101208741B (de)
FR (1) FR2884989A1 (de)
WO (1) WO2006114494A1 (de)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101441896B1 (ko) * 2008-01-29 2014-09-23 삼성전자주식회사 적응적 lpc 계수 보간을 이용한 오디오 신호의 부호화,복호화 방법 및 장치
CN101567203B (zh) * 2008-04-24 2013-06-05 深圳富泰宏精密工业有限公司 自动搜寻及播放音乐的系统及方法
US9245529B2 (en) * 2009-06-18 2016-01-26 Texas Instruments Incorporated Adaptive encoding of a digital signal with one or more missing values
US8743936B2 (en) * 2010-01-05 2014-06-03 Lsi Corporation Systems and methods for determining noise components in a signal set
WO2012103686A1 (en) * 2011-02-01 2012-08-09 Huawei Technologies Co., Ltd. Method and apparatus for providing signal processing coefficients
EP2660811B1 (de) * 2011-02-16 2017-03-29 Nippon Telegraph And Telephone Corporation Kodierungsverfahren, dekodierungsverfahren, kodierer, dekodierer, programm und aufzeichnungsmedium
US9336789B2 (en) * 2013-02-21 2016-05-10 Qualcomm Incorporated Systems and methods for determining an interpolation factor set for synthesizing a speech signal
WO2018108520A1 (en) * 2016-12-16 2018-06-21 Telefonaktiebolaget Lm Ericsson (Publ) Methods, encoder and decoder for handling line spectral frequency coefficients

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6173257B1 (en) * 1998-08-24 2001-01-09 Conexant Systems, Inc Completed fixed codebook for speech encoder
US6240386B1 (en) * 1998-08-24 2001-05-29 Conexant Systems, Inc. Speech codec employing noise classification for noise compensation
EP1095370A1 (de) * 1999-04-05 2001-05-02 Hughes Electronics Corporation Spektrale phasenmodellierung von prototyp-wellenformkomponenten für ein im frequenubereich arbeitendes interpolatives sprach-codec-system
US6434519B1 (en) * 1999-07-19 2002-08-13 Qualcomm Incorporated Method and apparatus for identifying frequency bands to compute linear phase shifts between frame prototypes in a speech coder
US20030195745A1 (en) * 2001-04-02 2003-10-16 Zinser, Richard L. LPC-to-MELP transcoder
US20030028386A1 (en) * 2001-04-02 2003-02-06 Zinser Richard L. Compressed domain universal transcoder
JP4108317B2 (ja) * 2001-11-13 2008-06-25 日本電気株式会社 符号変換方法及び装置とプログラム並びに記憶媒体
JP4263412B2 (ja) * 2002-01-29 2009-05-13 富士通株式会社 音声符号変換方法
US7433815B2 (en) * 2003-09-10 2008-10-07 Dilithium Networks Pty Ltd. Method and apparatus for voice transcoding between variable rate coders

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2006114494A1 *

Also Published As

Publication number Publication date
US8078457B2 (en) 2011-12-13
FR2884989A1 (fr) 2006-10-27
CN101208741B (zh) 2011-08-31
CN101208741A (zh) 2008-06-25
WO2006114494A1 (fr) 2006-11-02
US20090299737A1 (en) 2009-12-03

Similar Documents

Publication Publication Date Title
EP2277172B1 (de) Verbergung von übertragungsfehlern in einem digitalsignal in einer hierarchischen decodierungsstruktur
EP3161659B1 (de) Wiederabtastung eines tonsignals durch interpolation zur codierung/decodierung mit geringer verzögerung
EP1316087B1 (de) Übertragungsfehler-verdeckung in einem audiosignal
EP1051703B1 (de) Verfahren zur dekodierung eines audiosignals mit korrektur von übertragungsfehlern
EP1875465A1 (de) Verfahren zur anpassung für interoperabilität zwischen kurzzeit-korrelationsmodellen digitaler signale
EP0906613B1 (de) Verfahren und vorrichtung zur kodierung eines audiosignals mittels "vorwärts"- und "rückwärts"-lpc-analyse
EP1692687B1 (de) Transcodierung zwischen den indizes von mehrimpuls-wörterbüchern zur codierung bei der digitalen signalkomprimierung
FR2977439A1 (fr) Fenetres de ponderation en codage/decodage par transformee avec recouvrement, optimisees en retard.
EP3069340B1 (de) Übergang von einer transformationscodierung/-decodierung zu einer prädiktiven codierung/decodierung
EP2080194B1 (de) Dämpfung von stimmüberlagerung, im besonderen zur erregungserzeugung bei einem decoder in abwesenheit von informationen
EP1836699B1 (de) Verfahren und Vorrichtung zur Ausführung einer optimalizierten Audiokodierung zwischen zwei Langzeitvorhersagemodellen
EP2795618B1 (de) Verfahren zur erkennung eines vorgegebenen frequenzbandes in einem audiodatensignal, erkennungsvorrichtung und computerprogramm dafür
EP3138095A1 (de) Verbesserte frameverlustkorrektur mit sprachinformationen
EP2652735B1 (de) Verbesserte kodierung einer verbesserungsstufe bei einem hierarchischen kodierer
EP2589045B1 (de) Adaptive lineare prädiktive codierung/decodierung
WO2002091362A1 (fr) Procede d'extraction de parametres d'un signal audio, et codeur mettant en oeuvre un tel procede
FR2830970A1 (fr) Procede et dispositif de synthese de trames de substitution, dans une succession de trames representant un signal de parole
FR2980620A1 (fr) Traitement d'amelioration de la qualite des signaux audiofrequences decodes
FR2997250A1 (fr) Detection d'une bande de frequence predeterminee dans un contenu audio code par sous-bandes selon un codage de type modulation par impulsions

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20071017

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

DAX Request for extension of the european patent (deleted)
RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: ORANGE

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20161101