EP1692689B1 - Optimiertes mehrfach-codierungsverfahren - Google Patents

Optimiertes mehrfach-codierungsverfahren Download PDF

Info

Publication number
EP1692689B1
EP1692689B1 EP04805538A EP04805538A EP1692689B1 EP 1692689 B1 EP1692689 B1 EP 1692689B1 EP 04805538 A EP04805538 A EP 04805538A EP 04805538 A EP04805538 A EP 04805538A EP 1692689 B1 EP1692689 B1 EP 1692689B1
Authority
EP
European Patent Office
Prior art keywords
coders
coder
coding
data rate
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Not-in-force
Application number
EP04805538A
Other languages
English (en)
French (fr)
Other versions
EP1692689A1 (de
Inventor
David Virette
Claude Lamblin
Abdellatif Benjelloun Touimi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orange SA
Original Assignee
France Telecom SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by France Telecom SA filed Critical France Telecom SA
Priority to PL04805538T priority Critical patent/PL1692689T3/pl
Publication of EP1692689A1 publication Critical patent/EP1692689A1/de
Application granted granted Critical
Publication of EP1692689B1 publication Critical patent/EP1692689B1/de
Not-in-force legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/002Dynamic bit allocation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0212Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes

Definitions

  • the present invention relates to the encoding / decoding of digital signals, in applications for transmission or storage of multimedia signals such as audio signals (speech and / or sounds) or video.
  • the present invention is in the context of an optimization of " multiple coding " techniques, implemented as soon as a digital signal, or a portion of this signal, is coded according to several coding techniques.
  • This multiple coding can be performed simultaneously (in one pass) or not.
  • the processes can be carried out on the same signal, or possibly on versions derived from the same signal (for example according to different bandwidths).
  • Each coder carries out the compression of a version resulting from the decoding of the signal compressed by the preceding encoder.
  • Multiple coding is, for example, in the case of the same content which is coded according to several formats and then transmitted to terminals that do not support the same coding formats. If it is a real-time broadcast, the processing should be done simultaneously. If it is a question of access to a database, the codings can be carried out one after another, delayed. In these examples, the multiple coding makes it possible to code the same signal in different formats by using several coders (or possibly several rates or several modes of the same encoder), each encoder operating independently of the other coders.
  • multi-mode coding (with reference to the selection of a " mode " of coding).
  • mode the number of coders sharing a " common past " are required to encode the same signal portion.
  • the coding techniques used may be different, or from a single coding structure. However, they will not be completely independent unless they are " memory-free " techniques.
  • the second use case mentioned concerns multi-mode coding applications, allowing the selection of one encoder from a set for each portion of signal analyzed.
  • the selection requires the definition of a criterion, the most common aiming at the optimization of the rate-distortion compromise.
  • the signal being analyzed over successive time segments, at each segment several codings are evaluated.
  • the lowest bit rate coding for a given quality is then selected, or the best bit rate coding for a given bit rate. It will be noted that other constraints than those of flow / distortion can be used.
  • the coding selection is made “a priori” by an analysis of the signal on the segment in question (selection according to the characteristics of the signal).
  • selection according to the characteristics of the signal has led to propose a selection "a posteriori" the optimum mode after coding all the modes, however the complexity of high prices.
  • the decision a priori is made from a classification of the input signal.
  • signal classification There are many methods of signal classification.
  • US 6,581,032 discloses a speech compression system comprising four codecs selectively activated according to a signal to be compressed and according to a classification of the signals.
  • US 6,141,638 discloses an encoder using different code dictionaries according to parameters of the signal to be encoded.
  • the present invention improves the situation.
  • the above steps are implemented by a computer program product comprising program instructions for this purpose.
  • the present invention also aims at such a computer program product, intended to be stored in a memory of a processing unit, in particular a computer or a mobile terminal, or on a removable memory medium and intended to cooperate with a drive of the processing unit.
  • the present invention also aims at a device for aiding compression coding, for the implementation of the method according to the invention, and then including a memory adapted to store instructions of a computer program product of the aforementioned type.
  • FIG. 1a On which there is shown a plurality of encoders C0, C1, ..., CN, in parallel and each receiving an input signal s 0 .
  • Each encoder comprises functional blocks BF1 to BFn for implementing successive coding steps and ultimately outputting a coded bitstream BS0, BS1, ..., BSN. It is furthermore indicated that in an application in multi-mode coding, the outputs of the coders C0 to CN are connected to an optimal mode selection module MM and the bit stream BS of the optimal coder is transmitted (dotted line arrows of the figure 1a ).
  • Some BFi function blocks are sometimes identical from one mode (or encoder) to another, while others differ only in quantizer level. Usable relationships also exist when encoders from the same coding family are used, using similar models or computing parameters physically related to the signal.
  • the figure 1b illustrates the proposed solution.
  • the aforementioned " common " operations are performed once for at least a portion of the coders and, preferably, for all the coders, in an independent module MI which will redistribute the results. obtained at least part of the coders, or preferably all these coders. It is thus a sharing between at least part of all coders C0 to CN (or " pooling " below) the results obtained.
  • Such an independent module MI may be part of a device for a multiple compression coding as defined above.
  • the existing functional block or blocks BF1 to BFn of the same or more different coders is used, this or these coders being chosen according to criteria which will be described later.
  • the present invention can implement several strategies which, of course, may differ depending on the role of the functional block considered.
  • a first strategy is to use the parameters of the encoder whose bit rate is the lowest to focus the search parameters for all other modes.
  • a second strategy is to use the parameters of the encoder whose rate is the highest, then to " degrade " progressively to the encoder whose bit rate is the lowest.
  • the present invention makes it possible to reduce the complexity of the calculations preliminary to the a posteriori selection of an encoder performed in the last step, for example by the last module MM before the transmission of the bit stream BS.
  • MSPi partial selection module
  • FIG. 1d A more sophisticated variant of the multi-mode structure based on the functional block cutting described above is now proposed, with reference to the figure 1d .
  • the multi-mode structure of the figure 1d is called "trellis", with several possible paths in the trellis.
  • all the possible paths of the lattice are represented so that it is in a tree form.
  • each path of the trellis is defined by a combination of operating modes of the functional blocks, each functional block supplying several possible variants of the next functional block.
  • each coding mode is derived from the combination of operating modes of the functional blocks: the functional block 1 has N 1 operating modes, the functional block 2 has N 2 , and so on up to the block P.
  • a first feature of this structure is that it provides, for a given functional block, a common calculation module per output of the previous functional block. These common calculation modules perform the same operations, but on the basis of different signals since they come from different previous blocks.
  • the common calculation modules of the same level are pooled: the results of a given module usable by the following modules are provided to these following modules.
  • a partial selection, made at the end of the processing of each functional block advantageously makes it possible to eliminate the less efficient branches according to the chosen criterion. It is therefore possible to reduce the number of branches of the trellis to be evaluated.
  • the chosen trellis path is the one passing through the lowest flow functional block, or the highest rate functional block according to the coding context, and the results obtained from the lowest (or highest) bit rate functional block are adapted to the bit rates of at least a portion of the other functional blocks by a focused search of parameters for at least part of all other functional blocks, up to the highest (or lowest) rate functional block.
  • CELP Code Excited Linear Prediction
  • the reconstructed signal synthesis model is used at the encoder to extract the parameters modeling the signals to be coded.
  • These signals can be sampled at the frequency of 8 kHz (telephone band 300-3400 Hz) or at a higher frequency, for example at 16 kHz for wideband coding (bandwidth 50 Hz to 7 kHz).
  • the compression ratio varies from 1 to 16.
  • These encoders operate at rates of 2 to 16 kbit / s in the telephone band, and at speeds of 6 to 32 kbit / s in the extended band. .
  • the CELP type digital coding device currently the most useful synthesis analysis coder, is presented to the figure 3 in the form of main functional blocks.
  • the speech signal s 0 is sampled and converted into a sequence of frames of a number L of samples. Each frame is synthesized by filtering a waveform extracted from a directory (called " dictionary "), multiplied by a gain, through two filters varying in time.
  • the fixed excitation dictionary is a finite set of waveforms of the L samples.
  • the first filter is a long-term prediction filter. A " LTP " ( Long Term Prediction ) analysis makes it possible to evaluate the parameters of this long-term predictor which exploits the periodicity of the voiced sounds, this harmonic component being modeled in the form of an adaptive dictionary (block 32) .
  • the second filter is a short-term prediction filter.
  • the " LPC " ( Linear Prediction Coding ) analysis methods make it possible to obtain these short-term prediction parameters, which are representative of the vocal tract transfer function and characteristics of the envelope of the signal spectrum.
  • the method used to determine the innovation sequence is the synthetic analysis method which is summarized as follows. At the encoder, a large number of innovation sequences of the fixed excitation dictionary are filtered by the LPC filter (synthesis filter of the function block 34 of the figure 3 ). Beforehand, the adaptive excitation was obtained in a similar way.
  • the selected waveform is that producing the synthetic signal closest to the original signal (error minimization at function block 35), according to a perceptual weighting criterion (function block 36) which is generally known as criterion name " CELP ".
  • Decoding is, for its part, much less complex than coding.
  • the bitstream generated by the coder enables the decoder, after demultiplexing, to obtain the quantization index of each parameter.
  • the decoding of the parameters and the application of the synthesis model then make it possible to reconstruct the signal.
  • the first embodiment relates to the perceptual frequency coder called " TDAC " and described in particular in the published document. US 2001 / 027,393 .
  • This TDAC encoder is used to encode digital audio signals sampled at 16 kHz (wide band).
  • the figure 4a illustrates the main functional blocks of this encoder.
  • An audio signal x (n) limited in band at 7 kHz and sampled at 16 kHz is cut into frames of 320 samples (20 ms).
  • a Modified Discrete Cosine Transform (or " MDCT ") is applied (function block 41) on 640 sample input signal frames with 50% overlap, thus with a refresh of MDCT analysis every 20 ms .
  • the spectrum is limited to 7225 Hz by setting the last 31 coefficients to zero (only the first 289 coefficients are different of 0).
  • a masking curve (block 42) is determined from this spectrum and all masked coefficients are set to zero.
  • the spectrum is divided into 32 bands of unequal widths. Any masked bands are determined according to the transformed coefficients of the signals. For each band of the spectrum, the energy of the MDCT coefficients is calculated (to obtain scale factors).
  • the 32 scale factors constitute the spectral envelope of the signal which is then quantized and coded by entropy encoding (function block 43), and finally transmitted in the coded frame s c .
  • the dynamic allocation of the bits is based on a band masking curve (functional block 42) calculated from the decoded and dequantized version of the spectral envelope. This measurement makes it possible to have compatibility between the bit allocation of the encoder and the decoder.
  • the normalized MDCT coefficients in each band are then quantized (function block 45) by vector quantizers using size-nested dictionaries, the dictionaries being composed of a type II permutation code union.
  • the information on the tone (coded here on a bit B 1 ) and the voicing (coded here on a bit B 0 ), as well as the spectral envelope eq (i) and the coefficients coded y q (j) are multiplexed ( block 46 of the figure 4a ) and transmitted in frames.
  • the functional blocks shared by the coders bear the same reference as those of a single TDAC coder as represented in FIG. figure 4a .
  • the bit allocation block 44 is used in several passes, and the number of bits allocated is adjusted for the transquantification performed by each coder (blocks 45_1,..., 45_ (K-2), 45_ (K -1)), as will be seen below. Note further that these transquantifications use the results obtained by the quantization function block 45_0 for a chosen encoder, index 0 (the lowest rate encoder in the example described).
  • the only functional blocks of the encoders which act without real interaction are the multiplexing blocks 46_0, 46_1, ..., 46_ (K-2), 46_ (K-1), although they all use the same information of voicing and tone, as well as the same coded spectral envelope. As such, it is simply stated that a partial pooling of the multiplexing can be conducted, again.
  • the strategy employed is to exploit the results of the two bit allocation and quantization functional blocks made for the bit stream (0), at the lowest bit rate D 0 , to speed up the operations of the two corresponding function blocks for the K-1 other bitstream (k) (1 ⁇ k ⁇ K ). It is also possible to consider the multi-rate coding scheme which uses a bit-allocation functional block per bit stream (without factorization provided for this block) but mutualizes a part of the quantization operations thereafter.
  • the multiple coding techniques presented below are advantageously based on intelligent transcoding used for the reduction of the coded audio stream bit rate, generally located in a node of the network.
  • bit streams k , 0 ⁇ k ⁇ K are classified in an increasing order of rates ( D 0 ⁇ D 1 ⁇ ... ⁇ D K-1 ) .
  • bit stream 0 corresponds to the lowest bit rate.
  • a second phase is used to perform the readjustment. This step is preferably done by a succession of iterative operations based on a perceptual criterion that adds or removes bits from the bands.
  • the bits are added to the bands where the perceptual improvement is the most important. This perceptual improvement is measured by the variation of the noise to mask ratio between the initial and final allocation of the bands. The rate is increased for the band where this variation is greatest. In the opposite case where the total number of distributed bits is greater than that available, the extraction of bits on the bands is dual to the latter procedure.
  • the first determination step by the above formula can be done once based on the lowest bit rate D 0 .
  • the notation ⁇ ( k ) indicates the parameter used in the processing performed to obtain the bitstream of the encoder k.
  • the parameters without this exponent being calculated once and for all for the bit stream 0. They are independent of the flow (or mode) considered.
  • the MPEG-1 Layer I & II coder presented to the figure 6a uses a filter bank with 32 uniform sub-bands (block 61 of the figure 6a ) to perform the time / frequency transformation of the input audio signal s 0.
  • the output samples of each subband are grouped and then normalized by a common scale factor (determined by the function block 67) before being quantified (block 62).
  • the number of levels of the uniform scalar quantizer used for each subband results from a dynamic bit allocation procedure (performed by block 63). This procedure uses a psychoacoustic model (block 64) to determine the bit distribution that makes the quantization noise as noticeable as possible.
  • the hearing models proposed in the standard are based on the estimation of the spectrum obtained by a fast Fourier transform (FFT) of the input temporal signal (made by block 65).
  • FFT fast Fourier transform
  • the frame s c multiplexed by the block 66 of the figure 6a and which is finally transmitted, contains, after a header field H D , the set of samples of the quantized sub-bands E SB , which represent the main information, and complementary information used for the decoding operation constituted by the scaling factors F E and the bit allocation A i .
  • the two blocks 64 and 65 already provide the signal to mask ratios (SMR arrows Figures 6a and 7 ), used for the bit allocation procedure (block 70 of the figure 7 ).
  • Steps 1 and 2 are repeated iteratively until the total number of available bits, corresponding to the operating rate, is distributed.
  • the result is then a bit distribution vector ( b 0 , b 1 , ..., b M -1 ) .
  • the K outputs of this bit allocation block then feed the quantization blocks for each of the bit streams at the given bit rate.
  • the last exemplary embodiment relates to the coding of the multi-mode speech with a posteriori decision from the 3GPP NB-AMR (" Narrow-Band Adaptive Multi-Rate ”) coder which is a multi-rate bandband speech encoder. adaptive, according to a 3GPP standard.
  • This encoder which belongs to the well-known family of CELP coders whose principle was briefly described above, has eight modes (or bit rates) ranging from 12.2 kbit / s to 4.75 kbit / s, all based on the technique ACELP (for " Algebraic Code Excited Linear Prediction ").
  • the figure 8 gives the coding scheme in functional blocks of this encoder. This structure was exploited in order to realize a post-decision multi-mode encoder, based on 4 modes of the NB-AMR encoder (7.4, 6.7, 5.9, 5.15).
  • the complexity is even smaller.
  • the non-identical functional block calculations for some modes are accelerated by exploiting those of another mode or a common processing module, as will be seen below.
  • the results of the four encodings thus shared are then different from those of the four codings in parallel.
  • the functional blocks of these four modes are used for trellis multi-mode coding, as has been seen above with reference to FIG. figure 1d .
  • the 3GPP NB-AMR coder is working on a 3.4 kHz band-limited speech signal sampled at 8 kHz cut into 20 ms frames (160 samples). Each frame has 4 subframes of 5 ms (40 samples) grouped 2 by 2 in " super subframes " of 10 ms (80 samples). For all modes, the same types of parameters are extracted from the signal but with variants of modeling and / or quantification of these parameters. In the NB-AMR encoder, five types of parameters are to be analyzed and coded. Line Spectral Pairs (LSP) parameters are processed once per frame for all modes, except for 12.2 mode (so once per super subframe). The other parameters (in particular the LTP delay, the gain of the adaptive excitation, the fixed excitation, the gain of the fixed excitation) are processed once per subframe.
  • LSP Line Spectral Pairs
  • the four modes considered here (7.4, 6.7, 5.9, 5.15) are distinguished essentially by the quantifications of their parameters.
  • the binary allocation of these 4 modes is summarized in Table 1 below. ⁇ b> Table 1: ⁇ / b> Bit allocation of the 4 modes (7.4, 6.7, 5.9, 5.15) of the 3GPP NB-AMR encoder Mode (kbit / s) 7.4 6.7 5.9 5.15
  • These 4 modes of the NB-AMR encoder (7.4, 6.7, 5.9, 5.15) have identical modules such as preprocessing, analysis of linear prediction coefficients, signal calculation weighted.
  • the signal preprocessing is 80 Hz high-pass cut-off filtering to suppress the continuous components combined with division of the input signals to avoid overflows.
  • the quantization of the LSP parameters from , 15 kbit / s is done on 23 bits, that of the other three modes on 26 bits.
  • the Cartesian product vector quantization (so-called " split VQ ") of the LSP parameters divides the LSP parameters into 3 sub-vectors, of size 3, 3 and 4.
  • the first sub-vector composed of The first 3 LSP is quantized on 8 bits by the same dictionary for the four modes.
  • the second sub-vector composed of the following 3 LSPs is quantized for the 3 high-speed modes by a dictionary of size 512 (9 bits) and for the mode with 5,15 by half of this dictionary (one vector out of 2).
  • the third and last sub-vector composed of the last 4 LSPs is quantized for the 3 high-speed modes by a dictionary of size 512 (9 bits) and for the mode of lower bit rate by a dictionary of size 128 (7 bits).
  • the transformation in the normalized frequency domain, the calculation of the squared error criterion weights and the MA prediction (for " Moving Average ") of the LSP residue to be quantized are identical for the 4 modes.
  • the three broadband modes use the same dictionaries to quantify the LSPs, they can share, in addition to the same vector quantization module, the inverse transformation (to return from the normalized frequency domain to the cosine domain), as well as the calculation of the LSPs.
  • the closed-loop searches of the adaptive and fixed excitations are done sequentially and require the calculation of the impulse response of the weighted synthesis filter and then of the target signals beforehand.
  • the impulse response of the weighted synthesis filter (A i (z / ⁇ 1 ) / [A Q i (z) A i (z / ⁇ 2 )]) is identical for the 3 high-speed modes (7.4, 6, 7; 5,9).
  • the calculation of the target signal for the adaptive excitation depends on the weighted signal (regardless of the mode); the quantized filter A Q i (z) (identical for 3 modes) and the past subframe (different for each subframe other than the first subframe).
  • the target signal for the fixed excitation is obtained by removing from the previous target signal the contribution of the filtered adaptive excitation of this subframe (which is different from one mode to another except for the first subframe of the first 3 modes).
  • the search in this dictionary of absolute delays is focused around the delay found in open loop (range of ⁇ 5 for the 5.15 mode, ⁇ 3 for the other modes).
  • the target signal and the open-loop delay being identical, the result of this closed-loop search is also identical.
  • the other two dictionaries are of differential type and make it possible to code the difference between the current delay and the integer delay T i-1 closest to the fractional delay of the preceding sub-frame.
  • the first 5-bit differential dictionary used for the odd subframes of the 7.4 mode, is 1/3 resolution around the entire delay T i-1 in the interval [T i-1 -5 + 2 / 3, T i-1 +4 + 2/3].
  • the second 4-bit differential dictionary included in the first one, is used for the odd subframes of the modes at 6.7 and 5.9 as well as for the last three subframes of the 5.15 mode.
  • This second dictionary is of integer resolution around the integer delay T i-1 in the interval [T i-1 -5, T i-1 +4] plus a resolution of 1/3 in the interval [T i-1 -1 + 2/3, T i-1 + 2/3].
  • ACELP Interleaved Single-Pulse Permutation
  • the four modes (7.4, 6.7, 5.9, 5.15) use the same slice of the 40 samples of a 5-track subframe of length 8 interleaved, as shown in Table 2a.
  • Table 2b shows, for the 3 modes (7.4, 6.7, 5.9) the dictionary rate, the number of pulses and their distribution in the tracks.
  • the distribution of the 2 pulses of the ACELP 9-bit dictionary of the 5.15 mode is even more constrained.
  • the gains of the adaptive and fixed excitations are quantified on 7 or 6 bits (with an MA prediction applied to the gain of the fixed excitation) by a joint vector quantization minimizing the CELP criterion.
  • Non-identical functional blocks can be accelerated by exploiting those of another mode or a common processing module. Depending on the constraints of the application (in terms of quality and / or complexity), different variants can be used. Some examples are described below. It is also possible to rely on intelligent transcoding techniques between CELP coders.
  • This implementation gives a result identical to that of the non-optimized multi-mode coding. If one wishes to further reduce the complexity of the quantization, one can stop at step 1 and take Y 1 as a quantized vector for the high-speed modes if this vector is considered sufficiently close to Y. This simplification can therefore give a different result from an exhaustive search.
  • FIG. 1d it is proposed to realize a multi-mode trellis coder for several combinations of functional blocks, each functional block having at least two modes of operation (or flows).
  • This new encoder was constructed from the four NB-AMR encoder rates mentioned above (5.15, 5.90, 6.70, 7.40).
  • this encoder there are four functional blocks: the LPC block, the LTP block, the fixed excitation block and the gain block. Referring to Table 1 presented above, Table 3a below summarizes for each of these functional blocks, its number of flow rates and its flow rates.
  • Functional block Number of flows Flow of the functional blocks LPC (LSP) 2 26 and 23 LTP delay 3 26, 24 and 20 Fixed excitation 4 68, 56, 44 and 36 Earnings 2 28 and 24
  • the multi-rate encoder thus obtained has a high granularity in rates, with 32 possible modes given in Table 3b. However, it is indicated that the encoder thus obtained is not interoperable with the aforementioned NB-AMR encoder.
  • Table 3b the modes corresponding to the three flows of the NB-AMR (5.15, 5.90, 6.70) are shown in bold, the exclusion of the highest bit rate of the LTP functional block eliminating the flow of 7 40.
  • the present invention makes it possible to provide an effective solution to the problem of the complexity of multiple codings by pooling and accelerating the calculations implemented by the various coders.
  • the coding structures can therefore be represented using functional blocks describing the various operations performed during a treatment.
  • the functional blocks of the different encodings implemented in multiple coding have strong relationships that are exploited within the meaning of the present invention. These relations are particularly strong when the different codings correspond to different modes of the same structure.
  • the present invention is flexible from the point of view of complexity. It is indeed possible to decide a priori the maximum complexity of the multiple coding and to adapt the number of coders explored as a function of this complexity.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Amplifiers (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Separation By Low-Temperature Treatments (AREA)

Claims (27)

  1. Mehrfach-Kompressionscodierverfahren, bei dem ein Eingangssignal dazu bestimmt ist, mindestens einen ersten Codierer und einen zweiten Codierer parallel zu speisen, wobei jeder der ersten und zweiten Codierer eine Folge von Funktionsblöcken für eine Kompressionscodierung des Eingangssignals durch jeden der ersten und zweiten Codierer aufweist,
    wobei mindestens ein Teil der Funktionsblöcke Berechnungen ausführt, um Parameter zu liefern, die zur Codierung des Eingangssignals durch jeden Codierer dienen,
    wobei der erste und der zweite Codierer je mindestens einen ersten und einen zweiten Funktionsblock aufweisen, die eingerichtet sind, um gemeinsame Operationen durchzuführen, dadurch
    gekennzeichnet, dass:
    - in einem gleichen Schritt und in einem einzigen Block Berechnungen ausgeführt werden, um den gleichen Satz von Parametern an den ersten Block und an den zweiten Block zu liefern, und
    - wenn der erste und/oder der zweite Codierer mit einem sich vom einzigen Block unterscheidenden Durchsatz arbeitet, der Satz von Parametern an den Durchsatz des ersten und/oder zweiten Codierers angepasst wird, um je vom ersten und/oder zweiten Block verwendet zu werden.
  2. Verfahren nach Anspruch 1, dadurch gekennzeichnet, dass der einzige Block aus einem oder mehreren Blöcken eines der ersten und zweiten Codierer besteht.
  3. Verfahren nach Anspruch 1, dadurch gekennzeichnet, dass es die nachfolgenden vorbereitenden Schritte aufweist:
    a) Identifizieren der jeden Codierer bildenden Funktionsblöcke sowie einer oder mehrerer von jedem Block realisierten Funktionen,
    b) Erkennen, unter den Funktionen, derjenigen Funktionen, die von einem Codierer zum anderen gemeinsam sind, und
    c) Ausführen der gemeinsamen Funktionen ein für alle Mal für mindestens einen Teil aller Codierer innerhalb mindestens eines gleichen Rechenmoduls.
  4. Verfahren nach Anspruch 3, dadurch gekennzeichnet, dass für jede im Schritt c) durchgeführte Funktion mindestens ein Funktionsblock eines aus den mehreren Codierer ausgewählten Codierers verwendet wird, und dass der Block des gewählten Codierers eingerichtet ist, um Teilergebnisse an die anderen Codierer für eine wirksame Codierung in den anderen Codierern zu liefern, die ein optimales Kriterium zwischen der Komplexität und der Qualität der Codierung erfüllt.
  5. Verfahren nach Anspruch 4, bei dem die Codierer mit unterschiedlichen Durchsätzen arbeiten können, dadurch gekennzeichnet, dass der gewählte Codierer der Codierer mit dem geringsten Durchsatz ist, und dass die nach der Ausführung der Funktion im Schritt c) mit dem gewählten Codierer eigenen Parametern erhaltenen Ergebnisse an die Durchsätze mindestens eines Teils der anderen Codierer durch eine fokussierte Recherche von Parametern für mindestens einen Teil aller anderen Betriebsarten bis zum Codierer mit dem höchsten Durchsatz angepasst werden.
  6. Verfahren nach Anspruch 4, bei dem die Codierer mit unterschiedlichen Durchsätzen arbeiten können, dadurch gekennzeichnet, dass der gewählte Codierer der Codierer mit dem höchsten Durchsatz ist, und dass die nach der Ausführung der Funktion im Schritt c) mit dem gewählten Codierer eigenen Parametern erhaltenen Ergebnisse an die Durchsätze mindestens eines Teils der anderen Codierer durch eine fokussierte Recherche von Parametern für mindestens einen Teil aller anderen Betriebsarten bis zum Codierer mit dem geringsten Durchsatz angepasst werden.
  7. Verfahren nach Anspruch 5 in Kombination mit Anspruch 6, dadurch gekennzeichnet, dass für einen gegebenen Durchsatz der Funktionsblock eines mit dem gegebenen Durchsatz arbeitenden Codierers als Rechenmodul verwendet wird, und dass mindestens ein Teil der diesem Codierer eigenen Parameter progressiv angepasst wird:
    - bis zum Codierer mit dem höchsten Durchsatz durch fokussierte Recherche, und
    - bis zum Codierer mit dem niedrigsten Durchsatz durch fokussierte Recherche.
  8. Verfahren nach Anspruch 2, bei dem die Funktionsblöcke der verschiedenen Codierer als Trellis angeordnet sind, mit mehreren möglichen Pfaden in dem Trellis, dadurch gekennzeichnet, dass jeder Pfad des Trellis durch eine Kombination von Funktionsweisen der Funktionsblöcke definiert wird, wobei jeder Funktionsblock mehrere mögliche Varianten des folgenden Funktionsblock speist.
  9. Verfahren nach Anspruch 8, dadurch gekennzeichnet, dass ein Teilauswahl-Modul nach jedem von einem oder mehreren Funktionsblöcke durchgeführten Codierschritt vorgesehen ist, das die von einem oder mehreren dieser Funktionsblöcke gelieferten Ergebnisse für folgende Codierschritte auswählen kann.
  10. Verfahren nach Anspruch 8, bei dem die Funktionsblöcke mit unterschiedlichen Durchsätzen und unter Verwendung von den Durchsätzen eigenen Parametern arbeiten können, dadurch gekennzeichnet, dass für einen gegebenen Funktionsblock der gewählte Pfad des Trellis derjenige ist, der den Funktionsblock mit dem niedrigsten Durchsatz durchquert, und dass die vom Funktionsblock mit dem niedrigsten Durchsatz erhaltenen Ergebnisse an die Durchsätze mindestens eines Teils der anderen Funktionsblöcke durch eine fokussierte Recherche von Parametern für mindestens einen Teil aller anderen Funktionsblöcke bis zum Funktionsblock mit dem höchsten Durchsatz angepasst werden.
  11. Verfahren nach Anspruch 8, bei dem die Funktionsblöcke mit unterschiedlichen Durchsätzen und unter Verwendung von den Durchsätzen eigenen Parametern arbeiten können, dadurch gekennzeichnet, dass für einen gegebenen Funktionsblock der gewählte Pfad des Trellis derjenige ist, der den Funktionsblock mit dem höchsten Durchsatz durchquert, und dass die vom Funktionsblock mit dem höchsten Durchsatz erhaltenen Ergebnisse an die Durchsätze mindestens eines Teils der anderen Funktionsblöcke durch eine fokussierte Recherche von Parametern für mindestens einen Teil aller anderen Funktionsblöcke bis zum Funktionsblock mit dem niedrigsten Durchsatz angepasst werden.
  12. Verfahren nach Anspruch 10 in Kombination mit Anspruch 11, dadurch gekennzeichnet, dass für einen den Parametern eines Funktionsblocks eines Codierers zugeordneten gegebenen Durchsatz der mit dem gegebenen Durchsatz arbeitende Funktionsblock als Rechenmodul verwendet wird, und mindestens ein Teil der diesem Funktionsblock eigenen Parameter progressiv angepasst wird:
    - bis zu dem Funktionsblock, der mit dem niedrigsten Durchsatz arbeiten kann, durch fokussierte Recherche, und
    - bis zu dem Funktionsblock, der mit dem höchsten Durchsatz arbeiten kann, durch fokussierte Recherche.
  13. Verfahren nach Anspruch 3, dadurch gekennzeichnet, dass das Rechenmodul ein von den Codierern unabhängiges Modul und eingerichtet ist, um im Schritt c) erhaltene Ergebnisse an alle Codierer neu zu verteilen.
  14. Verfahren nach Anspruch 13 in Kombination mit Anspruch 3, dadurch gekennzeichnet, dass das unabhängige Modul und der Block oder die Blöcke mindestens eines der Codierer eingerichtet sind, um miteinander im Schritt c) erhaltene Ergebnisse auszutauschen, und dass das Rechenmodul eingerichtet ist, um eine Anpassungs-Umcodierung zwischen Funktionsblöcken verschiedener Codierer durchzuführen.
  15. Verfahren nach einem der Ansprüche 13 und 14, dadurch gekennzeichnet, dass das unabhängige Modul einen Block zur mindestens teilweisen Codierung und einen Block zur Anpassungs-Umcodierung aufweist.
  16. Verfahren nach einem der vorhergehenden Ansprüche, bei dem die parallelgeschalteten Codierer eingerichtet sind, um in Multimode-Codierung zu arbeiten, dadurch gekennzeichnet, dass ein A-posteriori-Auswahlmodul vorgesehen wird, das einen Codierer unter den Codierern auswählen kann.
  17. Verfahren nach Anspruch 16, dadurch gekennzeichnet, dass ein Teilauswahl-Modul nach jedem von einem oder mehreren Funktionsblöcken ausgeführten Codierschritt unabhängig von den Codierern und fähig, einen oder mehrere Codierer auszuwählen, vorgesehen wird.
  18. Verfahren nach einem der vorhergehenden Ansprüche, bei dem die Codierer vom Typ durch Transformation sind, dadurch gekennzeichnet, dass das Rechenmodul einen Bitzuweisungsblock aufweist, der auf alle Codierer aufgeteilt ist, wobei auf jede für einen Codierer durchgeführte Bitzuweisung eine Anpassung an diesen Codierer insbesondere in Abhängigkeit von seinem Durchsatz folgt.
  19. Verfahren nach Anspruch 18, dadurch gekennzeichnet, dass das Verfahren außerdem einen Quantisierungsschritt aufweist, dessen Ergebnisse an alle Codierer geliefert werden.
  20. Verfahren nach Anspruch 19, dadurch gekennzeichnet, dass es außerdem allen Codierern gemeinsame Schritte aufweist, darunter
    - eine Zeit-Frequenz-Transformation (MDCT),
    - eine Erfassung einer Stimmhaftmachung im Eingangssignal,
    - eine Tonerfassung,
    - die Bestimmung einer Maskierungskurve,
    - und eine Codierung einer spektralen Hüllkurve.
  21. Verfahren nach Anspruch 18, bei dem die Codierer eine Subband-Codierung (MPEG-1) durchführen, dadurch gekennzeichnet, dass das Verfahren außerdem allen Codierern gemeinsame Schritte aufweist, darunter
    - die Anwendung einer Analysefilterbank,
    - eine Bestimmung von Skalierungsfaktoren,
    - eine Berechnung einer spektralen Transformation (FFT),
    - und die Bestimmung von Maskierungsschwellen gemäß einem psycho-akustischen Modell.
  22. Verfahren nach einem der Ansprüche 1 bis 17, bei dem die Codierer vom Typ mit Analyse durch Synthese (CELP) sind, dadurch gekennzeichnet, dass das Verfahren allen Codierern gemeinsame Schritte aufweist, darunter mindestens:
    - eine Vorverarbeitung,
    - die Analyse von linearen Vorhersagekoeffizienten,
    - eine Berechnung eines gewichteten Eingangssignals,
    - und eine Quantisierung für mindestens einen Teil der Parameter.
  23. Verfahren nach Anspruch 22 in Kombination mit Anspruch 17, dadurch gekennzeichnet, dass das Teilauswahl-Modul nach einem gemeinsam genutzten Schritt der Vektorquantisierung für Kurzzeitparameter (LPC) angewendet wird.
  24. Verfahren nach Anspruch 22 in Kombination mit Anspruch 17, dadurch gekennzeichnet, dass das Teilauswahl-Modul nach einem gemeinsam genutzten Schritt der Langzeit-Parameterrecherche (LTP) in offener Schleife angewendet wird.
  25. Computerprogrammprodukt, das dazu bestimmt ist, in einem Speicher einer Verarbeitungseinheit, insbesondere eines Computers oder eines mobilen Endgeräts, oder in einem Speicherträger gespeichert zu werden, der entfernbar und dazu bestimmt ist, mit einem Lesegerät der Verarbeitungseinheit zusammenzuwirken, dadurch gekennzeichnet, dass es die Anweisungen zur Durchführung des Umcodierverfahrens nach einem der vorhergehenden Ansprüche enthält.
  26. Hilfsvorrichtung für eine Mehrfach-Kompressionscodierung, bei welcher Codierung ein Eingangssignal dazu bestimmt ist, parallel mehrere Codierer zu speisen, die je eine Folge von Funktionsblöcken aufweisen, mit dem Ziel einer Kompressionscodierung des Signals durch jeden Codierer, dadurch gekennzeichnet, dass sie einen die Anweisungen eines Computerprogrammprodukts nach Anspruch 25 speichernden Speicher aufweist.
  27. Vorrichtung nach Anspruch 26, dadurch gekennzeichnet, dass sie außerdem ein unabhängiges Rechenmodul (MI) zur Durchführung des Verfahrens nach einem der Ansprüche 13 bis 17 und 23, 24 aufweist.
EP04805538A 2003-12-10 2004-11-24 Optimiertes mehrfach-codierungsverfahren Not-in-force EP1692689B1 (de)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PL04805538T PL1692689T3 (pl) 2003-12-10 2004-11-24 Sposób zoptymalizowanego wielokrotnego kodowania

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR0314490A FR2867649A1 (fr) 2003-12-10 2003-12-10 Procede de codage multiple optimise
PCT/FR2004/003009 WO2005066938A1 (fr) 2003-12-10 2004-11-24 Procede de codage multiple optimise

Publications (2)

Publication Number Publication Date
EP1692689A1 EP1692689A1 (de) 2006-08-23
EP1692689B1 true EP1692689B1 (de) 2009-09-09

Family

ID=34746281

Family Applications (1)

Application Number Title Priority Date Filing Date
EP04805538A Not-in-force EP1692689B1 (de) 2003-12-10 2004-11-24 Optimiertes mehrfach-codierungsverfahren

Country Status (12)

Country Link
US (1) US7792679B2 (de)
EP (1) EP1692689B1 (de)
JP (1) JP4879748B2 (de)
KR (1) KR101175651B1 (de)
CN (1) CN1890714B (de)
AT (1) ATE442646T1 (de)
DE (1) DE602004023115D1 (de)
ES (1) ES2333020T3 (de)
FR (1) FR2867649A1 (de)
PL (1) PL1692689T3 (de)
WO (1) WO2005066938A1 (de)
ZA (1) ZA200604623B (de)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7987089B2 (en) * 2006-07-31 2011-07-26 Qualcomm Incorporated Systems and methods for modifying a zero pad region of a windowed frame of an audio signal
EP2084708A4 (de) 2006-10-19 2010-11-24 Lg Electronics Inc Codierungsverfahren und vorrichtung und decodierungsverfahren und vorrichtung
KR101411900B1 (ko) * 2007-05-08 2014-06-26 삼성전자주식회사 오디오 신호의 부호화 및 복호화 방법 및 장치
US9653088B2 (en) * 2007-06-13 2017-05-16 Qualcomm Incorporated Systems, methods, and apparatus for signal encoding using pitch-regularizing and non-pitch-regularizing coding
KR101403340B1 (ko) * 2007-08-02 2014-06-09 삼성전자주식회사 변환 부호화 방법 및 장치
CA2729751C (en) * 2008-07-10 2017-10-24 Voiceage Corporation Device and method for quantizing and inverse quantizing lpc filters in a super-frame
FR2936898A1 (fr) * 2008-10-08 2010-04-09 France Telecom Codage a echantillonnage critique avec codeur predictif
MX2011011399A (es) * 2008-10-17 2012-06-27 Univ Friedrich Alexander Er Aparato para suministrar uno o más parámetros ajustados para un suministro de una representación de señal de mezcla ascendente sobre la base de una representación de señal de mezcla descendete, decodificador de señal de audio, transcodificador de señal de audio, codificador de señal de audio, flujo de bits de audio, método y programa de computación que utiliza información paramétrica relacionada con el objeto.
GB0822537D0 (en) 2008-12-10 2009-01-14 Skype Ltd Regeneration of wideband speech
GB2466201B (en) * 2008-12-10 2012-07-11 Skype Ltd Regeneration of wideband speech
US9947340B2 (en) * 2008-12-10 2018-04-17 Skype Regeneration of wideband speech
KR20110001130A (ko) * 2009-06-29 2011-01-06 삼성전자주식회사 가중 선형 예측 변환을 이용한 오디오 신호 부호화 및 복호화 장치 및 그 방법
KR101747917B1 (ko) 2010-10-18 2017-06-15 삼성전자주식회사 선형 예측 계수를 양자화하기 위한 저복잡도를 가지는 가중치 함수 결정 장치 및 방법
CN102394658A (zh) * 2011-10-16 2012-03-28 西南科技大学 一种面向机械振动信号的复合压缩方法
US9386267B1 (en) * 2012-02-14 2016-07-05 Arris Enterprises, Inc. Cooperative transcoding to multiple streams
JP2014123865A (ja) * 2012-12-21 2014-07-03 Xacti Corp 画像処理装置及び撮像装置
US9549178B2 (en) * 2012-12-26 2017-01-17 Verizon Patent And Licensing Inc. Segmenting and transcoding of video and/or audio data
KR101595397B1 (ko) * 2013-07-26 2016-02-29 경희대학교 산학협력단 서로 다른 다계층 비디오 코덱의 통합 부호화/복호화 방법 및 장치
WO2015012514A1 (ko) * 2013-07-26 2015-01-29 경희대학교 산학협력단 서로 다른 다계층 비디오 코덱의 통합 부호화/복호화 방법 및 장치
CN104572751A (zh) * 2013-10-24 2015-04-29 携程计算机技术(上海)有限公司 呼叫中心录音文件的压缩存储方法及系统
SE538512C2 (sv) * 2014-11-26 2016-08-30 Kelicomp Ab Improved compression and encryption of a file
SE544304C2 (en) * 2015-04-17 2022-03-29 URAEUS Communication Systems AB Improved compression and encryption of a file
US10872598B2 (en) * 2017-02-24 2020-12-22 Baidu Usa Llc Systems and methods for real-time neural text-to-speech
US10896669B2 (en) 2017-05-19 2021-01-19 Baidu Usa Llc Systems and methods for multi-speaker neural text-to-speech
US10872596B2 (en) 2017-10-19 2020-12-22 Baidu Usa Llc Systems and methods for parallel wave generation in end-to-end text-to-speech
CN114144790B (zh) 2020-06-12 2024-07-02 百度时代网络技术(北京)有限公司 具有三维骨架正则化和表示性身体姿势的个性化语音到视频
US11587548B2 (en) * 2020-06-12 2023-02-21 Baidu Usa Llc Text-driven video synthesis with phonetic dictionary

Family Cites Families (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0398318A (ja) * 1989-09-11 1991-04-23 Fujitsu Ltd 音声符号化方式
US5734789A (en) * 1992-06-01 1998-03-31 Hughes Electronics Voiced, unvoiced or noise modes in a CELP vocoder
US5495555A (en) * 1992-06-01 1996-02-27 Hughes Aircraft Company High quality low bit rate celp-based speech codec
JP3227291B2 (ja) * 1993-12-16 2001-11-12 シャープ株式会社 データ符号化装置
US5602961A (en) * 1994-05-31 1997-02-11 Alaris, Inc. Method and apparatus for speech compression using multi-mode code excited linear predictive coding
US5956674A (en) * 1995-12-01 1999-09-21 Digital Theater Systems, Inc. Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels
US5987506A (en) * 1996-11-22 1999-11-16 Mangosoft Corporation Remote access and geographically distributed computers in a globally addressable storage environment
JP3134817B2 (ja) * 1997-07-11 2001-02-13 日本電気株式会社 音声符号化復号装置
US6141638A (en) * 1998-05-28 2000-10-31 Motorola, Inc. Method and apparatus for coding an information signal
US6249758B1 (en) * 1998-06-30 2001-06-19 Nortel Networks Limited Apparatus and method for coding speech signals by making use of voice/unvoiced characteristics of the speech signals
US6173257B1 (en) * 1998-08-24 2001-01-09 Conexant Systems, Inc Completed fixed codebook for speech encoder
US6192335B1 (en) * 1998-09-01 2001-02-20 Telefonaktieboiaget Lm Ericsson (Publ) Adaptive combining of multi-mode coding for voiced speech and noise-like signals
JP3579309B2 (ja) * 1998-09-09 2004-10-20 日本電信電話株式会社 画質調整方法及びその方法を使用した映像通信装置及びその方法を記録した記録媒体
SE521225C2 (sv) * 1998-09-16 2003-10-14 Ericsson Telefon Ab L M Förfarande och anordning för CELP-kodning/avkodning
US7272556B1 (en) * 1998-09-23 2007-09-18 Lucent Technologies Inc. Scalable and embedded codec for speech and audio signals
US6691084B2 (en) * 1998-12-21 2004-02-10 Qualcomm Incorporated Multiple mode variable rate speech coding
US6260009B1 (en) * 1999-02-12 2001-07-10 Qualcomm Incorporated CELP-based to CELP-based vocoder packet translation
US6640209B1 (en) * 1999-02-26 2003-10-28 Qualcomm Incorporated Closed-loop multimode mixed-domain linear prediction (MDLP) speech coder
DE19911179C1 (de) * 1999-03-12 2000-11-02 Deutsche Telekom Mobil Verfahren zur Adaption der Betriebsart eines Multi-Mode-Codecs an sich verändernde Funkbedingungen in einem CDMA-Mobilfunknetz
JP2000287213A (ja) * 1999-03-31 2000-10-13 Victor Co Of Japan Ltd 動画像符号化装置
US6532593B1 (en) * 1999-08-17 2003-03-11 General Instrument Corporation Transcoding for consumer set-top storage application
US6574593B1 (en) * 1999-09-22 2003-06-03 Conexant Systems, Inc. Codebook tables for encoding and decoding
US6581032B1 (en) * 1999-09-22 2003-06-17 Conexant Systems, Inc. Bitstream protocol for transmission of encoded voice signals
US6604070B1 (en) * 1999-09-22 2003-08-05 Conexant Systems, Inc. System of encoding and decoding speech signals
AU7486200A (en) * 1999-09-22 2001-04-24 Conexant Systems, Inc. Multimode speech encoder
US6522746B1 (en) * 1999-11-03 2003-02-18 Tellabs Operations, Inc. Synchronization of voice boundaries and their use by echo cancellers in a voice processing system
JP3549788B2 (ja) * 1999-11-05 2004-08-04 三菱電機株式会社 多段符号化方法、多段復号方法、多段符号化装置、多段復号装置およびこれらを用いた情報伝送システム
FR2802329B1 (fr) * 1999-12-08 2003-03-28 France Telecom Procede de traitement d'au moins un flux binaire audio code organise sous la forme de trames
AU2547201A (en) * 2000-01-11 2001-07-24 Matsushita Electric Industrial Co., Ltd. Multi-mode voice encoding device and decoding device
SE519981C2 (sv) * 2000-09-15 2003-05-06 Ericsson Telefon Ab L M Kodning och avkodning av signaler från flera kanaler
SE519976C2 (sv) * 2000-09-15 2003-05-06 Ericsson Telefon Ab L M Kodning och avkodning av signaler från flera kanaler
US6615169B1 (en) * 2000-10-18 2003-09-02 Nokia Corporation High frequency enhancement layer coding in wideband speech codec
JP2002202799A (ja) * 2000-10-30 2002-07-19 Fujitsu Ltd 音声符号変換装置
EP1410513A4 (de) * 2000-12-29 2005-06-29 Infineon Technologies Ag Kanal-codec-prozessor, der für mehrere drahtlose kommunikationsstandards konfigurierbar ist
US6614370B2 (en) * 2001-01-26 2003-09-02 Oded Gottesman Redundant compression techniques for transmitting data over degraded communication links and/or storing data on media subject to degradation
US6658383B2 (en) * 2001-06-26 2003-12-02 Microsoft Corporation Method for coding speech and music signals
EP1292036B1 (de) * 2001-08-23 2012-08-01 Nippon Telegraph And Telephone Corporation Verfahren und Vorrichtung zur Decodierung von digitalen Signalen
JP2003125406A (ja) * 2001-09-25 2003-04-25 Hewlett Packard Co <Hp> 有向性非周期グラフに基づくビデオ符号化のモード選択最適化方法およびシステム
US7095343B2 (en) * 2001-10-09 2006-08-22 Trustees Of Princeton University code compression algorithms and architectures for embedded systems
JP2003195893A (ja) * 2001-12-26 2003-07-09 Toshiba Corp 音声再生装置及び音声再生方法
US6829579B2 (en) * 2002-01-08 2004-12-07 Dilithium Networks, Inc. Transcoding method and system between CELP-based speech codes
US7254533B1 (en) * 2002-10-17 2007-08-07 Dilithium Networks Pty Ltd. Method and apparatus for a thin CELP voice codec
US7133521B2 (en) * 2002-10-25 2006-11-07 Dilithium Networks Pty Ltd. Method and apparatus for DTMF detection and voice mixing in the CELP parameter domain
US7023880B2 (en) * 2002-10-28 2006-04-04 Qualcomm Incorporated Re-formatting variable-rate vocoder frames for inter-system transmissions
JP2004208280A (ja) * 2002-12-09 2004-07-22 Hitachi Ltd 符号化装置および符号化方法
WO2004064041A1 (en) * 2003-01-09 2004-07-29 Dilithium Networks Pty Limited Method and apparatus for improved quality voice transcoding
KR100554164B1 (ko) * 2003-07-11 2006-02-22 학교법인연세대학교 서로 다른 celp 방식의 음성 코덱 간의 상호부호화장치 및 그 방법
US7469209B2 (en) * 2003-08-14 2008-12-23 Dilithium Networks Pty Ltd. Method and apparatus for frame classification and rate determination in voice transcoders for telecommunications
US7305055B1 (en) * 2003-08-18 2007-12-04 Qualcomm Incorporated Search-efficient MIMO trellis decoder
US7433815B2 (en) * 2003-09-10 2008-10-07 Dilithium Networks Pty Ltd. Method and apparatus for voice transcoding between variable rate coders
US7613606B2 (en) * 2003-10-02 2009-11-03 Nokia Corporation Speech codecs
US7170988B2 (en) * 2003-10-27 2007-01-30 Motorola, Inc. Method and apparatus for network communication
FR2867648A1 (fr) * 2003-12-10 2005-09-16 France Telecom Transcodage entre indices de dictionnaires multi-impulsionnels utilises en codage en compression de signaux numeriques
US20050258983A1 (en) * 2004-05-11 2005-11-24 Dilithium Holdings Pty Ltd. (An Australian Corporation) Method and apparatus for voice trans-rating in multi-rate voice coders for telecommunications

Also Published As

Publication number Publication date
WO2005066938A1 (fr) 2005-07-21
US20070150271A1 (en) 2007-06-28
ZA200604623B (en) 2007-11-28
CN1890714A (zh) 2007-01-03
ES2333020T3 (es) 2010-02-16
US7792679B2 (en) 2010-09-07
KR20060131782A (ko) 2006-12-20
DE602004023115D1 (de) 2009-10-22
JP2007515677A (ja) 2007-06-14
EP1692689A1 (de) 2006-08-23
PL1692689T3 (pl) 2010-02-26
CN1890714B (zh) 2010-12-29
KR101175651B1 (ko) 2012-08-21
ATE442646T1 (de) 2009-09-15
JP4879748B2 (ja) 2012-02-22
FR2867649A1 (fr) 2005-09-16

Similar Documents

Publication Publication Date Title
EP1692689B1 (de) Optimiertes mehrfach-codierungsverfahren
RU2326450C2 (ru) Способ и устройство для векторного квантования с надежным предсказанием параметров линейного предсказания в кодировании речи с переменной битовой скоростью
EP2452337B1 (de) Zuweisung von bits bei einer verstärkten codierung/decodierung zur verbesserung einer hierarchischen codierung/decodierung digitaler tonsignale
CA2512179C (fr) Procede de codage et de decodage audio a debit variable
EP2254110B1 (de) Stereosignalkodiergerät, stereosignaldekodiergerät und verfahren dafür
WO2008104663A1 (fr) Codage/decodage perfectionnes de signaux audionumeriques
WO2009055493A1 (en) Scalable speech and audio encoding using combinatorial encoding of mdct spectrum
CA2766864C (fr) Codage/decodage perfectionne de signaux audionumeriques
EP2128858B1 (de) Kodiervorrichtung und kodierverfahren
US7634402B2 (en) Apparatus for coding of variable bitrate wideband speech and audio signals, and a method thereof
EP1836699B1 (de) Verfahren und Vorrichtung zur Ausführung einer optimalizierten Audiokodierung zwischen zwei Langzeitvorhersagemodellen
US6611797B1 (en) Speech coding/decoding method and apparatus
FR2784218A1 (fr) Procede de codage de la parole a bas debit
WO2023165946A1 (fr) Codage et décodage optimisé d&#39;un signal audio utilisant un auto-encodeur à base de réseau de neurones
CA2567162A1 (fr) Procede de quantification d&#39;un codeur de parole a tres bas debit
WO2011144863A1 (fr) Codage avec mise en forme du bruit dans un codeur hierarchique
WO2002029786A1 (fr) Procede et dispositif de codage segmental d&#39;un signal audio
FR2791166A1 (fr) Procedes de codage, de decodage et de transcodage
FR2980620A1 (fr) Traitement d&#39;amelioration de la qualite des signaux audiofrequences decodes
FR2737360A1 (fr) Procedes de codage et de decodage de signaux audiofrequence, codeur et decodeur pour la mise en oeuvre de tels procedes
KR19980036961A (ko) 음성 부호화 및 복호화 장치와 그 방법

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20060526

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LU MC NL PL PT RO SE SI SK TR

DAX Request for extension of the european patent (deleted)
17Q First examination report despatched

Effective date: 20070621

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LU MC NL PL PT RO SE SI SK TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

Free format text: NOT ENGLISH

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REF Corresponds to:

Ref document number: 602004023115

Country of ref document: DE

Date of ref document: 20091022

Kind code of ref document: P

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
REG Reference to a national code

Ref country code: ES

Ref legal event code: FG2A

Ref document number: 2333020

Country of ref document: ES

Kind code of ref document: T3

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

REG Reference to a national code

Ref country code: PL

Ref legal event code: T3

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

REG Reference to a national code

Ref country code: IE

Ref legal event code: FD4D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100111

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100109

BERE Be: lapsed

Owner name: FRANCE TELECOM

Effective date: 20091130

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20091130

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

26N No opposition filed

Effective date: 20100610

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20091130

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20091210

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20091130

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20091130

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20091130

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20091124

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100310

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090909

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 12

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20151027

Year of fee payment: 12

Ref country code: DE

Payment date: 20151022

Year of fee payment: 12

Ref country code: IT

Payment date: 20151023

Year of fee payment: 12

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: ES

Payment date: 20151110

Year of fee payment: 12

Ref country code: PL

Payment date: 20151026

Year of fee payment: 12

Ref country code: FR

Payment date: 20151023

Year of fee payment: 12

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 602004023115

Country of ref document: DE

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20161124

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20170731

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20161124

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20161130

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20170601

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20161124

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PL

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20161124

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20161125

REG Reference to a national code

Ref country code: ES

Ref legal event code: FD2A

Effective date: 20181120