EP1433164B1 - Masquage ameliore de l'effacement des trames destine au codage predictif de la parole base sur l'extrapolation de la forme d'ondes de la parole - Google Patents

Masquage ameliore de l'effacement des trames destine au codage predictif de la parole base sur l'extrapolation de la forme d'ondes de la parole Download PDF

Info

Publication number
EP1433164B1
EP1433164B1 EP02757200A EP02757200A EP1433164B1 EP 1433164 B1 EP1433164 B1 EP 1433164B1 EP 02757200 A EP02757200 A EP 02757200A EP 02757200 A EP02757200 A EP 02757200A EP 1433164 B1 EP1433164 B1 EP 1433164B1
Authority
EP
European Patent Office
Prior art keywords
frame
ppfe1
samples
speech
waveform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
EP02757200A
Other languages
German (de)
English (en)
Other versions
EP1433164A1 (fr
EP1433164A4 (fr
Inventor
Juin-Hwey Chen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Broadcom Corp
Original Assignee
Broadcom Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/183,448 external-priority patent/US7143032B2/en
Priority claimed from US10/183,608 external-priority patent/US7711563B2/en
Priority claimed from US10/183,451 external-priority patent/US7308406B2/en
Application filed by Broadcom Corp filed Critical Broadcom Corp
Publication of EP1433164A1 publication Critical patent/EP1433164A1/fr
Publication of EP1433164A4 publication Critical patent/EP1433164A4/fr
Application granted granted Critical
Publication of EP1433164B1 publication Critical patent/EP1433164B1/fr
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/005Correction of errors induced by the transmission channel, if related to the coding algorithm

Definitions

  • the present invention relates to digital communications. More particularly, the present invention relates to the enhancement of speech quality when frames of a compressed bit stream representing a speech signal are lost within the context of a digital communications system.
  • a coder In speech coding, sometimes called voice compression, a coder encodes an input speech or audio signal into a digital bit stream for transmission. A decoder decodes the bit stream into an output speech signal. The combination of the coder and the decoder is called a codec.
  • the transmitted bit stream is usually partitioned into frames.
  • frames of transmitted bits are lost, erased, or corrupted. This condition is called frame erasure in wireless communications.
  • the same condition of erased frames can happen in packet networks due to packet loss.
  • the decoder cannot perform normal decoding operations since there are no bits to decode in the lost frame.
  • the decoder needs to perform frame erasure concealment (FEC) operations to try to conceal the quality-degrading effects of the frame erasure.
  • FEC frame erasure concealment
  • One of the earliest FEC techniques is waveform substitution based on pattern matching, as proposed by Goodman, et al. in "Waveform Substitution Techniques for Recovering Missing Speech Segments in Packet Voice Communications", IEEE Transaction on Acoustics, Speech and Signal Processing, December 1986, pp. 1440 - 1448 .
  • This scheme was applied to Pulse Code Modulation (PCM) speech codec that performs sample-by-sample instantaneous quantization of speech waveform directly.
  • PCM Pulse Code Modulation
  • This FEC scheme uses a piece of decoded speech waveform immediately before the lost frame as the template, and slides this template back in time to find a suitable piece of decoded speech waveform that maximizes some sort of waveform similarity measure (or minimizes a waveform difference measure).
  • Goodman's FEC scheme then uses the section of waveform immediately following a best-matching waveform segment as the substitute waveform for the lost frame. To eliminate discontinuities at frame boundaries, the scheme also uses a raised cosine window to perform an overlap-add technique between the correctly decoded waveform and the substitute waveform. This overlap-add technique increases the coding delay. The delay occurs because at the end of each frame, there are many speech samples that need to be overlap-added to obtain the final values, and thus cannot be played out until the next frame of speech is decoded.
  • the most popular type of speech codec is based on predictive coding.
  • the first publicized FEC scheme for a predictive codec is a "bad frame masking" scheme in the original TIA IS-54 VSELP standard for North American digital cellular radio (rescinded in September 1996).
  • the scheme repeats the linear prediction parameters of the last frame.
  • This scheme derives the speech energy parameter for the current frame by either repeating or attenuating the speech energy parameter of last frame, depending on how many consecutive bad frames have been counted.
  • the excitation signal or quantized prediction residual
  • this scheme does not perform any special operation. It merely decodes the excitation bits, even though they might contain a large number of bit errors.
  • the first FEC scheme for a predictive codec that performs waveform substitution in the excitation domain is probably the FEC system developed by Chen for the ITU-T Recommendation G.728 Low-Delay Code Excited Linear Predictor (CELP) codec, as described in United States Patent No. 5,615,298 issued to Chen, titled "Excitation Signal Synthesis During Frame Erasure or Packet Loss.”
  • CELP Low-Delay Code Excited Linear Predictor
  • EP 0 747 882 A2 describes a method for use in a speech decoder which fails to receive a portion of each of first and second consecutive frames of compressed speech information, wherein a pitch delay associated with the first of the consecutive frames is incremented and the incremented value is used as a pitch delay for the second of the consecutive frames.
  • an exemplary FEC technique includes a method of synthesizing a number of corrupted frames output from a decoder including one or more predictive filters.
  • the corrupted frames are representative of one segment of a decoded signal (sq(n)) output from the decoder.
  • the method comprises determining a first preliminary time lag (ppfe1) based upon examining a predetermined number (K) of samples of another segment of the decoded signal and determining a scaling factor (ptfe) associated with the examined number (K) of samples when the first preliminary time lag (ppfel) is determined.
  • the method also comprises extrapolating one or more replacement frames based upon the first preliminary time lag (ppfe1) and the scaling factor (ptfe).
  • FIG. 1 is a block diagram illustration of a conventional predictive decoder
  • FIG. 2 is a block diagram illustration of an exemplary decoder constructed and arranged in accordance with the present invention
  • FIG. 3(a) is a plot of an exemplary unnormalized waveform attenuation window functioning in accordance with the present invention
  • FIG. 3(b) is a plot of an exemplary normalized waveform attenuation window functioning in accordance with the present invention.
  • FIG. 4 is a block diagram of an exemplary computer system on which the present invention can be practiced.
  • the present invention is particularly useful in the environment of the decoder of a predictive speech codec to conceal the quality-degrading effects of frame erasure or packet loss.
  • FIG. 1 illustrates such an environment.
  • the general principles of the invention can be used in any linear predictive codec, although the preferred embodiment described later is particularly well suited for a specific type of predictive decoder.
  • the invention is an FEC technique designed for predictive coding of speech.
  • One characteristic that distinguishes it from the techniques mentioned above, is that it performs waveform substitution in the speech domain rather than the excitation domain. It also performs special operations to update the internal states, or memories, of predictors and filters inside the predictive decoder to ensure maximally smooth reproduction of speech waveform when the next good frame is received.
  • the present invention also avoids the additional delay associated with the overlap-add operation in Goodman's approach and in ITU-T G.711 Appendix I. This is achieved by performing overlap-add between extrapolated speech waveform and the ringing, or zero-input response of the synthesis filter. Other features include a special algorithm to minimize buzzing sounds during waveform extrapolation, and an efficient method to implement a linearly decreasing waveform envelope during extended frame erasure. Finally, the associated memories within the log-gain predictor are updated.
  • the present invention is not restricted to a particular speech codec. Instead, it's generally applicable to predictive speech codecs, including, but not limited to, Adaptive Predictive Coding (APC), Multi-Pulse Linear Predictive Coding (MPLPC), CELP, and Noise Feedback Coding (NFC), etc.
  • APC Adaptive Predictive Coding
  • MPLPC Multi-Pulse Linear Predictive Coding
  • CELP CELP
  • NFC Noise Feedback Coding
  • FIG. 1 is a block diagram illustration of a conventional predictive decoder 100.
  • the decoder 100 shown in FIG. 1 can be used to describe the decoders of APC, MPLPC, CELP, and NFC speech codecs.
  • the more sophisticated versions of the codecs associated with predictive decoders typically use a short-term predictor to exploit the redundancy among adjacent speech samples and a long-term predictor to exploit the redundancy between distant samples due to pitch periodicity of, for example, voiced speech.
  • the main information transmitted by these codecs is the quantized version of the prediction residual signal after short-term and long-term prediction.
  • This quantized residual signal is often called the excitation signal, because it is used in the decoder to excite the long-term and short-term synthesis filter to produce the output decoded speech.
  • the excitation signal In addition to the excitation signal, several other speech parameters are also transmitted as side information frame-by-frame or subframe-by-subframe.
  • An exemplary range of lengths for each frame (called frame size) is 5 ms to 40 ms, with 10 ms and 20 ms as the two most popular frame sizes for speech codecs.
  • Each frame usually contains a few equal-length subframes.
  • the side information of these predictive codecs typically includes spectral envelope information (in the form of the short-term predictor parameters), pitch period, pitch predictor taps (both long-term predictor parameters), and excitation gain.
  • the conventional decoder 100 includes a bit de-multiplexer 105.
  • the de-multiplexer 105 separates the bits in each received frame of bits into codes for the excitation signal and codes for short-term predictor, long-term predictor, and the excitation gain.
  • the short-term predictor parameters are usually transmitted once a frame.
  • LPC linear predictive coding
  • LSP line-spectrum pair
  • LSF line-spectrum frequency
  • LSPI represents the transmitted quantizer codebook index representing the LSP parameters in each frame.
  • a short-term predictive parameter decoder 110 decodes LSPI into an LSP parameter set and then converts the LSP parameters to the coefficients for the short-term predictor. These short-term predictor coefficients are then used to control the coefficient update of a short-term predictor 120.
  • Pitch period is defined as the time period at which a voiced speech waveform appears to be repeating itself periodically at a given moment. It is usually measured in terms of a number of samples, is transmitted once a subframe, and is used as the bulk delay in long-term predictors. Pitch taps are the coefficients of the long-term predictor.
  • the bit de-multiplexer 105 also separates out the pitch period index ( PPI ) and the pitch predictor tap index ( PPTI ) , from the received bit stream.
  • a long-term predictive parameter decoder 130 decodes PPI into the pitch period, and decodes the PPTI into the pitch predictor taps. The decoded pitch period and pitch predictor taps are then used to control the parameter update of a generalized long-term predictor 140.
  • the long-term predictor 140 is just a finite impulse response (FIR) filter, typically first order or third order, with a bulk delay equal to the pitch period.
  • FIR finite impulse response
  • the long-term predictor 140 has been generalized to an adaptive codebook, with the only difference being that when the pitch period is smaller than the subframe, some periodic repetition operations are performed.
  • the generalized long-term predictor 140 can represent either a straightforward FIR filter, or an adaptive codebook, thus covering most of the predictive speech codecs presently in use.
  • the bit de-multiplexer 105 also separates out a gain index GI and an excitation index CI from the input bit stream.
  • An excitation decoder 150 decodes the CI into an unscaled excitation signal, and also decodes the GI into the excitation gain. Then, it uses the excitation gain to scale the unscaled excitation signal to derive a scaled excitation gain signal uq ( n ), which can be considered a quantized version of the long-term prediction residual.
  • An adder 160 combines the output of the generalized long-term predictor 140 with the scaled excitation gain signal uq ( n ) to obtain a quantized version of a short-term prediction residual signal dq(n).
  • An adder 170 combines the output of the short-term predictor 120 to dg ( n ) to obtain an output decoded speech signal sq ( n ) .
  • a feedback loop is formed by the generalized long-term predictor 140 and the adder 160 and can be regarded as a single filter, called a long-term synthesis filter 180.
  • another feedback loop is formed by the short term predictor 120 and the adder 170.
  • This other feedback loop can be considered a single filter called a short-term synthesis filter 190.
  • the long-term synthesis filter 180 and the short-term synthesis filter 190 combine to form a synthesis filter module 195.
  • the conventional predictive decoder 100 depicted in FIG. 1 decodes the parameters of the short-term predictor 120 and the long-term predictor 140, the excitation gain, and the unscaled excitation signal. It then scales the unscaled excitation signal with the excitation gain, and passes the resulting scaled excitation signal uq(n) through the long-term synthesis filter 180 and the short-term synthesis filter 190 to derive the output decoded speech signal sg(n).
  • the decoder 100 in FIG. 1 When a frame of input bits is erased due to fading in a wireless transmission or due to packet loss in packet networks, the decoder 100 in FIG. 1 unfortunately looses the indices LSPI, PPI, PPTI, GI, and CI , needed to decode the speech waveform in the current frame.
  • the decoded speech waveform immediately before the current frame is stored and analyzed.
  • a waveform-matching search, similar to the approach of Goodman is performed, and the time lag and scaling factor for repeating the previously decoded speech waveform in the current frame are identified.
  • the time lag and scaling factor are sometimes modified as follows. If the analysis indicates that the stored previous waveform is not likely to be a segment of highly periodic voiced speech, and if the time lag for waveform repetition is smaller than a predetermined threshold, another search is performed for a suitable time lag greater than the predetermined threshold. The scaling factor is also updated accordingly.
  • the present invention copies the speech waveform one time lag earlier to fill the current frame, thus creating an extrapolated waveform.
  • the extrapolated waveform is then scaled with the scaling factor.
  • the present invention also calculates a number of samples of the ringing, or zero-input response, output from the synthesis filter module 195 from the beginning of the current frame. Due to the smoothing effect of the short-term synthesis filter 190, such a ringing signal will seem to flow smoothly from the decoded speech waveform at the end of the last frame.
  • the present invention then overlap-adds this ringing signal and the extrapolated speech waveform with a suitable overlap-add window in order to smoothly merge these two pieces of waveform. This technique will smooth out waveform discontinuity at the beginning of the current frame. At the same time, it avoids the additional delays created by G.711 Appendix I or the approach of Goodman.
  • the extrapolated speech signal is attenuated toward zero. Otherwise, it will create a tonal or buzzing sound.
  • the waveform envelope is attenuated linearly toward zero if the length of the frame erasure exceeds a certain threshold. The present invention then uses a memory-efficient method to implement this linear attenuation toward zero.
  • the present invention After the waveform extrapolation is performed in the erased frame, the present invention properly updates all the internal memory states of the filters within the speech decoder. If updating is not performed, there would be a large discontinuity and an audible glitch at the beginning of the next good frame. In updating the filter memory after a frame erasure, the present invention works backward from the output speech waveform. The invention sets the filter memory contents to be what they would have been at the end of the current frame, if the filtering operations of the speech decoder were done normally. That is, the filtering operations are performed with a special excitation such that the resulting synthesized output speech waveform is exactly the same as the extrapolated waveform calculated above.
  • the memory of the short-term synthesis filter 190 is simply the last M samples of the extrapolated speech signal for the current frame with the order reversed. This is because the short-term synthesis filter 190 in the conventional decoder 100 is an all-pole filter. The filter memory is simply the previous filter output signal samples in reverse order.
  • the present invention performs short-term prediction error filtering of the extrapolated speech signal of the current frame, with initial memory of the short-term predictor 120 set to the last M samples (in reverse order) of the output speech signal in the last frame.
  • the present invention After the first received good frame following a frame erasure, the present invention also attempts to correct filter memories within the long-term synthesis filter 180 and the short-term synthesis 190 filter if certain conditions are met.
  • the present invention first performs linear interpolation between the pitch period of the last good frame before the erasure and the pitch period of the first good frame after the erasure. Such linear interpolation of the pitch period is performed for each of the erased frames. Based on this linearly interpolated pitch contour, the present invention then re-extrapolates the long-term synthesis filter memory and re-calculates the short-term synthesis filter memory at the end of the last erased frame (i.e., before decoding the first good frame after the erasure).
  • FIG. 2 is a block diagram illustration of an exemplary embodiment of the present invention.
  • the decoder can be, for example, the decoder 100 shown in FIG. 1.
  • Also included in the embodiment of FIG. 2 is an input frame erasure flag switch 200. If the input frame erasure flag 200 indicates that the current frame received is a good frame, the decoder 100 performs the normal decoding operations as described above. If, however, the frame is the first good frame after a frame erasure, the long-term and short-term synthesis filter memories can be corrected before starting the normal decoding. When a good frame is received, the frame erasure flag switch 200 is in the upper position, and the decoded speech waveform sq ( n ) is used as the output of the system.
  • the current frame of decoded speech sq ( n ) is also passed to a module 201, which stores the previously decoded speech waveform samples in a buffer.
  • the current frame of decoded speech sq(n) is used to update that buffer.
  • the remaining modules in FIG. 2 are inactive during a good frame.
  • the operation of the decoder 100 is halted and the frame erasure flag switch 200 is changes to the lower position.
  • the remaining modules of FIG. 2 then perform frame erasure concealment operations to produce the output speech waveform sq' ( n ) for the current frame, and also update the filter memories of the decoder 100 to prepare the decoder 100 for the normal decoding operations of the next received good frame.
  • the remaining modules of FIG. 2 work in the following way.
  • a module 201 calculates L samples of "ringing," or zero-input response, of the synthesis filter in FIG. 1.
  • a module 202 analyzes the previously decoded speech waveform samples stored in the module 201 to determine a first time lag ppfe 1 and an associated scaling factor ptfe 1 for waveform extrapolation in the current frame. This can be done in a number of ways. One way, for example, uses the approaches outlined by Goodman et al. And discussed above. If there are multiple consecutive frames erased, the module 202 is active only at the first erased frame. From the second erased frame on, the time lag and scaling factor found in the first erased frame are used.
  • the present invention will typically usually just search for a "pitch period" in the general sense, as in a pitch-prediction-based speech codec. If the decoder 100 has a decoded pitch period of the last frame, and if it is deemed reliable, then the embodiment of FIG. 2 will simply search around the neighborhood of this pitch period pp to find a suitable time lag. If the decoder 100 docs not provide a decoded pitch period, or if this pitch period is deemed unreliable, then the embodiment of FIG. 2 will perform a full-scale pitch estimation to get the desired time lag. In FIG. 2, it is assumed that such a decoded pitch period pp is indeed available and reliable. In this case, the embodiment of FIG. 2 operates as follows.
  • pplast denote the pitch period of the last good frame before the frame erasure. If pplast is smaller than 10 ms (80 samples and 160 samples for 8 kHz and 16 kHz sampling rates, respectively), the module 202 uses it as the analysis window size K. If pplast is greater than 10 ms, the module 202 uses 10 ms as the analysis window size K.
  • the module 202 determines the pitch search range as follows. It subtracts 0.5 ms (4 samples and 8 samples for 8 kHz and 16 kHz sampling, respectively) from pplast, compares the result with the minimum allowed pitch period in the codec, and chooses the larger of the two as the lower bound of the search range, lb. It then adds 0.5 ms to pplast, compares the result with the maximum allowed pitch period in the codec, and chooses the smaller of the two as the upper bound of the search range, ub.
  • N f is the number of samples in a frame.
  • the time lag j that maximizes nc ( j ) is also the time lag within the search range that maximizes the pitch prediction gain for a single-tap pitch predictor. This is called the optimal time lag ppfe 1, which stands for p itch p eriod for f rame e rasure, 1 st version. In the extremely rare case where no c ( j ) in the search range is positive, ppfe 1 is set to lb in this degenerate case.
  • ppfe 1 the associated scaling factor ptfe 1 is calculated as follows.
  • Such a calculated scaling factor ptfe 1 is then clipped to 1 if it is greater than 1 and clipped to - 1 if it is less than -1. Also, in the degenerate case when the denominator on the right-hand side of the above equation is zero, ptfe 1 is set to 0.
  • the module 202 performs the above calculation only for the first erased frame when there are multiple consecutive erased frames, it also attempts to modify the first time lag ppfe 1 at the second consecutively erased frame, depending on the pitch period contour at the good frames immediately before the erasure. Starting from the last good frame before the erasure, and going backward frame-by-frame for up to 4 frames, the module 202 compares the transmitted pitch period until there is a change in the transmitted pitch period. If there is no change in pitch period found during these 4 good frames before the erasure, then the first time lag ppfe 1 found above at the first erased frame is also used for the second consecutively erased frame. Otherwise, the first pitch change identified in the backward search above is examined to see if the change is relatively small.
  • the amount of pitch period change per frame is calculated and is rounded to the nearest integer.
  • the module 202 then adds this rounded pitch period change per frame, whether positive or negative, to the ppfe 1 found above at the first erased frame.
  • the resulting value is used as the first time lag ppfe 1 for the second and subsequent consecutively erased frames. This modification of the first time lag after the second erased frame improves the speech quality on average.
  • the present invention uses a module 203 to distinguish between highly periodic voiced speech segments and other types of speech segments. If the module 203 determines that the decoded speech is in a highly periodic voiced speech region, it sets the periodic waveform extrapolation flag pwef to 1; otherwise, pwef is set to 0.
  • a module 204 will find a second, larger time lag ppfe2 greater than 10 ms to reduce or eliminate the buzz sound.
  • the module 203 uses ppfe 1 as its input, the module 203 performs further analysis of the previously decoded speech sq ( n ) to determine the periodic waveform extrapolation flag pwef. Again, this can be done in many possible ways.
  • One exemplary method of determining the periodic waveform flag pwef is described below.
  • the module 203 calculates three signal features: signal gain relative to long-term average of input signal level, pitch prediction gain, and the first normalized autocorrelation coefficient. It then calculates a weighted sum of these three signal features, and compares the resulting figure of merit with a pre-determined threshold. If the threshold is exceeded, pwef is set to 1, otherwise it is set to 0. The mopdule 203 then performs special handling for extreme cases.
  • lvl be the long-term average logarithmic gain of the active portion of the speech signal (that is, not counting the silence).
  • a separate estimator for input signal level can be employed to calculate lvl.
  • An exemplary signal level estimator is disclosed in U.S. Provisional Application No. 60/312,794, filed August 17, 2001 , entitled "Bit Error Concealment Methods for Speech Coding," and U.S. Provisional Application No.
  • fom nlg + 1.25 ⁇ ppg + 16 ⁇ ⁇ 1 If fom > 16, pwef is set to 1, otherwise it is set to 0. Afterward, the flag pwef may be overwritten in the following extreme cases:
  • the present invention searches for a second time lag ppfe 2 ⁇ T 0 .
  • Two waveforms, one extrapolated using the first time lag ppfe 1, and the other extrapolated using the second time lag ppfe 2 are added together and properly scaled, and the resulting waveform is used as the output speech of the current frame.
  • the present invention searches in the neighborhood of the first integer multiple of ppfe 1 that is no smaller than T 0 .
  • the flag pwef should have been 1 and is misclassified as 0, there is a good chance that an integer multiple of the true pitch period will be chosen as the second time lag ppfe 2 for periodic waveform extrapolation.
  • the module 204 sets m 1 , the lower bound of the time lag search range, to m x ppfe 1 - 3 or T 0 , whichever is larger.
  • the corresponding scaling factor is set to 1.
  • the module 205 extrapolates speech waveform for the current erased frame based on the first time lag ppfe 1. It first extrapolates the first L samples of speech in the current frame using the first time lag ppfe 1 and the corresponding scaling factor ptfe 1. A suitable value of L is 8 samples.
  • the sign " ⁇ " means the quantity on its right-hand side overwrites the variable values on its left-hand side.
  • the window function w u ( n ) represents the overlap-add window that is ramping up, while w d ( n ) represents the overlap-add window that is ramping down.
  • overlap-add windows can be used.
  • the raised cosine window mentioned in the paper by Goodman et al. Is one exemplary method.
  • simpler triangular windows can also be used.
  • the resulting waveform is passed to the module 210.
  • the module 210 starts waveform attenuation at the instant when the frame erasure has lasted for 20 ms. From there, the envelope of the extrapolated waveform is attenuated linearly toward zero and the waveform magnitude reaches zero at 60 ms into the erasure of consecutive frames. After 60 ms, the output is completely muted. See FIG. 3 (a) for a waveform attenuation window that implements this attenuation strategy.
  • the preferred embodiment of the present invention is used with a noise feedback codec that has a frame size of 5 ms.
  • the time interval between each adjacent pair of vertical lines in FIG. 3 (a) represent a frame.
  • the module 210 applies the waveform attenuation window frame-by-frame without any additional buffering.
  • the module 210 cannot directly apply the corresponding section of the window for that frame in FIG. 3 (a).
  • a waveform discontinuity will occur at the frame boundary, because the corresponding section of the attenuation window starts from a value less than unity (7/8, 6/8, 5/8, etc.). This will cause a sudden decrease of waveform sample value at the beginning of the frame, and thus an audible waveform discontinuity.
  • Such normalized attenuation window for each frame is shown in FIG. 3 (b).
  • the present invention Rather than storing every sample in the normalized attenuation window in FIG. 3 (b), the present invention simply stores the decrement between adjacent samples of the window for each of the eight window sections for fifth to twelfth frame. This decrement is the amount of total decline of the window function in each frame (1/8 for the fifth erased frame, 1/7 for the sixth erased frame, and so on), divided by N f , the number of speech samples in a frame.
  • the module 210 does not need to perform any waveform attenuation operation. If the frame erasure has lasted for more than 20 ms, then the module 210 applies the appropriate section of the normalized waveform attenuation window in FIG. 3 (b), depending on how many consecutive frames have been erased so far. For example, if the current frame is the sixth consecutive frame that is erased, then the module 210 applies the section of the window from 25 ms to 30 ms (with window function from 1 to 6/7). Since the normalized waveform attenuation window for each frame always starts with unity, the windowing operation will not cause any waveform discontinuity at the beginning of the frame.
  • the normalized window function is not stored; instead, it is calculated on the fly.
  • the module 210 multiplies the first waveform sample of the current frame by 1, and then reduces the window function value by the decrement value calculated and stored beforehand, as mentioned above. It then multiplies the second waveform sample by the resulting decremented window function value.
  • the window function value is again reduced by the decrement value, and the result is used to scale the third waveform sample of the frame. This process is repeated for all samples of the extrapolated waveform in the current frame.
  • the output of the module 210 is passed through the switch 200 and becomes the final output speech for the current erased frame.
  • the current frame of sq'(n) is passed to the module 201 to update the current frame portion of the sq ( n ) speech buffer stored there.
  • This signal is also passed to a module 211 to update the memory, or internal states, of the filters inside the decoder 100.
  • a filter memory update is performed in order to ensure that the filter memory is consistent with the extrapolated speech waveform in the current erased frame. This is necessary for a smooth transition of speech waveform at the beginning of the next frame, if the next frame turns out to be a good frame. If the filter memory were frozen without such proper update, then generally there would be audible glitch or disturbance at the beginning of the next good frame.
  • the updated memory is simply the last M samples of the extrapolated speech signal for the current erased frame, but with the order reversed.
  • stsm ( k ) be the k -th memory value of the short-term synthesis filter, or the value stored in the delay line corresponding to the k -th short-term predictor coefficient a k .
  • the module 211 extrapolates the long-term synthesis filter memory based on the first time lag ppfe 1, using procedures similar to speech waveform extrapolation performed at the module 205.
  • the operations of the module 211 are completed. If, on the other hand, predictive coding is used for side information, then the module 211 also needs to update the memory of the involved predictors to minimize the discontinuity of decoded speech parameters at the next good frame.
  • moving-average (MA) predictive coding is used to quantize both the Line-Spectrum Pair (LSP) parameters and the excitation gain.
  • LSP Line-Spectrum Pair
  • the predictive coding schemes for these parameters work as follows. For each parameter, the long-term mean value of that parameter is calculated off-line and subtracted from the unquantized parameter value. The predicted value of the mean-removed parameter is then subtracted from this mean-removed parameter value. A quantizer quantizes the resulting prediction error. The output of the quantizer is used as the input to the MA predictor. The predicted parameter value and the long-term mean value are both added back to the quantizer output value to reconstruct a final quantized parameter value.
  • the modules 202 through 210 produce the extrapolated speech for the current erased frame.
  • the current frame there is no need to extrapolate the side information speech parameters since the output speech waveform has already been generated.
  • these parameters are extrapolated from the last frame by simply copying the parameter values from the last frame, and then work "backward" from these extrapolated parameter values to update the predictor memory of the predictive quantizers for these parameters.
  • the predictor memory in the predictive LSP quantizer can be updated as follows.
  • the predicted value for the k -th LSP parameter is calculated as the inner product of the predictor coefficient array and the predictor memory array for the k -th LSP parameter).
  • This predicted value and the long-term mean value of the k -th LSP are subtracted from the k -th LSP parameter value at the last frame.
  • the resulting value is used to update the newest memory location for the predictor of the k -th LSP parameter (after the original set of predictor memory is shifted by one memory location, as is well-known in the art). This procedure is repeated for all the LSP parameters (there are M of them).
  • the memory update for the gain predictor is essentially the same as the memory update for the LSP predictors described above.
  • the predicted value of log-gain is calculated (by calculating the inner product of the predictor coefficient array and the predictor memory array for the log-gain). This predicted log-gain and the long-term mean value of the log-gain are then subtracted from the log-gain value of the last frame. The resulting value is used to update the newest memory location for the log-gain predictor (after the original set of predictor memory is shifted by one memory location, as is well-known in the art).
  • the output speech is zeroed out, and the base-2 log-gain is assumed to be at an artificially set default silence level of 0. Again, the predicted log-gain and the long-term mean value of log-gain are subtracted from this default level of 0, and the resulting value is used to update the newest memory location for the log-gain predictor.
  • the frame erasure lasts more than 20 ms but does not exceed 60 ms, then updating the predictor memory for the predictive gain quantizer may be challenging, because the extrapolated speech waveform is attenuated using the waveform attenuation window of FIG. 3.
  • the log-gain predictor memory is updated based on the log-gain value of the waveform attenuation window in each frame.
  • a correction factor is calculated from the log-gain of the last frame based on the attenuation window of FIG. 3, and the correction factor is stored.
  • the following algorithm calculates these 8 correction factors, or log-gain attenuation factors.
  • the above algorithm calculates the base-2 log-gain value of the waveform attenuation window for a given frame, and then determines the difference between this value and a similarly calculated log-gain for the window of the previous frame, compensated for the normalization of the start of the window to unity for each frame.
  • the log-gain predictor memory update for frame erasure lasting 20 ms to 60 ms becomes straightforward. If the current erased frame is the j -th frame into frame erasure (4 ⁇ j ⁇ 12), lga ( j -4) is subtracted from the log-gain value of the last frame. From the result of this subtraction, the predicted log-gain and the long-term mean value of log-gain are further subtracted, and the resulting value is used to update the newest memory location for the log-gain predictor.
  • the decoder 100 uses these values to update the memories and of its short-term synthesis filter 190, long-term synthesis filter 180, LSP predictor, and gain predictor, in preparation for the decoding of the next frame, assuming the next frame will be received intact.
  • the frame erasure concealment scheme described above can be used as is, and it will provide significant speech quality improvement compared with applying no concealment. So far, essentially all the frame erasure concealment operations are performed during erased frames.
  • the present invention has an optional feature that improves speech quality by performing "filter memory correction" at the first received good frame after the erasure.
  • the short-term synthesis filter memory and the long-term synthesis filter memory are updated in the module 211 based on waveform extrapolation.
  • Such filter memory mismatch often causes audible distortion even after the frame erasure is over.
  • the pitch period is typically held constant or nearly constant. If the pitch period is instantaneously quantized (i.e. without using inter-frame predictive coding), and if the frame erasure occurs in a voiced speech segment with a smooth pitch contour, then, linearly interpolating between the transmitted pitch periods of the last good frame before erasure and the first good frame after erasure often provides a better approximation of the transmitted pitch period contour than holding the pitch period constant during erased frames. Therefore, if the synthesis filter memory is re-calculated or corrected at the first good frame after erasure, based on linearly interpolated pitch period over the erased frames, better speech quality can often be obtained.
  • the long-term synthesis filter memory is corrected in the following way at the first good frame after the erasure.
  • the received pitch period at the first good frame and the received pitch period at the last good frame before the erasure are used to perform linear interpolation of the pitch period over the erased frames. If an interpolated pitch period is not an integer, it is rounded off to the nearest integer.
  • the long-term synthesis filter memory is "re-extrapolated" frame-by-frame based on the linearly interpolated pitch period in each erased frame, until the end of the last erased frame is reached. For simplicity, a scaling factor of 1 may be used for the extrapolation of the long-term synthesis filter. After such re-extrapolation, the long-term synthesis filter memory is corrected.
  • the short-term synthesis filter memory may be corrected in a similar way, by re-extrapolating the speech waveform frame-by-frame, until the end of the last erased frame is reached. Then, the last M samples of the re-extrapolated speech waveform at the last erased frame, with the order reversed, will be the corrected short-term synthesis filter memory.
  • Another simpler way to correct the short-term synthesis filter memory is to estimate the waveform offset between the original extrapolated waveform and the re-extrapolated waveform, without doing the re-extrapolation.
  • This method is described below. First, "project" the last speech sample of the last erased frame backward by ppfe 1 samples, where ppfe 1 is the original time lag used for extrapolation at that frame, and depending on which frame the newly projected sample lands, it is backward projected by the ppfe 1 of that frame again. This process is continued until a newly projected sample lands on a good frame before the erasure.
  • this waveform offset indicates that the re-extrapolated speech waveform based on interpolated pitch period is delayed by X samples relative to the original extrapolated speech waveform at the end of the last erased frame
  • the short-term synthesis filter memory can be corrected by taking the M consecutive samples of the original extrapolated speech waveform that are X samples away from the end of the last erased frame, and then reversing the order. If, on the other hand, the waveform offset calculated above indicates that the original extrapolated speech waveform is delayed by X samples relative to the re-extrapolated speech waveform (if such re-extrapolation were ever done), then the short-term synthesis filter memory correction would need to use certain speech samples that are not extrapolated yet.
  • the original extrapolation for X more samples can be extended, and the last M samples can be taken with their order reversed.
  • the system can move back one pitch cycle, and use the M consecutive samples (with order reversed) of the original extrapolated speech waveform that are ( ppfe 1 - X ) samples away from the end of the last erased frame, where ppfe 1 is the time lag used for original extrapolation of the last erased frame, and assuming ppfe 1 > X.
  • FIG. 4 An example of such a computer system 400 is shown in FIG. 4.
  • all of the elements depicted in FIGs. 1 and 2, for example, can execute on one or more distinct computer systems 400, to implement the various methods of the present invention.
  • the computer system 400 includes one or more processors, such as a processor 404.
  • the processor 404 can be a special purpose or a general purpose digital signal processor and it's connected to a communication infrastructure 406 (for example, a bus or network).
  • a communication infrastructure 406 for example, a bus or network.
  • the computer system 400 also includes a main memory 408, preferably random access memory (RAM), and may also include a secondary memory 410.
  • the secondary memory 410 may include, for example, a hard disk drive 412 and/or a removable storage drive 414, representing a floppy disk drive, a magnetic tape drive, an optical disk drive, etc.
  • the removable storage drive 414 reads from and/or writes to a removable storage unit 418 in a well known manner.
  • the removable storage unit 418 represents a floppy disk, magnetic tape, optical disk, etc. which is read by and written to by removable storage drive 414.
  • the removable storage unit 418 includes a computer usable storage medium having stored therein computer software and/or data.
  • the secondary memory 410 may include other similar means for allowing computer programs or other instructions to be loaded into the computer system 400.
  • Such means may include, for example, a removable storage unit 422 and an interface 420.
  • Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and the other removable storage units 422 and the interfaces 420 which allow software and data to be transferred from the removable storage unit 422 to the computer system 400.
  • the computer system 400 may also include a communications interface 424.
  • the communications interface 424 allows software and data to be transferred between the computer system 400 and external devices. Examples of the communications interface 424 may include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, etc.
  • Software and data transferred via the communications interface 424 are in the form of signals 428 which may be electronic, electromagnetic, optical or other signals capable of being received by the communications interface 424. These signals 428 are provided to the communications interface 424 via a communications path 426.
  • the communications path 426 carries the signals 428 and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link and other communications channels.
  • computer readable medium and “computer usable medium” are used to generally refer to media such as the removable storage drive 414, a hard disk installed in the hard disk drive 412, and the signals 428.
  • These computer program products are means for providing software to the computer system 400.
  • Computer programs are stored in the main memory 408 and/or the secondary memory 410. Computer programs may also be received via the communications interface 424. Such computer programs, when executed, enable the computer system 400 to implement the present invention as discussed herein.
  • the computer programs when executed, enable the processor 404 to implement the processes of the present invention. Accordingly, such computer programs represent controllers of the computer system 400.
  • the processes/methods performed by signal processing blocks of encoders and/or decoders can be performed by computer control logic.
  • the software may be stored in a computer program product and loaded into the computer system 400 using the removable storage drive 414, the hard drive 412 or the communications interface 424.
  • features of the invention are implemented primarily in hardware using, for example, hardware components such as Application Specific Integrated Circuits (ASICs) and gate arrays.
  • ASICs Application Specific Integrated Circuits
  • gate arrays gate arrays.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Claims (10)

  1. Procédé de synthétisation d'un nombre de trames altérées sorties d'un décodeur (100) comprenant un ou plusieurs filtres prédictifs (180, 190), les trames altérées étant représentatives d'un segment de signal décodé sq(n), sorti du décodeur (100), le procédé comprenant :
    la détermination (202) d'un retard temporel préliminaire, ppfe1, sur la base de l'examen d'un nombre prédéterminé, K, d'échantillons d'un autre segment du signal décodé ;
    dans lequel où le nombre, K, d'échantillons examinés est choisi parmi un nombre, N, d'échantillons stockés ;
    la détermination (202) d'un facteur d'échelle, ptfe1, associé au nombre, K, d'échantillons examinés lorsque le retard temporel préliminaire, ppfe1, est déterminé ; et
    l'extrapolation (205) d'une ou plusieurs trames de remplacement sur la base du retard temporel préliminaire, ppfe1, et du facteur d'échelle, ptfe1 ;
    le procédé étant caractérisé en ce que
    le facteur d'échelle, ptfe1, est déterminé selon l'expression : ptfe 1 = sign c ppfe 1 × n = N - K + 1 N sq n n = N - K + 1 N sq n - ppfe 1 ;
    Figure imgb0042

    dans laquelle c(ppfe1) est une valeur de corrélation qui est fonction de ppfe1.
  2. Procédé selon la revendication 1, comprenant en outre des états internes de mise à jour des filtres sur la base de l'extrapolation.
  3. Procédé selon la revendication 2, dans lequel les valeurs de corrélation c(j), associées avec des retards temporels préliminaires candidats, j, sont déterminées selon l'expression : c j = n = N - K + 1 N sq n sq n - j ;
    Figure imgb0043

    et
    dans laquelle le retard temporel préliminaire, ppfe1, est choisi parmi les retards temporels préliminaires candidats, j, et maximise l'expression : nc j = ( n = N - K + 1 N sq n sq n - j ) 2 n = N - K + 1 N sq 2 n - j
    Figure imgb0044

    et la valeur de corrélation c(ppfe1) est égale à l'expression c(j) prise à j = ppfe1.
  4. Procédé selon la revendication 1, comprenant en outre :
    la correction des états internes des filtres lorsqu'une première bonne trame est reçue, la première bonne trame étant reçue après le nombre de trames altérées.
  5. Procédé selon la revendication 4, dans lequel la mise à jour comprend la mise à jour de filtres de synthèse à court terme et à long terme associés avec le ou les filtres prédictifs (180, 190).
  6. Appareil de synthétisation d'un nombre de trames altérées sorties d'un décodeur (100) comprenant un ou plusieurs filtres prédictifs (180, 190), les trames altérées étant représentatives d'un segment de signal décodé sq(n), sorti par le décodeur (100), l'appareil comprenant :
    un moyen de détermination d'un retard temporel préliminaire, ppfe1, basé sur l'examen d'un nombre prédéterminé, K, d'échantillons d'un autre segment du signal décodé ;
    dans lequel le nombre, K, d'échantillons examinés est choisi parmi un nombre, N, d'échantillons stockés ;
    un moyen de détermination d'un facteur d'échelle, ptfe1, associé au nombre, K, d'échantillons examinés lorsque le retard temporel préliminaire, ppfe1, est déterminé ; et
    un moyen d'extrapolation d'une ou plusieurs trames de remplacement basé sur le retard temporel préliminaire, ppfe1, et le facteur d'échelle, ptfe1 ;
    l'appareil étant caractérisé en ce que
    le moyen de détermination du facteur d'échelle, ptfe1, est configuré pour déterminer le facteur d'échelle, ptfe, selon l'expression : ptfe 1 = sign c ppfe 1 × n = N - K + 1 N sq n n = N - K + 1 N sq n - ppfe 1 ;
    Figure imgb0045

    dans laquelle c(ppfe1) est une valeur de corrélation qui est fonction de ppfe1.
  7. Appareil selon la revendication 6, comprenant en outre un moyen pour mettre à jour les états internes des filtres sur la base de l'extrapolation.
  8. Appareil selon la revendication 6, comprenant en outre :
    un moyen de correction des états internes des filtres lorsqu'une première bonne trame est reçue, la première bonne trame étant reçue après le nombre de trames altérées.
  9. Support lisible par ordinateur portant une ou plusieurs séquences d'une ou plusieurs instructions pour l'exécution par un ou plusieurs processeurs afin d'exécuter un procédé de synthétisation d'un nombre de trames altérées sorties d'un décodeur (100) comprenant un ou plusieurs filtres prédictifs (180, 190), les trames altérées étant représentatives d'un segment de signal décodé sq(n), sorti par le décodeur (100), les instructions lorsqu'elles sont exécutées par le ou les processeurs amènent le ou les processeurs à exécuter toutes les étapes suivantes :
    déterminer (202) un retard temporel préliminaire, ppfe1, basé sur l'examen d'un nombre prédéterminé, K, d'échantillons d'un autre segment du signal décodé ;
    dans lequel le nombre, K, d'échantillons examinés est choisi parmi un nombre, N, d'échantillons stockés ;
    déterminer (202) un facteur d'échelle, ptfe1, associé au nombre, K, d'échantillons examinés lorsque le retard temporel préliminaire, ppfe1, est déterminé ; et
    extrapoler (205) une ou plusieurs trames de remplacement sur la base du retard temporel préliminaire, ppfe1, et du facteur d'échelle, ptfe1 ;
    le support lisible par ordinateur étant caractérisé en ce que
    le facteur d'échelle, ptfe1, est déterminé selon l'expression : ptfe 1 = sign c ppfe 1 × n = N - K + 1 N sq n n = N - K + 1 N sq n - ppfe 1 ;
    Figure imgb0046
    dans laquelle c(ppfe1) est une valeur de corrélation qui est fonction de ppfe1.
  10. Support lisible par ordinateur selon la revendication 9, dans lequel la ou les instructions, lorsqu'elles sont exécutées par le ou les processeurs, amènent le ou les processeurs à exécuter en outre l'étape suivante :
    corriger les états internes des filtres lorsqu'une première bonne trame est reçue, la première bonne trame étant reçue après le nombre de trames altérées.
EP02757200A 2001-08-17 2002-08-19 Masquage ameliore de l'effacement des trames destine au codage predictif de la parole base sur l'extrapolation de la forme d'ondes de la parole Expired - Fee Related EP1433164B1 (fr)

Applications Claiming Priority (11)

Application Number Priority Date Filing Date Title
US31278901P 2001-08-17 2001-08-17
US312789P 2001-08-17
US34437402P 2002-01-04 2002-01-04
US344374P 2002-01-04
US10/183,448 US7143032B2 (en) 2001-08-17 2002-06-28 Method and system for an overlap-add technique for predictive decoding based on extrapolation of speech and ringinig waveform
US10/183,608 US7711563B2 (en) 2001-08-17 2002-06-28 Method and system for frame erasure concealment for predictive speech coding based on extrapolation of speech waveform
US183448 2002-06-28
US183608 2002-06-28
US10/183,451 US7308406B2 (en) 2001-08-17 2002-06-28 Method and system for a waveform attenuation technique for predictive speech coding based on extrapolation of speech waveform
US183451 2002-06-28
PCT/US2002/026255 WO2003023763A1 (fr) 2001-08-17 2002-08-19 Masquage ameliore de l'effacement des trames destine au codage predictif de la parole base sur l'extrapolation de la forme d'ondes de la parole

Publications (3)

Publication Number Publication Date
EP1433164A1 EP1433164A1 (fr) 2004-06-30
EP1433164A4 EP1433164A4 (fr) 2006-07-12
EP1433164B1 true EP1433164B1 (fr) 2007-11-14

Family

ID=27539098

Family Applications (1)

Application Number Title Priority Date Filing Date
EP02757200A Expired - Fee Related EP1433164B1 (fr) 2001-08-17 2002-08-19 Masquage ameliore de l'effacement des trames destine au codage predictif de la parole base sur l'extrapolation de la forme d'ondes de la parole

Country Status (3)

Country Link
EP (1) EP1433164B1 (fr)
DE (1) DE60223580T2 (fr)
WO (1) WO2003023763A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2416467B (en) * 2003-05-14 2006-08-30 Oki Electric Ind Co Ltd Apparatus and method for concealing erased periodic signal data
US7519535B2 (en) 2005-01-31 2009-04-14 Qualcomm Incorporated Frame erasure concealment in voice communications
PT3664086T (pt) 2014-06-13 2021-11-02 Ericsson Telefon Ab L M Gestão de erros de tramas em rajada

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69633164T2 (de) * 1995-05-22 2005-08-11 Ntt Mobile Communications Network Inc. Tondekoder
US5699485A (en) * 1995-06-07 1997-12-16 Lucent Technologies Inc. Pitch delay modification during frame erasures
US6810377B1 (en) * 1998-06-19 2004-10-26 Comsat Corporation Lost frame recovery techniques for parametric, LPC-based speech coding systems
US6188980B1 (en) * 1998-08-24 2001-02-13 Conexant Systems, Inc. Synchronized encoder-decoder frame concealment using speech coding parameters including line spectral frequencies and filter coefficients

Also Published As

Publication number Publication date
DE60223580T2 (de) 2008-09-18
WO2003023763A1 (fr) 2003-03-20
EP1433164A1 (fr) 2004-06-30
EP1433164A4 (fr) 2006-07-12
DE60223580D1 (de) 2007-12-27

Similar Documents

Publication Publication Date Title
US7590525B2 (en) Frame erasure concealment for predictive speech coding based on extrapolation of speech waveform
EP1288916B1 (fr) Procédé et dispositif de masquage de pertes de trames de parole codée prédictivement utilisant une extrapolation du signal
EP1291851B1 (fr) Procédé et dispositif de masquage du signal de trames de paroles détériorées par des erreurs
EP1363273B1 (fr) Système de communication de la parole et procédé de gestion de trames perdues
US7930176B2 (en) Packet loss concealment for block-independent speech codecs
EP1526507B1 (fr) Méthode pour effectuer un masquage de pertes de paquets et/ou de pertes de trames dans un système de communication
EP2054878B1 (fr) Décodage contraint et contrôlé après perte de paquet
US8386246B2 (en) Low-complexity frame erasure concealment
US8015000B2 (en) Classification-based frame loss concealment for audio signals
EP1110209B1 (fr) Lissage spectral pour le codage de la parole
EP1194924B3 (fr) Compensation d'inclinaisons adaptative pour residus vocaux synthetises
US20080140409A1 (en) Method and apparatus for performing packet loss or frame erasure concealment
EP1288915B1 (fr) Procédé et dispositif d'atténuation du signal de trames de parole détériorées par des erreurs
US6564182B1 (en) Look-ahead pitch determination
US7146309B1 (en) Deriving seed values to generate excitation values in a speech coder
EP1433164B1 (fr) Masquage ameliore de l'effacement des trames destine au codage predictif de la parole base sur l'extrapolation de la forme d'ondes de la parole
US20090055171A1 (en) Buzz reduction for low-complexity frame erasure concealment

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20040317

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LI LU MC NL PT SE SK TR

A4 Supplementary search report drawn up and despatched

Effective date: 20060609

RIC1 Information provided on ipc code assigned before grant

Ipc: G10L 19/00 20060101AFI20060602BHEP

17Q First examination report despatched

Effective date: 20060831

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: BROADCOM CORPORATION

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE FR GB

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REF Corresponds to:

Ref document number: 60223580

Country of ref document: DE

Date of ref document: 20071227

Kind code of ref document: P

ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20080815

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20130831

Year of fee payment: 12

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20130820

Year of fee payment: 12

Ref country code: GB

Payment date: 20130823

Year of fee payment: 12

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 60223580

Country of ref document: DE

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20140819

REG Reference to a national code

Ref country code: DE

Ref legal event code: R079

Ref document number: 60223580

Country of ref document: DE

Free format text: PREVIOUS MAIN CLASS: G10L0011040000

Ipc: G10L0019040000

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20150430

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 60223580

Country of ref document: DE

Effective date: 20150303

Ref country code: DE

Ref legal event code: R079

Ref document number: 60223580

Country of ref document: DE

Free format text: PREVIOUS MAIN CLASS: G10L0011040000

Ipc: G10L0019040000

Effective date: 20150520

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20150303

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140819

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140901