EP0801788B1 - Verfahren zur sprachkodierung mittels analyse durch synthese - Google Patents

Verfahren zur sprachkodierung mittels analyse durch synthese Download PDF

Info

Publication number
EP0801788B1
EP0801788B1 EP96901008A EP96901008A EP0801788B1 EP 0801788 B1 EP0801788 B1 EP 0801788B1 EP 96901008 A EP96901008 A EP 96901008A EP 96901008 A EP96901008 A EP 96901008A EP 0801788 B1 EP0801788 B1 EP 0801788B1
Authority
EP
European Patent Office
Prior art keywords
frame
delays
open
delay
loop
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP96901008A
Other languages
English (en)
French (fr)
Other versions
EP0801788A1 (de
Inventor
William Navarro
Michel Mauc
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nortel Networks France SAS
Original Assignee
Matra Nortel Communications SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matra Nortel Communications SAS filed Critical Matra Nortel Communications SAS
Publication of EP0801788A1 publication Critical patent/EP0801788A1/de
Application granted granted Critical
Publication of EP0801788B1 publication Critical patent/EP0801788B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/09Long term prediction, i.e. removing periodical redundancies, e.g. by using adaptive codebook or pitch predictor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals

Definitions

  • the present invention relates to speech coding using synthetic analysis.
  • a linear prediction of the speech signal is carried out to obtain the coefficients of a short-term synthesis filter modeling the transfer function of the vocal tract. These coefficients are transmitted to the decoder, as well as parameters characterizing an excitation to be applied to the short-term synthesis filter.
  • further research is carried out on the longer-term correlations of the speech signal in order to characterize a long-term synthesis filter accounting for the pitch of the speech.
  • the excitation indeed has a predictable component which can be represented by the past excitation, delayed by TP samples of the speech signal and affected by a gain g P.
  • the remaining, unpredictable part of the excitation is called stochastic excitation.
  • CELP Code Excited Linear Prediction
  • MPLPC Multi-Pulse Linear Prediction Coding
  • the stochastic excitation comprises a certain number of pulses whose positions are sought by the coder.
  • CELP coders are preferred for low transmission rates, but they are more complex to implement than MPLPC coders.
  • an open loop analysis an analysis in closed loop or a combination of both.
  • the analysis in open loop requires little computational volume, but its accuracy is limited.
  • loop analysis closed requires a lot of calculations, but it's more reliable because it directly contributes to minimizing the difference perceptually balanced between the speech signal and the synthetic signal.
  • a loop analysis open is first performed to limit the interval in which the closed loop analyzer will look for the delay prediction. This search interval must nevertheless remain relatively wide because we have to take into account that the delay can vary quickly.
  • the invention aims in particular to find a good compromise between the quality of the modeling of the long-term part of the explanation and the complexity of finding the delay correspondent in a speech coder.
  • the invention thus proposes a coding method using analysis by synthesis of a speech signal digitized in frames successive divided into nst subframes, including the next steps: linear prediction analysis of the signal speech to determine parameters of a filter short-term synthesis; open loop signal analysis speech to detect voiced signal frames and to determine, for each voiced frame, a degree of voicing signal and a delay search interval of long-term prediction; closed loop predictive analytics of the speech signal to select, for some at minus of the frames of the voiced frames, a delay of long-term prediction contained in the range of research and constituting a parameter of a synthesis filter long-term ; and determination of a stochastic excitation for each subframe, so as to minimize a weighted difference perceptually between the speech signal and the excitation stochastic filtered by long-term synthesis filters and in the short term.
  • the open loop analysis step we determines the search interval for each frame voiced so that it contains a number of delays depending on the degree of voicing of said frame.
  • the number of delays that are to be tested in closed loop is adaptable to the voicing mode of the frame.
  • the width of the search interval will be more weak for the most voiced frames in order to take into account of their greater harmonic stability.
  • a speech coder implementing the invention is applicable in various types of speech transmission and / or storage systems using a digital compression technique.
  • the speech coder 16 is part of a mobile radio station.
  • the speech signal S is a digital signal sampled at a frequency typically equal to 8 kHz.
  • the signal S comes from an analog-digital converter 18 receiving the amplified and filtered output signal from a microphone 20.
  • the converter 18 puts the speech signal S in the form of successive frames themselves subdivided into nst sub-frames lst samples.
  • the speech signal S can also be subjected to conventional shaping treatments such as Hamming filtering.
  • the speech coder 16 delivers a binary sequence with a significantly lower bit rate than that of the speech signal S, and addresses this sequence to a channel coder 22 whose function is to introduce redundancy bits into the signal in order to allow detection and / or a correction of any transmission errors.
  • the output signal from the channel encoder 22 is then modulated on a carrier frequency by the modulator 24, and the modulated signal is transmitted on the air interface.
  • the wall coder 16 is a coder with analysis by synthesis.
  • the encoder 16 determines on the one hand parameters characterizing a short-term synthesis filter modeling the vocal tract of the speaker, and on the other hand a sequence excitation which, applied to the short synthesis filter term, provides a synthetic signal constituting a estimation of the speech signal S according to a criterion of perceptual weighting.
  • the short-term synthesis filter has a transfer function of the form 1 / A (z), with:
  • the coefficients a i are determined by a module 26 for short-term linear prediction analysis of the speech signal S.
  • the a i are the linear prediction coefficients of the speech signal S.
  • the order q of the linear prediction is typically of the order of 10.
  • the methods applicable by module 26 for short-term linear prediction are well known in the field of speech coding.
  • Module 26, for example, implements the Durbin-Levinson algorithm (see J. Makhoul: "Linear Prediction: A tutorial review", Proc. IEEE, Vol.63, N ° 4, April 1975, p. 561-580 ).
  • the coefficients a i obtained are supplied to a module 28 which converts them into spectral line parameters (LSP).
  • the representation of the prediction coefficients a i by LSP parameters is frequently used in speech coders with analysis by synthesis.
  • LST t (nst-1) LSP t for sub -frames 0,1,2, ..., nst-1 of the frame t.
  • the coefficients a i of the filter 1 / A (z) are then determined, sub-frame by sub-frame from the interpolated LSP parameters.
  • the non-quantified LSP parameters are supplied by the module 28 to a module 32 for calculating the coefficients of a perceptual weighting filter 34.
  • the coefficients of the perceptual weighting filter are calculated by the module 32 for each subframe after interpolation of the LSP parameters received from the module 28.
  • the perceptual weighting filter 34 receives the speech signal S and delivers a perceptually weighted SW signal which is analyzed by modules 36, 38, 40 for determine the excitation sequence.
  • the excitation sequence of the short-term filter consists of an excitation predictable by a long-term synthetic filter modeling the pitch of the speech, and an excitement unpredictable stochastic, or innovation sequence.
  • Module 36 performs long-term prediction (LTP) in open loop, i.e. it does not contribute directly to the minimization of the weighted error.
  • LTP long-term prediction
  • the weighting filter 34 intervenes in upstream of the open loop analysis module, but it could otherwise: module 36 could operate directly on the speech signal S or on the signal S cleared of its short-term correlations by a filter transfer function A (z).
  • modules 38 and 40 operate in a closed loop, i.e. they directly contribute to minimizing the error perceptually weighted.
  • Long-term prediction lag is determined in two steps.
  • the analysis module 36 Open loop LTP detects voiced frames from the speech signal and determines, for each voiced frame, a degree of voicing MV and a delay search interval long-term prediction.
  • the search interval is defined by a central value represented by its quantification index ZP and by a width in the field of quantification indexes, depending on the degree of voicing MV.
  • the module 30 operates the quantization of the LSP parameters which have previously been determined for this frame.
  • This quantization is for example vectorial, that is to say it consists in selecting, from one or more predetermined quantization tables, a set of quantized parameters LSP Q which has a minimum distance from the set of parameters LSP provided by the module 28.
  • the quantification tables differ according to the degree of voicing MV provided to the quantization module 30 by the open-loop analyzer 36.
  • a set of quantization tables for a degree of voicing MV is determined, during prior tests, so as to be statistically representative of frames having this degree MV. These sets are stored both in the coders and in the decoders implementing the invention.
  • the module 30 delivers the set of quantized parameters LSP Q as well as its index Q in the applicable quantification tables.
  • the speech coder 16 further comprises a module 42 for calculating the impulse response of the compound filter short-term summary filter and weighting filter perceptual.
  • This compound filter has the function of transfer W (z) / A (z).
  • module 42 takes for the weighting filter perceptual W (z) that corresponding to the LSP parameters interpolated but not quantified, i.e. the one whose coefficients were calculated by module 32, and for the synthesis filter 1 / A (z) the one corresponding to the parameters Quantified and interpolated LSP, i.e. the one that will actually reconstructed by the decoder.
  • the TP delay index is ZP + DP.
  • closed-loop LTP analysis consists in determining, in the search interval for long-term prediction delays T, the delay TP which maximizes, for each sub-frame of a voiced frame, the normalized correlation : where x (i) denotes the weighted speech signal SW of the subframe from which the memory of the weighted synthesis filter has been subtracted (i.e. the response to a zero signal, due to its initial states, of the filter whose impulse response has been calculated by module 42), and y T (i) denotes the convolution product: u (jT) designating the predictable component of the delayed excitation sequence of T samples, estimated by the well-known technique of the adaptive codebook.
  • the missing values of u (jT) can be extrapolated from the previous values.
  • Fractional delays are taken into account by oversampling the signal u (jT) in the adaptive repertoire.
  • An oversampling of a factor m is obtained by means of polyphase interpolating filters.
  • the gain g P of long-term prediction could be determined by the module 38 for each sub-frame, by applying the known formula: However, in a preferred version of the invention, the gain g P is calculated by the stochastic analysis module 40.
  • the stochastic excitation determined for each subframe by the module 40 is of the multi-pulse type.
  • the positions and gains calculated by the analysis module 40 stochastics are quantified by a module 44.
  • a module 48 is thus provided in the encoder which receives the different parameters and which adds to some of them redundancy bits to detect and / or correct any transmission errors.
  • redundancy bits are added to this parameter by module 48.
  • bit rate per 20 ms frame is for example that indicated in table I.
  • the channel coder 22 is that used in the pan-European system of radiocommunication with mobiles (GSM).
  • GSM pan-European system of radiocommunication with mobiles
  • This channel coder described in detail in Recommendation GSM 05.03, was developed for a 13 kbit / s speech coder of RPE-LTP type which also produces 260 bits per 20 ms frame. The sensitivity of each of the 260 bits was determined from listening tests.
  • the bits from the source encoder have been grouped into three categories. The first of these categories IA groups 50 bits which are coded convolutionally on the basis of a generator polynomial giving a half redundancy with a constraint length equal to 5. Three parity bits are calculated and added to the 50 bits of the category IA before convolutional coding.
  • the second category (IB) has 132 bits which are protected at a rate of a half by the same polynomial as the previous category.
  • the third category (II) contains 78 unprotected bits. After application of the convolutional code, the bits (456 per frame) are subjected to interleaving.
  • a mobile radio station capable of receiving the speech signal processed by the source encoder 16 is shown schematically in Figure 2.
  • the radio signal received is first processed by a demodulator 50 then by a channel 52 decoder which performs dual operations of those of modulator 24 and channel encoder 22.
  • the decoder channel 52 provides the speech decoder 54 with a sequence binary which, in the absence of transmission errors or when any errors have been corrected by the decoder channel 52, corresponds to the binary sequence that delivered the scheduling module 46 at the encoder 16.
  • the decoder 54 comprises a module 56 which receives this binary sequence and which identifies the parameters relating to different frames and subframes.
  • the module 56 performs in in addition to some checks on the parameters received. In particular, module 56 examines the redundancy bits introduced by the encoder module 48, to detect and / or correct errors affecting the parameters associated with these redundancy bits.
  • a module 58 of the decoder receives the degree of voicing MV and the index of Q for quantizing the LSP parameters.
  • the module 58 finds the quantized LSP parameters in the tables corresponding to the value of MV, and, after interpolation, converts them into coefficients a i for the short-term synthesis filter 60.
  • a pulse generator 62 receives the positions p (n) of the np pulses of the stochastic excitation.
  • the generator 62 delivers pulses of unit amplitude which are each multiplied by 64 by the associated gain g (n).
  • the output of amplifier 64 is addressed to the long-term synthesis filter 66.
  • This filter 66 has an adaptive directory structure.
  • the output samples u of the filter 66 are stored in the adaptive directory 68 so as to be available for the subsequent subframes.
  • the delay TP relative to a sub-frame, calculated from the quantization indices ZP and DP, is supplied to the adaptive repertoire 68 to produce the signal u suitably delayed.
  • the amplifier 70 multiplies the signal thus delayed by the gain g P of long-term prediction.
  • the long-term filter 66 finally comprises an adder 72 which adds the outputs of amplifiers 64 and 70 to provide the excitation sequence u.
  • the excitation sequence is addressed to the short-term synthesis filter 60, and the resulting signal can also, in known manner, be subjected to a post-filter 74 whose coefficients depend on the synthesis parameters received, to form the signal of synthetic speech S '.
  • the output signal S 'of the decoder 54 is then converted into analog by the converter 76 before being amplified to control a loudspeaker 78.
  • the module 36 also determines, for each sub-frame st, the entire delay K st which maximizes the open loop estimation P st (k) at the long-term prediction gain on the sub-frame st, excluding the delays k for which the autocorrelation C st (k) is negative or smaller than a small fraction ⁇ of the energy R0 st of the subframe.
  • step 94 the degree of voicing MV of the current frame is taken equal to 0 in step 94, which in this case ends the operations performed by the module 36 on this frame. If on the contrary the threshold S0 is exceeded in step 92, the current frame is detected as voiced and the degree MV will be equal to 1, 2 or 3. The module 36 then calculates, for each subframe st, a list I st containing candidate delays to constitute the ZP center of the search interval for long-term prediction delays.
  • the module 36 determines the basic delay rbf in full resolution for the rest of the processing. This basic delay could be taken equal to the integer K st obtained in step 90. The fact of finding the basic delay in fractional resolution around K st however makes it possible to gain in precision.
  • Step 100 thus consists in finding, around the integer delay K st obtained in step 90, the fractional delay which maximizes the expression C st 2 / G st .
  • This search can be carried out at the maximum resolution of the fractional delays (1/6 in the example described here) even if the entire delay K st is not in the domain where this maximum resolution applies.
  • the autocorrelations C st (T) and the delayed energies G st (T) are obtained by interpolation from the values stored in step 90 for the whole delays.
  • the basic delay relating to a sub-frame could also be determined in fractional resolution from step 90 and taken into account in the first estimation of the overall prediction gain on the frame.
  • step 102 the address j in the list I st and the index m of the submultiple are initialized to 0 and 1, respectively.
  • a comparison 104 is made between the submultiple rbf / m and the minimum delay rmin. The submultiple rbf / m is to be examined if it is greater than rmin.
  • step 110 If P st (r i ) ⁇ SE st , the delay r i is not taken into account, and we go directly to step 110 of incrementing the index m before carrying out the comparison 104 again for the next submultiple. If test 108 shows that P st (r i ) ⁇ SE st , the delay r i is retained and step 112 is executed before incrementing the index m in step 110. In step 112, we stores the index i at the address j in the list I st , we give the value m to the integer m0 intended to be equal to the index of the smallest submultiple retained, then we increment by one unit l 'address j.
  • the examination of the sub-multiples of the basic delay is finished when the comparison 104 shows rbf / m ⁇ rmin.
  • a comparison 116 is made between the multiple n.rbf / m0 and the maximum delay rmax. If n.rbf / m0> rmax, test 118 is carried out to determine whether the index m0 of the smallest sub-multiple is an integer multiple of n.
  • step 120 the delay n.rbf / m0 has already been examined when examining the sub-multiples of rbf, and we go directly to step 120 of incrementing the index n before carrying out again comparison 116 for the next multiple. If test 118 shows that m0 is not an integer multiple of n, the multiple n.rbf / m0 is to be examined. We then take for the integer i the value of the index of the quantized delay r i closest to n.rbf / m0 (step 122), then we compare, at 124, the estimated value of the prediction gain P st ( r i ) at the selection threshold SE st .
  • step 120 If P st (r i ) ⁇ SE st , the delay r i is not taken into account, and we go directly to step 120 of incrementing the index n. If test 124 shows that P st (r i ) ⁇ SE st , the delay r i is retained and step 126 is executed before incrementing the index n in step 120. In step 126, we stores the index i at address j in the list I st , then the address j is incremented by one.
  • the list I st contains j candidate delay index. If we wish to limit the maximum length of the list I st to jmax for the following steps, we can take the length j st of this list equal to min (j, jmax) (step 128) and then, in step 130, order the list I st in the order of gains C st 2 (r Ist (j) ) / G st 2 (r Ist (j) ) decreasing for 0 ⁇ j ⁇ j st so as to keep only the j st delays providing the largest gain values.
  • the value of jmax is chosen according to the compromise sought between the efficiency of the search for LTP delays and the complexity of this search. Typical values of jmax range from 3 to 5.
  • the analysis module 36 calculates a quantity Ymax determining a second open-loop estimate of the prediction gain at long term over the entire frame, as well as indexes ZP, ZP0 and ZP1 in a phase 132, the progress of which is detailed in FIG. 6.
  • This phase 132 consists in testing search intervals of length N1 to determine which one maximizes a second estimate of the overall prediction gain on the frame. The intervals tested are those whose centers are the candidate delays contained in the list I st calculated during phase 101.
  • Phase 132 begins with a step 136 where the address j in the list I st is initialized to 0.
  • step 138 we check if the index I st (j) has already been encountered by testing a previous interval centered on I st' (j ') with st' ⁇ st and 0 ⁇ j ' ⁇ j st' , in order to d '' Avoid testing the same interval twice. If test 138 reveals that I st (j) already appeared in a list I st , with st ' ⁇ st, we directly increment the address j in step 140, then we compare it to the length j st of the list I st . If the comparison 142 shows that j ⁇ j st , we return to step 138 for the new value of the address j.
  • the quantity Y determining the second estimate of the overall prediction gain for the interval centered on I st (j) is calculated according to: then compared to Ymax, where Ymax represents the value to be maximized.
  • This value Ymax is for example initialized to 0 at the same time as the index st in step 96. If Y ⁇ Ymax, we go directly to step 140 for incrementing the index j. If the comparison 150 shows that Y> Ymax, step 152 is executed before incrementing the address j in step 140. At this step 152, the index ZP is taken equal to I st (j) and the indices ZP0 and ZP1 are respectively taken equal to the smallest and the largest of the indices i st ' determined in step 148.
  • the index st is incremented by one (step 154) then compared, in step 156, to the number nst of subframes per frame. If st ⁇ nst, we return to step 98 to perform the operations relating to the following sub-frame.
  • the index ZP denotes the center of the search interval that will be provided to the module 38 closed loop LTP analysis
  • ZP0 and ZP1 are index whose difference is representative of the dispersion of optimal delays per subframe in the interval centered on ZP.
  • Gp 20.log 10 (RO / RO-Y max ).
  • Two other thresholds S1 and S2 are used. If Gp ⁇ S1, the degree of voicing MV is taken equal to 1 for the current frame.
  • ZP + DP index of TP delay ultimately determined may therefore in some cases be more small than 0 or larger than 255.
  • This allows analysis LTP in closed loop to also carry on some delays TP smaller than rmin or larger than rmax.
  • Reducing the delay search interval for very closely spaced frames reduces the complexity of the closed loop LTP analysis performed by the module 38 by reducing the number of convolutions y T (i) to be calculated according to formula (1).
  • Another possibility is to provide a parity bit for the delay TP and / or the gain g P , making it possible to detect possible errors affecting these parameters.
  • the first optimizations carried out in step 90 relative to the different subframes are replaced by a single optimization relating to the entire frame.
  • the autocorrelations C (k) and the delayed energies G (k) for the entire frame are also calculated:
  • nz basic delays K 1 ', ..., K nz ' in full resolution.
  • the voiced / unvoiced decision (step 92) is taken on the basis of that of the basic delays K i 'which provides the greatest value for the first open-loop estimate at the long-term prediction gain.
  • the basic delays in fractional resolution are determined by the same process as in step 100, but only allowing the quantized delay values. Examination 101 of the sub-multiples and multiples is not performed. For the phase 132 of calculating the second estimate of the prediction gain, the nz basic delays previously determined are taken as candidate delays. This second variant makes it possible to dispense with the systematic examination of the submultiples and of the multiples which are generally taken into account by virtue of the subdivision of the domain of possible delays.
  • phase 132 is modified in that, in the optimization steps 148, the index i st ' which maximizes C st' 2 (r i ) / G st ' (r i ) for I st (j) -N1 / 2 ⁇ i ⁇ I st (j) + N1 / 2 and 0 ⁇ i ⁇ N, and on the other hand, during the same maximization loop, the index k st ' which maximizes this same quantity over a reduced interval I st (j) -N3 / 2 ⁇ i ⁇ I st (j) + N3 / 2 and 0 ⁇ i ⁇ N.
  • Step 152 is also modified: the indexes ZP0 and ZP1 are no longer stored, but a quantity Ymax 'defined in the same way as Ymax but with reference to the reduced length interval:
  • Gp' 20.log 10 [R0 / (R0-Ymax ')].
  • the sub-frames for which the prediction gain is negative or negligible can be identified by consulting the nst pointers. If necessary, the module 38 is deactivated for the corresponding sub-frames. This does not affect the quality of the LTP analysis since the prediction gain corresponding to these subframes will be almost zero anyway.
  • Another aspect of the invention relates to the module 42 for calculating the impulse response of the weighted synthesis filter.
  • the closed loop LTP analysis module 38 needs this impulse response h over the duration of a subframe to calculate the convolutions y T (i) according to formula (1).
  • the stochastic analysis module 40 also needs it to calculate convolutions as will be seen below.
  • the operations performed by the module 42 are for example in accordance with the flowchart of FIG. 7.
  • the truncated energies of the impulse response are also calculated:
  • the coefficients a k are those involved in the perceptual weighting filter, i.e. the linear prediction coefficients interpolated but not quantified
  • the coefficients a k are those applied to the synthesis filter, i.e. the quantized and interpolated linear prediction coefficients.
  • the module 42 determines the shortest length L ⁇ such that the energy Eh (L ⁇ -1) of the impulse response truncated at L ⁇ samples is at least equal to a proportion ⁇ of its total energy Eh (pst-1) estimated over pst samples.
  • a typical value of ⁇ is 98%.
  • the number L ⁇ is initialized to pst in step 162 and decremented by unit as 166 as Eh (L ⁇ -2)> ⁇ .Eh (pst-1) (test 164).
  • the length L ⁇ sought is obtained when test 164 shows that Eh (L ⁇ -2) ⁇ .Eh (pst-1).
  • a term corrector ⁇ (MV) is added to the value of L ⁇ which has been obtained (step 168).
  • This corrector term is preferably an increasing function of the degree of voicing.
  • ⁇ (0) - 5
  • ⁇ (3) + 7.
  • the truncation length Lh of the response impulse is taken equal to L ⁇ if L ⁇ nst and to nst if not.
  • a third aspect of the invention relates to the module 40 of stochastic analysis used to model the unpredictable part of the excitement.
  • the stochastic excitation considered here is of the multi-pulse type.
  • the stochastic excitation relating to a subframe is represented by np pulses of positions p (n) and of amplitudes, or gains, g (n) (1 ⁇ n ⁇ np).
  • the gain g P of long-term prediction can also be calculated during the same process.
  • the excitation sequence relating to a sub-frame comprises nc contributions associated respectively with nc gains.
  • the contributions are lst sample vectors which, weighted by the associated and summed gains correspond to the excitation sequence of the short-term synthesis filter.
  • np vectors comprising only 0 except an impulse of amplitude 1.
  • the vectors F p (n) are simply constituted by the vector of the impulse response h shifted by p (n) samples. Truncating the impulse response as described above therefore makes it possible to significantly reduce the number of operations useful for calculating the scalar products involving these vectors F p (n) .
  • the gains g nc-1 (i) are the selected gains and the minimized quadratic error E is equal to the energy at the target vector e nc-1 .
  • the decomposition of Cholesky and the inversion of the matrix M n however require to carry out divisions and calculations of square roots which are operations demanding in terms of computation complexity.
  • Different constraints can be brought to the domain of maximization of the quantity above included in the interval [0, lst [.
  • the maximization is carried out in step 182 on the set of possible positions excluding the segments in which the positions p (1), ..., p (n have been found respectively) -1) pulses during previous iterations.
  • the module 40 proceeds to the calculation 184 of the line n of the matrices L, R and K involved in the decomposition of the matrix B, which makes it possible to complete the matrices L n , R n and K n defined above.
  • the column index j is first initialized at 0, in step 186.
  • the variable tmp is first initialized at the value of component B (n, j), that is:
  • step 188 the integer k is also initialized to 0.
  • a comparison 190 is then made between the integers k and j. If k ⁇ j, we add the term L (n, k). R (j, k) to the variable tmp, then we increment the whole k by one unit (step 192) before re-performing the comparison 190.
  • step 196 If j ⁇ n, the component R (n, j) is taken equal to tmp and the component L (n, j) to tmp.K (j) in step 196, then the column index j is incremented d 'a unit before returning to step 188 to calculate the following components.
  • K (n) is taken equal to 1 / tmp if tmp ⁇ 0 (step 198) and to 0 otherwise.
  • the calculation 184 requires at most one division 198, to obtain K (n).
  • any singularity of the matrix B n does not cause instabilities since we avoid divisions by 0.
  • the inversion 200 then begins with an initialization 202 of the column index j 'at n-1.
  • the term Linv (j ') is initialized to -L (n, j') and the integer k 'to j' + 1.
  • a comparison 206 is then carried out between the integers k ′ and n.
  • the inversion 200 is followed by the calculation 214 of the reoptimized gains and of the target vector E for the following iteration.
  • the computation of the reoptimized gains is also very simplified by the decomposition retained for the matrix B.
  • One can indeed compute the vector g n (g n (0), ..., g n (n)) solution of g n .
  • B n b n according to: and
  • g n (i ') g n-1 (i') + L -1 (n, i ').
  • the calculation 214 is detailed in FIG. 11.
  • b (n) serves as the initialization value for the variable tmq.
  • index i is also initialized to 0.
  • the comparison 218 is then carried out between the integers i and n. If i ⁇ n, we add the term b (i). Linv (i) to the variable tmq and we increment i by one unit (step 220) before returning to the comparison 218.
  • Step 226 also includes the incrementation of the index i 'before returning to the comparison 224.
  • Segmental pulse search significantly decreases the number of pulse positions to be evaluated during steps 182 of the search for stochastic excitation. It also allows efficient quantification of the positions found.
  • ns> np also has the advantage that good robustness to transmission errors can be obtained with regard to the positions of the pulses, by virtue of a separate quantification of the sequence numbers of the occupied segments and of the relative positions pulses in each occupied segment.
  • the possible binary words are stored in a quantification table in which the reading addresses are the quantization indexes received.
  • the order in this table, determined once for all, can be optimized so that an error of transmission affecting a bit of the index (the error case the more frequent, especially when interlacing is used work in the channel encoder 22) has, on average, minimal consequences according to a neighborhood criterion.
  • the neighborhood criterion is for example that a word of ns bits does not can be replaced only by words "neighbors", distant a Hamming distance at most equal to an np-2 ⁇ threshold, so as to keep all the pulses except ⁇ of them at valid positions in case of transmission error the single-bit index.
  • Other criteria would be usable in substitution or in addition, for example that two words are considered neighbors if the replacement of one by the other does not change the order of assignment of gains associated with pulses.
  • the order in the table word quantification can be determined from arithmetic considerations or, if this is insufficient, in simulating error scenarios on a computer (so exhaustive or by statistical sampling of the type Monte-Carlo according to the number of possible error cases).
  • the module scheduling 46 can put in the category of minimum protection, or in the unprotected category, a nx number of index bits which, if they are affected by a transmission error, give rise to a wrong word but checking the neighborhood criterion with a probability considered satisfactory, and put in a category more protected the other bits of the index. This way of proceed uses another word order in the quantification table.
  • This scheduling can also be optimized using simulations if you want maximize the number nx of the index bits assigned to the least protected category.
  • One possibility is to start by constituting a list of words of ns bits by counting in Gray code from 0 to 2 ns -1, and to obtain the ordered quantification table by deleting from this list the words having no weight of Hamming of np.
  • the table thus obtained is such that two consecutive words have a Hamming distance of np-2. If the indexes in this table have a binary representation in Gray code, any error on the least significant bit causes the index to vary by ⁇ 1 and therefore causes the replacement of the actual occupancy word by a neighboring word in the sense of the np-2 threshold on the Hamming distance, and an error on the i-th least significant bit also varies the index by ⁇ 1 with a probability of approximately 2 1-i .
  • nx By placing the nx least significant bits of the index in Gray code in an unprotected category, a possible transmission error affecting one of these bits leads to the replacement of the busy word by a neighboring word with a probability at least equal. to (1 + 1/2 + ... + 1/2 nx-1 ) / nx. This minimum probability decreases from 1 to (2 / nb) (1-1 / 2 nb ) for nx increasing from 1 to nb.
  • the errors affecting the nb-nx most significant bits of the index will most often be corrected thanks to the protection applied to them by the channel coder.
  • the value of nx is in this case chosen according to a compromise between robustness to small value errors) and a reduced bulk of the protected categories (large values).
  • the possible binary words to represent the occupation of the segments are arranged in ascending order in a search table.
  • An indexing table associates with each address the serial number, in the quantification table stored at the decoder, of the binary word having this address in the search table.
  • the content of the search table and of the indexing table is given in table III (in decimal values).
  • the quantification of the occupancy word of the segments deduced from the np positions provided by the analysis module stochastic 40 is performed in two stages by the module 44.
  • a dichotomous search is first performed in the lookup table to determine the address in this table of the word to be quantified.
  • the index of quantification is then obtained at the address determined in the indexing table then supplied to the scheduling module 46 bits.
  • the module 44 also performs the quantification of the gains calculated by the module 40.
  • the quantization bits of Gs are placed in a category protected by the channel 22 encoder, as well as most significant bits of the gain quantification indexes relative.
  • the relative gain quantization bits are ordered to allow assignment to impulses associated belonging to the segments localized by the word of occupation. Segmental research according to the invention also allows effective protection of positions relative pulses associated with the largest values gain.
  • the decoder 54 To reconstruct impulse contributions of excitation, the decoder 54 first locates the segments by means of the occupation word received; he then assigns the associated earnings; then he assigns the positions relative to pulses based on the order of importance of the gains.
  • the 13 kbit / s speech coder requires order 15 million comma instructions per second (Mips) fixed. So we will typically do this and program a commercial digital signal processor (DSP) as well as the decoder which requires only about 5 Mips.
  • DSP digital signal processor

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Investigating Or Analysing Materials By The Use Of Chemical Reactions (AREA)

Claims (10)

  1. Synthese-Analyse-Verfahren zum Codieren eines digitalisierten Sprachsignals (S) in aufeinanderfolgende Raster, die in lst Unterraster aufgeteilt sind, welches die folgenden Schritte aufweist:
    Analyse des Sprachsignals mittels linearer Prädiktion zum Bestimmen der Parameter eines Kurzzeitsynthesefilters (60);
    rückkopplungslose Analyse des Sprachsignals zum Erfassen der stimmhaften Raster des Signals und zum Bestimmen für jedes stimmhafte Raster eines Stimmhaftigkeitsgrades (MV) des Signals und eines Suchintervalls einer Langzeitprädiktionsverzögerung;
    prädiktive Analyse mit Rückkopplung des Sprachsignals zum Auswählen, für bestimmte mindestens der Unterraster der stimmhaften Raster, einer Langzeitprädiktionsverzögerung, welche in dem Suchintervall enthalten ist und einen Parameter eines Langzeitsynthesefilters (66) bildet; und
    Bestimmung einer stochastischen Anregung für jedes Unterraster, so daß ein Wahrnehmungswichtungsabstand zwischen dem Sprachsignal und der durch das Langzeit- und Kurzzeitsynthesefilter gefilterten stochastischen .Anregung minimiert wird,
    dadurch gekennzeichnet, daß bei dem Schritt der rückkopplungslosen Analyse das Suchintervall relativ zu jedem stimmhaften Raster derart bestimmt wird, daß es eine Anzahl von Verzögerungen (N1, N3) enthält, welche von dem Stimmhaftigkeitsgrad des Rasters abhängt.
  2. Verfahren nach Anspruch 1, dadurch gekennzeichnet, daß das Suchintervall der Langzeitprädiktionsverzögerung für die Raster mit dem höheren Stimmhaftigkeitsgrad weniger Verzögerungen als für die weiteren stimmhaften Raster enthält.
  3. Verfahren nach Anspruch 1 oder 2, dadurch gekennzeichnet, daß die rückkopplungslose Analyse bezüglich eines Rasters die Bestimmung von nst Basisverzögerungen (Kst) aufweist, die jeweils eine rückkopplungslose Schätzung der Langzeitprädiktionsverstärkung an einem jeweiligen Unterraster des Rasters maximieren, daraufhin den Vergleich zwischen einem ersten vorgegebenen Schwellenwert (S0) und einer ersten rückkopplungslosen Schätzung der Langzeitprädiktionsverstärkung an dem Raster, welche auf der Grundlage der nst Basisverzögerungen relativ zu den entsprechenden Unterrastern erhalten wird, um zu erfassen, ob das Raster stimmhaft ist, dadurch, daß, wenn das Raster als stimmhaft erfaßt wird, die rückkopplungslose Analyse des weiteren für jedes Unterraster die Bestimmung einer Liste (Ist) von potentiellen Verzögerungen aufweist, bei denen die rückkopplungslose Schätzung der Prädiktionsverstärkung an dem Unterraster größer als ein bestimmter Bruchteil (β) der Schätzung bezüglich der Basisverzögerung für das Unterraster ist, dadurch, daß in diesen Listen die potentielle Verzögerung ausgewählt wird, bei der eine zweite rückkopplungslose Schätzung der Langzeitprädiktionsverstärkung an dem Raster maximal ist, wobei die zweite rückkopplungslose Schätzung an dem einer potentiellen Verzögerung zugeordneten Raster auf der Grundlage von nst optimalen Verzögerungen erhalten wird, welche in einem um die potentielle verzögerung zentrierten Intervall von N1 Verzögerungen enthalten sind, die jeweils an diesem Intervall die rückkopplungslose Schätzung der Prädiktionsverstärkung an den nst Unterrastern maximieren, dadurch, daß die Bestimmung des Stimmhaftigkeitsgrades des Rasters einen Vergleich zwischen der zweiten maximierten Schätzung der Prädiktionsverstärkung an dem Raster und mindestens einem weiteren vorgegebenen Schwellenwert (S1, S2) beinhaltet, sowie dadurch, daß das bei Beendigung der rückkopplungslosen Analyse bestimmte Suchintervall um die ausgewählte Verzögerung zentriert ist.
  4. Verfahren nach Anspruch 1 oder 2, dadurch gekennzeichnet, daß die rückkopplungslose Analyse bezüglich eines Rasters die Bestimmung einer Basisverzögerung (K) aufweist, die eine erste rückkopplungslose Schätzung der Langzeitprädiktionsverstärkung an diesem Raster maximiert, daraufhin den Vergleich zwischen einem ersten vorgegebenen Schwellenwert (S0) und der ersten maximierten Schätzung der Langzeitprädiktionsverstärkung an dem Raster, um zu erfassen, ob das Raster stimmhaft ist, dadurch, daß, falls das Raster als stimmhaft erfaßt wird, die rückkopplungslose Analyse des weiteren die Bestimmung einer Liste (I) von potentiellen Verzögerungen aufweist, bei denen die rückkopplungslose Schätzung der Prädiktionsverstärkung an dem Raster größer als ein bestimmter Bruchteil (β) der Schätzung bezüglich der Basisverzögerung ist, dadurch, daß in dieser Liste die potentielle Verzögerung ausgewählt wird, bei der eine zweite rückkopplungslose Schätzung der Langzeitprädiktionsverstärkung an dem Raster maximal ist, wobei die zweite rückkopplungslose Schätzung an dem einer potentiellen Verzögerung zugeordneten Raster auf der Grundlage von ns optimalen Verzögerungen erhalten wird, die in einem um diese potentielle Verzögerung zentrierten Intervall von N1 Verzögerungen enthalten sind, die jeweils an dem Intervall die rückkopplungslose Schätzung der Prädiktionsverstärkung an den nst Unterrastern maximieren, dadurch, daß die Bestimmung des Stimmhaftigkeitsgrades des Rasters einen Vergleich zwischen der zweiten maximierten Schätzung der Prädiktionsverstärkung an dem Raster und mindestens einem weiteren vorgegebenen Schwellenwert (S1, S2) aufweist, sowie dadurch, daß das bei Beendigung der rückkopplungslosen Analyse bestimmte Suchintervall um die ausgewählte Verzögerung zentriert ist.
  5. Verfahren nach Anspruch 1 oder 2, dadurch gekennzeichnet, daß die rückkopplungslose Analyse bezüglich eines Rasters die Bestimmung einer Anzahl nz von Basisverzögerungen (K1',...,Knz') aufweist, die jeweils an einem jeweiligen Unterintervall von möglichen Verzögerungswerten eine erste rückkopplungslose Schätzung der Langzeitprädiktionsverstärkung an diesem Raster maximieren, daraufhin den Vergleich zwischen einem ersten vorgegebenen Schwellenwert (S0) und der größten der nz ersten maximierten Schätzungen der Langzeitprädiktionsverstärkung an dem Raster, um zu erfassen, ob das Raster stimmhaft ist, dadurch, daß, falls das Raster als stimmhaft erfaßt wird, unter nz potentiellen Verzögerungen, die ausgehend von den nz Basisverzögerungen erhalten wurden, die potentielle Verzögerung ausgewählt wird, bei der eine zweite rückkopplungslose Schätzung der Langzeitprädiktionsverstärkung an dem Raster maximal ist, wobei die zweite rückkopplungslose Schätzung an dem einer potentiellen Verzögerung zugeordneten Raster auf der Grundlage von nst optimalen Verzögerungen erhalten wird, welche in einem um diese potentielle Verzögerung zentrierten Intervall von N1 Verzögerungen enthalten sind, die jeweils an diesem Intervall die rückkopplungslose Schätzung der Prädiktionsverstärkung an den ns Unterrastern maximieren, dadurch, daß die Bestimmung des Stimmhaftigkeitsgrades des Rasters einen Vergleich zwischen der zweiten maximierten Schätzung der Prädiktionsverstärkung an dem Raster und mindestens einem weiteren vorgegebenen Schwellenwert (S1, S2) aufweist, sowie dadurch, daß das bei Beendigung der rückkopplungslosen Analyse bestimmte Suchintervall um diese ausgewählte Verzögerung zentriert ist.
  6. Verfahren nach einem der Ansprüche 3 bis 5, dadurch gekennzeichnet, daß, falls die zweite maximierte Schätzung der Prädiktionsverstärkung an einem stimmhaften Raster größer als einer der Schwellenwerte (S2) ist, bestimmt wird, ob die nst optimalen Verzögerungen in einem Intervall enthalten sind, welches um die ausgewählte Verzögerung zentriert ist und eine Anzahl von Verzögerungen N3 geringer als N1 enthält und, wenn dies zutrifft, dem Raster ein Stimmhaftigkeitsgrad zugeordnet wird, bei dem das Suchintervall der Langzeitprädiktionsverzögerung N3 Verzögerungen enthält, wobei das Suchintervall N1 Verzögerungen für mindestens einen weiteren Stimmhaftigkeitsgrad enthält.
  7. Verfahren nach einem der Ansprüche 3 bis 5, dadurch gekennzeichnet, daß bei der Maximierung der zweiten rückkopplungslosen Schätzung der Langzeitprädiktionsverstärkung an einem stimmhaften Raster des weiteren eine dritte rückkopplungslose Schätzung der Verstärkung an dem Raster auf der Grundlage von nst Verzögerungen berechnet wird, welche in in einem Intervall enthalten sind, das um die ausgewählte Verzögerung zentriert ist und eine Anzahl N3 von Verzögerungen geringer als N1 enthält, die jeweils an diesem Intervall von N3 Verzögerungen die rückkopplungslose Schätzung der Prädiktionsverstärkung an den nst Unterraster maximieren, sowie dadurch, daß dem Raster ein Stimmhaftigkeitsgrad zugeordnet wird, bei dem das Suchintervall N3 Verzögerungen enthält, falls die dritte Schätzung einen vorgegebenen Schwellenwert (S2) übersteigt, wobei das Suchintervall N1 Verzögerungen für mindestens einen weiteren Stimmhaftigkeitsgrad enthält.
  8. Verfahren nach Anspruch 3 oder 4, dadurch gekennzeichnet, daß die potentiellen Verzögerungen einer Liste unter den Teilern der der Liste zugeordneten Basisverzögerung und unter den Vielfachen des kleinsten unter den Teiler ausgewählt werden, bei denen die rückkopplungslose Schätzung der Prädiktionsverstärkung größer als der bestimmte Bruchteil der Schätzung bezüglich der Basisverzögerung ist.
  9. Verfahren nach Anspruch 8, dadurch gekennzeichnet, daß die Langzeitprädiktionsverzögerungen ganzen oder bruchartigen Zahlen von Abtastproben des Sprachsignals entsprechen können, dadurch, daß die Basisverzögerungen (rbf) in bruchartiger Auflösung bestimmt werden, um die in eine Liste der potentiellen Verzögerungen aufzunehmenden Teiler und Mehrfachen zu suchen, sowie dadurch, daß die Basisverzögerungen in ganzzahliger Auflösung bestimmt werden, um die ersten rückkopplungslosen Schätzungen der Prädiktionsverstärkung an einem Raster zu bewerten.
  10. Verfahren nach einem der Ansprüche 3 bis 9, dadurch gekennzeichnet, daß die prädiktive Analyse mit Rückkopplung nicht durchgeführt wird bezüglich jedes Unterrasters, bei dem die Autokorrelation (Cst) des der optimalen Verzögerung für dieses Unterraster zugeordneten Sprachsignals negativ ist.
EP96901008A 1995-01-06 1996-01-03 Verfahren zur sprachkodierung mittels analyse durch synthese Expired - Lifetime EP0801788B1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
FR9500134 1995-01-06
FR9500134A FR2729246A1 (fr) 1995-01-06 1995-01-06 Procede de codage de parole a analyse par synthese
PCT/FR1996/000004 WO1996021218A1 (fr) 1995-01-06 1996-01-03 Procede de codage de parole a analyse par synthese

Publications (2)

Publication Number Publication Date
EP0801788A1 EP0801788A1 (de) 1997-10-22
EP0801788B1 true EP0801788B1 (de) 1999-06-09

Family

ID=9474931

Family Applications (1)

Application Number Title Priority Date Filing Date
EP96901008A Expired - Lifetime EP0801788B1 (de) 1995-01-06 1996-01-03 Verfahren zur sprachkodierung mittels analyse durch synthese

Country Status (9)

Country Link
US (1) US5974377A (de)
EP (1) EP0801788B1 (de)
CN (1) CN1145143C (de)
AT (1) ATE181170T1 (de)
AU (1) AU704229B2 (de)
CA (1) CA2209384C (de)
DE (1) DE69602822T2 (de)
FR (1) FR2729246A1 (de)
WO (1) WO1996021218A1 (de)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998006091A1 (fr) 1996-08-02 1998-02-12 Matsushita Electric Industrial Co., Ltd. Codec vocal, support sur lequel est enregistre un programme codec vocal, et appareil mobile de telecommunications
JP3166697B2 (ja) * 1998-01-14 2001-05-14 日本電気株式会社 音声符号化・復号装置及びシステム
US6192335B1 (en) * 1998-09-01 2001-02-20 Telefonaktieboiaget Lm Ericsson (Publ) Adaptive combining of multi-mode coding for voiced speech and noise-like signals
FI116992B (fi) * 1999-07-05 2006-04-28 Nokia Corp Menetelmät, järjestelmä ja laitteet audiosignaalin koodauksen ja siirron tehostamiseksi
US7272553B1 (en) * 1999-09-08 2007-09-18 8X8, Inc. Varying pulse amplitude multi-pulse analysis speech processor and method
JP3372908B2 (ja) * 1999-09-17 2003-02-04 エヌイーシーマイクロシステム株式会社 マルチパルス探索処理方法と音声符号化装置
KR100324204B1 (ko) * 1999-12-24 2002-02-16 오길록 예측분할벡터양자화 및 예측분할행렬양자화 방식에 의한선스펙트럼쌍 양자화기의 고속탐색방법
US6999509B2 (en) * 2001-08-08 2006-02-14 Octasic Inc. Method and apparatus for generating a set of filter coefficients for a time updated adaptive filter
US6957240B2 (en) * 2001-08-08 2005-10-18 Octasic Inc. Method and apparatus for providing an error characterization estimate of an impulse response derived using least squares
US6970896B2 (en) 2001-08-08 2005-11-29 Octasic Inc. Method and apparatus for generating a set of filter coefficients
US6965640B2 (en) * 2001-08-08 2005-11-15 Octasic Inc. Method and apparatus for generating a set of filter coefficients providing adaptive noise reduction
CA2365203A1 (en) * 2001-12-14 2003-06-14 Voiceage Corporation A signal modification method for efficient coding of speech signals
US7720231B2 (en) * 2003-09-29 2010-05-18 Koninklijke Philips Electronics N.V. Encoding audio signals
US7792670B2 (en) 2003-12-19 2010-09-07 Motorola, Inc. Method and apparatus for speech coding
US8329884B2 (en) 2004-12-17 2012-12-11 Roche Molecular Systems, Inc. Reagents and methods for detecting Neisseria gonorrhoeae
CN101320565B (zh) * 2007-06-08 2011-05-11 华为技术有限公司 感知加权滤波方法及感知加权滤波器
US9626982B2 (en) * 2011-02-15 2017-04-18 Voiceage Corporation Device and method for quantizing the gains of the adaptive and fixed contributions of the excitation in a CELP codec
FR2987931A1 (fr) * 2012-03-12 2013-09-13 France Telecom Modification des caracteristiques spectrales d'un filtre de prediction lineaire d'un signal audionumerique represente par ses coefficients lsf ou isf.
PL3011557T3 (pl) 2013-06-21 2017-10-31 Fraunhofer Ges Forschung Urządzenie i sposób do udoskonalonego stopniowego zmniejszania sygnału w przełączanych układach kodowania sygnału audio podczas ukrywania błędów
CN107452390B (zh) 2014-04-29 2021-10-26 华为技术有限公司 音频编码方法及相关装置

Family Cites Families (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NL8302985A (nl) * 1983-08-26 1985-03-18 Philips Nv Multipulse excitatie lineair predictieve spraakcodeerder.
CA1223365A (en) * 1984-02-02 1987-06-23 Shigeru Ono Method and apparatus for speech coding
NL8500843A (nl) * 1985-03-22 1986-10-16 Koninkl Philips Electronics Nv Multipuls-excitatie lineair-predictieve spraakcoder.
US4868867A (en) * 1987-04-06 1989-09-19 Voicecraft Inc. Vector excitation speech or audio coder for transmission or storage
US4802171A (en) * 1987-06-04 1989-01-31 Motorola, Inc. Method for error correction in digitally encoded speech
US4831624A (en) * 1987-06-04 1989-05-16 Motorola, Inc. Error detection method for sub-band coding
CA1337217C (en) * 1987-08-28 1995-10-03 Daniel Kenneth Freeman Speech coding
US5359696A (en) * 1988-06-28 1994-10-25 Motorola Inc. Digital speech coder having improved sub-sample resolution long-term predictor
SE463691B (sv) * 1989-05-11 1991-01-07 Ericsson Telefon Ab L M Foerfarande att utplacera excitationspulser foer en lineaerprediktiv kodare (lpc) som arbetar enligt multipulsprincipen
US5060269A (en) * 1989-05-18 1991-10-22 General Electric Company Hybrid switched multi-pulse/stochastic speech coding technique
US5097508A (en) * 1989-08-31 1992-03-17 Codex Corporation Digital speech coder having improved long term lag parameter determination
JP3268360B2 (ja) * 1989-09-01 2002-03-25 モトローラ・インコーポレイテッド 改良されたロングターム予測器を有するデジタル音声コーダ
EP0570362B1 (de) * 1989-10-17 1999-03-17 Motorola, Inc. Digitaler sprachdekodierer unter verwendung einer nachfilterung mit einer reduzierten spektralverzerrung
US5073940A (en) * 1989-11-24 1991-12-17 General Electric Company Method for protecting multi-pulse coders from fading and random pattern bit errors
US5307441A (en) * 1989-11-29 1994-04-26 Comsat Corporation Wear-toll quality 4.8 kbps speech codec
US5097507A (en) * 1989-12-22 1992-03-17 General Electric Company Fading bit error protection for digital cellular multi-pulse speech coder
US5265219A (en) * 1990-06-07 1993-11-23 Motorola, Inc. Speech encoder using a soft interpolation decision for spectral parameters
JPH04264597A (ja) * 1991-02-20 1992-09-21 Fujitsu Ltd 音声符号化装置および音声復号装置
FI98104C (fi) * 1991-05-20 1997-04-10 Nokia Mobile Phones Ltd Menetelmä herätevektorin generoimiseksi ja digitaalinen puhekooderi
EP0588932B1 (de) * 1991-06-11 2001-11-14 QUALCOMM Incorporated Vocoder mit veraendlicher bitrate
EP0556354B1 (de) * 1991-09-05 2001-10-31 Motorola, Inc. Fehlerschutz für vielfachmodensprachkodierer
US5253269A (en) * 1991-09-05 1993-10-12 Motorola, Inc. Delta-coded lag information for use in a speech coder
TW224191B (de) * 1992-01-28 1994-05-21 Qualcomm Inc
US5495555A (en) * 1992-06-01 1996-02-27 Hughes Aircraft Company High quality low bit rate celp-based speech codec
US5317595A (en) * 1992-06-30 1994-05-31 Nokia Mobile Phones Ltd. Rapidly adaptable channel equalizer
US5717824A (en) * 1992-08-07 1998-02-10 Pacific Communication Sciences, Inc. Adaptive speech coder having code excited linear predictor with multiple codebook searches
FI95086C (fi) * 1992-11-26 1995-12-11 Nokia Mobile Phones Ltd Menetelmä puhesignaalin tehokkaaksi koodaamiseksi
FR2702590B1 (fr) * 1993-03-12 1995-04-28 Dominique Massaloux Dispositif de codage et de décodage numériques de la parole, procédé d'exploration d'un dictionnaire pseudo-logarithmique de délais LTP, et procédé d'analyse LTP.
IT1264766B1 (it) * 1993-04-09 1996-10-04 Sip Codificatore della voce utilizzante tecniche di analisi con un'eccitazione a impulsi.
IT1270438B (it) * 1993-06-10 1997-05-05 Sip Procedimento e dispositivo per la determinazione del periodo del tono fondamentale e la classificazione del segnale vocale in codificatori numerici della voce
US5784532A (en) * 1994-02-16 1998-07-21 Qualcomm Incorporated Application specific integrated circuit (ASIC) for performing rapid speech compression in a mobile telephone system
US5751903A (en) * 1994-12-19 1998-05-12 Hughes Electronics Low rate multi-mode CELP codec that encodes line SPECTRAL frequencies utilizing an offset
FR2729245B1 (fr) * 1995-01-06 1997-04-11 Lamblin Claude Procede de codage de parole a prediction lineaire et excitation par codes algebriques
FR2734389B1 (fr) * 1995-05-17 1997-07-18 Proust Stephane Procede d'adaptation du niveau de masquage du bruit dans un codeur de parole a analyse par synthese utilisant un filtre de ponderation perceptuelle a court terme
US5699485A (en) * 1995-06-07 1997-12-16 Lucent Technologies Inc. Pitch delay modification during frame erasures
US5664055A (en) * 1995-06-07 1997-09-02 Lucent Technologies Inc. CS-ACELP speech compression system with adaptive pitch prediction filter gain based on a measure of periodicity
US5732389A (en) * 1995-06-07 1998-03-24 Lucent Technologies Inc. Voiced/unvoiced classification of speech for excitation codebook selection in celp speech decoding during frame erasures
US5710863A (en) * 1995-09-19 1998-01-20 Chen; Juin-Hwey Speech signal quantization using human auditory models in predictive coding systems
US5790759A (en) * 1995-09-19 1998-08-04 Lucent Technologies Inc. Perceptual noise masking measure based on synthesis filter frequency response
JP3680380B2 (ja) * 1995-10-26 2005-08-10 ソニー株式会社 音声符号化方法及び装置
JP4005154B2 (ja) * 1995-10-26 2007-11-07 ソニー株式会社 音声復号化方法及び装置
FR2742568B1 (fr) * 1995-12-15 1998-02-13 Catherine Quinquis Procede d'analyse par prediction lineaire d'un signal audiofrequence, et procedes de codage et de decodage d'un signal audiofrequence en comportant application
US5729694A (en) * 1996-02-06 1998-03-17 The Regents Of The University Of California Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
US5708757A (en) * 1996-04-22 1998-01-13 France Telecom Method of determining parameters of a pitch synthesis filter in a speech coder, and speech coder implementing such method

Also Published As

Publication number Publication date
CA2209384A1 (en) 1996-07-11
WO1996021218A1 (fr) 1996-07-11
CA2209384C (en) 2001-05-29
FR2729246B1 (de) 1997-03-07
CN1145143C (zh) 2004-04-07
ATE181170T1 (de) 1999-06-15
US5974377A (en) 1999-10-26
FR2729246A1 (fr) 1996-07-12
DE69602822T2 (de) 1999-12-23
EP0801788A1 (de) 1997-10-22
DE69602822D1 (de) 1999-07-15
CN1173939A (zh) 1998-02-18
AU4490196A (en) 1996-07-24
AU704229B2 (en) 1999-04-15

Similar Documents

Publication Publication Date Title
EP0801790B1 (de) Verfahren zur sprachkodierung mittels analyse durch synthese
EP0801788B1 (de) Verfahren zur sprachkodierung mittels analyse durch synthese
EP0801789B1 (de) Verfahren zur sprachkodierung mittels analyse durch synthese
EP1994531B1 (de) Verbesserte celp kodierung oder dekodierung eines digitalen audiosignals
US8401843B2 (en) Method and device for coding transition frames in speech signals
EP1576585B1 (de) Verfahren und vorrichtung zur robusten prädiktiven vektorquantisierung von parametern der linearen prädiktion in variabler bitraten-kodierung
EP1692689B1 (de) Optimiertes mehrfach-codierungsverfahren
FR2734389A1 (fr) Procede d'adaptation du niveau de masquage du bruit dans un codeur de parole a analyse par synthese utilisant un filtre de ponderation perceptuelle a court terme
EP2080194B1 (de) Dämpfung von stimmüberlagerung, im besonderen zur erregungserzeugung bei einem decoder in abwesenheit von informationen
EP0490740A1 (de) Verfahren und Einrichtung zum Bestimmen der Sprachgrundfrequenz in Vocodern mit sehr niedriger Datenrate
FR2880724A1 (fr) Procede et dispositif de codage optimise entre deux modeles de prediction a long terme
EP0616315A1 (de) Vorrichtung zur digitalen Sprachkodierung und -dekodierung, Verfahren zum Durchsuchen eines pseudologarithmischen LTP-Verzögerungskodebuchs und Verfahren zur LTP-Analyse
WO2002029786A1 (fr) Procede et dispositif de codage segmental d'un signal audio
Jung et al. Efficient implementation of ITU-T G. 723.1 speech coder for multichannel voice transmission and storage
EP1194923B1 (de) Verfahren und system für audio analyse und synthese
EP1192618B1 (de) Audiokodierung mit adaptiver lifterung
EP1192621B1 (de) Audiokodierung mit harmonischen komponenten
JP2001100799A (ja) 音声符号化装置、音声符号化方法および音声符号化アルゴリズムを記録したコンピュータ読み取り可能な記録媒体
JP2000330594A (ja) 音声符号化装置及び方法並びに音声符号化プログラムを記録した記憶媒体
WO2001003118A1 (fr) Codage et decodage audio par interpolation
FR2980620A1 (fr) Traitement d'amelioration de la qualite des signaux audiofrequences decodes
WO2001003119A1 (fr) Codage et decodage audio incluant des composantes non harmoniques du signal
JPH0675597A (ja) 音声符号化装置

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19970725

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH DE GB IT LI LU NL SE

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: MATRA NORTEL COMMUNICATIONS

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

17Q First examination report despatched

Effective date: 19981029

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE CH DE GB IT LI LU NL SE

REF Corresponds to:

Ref document number: 181170

Country of ref document: AT

Date of ref document: 19990615

Kind code of ref document: T

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: CH

Ref legal event code: NV

Representative=s name: KELLER & PARTNER PATENTANWAELTE AG

GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)

Effective date: 19990622

REF Corresponds to:

Ref document number: 69602822

Country of ref document: DE

Date of ref document: 19990715

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20000103

Ref country code: AT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20000103

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20000131

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20000131

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20000131

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
BERE Be: lapsed

Owner name: MATRA NORTEL COMMUNICATIONS

Effective date: 20000131

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20000801

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

NLV4 Nl: lapsed or anulled due to non-payment of the annual fee

Effective date: 20000801

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: SE

Payment date: 20001218

Year of fee payment: 6

REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20020104

EUG Se: european patent has lapsed

Ref document number: 96901008.1

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20040130

Year of fee payment: 9

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20041210

Year of fee payment: 10

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED.

Effective date: 20050103

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20050802

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20060103

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20060103