EP0685833B1 - Verfahren zur Sprachkodierung mittels linearer Prädiktion - Google Patents

Verfahren zur Sprachkodierung mittels linearer Prädiktion Download PDF

Info

Publication number
EP0685833B1
EP0685833B1 EP95401262A EP95401262A EP0685833B1 EP 0685833 B1 EP0685833 B1 EP 0685833B1 EP 95401262 A EP95401262 A EP 95401262A EP 95401262 A EP95401262 A EP 95401262A EP 0685833 B1 EP0685833 B1 EP 0685833B1
Authority
EP
European Patent Office
Prior art keywords
signal
state
short
quantization
determined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP95401262A
Other languages
English (en)
French (fr)
Other versions
EP0685833A1 (de
Inventor
Sophie Scott
William Navarro
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nortel Networks France SAS
Original Assignee
Matra Nortel Communications SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matra Nortel Communications SAS filed Critical Matra Nortel Communications SAS
Publication of EP0685833A1 publication Critical patent/EP0685833A1/de
Application granted granted Critical
Publication of EP0685833B1 publication Critical patent/EP0685833B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/038Vector quantisation, e.g. TwinVQ audio

Definitions

  • the present invention relates to a coding method linear prediction speech, in which a signal of speech digitized in successive frames is subjected to a analysis by synthesis to obtain, for each frame, quantification values of synthesis parameters used to reconstruct an estimate of the signal speech, analysis by synthesis including a prediction short-term linear speech signal to determine the coefficients of a short-term synthesis filter.
  • Low bit rate speech coders (typically 5 kbit / s for a sampling frequency of 8 kHz) give their best performance on signals presenting a "telephone" spectrum, that is to say in the 300-3400 band Hz and with a pre-emphasis in the frequencies high.
  • IRS Intermediate Reference System
  • This template has been defined for telephone handsets, both as input (microphone) than out (headphones).
  • the speech encoder input signal has a spectrum more "flat", for example when a hands-free installation is used, employing a frequency response microphone linear.
  • the usual vocoders are designed to be independent of the input with which they operate, and they are also not informed of the characteristics of this entry. If characteristic microphones different are likely to be connected to the vocoder, or more generally if the vocoder is likely to receive acoustic signals with different spectral characteristics, then there are cases where the vocoder is used sub-optimally.
  • a main purpose of this invention is to improve the performance of a vocoder by making them less dependent on spectral characteristics of the signal intended for it.
  • the invention provides a speech coding method of the type indicated at the beginning, in which a state is determined speech signal spectral among first and second states such that the signal contains proportionately less energy at low frequencies in the first state than in the second state, and we apply one or the other of two modes quantization to get quantization values coefficients of the following short-term synthesis filter the determined spectral state of the speech signal.
  • the detection of the spectral state allows to adapt the encoder to the characteristics of the input signal.
  • the performance of the encoder can be improved or, identical performance, we can reduce the number of bits necessary for coding.
  • the coefficients of the filter short-term synthesis are represented by a set of p frequency parameters of so-called ordered spectral lines "LSP parameters", p being the order of linear prediction.
  • LSP parameters ordered spectral lines
  • the distribution of these p LSP parameters can be analyzed to inform about the spectral state of the signal and to contribute upon detection of this state.
  • LSP parameters can be scalar or vector quantized.
  • the i-th LSP parameter is quantified by subdividing a variation interval included in a respective reference interval into 2 Ni segments, Ni being the number of coding bits devoted to the quantization of this parameter .
  • a first possibility is to use at least for the first ordered LSP parameters, reference intervals each chosen from two distinct intervals according to the determined spectral state of the speech signal.
  • An additional possibility is to give at least some numbers of coding bits Ni one or the other of two distinct values according to the determined spectral state of the speech signal, in order to effect a dynamic allocation of bits.
  • vector quantization differential we subdivide the set of p LSP parameters ordered in m groups of consecutive parameters, and, at least for the first group, we can perform a quantification differential with respect to an average vector chosen from two distinct vectors according to the determined spectral state of the speech signal.
  • the speech coder illustrated in FIG. 1A rests on the principle of analysis by synthesis. Its organization general is classic except for unit 8 of short term prediction and state detection unit 20 signal spectral.
  • the speech coder processes the amplified output signal from a microphone 5.
  • a low-pass filter 6 eliminates the frequency components of this signal above the upper limit (for example 4000 Hz) of the bandwidth processed by the coder .
  • the signal is then digitized by the analog-digital converter 7 which delivers the input signal S I in the form of successive frames of 10 to 30 ms consisting of samples taken at a rate of 8000 Hz for example.
  • the coefficients a i of this filter (1 ⁇ i ⁇ p) can be obtained by short-term linear prediction of the input signal, the number p designating the order of the linear prediction, which is typically equal to 10 for the narrowband speech.
  • the short-term prediction unit 8 determines estimates i of the coefficients a i which correspond to a quantification of these coefficients by quantization values q (a i ).
  • Each input signal frame S I is first subjected to the reverse filter 9 of transfer function A (z), then to a filter 10 of transfer function 1 / A (z / ⁇ ) where ⁇ denotes a predefined factor , generally between 0.8 and 0.9.
  • the combined filter thus constituted, of transfer function W (z) A (z) / A (z / ⁇ ), is a filter for perceptual weighting of the residual error of the coder.
  • the coefficients used in filters 9 and 10 are the estimates â i supplied by the short-term prediction unit 8.
  • the output R1 of the reverse filter 9 has a long-term periodicity, corresponding to the pitch of the speech.
  • the signal R1 is subjected to a reverse filter 11 of transfer function B (z) whose output R2 is supplied to the input of the filter 10.
  • the output S W of the filter 10 thus corresponds to the input signal S I cleared of its long-term correlation by the filter 11 of transfer function B (z), and weighted perceptually by the filters 9, 10 of combined transfer function W (z).
  • the filter 11 includes a subtractor whose input positive receives signal R1 and whose negative input receives a long-term estimate obtained by delaying the signal R1 of T samples and amplifying it.
  • the R1 signal as well that the long-term estimate are provided to a unit 13 which maximizes the correlation between these two signals to determine the optimal delay T and gain b.
  • Unit 13 explores all the whole and / or fractional values of the delay T between two terminals to select the one which maximizes the normalized correlation.
  • the gain b is deducted from the value of T, and is quantified by discretization, which leads to a quantization value q (b); the value quantized b and corresponding to this quantization value q (b) is that supplied as gain of the amplifier of the filter 11.
  • Speech synthesis in the coder is carried out in a closed loop comprising an excitation generator 12, a filter 14 having the same transfer function as the filter 10, a correlator 15, and a maximization unit 19 of the normalized correlation.
  • the nature of the excitation generator 12 makes it possible to distinguish between different types of coders with analysis by synthesis, according to the form of excitement.
  • MPLPC prediction analysis linear and multi-pulse excitation
  • CELP linear prediction analysis and vector excitation
  • the plaintiff used sequence excitation regular pulses or RPCELP, as described in his European patent application No. 0 347 307.
  • the excitation is represented by a input address k in a vector dictionary of excitation, and by an associated gain G.
  • the selected and amplified excitation vector is subjected to the filter 14 of transfer function 1 / A (z / ⁇ ), whose coefficients â i (1 ⁇ i ⁇ p) are provided by the short-term prediction unit 8 .
  • the resulting signal S W * is supplied to an input of the correlator 15, the other input of which receives the output signal S W of the filter 10.
  • the output of the correlator 15 consists of the normalized correlation which is maximized by the unit 19 , which amounts to minimizing the coding error.
  • the unit 19 selects the address k and the gain G of the excitation generator which maximize the correlation resulting from the correlator 15.
  • the maximization consists in determining the optimal address k, the gain G deducing from k.
  • the unit 19 operates a quantization by discretization of the digital value of the gain G, which leads to a quantization value q (G).
  • the quantized value G and corresponding to this quantization value q (G) is that which is supplied as the gain of the amplifier of the excitation generator 12.
  • the excitation vector selected in the dictionary of the generator 12, the associated gain G, the parameters b and T of the long-term filter 13, and the coefficients a i of the short-term prediction filter, to which is added a bit d state Y which will be described later, constitute the synthesis parameters, the quantization values of k, q (G), q (b), T, q (a i ), Y are transmitted to the receiver in order to reconstruct a estimation of the speech signal S I. These quantization values are combined on the same channel by the multiplexer 21 for transmission.
  • the associated decoder illustrated in FIG. 1B comprises a unit 50 which restores the quantized values k, G and, T, b and, â i on the basis of the quantization values received.
  • An excitation generator 52 identical to the generator 12 of the encoder receives the quantized values of the parameters k and G.
  • the output R and2 of the generator 52 (which is an estimate of R2) is subjected to the long-term prediction filter 53 of function of transfer 1 / B (z) whose coefficients are the quantized values of the parameters T and b.
  • the output R and1 of the filter 53 (which is an estimate of R1) is subjected to the short-term prediction filter 54 of transfer function 1 / A (z) whose coefficients are the quantized values of the parameters a i .
  • the resulting signal S and is the estimate of the input signal S I of the coder.
  • FIG. 2 shows an example of the constitution of the short-term prediction unit 8 of the coder.
  • the modeling coefficients a i are calculated for each frame, for example for the autocorrelation method.
  • Block 40 calculates the autocorrelations for 0 ⁇ j ⁇ p, n denoting the index of a sample of the current frame, and L the number of samples per frame.
  • the representation parameters thus obtained are quantized to reduce the number of bits necessary for their identification.
  • the two solid lines correspond to the framework of the IRS template, defined for microphones in CCITT Recommendation P48.
  • an IRS type microphone signal has a strong attenuation in the lower part of the spectrum (between 0 and 300 Hz) and a relative emphasis in the high frequencies.
  • a linear type signal provided by example through the microphone of a hands-free system presents a flatter spectrum, notably not having the strong attenuation at low frequencies (a typical example of such a linear type signal is illustrated by a line dashed on the diagram in Figure 3).
  • the detection device 20 comprises a high-pass filter 16 receiving the input acoustic signal S I and delivering a filtered signal S I '.
  • the filter 16 is typically a digital filter of the bi-quad type having an abrupt cutoff at 400 Hz.
  • the energies E1 and E2 contained in each frame of the acoustic input signal S I and of the filtered signal S I ' are calculated by two units 17, 18 each carrying out the sum of the squares of the samples of each frame which it receives.
  • the energy E1 of each frame of the input signal S I is sent to the input of a threshold comparator 25 which delivers a bit Z of value 0 when the energy E1 is less than a predetermined energy threshold, and of value 1 when the energy E1 is greater than the threshold.
  • the energy threshold is typically of the order of -38 dB relative to the signal saturation energy.
  • the comparator 25 serves to inhibit the determination of the state of the signal when the latter contains too little energy to be representative of the characteristics of the source. In this case, the determined state of the signal remains unchanged.
  • the energies E1 and E2 are sent to a digital divider 26 which calculates the ratio E2 / E1 for each frame.
  • This E2 / E1 ratio is sent to another threshold comparator 27 which delivers a bit X of value 0 when the E2 / E1 ratio is greater than a predetermined threshold, and of value 1 when the E2 / E1 ratio is less than the threshold.
  • This threshold on the E2 / E1 ratio is typically of the order of 0.93.
  • Bit X is representative of a signal condition on each frame.
  • the state bit Y is not taken directly equal to the condition bit X, but it results from a processing of successive condition bits X by a state determination circuit 29, which makes it possible to modify the determined state Y only after several successive frames show a signal condition X different from that corresponding to the previously determined state .
  • the operation of the state determination circuit 29 is illustrated in FIG. 5, where the upper timing diagram illustrates an example of evolution of the bit X supplied by the comparator 27.
  • the status bit Y (lower timing diagram) is initialized to 0 , because IRS characteristics are most frequently encountered.
  • variable V As soon as the variable V reaches a predetermined threshold (8 in the example considered), it is reset to 0 and the value of the bit Y is changed, so that it is determined that the signal has changed state.
  • a predetermined threshold 8 in the example considered
  • the signal is in state Y A up to frame M, in being Y B between frames M and N (change of signal source), then again in state Y A from frame N.
  • the above counting mode can for example be obtained by circuit 29 shown in FIG. 4.
  • This circuit includes a counter 32 on four bits, of which the bit most significant corresponds to the status bit Y, of which the three Least significant bits represent the counting variable V.
  • X and Y bits are supplied at the input of an OR gate EXCLUSIVE 33 whose output is addressed to the input incrementation of the counter 32 via a AND gate 34 whose other input receives the Z bit supplied by the threshold comparator 25.
  • the inverted output of the gate 33 is supplied to a decrementing input of the counter 32 via another AND gate 35 whose the other two inputs receive the Z bit respectively provided by comparator 25, and the output of an OR gate to three inputs 36 receiving the three least significant bits of the counter 32.
  • Counter 32 is arranged to split the pulses received on its decrement input when its least significant bit is 0 or when one at less than the next two bits is 1, as shown by the OR gate 37 in FIG. 4.
  • the circuit of determination 29 is not activated because AND gates 34, 35 prevent the value of counter 32 from being changed.
  • the status bit Y thus determined is supplied to the unit 8 short-term linear prediction to choose the mode for quantifying the coefficients of the synthesis filter at short term.
  • the parameters used to represent the coefficients a i of the short-term synthesis filter are the frequencies of spectral lines (LSF), or pairs of spectral lines (LSP). These parameters are known as having good statistical properties and as easily ensuring the stability of the synthesized filter (see N. Sugamura and F. Itakura: "Speech Analysis And Synthesis Method Developed At ECL In NTT: From LPC to LSP", Speech Communication, North Holland, Vol. 5, No. 2, 1986, pp. 199-215).
  • the LSP parameters are calculated by block 42 from the prediction coefficients a i obtained by block 41 by means of Chebyshev polynomials (see P. Kabal and RP Ramachandran: "The Computation Of Line Spectral Frequencies Using Chebyshev Polynomials", IEEE Trans. ASSP, Vol. 34, N ° 6, 1986 pp. 1419-1426). They can also be obtained directly from the autocorrelations of the signal, by the exploded Levinson algorithm (see P. Delsarte and Y. Genin: “The Split Levinson Algorithm", IEEE Trans. ASSP, Vol. 34, N ° 3, 1986).
  • Block 43 quantizes the LSF frequencies, or more precisely the cos2 ⁇ f i values, hereinafter called LSP parameters, comprised between -1 and +1, which simplifies the dynamic problems.
  • LSP parameters comprised between -1 and +1, which simplifies the dynamic problems.
  • the LSF frequency calculation method makes it possible to obtain them in the order of increasing frequencies, that is to say decreasing cosines.
  • m 3 independent vector quantifications, of dimensions respectively 3, 3 and 4, defining the LSP I groups (1,2,3), II (4,5,6) and III (7,8,9,10).
  • Each group is quantified in selecting from a respective quantification table prerecorded the vector presenting the Euclidean distance minimal with the parameters of this group.
  • For group I we define two quantization tables T I, 1 and T I, 2 disjoint with respective sizes 2 n1 and 2 n2 .
  • For group II two quantization tables T II, 1 and T II, 2 of respective sizes 2 p1 and 2 p2 having a common part are defined to reduce the necessary memory space.
  • For Group III defining a single quantization table T III size 2 q AD addresses I, AD II, AD III of the three vectors from three quantization tables for the three groups are the quantization values q (a i ) coefficients of the short-term synthesis filter, which are addressed to the multiplexer 21.
  • block 43 selects the tables T I, 2 and T II, 2 , the statistics of which are established to be representative of an input signal of linear type.
  • table T III is used in all cases, since the upper part of the spectrum is less sensitive to the differences between the IRS and linear characteristics.
  • the status bit Y is also supplied to the multiplexer 21.
  • a unit 45 calculates the estimates â i from the discretized values of the LSP parameters given by the three vectors selected.
  • the estimates â i thus obtained are supplied by the unit 45 to the short-term filters 9, 10 and 14 of the coder.
  • the same calculation is carried out by the restitution unit 50, the quantized cosine vectors being found from the quantization addresses AD I , AD II and AD III .
  • the decoder contains the same quantization tables as the coder, and their selection is made as a function of the status bit Y received.
  • the use of two families of quantification tables selected according to the spectral state Y has the advantage of provide better efficiency in terms of number of bits coding required. Indeed, the total number of bits used, for equal performance, for the quantification of parameters LSP in each case is less than the number of bits necessary when only one family of tables is used regardless of detection of the spectral state.
  • n1 8
  • block 43 can be arranged to perform differential vector quantization.
  • Each group of parameters I, II, III is then quantified differentially with respect to an average vector.
  • group I two distinct mean vectors V I, 1 and V I, 2 are defined and a table for quantifying the differences TD I.
  • group II two distinct mean vectors V II, 1 and V II, 2 are defined and a table for quantifying the differences TD II .
  • group III we define a single mean vector V III and a table for quantifying the differences TD III .
  • the average vectors V I, 1 and V II, 1 are established to be representative of an IRS type signal statistic, while the average vectors V I, 2 and V II, 2 are established to be representative of a statistic of linear type signals.
  • the advantage of this differential quantization is that it makes it possible to store, in the coder and in the decoder, only one quantization table per group.
  • the quantization values q (a i ) are the addresses of the three optimal difference vectors in the three tables, to which is added the bit Y determining which are the average vectors to be added to these difference vectors to restore the quantized LSP parameters.
  • each parameter is represented separately by the nearest quantized value.
  • cos2 ⁇ f i an upper bound M i and a lower bound m i , such that, on a large number of speech samples, approximately 90% of the encountered values of cos2 ⁇ f i are between m i and M i .
  • the reference interval between the two terminals is divided into 2 Ni equal segments, where Ni is the number of coding bits devoted to the quantization of the parameter cos2 ⁇ f i .
  • the frequency scheduling property f i is used to replace in certain cases the upper limit M i by the quantized value of the previous cosine c andos2 ⁇ f i-1 .
  • the quantization of cos2 ⁇ f i is carried out by subdividing the variation interval [m i , min ⁇ M i , c andos2 ⁇ f i-1 ⁇ ] into 2 Ni equal segments.
  • the detection of the spectral state of the signal makes it possible to define two families of reference intervals [m i, 1 , M i, 1 ] and [m i, 2 , M i, 2 ] for the first r parameters (1 ⁇ i ⁇ r ⁇ p).
  • Another possibility, which can supplement or replace the previous one, is to define for some of the parameters of the numbers of different Ni coding bits depending on whether the signal is IRS or linear. For a same total number of coding bits, one can in particular take lower Ni numbers in the IRS case than in the linear case for the first LSP parameters (the cosines the largest), since the dynamics of the first LSP parameters is reduced in the IRS case, the decrease in first Ni being offset by an increase in Ni relative to the last LSP parameters, which increases the fineness quantification of these latter parameters.
  • These different allocations of coding bits are stored at the both in the encoder and in the decoder, the LSP parameters can thus be found by examining the status bit Y.
  • the calculated LSP parameters can directly give a fairly precise idea of the spectral envelope of the speech signal.
  • the amplitude of the resonances located in the lower part of the spectrum is weaker than in the case linear. So, by analyzing the differences between the first consecutive LSF frequencies, it can be determined whether the signal input is more like IRS (large deviations) or linear (smaller deviations). This determination can be made for each signal frame to get the condition bit X which is then processed by a state determination circuit similar to circuit 29 of figure 4 to obtain the bit of state Y used by the quantization block 43.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Claims (13)

  1. Verfahren zur Sprachkodierung mittels linearer Vorhersage, bei welchem ein Sprachsignal (SI), das in aufeinanderfolgenden Rahmen digitalisiert ist, einer Syntheseanalyse unterzogen wird, um für jeden Rahmen Quantifikationswerte von Syntheseparametern (ai b, T, k G) zu erhalten, die es ermöglichen, eine Abschätzung (S and) des Sprachsignals zu erhalten, und bei welchem die Quantifikationswerte ausgegeben werden, wobei die Syntheseanalyse eine lineare Kurzzeit-Vorhersage des Sprachsignals umfaßt, um die Quantifikationswerte der Koeffizienten eines Kurzzeit-Synthesefilters zu bestimmen, dadurch gekennzeichnet, daß ein spektraler Status (Y) des Sprachsignals unter ersten und zweiten Stati (YA, YB) derart bestimmt wird, daß das Signal proportional weniger Energie bei tiefen Frequenzen in dem ersten Status enthält als in dem zweiten Status, und der eine oder der andere von zwei Quantifikationsmodi angewendet wird, um die Quantifikationswerte der Koeffizienten des Kurzzeit-Synthesefilters gemäß dem bestimmten spektralen Status (Y) des Sprachsignals zu erhalten.
  2. Verfahren nach Anspruch 1, dadurch gekennzeichnet, daß der bestimmte Status (Y) des Sprachsignals nicht modifiziert wird, solange er eine Energie unterhalb einer vorbestimmten Schwelle aufweist.
  3. Verfahren nach Anspruch 1 oder 2, dadurch gekennzeichnet, daß Rahmen für Rahmen ermittelt wird, ob das Signal in einem ersten Zustand ist, der dem ersten Status (YA) entspricht, oder in einem zweiten Zustand, der dem zweiten Status (YB) entspricht, und der Status (Y) des Signals auf der Basis der Zustände Rahmen für Rahmen (X) ermittelt wird, wobei der bestimmte Status nur modifiziert wird, nachdem mehrere nachfolgende Rahmen einen Signalzustand zeigen, der sich von demjenigen unterscheidet, der dem vorhergehend bestimmten Status entspricht.
  4. Verfahren nach Anspruch 3, dadurch gekennzeichnet, daß eine Zählvariable (V) inkrementiert wird, wenn der Zustand (X) des Signals in einem Rahmen sich von demjenigen unterscheidet, der dem bestimmten Status (Y) des Signals entspricht, daß diese Zählvariable (V) dekrementiert wird, wenn der Zustand des Signals in einem Rahmen derjenige ist, der dem bestimmten Status des Signals entspricht, außer wenn diese Variable 0 ist, und dadurch daß dann, wenn die Zählvariable (V) eine vorbestimmte Schwelle erreicht, diese auf 0 zurückgesetzt wird und festgestellt wird, daß das Signal den Status gewechselt hat.
  5. Verfahren nach Anspruch 3 oder 4, dadurch gekennzeichnet, daß das Sprachsignal (SI) einer Hochpaßfilterung unterzogen wird, die Energie (E2) des Signals (SI'), das den Hochpaßfilter durchlaufen hat, mit derjenigen (E1) des nicht gefilterten Signals verglichen wird, um Rahmen für Rahmen zu bestimmen, ob das Signal in dem ersten Zustand ist, für den die Energie des Hochpaß-gefilterten Signals größer ist als ein vorbestimmter Teil der Energie des nicht gefilterten Signals, oder ob das Signal in dem zweiten Zustand ist, für den die Energie des Hochpaß-gefilterten Signals geringer ist als der vorbestimmte Teil der Energie des nicht gefilterten Signals.
  6. Verfahren nach Anspruch 3 oder 4, dadurch gekennzeichnet, daß die Koeffizienten (aI) des Kurzzeit-Synthesefilters durch eine Menge von Frequenzen von Spektrallinien (fI) dargestellt sind und dadurch, daß die Verteilung der Frequenzen der Spektrallinien in jedem Rahmen des Sprachsignals (SI) analysiert wird, um zu ermitteln, ob das Signal in dem ersten oder dem zweiten Zustand ist.
  7. Verfahren nach einem der Ansprüche 1 bis 6, dadurch gekennzeichnet, daß die Koeffizienten (aI) des Kurzzeit-Synthesefilters durch eine Menge von p geordneten Frequenzparametern von Spektrallinien (cos2πfi) dargestellt werden, und zwar unterteilt in m Gruppen von aufeinanderfolgenden Frequenzparametern, wobei p die Ordnung der linearen Kurzzeitvorhersage ist und m eine ganze Zahl größer oder gleich 1 ist, und dadurch, daß wenigstens die erste Gruppe in Bezug auf einen mittleren Vektor differentiell quantifiziert wird, der aus zwei unterschiedlichen Vektoren (VI,1, VI,2) gemäß dem bestimmten spektralen Zustand (Y) des Sprachsignals ausgewählt wird.
  8. Verfahren nach Anspruch 7, dadurch gekennzeichnet, daß die Anzahl m gleich 3 ist und dadurch, daß jede der ersten drei Gruppen der aufeinanderfolgenden Frequenzparameter in Bezug auf einen entsprechenden mittleren Vektors differentiell quantifiziert wird, der aus zwei unterschiedlichen entsprechenden Vektoren gemäß dem bestimmten spektralen Zustand (Y) des Sprachsignals ausgewählt wird.
  9. Verfahren nach einem der Ansprüche 1 bis 6, dadurch gekennzeichnet, daß die Koeffizienten (ai) des Kurzzeit-Synthesefilters durch eine Menge von p geordneten Frequenzparametern von Spektrallinien (cos2πfi) bestimmt werden, wobei die Menge in m Gruppen von aufeinanderfolgenden Frequenzparametern unterteilt ist, wobei p die Ordnung der linearen Kurzzeit-Vorhersage ist und m eine ganze Zahl größer oder gleich 1 ist, und dadurch, daß wenigstens die erste Gruppe quantifiziert wird, indem in einer Quantifizierungstabelle ein Vektor ausgewählt wird, der einen minimalen Abstand zu den Frequenzparametern der Gruppe aufweist, wobei diese Quantifizierungstabelle aus zwei unterschiedlichen Tabellen (TI,1, TI,2) gemäß dem bestimmten spektralen Zustand (Y) des Sprachsignals ausgewählt wird.
  10. Verfahren nach Anspruch 9, dadurch gekennzeichnet, daß die Anzahl gleich 3 ist und dadurch, daß jede der beiden ersten Gruppen der aufeinanderfolgenden Frequenzparameter quantifiziert wird, indem in einer entsprechenden Quantifizierungstabelle ein Vektor ausgewählt wird, der einen minimalen Abstand zu den Frequenzparametern der Gruppe darstellt, wobei jede der beiden Quantifizierungstabellen in Bezug auf die beiden ersten Gruppen aus zwei jeweils unterschiedlichen Tabellen gemäß dem bestimmten spektralen Status (Y) des Sprachsignals ausgewählt wird.
  11. Verfahren nach Anspruch 10, dadurch gekennzeichnet, daß die zwei unterschiedlichen Quantifizierungstabellen (TI,1, TI,2) in Bezug auf die Gruppe disjunkt sind und dadurch, daß die zwei unterschiedlichen Quantifizierungstabellen (TII,1, TII,2) in Bezug au die zweite Gruppe einen gemeinsamen Teil aufweisen.
  12. Verfahren nach einem der vorhergehenden Ansprüche 1 bis 6, dadurch gekennzeichnet, daß die Koeffizienten (ai) des Kurzzeit-Synthesefilters durch eine Menge von p geordneten Frequenzparametern von Spektrallinien (cos2πfi) dargestellt werden, wobei p die Ordnung der linearen Kurzzeit-Vorhersage darstellt, dadurch daß jeder der p Parameter quantifiziert wird, indem ein Variationsinterval
    Figure 00260001
    das in einem jeweiligen Referenzinterval ([mi,Mi]) enthalten ist, in 2Ni Segmente unterteilt wird, wobei Ni die Zahl der Codierbits ist, die für die Quantifizierung dieses Parameters verwendet wird, und dadurch, daß wenigstens für die ersten Ordnungsparameter Referenzintervalle verwendet werden, wobei jedes aus zwei unterschiedlichen Intervallen ([mi,1, Mi,1],[mi,2,Mi,2]) gemäß dem bestimmten spektralen Status (Y) des Sprachsignals ausgewählt wird.
  13. Verfahren nach einem der Ansprüche 1 bis 6 oder nach Anspruch 12, dadurch gekennzeichnet, daß die Koeffizienten (ai) des Kurzzeit-Synthesefilters durch eine Menge aus p geordneten Frequenzparametern von Spektrallinien (cos2πfi) dargestellt werden, wobei p die Ordnung der linearen Kurzzeit-Vorhersage ist, dadurch, daß jeder der p Parameter quantifiziert wird, indem ein Variationsinterval ([mi,min{Mi,c andos2πfi-1}]), das in einem jeweiligen Referenzintervall ([mi,Mi]) enthalten ist, in 2Ni Segmente unterteilt wird, wobei Ni die Anzahl der Codierbits ist, die zur Quantifizierung der Parameter verwendet wird, und dadurch, daß wenigstens bestimmten der Anzahlen von Codierbits Ni der eine oder der andere der zwei unterschiedlichen Werte gemäß dem bestimmten spektralen Status (Y) des Sprachsignals zugewiesen wird.
EP95401262A 1994-06-03 1995-05-31 Verfahren zur Sprachkodierung mittels linearer Prädiktion Expired - Lifetime EP0685833B1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR9406825 1994-06-03
FR9406825A FR2720850B1 (fr) 1994-06-03 1994-06-03 Procédé de codage de parole à prédiction linéaire.

Publications (2)

Publication Number Publication Date
EP0685833A1 EP0685833A1 (de) 1995-12-06
EP0685833B1 true EP0685833B1 (de) 2000-04-26

Family

ID=9463861

Family Applications (1)

Application Number Title Priority Date Filing Date
EP95401262A Expired - Lifetime EP0685833B1 (de) 1994-06-03 1995-05-31 Verfahren zur Sprachkodierung mittels linearer Prädiktion

Country Status (4)

Country Link
US (1) US5642465A (de)
EP (1) EP0685833B1 (de)
DE (1) DE69516455T2 (de)
FR (1) FR2720850B1 (de)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08179796A (ja) * 1994-12-21 1996-07-12 Sony Corp 音声符号化方法
FR2729247A1 (fr) * 1995-01-06 1996-07-12 Matra Communication Procede de codage de parole a analyse par synthese
JP3196595B2 (ja) * 1995-09-27 2001-08-06 日本電気株式会社 音声符号化装置
JPH09230896A (ja) * 1996-02-28 1997-09-05 Sony Corp 音声合成装置
JP3094908B2 (ja) * 1996-04-17 2000-10-03 日本電気株式会社 音声符号化装置
US6253172B1 (en) * 1997-10-16 2001-06-26 Texas Instruments Incorporated Spectral transformation of acoustic signals
US6094629A (en) * 1998-07-13 2000-07-25 Lockheed Martin Corp. Speech coding system and method including spectral quantizer
US7379865B2 (en) * 2001-10-26 2008-05-27 At&T Corp. System and methods for concealing errors in data transmission
KR20050049103A (ko) * 2003-11-21 2005-05-25 삼성전자주식회사 포만트 대역을 이용한 다이얼로그 인핸싱 방법 및 장치
WO2009081569A1 (ja) * 2007-12-25 2009-07-02 Panasonic Corporation 超音波診断装置
WO2011074233A1 (ja) 2009-12-14 2011-06-23 パナソニック株式会社 ベクトル量子化装置、音声符号化装置、ベクトル量子化方法、及び音声符号化方法
CN102812512B (zh) * 2010-03-23 2014-06-25 Lg电子株式会社 处理音频信号的方法和装置
CN103928031B (zh) * 2013-01-15 2016-03-30 华为技术有限公司 编码方法、解码方法、编码装置和解码装置
CN112927724B (zh) * 2014-07-29 2024-03-22 瑞典爱立信有限公司 用于估计背景噪声的方法和背景噪声估计器

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NL8500843A (nl) 1985-03-22 1986-10-16 Koninkl Philips Electronics Nv Multipuls-excitatie lineair-predictieve spraakcoder.

Also Published As

Publication number Publication date
FR2720850B1 (fr) 1996-08-14
US5642465A (en) 1997-06-24
FR2720850A1 (fr) 1995-12-08
DE69516455D1 (de) 2000-05-31
DE69516455T2 (de) 2001-01-25
EP0685833A1 (de) 1995-12-06

Similar Documents

Publication Publication Date Title
EP0685833B1 (de) Verfahren zur Sprachkodierung mittels linearer Prädiktion
EP0782128B1 (de) Verfahren zur Analyse eines Audiofrequenzsignals durch lineare Prädiktion, und Anwendung auf ein Verfahren zur Kodierung und Dekodierung eines Audiofrequenzsignals
EP0127718B1 (de) Verfahren zur Aktivitätsdetektion in einem Sprachübertragungssystem
EP2419900B1 (de) Verfahren und einrichtung zur objektiven evaluierung der sprachqualität eines sprachsignals unter berücksichtigung der klassifikation der in dem signal enthaltenen hintergrundgeräusche
EP0768770B1 (de) Verfahren und Vorrichtung zur Erzeugung von Hintergrundrauschen in einem digitalen Übertragungssystem
EP2415047B1 (de) Klassifizieren von in einem Tonsignal enthaltenem Hintergrundrauschen
EP1593116B1 (de) Verfahren zur differenzierten digitalen Sprach- und Musikbearbeitung, Rauschfilterung, Erzeugung von Spezialeffekten und Einrichtung zum Ausführen des Verfahrens
EP1692689B1 (de) Optimiertes mehrfach-codierungsverfahren
FR2522179A1 (fr) Procede et appareil de reconnaissance de paroles permettant de reconnaitre des phonemes particuliers du signal vocal quelle que soit la personne qui parle
EP0801790B1 (de) Verfahren zur sprachkodierung mittels analyse durch synthese
FR2639459A1 (fr) Procede de traitement du signal et appareil de formation de donnees issues d'une source sonore
EP0428445B1 (de) Verfahren und Einrichtung zur Codierung von Prädiktionsfiltern in Vocodern mit sehr niedriger Datenrate
FR2690551A1 (fr) Procédé de quantification d'un filtre prédicteur pour vocodeur à très faible débit.
FR2784218A1 (fr) Procede de codage de la parole a bas debit
FR2984580A1 (fr) Procede de detection d'une bande de frequence predeterminee dans un signal de donnees audio, dispositif de detection et programme d'ordinateur correspondant
FR2579356A1 (fr) Procede de codage a faible debit de la parole a signal multi-impulsionnel d'excitation
EP0616315A1 (de) Vorrichtung zur digitalen Sprachkodierung und -dekodierung, Verfahren zum Durchsuchen eines pseudologarithmischen LTP-Verzögerungskodebuchs und Verfahren zur LTP-Analyse
EP2589045B1 (de) Adaptive lineare prädiktive codierung/decodierung
EP0685836B1 (de) Verfahren und Gerät zur Vorverarbeitung eines akustischen Signals vor der Sprachcodierung
EP1192619B1 (de) Audio-kodierung, dekodierung zur interpolation
EP1605440A1 (de) Verfahren zur Quellentrennung eines Signalgemisches
EP1192621B1 (de) Audiokodierung mit harmonischen komponenten
Moreau Predictive speech coding at low bit rates: a unified approach
EP0454552A2 (de) Verfahren und Einrichtung zur Sprachcodierung mit niedriger Bitrate
FR2737360A1 (fr) Procedes de codage et de decodage de signaux audiofrequence, codeur et decodeur pour la mise en oeuvre de tels procedes

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE ES GB IT NL SE

17P Request for examination filed

Effective date: 19951228

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: MATRA NORTEL COMMUNICATIONS

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

17Q First examination report despatched

Effective date: 19990803

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE ES GB IT NL SE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20000426

Ref country code: ES

Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY

Effective date: 20000426

RIC1 Information provided on ipc code assigned before grant

Free format text: 7G 10L 19/04 A, 7G 10L 101:10 Z

REF Corresponds to:

Ref document number: 69516455

Country of ref document: DE

Date of ref document: 20000531

GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)

Effective date: 20000601

ITF It: translation for a ep patent filed

Owner name: BARZANO' E ZANARDO MILANO S.P.A.

NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: SE

Payment date: 20010419

Year of fee payment: 7

REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20020601

EUG Se: european patent has lapsed
PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20040528

Year of fee payment: 10

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20050414

Year of fee payment: 11

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED.

Effective date: 20050531

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20051201

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20060531

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20060531