EP0685833B1 - Procédé de codage de parole à prédiction linéaire - Google Patents
Procédé de codage de parole à prédiction linéaire Download PDFInfo
- Publication number
- EP0685833B1 EP0685833B1 EP95401262A EP95401262A EP0685833B1 EP 0685833 B1 EP0685833 B1 EP 0685833B1 EP 95401262 A EP95401262 A EP 95401262A EP 95401262 A EP95401262 A EP 95401262A EP 0685833 B1 EP0685833 B1 EP 0685833B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- signal
- state
- short
- quantization
- determined
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000000034 method Methods 0.000 title claims description 21
- 238000013139 quantization Methods 0.000 claims description 47
- 239000013598 vector Substances 0.000 claims description 39
- 230000003595 spectral effect Effects 0.000 claims description 37
- 238000003786 synthesis reaction Methods 0.000 claims description 28
- 230000015572 biosynthetic process Effects 0.000 claims description 26
- 238000001228 spectrum Methods 0.000 claims description 12
- 230000001747 exhibiting effect Effects 0.000 claims 2
- 238000001914 filtration Methods 0.000 claims 1
- 238000011002 quantification Methods 0.000 description 17
- 230000006870 function Effects 0.000 description 16
- 230000005284 excitation Effects 0.000 description 15
- 238000012546 transfer Methods 0.000 description 15
- 238000004458 analytical method Methods 0.000 description 11
- 238000001514 detection method Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 230000007774 longterm Effects 0.000 description 6
- 230000008901 benefit Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 101000822695 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C1 Proteins 0.000 description 2
- 101000655262 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C2 Proteins 0.000 description 2
- 101000655256 Paraclostridium bifermentans Small, acid-soluble spore protein alpha Proteins 0.000 description 2
- 101000655264 Paraclostridium bifermentans Small, acid-soluble spore protein beta Proteins 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 102100029469 WD repeat and HMG-box DNA-binding protein 1 Human genes 0.000 description 1
- 101710097421 WD repeat and HMG-box DNA-binding protein 1 Proteins 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/038—Vector quantisation, e.g. TwinVQ audio
Definitions
- the present invention relates to a coding method linear prediction speech, in which a signal of speech digitized in successive frames is subjected to a analysis by synthesis to obtain, for each frame, quantification values of synthesis parameters used to reconstruct an estimate of the signal speech, analysis by synthesis including a prediction short-term linear speech signal to determine the coefficients of a short-term synthesis filter.
- Low bit rate speech coders (typically 5 kbit / s for a sampling frequency of 8 kHz) give their best performance on signals presenting a "telephone" spectrum, that is to say in the 300-3400 band Hz and with a pre-emphasis in the frequencies high.
- IRS Intermediate Reference System
- This template has been defined for telephone handsets, both as input (microphone) than out (headphones).
- the speech encoder input signal has a spectrum more "flat", for example when a hands-free installation is used, employing a frequency response microphone linear.
- the usual vocoders are designed to be independent of the input with which they operate, and they are also not informed of the characteristics of this entry. If characteristic microphones different are likely to be connected to the vocoder, or more generally if the vocoder is likely to receive acoustic signals with different spectral characteristics, then there are cases where the vocoder is used sub-optimally.
- a main purpose of this invention is to improve the performance of a vocoder by making them less dependent on spectral characteristics of the signal intended for it.
- the invention provides a speech coding method of the type indicated at the beginning, in which a state is determined speech signal spectral among first and second states such that the signal contains proportionately less energy at low frequencies in the first state than in the second state, and we apply one or the other of two modes quantization to get quantization values coefficients of the following short-term synthesis filter the determined spectral state of the speech signal.
- the detection of the spectral state allows to adapt the encoder to the characteristics of the input signal.
- the performance of the encoder can be improved or, identical performance, we can reduce the number of bits necessary for coding.
- the coefficients of the filter short-term synthesis are represented by a set of p frequency parameters of so-called ordered spectral lines "LSP parameters", p being the order of linear prediction.
- LSP parameters ordered spectral lines
- the distribution of these p LSP parameters can be analyzed to inform about the spectral state of the signal and to contribute upon detection of this state.
- LSP parameters can be scalar or vector quantized.
- the i-th LSP parameter is quantified by subdividing a variation interval included in a respective reference interval into 2 Ni segments, Ni being the number of coding bits devoted to the quantization of this parameter .
- a first possibility is to use at least for the first ordered LSP parameters, reference intervals each chosen from two distinct intervals according to the determined spectral state of the speech signal.
- An additional possibility is to give at least some numbers of coding bits Ni one or the other of two distinct values according to the determined spectral state of the speech signal, in order to effect a dynamic allocation of bits.
- vector quantization differential we subdivide the set of p LSP parameters ordered in m groups of consecutive parameters, and, at least for the first group, we can perform a quantification differential with respect to an average vector chosen from two distinct vectors according to the determined spectral state of the speech signal.
- the speech coder illustrated in FIG. 1A rests on the principle of analysis by synthesis. Its organization general is classic except for unit 8 of short term prediction and state detection unit 20 signal spectral.
- the speech coder processes the amplified output signal from a microphone 5.
- a low-pass filter 6 eliminates the frequency components of this signal above the upper limit (for example 4000 Hz) of the bandwidth processed by the coder .
- the signal is then digitized by the analog-digital converter 7 which delivers the input signal S I in the form of successive frames of 10 to 30 ms consisting of samples taken at a rate of 8000 Hz for example.
- the coefficients a i of this filter (1 ⁇ i ⁇ p) can be obtained by short-term linear prediction of the input signal, the number p designating the order of the linear prediction, which is typically equal to 10 for the narrowband speech.
- the short-term prediction unit 8 determines estimates i of the coefficients a i which correspond to a quantification of these coefficients by quantization values q (a i ).
- Each input signal frame S I is first subjected to the reverse filter 9 of transfer function A (z), then to a filter 10 of transfer function 1 / A (z / ⁇ ) where ⁇ denotes a predefined factor , generally between 0.8 and 0.9.
- the combined filter thus constituted, of transfer function W (z) A (z) / A (z / ⁇ ), is a filter for perceptual weighting of the residual error of the coder.
- the coefficients used in filters 9 and 10 are the estimates â i supplied by the short-term prediction unit 8.
- the output R1 of the reverse filter 9 has a long-term periodicity, corresponding to the pitch of the speech.
- the signal R1 is subjected to a reverse filter 11 of transfer function B (z) whose output R2 is supplied to the input of the filter 10.
- the output S W of the filter 10 thus corresponds to the input signal S I cleared of its long-term correlation by the filter 11 of transfer function B (z), and weighted perceptually by the filters 9, 10 of combined transfer function W (z).
- the filter 11 includes a subtractor whose input positive receives signal R1 and whose negative input receives a long-term estimate obtained by delaying the signal R1 of T samples and amplifying it.
- the R1 signal as well that the long-term estimate are provided to a unit 13 which maximizes the correlation between these two signals to determine the optimal delay T and gain b.
- Unit 13 explores all the whole and / or fractional values of the delay T between two terminals to select the one which maximizes the normalized correlation.
- the gain b is deducted from the value of T, and is quantified by discretization, which leads to a quantization value q (b); the value quantized b and corresponding to this quantization value q (b) is that supplied as gain of the amplifier of the filter 11.
- Speech synthesis in the coder is carried out in a closed loop comprising an excitation generator 12, a filter 14 having the same transfer function as the filter 10, a correlator 15, and a maximization unit 19 of the normalized correlation.
- the nature of the excitation generator 12 makes it possible to distinguish between different types of coders with analysis by synthesis, according to the form of excitement.
- MPLPC prediction analysis linear and multi-pulse excitation
- CELP linear prediction analysis and vector excitation
- the plaintiff used sequence excitation regular pulses or RPCELP, as described in his European patent application No. 0 347 307.
- the excitation is represented by a input address k in a vector dictionary of excitation, and by an associated gain G.
- the selected and amplified excitation vector is subjected to the filter 14 of transfer function 1 / A (z / ⁇ ), whose coefficients â i (1 ⁇ i ⁇ p) are provided by the short-term prediction unit 8 .
- the resulting signal S W * is supplied to an input of the correlator 15, the other input of which receives the output signal S W of the filter 10.
- the output of the correlator 15 consists of the normalized correlation which is maximized by the unit 19 , which amounts to minimizing the coding error.
- the unit 19 selects the address k and the gain G of the excitation generator which maximize the correlation resulting from the correlator 15.
- the maximization consists in determining the optimal address k, the gain G deducing from k.
- the unit 19 operates a quantization by discretization of the digital value of the gain G, which leads to a quantization value q (G).
- the quantized value G and corresponding to this quantization value q (G) is that which is supplied as the gain of the amplifier of the excitation generator 12.
- the excitation vector selected in the dictionary of the generator 12, the associated gain G, the parameters b and T of the long-term filter 13, and the coefficients a i of the short-term prediction filter, to which is added a bit d state Y which will be described later, constitute the synthesis parameters, the quantization values of k, q (G), q (b), T, q (a i ), Y are transmitted to the receiver in order to reconstruct a estimation of the speech signal S I. These quantization values are combined on the same channel by the multiplexer 21 for transmission.
- the associated decoder illustrated in FIG. 1B comprises a unit 50 which restores the quantized values k, G and, T, b and, â i on the basis of the quantization values received.
- An excitation generator 52 identical to the generator 12 of the encoder receives the quantized values of the parameters k and G.
- the output R and2 of the generator 52 (which is an estimate of R2) is subjected to the long-term prediction filter 53 of function of transfer 1 / B (z) whose coefficients are the quantized values of the parameters T and b.
- the output R and1 of the filter 53 (which is an estimate of R1) is subjected to the short-term prediction filter 54 of transfer function 1 / A (z) whose coefficients are the quantized values of the parameters a i .
- the resulting signal S and is the estimate of the input signal S I of the coder.
- FIG. 2 shows an example of the constitution of the short-term prediction unit 8 of the coder.
- the modeling coefficients a i are calculated for each frame, for example for the autocorrelation method.
- Block 40 calculates the autocorrelations for 0 ⁇ j ⁇ p, n denoting the index of a sample of the current frame, and L the number of samples per frame.
- the representation parameters thus obtained are quantized to reduce the number of bits necessary for their identification.
- the two solid lines correspond to the framework of the IRS template, defined for microphones in CCITT Recommendation P48.
- an IRS type microphone signal has a strong attenuation in the lower part of the spectrum (between 0 and 300 Hz) and a relative emphasis in the high frequencies.
- a linear type signal provided by example through the microphone of a hands-free system presents a flatter spectrum, notably not having the strong attenuation at low frequencies (a typical example of such a linear type signal is illustrated by a line dashed on the diagram in Figure 3).
- the detection device 20 comprises a high-pass filter 16 receiving the input acoustic signal S I and delivering a filtered signal S I '.
- the filter 16 is typically a digital filter of the bi-quad type having an abrupt cutoff at 400 Hz.
- the energies E1 and E2 contained in each frame of the acoustic input signal S I and of the filtered signal S I ' are calculated by two units 17, 18 each carrying out the sum of the squares of the samples of each frame which it receives.
- the energy E1 of each frame of the input signal S I is sent to the input of a threshold comparator 25 which delivers a bit Z of value 0 when the energy E1 is less than a predetermined energy threshold, and of value 1 when the energy E1 is greater than the threshold.
- the energy threshold is typically of the order of -38 dB relative to the signal saturation energy.
- the comparator 25 serves to inhibit the determination of the state of the signal when the latter contains too little energy to be representative of the characteristics of the source. In this case, the determined state of the signal remains unchanged.
- the energies E1 and E2 are sent to a digital divider 26 which calculates the ratio E2 / E1 for each frame.
- This E2 / E1 ratio is sent to another threshold comparator 27 which delivers a bit X of value 0 when the E2 / E1 ratio is greater than a predetermined threshold, and of value 1 when the E2 / E1 ratio is less than the threshold.
- This threshold on the E2 / E1 ratio is typically of the order of 0.93.
- Bit X is representative of a signal condition on each frame.
- the state bit Y is not taken directly equal to the condition bit X, but it results from a processing of successive condition bits X by a state determination circuit 29, which makes it possible to modify the determined state Y only after several successive frames show a signal condition X different from that corresponding to the previously determined state .
- the operation of the state determination circuit 29 is illustrated in FIG. 5, where the upper timing diagram illustrates an example of evolution of the bit X supplied by the comparator 27.
- the status bit Y (lower timing diagram) is initialized to 0 , because IRS characteristics are most frequently encountered.
- variable V As soon as the variable V reaches a predetermined threshold (8 in the example considered), it is reset to 0 and the value of the bit Y is changed, so that it is determined that the signal has changed state.
- a predetermined threshold 8 in the example considered
- the signal is in state Y A up to frame M, in being Y B between frames M and N (change of signal source), then again in state Y A from frame N.
- the above counting mode can for example be obtained by circuit 29 shown in FIG. 4.
- This circuit includes a counter 32 on four bits, of which the bit most significant corresponds to the status bit Y, of which the three Least significant bits represent the counting variable V.
- X and Y bits are supplied at the input of an OR gate EXCLUSIVE 33 whose output is addressed to the input incrementation of the counter 32 via a AND gate 34 whose other input receives the Z bit supplied by the threshold comparator 25.
- the inverted output of the gate 33 is supplied to a decrementing input of the counter 32 via another AND gate 35 whose the other two inputs receive the Z bit respectively provided by comparator 25, and the output of an OR gate to three inputs 36 receiving the three least significant bits of the counter 32.
- Counter 32 is arranged to split the pulses received on its decrement input when its least significant bit is 0 or when one at less than the next two bits is 1, as shown by the OR gate 37 in FIG. 4.
- the circuit of determination 29 is not activated because AND gates 34, 35 prevent the value of counter 32 from being changed.
- the status bit Y thus determined is supplied to the unit 8 short-term linear prediction to choose the mode for quantifying the coefficients of the synthesis filter at short term.
- the parameters used to represent the coefficients a i of the short-term synthesis filter are the frequencies of spectral lines (LSF), or pairs of spectral lines (LSP). These parameters are known as having good statistical properties and as easily ensuring the stability of the synthesized filter (see N. Sugamura and F. Itakura: "Speech Analysis And Synthesis Method Developed At ECL In NTT: From LPC to LSP", Speech Communication, North Holland, Vol. 5, No. 2, 1986, pp. 199-215).
- the LSP parameters are calculated by block 42 from the prediction coefficients a i obtained by block 41 by means of Chebyshev polynomials (see P. Kabal and RP Ramachandran: "The Computation Of Line Spectral Frequencies Using Chebyshev Polynomials", IEEE Trans. ASSP, Vol. 34, N ° 6, 1986 pp. 1419-1426). They can also be obtained directly from the autocorrelations of the signal, by the exploded Levinson algorithm (see P. Delsarte and Y. Genin: “The Split Levinson Algorithm", IEEE Trans. ASSP, Vol. 34, N ° 3, 1986).
- Block 43 quantizes the LSF frequencies, or more precisely the cos2 ⁇ f i values, hereinafter called LSP parameters, comprised between -1 and +1, which simplifies the dynamic problems.
- LSP parameters comprised between -1 and +1, which simplifies the dynamic problems.
- the LSF frequency calculation method makes it possible to obtain them in the order of increasing frequencies, that is to say decreasing cosines.
- m 3 independent vector quantifications, of dimensions respectively 3, 3 and 4, defining the LSP I groups (1,2,3), II (4,5,6) and III (7,8,9,10).
- Each group is quantified in selecting from a respective quantification table prerecorded the vector presenting the Euclidean distance minimal with the parameters of this group.
- For group I we define two quantization tables T I, 1 and T I, 2 disjoint with respective sizes 2 n1 and 2 n2 .
- For group II two quantization tables T II, 1 and T II, 2 of respective sizes 2 p1 and 2 p2 having a common part are defined to reduce the necessary memory space.
- For Group III defining a single quantization table T III size 2 q AD addresses I, AD II, AD III of the three vectors from three quantization tables for the three groups are the quantization values q (a i ) coefficients of the short-term synthesis filter, which are addressed to the multiplexer 21.
- block 43 selects the tables T I, 2 and T II, 2 , the statistics of which are established to be representative of an input signal of linear type.
- table T III is used in all cases, since the upper part of the spectrum is less sensitive to the differences between the IRS and linear characteristics.
- the status bit Y is also supplied to the multiplexer 21.
- a unit 45 calculates the estimates â i from the discretized values of the LSP parameters given by the three vectors selected.
- the estimates â i thus obtained are supplied by the unit 45 to the short-term filters 9, 10 and 14 of the coder.
- the same calculation is carried out by the restitution unit 50, the quantized cosine vectors being found from the quantization addresses AD I , AD II and AD III .
- the decoder contains the same quantization tables as the coder, and their selection is made as a function of the status bit Y received.
- the use of two families of quantification tables selected according to the spectral state Y has the advantage of provide better efficiency in terms of number of bits coding required. Indeed, the total number of bits used, for equal performance, for the quantification of parameters LSP in each case is less than the number of bits necessary when only one family of tables is used regardless of detection of the spectral state.
- n1 8
- block 43 can be arranged to perform differential vector quantization.
- Each group of parameters I, II, III is then quantified differentially with respect to an average vector.
- group I two distinct mean vectors V I, 1 and V I, 2 are defined and a table for quantifying the differences TD I.
- group II two distinct mean vectors V II, 1 and V II, 2 are defined and a table for quantifying the differences TD II .
- group III we define a single mean vector V III and a table for quantifying the differences TD III .
- the average vectors V I, 1 and V II, 1 are established to be representative of an IRS type signal statistic, while the average vectors V I, 2 and V II, 2 are established to be representative of a statistic of linear type signals.
- the advantage of this differential quantization is that it makes it possible to store, in the coder and in the decoder, only one quantization table per group.
- the quantization values q (a i ) are the addresses of the three optimal difference vectors in the three tables, to which is added the bit Y determining which are the average vectors to be added to these difference vectors to restore the quantized LSP parameters.
- each parameter is represented separately by the nearest quantized value.
- cos2 ⁇ f i an upper bound M i and a lower bound m i , such that, on a large number of speech samples, approximately 90% of the encountered values of cos2 ⁇ f i are between m i and M i .
- the reference interval between the two terminals is divided into 2 Ni equal segments, where Ni is the number of coding bits devoted to the quantization of the parameter cos2 ⁇ f i .
- the frequency scheduling property f i is used to replace in certain cases the upper limit M i by the quantized value of the previous cosine c andos2 ⁇ f i-1 .
- the quantization of cos2 ⁇ f i is carried out by subdividing the variation interval [m i , min ⁇ M i , c andos2 ⁇ f i-1 ⁇ ] into 2 Ni equal segments.
- the detection of the spectral state of the signal makes it possible to define two families of reference intervals [m i, 1 , M i, 1 ] and [m i, 2 , M i, 2 ] for the first r parameters (1 ⁇ i ⁇ r ⁇ p).
- Another possibility, which can supplement or replace the previous one, is to define for some of the parameters of the numbers of different Ni coding bits depending on whether the signal is IRS or linear. For a same total number of coding bits, one can in particular take lower Ni numbers in the IRS case than in the linear case for the first LSP parameters (the cosines the largest), since the dynamics of the first LSP parameters is reduced in the IRS case, the decrease in first Ni being offset by an increase in Ni relative to the last LSP parameters, which increases the fineness quantification of these latter parameters.
- These different allocations of coding bits are stored at the both in the encoder and in the decoder, the LSP parameters can thus be found by examining the status bit Y.
- the calculated LSP parameters can directly give a fairly precise idea of the spectral envelope of the speech signal.
- the amplitude of the resonances located in the lower part of the spectrum is weaker than in the case linear. So, by analyzing the differences between the first consecutive LSF frequencies, it can be determined whether the signal input is more like IRS (large deviations) or linear (smaller deviations). This determination can be made for each signal frame to get the condition bit X which is then processed by a state determination circuit similar to circuit 29 of figure 4 to obtain the bit of state Y used by the quantization block 43.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
- les figures 1A et 1B sont des schémas synoptiques respectivement d'un codeur de parole à analyse par synthèse pour la mise en oeuvre de l'invention et d'un décodeur associé ;
- la figure 2 est un schéma synoptique d'une unité de prédiction linéaire utilisable dans le codeur de la figure 1A ;
- la figure 3 est un diagramme illustrant les caractéristiques d'un signal acoustique de type IRS et d'un signal de type linéaire ;
- la figure 4 est un schéma d'un dispositif de détection de l'état spectral du signal, utilisable avec le codeur de la figure 1A ; et
- la figure 5 montre des chronogrammes illustrant le mode de détection de l'état du signal par le dispositif de la figure 4.
E(0) = R(0)
Pour i = 1 à p faire :
Claims (13)
- Procédé de codage de parole à prédiction linéaire, dans lequel un signal de parole (SI) numérisé en trames successives est soumis à une analyse par synthèse pour obtenir, pour chaque trame, des valeurs de quantification de paramètres de synthèse (ai,b,T,k,G) permettant de reconstituer une estimation (S and) du signal de parole, et on émet lesdites valeurs de quantification, l'analyse par synthèse comprenant une prédiction linéaire à court terme du signal de parole pour déterminer les valeurs de quantification des coefficients d'un filtre de synthèse à court terme, caractérisé en ce qu'on détermine un état spectral (Y) du signal de parole parmi des premier et second états (YA,YB) tels que le signal contienne proportionnellement moins d'énergie aux basses fréquences dans le premier état que dans le second état, et on applique l'un ou l'autre de deux modes de quantification pour obtenir les valeurs de quantification des coefficients du filtre de synthèse à court terme suivant l'état spectral déterminé (Y) du signal de parole.
- Procédé selon la revendication 1, caractérisé en ce qu'on ne modifie pas l'état déterminé (Y) du signal de parole lorsqu'il a une énergie inférieure à un seuil prédéterminé.
- Procédé selon la revendication 1 ou 2, caractérisé en ce qu'on détecte trame par trame si le signal est dans une première condition correspondant au premier état (YA) ou dans une seconde condition correspondant au second état (YB), et on détermine l'état (Y) du signal sur la base des conditions trame par trame (X), en ne modifiant l'état déterminé qu'après que plusieurs trames successives montrent une condition de signal différente de celle correspondant à l'état précédemment déterminé.
- Procédé selon la revendication 3, caractérisé en ce qu'on incrémente une variable de comptage (V) lorsque la condition (X) du signal sur une trame diffère de celle correspondant à l'état déterminé (Y) du signal, en ce qu'on décrémente cette variable de comptage (V) lorsque la condition du signal sur une trame est celle correspondant à l'état déterminé du signal sauf si cette variable vaut zéro, et en ce que, lorsque la variable de comptage (V) atteint un seuil prédéterminé, on la remet à zéro et on détermine que le signal a changé d'état.
- Procédé selon la revendication 3 ou 4, caractérisé en ce qu'on soumet le signal de parole (SI) à un filtrage passe-haut, on compare l'énergie (E2) du signal filtré passe-haut (SI') à celle (E1) du signal non filtré pour déterminer trame par trame si le signal est dans la première condition, pour laquelle l'énergie du signal filtré passe-haut est supérieure à une fraction prédéterminée de l'énergie du signal non filtré, ou dans la seconde condition, pour laquelle l'énergie du signal filtré passe-haut est inférieure à la fraction prédéterminée de l'énergie du signal non filtré.
- Procédé selon la revendication 3 ou 4, caractérisé en ce que les coefficients (ai) du filtre de synthèse à court terme sont représentés par un ensemble de fréquences de lignes spectrales (fi), et en ce qu'on analyse la distribution des fréquences de lignes spectrales dans chaque trame du signal de parole (SI) pour détecter si le signal est dans la première ou la seconde condition.
- Procédé selon l'une quelconque des revendications 1 à 6, caractérisé en ce que les coefficients (ai) du filtre de synthèse à court terme sont représentés par un ensemble de p paramètres fréquentiels de lignes spectrales ordonnés (cos2πfi), subdivisé en m groupes de paramètres fréquentiels consécutifs, p étant l'ordre de la prédiction linéaire à court terme et m étant un nombre entier supérieur ou égal à 1, et en ce qu'au moins le premier groupe est quantifié différentiellement par rapport à un vecteur moyenne choisi parmi deux vecteurs distincts (VI,1,VI,2) suivant l'état spectral déterminé (Y) du signal de parole.
- Procédé selon la revendication 7, caractérisé en ce que le nombre m est égal à 3, et en ce que chacun des deux premiers groupes de paramètres fréquentiels consécutifs est quantifié différentiellement par rapport à un vecteur moyenne respectif choisi parmi deux vecteurs distincts respectifs suivant l'état spectral déterminé (Y) du signal de parole.
- Procédé selon l'une quelconque des revendications 1 à 6, caractérisé en ce que les coefficients (ai) du filtre de synthèse à court terme sont représentés par un ensemble de p paramètres fréquentiels de lignes spectrales ordonnés (cos2πfi), subdivisé en m groupes de paramètres fréquentiels consécutifs, p étant l'ordre de la prédiction linéaire à court terme et m étant un nombre entier supérieur ou égal à 1, et en ce qu'au moins le premier groupe est quantifié en sélectionnant dans une table de quantification un vecteur présentant une distance minimale avec les paramètres fréquentiels dudit groupe, cette table de quantification étant choisie parmi deux tables distinctes (TI,1, TI,2) suivant l'état spectral déterminé (Y) du signal de parole.
- Procédé selon la revendication 9, caractérisé en ce que le nombre m est égal à 3, et en ce que chacun des deux premiers groupes de paramètres fréquentiels consécutifs est quantifié en sélectionnant dans une table de quantification respective un vecteur présentant une distance minimale avec les paramètres fréquentiels dudit groupe, chacune des deux tables de quantification relatives aux deux premiers groupes étant choisie parmi deux tables distinctes respectives suivant l'état spectral déterminé (Y) du signal de parole.
- Procédé selon la revendication 10, caractérisé en ce que les deux tables de quantification distinctes (TI,1, TI,2) relatives au premier groupe sont disjointes, et en ce que les deux tables de quantification distinctes (TII,1, TII,2) relatives au second groupe présentent une partie commune.
- Procédé selon l'une quelconque des revendications 1 à 6, caractérisé en ce que les coefficients (ai) du filtre de synthèse à court terme sont représentés par un ensemble de p paramètres fréquentiels de lignes spectrales ordonnés (cos2πfi), p étant l'ordre de la prédiction linéaire à court terme, en ce qu'on quantifie chacun de ces p paramètres en subdivisant un intervalle de variation ([mi,min{Mi, c andos2πfi-1}]) inclus dans un intervalle de référence respectif ([mi,Mi]) en 2Ni segments, Ni étant le nombre de bits de codage consacré à la quantification de ce paramètre, et en ce qu'on utilise, au moins pour les premiers paramètres ordonnés, des intervalles de référence choisis chacun parmi deux intervalles distincts ([mi,1,Mi,1], [mi,2,Mi,2]) suivant l'état spectral déterminé (Y) du signal de parole.
- Procédé selon l'une quelconque des revendications 1 à 6 ou la revendication 12, caractérisé en ce que les coefficients (ai) du filtre de synthèse à court terme sont représentés par un ensemble de p paramètres fréquentiels de lignes spectrales ordonnés (cos2πfi), p étant l'ordre de la prédiction linéaire à court terme, en ce qu'on quantifie chacun de ces p paramètres en subdivisant un intervalle de variation ([mi,min{Mi, c andos2πfi-1}]) inclus dans un intervalle de référence respectif ([mi,Mi]) en 2Ni segments, Ni étant le nombre de bits de codage consacré à la quantification de ce paramètre, et en ce qu'on donne à certains au moins des nombres de bits de codage Ni l'une ou l'autre de deux valeurs distinctes suivant l'état spectral déterminé (Y) du signal de parole.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR9406825A FR2720850B1 (fr) | 1994-06-03 | 1994-06-03 | Procédé de codage de parole à prédiction linéaire. |
FR9406825 | 1994-06-03 |
Publications (2)
Publication Number | Publication Date |
---|---|
EP0685833A1 EP0685833A1 (fr) | 1995-12-06 |
EP0685833B1 true EP0685833B1 (fr) | 2000-04-26 |
Family
ID=9463861
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP95401262A Expired - Lifetime EP0685833B1 (fr) | 1994-06-03 | 1995-05-31 | Procédé de codage de parole à prédiction linéaire |
Country Status (4)
Country | Link |
---|---|
US (1) | US5642465A (fr) |
EP (1) | EP0685833B1 (fr) |
DE (1) | DE69516455T2 (fr) |
FR (1) | FR2720850B1 (fr) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08179796A (ja) * | 1994-12-21 | 1996-07-12 | Sony Corp | 音声符号化方法 |
FR2729247A1 (fr) * | 1995-01-06 | 1996-07-12 | Matra Communication | Procede de codage de parole a analyse par synthese |
JP3196595B2 (ja) * | 1995-09-27 | 2001-08-06 | 日本電気株式会社 | 音声符号化装置 |
JPH09230896A (ja) * | 1996-02-28 | 1997-09-05 | Sony Corp | 音声合成装置 |
JP3094908B2 (ja) * | 1996-04-17 | 2000-10-03 | 日本電気株式会社 | 音声符号化装置 |
US6253172B1 (en) * | 1997-10-16 | 2001-06-26 | Texas Instruments Incorporated | Spectral transformation of acoustic signals |
US6094629A (en) * | 1998-07-13 | 2000-07-25 | Lockheed Martin Corp. | Speech coding system and method including spectral quantizer |
US7379865B2 (en) * | 2001-10-26 | 2008-05-27 | At&T Corp. | System and methods for concealing errors in data transmission |
KR20050049103A (ko) * | 2003-11-21 | 2005-05-25 | 삼성전자주식회사 | 포만트 대역을 이용한 다이얼로그 인핸싱 방법 및 장치 |
JP4975829B2 (ja) * | 2007-12-25 | 2012-07-11 | パナソニック株式会社 | 超音波診断装置 |
PT3364411T (pt) | 2009-12-14 | 2022-09-06 | Fraunhofer Ges Forschung | Dispositivo de quantização de vetor, dispositivo de codificação de voz, método de quantização de vetor e método de codificação de voz |
US9093068B2 (en) * | 2010-03-23 | 2015-07-28 | Lg Electronics Inc. | Method and apparatus for processing an audio signal |
CN103928031B (zh) * | 2013-01-15 | 2016-03-30 | 华为技术有限公司 | 编码方法、解码方法、编码装置和解码装置 |
RU2713852C2 (ru) | 2014-07-29 | 2020-02-07 | Телефонактиеболагет Лм Эрикссон (Пабл) | Оценивание фонового шума в аудиосигналах |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
NL8500843A (nl) | 1985-03-22 | 1986-10-16 | Koninkl Philips Electronics Nv | Multipuls-excitatie lineair-predictieve spraakcoder. |
-
1994
- 1994-06-03 FR FR9406825A patent/FR2720850B1/fr not_active Expired - Fee Related
-
1995
- 1995-05-31 EP EP95401262A patent/EP0685833B1/fr not_active Expired - Lifetime
- 1995-05-31 DE DE69516455T patent/DE69516455T2/de not_active Expired - Fee Related
- 1995-06-05 US US08/465,263 patent/US5642465A/en not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
US5642465A (en) | 1997-06-24 |
EP0685833A1 (fr) | 1995-12-06 |
FR2720850A1 (fr) | 1995-12-08 |
DE69516455D1 (de) | 2000-05-31 |
FR2720850B1 (fr) | 1996-08-14 |
DE69516455T2 (de) | 2001-01-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0685833B1 (fr) | Procédé de codage de parole à prédiction linéaire | |
EP0782128B1 (fr) | Procédé d'analyse par prédiction linéaire d'un signal audiofréquence, et procédés de codage et de décodage d'un signal audiofréquence en comportant application | |
EP0127718B1 (fr) | Procédé de détection d'activité dans un système de transmission de la voix | |
EP2419900B1 (fr) | Procede et dispositif d'evaluation objective de la qualite vocale d'un signal de parole prenant en compte la classification du bruit de fond contenu dans le signal | |
EP0768770B1 (fr) | Procédé et dispositif de création d'un bruit de confort dans un système de transmission numérique de parole | |
EP2415047B1 (fr) | Classification du bruit de fond contenu dans un signal sonore | |
EP1593116B1 (fr) | Procédé pour le traitement numérique différencié de la voix et de la musique, le filtrage de bruit, la création d'effets spéciaux et dispositif pour la mise en oeuvre dudit procédé | |
EP1692689B1 (fr) | Procede de codage multiple optimise | |
FR2522179A1 (fr) | Procede et appareil de reconnaissance de paroles permettant de reconnaitre des phonemes particuliers du signal vocal quelle que soit la personne qui parle | |
FR2639459A1 (fr) | Procede de traitement du signal et appareil de formation de donnees issues d'une source sonore | |
EP0428445B1 (fr) | Procédé et dispositif de codage de filtres prédicteurs de vocodeurs très bas débit | |
FR2690551A1 (fr) | Procédé de quantification d'un filtre prédicteur pour vocodeur à très faible débit. | |
EP0195441B1 (fr) | Procédé de codage à faible débit de la parole à signal multi-impulsionnel d'excitation | |
EP0616315A1 (fr) | Dispositif de codage et de décodage numérique de la parole, procédé d'exploration d'un dictionnaire pseudo-logarithmique de délais LTP, et procédé d'analyse LTP | |
EP2589045B1 (fr) | Codage/décodage prédictif linéaire adaptatif | |
EP0685836B1 (fr) | Procédé et dispositif de prétraitement d'un signal acoustique en amont d'un codeur de parole | |
EP1192619B1 (fr) | Codage et decodage audio par interpolation | |
EP1605440A1 (fr) | Procédé de séparation de signaux sources à partir d'un signal issu du mélange | |
EP1192621B1 (fr) | Codage audio avec composants harmoniques | |
Moreau | Predictive speech coding at low bit rates: a unified approach | |
EP0454552A2 (fr) | ProcédÀ© et dispositif de codage bas débit de la parole | |
FR2737360A1 (fr) | Procedes de codage et de decodage de signaux audiofrequence, codeur et decodeur pour la mise en oeuvre de tels procedes | |
FR2760285A1 (fr) | Procede et dispositif de generation d'un signal de bruit pour la sortie non vocale d'un signal decode de la parole | |
FR2741743A1 (fr) | Procede et dispositif pour l'amelioration de l'intelligibilite de la parole dans les vocodeurs a bas debit | |
WO2001003119A1 (fr) | Codage et decodage audio incluant des composantes non harmoniques du signal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): DE ES GB IT NL SE |
|
17P | Request for examination filed |
Effective date: 19951228 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: MATRA NORTEL COMMUNICATIONS |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
17Q | First examination report despatched |
Effective date: 19990803 |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE ES GB IT NL SE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20000426 Ref country code: ES Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY Effective date: 20000426 |
|
RIC1 | Information provided on ipc code assigned before grant |
Free format text: 7G 10L 19/04 A, 7G 10L 101:10 Z |
|
REF | Corresponds to: |
Ref document number: 69516455 Country of ref document: DE Date of ref document: 20000531 |
|
GBT | Gb: translation of ep patent filed (gb section 77(6)(a)/1977) |
Effective date: 20000601 |
|
ITF | It: translation for a ep patent filed | ||
NLV1 | Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: SE Payment date: 20010419 Year of fee payment: 7 |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: IF02 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20020601 |
|
EUG | Se: european patent has lapsed | ||
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20040528 Year of fee payment: 10 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20050414 Year of fee payment: 11 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED. Effective date: 20050531 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20051201 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20060531 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20060531 |