EP0361432B1 - Méthode et dispositif de codage et de décodage de signaux de parole utilisant une excitation multi-impulsionnelle - Google Patents
Méthode et dispositif de codage et de décodage de signaux de parole utilisant une excitation multi-impulsionnelle Download PDFInfo
- Publication number
- EP0361432B1 EP0361432B1 EP89117837A EP89117837A EP0361432B1 EP 0361432 B1 EP0361432 B1 EP 0361432B1 EP 89117837 A EP89117837 A EP 89117837A EP 89117837 A EP89117837 A EP 89117837A EP 0361432 B1 EP0361432 B1 EP 0361432B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- signal
- long
- term
- gain
- excitation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 230000005284 excitation Effects 0.000 title claims abstract description 78
- 238000000034 method Methods 0.000 title claims abstract description 47
- 230000007774 longterm Effects 0.000 claims abstract description 65
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 62
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 49
- 238000001914 filtration Methods 0.000 claims abstract description 7
- 238000004458 analytical method Methods 0.000 claims description 50
- 230000006870 function Effects 0.000 claims description 39
- 230000003595 spectral effect Effects 0.000 claims description 19
- 238000012546 transfer Methods 0.000 claims description 18
- 238000007493 shaping process Methods 0.000 claims description 16
- 238000001228 spectrum Methods 0.000 claims description 14
- 239000013598 vector Substances 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 12
- 101710196810 Non-specific lipid-transfer protein 2 Proteins 0.000 claims description 10
- 238000013139 quantization Methods 0.000 claims description 9
- 101710196809 Non-specific lipid-transfer protein 1 Proteins 0.000 claims description 7
- 101000972854 Lens culinaris Non-specific lipid-transfer protein 3 Proteins 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 239000000872 buffer Substances 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 210000004704 glottis Anatomy 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/10—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
Definitions
- the present invention concerns medium-low bit-race speech signal coding systems, and more particularly it relates to a coding-decoding method and device using a multipulse analysis-by-synthesis excitation technique.
- Multipulse linear prediction coding is one of the most promising techniques for obtaining high quality synthetic speech at bit rates below 16 kbit/s. This technique has been originally proposed by B. S. Atal and J. R. Remde in the paper entitled “A new method of LPC excitation for producing natural-sounding speech at low bit rates", International Conference on Acoustic, Speech, Signal Processing (ICASSP), pages 614-617, Paris, 1982.
- IICASSP International Conference on Acoustic, Speech, Signal Processing
- the excitation signal for the synthesis filter consists of a train of pulses whose amplitudes and time positions are determined so as to minimize a perceptually-meaningful distorsion measurement; such a measurement is obtained by comparing the samples at the synthesis filter output with the original speech samples and simultaneous weighting the difference by a function which takes into account how the human perception evaluates the distorsion introduced (analysis-by-synthesis procedure).
- the synthesizer comprises the cascade of a long-term and a short-term synthesis filter are of particular interest: in fact they provide signals whose quality gradually decreases as the bit rate decreases and do not present a dramatic performance deterioration below a threshold rate.
- the invention provides a method and a device allowing quality to be increased leaving the bit rate unchanged or a given quality to be maintained even at lower bit rate.
- This can be achieved by using a combined optimization technique, of sequential type, of the parameters of the long-term synthesis filter and of the excitation within the analysis-by-synthesis procedure; the sequential procedure is sub-optimum with respect to the original optimum one, but it is easier to be implemented.
- a method is provided where an optimization of parameters according to the particular error minimization procedure is used, which is a closed loop analysis.
- the terms "open loop analysis” and “closed loop analysis” are here used as explained e.g. in IEEE Journal on Selected Areas in communications, Vol. 6 No. 2, Feb. 1988, p.353-363, Kroon and Deprettere.
- the long-term analysis means are apt to determine said lag and gain in two successive steps, preceding a step in which the amplitudes and positions of the excitation pulses are determined by said excitation generator, and comprise: a second long-term synthesis filter, which is fed with a null signal and in which, for the computation of the lag, there is used
- a generic speech signal coding-decoding system can be schematized by a coder COD, a transmission channel CH and a decoder DEC.
- coder COD receives digital samples s(n) of the original speech signal, organized into frames comprising each a predetermined number of samples, and sends onto channel CH, for each sample frame, the coding of a suitable representation ⁇ (k) of a group of linear prediction coefficients a(k) obtained by a short-term analysis of the speech signal, the coded amplitudes and positions A(i), Cp of the pulses forming the excitation signal, the coded r.m.s. values ⁇ (i) of the excitation pulses, and the codings of two parameters (gain B and lag M) determined by the long-term analysis.
- Decoder DEC reconstructs the excitation and generates a synthesized speech signal on the basis of the reconstructed excitation, the linear prediction coefficients reconstructed starting from the transmitted representation thereof, and long-term analysis parameters.
- the digital sample frames, present on connection 1 are supplied to a spectral shaping circuit SW and to a short-term analysis circuit STA.
- Spectral shaping circuit SW performs a frequency-shaping of the speech signal in order to render the differences between the original and the reconstructed speech signals less perceptible in correspondence with the formants of the original speech signal.
- Such a circuit consists of a pair of cascaded digital filters F1, F2, whose transfer functions, in z transform, are given in a non-limiting example respectively by relations where z represents a sampling interval delay; â(k) is a quantized linear prediction coefficient vector (1 ⁇ k ⁇ p, where p is the filter order) reconstructed from the coded representation of the linear prediction coefficients obtained as short-term analysis result; ⁇ is an experimentally determined constant correcting factor, determining the bandwidth increase around the formants.
- a signal r(n) hereinafter referred to as “residual signal”
- spectrally shaped speech signal s w (n) is obtained on output connection 3 of F2: both signals are used in long-term analysis.
- Short-term analysis circuit STA is to determine linear prediction coefficients a(k), which depend on short-term correlations deriving from a non-flat spectral envelope of speech signal. Circuit STA calculates coefficients a(k) according to the classical autocorrelation method, as described in "Digital Signal Processing of Speech Signals" by L.R. Rabiner and R.W. Schafer (Prentice-Hall, Englewood Cliffs, N.J., USA, 1978), page 401, and uses to this aim a set of digital samples s h (n) which can comprise, besides the samples of the current frame, a certain number of samples of both the preceding and the following frames.
- Block STA also comprises circuits for transforming the coefficients into a group of parameters ⁇ (k) in the frequency domain, known as "line spectrum pairs", which are presented on output 5 of STA.
- line spectrum pairs denote the resonant frequencies at which the acoustic tube, the vocal tract can be assimilated to, exhibits a line spectrum structure under extreme boundary conditions corresponding to complete opening and closure at the glottis.
- the conversion of linear prediction coefficients into line spectrum pairs is described e.g. by N. Sugamura and F.Itakura in the paper "Speech analysis and synthesis method developed at ECL in NTT - From LPC to LSP", Speech Communication, Vol.5, No.2, June 1986, pages 199-215.
- Line spectrum pairs ⁇ (k) or the differences ⁇ between adjacent line pairs are then vectorially quantized in a vector quantization circuit VQ exploiting techniques of the type described in published European Patent application EP-A-186763 (CSELT), applied to a set of codebooks.
- CSELT published European Patent application EP-A-186763
- That vector instead of being coded by a single word with that number of bits, is quantized by a group of words of smaller size chosen out of suitable sub-codebooks.
- the modality of quantization of the above patent application are applied to obtain each of said words.
- vector quantizer VQ is one of the characteristics of the present invention and allows a reduction in the number of bits necessary to code the results of the short-term analysis, while maintaining the same quality of the coded signal, from about 36-34 bits (scalar quantization) to 24 (vector quantization).
- differences ⁇ organized into three vectors of 3, 3 and 4 components respectively, may be quantized with 24 bits organized into three groups of 256 words, each group corresponding to one of said vectors.
- the indices of the vectors are sent by VQ on a connection 6 which belongs to channel CH.
- a circuit DCO obtains from said indices quantized linear prediction coefficients â(k) which are supplied, through connection 4, to filters F1, F2 or circuit SW, to an excitation generator EG and to a long-term analysis circuit LTA.
- LTA supplies information dependent on the fine spectral structure of the signal, which information is used to make the synthesized signal more natural-sounding.
- the samples relevant to M preceding sampling instants weighted by a weighting factor (gain) 3, are used.
- LTA is just to determine both M and B.
- Lag M in case of a voiced sound, corresponds to the pitch period.
- the lag can range from 20 to 83 samples and it is updated every frame. The gain is on the contrary updated every half frame.
- Values M and B are emitted on a connection 7 and are supplied to excitation generator EG which also receives, through a connection 8, a signal s we (n), obtained from s w (n) in a manner which will be described hereinafter. Values M and B are also sent to a coder LTC, which transfers the coded signals onto a connection 9 belonging to channel CH.
- LTC liquid crystal display
- Long-term analysts circuit LTA performs a closed-loop analysis as a part of the procedure for determining the pulse positions, with modalities allowing a good coder performance to be maintained even if a sub-optimum procedure is used, as will be better described hereinafter.
- Excitation generator EG is to supply the sequence of Ns pulses (e.g. 6), distributed within a time period Ls (more particularly corresponding to half a frame), forming the excitation signal; such a signal is computed so as to minimize a mean squared error, frequency shaped as mentioned, between the original signal and the reconstructed one.
- Ns pulses e.g. 6
- Ls more particularly corresponding to half a frame
- Excitation generator EG supplies, through a connection 10, the pulses it has generated to a circuit PAC coding the amplitudes and the positions of such pulses, which circuits calculate and code also the r.m.s. values of said pulses.
- the coded values ⁇ (i), A(i) (1 ⁇ i ⁇ Ns) and Cp are emitted on a connection 11, also belonging to channel CH.
- circuit PAC The structure of circuit PAC is known to the skilled in the art.
- an excitation decoder ED reconstructs the excitation starting from the coded values ⁇ (i), A(i), Cp.
- reconstructed excitation pulses ê are supplied by ED to a long-term synthesis filter LTP1 which, together with a short-term synthesis filter STP, forms synthesizer SYN.
- Reconstructed residual signal r ⁇ is present at the output of LTP1 and is sent via a connection 14 to short-term synthesis filter STP.
- This is a filter whose transfer function in z transform is 1/A(z), where A(z) is the function already examined for filter F1 of spectral shaping circuit SW.
- Coefficients â(k) for filter STP are supplied through a connection 15 from a circuit STD, which reconstructs them by decoding the information relevant to line spectrum pairs.
- Filter STP emits on connection 16 the reconstructed or synthesized speech signal ⁇ .
- the optimum solution would be determining, for each pair of possible values m, b of the lag and gain used to determine the optimum values M, B to be exploited in the synthesis, the combination of excitation pulses, gain and lag minimizing the mean squared error between the original signal and the reconstructed signal.
- the optimum solution is too complex and hence, according to the invention, the determination of M and B is separated from that of the excitation pulses There are hence two successive operation phases.
- M, B of m and b are to be found which minimize mean squared error between frequency-shaped speech signal s w (n) and a signal s w0 (n) obtained by weighting, in the same way as the residual signal, a signal r0 obtained as a response from a long-term synthesis filter (similar to the one of the synthesizer), when at the filter input a zero has been forced (long-term synthesis filter memory).
- a predetermined value b is allotted to the gain and the error is minimized for each value m of lag: once found optimum lag M, the successive step is that of determining the optimum gain B.
- value B of b is chosen which renders E(M, b) minimum.
- B is computed every half frame, and hence also the excitation pulses will be computed every half frame.
- Fig. 3 shows a block diagram of the devices of LTP and EG in case signal 0 is used to determine M and B.
- a synthesis filter LTP2 having a transfer function similar to that of LTP1 (Fig. 1), is fed with a null signal.
- filter LTP2 successively uses the different values m and, for each of them, an optimum value b(ott) which is implicitly obtained in the above-mentioned derivative operation.
- B LTP2 uses value M of the lag determined in the preceding step and different values b.
- Values m and b are supplied to LTP2 by a processing unit CMB, carrying out the computations and comparisons mentioned above.
- Signal r0 is present on output 20 of LTP2.
- Output 20 is connected to a first input of a multiplexer MX1 receiving at a second input the residual signal r(n) present on connection 2, and letting through signal r0 or signal r depending on the relative value of m and n.
- signal 0 is present on output connection 21 of MX1, and that signal is delayed by a time equal to m samples in a delay element DL1 before being sent to CMB.
- the latter receives also signal r(n) and, for each frame and for all values m, calculates function R'(m) and determines the value M of m which maximizes such function.
- the value is stored into a register RM and made available on wires 7a of connection 7.
- Output 20 of LTP2 is also connected to a weighting filter F3, which is enabled only while B is being computed and has the same transfer function 1/ A(z/ ⁇ ) as filter F2 in SW (Fig. 1).
- Filter F3 weights signal r0 (or r'0, when the gain used in LTP2 is 1) giving at output 22 signal s w0 (s' w0 ).
- the latter is supplied at an input of an adder SM1 where it is subtracted from signal s w coming from spectral shaping filter SW (Fig. 1) via connection 3.
- SM1 supplies on output 8 signal s we .
- device CMB determines, every half frame, value B of b which minimizes E and stores it into register RB which keeps it available, for the whole half frame, on a group of wires 7b of connection 7.
- Values B, M computed by CMB are supplied to LTC (Fig.1) and to a long-term synthesis filter LTP3 which is part of the excitation generator EG and is followed by a weighting filter F4.
- Filters LTP3, F4 have transfer functions similar to those of LTP1 and F2, respectively;
- LTP3 is fed, during the analysis-by-synthesis procedure, with the excitation pulses e(i) supplied via connection 10 by a processing unit CE which sequentially determines the positions and the amplitudes of the various pulses.
- F4 emits on output 24 signal ⁇ we which is supplied to a first input of an adder SM2 receiving at a second input signal s we outgoing from SM1. The difference between the two signals is then supplied via connection 25 to CE, which determines pulses e(i) by minimizing mean squared error dw.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Analogue/Digital Conversion (AREA)
- Dc Digital Transmission (AREA)
Claims (6)
- Méthode pour le codage et décodage de signaux de parole, utilisant une technique d'excitation d'analyse par synthèse multi-impulsionnelle, la méthode comprenant une phase de codage qui prévoit les opérations suivantes:- conversion du signal de parole en trames d'échantillons numériques [s(n)];- analyse à court terme du signal de parole, pour déterminer un groupe de coefficients de prédiction linéaire [a(k)] (k= 1,...,p) relatifs à un trame en cours et une représentation de ces coefficients sous forme de paires de raies spectrales;- codage de la représentation des coefficients de prédiction linéaire et obtention de coefficients de prédiction linéaire quantifiés [â(k)] à partir de cette représentation;- mise en forme spectrale du signal de parole, en pondérant les échantillons numériques [s(n)] d'une trame par une première et une seconde fonction de pondération A(z), 1/A(z/γ), où- analyse à long terme du signal de parole, en utilisant le signal résiduel [r(n)] et le signal mis en forme spectralement [sw(n)], pour déterminer le retard (M) qui sépare un échantillon courant d'un échantillon précédent [r(n-M)] utilisé pour traiter l'échantillon courant, et le gain (B) par lequel l'échantillon précédent est pondéré pour le traitement;- détermination des positions et des amplitudes des impulsions d'excitation en utilisant les résultats de l'analyse à court et à long terme;- codage des valeurs du retard et du gain de l'analyse à long terme et des amplitudes et des positions des impulsions d'excitation, les valeurs codées constituant, avec la représentation codée des coefficients de prédiction linéaire et les valeurs efficaces codées des impulsions d'excitation, le signal de parole codé;et comprenant en outre une phase de décodage, où l'on reconstitue l'excitation à partir des valeurs codées des amplitudes, des positions et des valeurs efficaces des impulsions et on engendre un signal de parole synthétisé [ŝ(n)] en faisant passer l'excitation reconstituée (ê) à travers un filtre de synthèse à long terme 1/(1-B·z-M) suivi d'un filtre de synthèse à court terme [1/A(z)], qui utilisent respectivement les paramètres de l'analyse à long terme et les coefficients de prédiction linéaire quantifiés, méthode suivant laquelle l'analyse à long terme et la génération des impulsions d'excitation sont effectuées en des phases successives, dans la première desquelles on détermine le retard (M) et le gain (B) de l'analyse à long terme en minimisant une erreur quadratique moyenne entre le signal de parole mis en forme spectralement [sw(n)] et un signal ultérieur [sw0(n)] obtenu en pondérant par la seconde fonction de pondération 1/A(z/γ) le signal résultant d'un filtrage de synthèse à long terme, qui est semblable à celui effectué pendant la phase de décodage et où le signal utilisé pour la synthèse est un signal nul, tandis que dans la seconde phase on détermine effectivement les amplitudes et les positions des impulsions d'excitation [e(i)], en minimisant l'erreur quadratique moyenne entre un signal [swe(n)] qui représente la différence entre le signal de parole mis en forme spectralement [sw(n)] et le signal ultérieur [sw0(n)], et un troisième signal pondéré [ŝwe(n)], obtenu en soumettant les impulsions d'excitation à un filtrage de synthèse à long terme et à une pondération par la seconde fonction de pondération; et suivant laquelle le codage de la repésentation des coefficients de prédiction linéaire consiste en une quantification vectorielle des paires des raies spectrales ou des différences entre de paires des raies spectrales adjacentes selon une technique de quantification à dictionnaire subdivisé.
- Méthode selon la revendication 1, caractérisée en ce que le retard (M) et le gain (B) sont déterminés en deux étapes successives, dans la première desquelles on détermine une valeur optimale du retard en minimisant l'erreur pour une valeur prédéterminée du gain, tandis que dans la seconde on détermine la valeur optimale du gain, en utilisant la valeur optimale du retard.
- Méthode selon la revendication 1, caractérisée en ce que le retard (M) et le gain (B) sont déterminés en deux étapes successives, dans la première desquelles on minimise l'erreur quadratique moyenne entre le signal résiduel [r(n)] et un signal [₀(n)] qui est le signal [r₀(n)] résultant du filtrage de synthèse à long terme avec entrée nulle, si la synthèse relative à un échantillon de la trame en cours est effectuée sur la base d'un échantillon d'une trame qui précède, et est le signal résiduel [r(n)] si la synthèse relative à un échantillon de la trame en cours est effectuée sur la base d'un échantillon précédent de la même trame, tandis que dans la seconde étape on calcule le gain (B) par la succession des opérations suivantes: on détermine une valeur [s'w0(n)] du signal ultérieur pour une valeur unitaire du gain; on détermine ensuite une première valeur E(M,1) de l'erreur, et on répète les opérations our déterminer la valeur du signal pondéré par la seconde fonction de pondération et de l'erreur pour chaque valeur possible pour le gain, la valeur adoptée étant celle qui minimise l'erreur.
- Méthode selon la revendication 3, caractérisée en ce que le retard (M) est calculé à chaque trame, et le gain (B) à chaque semi-trame.
- Dispositif pour le codage et décodage de signaux de parole par des techniques d'excitation d'analyse par synthèse multi-impulsionnelle, pour la réalisation de la méthode selon l'une quelconque des revendications 1, 3 ou 4, comprenant, pour le codage du signal de parole:- des moyens pour convertir le signal de parole en trames d'échantillons numériques [s(n)];- des moyens (STA) pour l'analyse à court terme du signal de parole, qui reçoivent des moyens de conversion un ensemble d'échantillons, calculent un groupe de coefficients de prédiction linéaire [a(k)] (k= 1,...,p) relatifs à une trame en cours et émettent une représentation des coefficients de prédiction linéaire [a(k)] sous forme de paires de raies spectrales;- des moyens (VQ) pour coder la représentation des coefficients de prédiction linéaire;- des moyens (DCO) pour obtenir des coefficients de prédiction linéaire quantifiés [â(k)] à partir de la représentation codée;- un circuit (SW) pour la mise en forme spectrale du signal de parole, connecté aux moyens de conversion et aux moyens (DCO) qui obtiennent les coefficients de prédiction linéaire quantifiés, et comprenant une paire de filtres numériques de pondération (F1, F2) en cascade, qui pondèrent les échantillons numériques [s(n)] respectivement selon une première et une seconde fonction de pondération A(z), 1/A(z/γ), où- des moyens (LTA) pour l'analyse à long terme du signal de parole, connectés à la sortie du premier filtre (F1) et du circuit de mise en forme spectrale (SW) pour déterminer le retard (M) qui sépare un échantillon courant d'un échantillon précédent [r(n-M)] utilisé pour traiter l'échantillon courant, et le gain (B) par lequel l'échantillon précédent est pondéré pour le traitement;- un générateur d'excitation (EG) pour déterminer les positions et les amplitudes des impulsions d'excitation, connecté aux moyens d'analyse à court et à long terme (STA, LTA) et au circuit de mise en forme spectrale (SW);- des moyens (LTC, PAC) pour le codage des valeurs du retard et du gain de l'analyse à long terme et des amplitudes et des positions des impulsions d'excitation, les valeurs codées constituant, avec la représentation codée des coefficients de prédiction linéaire et les valeurs efficaces des impulsions d'excitation, le signal de parole codé;et comprenant en outre, pour le décodage (synthèse) du signal de parole:- des moyens (ED, LTD, STD) pour reconstituer l'excitation, le retard (M) et le gain (B) de l'analyse à long terme aussi bien que les coefficients de prédiction linéaire [a(k)] à partir du signal codé; et- un synthétiseur, comprenant la cascade d'un premier filtre de synthèse à long terme (LTP1), qui reçoit les impulsions d'excitation et le gain et le retard reconstitués et qui filtre ces impulsions selon une première fonction de transfert 1/(1-B·z-M), et d'un titre de synthèse à court terme (STP), ayant une seconde fonction de transfert 1/A(z) qui est le réciproque de la première fonction de mise en forme spectrale A(z),où les moyens d'analyse à long terme (LTA) sont aptes à déterminer le retard (M) et le gain (B) en deux étapes successives, qui précèdent une phase de détermination des amplitudes et des positions des impulsions d'excitation par le générateur d'excitation (EG), et comprennent:- un second titre de synthèse à long terme (LTP2), qui est alimenté avec un signal nul et où, pour le calcul du retard (M), on utilise un ensemble prédéterminé de valeurs du nombre d'échantillons qui sépare un échantillon courant en cours de synthèse d'un échantillon précédent utilisé pour la synthèse et, pour le calcul du gain (B), on utilise un ensemble prédeterminé de valeurs possibles du gain luimême;- un multiplexeur (MX1) qui reçoit à une première entrée un échantillon du signal résiduel [r(n)] et à une seconde entrée un échantillon du signal de sortie du second filtre de synthèse à long terme (LTP2) et émet les échantillons présents à l'une ou à l'autre entrée selon que le nombre d'échantillons est inférieur ou non à la longueur d'une trame;- un troisième filtre de pondération (F3), qui a la même fonction de transfert que le second filtre numérique (F2) du circuit de mise en forme spectrale (SW), est connecté à la sortie du second filtre de synthèse à long terme (LTP2) et est validé seulement pendant la détermination du gain de l'analyse à long terme (B);- un premier additionneur (SM1), qui reçoit à une première entrée le signal mis en forme spectralement (sw) et à une seconde entrée le signal de sortie du troisième filtre de pondération (F3) et fournit la différence entre les signaux présents à la première ou seconde entrée;- une première unité de traitement (CMB), qui reçoit dans une première des deux étapes successives le signal de sortie du multiplexeur (MX1) et détermine la valeur optimale du nombre d'échantillons, et dans la seconde des deux étapes successives reçoit le signal de sortie du premier additionneur (SM1) et détermine, en utilisant le retard calculé dans la première étape, la valeur du gain qui minimise l'erreur quadratique moyenne, dans une période de validité des impulsions d'excitation, entre les signaux d'entrée du premier additionneur (SM1);et où le générateur d'excitation (EG) pour engendrer les impulsions d'excitation [e(i)] comprend:- un troisième filtre de synthèse à long terme (LTP3), qui a la même fonction de transfert que le premier filtre de synthèse à long terme (LTP1) et qui est alimenté avec les impulsions d'excitation qui sont engendrées;- un quatrième filtre de pondération (F4), connecté à la sortie du troisième filtre de synthèse (LTP3) et ayant la même fonction de transfert que les second et troisième filtres de pondération (F2, F3);- un second additionneur (SM2), qui reçoit à une première entrée le signal de sortie du premier additionneur (SM1) et à une seconde entrée le signal de sortie du quatrième titre de pondération (F4), et fournit à la sortie la différence entre les signaux présents à la première et à la seconde entrée;- une seconde unité de traitement (CE) qui est connectée à la sortie du second additionneur (SM2) et qui détermine les amplitudes et les positions des impulsions en minimisant l'erreur quadratique moyenne, dans une période de validité des impulsions, entre les signaux d'entrée du second additionneur (SM2).
- Dispositif selon la revendication 5, caractérisé en ce que les moyens (VQ) pour le codage de la représentation des coefficients de prédiction linéaire comprennent un quantificateur vectoriel (VQ) pour la quantification vectorielle à dictionnaire subdivisé des paires des raies spectrales ou des différences entre paires de raies spectrales adjacentes.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IT67868/88A IT1224453B (it) | 1988-09-28 | 1988-09-28 | Procedimento e dispositivo per la codifica decodifica di segnali vocali con l'impiego di un eccitazione a impulsi multipli |
IT6786888 | 1988-09-28 |
Publications (3)
Publication Number | Publication Date |
---|---|
EP0361432A2 EP0361432A2 (fr) | 1990-04-04 |
EP0361432A3 EP0361432A3 (en) | 1990-09-26 |
EP0361432B1 true EP0361432B1 (fr) | 1994-08-17 |
Family
ID=11305936
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP89117837A Expired - Lifetime EP0361432B1 (fr) | 1988-09-28 | 1989-09-27 | Méthode et dispositif de codage et de décodage de signaux de parole utilisant une excitation multi-impulsionnelle |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP0361432B1 (fr) |
AT (1) | ATE110180T1 (fr) |
DE (2) | DE361432T1 (fr) |
ES (1) | ES2017906T3 (fr) |
GR (1) | GR900300170T1 (fr) |
IT (1) | IT1224453B (fr) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE69232879T2 (de) * | 1991-02-26 | 2003-05-08 | Nec Corp., Tokio/Tokyo | Sprachparameterkodierungsvorrichtung |
FI98104C (fi) * | 1991-05-20 | 1997-04-10 | Nokia Mobile Phones Ltd | Menetelmä herätevektorin generoimiseksi ja digitaalinen puhekooderi |
ES2042410B1 (es) * | 1992-04-15 | 1997-01-01 | Control Sys S A | Metodo de codificacion y codificador de voz para equipos y sistemas de comunicacion. |
FI95086C (fi) * | 1992-11-26 | 1995-12-11 | Nokia Mobile Phones Ltd | Menetelmä puhesignaalin tehokkaaksi koodaamiseksi |
FI96248C (fi) * | 1993-05-06 | 1996-05-27 | Nokia Mobile Phones Ltd | Menetelmä pitkän aikavälin synteesisuodattimen toteuttamiseksi sekä synteesisuodatin puhekoodereihin |
GB9408037D0 (en) * | 1994-04-22 | 1994-06-15 | Philips Electronics Uk Ltd | Analogue signal coder |
-
1988
- 1988-09-28 IT IT67868/88A patent/IT1224453B/it active
-
1989
- 1989-09-27 DE DE198989117837T patent/DE361432T1/de active Pending
- 1989-09-27 EP EP89117837A patent/EP0361432B1/fr not_active Expired - Lifetime
- 1989-09-27 AT AT89117837T patent/ATE110180T1/de active
- 1989-09-27 DE DE68917552T patent/DE68917552T2/de not_active Expired - Fee Related
- 1989-09-27 ES ES89117837T patent/ES2017906T3/es not_active Expired - Lifetime
-
1991
- 1991-09-27 GR GR90300170T patent/GR900300170T1/el unknown
Non-Patent Citations (2)
Title |
---|
ICASSP 86, IEEE-IECEJ-ASJ INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, ANDSIGNAL PROCESSING, Tokyo, 7th - 11th April 1986, vol. 4, pages 3067-3070, IEEE,New York, US; G. OHYAMA et al.: "A novel approach to estimating excitation codein code-excited linear prediction coding" * |
SIGNAL PROCESSING, Toyko, 7th - 11th April 1986, vol. 3, pages 1689-1692,IEEE, New York, US; K. OZAWA et al.: "High quality multi-pulse speech coderwith pitch predicton" * |
Also Published As
Publication number | Publication date |
---|---|
GR900300170T1 (en) | 1991-09-27 |
ES2017906T3 (es) | 1994-10-16 |
DE68917552D1 (de) | 1994-09-22 |
IT8867868A0 (it) | 1988-09-28 |
ATE110180T1 (de) | 1994-09-15 |
IT1224453B (it) | 1990-10-04 |
DE68917552T2 (de) | 1995-01-12 |
EP0361432A3 (en) | 1990-09-26 |
ES2017906A4 (es) | 1991-03-16 |
EP0361432A2 (fr) | 1990-04-04 |
DE361432T1 (de) | 1991-03-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0409239B1 (fr) | Procédé pour le codage et le décodage de la parole | |
EP1221694B1 (fr) | Codeur/decodeur vocal | |
EP1232494B1 (fr) | Lissage de gain dans un decodeur de signaux vocaux et audio a large bande | |
CA1181854A (fr) | Codeur de paroles numerique | |
EP0360265B1 (fr) | Système de transmission capable de modifier la qualité de la parole par classement des signaux de paroles | |
US5602961A (en) | Method and apparatus for speech compression using multi-mode code excited linear predictive coding | |
US7280959B2 (en) | Indexing pulse positions and signs in algebraic codebooks for coding of wideband signals | |
EP1125286B1 (fr) | Dispositif et procede de ponderation perceptive pour le codage efficace de signaux a large bande | |
EP1224662B1 (fr) | Codage de la parole a debit binaire variable de type celp avec classification phonetique | |
KR100264863B1 (ko) | 디지털 음성 압축 알고리즘에 입각한 음성 부호화 방법 | |
US5339384A (en) | Code-excited linear predictive coding with low delay for speech or audio signals | |
JPH10187196A (ja) | 低ビットレートピッチ遅れコーダ | |
US5027405A (en) | Communication system capable of improving a speech quality by a pair of pulse producing units | |
EP0361432B1 (fr) | Méthode et dispositif de codage et de décodage de signaux de parole utilisant une excitation multi-impulsionnelle | |
Cuperman et al. | Backward adaptation for low delay vector excitation coding of speech at 16 kbit/s | |
JPH086597A (ja) | 音声の励振信号符号化装置および方法 | |
US4908863A (en) | Multi-pulse coding system | |
US5708756A (en) | Low delay, middle bit rate speech coder | |
JPH0720897A (ja) | ディジタルコーダにおけるスペクトルパラメータを量子化する方法および装置 | |
KR0155798B1 (ko) | 음성신호 부호화 및 복호화 방법 | |
JP3296411B2 (ja) | 音声符号化方法および復号化方法 | |
JP2853170B2 (ja) | 音声符号化復号化方式 | |
JPH08320700A (ja) | 音声符号化装置 | |
JP3144244B2 (ja) | 音声符号化装置 | |
EP1212750A1 (fr) | Vocodeur de type vselp |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE CH DE ES FR GB GR LI NL SE |
|
PUAL | Search report despatched |
Free format text: ORIGINAL CODE: 0009013 |
|
AK | Designated contracting states |
Kind code of ref document: A3 Designated state(s): AT BE CH DE ES FR GB GR LI NL SE |
|
17P | Request for examination filed |
Effective date: 19901019 |
|
EL | Fr: translation of claims filed | ||
TCAT | At: translation of patent claims filed | ||
DET | De: translation of patent claims | ||
TCNL | Nl: translation of patent claims filed | ||
17Q | First examination report despatched |
Effective date: 19920814 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AT BE CH DE ES FR GB GR LI NL SE |
|
REF | Corresponds to: |
Ref document number: 110180 Country of ref document: AT Date of ref document: 19940915 Kind code of ref document: T |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: SE Payment date: 19940831 Year of fee payment: 6 Ref country code: BE Payment date: 19940831 Year of fee payment: 6 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: CH Payment date: 19940906 Year of fee payment: 6 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 19940919 Year of fee payment: 6 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GR Payment date: 19940921 Year of fee payment: 6 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: ES Payment date: 19940922 Year of fee payment: 6 |
|
REF | Corresponds to: |
Ref document number: 68917552 Country of ref document: DE Date of ref document: 19940922 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 19940929 Year of fee payment: 6 Ref country code: AT Payment date: 19940929 Year of fee payment: 6 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: NL Payment date: 19940930 Year of fee payment: 6 Ref country code: FR Payment date: 19940930 Year of fee payment: 6 |
|
ET | Fr: translation filed | ||
REG | Reference to a national code |
Ref country code: ES Ref legal event code: FG2A Ref document number: 2017906 Country of ref document: ES Kind code of ref document: T3 |
|
REG | Reference to a national code |
Ref country code: GR Ref legal event code: FG4A Free format text: 3012980 |
|
EAL | Se: european patent in force in sweden |
Ref document number: 89117837.8 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Effective date: 19950927 Ref country code: AT Effective date: 19950927 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Effective date: 19950928 Ref country code: ES Free format text: LAPSE BECAUSE OF THE APPLICANT RENOUNCES Effective date: 19950928 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Effective date: 19950930 Ref country code: CH Effective date: 19950930 Ref country code: BE Effective date: 19950930 |
|
BERE | Be: lapsed |
Owner name: SOCIETA ITALIANA TELECOMUNICAZIONI S.P.A. ITALTE Effective date: 19950930 Owner name: SOCIETA ITALIANA PER L'ESERCIZIO DELLE TELECOMUNIC Effective date: 19950930 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GR Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY Effective date: 19960331 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Effective date: 19960401 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 19950927 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Effective date: 19960531 |
|
REG | Reference to a national code |
Ref country code: GR Ref legal event code: MM2A Free format text: 3012980 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Effective date: 19960601 |
|
NLV4 | Nl: lapsed or anulled due to non-payment of the annual fee |
Effective date: 19960401 |
|
EUG | Se: european patent has lapsed |
Ref document number: 89117837.8 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST |
|
REG | Reference to a national code |
Ref country code: ES Ref legal event code: FD2A Effective date: 19991007 |