EP0076234B1 - Procédé et dispositif pour traitement digital de la parole réduisant la redondance - Google Patents

Procédé et dispositif pour traitement digital de la parole réduisant la redondance Download PDF

Info

Publication number
EP0076234B1
EP0076234B1 EP82810391A EP82810391A EP0076234B1 EP 0076234 B1 EP0076234 B1 EP 0076234B1 EP 82810391 A EP82810391 A EP 82810391A EP 82810391 A EP82810391 A EP 82810391A EP 0076234 B1 EP0076234 B1 EP 0076234B1
Authority
EP
European Patent Office
Prior art keywords
speech
section
parameters
coded
sections
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
EP82810391A
Other languages
German (de)
English (en)
Other versions
EP0076234A1 (fr
Inventor
Stephan Dr. Horvath
Carlo Bernasconi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Omnisec AG Te Regensdorf Zwitserland
Original Assignee
Gretag AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gretag AG filed Critical Gretag AG
Priority to AT82810391T priority Critical patent/ATE15415T1/de
Publication of EP0076234A1 publication Critical patent/EP0076234A1/fr
Application granted granted Critical
Publication of EP0076234B1 publication Critical patent/EP0076234B1/fr
Expired legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

Definitions

  • the invention relates to a method operating according to the method of linear predication and a corresponding device for redundancy-reducing digital speech processing according to the preamble of claim 1 and claim 13.
  • the LPC vocoders known and available today are not yet fully satisfactory. Although the language synthesized again after the analysis is usually still relatively understandable, it is distorted and sounds artificial. One of the reasons for this is with difficulty in making the decision as to whether there is a voiced or an unvoiced speech section with sufficient certainty. Other causes include poor determination of the pitch period and inaccurate determination of the sound formation filter parameters.
  • the data rate must in many cases be limited to a relatively low value. It is e.g. in the case of telephone networks, preferably only 2.4 kbit / sec.
  • the data rate is determined by the number of speech parameters analyzed in each speech section, by the number of bits required for these parameters and by the so-called frame rate, i.e. given the number of speech sections per second.
  • frame rate i.e. given the number of speech sections per second.
  • at least slightly more than 50 bits are required per speech section. This automatically sets the maximum frame rate, e.g. in a 2.4 kbit / sec system to around 45 / sec.
  • the voice quality at these relatively low frame rates is also correspondingly poor. It is not possible to increase the frame rate, which would in itself improve the voice quality, as this would exceed the specified data rate. To reduce the number of bits required per frame, on the other hand, a reduction in the number of parameters used or a coarsening of their quantization would be necessary, but this would automatically result in a deterioration in the quality of the speech reproduction.
  • the present invention now deals primarily with these difficulties caused by predetermined data rates and has in particular the aim of improving a method or a device of the type defined at the outset with regard to the quality of the speech reproduction without increasing the data rates.
  • the basic idea of the invention is therefore to save bits by improved coding of the speech parameters, so that the frame rate can be increased.
  • there is also an interrelation between the coding of the parameters and the frame rate since less bit-intensive coding, which reduces redundancy, is only possible or makes sense at higher frame rates.
  • this affects therefore, that the coding of the parameters according to the invention is based on the use of the correlation between adjacent voiced speech sections (interframe correlation), which of course becomes increasingly stronger with increasing frame rate.
  • FIG. 1 The general structure and mode of operation of the speech processing device according to the invention are shown in FIG. 1. That from any source, e.g. Analog voice signal originating from a microphone 1 is band-limited in a filter 2 and then sampled and digitized in an A / D converter 3. The sampling rate is about 6 to 16 kHz, preferably about 8 kHz.
  • the resolution is about 8 to 12 bit.
  • the pass band of the filter 2 usually extends from approximately 80 Hz to approximately 3.1-3.4 kHz in the case of so-called broadband speech, and from approximately 300 Hz to 3.1-3.4 kHz in the telephone language.
  • the speech section length is approximately 10 to 30 msec, preferably approximately 20 msec.
  • the frame rate ie the number of frames per second, is approximately 30 to 100, preferably 50 to 70.
  • sections as short as possible and correspondingly high frame rates are desirable, but there is one on the one hand, with real-time processing, the limited performance of the computer used and, on the other hand, the conclusion of the lowest possible bit rates during the transmission.
  • the analysis is therefore essentially divided into two main procedures, on the one hand in the calculation of the amplifier factor or volume parameter and the coefficients or filter parameters of the underlying vocal tract model filter and on the other hand in the voiced-unvoiced decision and in determining the pitch -Period in voiced case.
  • the filter coefficients are obtained in a parameter calculator 4 by solving the system of equations which is obtained when the energy of the prediction error, ie the energy of the difference between the actual samples and the samples estimated on the basis of the model assumption in the interval under consideration (speech section) is minimized as a function of the coefficients becomes.
  • the system of equations is preferably solved using the autocorrelation method using an algorithm according to Durbin (see, for example, LB Rabiner and RW Schafer, “Digital Processing of Speech Signals”, Prentice Hall Inc., Englewood Cliffs, NJ, 1978, pages 411-413).
  • the so-called reflection coefficients (k j ) also result, which are less sensitive transforms of the filter coefficients (a j ) to quantization.
  • the reflection coefficients are always smaller than 1 and, in addition, their amount decreases with an increasing atomic number. Because of these advantages, these reflection coefficients (k j ) are preferably transmitted instead of the filter coefficients (a j ).
  • the volume parameter G results from the algorithm as a by-product.
  • the digital voice - signal Sn stored in a buffer 5 first as long are calculated until the filter parameters (a j). The signal then passes through an inverse filter 6 set with the parameters (a j ), which has an inverse transfer function to the transfer function of the vocal tract model filter.
  • the result of this inverse filtering is a prediction error signal e n , which is similar to the excitation signal Xn multiplied by the gain factor G.
  • This prediction error signal e n is now supplied in the case of telephone speech directly or in the case of broadband speech via a low-pass filter 7 to an autocorrelation stage 8, which forms the autocorrelation function AKF standardized to the zero-order autocorrelation maximum, on the basis of which the pitch period p is determined in a pitch extraction stage 9. specifically in a known manner as the distance between the second autocorrelation maximum RXX and the first maximum (zero order), an adaptive search method preferably being used.
  • the language section under consideration is classified as voiced or unvoiced in a decision stage 11 according to certain criteria, which include also include the energy of the speech signal and the number of zero crossings in the section under consideration. These two values are determined in an energy determination stage 12 and a zero crossing determination stage 13.
  • the parameter calculator described above determines a set of filter parameters for each speech section (frame).
  • the filter parameters could also be determined differently, for example continuously by means of adaptive inverse filtering or another known method, the filter parameters being readjusted continuously with each sampling cycle, but only at the times determined by the frame rate for further processing or Transmission will be provided.
  • the invention is in no way restricted in this regard. It is only essential that there is a set of filter parameters for each language section.
  • the parameters (k j ), G and p obtained according to the method just described are then fed to a coding stage 14, where they are brought (formatted) and made available in a particularly bit-efficient form suitable for transmission, in a manner to be described in more detail below .
  • the speech signal is recovered or synthesized from the parameters in a known manner in that the parameters initially decoded in a decoder 15 are fed to a pulse-noise generator 16, an amplifier 17 and a vocal tract model filter 18 and the output signal of the model filter 18 by means of a D / A converter 19 brought into analog form and then after the usual filtering 20 by a playback device, for. B. a speaker 21 is made audible.
  • the volume parameter G controls the amplification factor of the amplifier 17, the filter parameters (k j ) define the transfer function of the sound formation or vocal tract model filter 18.
  • Fig. 2 An example of such a system is shown in Fig. 2 as a block diagram.
  • the multi-processor system shown essentially comprises four functional blocks, namely a main processor 50, two secondary processors 60 and 70 and an input / output unit 80. It implements both analysis and synthesis.
  • the input / output unit 80 contains the stages designated 81 for analog signal processing, such as amplifiers, filters and automatic gain control, as well as the A / D converter and the D / A converter.
  • the main processor 50 carries out the actual speech analysis or synthesis, for which purpose the determination of the filter parameters and the volume parameters (parameter calculator 4), the determination of energy and zero crossings of the speech signal (stages 13 and 12), the voiced-unvoiced decision (stage 11 ) and the determination of the pitch period (stage 9) or, on the synthesis side, the generation of the output signal (stage 16), its volume variation (stage 17) and its filtering in the speech model filter (filter 18).
  • the main processor 50 is supported by the secondary processor 60, which carries out the intermediate storage (buffer 5), inverse filtering (stage 6), optionally the low-pass filtering (stage 7) and the autocorrelation (stage 8).
  • the secondary processor 70 deals exclusively with the coding or decoding of the speech parameters and with the data traffic, e.g. a modem 90 or the like via an interface designated 71.
  • the data rate in an LPC vocoder system is determined by the so-called frame rate, i.e. the number of speech segments per second, the number of language parameters used and the number of bits required to encode the language parameters.
  • the basic principle of the invention consists in the consideration that if the speech signal is analyzed more often, that is to say the frame rate is increased, a better tracking of the transientities of the speech signal is possible. With stationary speech sections, a greater correlation between the parameters of the successive speech sections is thus achieved, which in turn leads to a more efficient, i.e. bit-saving coding can be used so that the overall data rate does not increase despite the increased frame rate, but the voice quality is significantly improved.
  • This special coding of the speech parameters according to the invention is explained in more detail below.
  • the basic idea of the parameter coding according to the invention is the so-called block coding principle, that is to say that the speech parameters are not coded independently of one another for each individual speech section, but rather two or three speech sections are combined to form a block and the parameters of all two or are coded within this block three language sections according to uniform rules and in such a way that in each case only the parameters of the first section are coded in full form, while the parameters of the other language section (s) are coded in differential form or possibly omitted or substituted entirely.
  • the coding within the block is also carried out differently, taking into account the typical properties of human speech, depending on whether it is a voiced or unvoiced block, the first speech section in each case determining the voiced character of the block.
  • Complete coding is understood to mean the usual coding of the parameters, for example 6 bit for the pitch parameter, 5 bit for the volume parameter and (for a ten-pole filter, for example) for the first four filter coefficients, each 5 bit, for the next four 4 bits each and reserved for the last two 3 or 2 bits.
  • the decreasing number of bits for the higher filter coefficients is explained from the fact that the reflection coefficients usually used decrease in magnitude with increasing atomic number and essentially only determine the fine structure of the short-term speech spectrum.
  • the coding according to the invention is different for the individual parameter types (filter coefficients, volume, pitch). It is explained below using the example of blocks consisting of three language sections each.
  • the filter parameters of the first section are encoded in full form, the filter parameters of the second and third sections, however, in differential form, ie only in the form of their difference compared to the corresponding parameters of the first or if necessary also of the second section.
  • the difference of a 5-bit parameter is e.g. represented by a 4-bit word, etc.
  • the last, only 2-bit parameter could be encoded in this way, but this would make little sense with only 2-bit.
  • the last filter parameter of the second and third sections is therefore either replaced by that of the first section or set to zero, which saves the transmission in both cases.
  • the filter coefficients of the second speech section can also be adopted immediately with those of the first section and therefore do not need to be coded or transmitted at all.
  • the bits released in this way can be used to encode the difference between the filter parameters of the third section and those of the first section with greater resolution.
  • the coding is done in a different way.
  • the filter parameters of the first section are full again, i.e. encoded in full form or full bit length, the filter parameters of the other two sections are not coded differentially, but also in full form.
  • bit reduction use is made of the fact that in the unvoiced case the higher filter coefficients make little contribution to the sound image, and accordingly the higher filter coefficients, e.g. from the seventh, not encoded or transmitted at all. On the synthesis side, they are then interpreted as zero.
  • This parameter encoding is performed in voiced and unvoiced case largely g e-based or even completely the same in a variant.
  • the parameters of the first and third sections are each fully coded, those of the middle section in the form of their difference from that of the first section.
  • the volume parameter of the middle speech section can also be assumed to be the same as that of the first section and therefore does not need to be coded or transmitted at all.
  • the synthesis-side decoder then automatically generates this parameter from the parameter of the first speech section.
  • the pitch parameter is coded the same for voiced and unvoiced blocks, just like that of the filter coefficients in the voiced case, i.e. full for the first language section (e.g. 7 bit) and differential for the other two sections.
  • the differences are preferably represented with 3 bits.
  • a change is indicated by a special code word, in that the difference to the pitch parameter of the first speech section, which in any case exceeds the representable difference range, is replaced by this code word.
  • the code word of course has the same format as the pitch parameter differences.
  • the running pitch parameter is a running average of the pitch parameters of a number, e.g. 2 to 7 previous language sections used.
  • the decoded pitch parameter is preferably synthesized on the synthesis side with a running average of the pitch parameters of a number, e.g. 2 to 7 previous language sections compared and replaced by the running average when a predetermined maximum deviation, for example about ⁇ 30% to ⁇ 60% is exceeded.
  • a predetermined maximum deviation for example about ⁇ 30% to ⁇ 60% is exceeded.
  • the “outlier” does not go into further averaging.
  • the coding is basically the same as for the blocks with three sections. All parameters of the first section are encoded in their entirety.
  • the filter parameters of the second speech section are either coded in differential form in voiced blocks or assumed to be the same as in the first section and accordingly not coded at all.
  • the filter coefficients of the second speech section are also encoded in their entirety, but the higher coefficients are omitted.
  • the pitch parameter of the second speech section is coded the same again in the voiced and in the unvoiced case, namely in the form of its difference to the pitch parameter of the first section.
  • a code word is used again.
  • the volume parameter of the second speech section is coded in the same way as in the case of blocks with three sections, that is to say in differential form or not at all.
  • the coding and decoding is preferably carried out by software using the computer system which is already available for the remaining speech processing.
  • the creation of a suitable program is within the skill of the average professional.
  • the coding rules A 1 , A 2 and A 3 and B 1 , B 2 and B 3 contained in FIG. 3 are shown in more detail in FIG. 4 and each indicate the format (bit assignments) of the parameters to be coded.
  • the programs for decoding are of course analog.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
  • Exchange Systems With Centralized Control (AREA)

Claims (13)

1. Procédé de traitement de parole réduisant la redondance selon la méthode de prédiction linéaire, où du côté analyse le signal numérique de parole obtenu par exploration du signal analogique de parole, le cas échéant soumis à une limitation de bande, est divisé en éléments et, pour chaque élément de parole, les paramètres d'un filtre de modèle de parole, un paramètre de puissance sonore et le paramètre de hauteur sonore (période de la fréquence fondamentale de bande sonore) sont déterminés et sont tenus à disposition pour la transmission sous une forme codée ou bien sont transmis, et où, du côté synthèse, les paramètres de filtre, le paramètre de puissance sonore et le paramètre de hauteur sonore sont décodés et, au moyen desdits paramètres, un étage de synthèse se composant essentiellement d'un générateur d'excitation et d'un filtre de modèle de parole est commandé pour régénérer le signal de parole, caractérisé en ce que le codage des paramètres est effectué par blocs, par l'intermédiaire respectivement de deux ou trois éléments de parole se succédant, les paramètres du premier élément de parole étant codés respectivement sous une forme complète et au moins une partie des paramètres des éléments restants étant codée sous une forme différentielle ou laissée de côté.
2. Procédé selon la revendication 1, caractérisé en ce que le codage des paramètres est effectué d'une manière différente selon que le premier élément de parole d'un bloc d'éléments de parole correspond à un son timbré ou à un son sourd.
3. Procédé selon la revendication 2, caractérisé en ce que, dans le cas de blocs comportant chacun trois éléments de parole et pour un premier élément de parole timbré, les paramètres de filtre et de hauteur sonore du premier élément sont codés sous une forme complète tandis que les paramètres de filtre et de hauteur sonore des deux éléments restants sont codés sous la forme de leurs différences avec les paramètres correspondants du premier élément ou du second élément de parole, et en ce que, dans le cas d'un premier élément de parole sourd, les paramètres de filtre d'ordres supérieurs sont laissés de côté et les paramètres de filtre subsistants des trois éléments de parole sont codés sous une forme complète tandis que les paramètres de hauteur sonore sont codés de la même façon que dans le cas d'un son timbré.
4. Procédé selon la revendication 2, caractérisé en ce que, dans le cas de blocs comportant chacun trois éléments de parole et pour un premier élément de parole timbré, les paramètres de filtre et les paramètres de hauteur sonore du premier élément de parole sont codés sous une forme complète, le paramètre de filtre de l'élément central de parole n'est essentiellement pas codé et le paramètre de hauteur sonore de cet élément de parole est codé sous la forme de sa différence avec le paramètre de hauteur sonore du premier élément de parole, tandis que les paramètres de filtre et de hauteur sonore du dernier élément de parole sont codés sous la forme de leurs différences avec les paramètres correspondants du premier élément de parole, et en ce que, dans le cas d'un premier élément de parole sourd, des paramètres de filtre d'ordres supérieurs sont laissés de côté et les paramètres de filtres restants des trois éléments de parole sont codés sous une forme complète tandis que les paramètres de hauteur sonore sont codés comme dans le cas d'un son timbré.
5. Procédé selon la revendication 2, caractérisé en ce que, dans le cas de blocs comportant chacun deux éléments de parole et pour un premier élément de parole timbré, les paramètres de filtre et de hauteur sonore du premier élément de parole sont codés sous une forme complète et des paramètres de filtre du second élément de parole ne sont essentiellement pas codés ou bien ils le sont sous la forme de leurs différences avec les paramètres correspondants du premier élément, tandis que le paramètre de hauteur sonore du second élément de parole est codé sous la forme de sa différence avec le paramètre de hauteur sonore du premier élément de parole, et en ce que, dans le cas d'un premier élément de parole sourd, les paramètres de filtre d'ordres supérieurs sont laissés de côté et les paramètres de filtre restants des deux éléments de parole sont codés sous une forme complète tandis que le paramètre de hauteur sonore est codé de la même façon que dans le cas d'un son timbré.
6. Procédé selon l'une des revendications 3 ou 4, caractérisé en ce que, dans le cas d'un premier élément de parole timbré, les paramètres de puissance sonore du premier et du dernier élément de parole sont codés sous une forme complète tandis que celui de l'élément de parole central n'est essentiellement pas codé ou bien l'est sous la forme de sa différence avec le paramètre de puissance sonore du premier élément de parole et en ce que, dans le cas d'un premier élément de parole sourd, les paramètres de puissance sonore du premier et du dernier élément de parole sont codés sous une forme complète tandis que celui de l'élément central est codé sous la forme de sa différence avec le paramètre de puissance sonore- du premier élément de parole.
7. Procédé selon la revendication 5, caractérisé en ce que, dans le cas d'un premier élément de parole timbré, le paramètre de puissance sonore du premier élément de parole est codé sous une forme complète et celui du second élément de parole n'est essentiellement pas codé ou bien l'est sous la forme de sa différence avec le paramètre de puissance sonore du premier élément de parole et en ce que, dans le cas d'un premier élément de parole sourd, le paramètre de puissance sonore du premier élément de parole est codé sous une forme complète tandis que celui du second élément de parole est codé sous la forme de sa différence avec le paramètre de puissance sonore du premier élément de parole.
8. Procédé selon l'une des revendications 3 à 7, caractérisé en ce que, dans le cas d'une alternance d'un son timbré à un son sourd, ou inversement, à l'intérieur d'un bloc d'éléments de parole, le paramètre de hauteur sonore de l'élément de parole correspondant est remplacé par un mot de code spécial.
9. Procédé selon la revendication 8, caractérisé en ce que du côté synthèse, lors de l'apparition du mot de code et quand l'élément de parole précédent était sourd, on utilise comme paramètre de hauteur sonore correspondant une valeur moyenne obtenue à partir des paramètres de hauteur sonore d'un certain nombre d'éléments de parole produits précédemment.
10. Procédé selon l'une des revendications précédentes, caractérisé en ce que, du côté synthèse, le paramètre de hauteur sonore décodé est comparé avec une valeur moyenne obtenue à partir des paramètres de hauteur sonore d'un certain nombre d'éléments de parole précédents et, lors d'un dépassement d'un écart maximal prédéterminé, il est remplacé par la valeur moyenne présente.
11. Procédé selon l'une des revendications précédentes, caractérisé en ce que la longueur des différents éléments de parole, pour lesquels on a obtenu à chaque fois les paramètres de parole, s'élève au maximum à environ 30 ms, de préférence à environ 20 ms.
12. Procédé selon l'une des revendications précédentes, caractérisé en ce que le nombre des éléments de parole par seconde sélève au moins à environ 55, de préférence à au moins 60.
13. Dispositif pour la mise en oeuvre du procédé selon une des revendications précédentes, comportant une partie de préparation de signaux, qui explore de façon synchronisée le signal analogique de parole et qui convertit en signaux numériques les valeurs d'exploration ainsi obtenues, une partie d'analyse, qui analyse par éléments le signal de parole converti numériquement ainsi qu'un calculateur de paramètres, un étage de discrimination de hauteur sonore et un étage de calcul de hauteur sonore, et en outre un étage de codage, qui code les paramètres de parole obtenus par la partie d'analyse, caractérisé en ce que la partie d'analyse est un système à processeurs multiples comportant un processeur principal (50) et deux processeurs auxiliaires (60, 70), un processeur auxiliaire (60) assurant une mémorisation intermédiaire du signal de parole, produisant à partir du signal de parole mémorisé de façon intermédiaire le signal d'erreur de prédiction par un filtrage inverse et formant à partir de celui-ci, le cas échéant après un filtrage passe-bas, la fonction d'auto- corrélation normalisée, le processeur principal (50) effectuant l'analyse proprement dite du signal de parole et l'autre processeur auxiliaire (70) intervenant pour le codage des paramètres de parole obtenus par le processeur principal en liaison avec le premier processeur auxiliaire.
EP82810391A 1981-09-24 1982-09-20 Procédé et dispositif pour traitement digital de la parole réduisant la redondance Expired EP0076234B1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AT82810391T ATE15415T1 (de) 1981-09-24 1982-09-20 Verfahren und vorrichtung zur redundanzvermindernden digitalen sprachverarbeitung.

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CH616881 1981-09-24
CH6168/81 1981-09-24

Publications (2)

Publication Number Publication Date
EP0076234A1 EP0076234A1 (fr) 1983-04-06
EP0076234B1 true EP0076234B1 (fr) 1985-09-04

Family

ID=4305342

Family Applications (1)

Application Number Title Priority Date Filing Date
EP82810391A Expired EP0076234B1 (fr) 1981-09-24 1982-09-20 Procédé et dispositif pour traitement digital de la parole réduisant la redondance

Country Status (6)

Country Link
US (1) US4618982A (fr)
EP (1) EP0076234B1 (fr)
JP (1) JPS5870300A (fr)
AT (1) ATE15415T1 (fr)
CA (1) CA1184656A (fr)
DE (1) DE3266042D1 (fr)

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA1229681A (fr) * 1984-03-06 1987-11-24 Kazunori Ozawa Methode et appareil de codage de signaux dans la bande de frequences vocales
CA1255802A (fr) * 1984-07-05 1989-06-13 Kazunori Ozawa Codage et decodage de signaux a faible debit binaire utilisant un nombre restreint d'impulsions d'excitation
CA1252568A (fr) * 1984-12-24 1989-04-11 Kazunori Ozawa Codeur et decodeur de signaux a faible debit binaire pouvant reduire la vitesse de transmission de l'information
US4912764A (en) * 1985-08-28 1990-03-27 American Telephone And Telegraph Company, At&T Bell Laboratories Digital speech coder with different excitation types
US4890328A (en) * 1985-08-28 1989-12-26 American Telephone And Telegraph Company Voice synthesis utilizing multi-level filter excitation
EP0245531A1 (fr) * 1986-05-14 1987-11-19 Deutsche ITT Industries GmbH Application d'une mémoire morte semi-conductrice
EP0360265B1 (fr) * 1988-09-21 1994-01-26 Nec Corporation Système de transmission capable de modifier la qualité de la parole par classement des signaux de paroles
US4972474A (en) * 1989-05-01 1990-11-20 Cylink Corporation Integer encryptor
JPH03136100A (ja) * 1989-10-20 1991-06-10 Canon Inc 音声処理方法及び装置
US6006174A (en) 1990-10-03 1999-12-21 Interdigital Technology Coporation Multiple impulse excitation speech encoder and decoder
JP2810252B2 (ja) * 1991-05-22 1998-10-15 シャープ株式会社 音声再生装置
US5317567A (en) * 1991-09-12 1994-05-31 The United States Of America As Represented By The Secretary Of The Air Force Multi-speaker conferencing over narrowband channels
US5272698A (en) * 1991-09-12 1993-12-21 The United States Of America As Represented By The Secretary Of The Air Force Multi-speaker conferencing over narrowband channels
FI95086C (fi) * 1992-11-26 1995-12-11 Nokia Mobile Phones Ltd Menetelmä puhesignaalin tehokkaaksi koodaamiseksi
US5517511A (en) * 1992-11-30 1996-05-14 Digital Voice Systems, Inc. Digital transmission of acoustic signals over a noisy communication channel
FI96248C (fi) * 1993-05-06 1996-05-27 Nokia Mobile Phones Ltd Menetelmä pitkän aikavälin synteesisuodattimen toteuttamiseksi sekä synteesisuodatin puhekoodereihin
US5457685A (en) * 1993-11-05 1995-10-10 The United States Of America As Represented By The Secretary Of The Air Force Multi-speaker conferencing over narrowband channels
PL174216B1 (pl) * 1993-11-30 1998-06-30 At And T Corp Sposób redukcji w czasie rzeczywistym szumu transmisji mowy
US5715365A (en) * 1994-04-04 1998-02-03 Digital Voice Systems, Inc. Estimation of excitation parameters
AU696092B2 (en) * 1995-01-12 1998-09-03 Digital Voice Systems, Inc. Estimation of excitation parameters
US5754974A (en) * 1995-02-22 1998-05-19 Digital Voice Systems, Inc Spectral magnitude representation for multi-band excitation speech coders
US5701390A (en) * 1995-02-22 1997-12-23 Digital Voice Systems, Inc. Synthesis of MBE-based coded speech using regenerated phase information
US6240384B1 (en) * 1995-12-04 2001-05-29 Kabushiki Kaisha Toshiba Speech synthesis method
SE506034C2 (sv) * 1996-02-01 1997-11-03 Ericsson Telefon Ab L M Förfarande och anordning för förbättring av parametrar representerande brusigt tal
US6131084A (en) * 1997-03-14 2000-10-10 Digital Voice Systems, Inc. Dual subframe quantization of spectral magnitudes
US6161089A (en) * 1997-03-14 2000-12-12 Digital Voice Systems, Inc. Multi-subframe quantization of spectral parameters
US6199037B1 (en) 1997-12-04 2001-03-06 Digital Voice Systems, Inc. Joint quantization of speech subframe voicing metrics and fundamental frequencies
US6377916B1 (en) 1999-11-29 2002-04-23 Digital Voice Systems, Inc. Multiband harmonic transform coder
US7080009B2 (en) * 2000-05-01 2006-07-18 Motorola, Inc. Method and apparatus for reducing rate determination errors and their artifacts
DE102004001293A1 (de) * 2004-01-07 2005-08-11 Deutsche Thomson-Brandt Gmbh Vorrichtung und Verfahren zur Datenübertragung mit reduzierter Datenmenge

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3017456A (en) * 1958-03-24 1962-01-16 Technicolor Corp Bandwidth reduction system for television signals
DE1162398B (de) * 1961-10-24 1964-02-06 Ibm Verdichter fuer Daten, die aus Bits verschiedener Wertigkeit bestehen
US3236947A (en) * 1961-12-21 1966-02-22 Ibm Word code generator
US3439753A (en) * 1966-04-19 1969-04-22 Bell Telephone Labor Inc Reduced bandwidth pulse modulation scheme using dual mode encoding in selected sub-block sampling periods
US4053712A (en) * 1976-08-24 1977-10-11 The United States Of America As Represented By The Secretary Of The Army Adaptive digital coder and decoder
CA1123955A (fr) * 1978-03-30 1982-05-18 Tetsu Taguchi Appareil d'analyse et de synthese de la parole
US4335277A (en) * 1979-05-07 1982-06-15 Texas Instruments Incorporated Control interface system for use with a memory device executing variable length instructions

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
IEEE TRANSACTIONS ON COMMUNICATIONS, Band COM-23, Nr. 12, Dezember 1975, Seiten 1466-1474, New York, USA, C.K. UN et al.: "The residual-excited linear prediction vocoder with transmission rate below 9.6 kbits/s" *

Also Published As

Publication number Publication date
DE3266042D1 (en) 1985-10-10
EP0076234A1 (fr) 1983-04-06
US4618982A (en) 1986-10-21
CA1184656A (fr) 1985-03-26
JPS5870300A (ja) 1983-04-26
ATE15415T1 (de) 1985-09-15

Similar Documents

Publication Publication Date Title
EP0076234B1 (fr) Procédé et dispositif pour traitement digital de la parole réduisant la redondance
DE3244476C2 (fr)
DE69915830T2 (de) Verbesserte verfahren zur rückgewinnung verlorener datenrahmen für ein lpc-basiertes, parametrisches sprachkodierungsystem.
EP0076233B1 (fr) Procédé et dispositif pour traitement digital de la parole réduisant la redondance
DE60219351T2 (de) Signaländerungsverfahren zur effizienten kodierung von sprachsignalen
DE68912692T2 (de) Zur Sprachqualitätsmodifizierung geeignetes Übertragungssystem durch Klassifizierung der Sprachsignale.
DE60123651T2 (de) Verfahren und vorrichtung zur robusten sprachklassifikation
DE60209861T2 (de) Adaptive Postfilterung zur Sprachdekodierung
DE69910240T2 (de) Vorrichtung und verfahren zur wiederherstellung des hochfrequenzanteils eines überabgetasteten synthetisierten breitbandsignals
EP1979901B1 (fr) Procede et dispositifs pour le codage de signaux audio
DE60011051T2 (de) Celp-transkodierung
DE69816810T2 (de) Systeme und verfahren zur audio-kodierung
DE60006271T2 (de) Celp sprachkodierung mit variabler bitrate mittels phonetischer klassifizierung
DE102008042579B4 (de) Verfahren zur Fehlerverdeckung bei fehlerhafter Übertragung von Sprachdaten
DE69916321T2 (de) Kodierung eines verbesserungsmerkmals zur leistungsverbesserung in der kodierung von kommunikationssignalen
DE60308567T2 (de) Dekodierungsgerät, Kodierungsgerät, Dekodierungsverfahren und Kodierungsverfahren
DE69013738T2 (de) Einrichtung zur Sprachcodierung.
DE60133757T2 (de) Verfahren und vorrichtung zur kodierung von stimmloser sprache
DE4237563A1 (fr)
DE60034429T2 (de) Verfahren und vorrichtung zur bestimmung von sprachkodierparametern
DE60309651T2 (de) Verfahren zur Sprachkodierung mittels verallgemeinerter Analyse durch Synthese und Sprachkodierer zur Durchführung dieses Verfahrens
DE68917584T2 (de) Zur Sprachqualitätsverbesserung geeignetes Kodiergerät unter Anwendung einer Doppelanlage zur Pulserzeugung.
DE69629485T2 (de) Kompressionsystem für sich wiederholende töne
DE60112407T2 (de) Verfahren und vorrichtung zur konvertierung eines audiosignals zwischen unterschiedlichen datenkompressionsformaten
DE3884839T2 (de) Codierung von akustischen Wellenformen.

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19820922

AK Designated contracting states

Designated state(s): AT CH DE FR GB IT LI NL SE

ITF It: translation for a ep patent filed

Owner name: SOCIETA' ITALIANA BREVETTI S.P.A.

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Designated state(s): AT CH DE FR GB IT LI NL SE

REF Corresponds to:

Ref document number: 15415

Country of ref document: AT

Date of ref document: 19850915

Kind code of ref document: T

REF Corresponds to:

Ref document number: 3266042

Country of ref document: DE

Date of ref document: 19851010

ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: AT

Payment date: 19860825

Year of fee payment: 5

26N No opposition filed
PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: NL

Payment date: 19870930

Year of fee payment: 6

REG Reference to a national code

Ref country code: CH

Ref legal event code: PUE

Owner name: OMNISEC AG

REG Reference to a national code

Ref country code: FR

Ref legal event code: TP

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732

ITPR It: changes in ownership of a european patent

Owner name: CESSIONE;OMNISEC AG

NLS Nl: assignments of ep-patents

Owner name: OMNISEC AG TE REGENSDORF, ZWITSERLAND.

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Effective date: 19880920

Ref country code: AT

Effective date: 19880920

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LI

Effective date: 19880930

Ref country code: CH

Effective date: 19880930

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Effective date: 19890401

NLV4 Nl: lapsed or anulled due to non-payment of the annual fee
PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 19890531

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

GBPC Gb: european patent ceased through non-payment of renewal fee
PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Effective date: 19890601

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: SE

Payment date: 19890921

Year of fee payment: 8

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Effective date: 19900921

EUG Se: european patent has lapsed

Ref document number: 82810391.1

Effective date: 19910527