WO1994007313A1 - Codec de signaux vocaux - Google Patents
Codec de signaux vocaux Download PDFInfo
- Publication number
- WO1994007313A1 WO1994007313A1 PCT/DE1993/000839 DE9300839W WO9407313A1 WO 1994007313 A1 WO1994007313 A1 WO 1994007313A1 DE 9300839 W DE9300839 W DE 9300839W WO 9407313 A1 WO9407313 A1 WO 9407313A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- mode
- speech
- channel
- bit rate
- encoder
- Prior art date
Links
- 238000000034 method Methods 0.000 claims abstract description 22
- 230000005540 biological transmission Effects 0.000 claims abstract description 15
- 230000007704 transition Effects 0.000 claims description 2
- 101001096074 Homo sapiens Regenerating islet-derived protein 4 Proteins 0.000 claims 1
- 102100037889 Regenerating islet-derived protein 4 Human genes 0.000 claims 1
- 230000007774 longterm Effects 0.000 description 11
- 239000013598 vector Substances 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000005311 autocorrelation function Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04B—TRANSMISSION
- H04B14/00—Transmission systems not characterised by the medium used for transmission
- H04B14/02—Transmission systems not characterised by the medium used for transmission characterised by the use of pulse modulation
- H04B14/04—Transmission systems not characterised by the medium used for transmission characterised by the use of pulse modulation using pulse code modulation
- H04B14/046—Systems or methods for reducing noise or bandwidth
Definitions
- the invention relates to a method for coding speech signals according to the preamble of claim 1.
- speech coding methods are known, for example from German Patent 38 34 871.
- a common feature of all speech coding methods is a prediction analysis of the input signal (linear prediction coder, LPC).
- the voice signal at the input of the encoder is subdivided within a certain period of 20-30 ms, for example.
- Each speech frame is subjected to a linear prediction analysis in the encoder, which removes linear dependencies in the speech signal.
- the linear prediction is carried out with the help of FIR filters (Finite Impulse Response).
- FIR filters Finite Impulse Response
- the coefficients of these filters are determined anew in every frame, ie these are adaptive filters.
- CELP Code Excited Linear Prediction
- SELP Stochastically Excited Linear Prediction
- the filter coefficients of the short-term predictor are determined once per speech frame, while the coefficients of the long-term predictor are typically determined four times per speech frame.
- the so-called residual the error signal of the LPC analysis
- the residue is generated by a Gaussian-distributed random sequence, with a code book containing the random vectors searched and the vector is selected that generates the smallest error in the synthesized speech signal. Then only the addresses of the selected vectors in the code book are to be transmitted.
- Good voice quality is generally required for voice transmission, both with error-free and with disturbed channels.
- a redundancy R is added to the bits from the speech encoder in digital voice transmissions, this is called channel coding, in order to be able to correct transmission errors on the receiving side. Since the channel capacity is a predefined and unchangeable system size, transmission errors can no longer be corrected in the case of certain channel disturbances, which is why the quality or intelligibility of the received speech signal suffers as a result.
- the present invention was based on the object of specifying a speech coding method of the type mentioned at the outset, which is capable of increasing the quality or intelligibility of the speech transmitted over a channel both in the case of interference-free and disturbed channel, i.e. to increase both the voice quality with error-free transmission and the robustness of the voice transmission system.
- the invention is based on the knowledge that a speech signal can be divided into three classes:
- this smaller bit rate is used to encode the voiced speech sections, this can be done statistically for about 45 to 50% of the total speech transmission, whereby the speech quality of the speech codec is not deteriorated if the channel is free of errors, but the quality is significantly increased if the channel is disturbed .
- the method according to the invention it is additionally proposed to force mode 1 independently of the statistics of the speech input signal if the channel interference exceeds a certain level and the intelligibility would thus be greatly reduced.
- the method according to the invention significantly increases the robustness and thus the quality or intelligibility in the case of a disturbed or severely disturbed channel. If a measure of the level of the interference is available as a signal, the reception quality can be controlled in the transmitter using this. This is possible, for example, if a return channel is available, via which a corresponding signal is transmitted back from the receiver to the transmitter as a measure of the quality of the received signal.
- Figure 1 and Figure 2 show block diagrams for speech and channel coders or decoders with variable bit rate.
- Figure 3 demonstrates a radio transmission system with return channel
- Figure 4 is the structure of a variable language coder
- FIG. 5 is a flow chart of a
- the channel coding on the transmitting and receiving sides is adapted to the bit rate of the speech codec.
- the mode in which the language encoder works is signaled with a so-called mode bit. This mode bit must be reconstructed on the receiving side in the channel decoder.
- Figures 1 and 2 give an overview of the transmitting and receiving part.
- the bit rate of the encoder part is controlled by two blocks. On the one hand, this is the voiced / unvoiced decision maker SH / SL, who statistically evaluates the speech input signal s (n).
- the language coder SE is informed whether a language frame is voiced or unvoiced.
- the encoder is switched to mode 1, in which differential coding of the pitch analysis parameters is used.
- This differential coding of the pitch analysis parameters can also be enforced independently of the statistics of the input signal by appropriate setting of the parameters relevant for this in the block external control AS.
- the percentage of speech frames transmitted with mode 1, i.e. differential coding can be increased and an optimal setting between speech quality and robustness of the channel can be achieved become.
- the method according to the invention uses two channel encoders KEO and KE1, which encode the encoded speech parameters generated by the speech encoder and the mode bit in mode 0 with the bit rate BO and in mode 1 with the bit rate B1, where B0 is greater than B1.
- the receiver according to Figure 2 contains a module for mode determination, which switches the channel signal to be decoded in mode 0 to the channel decoder KDO and in mode 1 to the channel decoder KD1.
- the output signals of the two channel decoders are decoded by the subsequent speech decoder SD into the output speech signal s (n).
- FIG. 3 shows a radio transmission system with a return channel, the modules according to FIGS. 1 and 2 being contained in simplified form.
- the reception quality is determined at the modulator output of the receiver and transmitted to the transmitter.
- the received quality signals act directly on an external control AS, through which the language encoder SE can be switched to the mode with differential coding. If the reception quality is poor, the percentage of delta-coded speech frames (mode 1) can be increased. Although the voice quality deteriorates slightly, the robustness against transmission errors and thus the quality of the receiver is improved. If the reception quality improves, the proportion of Mode 1 speech frames is reduced to the normal proportion, and the speech quality is correspondingly better. It is thus possible to dynamically adapt the speech codec to the channel conditions in a simple manner.
- FIG. 1 shows a radio transmission system with a return channel, the modules according to FIGS. 1 and 2 being contained in simplified form.
- the reception quality is determined at the modulator output of the receiver and transmitted to the transmitter.
- the received quality signals act directly on an external control AS,
- the blocks excitation analysis and LPC analysis are carried out as in known CELP methods (see reference 1).
- the long-term prediction parameters are determined using the also known closed-loop method (reference 2).
- the parameters of the LPC analysis are determined, for example, once per speech frame (for example 20 ms) and the long-term prediction analysis N su b Ma - L ( Z - B - every 5 ms) per frame.
- the speech subframe On The speech section for which the long-term prediction parameters are determined is referred to as the speech subframe.
- the long-term predictor can be represented as an adaptive code book.
- the code book consists of 256 signals, for example
- N s L (n) n 0 ... -1
- the error energy between the prediction signal and the speech signal s (n) serves as a measure of the quality of the prediction
- variable bit rate speech codec Only the function blocks that are relevant for the variable bit rate speech codec are described below.
- This decision maker is an "open loop" pitch analysis which is carried out in three steps:
- T G is a threshold that is set in the "External control" module or dynamically when using a return channel.
- P optimal delta pitch period, which was calculated on the condition that a delta coding for the pitch period of the last speech subframe is possible with a predetermined number of bits.
- ⁇ * optimal delta scaling factor for the adaptive codebook vector, which was calculated on the condition that a delta coding to the value of the scaling factor of the last speech subframe is possible with a predetermined number of bits
- E ⁇ error energy of the "closed loop" pitch predictor if the pitch period and the scaling factor are differentially coded with the corresponding values of the last speech subframe with a predetermined number of bits.
- P SH Pitch period from the "Voiced / Unvoiced” decision maker. In the voiced case, this value defines the delta environment for the "closed loop” pitch in the first language subframe.
- T G defines by means of which a differential coding can be forced. These parameters are either fixed or can be varied in time by evaluating the return channel information.
- the difference coding is applied to the pitch period and to the scaling factor by the difference between the current and the last calculated
- Parameters are coded and transmitted.
- the bit rate for transmitting the long-term prediction parameters is 2.4 kbit / sec in mode 0 and 1.8 kbit / sec in mode 1.
- no differential coding can be carried out for the first parameter of a speech subframe, which is why no bit rate can be saved at this point.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
L'invention concerne un procédé de codage de signaux vocaux à transmettre d'un émetteur à un récepteur, en passant par un canal à capacité de transmission (BBR) limitée, avec un codeur de signaux vocaux et un codeur de canaux, qui se caractérise en ce que le codeur de signaux vocaux et le codeur de canaux comportent chacun deux modes différents. Dans le premier mode, le signal vocal est codé par le codeur de signaux vocaux avec un débit binaire (B1) moins important que dans le second mode (B0). Dans le premier mode, la différence des débits binaires (B0-B1) par rapport au débit binaire (B0) du second mode est mise à la disposition du codeur de canaux. Cette différence de débits binaires supplémentaire est utilisée par le codeur de canaux pour transmettre d'autres informations de redondance. Ce procédé permet d'améliorer la qualité de transmission de la voix et peut être utilisé par exemple dans les radiotéléphones mobiles.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU49434/93A AU4943493A (en) | 1992-09-24 | 1993-09-11 | Speech codec |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DEP4231918.8 | 1992-09-24 | ||
DE19924231918 DE4231918C1 (de) | 1992-09-24 | 1992-09-24 | Verfahren für die Codierung von Sprachsignalen |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1994007313A1 true WO1994007313A1 (fr) | 1994-03-31 |
Family
ID=6468675
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/DE1993/000839 WO1994007313A1 (fr) | 1992-09-24 | 1993-09-11 | Codec de signaux vocaux |
Country Status (3)
Country | Link |
---|---|
AU (1) | AU4943493A (fr) |
DE (1) | DE4231918C1 (fr) |
WO (1) | WO1994007313A1 (fr) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1996019880A1 (fr) * | 1994-12-19 | 1996-06-27 | Nokia Telecommunications Oy | Procede de transmission de donnees, systeme de transmission de donnees et systeme de radio cellulaire |
EP0803989A1 (fr) * | 1996-04-26 | 1997-10-29 | Deutsche Thomson-Brandt Gmbh | Procédé et appareil pour le codage d'un signal audio-nimérique |
WO1997041549A1 (fr) * | 1996-04-26 | 1997-11-06 | Telefonaktiebolaget Lm Ericsson | Procede de commande de mode de codage et appareil de determination de mode de decodage |
US6009399A (en) * | 1996-04-26 | 1999-12-28 | Deutsche Thomson-Brandt Gmbh | Method and apparatus for encoding digital signals employing bit allocation using combinations of different threshold models to achieve desired bit rates |
US6134220A (en) * | 1994-04-13 | 2000-10-17 | Alcatel Cit | Method of adapting the air interface in a mobile radio system and corresponding base transceiver station, mobile station and transmission mode |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6134521A (en) * | 1994-02-17 | 2000-10-17 | Motorola, Inc. | Method and apparatus for mitigating audio degradation in a communication system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1989012292A1 (fr) * | 1988-06-08 | 1989-12-14 | Fujitsu Limited | Appareil codeur/decodeur |
US5060269A (en) * | 1989-05-18 | 1991-10-22 | General Electric Company | Hybrid switched multi-pulse/stochastic speech coding technique |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE3834871C1 (en) * | 1988-10-13 | 1989-12-14 | Ant Nachrichtentechnik Gmbh, 7150 Backnang, De | Method for encoding speech |
-
1992
- 1992-09-24 DE DE19924231918 patent/DE4231918C1/de not_active Expired - Fee Related
-
1993
- 1993-09-11 WO PCT/DE1993/000839 patent/WO1994007313A1/fr active Application Filing
- 1993-09-11 AU AU49434/93A patent/AU4943493A/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1989012292A1 (fr) * | 1988-06-08 | 1989-12-14 | Fujitsu Limited | Appareil codeur/decodeur |
US5060269A (en) * | 1989-05-18 | 1991-10-22 | General Electric Company | Hybrid switched multi-pulse/stochastic speech coding technique |
Non-Patent Citations (1)
Title |
---|
TANIGUCHI: "Combined Source and Channel Coding Based on Multimode Coding", ICASSP 90, SPEECH PROCESSING 1, vol. 1, April 1990 (1990-04-01), NY,USA, pages 477 - 480, XP000146509 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6456598B1 (en) | 1994-04-13 | 2002-09-24 | Alcatel Cit | Method of adapting the air interface in a mobile radio system and corresponding base transceiver station, mobile station and transmission mode |
US6134220A (en) * | 1994-04-13 | 2000-10-17 | Alcatel Cit | Method of adapting the air interface in a mobile radio system and corresponding base transceiver station, mobile station and transmission mode |
AU698404B2 (en) * | 1994-12-19 | 1998-10-29 | Nokia Telecommunications Oy | Data transmission method, data transmission system, and cellular radio system |
WO1996019880A1 (fr) * | 1994-12-19 | 1996-06-27 | Nokia Telecommunications Oy | Procede de transmission de donnees, systeme de transmission de donnees et systeme de radio cellulaire |
US6092222A (en) * | 1994-12-19 | 2000-07-18 | Nokia Telecommunications Oy | Data transmission method, data transmission system, and cellular radio system |
WO1997041662A1 (fr) * | 1996-04-26 | 1997-11-06 | Telefonaktiebolaget Lm Ericsson (Publ) | Procede et appareil de commande de mode de codage de source/voie |
US5982766A (en) * | 1996-04-26 | 1999-11-09 | Telefonaktiebolaget Lm Ericsson | Power control method and system in a TDMA radio communication system |
US6009399A (en) * | 1996-04-26 | 1999-12-28 | Deutsche Thomson-Brandt Gmbh | Method and apparatus for encoding digital signals employing bit allocation using combinations of different threshold models to achieve desired bit rates |
AU720308B2 (en) * | 1996-04-26 | 2000-05-25 | Telefonaktiebolaget Lm Ericsson (Publ) | Encoding mode control method and decoding mode determining apparatus |
WO1997041663A1 (fr) * | 1996-04-26 | 1997-11-06 | Telefonaktiebolaget Lm Ericsson (Publ) | Appareil et procede de commande adaptative de mode de codage dans un systeme de radiocommunication amrt |
WO1997041549A1 (fr) * | 1996-04-26 | 1997-11-06 | Telefonaktiebolaget Lm Ericsson | Procede de commande de mode de codage et appareil de determination de mode de decodage |
US6163577A (en) * | 1996-04-26 | 2000-12-19 | Telefonaktiebolaget Lm Ericsson (Publ) | Source/channel encoding mode control method and apparatus |
US6195337B1 (en) | 1996-04-26 | 2001-02-27 | Telefonaktiebolaget Lm Ericsson (Publ) | Encoding mode control method and decoding mode determining apparatus |
EP0803989A1 (fr) * | 1996-04-26 | 1997-10-29 | Deutsche Thomson-Brandt Gmbh | Procédé et appareil pour le codage d'un signal audio-nimérique |
MY119786A (en) * | 1996-04-26 | 2005-07-29 | Ericsson Telefon Ab L M | Power control method and system in a tdma radio communication system. |
Also Published As
Publication number | Publication date |
---|---|
AU4943493A (en) | 1994-04-12 |
DE4231918C1 (de) | 1993-12-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE69900786T2 (de) | Sprachkodierung | |
DE69029232T2 (de) | System und Methode zur Sprachkodierung | |
DE60219351T2 (de) | Signaländerungsverfahren zur effizienten kodierung von sprachsignalen | |
DE3856211T2 (de) | Verfahren zur adaptiven Filterung von Sprach- und Audiosignalen | |
DE69932460T2 (de) | Sprachkodierer/dekodierer | |
DE2945414C2 (de) | Sprachsignal-Voraussageprozessor und Verfahren zur Verarbeitung eines Sprachleistungssignals | |
DE69113866T2 (de) | Sprachdecoder. | |
DE19604273C5 (de) | Verfahren und Vorrichtung zum Durchführen einer Suche in einem Kodebuch im Hinblick auf das Kodieren eines Klangsignales, Zellkommunikationssystem, Zellnetzwerkelement und mobile Zell-Sender-/Empfänger-Einheit | |
DE602004007786T2 (de) | Verfahren und vorrichtung zur quantisierung des verstärkungsfaktors in einem breitbandsprachkodierer mit variabler bitrate | |
DE69132885T2 (de) | CELP-Kodierung niedriger Verzögerung und 32 kbit/s für ein Breitband-Sprachsignal | |
DE69902233T2 (de) | Sprachkodierung unter verwendung einer weichen adaptation | |
EP1025646A2 (fr) | Procede et dispositif de codage de signaux audio ainsi que procede et dispositif de decodage d'un train de bits | |
DE69524890T2 (de) | Parametrische Sprachkodierung | |
DE10296562T5 (de) | Rauschunterdrückung | |
DE69033510T2 (de) | Numerischer sprachkodierer mit verbesserter langzeitvorhersage durch subabtastauflösung | |
DE68913691T2 (de) | System zur Sprachcodierung und -decodierung. | |
DE69329568T2 (de) | Verfahren zur Sprachkodierung | |
DE60309651T2 (de) | Verfahren zur Sprachkodierung mittels verallgemeinerter Analyse durch Synthese und Sprachkodierer zur Durchführung dieses Verfahrens | |
DE69609089T2 (de) | Sprachkodierer mit aus aktuellen und vorhergehenden Rahmen extrahierten Merkmalen | |
EP1080464B1 (fr) | Procede et dispositif de codage de la parole | |
DE69827313T2 (de) | Verfahren zur Kodierung des Zufallskomponenten-Vektors in einem ACELP-Kodierer | |
DE69232166T2 (de) | Fehlerschutz für vielfachmodensprachkodierer | |
DE69324732T2 (de) | Selektive Anwendung von Sprachkodierungstechniken | |
DE4231918C1 (de) | Verfahren für die Codierung von Sprachsignalen | |
DE69922388T2 (de) | Linear-prädiktives Analyse-durch-Synthese-Kodierverfahren und Kodierer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AU CA FI JP US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: CA |