EP1521242A1 - Procédé de codage de la parole avec réduction de bruit au moyen de la modification du gain du livre de code - Google Patents
Procédé de codage de la parole avec réduction de bruit au moyen de la modification du gain du livre de code Download PDFInfo
- Publication number
- EP1521242A1 EP1521242A1 EP03022249A EP03022249A EP1521242A1 EP 1521242 A1 EP1521242 A1 EP 1521242A1 EP 03022249 A EP03022249 A EP 03022249A EP 03022249 A EP03022249 A EP 03022249A EP 1521242 A1 EP1521242 A1 EP 1521242A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- signal
- time interval
- noise
- fixed gain
- gain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000009467 reduction Effects 0.000 title description 27
- 230000004048 modification Effects 0.000 claims abstract description 18
- 238000012986 modification Methods 0.000 claims abstract description 18
- 238000001308 synthesis method Methods 0.000 claims abstract description 3
- 230000003044 adaptive effect Effects 0.000 claims description 18
- 238000004891 communication Methods 0.000 claims description 6
- FGUUSXIOTUKUDN-IBGZPJMESA-N C1(=CC=CC=C1)N1C2=C(NC([C@H](C1)NC=1OC(=NN=1)C1=CC=CC=C1)=O)C=CC=C2 Chemical compound C1(=CC=CC=C1)N1C2=C(NC([C@H](C1)NC=1OC(=NN=1)C1=CC=CC=C1)=O)C=CC=C2 FGUUSXIOTUKUDN-IBGZPJMESA-N 0.000 claims 1
- 230000005284 excitation Effects 0.000 description 13
- 230000015572 biosynthetic process Effects 0.000 description 5
- 238000003786 synthesis reaction Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000003595 spectral effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000001629 suppression Effects 0.000 description 2
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02168—Noise filtering characterised by the method used for estimating noise the estimation exclusively taking place during speech pauses
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
Definitions
- the invention refers to a speech coding method applying noise reduction
- noise reduction methods have been developed in speech processing. Most of the methods are performed in the frequency domain. They commonly comprise three major components:
- the suppression rule modifies only the spectral amplitude, not the phase. It has been shown, that there is no need to modify the phase in speech enhancement processing. Nevertheless, this approximation is only valid for a Signal to Noise Ratio (SNR) greater than 6dB. However, this condition is supposed to be satisfied in the majority of the noise reduction algorithms.
- SNR Signal to Noise Ratio
- FIG. 1 A scheme of a treatment of a speech signal with noise reduction is depicted in Fig. 1.
- the speech component s(p), where p denotes a time interval is superimposed with a noise component n(p).
- n(p) This results in the total signal y(p).
- the total signal y(p) undergoes a FFT.
- the result are Fourier components Y(p, f k ), where f k denotes a quantized frequency.
- the noise reduction NR is applied, thus producing modified Fouriercomponents S(p, S and (p,f k ). This leads after an IFFT to a clean speech signal estimate s and (p).
- a problem of any spectral weighting noise reduction method is its computational complexity, e.g. if the following steps have to be performed successively:
- a method for transmitting speech data said speech data are encoded by using an analysis through synthesis method.
- a synthesised signal is produced for approximating the original signal.
- the production of the synthesised signal is performed by using at least a fixed codebook with a respective fixed gain and optionally an adaptive codebook and a adaptive gain. The entries of the codebook and the gain are chosen such, that the synthesised signal resembles the original signal.
- Parameters describing these quantities will be transmitted from a sender to a receiver, e.g. from a near-end speaker to a far-end speaker or vice versa.
- the invention is based on the idea of modifying the fixed gain determined for the signal containing a noise component and a speech component. Objective of this modification is to obtain a useful estimate of the fixed gain of the speech component or clean signal.
- the modification is done using a modification factor, which is determined on basis of an estimate of the signal to noise ratio.
- This signal to noise ratio is calculated consecutively using also the past of this quantity. Thereby the noise component is represented by its fixed gain.
- One advantage of this procedure is its low computational complexity, particularly if the speech enhancement through noise reduction is done independently from an encoding / decoding unit, e.g. in a certain position within a network, where according to a noise reduction method in the time domain all the steps of decoding, FFT, speech enhancement , IFFT and encoding would have to be performed one after the other. This is not necessary for a noise reduction method according based on modification of parameters
- Another advantage is that by using the parameters for any modification, a repeated encoding and decoding process, the so called “tandeming" can be avoided, because the modification takes place in the parameter itself. Any tandeming decreases the speech quality. Furthermore the delay due to the additional encoding/decoding, which is e.g. in GSM typically 5 ms can be avoided.
- the procedure is furthermore also applicable within a communications network.
- An encoding apparatus set up for performing the above described encoding method includes at least a processing unit.
- the encoding apparatus may be part of a communications device, e.g. a cellular phone or it may be also situated in a communication network or a component thereof.
- the codec consists of a multi-rate, that is, the AMR codec can switch between the following bit rates: 12.2, 10.2, 7.95, 7.40, 6.70, 5.90, 5.15 and 4.75 kbit/s, speech codec, a source-controlled rate scheme including a Voice Activity Detection (VAD), a comfort noise generation system and an error concealment mechanism to compensate the effects of transmission errors.
- VAD Voice Activity Detection
- Fig. 2 shows the scheme of the AMR encoder. It uses a LTP (long term prediction) filter. It is transformed to an equivalent structure called adaptive codebook. This codebook saves former LPC filtered excitation signals. Instead of subtracting a long-term prediction as the LTP filter does, an adaptive codebook search is done to get an excitation vector from further LPC filtered speech samples. The amplitude of this excitation is adjusted by a gain factor g a .
- the encoder transforms the speech signal to parameters which describe the speech.
- these parameters namely the LSF (or LPC) coefficients, the lag of the adaptive codebook, the index of the fixed codebook and the codebook gains, as "speech coding parameters”.
- the domain will be called “(speech) codec parameter domain” and the signals of this domain are subscripted with frame index $k$.
- Fig. 3 shows the signal flow of the decoder.
- the decoder receives the speech coding parameters and computes the excitation signal of the synthesis filter. This excitation signal is the sum of the excitations of the fixed and adaptive codebook scaled with their respective gain factors. After the synthesis-filtering is performed, the speech signal is post-processed.
- a (total) signal containing clean speech or a speech component and a noise component is encoded.
- a fixed gain g y (m) of the total signal is calculated.
- This fixed gain g y (m) of the total signal is subject to a gain modification which bases on a noise gain estimation.
- an estimate of the fixed gain g and n ( m ) is determined, which is used for the gain modification.
- the result of the gain modification is an estimate of the fixed gain g and s ( m ) of the clean speech or the speech component.
- This parameter is transmitted from a sender to a receiver. At the receiver side it is decoded. This procedure will now be described in detail:
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP03022249A EP1521242A1 (fr) | 2003-10-01 | 2003-10-01 | Procédé de codage de la parole avec réduction de bruit au moyen de la modification du gain du livre de code |
PCT/EP2004/051712 WO2005031708A1 (fr) | 2003-10-01 | 2004-08-04 | Procede de codage de la parole appliquant une reduction du bruit par une modification du gain de livre de codes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP03022249A EP1521242A1 (fr) | 2003-10-01 | 2003-10-01 | Procédé de codage de la parole avec réduction de bruit au moyen de la modification du gain du livre de code |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1521242A1 true EP1521242A1 (fr) | 2005-04-06 |
Family
ID=34306816
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP03022249A Withdrawn EP1521242A1 (fr) | 2003-10-01 | 2003-10-01 | Procédé de codage de la parole avec réduction de bruit au moyen de la modification du gain du livre de code |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP1521242A1 (fr) |
WO (1) | WO2005031708A1 (fr) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3701523B1 (fr) * | 2017-10-27 | 2021-10-20 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Atténuation de bruit au niveau d'un décodeur |
CN114023352B (zh) * | 2021-11-12 | 2022-12-16 | 华南理工大学 | 一种基于能量谱深度调制的语音增强方法及装置 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6026356A (en) * | 1997-07-03 | 2000-02-15 | Nortel Networks Corporation | Methods and devices for noise conditioning signals representative of audio information in compressed and digitized form |
WO2001002929A2 (fr) * | 1999-07-02 | 2001-01-11 | Tellabs Operations, Inc. | Gestion du bruit du domaine code |
US20020184010A1 (en) * | 2001-03-30 | 2002-12-05 | Anders Eriksson | Noise suppression |
EP1301018A1 (fr) * | 2001-10-02 | 2003-04-09 | Alcatel | Méthode et appareille pour modifié un signal digital dons un domain codifié |
-
2003
- 2003-10-01 EP EP03022249A patent/EP1521242A1/fr not_active Withdrawn
-
2004
- 2004-08-04 WO PCT/EP2004/051712 patent/WO2005031708A1/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6026356A (en) * | 1997-07-03 | 2000-02-15 | Nortel Networks Corporation | Methods and devices for noise conditioning signals representative of audio information in compressed and digitized form |
WO2001002929A2 (fr) * | 1999-07-02 | 2001-01-11 | Tellabs Operations, Inc. | Gestion du bruit du domaine code |
US20020184010A1 (en) * | 2001-03-30 | 2002-12-05 | Anders Eriksson | Noise suppression |
EP1301018A1 (fr) * | 2001-10-02 | 2003-04-09 | Alcatel | Méthode et appareille pour modifié un signal digital dons un domain codifié |
Non-Patent Citations (2)
Title |
---|
CHANDRAN R ET AL: "COMPRESSED DOMAIN NOISE REDUCTION AND ECHO SUPPRESSION FOR NETWORK SPEECH ENHANCEMENT", PROCEEDINGS OF THE 43RD. IEEE MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS. MWSCAS 2000. LANSING, MI, NEW YORK, NY: IEEE, US, vol. 1 OF 3, 8 August 2000 (2000-08-08) - 11 August 2000 (2000-08-11), pages 10 - 13, XP002951730, ISBN: 0-7803-6476-7 * |
MARTIN R ET AL: "Optimized estimation of spectral parameters for the coding of noisy speech", IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, vol. 3, 5 June 2000 (2000-06-05), Istanbul, Turkey, pages 1479 - 1482, XP010507630 * |
Also Published As
Publication number | Publication date |
---|---|
WO2005031708A1 (fr) | 2005-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1363273B1 (fr) | Système de communication de la parole et procédé de gestion de trames perdues | |
CA2399706C (fr) | Reduction du bruit de fond dans des systemes de codage vocal sinusoidaux | |
US6782360B1 (en) | Gain quantization for a CELP speech coder | |
US6931373B1 (en) | Prototype waveform phase modeling for a frequency domain interpolative speech codec system | |
US7379866B2 (en) | Simple noise suppression model | |
US6996523B1 (en) | Prototype waveform magnitude quantization for a frequency domain interpolative speech codec system | |
CN104021796B (zh) | 语音增强处理方法和装置 | |
JPH08328591A (ja) | 短期知覚重み付けフィルタを使用する合成分析音声コーダに雑音マスキングレベルを適応する方法 | |
EP1313091A2 (fr) | Procédés d'analyse, synthèse, et quantisation de la parole | |
EP0899718B1 (fr) | Filtre non-linéaire pour l'atténuation du bruit dans des dispositifs de codage à prédiction linéaire | |
EP1301018A1 (fr) | Méthode et appareille pour modifié un signal digital dons un domain codifié | |
EP2608200B1 (fr) | Estimation d'énergie vocale sur la base de paramètres de prédiction linéaire à excitation par code (CELP) extraits à partir d'un flux binaire codé-CELP partiellement décodé | |
US10672411B2 (en) | Method for adaptively encoding an audio signal in dependence on noise information for higher encoding accuracy | |
EP1521242A1 (fr) | Procédé de codage de la parole avec réduction de bruit au moyen de la modification du gain du livre de code | |
JPH09508479A (ja) | バースト励起線形予測 | |
EP1521243A1 (fr) | Procédé de codage de la parole avec réduction de bruit au moyen de la modification du gain du livre de codage | |
EP1521241A1 (fr) | Übertragung von Sprachkodierparametern mit Echounterdrückung | |
WO2005031707A1 (fr) | Procede de codage de la parole appliquant une suppression d'echo par modification du code | |
EP1944761A1 (fr) | Réduction de perturbation pour le traitement de signaux numériques | |
EP0984433A2 (fr) | Suppression de bruit dans une unité de communication vocale et méthode d'opération |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL LT LV MK |
|
AKX | Designation fees paid | ||
REG | Reference to a national code |
Ref country code: DE Ref legal event code: 8566 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20060414 |