EP0685836B1 - Method and apparatus for preprocessing an acoustic signal before speech coding - Google Patents

Method and apparatus for preprocessing an acoustic signal before speech coding Download PDF

Info

Publication number
EP0685836B1
EP0685836B1 EP95401261A EP95401261A EP0685836B1 EP 0685836 B1 EP0685836 B1 EP 0685836B1 EP 95401261 A EP95401261 A EP 95401261A EP 95401261 A EP95401261 A EP 95401261A EP 0685836 B1 EP0685836 B1 EP 0685836B1
Authority
EP
European Patent Office
Prior art keywords
signal
state
frame
energy
acoustic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP95401261A
Other languages
German (de)
French (fr)
Other versions
EP0685836A1 (en
Inventor
Sophie Scott
William Navarro
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nortel Networks France SAS
Original Assignee
Matra Nortel Communications SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matra Nortel Communications SAS filed Critical Matra Nortel Communications SAS
Publication of EP0685836A1 publication Critical patent/EP0685836A1/en
Application granted granted Critical
Publication of EP0685836B1 publication Critical patent/EP0685836B1/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

Definitions

  • the present invention relates to a method and a acoustic signal preprocessing device supplied to a speech coder. It applies in particular, but not exclusively, to improve the performance of encoders low-speed speech.
  • Low bit rate speech coders (typically 5 kbit / s for a sampling frequency of 8 kHz) give their best performance on signals presenting a "telephone" spectrum, that is to say in the band 300-3400 Hz and with a pre-emphasis in the high frequencies.
  • IRS Intermediate Reference System
  • This template has been defined for telephone handsets, both in input (microphone) than output (headphones).
  • the speech encoder input signal has a spectrum more "flat", for example when a hands-on installation free is used, using a response microphone in linear frequency.
  • the usual vocoders are designed to be independent of the input with which they function, and they are not informed of the characteristics of this entry. If microphones different characteristics are likely to be connected to the vocoder, or more generally if the vocoder is likely to receive acoustic signals with different spectral characteristics, it there are then cases where the vocoder is used sub-optimally.
  • a main purpose of this invention is to improve the performance of a vocoder by making them less dependent on features spectral signal intended for it.
  • the method according to the invention consists in subjecting the acoustic input signal to high-pass filtering, to compare the energy of the high pass filtered signal to that of the unfiltered signal to determine a signal state among a first state for which the energy of the filtered high pass signal is greater than a predetermined fraction of the energy of the unfiltered signal, and a second state for which the energy of the filtered high pass signal is lower at the predetermined fraction of the signal energy not filtered, and to send the filtered signal to the encoder input high pass subject to high emphasis frequencies when the signal is in its second state.
  • the high pass filter used is typically a 400 Hz abrupt cutoff filter, and the fraction predetermined energy is typically 85 to 95%.
  • the first signal state corresponds to IRS characteristics
  • the second state corresponds to a flatter spectrum of the input acoustic signal containing proportionally more energy at low frequencies.
  • a flat spectrum signal is preprocessed (high pass filtering and pre-emphasis) to make spectral characteristics closer to those of the IRS template.
  • the use of high pass filtering for determining the signal state has the advantage over low-pass filtering, to allow the signal to be used filtered to address it (after pre-emphasis) to the input of the vocoder.
  • the determined state of the signal cannot only be changed when the input acoustic signal, or the high pass filtered signal, has an energy greater than one predetermined threshold. In fact, otherwise (by example in a zone of silence or low ambient noise), signal energy is too low to be able to reliably assess its spectral characteristics.
  • the acoustic signal When the acoustic signal is digitized in frames successive, it is detected whether the signal included in each frame is in a first condition corresponding to first state or in a second condition corresponding to second state, and the state of the signal is determined on the basis frame by frame conditions, by not modifying the state determined that after several successive frames show a signal condition different from that corresponding to the previously determined state.
  • This introduces a kind hysteresis which allows variations to be taken into account of the speech signal spectral envelope, due to ambient noise or speech itself (the timbre of the voice is not constant). This reduces the risk of false determination of the signal state, which leads to a better quality of the coded signal and avoids introducing stamp discontinuities which could be due to untimely modifications of the determined state.
  • the pretreatment device includes a high-pass filter receiving the acoustic signal input, means to calculate the energies contained respectively in said acoustic signal and in the high pass filter output signal, means for comparison of calculated energies, and a filter of high frequency pre-emphasis, which input receives the output signal of the high pass filter, and the output of which delivers the signal sent to the encoder input when the comparison means reveal that the output signal from the high pass filter contains less than a fraction predetermined energy of said acoustic signal.
  • the two solid lines correspond to the framing of the IRS template defined for microphones in CCITT Recommendation P48.
  • an IRS type microphone signal has a strong attenuation in the lower part of the spectrum (between 0 and 300 Hz) and a relative emphasis in the high frequencies.
  • a linear type signal provided by example through the microphone of a hands-free system, presents a flatter spectrum, notably not having the strong attenuation at low frequencies (a typical example of such a linear type signal is illustrated by a line in dashes on the diagram in Figure 1).
  • Encoder 12 is a low encoder flow optimized for an IRS type input signal. he can be, among other things, a linear prediction coder with excitation by regular pulse vectors (RP-CELP), as described in document EP-A-0 347 307. The coder 12 has no prior knowledge of the source of the acoustic signal addressed to him.
  • RP-CELP regular pulse vectors
  • the acoustic input signal S I is the output signal from a microphone 13 which has been amplified and digitized by an analog-digital converter 14.
  • the signal is typically digitized at a rate of 8 kHz sampling, and put into successive 30 ms frames each containing 240 16-bit samples.
  • the pretreatment device 10 comprises a high-pass filter 16 receiving the input acoustic signal S I and delivering a filtered signal S I '.
  • the filter 16 is typically a digital filter of the bi-quad type having an abrupt cutoff at 400 Hz.
  • the energies E1 and E2 contained in each frame of the acoustic input signal S I and of the filtered signal S I ' are calculated by two units 17, 18 each carrying out the sum of the squares of the samples of each frame which it receives.
  • the calculated energies E1 and E2 are supplied to a comparison unit 20 which determines the state of the signal in the form of a bit Y which is equal to 0 when it is determined that the signal is of IRS type (state Y A ), and 1 when it is determined that the signal is rather of the linear type (state Y B ).
  • the output of the pretreatment device 10 connected to the input of the encoder 12 is constituted by a terminal of a switch 21, the other terminal of which is connected either to the input of the high-pass filter 16, or to the output of a pre-emphasis filter 22, according to the value of the bit Y delivered by the comparison unit 20.
  • H (z) 1- ⁇ / z
  • denotes a pre-emphasis coefficient which is typically of the order 0.4.
  • the comparison unit 20 is for example in accordance with the diagram illustrated in FIG. 3.
  • the energy E1 of each frame of the input signal S I is sent to the input of a threshold comparator 25 which delivers a bit Z of value 0 when the energy E1 is less than a predetermined energy threshold, and of value 1 when the energy E1 is greater than the threshold.
  • the energy threshold is typically of the order of -38 dB relative to the signal saturation energy.
  • the comparator 25 serves to inhibit the determination of the state of the signal when the latter contains too little energy to be representative of the characteristics of the source. In this case, the determined state of the signal remains unchanged.
  • the energies E1 and E2 are sent to a digital divider 26 which calculates the ratio E2 / E1 for each frame.
  • This E2 / E1 ratio is sent to another threshold comparator 27 which delivers a bit X of value 0 when the E2 / E1 ratio is greater than a predetermined threshold, and of value 1 when the E2 / E1 ratio is less than the threshold.
  • This threshold on the E2 / E1 ratio is typically of the order of 0.93.
  • Bit X is representative of a signal condition on each frame.
  • the state bit Y is not taken directly equal to the condition bit X, but it results from a processing of successive condition bits X by a state determination circuit 29.
  • the operation of the state determination circuit 29 is illustrated in FIG. 4, where the upper timing diagram illustrates an example of evolution of the bit X provided by the comparator 27.
  • the status bit Y (lower timing diagram) is initialized to 0 , because IRS characteristics are most frequently encountered.
  • variable V As soon as the variable V reaches a predetermined threshold (8 in the example considered), it is reset to 0 and the value of the bit Y is changed, so that it is determined that the signal has changed state.
  • a predetermined threshold 8 in the example considered
  • the signal is in state Y A up to frame M, in state Y B between frames M and N (change of signal source), then again in state Y A from frame N.
  • other modes of incrementation and decrementation and other threshold values would be usable.
  • the above counting mode can for example be obtained by circuit 29 shown in Figure 3.
  • This circuit includes a counter 32 on four bits, of which the bit most significant corresponds to the status bit Y, of which the three Least significant bits represent the counting variable V.
  • X and Y bits are supplied at the input of an OR gate EXCLUSIVE 33 whose output is addressed to the input incrementation of the counter 32 via a AND gate 34 whose other input receives the Z bit supplied by the threshold comparator 25.
  • the inverted output of the gate 33 is supplied to a decrementing input of the counter 32 via another AND gate 35 whose the other two inputs receive the Z bit respectively provided by comparator 25, and the output of an OR gate to three inputs 36 receiving the three least significant bits of the counter 32.
  • Counter 32 is arranged to split the pulses received on its decrement input when its least significant bit is 0 or when one at less than the next two bits is 1, as shown by the OR gate 37 in FIG. 3.
  • the circuit of determination 29 is not activated because AND gates 34, 35 prevent the value of counter 32 from being changed.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Description

La présente invention concerne un procédé et un dispositif de prétraitement du signal acoustique fourni à un codeur de parole. Elle s'applique notamment, mais non exclusivement, pour améliorer les performances des codeurs de parole à bas débit.The present invention relates to a method and a acoustic signal preprocessing device supplied to a speech coder. It applies in particular, but not exclusively, to improve the performance of encoders low-speed speech.

Les codeurs de parole à bas débit (typiquement 5 kbit/s pour une fréquence d'échantillonnage de 8 kHz) actuels donnent leur meilleure performance sur des signaux présentant un spectre "téléphonique", c'est-à-dire dans la bande 300-3400 Hz et avec une préaccentuation dans les fréquences élevées. Ces caractéristiques spectrales correspondent au gabarit IRS (Intermediate Reference System) défini par le CCITT dans la Recommandation P48. Ce gabarit a été défini pour les combinés téléphoniques, aussi bien en entrée (microphone) qu'en sortie (écouteurs).Low bit rate speech coders (typically 5 kbit / s for a sampling frequency of 8 kHz) give their best performance on signals presenting a "telephone" spectrum, that is to say in the band 300-3400 Hz and with a pre-emphasis in the high frequencies. These spectral characteristics correspond to the IRS (Intermediate Reference System) template defined by the CCITT in Recommendation P48. This template has been defined for telephone handsets, both in input (microphone) than output (headphones).

Cependant, il arrive de plus en plus fréquemment que le signal d'entrée d'un codeur de parole présente un spectre plus "plat", par exemple lorsqu'une installation mains libres est utilisée, employant un microphone à réponse en fréquence linéaire. Les vocodeurs habituels sont conçus pour être indépendants de l'entrée avec laquelle ils fonctionnent, et ils ne sont d'ailleurs pas informés des caractéristiques de cette entrée. Si des microphones de caractéristiques différentes sont susceptibles d'être raccordés au vocodeur, ou plus généralement si le vocodeur est susceptible de recevoir des signaux acoustiques présentant des caractéristiques spectrales différentes, il y a alors des cas où le vocodeur est utilisé de façon sous-optimale.However, it is more and more common that the speech encoder input signal has a spectrum more "flat", for example when a hands-on installation free is used, using a response microphone in linear frequency. The usual vocoders are designed to be independent of the input with which they function, and they are not informed of the characteristics of this entry. If microphones different characteristics are likely to be connected to the vocoder, or more generally if the vocoder is likely to receive acoustic signals with different spectral characteristics, it there are then cases where the vocoder is used sub-optimally.

Dans ce contexte, un but principal de la présente invention est d'améliorer les performances d'un vocodeur en les rendant moins dépendantes des caractéristiques spectrales du signal qui lui est destiné.In this context, a main purpose of this invention is to improve the performance of a vocoder by making them less dependent on features spectral signal intended for it.

Le procédé selon l'invention consiste à soumettre le signal acoustique d'entrée à un filtrage passe-haut, à comparer l'énergie du signal filtré passe-haut à celle du signal non filtré pour déterminer un état du signal parmi un premier état pour lequel l'énergie du signal filtré passe-haut est supérieure à une fraction prédéterminée de l'énergie du signal non filtré, et un second état pour lequel l'énergie du signal filtré passe-haut est inférieure à la fraction prédéterminée de l'énergie du signal non filtré, et à adresser à l'entrée du codeur le signal filtré passe-haut soumis à une préaccentuation des hautes fréquences lorsque le signal est dans son second état.The method according to the invention consists in subjecting the acoustic input signal to high-pass filtering, to compare the energy of the high pass filtered signal to that of the unfiltered signal to determine a signal state among a first state for which the energy of the filtered high pass signal is greater than a predetermined fraction of the energy of the unfiltered signal, and a second state for which the energy of the filtered high pass signal is lower at the predetermined fraction of the signal energy not filtered, and to send the filtered signal to the encoder input high pass subject to high emphasis frequencies when the signal is in its second state.

Le filtre passe-haut utilisé est typiquement un filtre à coupure abrupte à 400 Hz, et la fraction énergétique prédéterminée est typiquement de 85 à 95%. Le premier état du signal correspond aux caractéristiques IRS, et le second état correspond à un spectre plus plat du signal acoustique d'entrée contenant proportionnellement plus d'énergie aux basses fréquences. Avec le procédé selon l'invention, un tel signal à spectre plat est prétraité (filtrage passe-haut et préaccentuation) pour rendre ses caractéristiques spectrales plus proches de celles du gabarit IRS. L'utilisation d'un filtrage passe-haut pour déterminer l'état du signal présente l'avantage, par rapport à un filtrage passe-bas, de permettre d'utiliser le signal filtré pour l'adresser (après préaccentuation) à l'entrée du vocodeur.The high pass filter used is typically a 400 Hz abrupt cutoff filter, and the fraction predetermined energy is typically 85 to 95%. The first signal state corresponds to IRS characteristics, and the second state corresponds to a flatter spectrum of the input acoustic signal containing proportionally more energy at low frequencies. With the process according to the invention, such a flat spectrum signal is preprocessed (high pass filtering and pre-emphasis) to make spectral characteristics closer to those of the IRS template. The use of high pass filtering for determining the signal state has the advantage over low-pass filtering, to allow the signal to be used filtered to address it (after pre-emphasis) to the input of the vocoder.

De préférence, l'état déterminé du signal ne peut être modifié que lorsque le signal acoustique d'entrée, ou le signal filtré passe-haut, a une énergie supérieure à un seuil prédéterminé. En effet, dans le cas contraire (par exemple en zone de silence ou de faible bruit ambiant), l'énergie du signal est trop faible pour qu'on puisse évaluer de façon fiable ses caractéristiques spectrales. Preferably, the determined state of the signal cannot only be changed when the input acoustic signal, or the high pass filtered signal, has an energy greater than one predetermined threshold. In fact, otherwise (by example in a zone of silence or low ambient noise), signal energy is too low to be able to reliably assess its spectral characteristics.

Lorsque le signal acoustique est numérisé en trames successives, on détecte si le signal inclus dans chaque trame est dans une première condition correspondant au premier état ou dans une seconde condition correspondant au second état, et on détermine l'état du signal sur la base des conditions trame par trame, en ne modifiant l'état déterminé qu'après que plusieurs trames successives montrent une condition de signal différente de celle correspondant à l'état précédemment déterminé. Ceci introduit une sorte d'hystérésis qui permet de prendre en compte les variations rapides de l'enveloppe spectrale du signal de parole, dues au bruit ambiant ou à la parole elle-même (le timbre de la voix n'est pas constant). On réduit ainsi les risques de fausse détermination de l'état du signal, ce qui conduit à une meilleure qualité du signal codé et évite d'introduire des discontinuités de timbre qui pourraient être dues à des modifications intempestives de l'état déterminé.When the acoustic signal is digitized in frames successive, it is detected whether the signal included in each frame is in a first condition corresponding to first state or in a second condition corresponding to second state, and the state of the signal is determined on the basis frame by frame conditions, by not modifying the state determined that after several successive frames show a signal condition different from that corresponding to the previously determined state. This introduces a kind hysteresis which allows variations to be taken into account of the speech signal spectral envelope, due to ambient noise or speech itself (the timbre of the voice is not constant). This reduces the risk of false determination of the signal state, which leads to a better quality of the coded signal and avoids introducing stamp discontinuities which could be due to untimely modifications of the determined state.

Le dispositif de prétraitement selon l'invention comprend un filtre passe-haut recevant le signal acoustique d'entrée, des moyens pour calculer les énergies contenues respectivement dans ledit signal acoustique et dans le signal de sortie du filtre passe-haut, des moyens de comparaison des énergies calculées, et un filtre de préaccentuation des hautes fréquences, dont l'entrée reçoit le signal de sortie du filtre passe-haut, et dont la sortie délivre le signal adressé à l'entrée du codeur lorsque les moyens de comparaison révèlent que le signal de sortie du filtre passe-haut contient moins qu'une fraction prédéterminée de l'énergie dudit signal acoustique.The pretreatment device according to the invention includes a high-pass filter receiving the acoustic signal input, means to calculate the energies contained respectively in said acoustic signal and in the high pass filter output signal, means for comparison of calculated energies, and a filter of high frequency pre-emphasis, which input receives the output signal of the high pass filter, and the output of which delivers the signal sent to the encoder input when the comparison means reveal that the output signal from the high pass filter contains less than a fraction predetermined energy of said acoustic signal.

D'autres particularités et avantages de la présente invention apparaítront dans la description ci-après d'un exemple de réalisation préféré mais non limitatif, en référence aux dessins annexés, dans lesquels :

  • la figure 1 est un diagramme illustrant les caractéristiques d'un signal acoustique de type IRS et d'un signal de type linéaire ;
  • la figure 2 est un schéma synoptique d'un dispositif de prétraitement selon l'invention ;
  • la figure 3 est un schéma plus détaillé des moyens de comparaison du dispositif de la figure 2 ; et
  • la figure 4 montre des chronogrammes illustrant le mode de détermination de l'état du signal par les moyens de la figure 3.
Other features and advantages of the present invention will appear in the description below of a preferred but nonlimiting embodiment, with reference to the accompanying drawings, in which:
  • FIG. 1 is a diagram illustrating the characteristics of an acoustic signal of IRS type and of a linear type signal;
  • Figure 2 is a block diagram of a pretreatment device according to the invention;
  • Figure 3 is a more detailed diagram of the comparison means of the device of Figure 2; and
  • FIG. 4 shows timing diagrams illustrating the mode of determining the state of the signal by the means of FIG. 3.

Sur la figure 1, les deux lignes en traits pleins correspondent à l'encadrement du gabarit IRS défini pour des microphones dans la Recommandation P48 du CCITT. On voit qu'un signal de microphone de type IRS présente une forte atténuation dans la partie basse du spectre (entre 0 et 300 Hz) et une relative accentuation dans les hautes fréquences. En comparaison, un signal de type linéaire, fourni par exemple par le microphone d'une installation mains libres, présente un spectre plus plat, n'ayant notamment pas la forte atténuation aux basses fréquences (un exemple typique d'un tel signal de type linéaire est illustré par une ligne en tirets sur le diagramme de la figure 1).In Figure 1, the two solid lines correspond to the framing of the IRS template defined for microphones in CCITT Recommendation P48. We see that an IRS type microphone signal has a strong attenuation in the lower part of the spectrum (between 0 and 300 Hz) and a relative emphasis in the high frequencies. In comparison, a linear type signal, provided by example through the microphone of a hands-free system, presents a flatter spectrum, notably not having the strong attenuation at low frequencies (a typical example of such a linear type signal is illustrated by a line in dashes on the diagram in Figure 1).

On tire parti de ces propriétés spectrales dans le dispositif de prétraitement 10 selon l'invention, schématisé sur la figure 2. Ce dispositif traite le signal d'entrée fourni par une source de signal acoustique pour l'adresser à un codeur de parole 12. Le codeur 12 est un codeur à bas débit optimisé pour un signal d'entrée de type IRS. Il peut être, entre autres, un codeur à prédiction linéaire à excitation par des vecteurs d'impulsions régulières (RP-CELP), tel que décrit dans le document EP-A-0 347 307. Le codeur 12 n'a pas de connaissance a priori de la source du signal acoustique qui lui est adressé.We take advantage of these spectral properties in the pretreatment device 10 according to the invention, shown diagrammatically in Figure 2. This device processes the input signal supplied by an acoustic signal source to address it to a speech encoder 12. Encoder 12 is a low encoder flow optimized for an IRS type input signal. he can be, among other things, a linear prediction coder with excitation by regular pulse vectors (RP-CELP), as described in document EP-A-0 347 307. The coder 12 has no prior knowledge of the source of the acoustic signal addressed to him.

Sur le schéma de la figure 2, le signal acoustique d'entrée SI est le signal de sortie d'un microphone 13 qui a été amplifié et numérisé par un convertisseur analogique-numérique 14. Le signal est typiquement numérisé à une cadence d'échantillonnage de 8 kHz, et mis sous forme de trames successives de 30 ms contenant chacune 240 échantillons de 16 bits.In the diagram of FIG. 2, the acoustic input signal S I is the output signal from a microphone 13 which has been amplified and digitized by an analog-digital converter 14. The signal is typically digitized at a rate of 8 kHz sampling, and put into successive 30 ms frames each containing 240 16-bit samples.

Le dispositif de prétraitement 10 comprend un filtre passe-haut 16 recevant le signal acoustique d'entrée SI et délivrant un signal filtré SI'. Le filtre 16 est typiquement un filtre numérique de type bi-quad ayant une coupure abrupte à 400 Hz. Les énergies E1 et E2 contenues dans chaque trame du signal acoustique d'entrée SI et du signal filtré SI' sont calculées par deux unités 17, 18 effectuant chacune la somme des carrés des échantillons de chaque trame qu'elle reçoit. Les énergies calculées E1 et E2 sont fournies à une unité de comparaison 20 qui détermine l'état du signal sous la forme d'un bit Y qui vaut 0 lorsqu'il est déterminé que le signal est de type IRS (état YA), et 1 lorsqu'il est déterminé que le signal est plutôt de type linéaire (état YB).The pretreatment device 10 comprises a high-pass filter 16 receiving the input acoustic signal S I and delivering a filtered signal S I '. The filter 16 is typically a digital filter of the bi-quad type having an abrupt cutoff at 400 Hz. The energies E1 and E2 contained in each frame of the acoustic input signal S I and of the filtered signal S I 'are calculated by two units 17, 18 each carrying out the sum of the squares of the samples of each frame which it receives. The calculated energies E1 and E2 are supplied to a comparison unit 20 which determines the state of the signal in the form of a bit Y which is equal to 0 when it is determined that the signal is of IRS type (state Y A ), and 1 when it is determined that the signal is rather of the linear type (state Y B ).

La sortie du dispositif de prétraitement 10 reliée à l'entrée du codeur 12 est constituée par une borne d'un commutateur 21 dont l'autre borne est reliée soit à l'entrée du filtre passe-haut 16, soit à la sortie d'un filtre de préaccentuation 22, suivant la valeur du bit Y délivré par l'unité de comparaison 20. Lorsque Y = 0 (état YA), le commutateur 21 est dans la position représentée sur la figure 2, et le signal acoustique d'entrée SI est adressé à l'entrée du codeur 12. Dans l'autre position (Y = 1, état YB), c'est la sortie du filtre de préaccentuation 22 qui est adressée à l'entrée du codeur 12. Le filtre de préaccentuation 22 reçoit le signal filtré passe-haut SI' et lui applique une fonction de transfert de la forme H(z) = 1-β/z, dans laquelle β désigne un coefficient de préaccentuation qui est typiquement de l'ordre de 0,4. Ainsi, lorsque le signal acoustique est de type linéaire, il est transformé par filtrage passe-haut (filtre 16) et préaccentuation (filtre 22) pour être adressé à l'entrée du codeur 12 avec des caractéristiques spectrales plus proches de celles du gabarit IRS.The output of the pretreatment device 10 connected to the input of the encoder 12 is constituted by a terminal of a switch 21, the other terminal of which is connected either to the input of the high-pass filter 16, or to the output of a pre-emphasis filter 22, according to the value of the bit Y delivered by the comparison unit 20. When Y = 0 (state Y A ), the switch 21 is in the position shown in FIG. 2, and the acoustic signal of input S I is addressed to the input of the encoder 12. In the other position (Y = 1, state Y B ), it is the output of the pre-emphasis filter 22 which is addressed to the input of the encoder 12. The pre-emphasis filter 22 receives the filtered high-pass signal S I 'and applies to it a transfer function of the form H (z) = 1-β / z, in which β denotes a pre-emphasis coefficient which is typically of the order 0.4. Thus, when the acoustic signal is of the linear type, it is transformed by high-pass filtering (filter 16) and pre-emphasis (filter 22) to be addressed to the input of the encoder 12 with spectral characteristics closer to those of the IRS mask. .

Etant donné que le filtre passe-haut 16 n'affecte que peu le signal d'entrée lorsque celui-ci a des caractéristiques IRS, il est également possible de fournir au codeur 12 le signal filtré passe-haut SI' lorsqu'on a déterminé que le signal est dans l'état YA correspondant aux caractéristiques IRS. Une variante du schéma de la figure 2 consiste alors à se dispenser du commutateur 21 en reliant directement la sortie du filtre de préaccentuation 22 à l'entrée du codeur 12, et à commander la valeur du coefficient β dans le filtre 22 en fonction de la valeur du bit d'état Y (par exemple β = 0 lorsque Y = 0 et β = 0,4 lorsque Y = 1).Since the high-pass filter 16 has little effect on the input signal when the latter has IRS characteristics, it is also possible to supply the encoder 12 with the filtered high-pass signal S I 'when determined that the signal is in state Y A corresponding to the IRS characteristics. A variant of the diagram in FIG. 2 then consists in dispensing with the switch 21 by directly connecting the output of the pre-emphasis filter 22 to the input of the encoder 12, and in controlling the value of the coefficient β in the filter 22 as a function of the value of the status bit Y (for example β = 0 when Y = 0 and β = 0.4 when Y = 1).

L'unité de comparaison 20 est par exemple conforme au schéma illustré sur la figure 3. L'énergie E1 de chaque trame du signal d'entrée SI est adressée à l'entrée d'un comparateur à seuil 25 qui délivre un bit Z de valeur 0 lorsque l'énergie E1 est inférieure à un seuil d'énergie prédéterminé, et de valeur 1 lorsque l'énergie E1 est supérieure au seuil. Le seuil d'énergie est typiquement de l'ordre de -38 dB par rapport à l'énergie de saturation du signal. Le comparateur 25 sert à inhiber la détermination de l'état du signal lorsque celui-ci contient trop peu d'énergie pour être représentatif des caractéristiques de la source. Dans ce cas, l'état déterminé du signal reste inchangé.The comparison unit 20 is for example in accordance with the diagram illustrated in FIG. 3. The energy E1 of each frame of the input signal S I is sent to the input of a threshold comparator 25 which delivers a bit Z of value 0 when the energy E1 is less than a predetermined energy threshold, and of value 1 when the energy E1 is greater than the threshold. The energy threshold is typically of the order of -38 dB relative to the signal saturation energy. The comparator 25 serves to inhibit the determination of the state of the signal when the latter contains too little energy to be representative of the characteristics of the source. In this case, the determined state of the signal remains unchanged.

Les énergies E1 et E2 sont adressées à un diviseur numérique 26 qui calcule le rapport E2/E1 pour chaque trame. Ce rapport E2/E1 est adressé à un autre comparateur à seuil 27 qui délivre un bit X de valeur 0 lorsque le rapport E2/E1 est supérieur à un seuil prédéterminé, et de valeur 1 lorsque le rapport E2/E1 est inférieur au seuil. Ce seuil sur le rapport E2/E1 est typiquement de l'ordre de 0,93. Le bit X est représentatif d'une condition du signal sur chaque trame. La condition X = 0 correspond aux caractéristiques IRS du signal d'entrée (état YA), et la condition X = 1 correspond aux caractéristiques linéaires (état YB). Pour éviter des changements d'état répétés et intempestifs à l'occasion des variations à court terme de l'excitation vocale, le bit d'état Y n'est pas pris directement égal au bit de condition X, mais il résulte d'un traitement des bits de condition successifs X par un circuit 29 de détermination d'état.The energies E1 and E2 are sent to a digital divider 26 which calculates the ratio E2 / E1 for each frame. This E2 / E1 ratio is sent to another threshold comparator 27 which delivers a bit X of value 0 when the E2 / E1 ratio is greater than a predetermined threshold, and of value 1 when the E2 / E1 ratio is less than the threshold. This threshold on the E2 / E1 ratio is typically of the order of 0.93. Bit X is representative of a signal condition on each frame. The condition X = 0 corresponds to the IRS characteristics of the input signal (state Y A ), and the condition X = 1 corresponds to the linear characteristics (state Y B ). To avoid repeated and untimely changes of state during short-term variations in the voice excitation, the state bit Y is not taken directly equal to the condition bit X, but it results from a processing of successive condition bits X by a state determination circuit 29.

Le fonctionnement du circuit 29 de détermination d'état est illustré sur la figure 4, où le chronogramme supérieur illustre un exemple d'évolution du bit X fourni par le comparateur 27. Le bit d'état Y (chronogramme inférieur) est initialisé à 0, car les caractéristiques IRS sont le plus fréquemment rencontrées. On calcule trame après trame une variable de comptage V initialement mise à 0. La variable V est incrémentée d'une unité chaque fois que la condition X du signal sur une trame diffère de celle correspondant à l'état déterminé Y (X = 1 et Y = 0, ou X = 0 et Y = 1). Dans le cas contraire (X = Y = 0 ou 1) la variable V est décrémentée de deux unités si elle est différente de 0 et de 1, décrémentée d'une unité si elle est égale à 1, et maintenue inchangée si elle est égale à 0. Dès que la variable V atteint un seuil prédéterminé (8 dans l'exemple considéré), on la remet à 0 et on change la valeur du bit Y, de sorte qu'on détermine que le signal a changé d'état. Ainsi, dans l'exemple représenté sur la figure 4, le signal est dans l'état YA jusqu'à la trame M, dans l'état YB entre les trames M et N (changement de la source de signal), puis de nouveau dans l'état YA à partir de la trame N. Bien entendu, d'autres modes d'incrémentation et de décrémentation et d'autres valeurs de seuil seraient utilisables.The operation of the state determination circuit 29 is illustrated in FIG. 4, where the upper timing diagram illustrates an example of evolution of the bit X provided by the comparator 27. The status bit Y (lower timing diagram) is initialized to 0 , because IRS characteristics are most frequently encountered. A counting variable V initially set to 0 is calculated frame after frame. The variable V is incremented by one each time the condition X of the signal on a frame differs from that corresponding to the determined state Y (X = 1 and Y = 0, or X = 0 and Y = 1). Otherwise (X = Y = 0 or 1) the variable V is decremented by two units if it is different from 0 and 1, decremented by one unit if it is equal to 1, and kept unchanged if it is equal to 0. As soon as the variable V reaches a predetermined threshold (8 in the example considered), it is reset to 0 and the value of the bit Y is changed, so that it is determined that the signal has changed state. Thus, in the example shown in FIG. 4, the signal is in state Y A up to frame M, in state Y B between frames M and N (change of signal source), then again in state Y A from frame N. Of course, other modes of incrementation and decrementation and other threshold values would be usable.

Le mode de comptage ci-dessus peut par exemple être obtenu par le circuit 29 représenté sur la figure 3. Ce circuit comprend un compteur 32 sur quatre bits, dont le bit de poids fort correspond au bit d'état Y, et dont les trois bits de poids faible représentent la variable de comptage V. Les bits X et Y sont fournis à l'entrée d'une porte OU EXCLUSIF 33 dont la sortie est adressée à l'entrée d'incrémentation du compteur 32 par l'intermédiaire d'une porte ET 34 dont l'autre entrée reçoit le bit Z fourni par le comparateur à seuil 25. Ainsi, la variable V est incrémentée lorsque X ≠ Y et Z = 1. La sortie inversée de la porte 33 est fournie à une entrée de décrémentation du compteur 32 par l'intermédiaire d'une autre porte ET 35 dont les deux autres entrées reçoivent respectivement le bit Z fourni par le comparateur 25, et la sortie d'une porte OU à trois entrées 36 recevant les trois bits de poids faible du compteur 32. Le compteur 32 est agencé pour dédoubler les impulsions reçues sur son entrée de décrémentation lorsque son bit de poids le plus faible vaut 0 ou lorsque l'un au moins des deux bits suivants vaut 1, comme schématisé par la porte OU 37 sur la figure 3. Ainsi, le compteur 32 est décrémenté (d'une unité si V = 1 et de deux unités si V > 1) lorsque X = Y et Z = 1 et V ≠ 0. Lorsque l'énergie du signal d'entrée est insuffisante, on a Z = 0 et le circuit de détermination 29 n'est pas activé car les portes ET 34, 35 empêchent de modifier la valeur du compteur 32.The above counting mode can for example be obtained by circuit 29 shown in Figure 3. This circuit includes a counter 32 on four bits, of which the bit most significant corresponds to the status bit Y, of which the three Least significant bits represent the counting variable V. X and Y bits are supplied at the input of an OR gate EXCLUSIVE 33 whose output is addressed to the input incrementation of the counter 32 via a AND gate 34 whose other input receives the Z bit supplied by the threshold comparator 25. Thus, the variable V is incremented when X ≠ Y and Z = 1. The inverted output of the gate 33 is supplied to a decrementing input of the counter 32 via another AND gate 35 whose the other two inputs receive the Z bit respectively provided by comparator 25, and the output of an OR gate to three inputs 36 receiving the three least significant bits of the counter 32. Counter 32 is arranged to split the pulses received on its decrement input when its least significant bit is 0 or when one at less than the next two bits is 1, as shown by the OR gate 37 in FIG. 3. Thus, the counter 32 is decremented (by one unit if V = 1 and by two units if V> 1) when X = Y and Z = 1 and V ≠ 0. When the signal energy input is insufficient, we have Z = 0 and the circuit of determination 29 is not activated because AND gates 34, 35 prevent the value of counter 32 from being changed.

Claims (8)

  1. Method of preprocessing an acoustic signal upstream of a speech coder (12), characterized in that the acoustic signal (SI) is subjected to high-pass filtering, the energy (E2) of the high pass filtered signal (SI') is compared with that (E1) of the unfiltered signal in order to determine a state (Y) of the signal from among a first state (YA) for which the energy of the high-pass filtered signal is above a predetermined fraction of the energy of the unfiltered signal and a second state (YB) for which the energy of the high-pass filtered signal is below the predetermined fraction of the energy of the unfiltered signal, and the high-pass filtered signal subjected to pre-emphasis of the high frequencies is addressed to the input of the coder (12) when the signal is in its second state.
  2. Method according to Claim 1, characterized in that the determined state of the signal is not modified when said acoustic signal or the high-pass filtered signal has energy below a predetermined threshold.
  3. Method according to Claim 1 or 2, characterized in that the acoustic signal (SI) being digitized as successive frames, there is frame-by-frame detection of whether the signal is in a first condition, corresponding to the first state (YA), for which the calculated energy (E2) of the frame of the high-pass filtered signal (SI') is above the predetermined fraction of the calculated energy (E1) of the frame of the unfiltered signal (SI) or in a second condition, corresponding to the second state (YB), for which the calculated energy of the frame of the high-pass filtered signal is below the predetermined fraction of the calculated energy of the frame of the unfiltered signal, and the state (Y) of the signal is determined on the basis of the frame-by-frame conditions (X), by modifying the determined state only after several successive frames show a signal condition different from that corresponding to the previously determined state.
  4. Method according to Claim 3, characterized in that a counting variable (V) is incremented when the condition (x) of the signal in a frame differs from that corresponding to the determined state (Y) of the signal, in that said counting variable (V) is decremented when the condition of the signal in a frame is that corresponding to the determined state of the signal unless this variable equals zero and in that, when the counting variable (V) reaches a predetermined threshold, it is reset to zero and the signal is determined to have changed state.
  5. Device (10) for preprocessing an acoustic signal upstream of a speech coder (12), characterized in that it comprises a high-pass filter (16) receiving said acoustic signal (SI), means (17, 18) for calculating the energies (E1, E2) contained respectively in said acoustic signal (SI) and in the output signal (SI') of the high-pass filter, means (20) for comparing the calculated energies, and a filter (22) for pre-emphasis of the high frequencies, the input of which receives the output signal from the high-pass filter, and the output of which delivers the signal addressed to the input of the coder (12) when the means of comparison (20) reveal that the output signal from the high-pass filter contains less than a predetermined fraction of the energy of said acoustic signal.
  6. Device according to Claim 5, characterized in that, with the acoustic signal being digitized as successive frames, the energies (E1, E2) are calculated for each frame by the means of calculation (17, 18), and the means of comparison (20) comprise a comparator (27) which detects frame by frame whether the signal is in a first or a second condition according to whether the ratio (E2/E1) between the calculated energy of the output signal from the high-pass filter (16) and the calculated energy of said acoustic signal (SI) is above or, respectively, below a predetermined value, and means (29) for determining a state (Y) of the signal from among first and second states (YA, YB) corresponding respectively to the first and second conditions of the signal per frame, these means (29) modifying the determined state of the signal only after the comparator (27) indicates for several successive frames a signal condition different from that corresponding to the previously determined state, and the pre-emphasis filter (22) being used to filter the signal addressed to the input of the coder (12) only when the means (29) have determined that the signal is in its second state.
  7. Device according to Claim 6, characterized in that the means (29) for determining the state of the signal comprise a counter (32) calculating after each frame a counting variable (V), incrementing it when the comparator (27) indicates a signal condition different from that corresponding to the determined state of the signal, decrementing it, unless it equals zero, when the comparator (27) indicates a signal condition identical to that corresponding to the determined state of the signal, and resetting it to zero when it reaches a predetermined threshold, the determined state (Y) of the signal being modified on each reset to zero of the counting variable (V).
  8. Device according to Claim 6 or 7, characterized in that it comprises another comparator (25) which compares the calculated energy of said acoustic signal or of the high-pass filtered signal with a predetermined threshold, so as to activate the means (29) of determining state of the signal only when said threshold is exceeded.
EP95401261A 1994-06-03 1995-05-31 Method and apparatus for preprocessing an acoustic signal before speech coding Expired - Lifetime EP0685836B1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR9406824A FR2720849B1 (en) 1994-06-03 1994-06-03 Method and device for preprocessing an acoustic signal upstream of a speech coder.
FR9406824 1994-06-03

Publications (2)

Publication Number Publication Date
EP0685836A1 EP0685836A1 (en) 1995-12-06
EP0685836B1 true EP0685836B1 (en) 1999-07-21

Family

ID=9463860

Family Applications (1)

Application Number Title Priority Date Filing Date
EP95401261A Expired - Lifetime EP0685836B1 (en) 1994-06-03 1995-05-31 Method and apparatus for preprocessing an acoustic signal before speech coding

Country Status (4)

Country Link
US (1) US5644679A (en)
EP (1) EP0685836B1 (en)
DE (1) DE69510865T2 (en)
FR (1) FR2720849B1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2729247A1 (en) * 1995-01-06 1996-07-12 Matra Communication SYNTHETIC ANALYSIS-SPEECH CODING METHOD
US6799159B2 (en) * 1998-02-02 2004-09-28 Motorola, Inc. Method and apparatus employing a vocoder for speech processing

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0243562B1 (en) * 1986-04-30 1992-01-29 International Business Machines Corporation Improved voice coding process and device for implementing said process
FR2632758B1 (en) * 1988-06-13 1991-06-07 Matra Communication LINEAR PREDICTION SPEECH CODING AND ENCODING METHOD
JP2626223B2 (en) * 1990-09-26 1997-07-02 日本電気株式会社 Audio coding device

Also Published As

Publication number Publication date
EP0685836A1 (en) 1995-12-06
FR2720849A1 (en) 1995-12-08
US5644679A (en) 1997-07-01
DE69510865T2 (en) 2000-07-13
FR2720849B1 (en) 1996-08-14
DE69510865D1 (en) 1999-08-26

Similar Documents

Publication Publication Date Title
EP0127718B1 (en) Process for activity detection in a voice transmission system
EP2419900B1 (en) Method and device for the objective evaluation of the voice quality of a speech signal taking into account the classification of the background noise contained in the signal
EP0139803B1 (en) Method of recovering lost information in a digital speech transmission system, and transmission system using said method
EP2002428B1 (en) Method for trained discrimination and attenuation of echoes of a digital signal in a decoder and corresponding device
JP3423906B2 (en) Voice operation characteristic detection device and detection method
EP0932964B1 (en) Method and device for blind equalizing of transmission channel effects on a digital speech signal
EP0685833B1 (en) Method for speech coding using linear prediction
US7167544B1 (en) Telecommunication system with error messages corresponding to speech recognition errors
EP0906613B1 (en) Method and device for coding an audio signal by "forward" and "backward" lpc analysis
EP0428445B1 (en) Method and apparatus for coding of predictive filters in very low bitrate vocoders
SE470577B (en) Method and apparatus for encoding and / or decoding background noise
EP0043056B1 (en) Process for the detection of speech in a telephone circuit signal, and speech detector therefor
EP0685836B1 (en) Method and apparatus for preprocessing an acoustic signal before speech coding
EP0692883A1 (en) Blind equalisation method, and its application to speech recognition
EP0714088B1 (en) Voice activity detection
EP1229517B1 (en) Method for recognizing speech with noise-dependent variance normalization
FR2494525A1 (en) TONE CONTROL CIRCUIT
EP1021805B1 (en) Method and apparatus for conditioning a digital speech signal
CA1165917A (en) Device for measuring the attenuation in a transmission path
EP0776114A2 (en) Telephone apparatus with controllable volume in response to ambient noise
US6633847B1 (en) Voice activated circuit and radio using same
EP0989544A1 (en) Device and method for filtering a speech signal, receiver and telephone communications system
EP0337868B1 (en) Method and apparatus for signal discrimination
EP0015363B1 (en) Speech detector with a variable threshold level
EP0073720B1 (en) Device for digital frequency generation

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE ES GB IT NL SE

17P Request for examination filed

Effective date: 19951118

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: MATRA NORTEL COMMUNICATIONS

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

17Q First examination report despatched

Effective date: 19981112

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE ES GB IT NL SE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY

Effective date: 19990721

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 19990721

Ref country code: ES

Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY

Effective date: 19990721

GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)

Effective date: 19990804

REF Corresponds to:

Ref document number: 69510865

Country of ref document: DE

Date of ref document: 19990826

ITF It: translation for a ep patent filed
NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20040528

Year of fee payment: 10

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20050414

Year of fee payment: 11

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED.

Effective date: 20050531

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20051201

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20060531

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20060531