EP0685836B1 - Method and apparatus for preprocessing an acoustic signal before speech coding - Google Patents
Method and apparatus for preprocessing an acoustic signal before speech coding Download PDFInfo
- Publication number
- EP0685836B1 EP0685836B1 EP95401261A EP95401261A EP0685836B1 EP 0685836 B1 EP0685836 B1 EP 0685836B1 EP 95401261 A EP95401261 A EP 95401261A EP 95401261 A EP95401261 A EP 95401261A EP 0685836 B1 EP0685836 B1 EP 0685836B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- signal
- state
- frame
- energy
- acoustic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000000034 method Methods 0.000 title claims description 9
- 238000007781 pre-processing Methods 0.000 title claims description 4
- 238000001914 filtration Methods 0.000 claims description 6
- 238000011144 upstream manufacturing Methods 0.000 claims 2
- 238000001514 detection method Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 10
- 230000003595 spectral effect Effects 0.000 description 8
- 238000001228 spectrum Methods 0.000 description 6
- 230000008901 benefit Effects 0.000 description 3
- 230000005284 excitation Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
Definitions
- the present invention relates to a method and a acoustic signal preprocessing device supplied to a speech coder. It applies in particular, but not exclusively, to improve the performance of encoders low-speed speech.
- Low bit rate speech coders (typically 5 kbit / s for a sampling frequency of 8 kHz) give their best performance on signals presenting a "telephone" spectrum, that is to say in the band 300-3400 Hz and with a pre-emphasis in the high frequencies.
- IRS Intermediate Reference System
- This template has been defined for telephone handsets, both in input (microphone) than output (headphones).
- the speech encoder input signal has a spectrum more "flat", for example when a hands-on installation free is used, using a response microphone in linear frequency.
- the usual vocoders are designed to be independent of the input with which they function, and they are not informed of the characteristics of this entry. If microphones different characteristics are likely to be connected to the vocoder, or more generally if the vocoder is likely to receive acoustic signals with different spectral characteristics, it there are then cases where the vocoder is used sub-optimally.
- a main purpose of this invention is to improve the performance of a vocoder by making them less dependent on features spectral signal intended for it.
- the method according to the invention consists in subjecting the acoustic input signal to high-pass filtering, to compare the energy of the high pass filtered signal to that of the unfiltered signal to determine a signal state among a first state for which the energy of the filtered high pass signal is greater than a predetermined fraction of the energy of the unfiltered signal, and a second state for which the energy of the filtered high pass signal is lower at the predetermined fraction of the signal energy not filtered, and to send the filtered signal to the encoder input high pass subject to high emphasis frequencies when the signal is in its second state.
- the high pass filter used is typically a 400 Hz abrupt cutoff filter, and the fraction predetermined energy is typically 85 to 95%.
- the first signal state corresponds to IRS characteristics
- the second state corresponds to a flatter spectrum of the input acoustic signal containing proportionally more energy at low frequencies.
- a flat spectrum signal is preprocessed (high pass filtering and pre-emphasis) to make spectral characteristics closer to those of the IRS template.
- the use of high pass filtering for determining the signal state has the advantage over low-pass filtering, to allow the signal to be used filtered to address it (after pre-emphasis) to the input of the vocoder.
- the determined state of the signal cannot only be changed when the input acoustic signal, or the high pass filtered signal, has an energy greater than one predetermined threshold. In fact, otherwise (by example in a zone of silence or low ambient noise), signal energy is too low to be able to reliably assess its spectral characteristics.
- the acoustic signal When the acoustic signal is digitized in frames successive, it is detected whether the signal included in each frame is in a first condition corresponding to first state or in a second condition corresponding to second state, and the state of the signal is determined on the basis frame by frame conditions, by not modifying the state determined that after several successive frames show a signal condition different from that corresponding to the previously determined state.
- This introduces a kind hysteresis which allows variations to be taken into account of the speech signal spectral envelope, due to ambient noise or speech itself (the timbre of the voice is not constant). This reduces the risk of false determination of the signal state, which leads to a better quality of the coded signal and avoids introducing stamp discontinuities which could be due to untimely modifications of the determined state.
- the pretreatment device includes a high-pass filter receiving the acoustic signal input, means to calculate the energies contained respectively in said acoustic signal and in the high pass filter output signal, means for comparison of calculated energies, and a filter of high frequency pre-emphasis, which input receives the output signal of the high pass filter, and the output of which delivers the signal sent to the encoder input when the comparison means reveal that the output signal from the high pass filter contains less than a fraction predetermined energy of said acoustic signal.
- the two solid lines correspond to the framing of the IRS template defined for microphones in CCITT Recommendation P48.
- an IRS type microphone signal has a strong attenuation in the lower part of the spectrum (between 0 and 300 Hz) and a relative emphasis in the high frequencies.
- a linear type signal provided by example through the microphone of a hands-free system, presents a flatter spectrum, notably not having the strong attenuation at low frequencies (a typical example of such a linear type signal is illustrated by a line in dashes on the diagram in Figure 1).
- Encoder 12 is a low encoder flow optimized for an IRS type input signal. he can be, among other things, a linear prediction coder with excitation by regular pulse vectors (RP-CELP), as described in document EP-A-0 347 307. The coder 12 has no prior knowledge of the source of the acoustic signal addressed to him.
- RP-CELP regular pulse vectors
- the acoustic input signal S I is the output signal from a microphone 13 which has been amplified and digitized by an analog-digital converter 14.
- the signal is typically digitized at a rate of 8 kHz sampling, and put into successive 30 ms frames each containing 240 16-bit samples.
- the pretreatment device 10 comprises a high-pass filter 16 receiving the input acoustic signal S I and delivering a filtered signal S I '.
- the filter 16 is typically a digital filter of the bi-quad type having an abrupt cutoff at 400 Hz.
- the energies E1 and E2 contained in each frame of the acoustic input signal S I and of the filtered signal S I ' are calculated by two units 17, 18 each carrying out the sum of the squares of the samples of each frame which it receives.
- the calculated energies E1 and E2 are supplied to a comparison unit 20 which determines the state of the signal in the form of a bit Y which is equal to 0 when it is determined that the signal is of IRS type (state Y A ), and 1 when it is determined that the signal is rather of the linear type (state Y B ).
- the output of the pretreatment device 10 connected to the input of the encoder 12 is constituted by a terminal of a switch 21, the other terminal of which is connected either to the input of the high-pass filter 16, or to the output of a pre-emphasis filter 22, according to the value of the bit Y delivered by the comparison unit 20.
- H (z) 1- ⁇ / z
- ⁇ denotes a pre-emphasis coefficient which is typically of the order 0.4.
- the comparison unit 20 is for example in accordance with the diagram illustrated in FIG. 3.
- the energy E1 of each frame of the input signal S I is sent to the input of a threshold comparator 25 which delivers a bit Z of value 0 when the energy E1 is less than a predetermined energy threshold, and of value 1 when the energy E1 is greater than the threshold.
- the energy threshold is typically of the order of -38 dB relative to the signal saturation energy.
- the comparator 25 serves to inhibit the determination of the state of the signal when the latter contains too little energy to be representative of the characteristics of the source. In this case, the determined state of the signal remains unchanged.
- the energies E1 and E2 are sent to a digital divider 26 which calculates the ratio E2 / E1 for each frame.
- This E2 / E1 ratio is sent to another threshold comparator 27 which delivers a bit X of value 0 when the E2 / E1 ratio is greater than a predetermined threshold, and of value 1 when the E2 / E1 ratio is less than the threshold.
- This threshold on the E2 / E1 ratio is typically of the order of 0.93.
- Bit X is representative of a signal condition on each frame.
- the state bit Y is not taken directly equal to the condition bit X, but it results from a processing of successive condition bits X by a state determination circuit 29.
- the operation of the state determination circuit 29 is illustrated in FIG. 4, where the upper timing diagram illustrates an example of evolution of the bit X provided by the comparator 27.
- the status bit Y (lower timing diagram) is initialized to 0 , because IRS characteristics are most frequently encountered.
- variable V As soon as the variable V reaches a predetermined threshold (8 in the example considered), it is reset to 0 and the value of the bit Y is changed, so that it is determined that the signal has changed state.
- a predetermined threshold 8 in the example considered
- the signal is in state Y A up to frame M, in state Y B between frames M and N (change of signal source), then again in state Y A from frame N.
- other modes of incrementation and decrementation and other threshold values would be usable.
- the above counting mode can for example be obtained by circuit 29 shown in Figure 3.
- This circuit includes a counter 32 on four bits, of which the bit most significant corresponds to the status bit Y, of which the three Least significant bits represent the counting variable V.
- X and Y bits are supplied at the input of an OR gate EXCLUSIVE 33 whose output is addressed to the input incrementation of the counter 32 via a AND gate 34 whose other input receives the Z bit supplied by the threshold comparator 25.
- the inverted output of the gate 33 is supplied to a decrementing input of the counter 32 via another AND gate 35 whose the other two inputs receive the Z bit respectively provided by comparator 25, and the output of an OR gate to three inputs 36 receiving the three least significant bits of the counter 32.
- Counter 32 is arranged to split the pulses received on its decrement input when its least significant bit is 0 or when one at less than the next two bits is 1, as shown by the OR gate 37 in FIG. 3.
- the circuit of determination 29 is not activated because AND gates 34, 35 prevent the value of counter 32 from being changed.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
La présente invention concerne un procédé et un dispositif de prétraitement du signal acoustique fourni à un codeur de parole. Elle s'applique notamment, mais non exclusivement, pour améliorer les performances des codeurs de parole à bas débit.The present invention relates to a method and a acoustic signal preprocessing device supplied to a speech coder. It applies in particular, but not exclusively, to improve the performance of encoders low-speed speech.
Les codeurs de parole à bas débit (typiquement 5 kbit/s pour une fréquence d'échantillonnage de 8 kHz) actuels donnent leur meilleure performance sur des signaux présentant un spectre "téléphonique", c'est-à-dire dans la bande 300-3400 Hz et avec une préaccentuation dans les fréquences élevées. Ces caractéristiques spectrales correspondent au gabarit IRS (Intermediate Reference System) défini par le CCITT dans la Recommandation P48. Ce gabarit a été défini pour les combinés téléphoniques, aussi bien en entrée (microphone) qu'en sortie (écouteurs).Low bit rate speech coders (typically 5 kbit / s for a sampling frequency of 8 kHz) give their best performance on signals presenting a "telephone" spectrum, that is to say in the band 300-3400 Hz and with a pre-emphasis in the high frequencies. These spectral characteristics correspond to the IRS (Intermediate Reference System) template defined by the CCITT in Recommendation P48. This template has been defined for telephone handsets, both in input (microphone) than output (headphones).
Cependant, il arrive de plus en plus fréquemment que le signal d'entrée d'un codeur de parole présente un spectre plus "plat", par exemple lorsqu'une installation mains libres est utilisée, employant un microphone à réponse en fréquence linéaire. Les vocodeurs habituels sont conçus pour être indépendants de l'entrée avec laquelle ils fonctionnent, et ils ne sont d'ailleurs pas informés des caractéristiques de cette entrée. Si des microphones de caractéristiques différentes sont susceptibles d'être raccordés au vocodeur, ou plus généralement si le vocodeur est susceptible de recevoir des signaux acoustiques présentant des caractéristiques spectrales différentes, il y a alors des cas où le vocodeur est utilisé de façon sous-optimale.However, it is more and more common that the speech encoder input signal has a spectrum more "flat", for example when a hands-on installation free is used, using a response microphone in linear frequency. The usual vocoders are designed to be independent of the input with which they function, and they are not informed of the characteristics of this entry. If microphones different characteristics are likely to be connected to the vocoder, or more generally if the vocoder is likely to receive acoustic signals with different spectral characteristics, it there are then cases where the vocoder is used sub-optimally.
Dans ce contexte, un but principal de la présente invention est d'améliorer les performances d'un vocodeur en les rendant moins dépendantes des caractéristiques spectrales du signal qui lui est destiné.In this context, a main purpose of this invention is to improve the performance of a vocoder by making them less dependent on features spectral signal intended for it.
Le procédé selon l'invention consiste à soumettre le signal acoustique d'entrée à un filtrage passe-haut, à comparer l'énergie du signal filtré passe-haut à celle du signal non filtré pour déterminer un état du signal parmi un premier état pour lequel l'énergie du signal filtré passe-haut est supérieure à une fraction prédéterminée de l'énergie du signal non filtré, et un second état pour lequel l'énergie du signal filtré passe-haut est inférieure à la fraction prédéterminée de l'énergie du signal non filtré, et à adresser à l'entrée du codeur le signal filtré passe-haut soumis à une préaccentuation des hautes fréquences lorsque le signal est dans son second état.The method according to the invention consists in subjecting the acoustic input signal to high-pass filtering, to compare the energy of the high pass filtered signal to that of the unfiltered signal to determine a signal state among a first state for which the energy of the filtered high pass signal is greater than a predetermined fraction of the energy of the unfiltered signal, and a second state for which the energy of the filtered high pass signal is lower at the predetermined fraction of the signal energy not filtered, and to send the filtered signal to the encoder input high pass subject to high emphasis frequencies when the signal is in its second state.
Le filtre passe-haut utilisé est typiquement un filtre à coupure abrupte à 400 Hz, et la fraction énergétique prédéterminée est typiquement de 85 à 95%. Le premier état du signal correspond aux caractéristiques IRS, et le second état correspond à un spectre plus plat du signal acoustique d'entrée contenant proportionnellement plus d'énergie aux basses fréquences. Avec le procédé selon l'invention, un tel signal à spectre plat est prétraité (filtrage passe-haut et préaccentuation) pour rendre ses caractéristiques spectrales plus proches de celles du gabarit IRS. L'utilisation d'un filtrage passe-haut pour déterminer l'état du signal présente l'avantage, par rapport à un filtrage passe-bas, de permettre d'utiliser le signal filtré pour l'adresser (après préaccentuation) à l'entrée du vocodeur.The high pass filter used is typically a 400 Hz abrupt cutoff filter, and the fraction predetermined energy is typically 85 to 95%. The first signal state corresponds to IRS characteristics, and the second state corresponds to a flatter spectrum of the input acoustic signal containing proportionally more energy at low frequencies. With the process according to the invention, such a flat spectrum signal is preprocessed (high pass filtering and pre-emphasis) to make spectral characteristics closer to those of the IRS template. The use of high pass filtering for determining the signal state has the advantage over low-pass filtering, to allow the signal to be used filtered to address it (after pre-emphasis) to the input of the vocoder.
De préférence, l'état déterminé du signal ne peut être modifié que lorsque le signal acoustique d'entrée, ou le signal filtré passe-haut, a une énergie supérieure à un seuil prédéterminé. En effet, dans le cas contraire (par exemple en zone de silence ou de faible bruit ambiant), l'énergie du signal est trop faible pour qu'on puisse évaluer de façon fiable ses caractéristiques spectrales. Preferably, the determined state of the signal cannot only be changed when the input acoustic signal, or the high pass filtered signal, has an energy greater than one predetermined threshold. In fact, otherwise (by example in a zone of silence or low ambient noise), signal energy is too low to be able to reliably assess its spectral characteristics.
Lorsque le signal acoustique est numérisé en trames successives, on détecte si le signal inclus dans chaque trame est dans une première condition correspondant au premier état ou dans une seconde condition correspondant au second état, et on détermine l'état du signal sur la base des conditions trame par trame, en ne modifiant l'état déterminé qu'après que plusieurs trames successives montrent une condition de signal différente de celle correspondant à l'état précédemment déterminé. Ceci introduit une sorte d'hystérésis qui permet de prendre en compte les variations rapides de l'enveloppe spectrale du signal de parole, dues au bruit ambiant ou à la parole elle-même (le timbre de la voix n'est pas constant). On réduit ainsi les risques de fausse détermination de l'état du signal, ce qui conduit à une meilleure qualité du signal codé et évite d'introduire des discontinuités de timbre qui pourraient être dues à des modifications intempestives de l'état déterminé.When the acoustic signal is digitized in frames successive, it is detected whether the signal included in each frame is in a first condition corresponding to first state or in a second condition corresponding to second state, and the state of the signal is determined on the basis frame by frame conditions, by not modifying the state determined that after several successive frames show a signal condition different from that corresponding to the previously determined state. This introduces a kind hysteresis which allows variations to be taken into account of the speech signal spectral envelope, due to ambient noise or speech itself (the timbre of the voice is not constant). This reduces the risk of false determination of the signal state, which leads to a better quality of the coded signal and avoids introducing stamp discontinuities which could be due to untimely modifications of the determined state.
Le dispositif de prétraitement selon l'invention comprend un filtre passe-haut recevant le signal acoustique d'entrée, des moyens pour calculer les énergies contenues respectivement dans ledit signal acoustique et dans le signal de sortie du filtre passe-haut, des moyens de comparaison des énergies calculées, et un filtre de préaccentuation des hautes fréquences, dont l'entrée reçoit le signal de sortie du filtre passe-haut, et dont la sortie délivre le signal adressé à l'entrée du codeur lorsque les moyens de comparaison révèlent que le signal de sortie du filtre passe-haut contient moins qu'une fraction prédéterminée de l'énergie dudit signal acoustique.The pretreatment device according to the invention includes a high-pass filter receiving the acoustic signal input, means to calculate the energies contained respectively in said acoustic signal and in the high pass filter output signal, means for comparison of calculated energies, and a filter of high frequency pre-emphasis, which input receives the output signal of the high pass filter, and the output of which delivers the signal sent to the encoder input when the comparison means reveal that the output signal from the high pass filter contains less than a fraction predetermined energy of said acoustic signal.
D'autres particularités et avantages de la présente invention apparaítront dans la description ci-après d'un exemple de réalisation préféré mais non limitatif, en référence aux dessins annexés, dans lesquels :
- la figure 1 est un diagramme illustrant les caractéristiques d'un signal acoustique de type IRS et d'un signal de type linéaire ;
- la figure 2 est un schéma synoptique d'un dispositif de prétraitement selon l'invention ;
- la figure 3 est un schéma plus détaillé des moyens de comparaison du dispositif de la figure 2 ; et
- la figure 4 montre des chronogrammes illustrant le mode de détermination de l'état du signal par les moyens de la figure 3.
- FIG. 1 is a diagram illustrating the characteristics of an acoustic signal of IRS type and of a linear type signal;
- Figure 2 is a block diagram of a pretreatment device according to the invention;
- Figure 3 is a more detailed diagram of the comparison means of the device of Figure 2; and
- FIG. 4 shows timing diagrams illustrating the mode of determining the state of the signal by the means of FIG. 3.
Sur la figure 1, les deux lignes en traits pleins correspondent à l'encadrement du gabarit IRS défini pour des microphones dans la Recommandation P48 du CCITT. On voit qu'un signal de microphone de type IRS présente une forte atténuation dans la partie basse du spectre (entre 0 et 300 Hz) et une relative accentuation dans les hautes fréquences. En comparaison, un signal de type linéaire, fourni par exemple par le microphone d'une installation mains libres, présente un spectre plus plat, n'ayant notamment pas la forte atténuation aux basses fréquences (un exemple typique d'un tel signal de type linéaire est illustré par une ligne en tirets sur le diagramme de la figure 1).In Figure 1, the two solid lines correspond to the framing of the IRS template defined for microphones in CCITT Recommendation P48. We see that an IRS type microphone signal has a strong attenuation in the lower part of the spectrum (between 0 and 300 Hz) and a relative emphasis in the high frequencies. In comparison, a linear type signal, provided by example through the microphone of a hands-free system, presents a flatter spectrum, notably not having the strong attenuation at low frequencies (a typical example of such a linear type signal is illustrated by a line in dashes on the diagram in Figure 1).
On tire parti de ces propriétés spectrales dans le
dispositif de prétraitement 10 selon l'invention, schématisé
sur la figure 2. Ce dispositif traite le signal d'entrée
fourni par une source de signal acoustique pour l'adresser
à un codeur de parole 12. Le codeur 12 est un codeur à bas
débit optimisé pour un signal d'entrée de type IRS. Il peut
être, entre autres, un codeur à prédiction linéaire à
excitation par des vecteurs d'impulsions régulières (RP-CELP),
tel que décrit dans le document EP-A-0 347 307. Le
codeur 12 n'a pas de connaissance a priori de la source du
signal acoustique qui lui est adressé.We take advantage of these spectral properties in the
Sur le schéma de la figure 2, le signal acoustique
d'entrée SI est le signal de sortie d'un microphone 13 qui
a été amplifié et numérisé par un convertisseur analogique-numérique
14. Le signal est typiquement numérisé à une
cadence d'échantillonnage de 8 kHz, et mis sous forme de
trames successives de 30 ms contenant chacune 240
échantillons de 16 bits.In the diagram of FIG. 2, the acoustic input signal S I is the output signal from a
Le dispositif de prétraitement 10 comprend un filtre
passe-haut 16 recevant le signal acoustique d'entrée SI et
délivrant un signal filtré SI'. Le filtre 16 est typiquement
un filtre numérique de type bi-quad ayant une coupure abrupte
à 400 Hz. Les énergies E1 et E2 contenues dans chaque
trame du signal acoustique d'entrée SI et du signal filtré
SI' sont calculées par deux unités 17, 18 effectuant chacune
la somme des carrés des échantillons de chaque trame qu'elle
reçoit. Les énergies calculées E1 et E2 sont fournies à une
unité de comparaison 20 qui détermine l'état du signal sous
la forme d'un bit Y qui vaut 0 lorsqu'il est déterminé que
le signal est de type IRS (état YA), et 1 lorsqu'il est déterminé
que le signal est plutôt de type linéaire (état YB).The
La sortie du dispositif de prétraitement 10 reliée
à l'entrée du codeur 12 est constituée par une borne d'un
commutateur 21 dont l'autre borne est reliée soit à l'entrée
du filtre passe-haut 16, soit à la sortie d'un filtre de
préaccentuation 22, suivant la valeur du bit Y délivré par
l'unité de comparaison 20. Lorsque Y = 0 (état YA), le
commutateur 21 est dans la position représentée sur la
figure 2, et le signal acoustique d'entrée SI est adressé
à l'entrée du codeur 12. Dans l'autre position (Y = 1, état
YB), c'est la sortie du filtre de préaccentuation 22 qui est
adressée à l'entrée du codeur 12. Le filtre de préaccentuation
22 reçoit le signal filtré passe-haut SI' et lui
applique une fonction de transfert de la forme H(z) = 1-β/z,
dans laquelle β désigne un coefficient de préaccentuation
qui est typiquement de l'ordre de 0,4. Ainsi, lorsque le
signal acoustique est de type linéaire, il est transformé
par filtrage passe-haut (filtre 16) et préaccentuation
(filtre 22) pour être adressé à l'entrée du codeur 12 avec
des caractéristiques spectrales plus proches de celles du
gabarit IRS.The output of the
Etant donné que le filtre passe-haut 16 n'affecte
que peu le signal d'entrée lorsque celui-ci a des
caractéristiques IRS, il est également possible de fournir
au codeur 12 le signal filtré passe-haut SI' lorsqu'on a
déterminé que le signal est dans l'état YA correspondant aux
caractéristiques IRS. Une variante du schéma de la figure 2
consiste alors à se dispenser du commutateur 21 en reliant
directement la sortie du filtre de préaccentuation 22 à
l'entrée du codeur 12, et à commander la valeur du
coefficient β dans le filtre 22 en fonction de la valeur du
bit d'état Y (par exemple β = 0 lorsque Y = 0 et β = 0,4
lorsque Y = 1).Since the high-
L'unité de comparaison 20 est par exemple conforme
au schéma illustré sur la figure 3. L'énergie E1 de chaque
trame du signal d'entrée SI est adressée à l'entrée d'un
comparateur à seuil 25 qui délivre un bit Z de valeur 0
lorsque l'énergie E1 est inférieure à un seuil d'énergie
prédéterminé, et de valeur 1 lorsque l'énergie E1 est
supérieure au seuil. Le seuil d'énergie est typiquement de
l'ordre de -38 dB par rapport à l'énergie de saturation du
signal. Le comparateur 25 sert à inhiber la détermination de
l'état du signal lorsque celui-ci contient trop peu
d'énergie pour être représentatif des caractéristiques de la
source. Dans ce cas, l'état déterminé du signal reste
inchangé.The
Les énergies E1 et E2 sont adressées à un diviseur
numérique 26 qui calcule le rapport E2/E1 pour chaque trame.
Ce rapport E2/E1 est adressé à un autre comparateur à seuil
27 qui délivre un bit X de valeur 0 lorsque le rapport E2/E1
est supérieur à un seuil prédéterminé, et de valeur 1
lorsque le rapport E2/E1 est inférieur au seuil. Ce seuil
sur le rapport E2/E1 est typiquement de l'ordre de 0,93. Le
bit X est représentatif d'une condition du signal sur chaque
trame. La condition X = 0 correspond aux caractéristiques
IRS du signal d'entrée (état YA), et la condition X = 1
correspond aux caractéristiques linéaires (état YB). Pour
éviter des changements d'état répétés et intempestifs à
l'occasion des variations à court terme de l'excitation
vocale, le bit d'état Y n'est pas pris directement égal au
bit de condition X, mais il résulte d'un traitement des bits
de condition successifs X par un circuit 29 de détermination
d'état.The energies E1 and E2 are sent to a
Le fonctionnement du circuit 29 de détermination
d'état est illustré sur la figure 4, où le chronogramme supérieur
illustre un exemple d'évolution du bit X fourni par
le comparateur 27. Le bit d'état Y (chronogramme inférieur)
est initialisé à 0, car les caractéristiques IRS sont le
plus fréquemment rencontrées. On calcule trame après trame
une variable de comptage V initialement mise à 0. La variable
V est incrémentée d'une unité chaque fois que la condition
X du signal sur une trame diffère de celle correspondant
à l'état déterminé Y (X = 1 et Y = 0, ou X = 0 et
Y = 1). Dans le cas contraire (X = Y = 0 ou 1) la variable
V est décrémentée de deux unités si elle est différente de
0 et de 1, décrémentée d'une unité si elle est égale à 1, et
maintenue inchangée si elle est égale à 0. Dès que la
variable V atteint un seuil prédéterminé (8 dans l'exemple
considéré), on la remet à 0 et on change la valeur du bit Y,
de sorte qu'on détermine que le signal a changé d'état.
Ainsi, dans l'exemple représenté sur la figure 4, le signal
est dans l'état YA jusqu'à la trame M, dans l'état YB entre
les trames M et N (changement de la source de signal), puis
de nouveau dans l'état YA à partir de la trame N. Bien
entendu, d'autres modes d'incrémentation et de décrémentation
et d'autres valeurs de seuil seraient utilisables.The operation of the state determination circuit 29 is illustrated in FIG. 4, where the upper timing diagram illustrates an example of evolution of the bit X provided by the
Le mode de comptage ci-dessus peut par exemple être
obtenu par le circuit 29 représenté sur la figure 3. Ce
circuit comprend un compteur 32 sur quatre bits, dont le bit
de poids fort correspond au bit d'état Y, et dont les trois
bits de poids faible représentent la variable de comptage V.
Les bits X et Y sont fournis à l'entrée d'une porte OU
EXCLUSIF 33 dont la sortie est adressée à l'entrée
d'incrémentation du compteur 32 par l'intermédiaire d'une
porte ET 34 dont l'autre entrée reçoit le bit Z fourni par
le comparateur à seuil 25. Ainsi, la variable V est
incrémentée lorsque X ≠ Y et Z = 1. La sortie inversée de la
porte 33 est fournie à une entrée de décrémentation du
compteur 32 par l'intermédiaire d'une autre porte ET 35 dont
les deux autres entrées reçoivent respectivement le bit Z
fourni par le comparateur 25, et la sortie d'une porte OU à
trois entrées 36 recevant les trois bits de poids faible du
compteur 32. Le compteur 32 est agencé pour dédoubler les
impulsions reçues sur son entrée de décrémentation lorsque
son bit de poids le plus faible vaut 0 ou lorsque l'un au
moins des deux bits suivants vaut 1, comme schématisé par la
porte OU 37 sur la figure 3. Ainsi, le compteur 32 est
décrémenté (d'une unité si V = 1 et de deux unités si V > 1)
lorsque X = Y et Z = 1 et V ≠ 0. Lorsque l'énergie du signal
d'entrée est insuffisante, on a Z = 0 et le circuit de
détermination 29 n'est pas activé car les portes ET 34, 35
empêchent de modifier la valeur du compteur 32.The above counting mode can for example be
obtained by circuit 29 shown in Figure 3. This
circuit includes a
Claims (8)
- Method of preprocessing an acoustic signal upstream of a speech coder (12), characterized in that the acoustic signal (SI) is subjected to high-pass filtering, the energy (E2) of the high pass filtered signal (SI') is compared with that (E1) of the unfiltered signal in order to determine a state (Y) of the signal from among a first state (YA) for which the energy of the high-pass filtered signal is above a predetermined fraction of the energy of the unfiltered signal and a second state (YB) for which the energy of the high-pass filtered signal is below the predetermined fraction of the energy of the unfiltered signal, and the high-pass filtered signal subjected to pre-emphasis of the high frequencies is addressed to the input of the coder (12) when the signal is in its second state.
- Method according to Claim 1, characterized in that the determined state of the signal is not modified when said acoustic signal or the high-pass filtered signal has energy below a predetermined threshold.
- Method according to Claim 1 or 2, characterized in that the acoustic signal (SI) being digitized as successive frames, there is frame-by-frame detection of whether the signal is in a first condition, corresponding to the first state (YA), for which the calculated energy (E2) of the frame of the high-pass filtered signal (SI') is above the predetermined fraction of the calculated energy (E1) of the frame of the unfiltered signal (SI) or in a second condition, corresponding to the second state (YB), for which the calculated energy of the frame of the high-pass filtered signal is below the predetermined fraction of the calculated energy of the frame of the unfiltered signal, and the state (Y) of the signal is determined on the basis of the frame-by-frame conditions (X), by modifying the determined state only after several successive frames show a signal condition different from that corresponding to the previously determined state.
- Method according to Claim 3, characterized in that a counting variable (V) is incremented when the condition (x) of the signal in a frame differs from that corresponding to the determined state (Y) of the signal, in that said counting variable (V) is decremented when the condition of the signal in a frame is that corresponding to the determined state of the signal unless this variable equals zero and in that, when the counting variable (V) reaches a predetermined threshold, it is reset to zero and the signal is determined to have changed state.
- Device (10) for preprocessing an acoustic signal upstream of a speech coder (12), characterized in that it comprises a high-pass filter (16) receiving said acoustic signal (SI), means (17, 18) for calculating the energies (E1, E2) contained respectively in said acoustic signal (SI) and in the output signal (SI') of the high-pass filter, means (20) for comparing the calculated energies, and a filter (22) for pre-emphasis of the high frequencies, the input of which receives the output signal from the high-pass filter, and the output of which delivers the signal addressed to the input of the coder (12) when the means of comparison (20) reveal that the output signal from the high-pass filter contains less than a predetermined fraction of the energy of said acoustic signal.
- Device according to Claim 5, characterized in that, with the acoustic signal being digitized as successive frames, the energies (E1, E2) are calculated for each frame by the means of calculation (17, 18), and the means of comparison (20) comprise a comparator (27) which detects frame by frame whether the signal is in a first or a second condition according to whether the ratio (E2/E1) between the calculated energy of the output signal from the high-pass filter (16) and the calculated energy of said acoustic signal (SI) is above or, respectively, below a predetermined value, and means (29) for determining a state (Y) of the signal from among first and second states (YA, YB) corresponding respectively to the first and second conditions of the signal per frame, these means (29) modifying the determined state of the signal only after the comparator (27) indicates for several successive frames a signal condition different from that corresponding to the previously determined state, and the pre-emphasis filter (22) being used to filter the signal addressed to the input of the coder (12) only when the means (29) have determined that the signal is in its second state.
- Device according to Claim 6, characterized in that the means (29) for determining the state of the signal comprise a counter (32) calculating after each frame a counting variable (V), incrementing it when the comparator (27) indicates a signal condition different from that corresponding to the determined state of the signal, decrementing it, unless it equals zero, when the comparator (27) indicates a signal condition identical to that corresponding to the determined state of the signal, and resetting it to zero when it reaches a predetermined threshold, the determined state (Y) of the signal being modified on each reset to zero of the counting variable (V).
- Device according to Claim 6 or 7, characterized in that it comprises another comparator (25) which compares the calculated energy of said acoustic signal or of the high-pass filtered signal with a predetermined threshold, so as to activate the means (29) of determining state of the signal only when said threshold is exceeded.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR9406824A FR2720849B1 (en) | 1994-06-03 | 1994-06-03 | Method and device for preprocessing an acoustic signal upstream of a speech coder. |
FR9406824 | 1994-06-03 |
Publications (2)
Publication Number | Publication Date |
---|---|
EP0685836A1 EP0685836A1 (en) | 1995-12-06 |
EP0685836B1 true EP0685836B1 (en) | 1999-07-21 |
Family
ID=9463860
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP95401261A Expired - Lifetime EP0685836B1 (en) | 1994-06-03 | 1995-05-31 | Method and apparatus for preprocessing an acoustic signal before speech coding |
Country Status (4)
Country | Link |
---|---|
US (1) | US5644679A (en) |
EP (1) | EP0685836B1 (en) |
DE (1) | DE69510865T2 (en) |
FR (1) | FR2720849B1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2729247A1 (en) * | 1995-01-06 | 1996-07-12 | Matra Communication | SYNTHETIC ANALYSIS-SPEECH CODING METHOD |
US6799159B2 (en) * | 1998-02-02 | 2004-09-28 | Motorola, Inc. | Method and apparatus employing a vocoder for speech processing |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0243562B1 (en) * | 1986-04-30 | 1992-01-29 | International Business Machines Corporation | Improved voice coding process and device for implementing said process |
FR2632758B1 (en) * | 1988-06-13 | 1991-06-07 | Matra Communication | LINEAR PREDICTION SPEECH CODING AND ENCODING METHOD |
JP2626223B2 (en) * | 1990-09-26 | 1997-07-02 | 日本電気株式会社 | Audio coding device |
-
1994
- 1994-06-03 FR FR9406824A patent/FR2720849B1/en not_active Expired - Fee Related
-
1995
- 1995-05-31 DE DE69510865T patent/DE69510865T2/en not_active Expired - Fee Related
- 1995-05-31 EP EP95401261A patent/EP0685836B1/en not_active Expired - Lifetime
- 1995-06-05 US US08/462,209 patent/US5644679A/en not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
EP0685836A1 (en) | 1995-12-06 |
FR2720849A1 (en) | 1995-12-08 |
US5644679A (en) | 1997-07-01 |
DE69510865T2 (en) | 2000-07-13 |
FR2720849B1 (en) | 1996-08-14 |
DE69510865D1 (en) | 1999-08-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0127718B1 (en) | Process for activity detection in a voice transmission system | |
EP2419900B1 (en) | Method and device for the objective evaluation of the voice quality of a speech signal taking into account the classification of the background noise contained in the signal | |
EP0139803B1 (en) | Method of recovering lost information in a digital speech transmission system, and transmission system using said method | |
EP2002428B1 (en) | Method for trained discrimination and attenuation of echoes of a digital signal in a decoder and corresponding device | |
JP3423906B2 (en) | Voice operation characteristic detection device and detection method | |
EP0932964B1 (en) | Method and device for blind equalizing of transmission channel effects on a digital speech signal | |
EP0685833B1 (en) | Method for speech coding using linear prediction | |
US7167544B1 (en) | Telecommunication system with error messages corresponding to speech recognition errors | |
EP0906613B1 (en) | Method and device for coding an audio signal by "forward" and "backward" lpc analysis | |
EP0428445B1 (en) | Method and apparatus for coding of predictive filters in very low bitrate vocoders | |
SE470577B (en) | Method and apparatus for encoding and / or decoding background noise | |
EP0043056B1 (en) | Process for the detection of speech in a telephone circuit signal, and speech detector therefor | |
EP0685836B1 (en) | Method and apparatus for preprocessing an acoustic signal before speech coding | |
EP0692883A1 (en) | Blind equalisation method, and its application to speech recognition | |
EP0714088B1 (en) | Voice activity detection | |
EP1229517B1 (en) | Method for recognizing speech with noise-dependent variance normalization | |
FR2494525A1 (en) | TONE CONTROL CIRCUIT | |
EP1021805B1 (en) | Method and apparatus for conditioning a digital speech signal | |
CA1165917A (en) | Device for measuring the attenuation in a transmission path | |
EP0776114A2 (en) | Telephone apparatus with controllable volume in response to ambient noise | |
US6633847B1 (en) | Voice activated circuit and radio using same | |
EP0989544A1 (en) | Device and method for filtering a speech signal, receiver and telephone communications system | |
EP0337868B1 (en) | Method and apparatus for signal discrimination | |
EP0015363B1 (en) | Speech detector with a variable threshold level | |
EP0073720B1 (en) | Device for digital frequency generation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): DE ES GB IT NL SE |
|
17P | Request for examination filed |
Effective date: 19951118 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: MATRA NORTEL COMMUNICATIONS |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
17Q | First examination report despatched |
Effective date: 19981112 |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE ES GB IT NL SE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY Effective date: 19990721 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 19990721 Ref country code: ES Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY Effective date: 19990721 |
|
GBT | Gb: translation of ep patent filed (gb section 77(6)(a)/1977) |
Effective date: 19990804 |
|
REF | Corresponds to: |
Ref document number: 69510865 Country of ref document: DE Date of ref document: 19990826 |
|
ITF | It: translation for a ep patent filed | ||
NLV1 | Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
REG | Reference to a national code |
Ref country code: GB Ref legal event code: IF02 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20040528 Year of fee payment: 10 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20050414 Year of fee payment: 11 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED. Effective date: 20050531 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20051201 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20060531 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20060531 |