WO1998005031A2 - A method and a device for the reduction impulse noise from a speech signal - Google Patents

A method and a device for the reduction impulse noise from a speech signal Download PDF

Info

Publication number
WO1998005031A2
WO1998005031A2 PCT/FI1997/000458 FI9700458W WO9805031A2 WO 1998005031 A2 WO1998005031 A2 WO 1998005031A2 FI 9700458 W FI9700458 W FI 9700458W WO 9805031 A2 WO9805031 A2 WO 9805031A2
Authority
WO
WIPO (PCT)
Prior art keywords
residual
signal
numerical values
calculated
noise
Prior art date
Application number
PCT/FI1997/000458
Other languages
French (fr)
Other versions
WO1998005031A3 (en
Inventor
Paavo Alku
Original Assignee
Paavo Alku
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Paavo Alku filed Critical Paavo Alku
Priority to AU35445/97A priority Critical patent/AU3544597A/en
Publication of WO1998005031A2 publication Critical patent/WO1998005031A2/en
Publication of WO1998005031A3 publication Critical patent/WO1998005031A3/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/12Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being prediction coefficients

Definitions

  • a method and a device fop the reduction impulse noise from a speech signal
  • the invention relates in general to reduction of noise in an electrically reproduced speech signal.
  • the invention relates to a method and an apparatus whereby impulse noise, or brief noise peaks, can be eliminated from a speech signal. More specifically the invention relates to a method and an apparatus which improve the quality of sound reproduced by digital mobile stations.
  • the microphone of the transmitting party converts the acoustic speech signal into electric form whereafter it is conveyed using various coding and modulation methods to the receiving party whose equipment, including a loudspeaker, reconverts the signal into acoustic form.
  • the transmission path may include radio transmission and reception as well as transmission across fixed wire and fibre-optic connections.
  • Recording, transmission and playback conditions distort the transmitted speech signal and induce in it many kinds of spurious effects which the receiving party, or the listener, experiences as disturbing.
  • the transmitting party with the microphone may be in a noisy place so that the microphone also records the ambient noise in the background. Electric switches, motors, lightnings and other electric systems operating in the same area also induce noise in an electric communication system.
  • the noise may be relatively continuous in nature, in which case it is called white noise, or it may consist of single momentary disturbances, in which case it is called impulse noise.
  • a differential microphone structure aims at the same effect using only one microphone element which the speech signal affects only in one direction but noise affects symmetrically in both directions.
  • So-called active noise reduction employs special sound generating elements that produce acoustic oscillation having the same amplitude as the background noise measured, but the opposite phase.
  • Prior-art methods have some known disadvantages. Filters having effect on certain frequency areas require prior knowledge of useful and unwanted frequencies or alternatively complex circuit arrangements that analyse the speech signal's frequency structure and control the filters' frequency response. Methods that aim at eliminating background noise in the transmitting end are ineffective against disturbances generated at a later stage of the transmission path and usually are unable to eliminate impulse noise. Active noise reduction in the receiving end requires expensive and complex electronics to produce a real-time response to measured background noise impulses, and noise reduction is then not directed to disturbances in the transmitted signal but to local background noise.
  • An object of this invention is to provide a method and an apparatus with which the understandability of an electrically reproduced speech signal can be enhanced in an advantageous manner compared to the prior art. Another object of the invention is that the method according to it only calls for slight changes in prior-art apparatus reproducing sound electrically so that the manufacturing costs of the apparatus according to the invention are not substantially bigger than in the prior art. A further object of the invention is that the method according to it is applicable in small, portable communication devices such as mobile phones.
  • the objects of the invention are achieved by eliminating from an electric waveform corresponding to an acoustic signal features that cannot have originated in the human sound production system.
  • samples radically different from their surroundings are processed so as to better match samples produced by the human voice production mechanism by replacing them by estimates from linear predictive coding (LPC) analysis.
  • LPC linear predictive coding
  • the method according to the invention for reducing noise in a digitally transmitted speech signal consisting of numerical values representing successive samples of a certain waveform, wherein it is recognised those of said numerical values that represent waveform details (2) which produce noise in significant portions, is characterised in that to recognise a certain first numerical value it is calculated
  • a first residual which equals the difference of a numerical value representing a sample of said waveform and an estimate calculated for that numerical value on the basis of other numerical values, calculated at sampling moments in a certain set of numerical values including said first numerical value
  • a second residual which equals the difference of a numerical value representing a sample of said waveform and an estimate calculated for that numerical value on the basis of other numerical values, calculated at sampling moments in said set of numerical values when said first numerical value has been replaced by the estimate calculated for it, and said first numerical value is recognised on the basis of said first and second residuals.
  • the invention is also directed to an apparatus for realising the method described above.
  • the apparatus according to the invention for reducing noise in a digitally transmitted signal which consists of numerical values representing samples of a certain waveform, is characterised in that it comprises
  • the invention is based on the knowledge that, as a rule, a speech signal produced by the human vocal organs conforms to certain laws. If an acoustic signal is given an electrical representation, these laws are manifest in a certain regularity in the waveform of the electric signal.
  • an electric signal is represented by a train of samples, where the samples can be processed individually or in groups of desired sizes. If the value of a given sample substantially differs from an otherwise regular waveform, it can be assumed that it was not generated on the basis of the original acoustic signal but as a consequence of a sudden disturbance.
  • the deviating sample value is replaced by a value that is closer to the estimate obtained on the basis of the regular waveform so that a near-original, noiseless signal can be reconstructed.
  • a preferred embodiment of the invention is based on the known LPC method, in which at each sample of a digital signal a difference is calculated representing the difference of the current sample and the estimate calculated for it on the basis of the previous samples.
  • a sudden disturbance in the original signal results in the degradation of the estimate, which shows as increasing amplitudes of the difference values at the location of the disturbance.
  • the effect of the disturbance is removed when the deviating sample is replaced by another value which better conforms to the average behaviour of the speech signal.
  • a preferred replacement value is the value calculated for the sample in question as an estimate determined on the basis of the previous speech samples.
  • Noise reduction according to the invention is preferably realised at the receiving end so that it works for all disturbances originated at the transmitting end or on the transmission path and transmitted via a communications link to the receiving apparatus.
  • This brings considerable advantage as a communications device applying the method according to the invention produces a higher-quality acoustic signal from a received radio signal than an apparatus without the noise reduction according to the invention.
  • the invention is not limited to be used at the receiving end only, but it can also be used to enhance the signal quality prior to transmission.
  • the invention is applicable in local digital sound reproduction systems that do not require signal transmission in a communications network.
  • the noise reduction according to the invention can be applied to cellular networks and mobile stations.
  • the inventional method can be realised in a receiving mobile station without any hardware changes as the implementation of the method can utilise information already computed in the mobile station.
  • the inventional method is flexible in a mobile station environment in that it can be realised so as to process different quantities of candidate impulses so that computing capacity can be adapted to the environment.
  • the simplicity of the method is an advantage as far as the production of mobile stations is concerned because it makes possible to perform the new calculations using the existing DSP circuit in the phone without any changes in the electronic circuitry of the apparatus.
  • the proposed method does not call for changes in the LPC parameters, such as the order and frame size, of the most typical speech encoders.
  • the invention can be realised in a mobile station mainly by modifying the software that controls the operation of the mobile station.
  • the method according to the invention can be realised not only in a receiving mobile station but also in a transmitting mobile station or in cellular base stations where transmission and reception coding is identical to that of mobile stations. Realisation of the method in base stations has the special advantage that no changes are needed in the existing mobile stations.
  • Fig. la shows a certain signal waveform and sampling applied to it
  • Fig. lb shows an LPC residual formed from the waveform of Fig. la according to the prior art
  • Fig. lc shows an LPC residual shaped according to a preferred embodiment of the invention
  • FFiigg.. 22 shows in the form of block diagram a communications device which includes noise removal according to the invention
  • Fig. 3 shows in the form of block diagram a noise removal arrangement according to the invention applied to a transmitter according to the GSM system and Fig. 4 shows in the form of block diagram noise removal according to the invention applied to a receiver according to the GSM system.
  • the sound production process of a healthy person starts with air being pushed from the lungs through the vocal cords in the larynx so that the vocal cords start oscillating and generate a so-called glottis impulse.
  • a speaker can produce various resonating frequencies (so-called formants).
  • formants As the glottal excitation that has left the vocal cords passes through the vocal tract, the exitation is strongly filtered by the vocal tract.
  • the sound produced e.g. vowel /a/, gets its characteristic shape mainly from the vocal tract's resonating frequencies used in the sound production.
  • vocal cords do not oscillate in the speech production, but the speaker produces in the outcoming air flow interruptions and/or broadband noise oscillation using the mouth. So, the human sound production mechanism, particularly in the case of voiced sounds, produces speech in physiological processes based on the periodic motion of two baggy organs (vocal cords) containing plenty of water. The waveform thus produced, when received by a microphone and converted to electric form, is quite regular when observed across a short span of time.
  • Fig. la schematically shows part of a waveform of an electric signal.
  • it represent a human speech sound, recorded by a microphone and analogically converted into an electric signal the amplitude of which varies as a function of time as described by the waveform 1.
  • the human sound production system can only produce certain relatively regular waveforms, so we can say about the waveform 1 that the sudden peak 2 in it is probably caused by a disturbance and not included in the original speech sound.
  • the waveform 1 must be shaped such that the effect of the peak 2 is eliminated so that when an electric signal according to the shaped waveform is fed to a loudspeaker, the loudspeaker reproduces the sound without the disturbance.
  • Signal distortion caused by sudden disturbances such as the peak 2 is called impulse noise.
  • the waveform 1 is not processed as such, but it is sampled, i.e. samples are taken from the analog signal at regular intervals. Sampling can be realised using various methods.
  • a preferred embodiment of the invention for removing impulse noise involves the LPC method, which is known per se, but the essential features of the method will nevertheless be described below to provide background information for the preferred embodiment of the invention.
  • the LPC method an estimate is produced for each sample of a digital sample train, said estimate being a weighted linear combination of p previous samples, where p is a positive parameter. Then it is calculated the difference of the current signal and the estimate calculated for it in the manner described above, said difference being called a residual signal, or residual. The better the estimate, the smaller the energy of the residual signal.
  • the LPC analysis has become the most significant technique facilitating speech coding.
  • a speech signal of good quality can be transmitted using a considerably lower bit rate than that used e.g. by the pulse code modulation (PCM) technique traditionally used in the fixed telecommunications network.
  • PCM pulse code modulation
  • Efficient speech coding using the LPC analysis is based on the fact that a residual resembling noise can be quantised using a small amount of information without significantly degrading the quality of the signal synthesised from the residual back into a speech signal.
  • the LPC method cannot predict sudden noise peaks, it produces for a sample corresponding to a time index k an estimate which is represented by the darkened circle 4 in the case depicted by Fig. la.
  • the open circle 3 represents a sample of the original signal suffering from impulse noise at time index k so that the difference of the signal and the estimate is quite big.
  • Fig. lb shows a prior-art LPC residual which shows that the prediction error grows considerably across the time span from index k to index k+p, where p is the order of the prediction, i.e. the number of the previous samples used for calculating a given estimate.
  • an apparatus processing the original noisy speech signal examines whether it would be more logical to replace in a given time index the speech signal sample by the estimate produced by the LPC analysis so that the residual of the new signal thus obtained would be smaller than the residual computed from the original speech signal.
  • Fig. lc shows a residual calculated by replacing a noisy sample of the original speech signal at time index k by an estimate produced by the LPC analysis.
  • the invention be applied as an enhancement method for the speech quality of a sound-reproducing apparatus prior to the conversion of a digital signal into an analog one by a D/A converter.
  • the method aims at reconstructing a speech signal spoiled by impulse noise so that it becomes nearer to the waveform that represented the signal before the introduction of the noise.
  • By feeding the reconstructed wave- form to a loudspeaker we get an acoustic signal that sounds almost like the original noiseless speech signal.
  • the apparatus calculates for each digital signal sample an estimate on the basis of previous samples.
  • the preferred embodiment of the invention only requires that the apparatus knows a certain threshold value to which it compares the ratio of the energies of two residuals. A first residual has been calculated from the original signal which may include noise.
  • a second residual is the residual obtained from the signal wherein the sample of the original speech signal at the current time index has been replaced by the LPC estimate of that sample.
  • These two residuals differ from each other only in the index span from k to k+p, where k is the current time index and p is the order of the LPC prediction.
  • Computing capacity can be saved in the energy calculation by taking into account the fact that differences occur only in that span.
  • a residual and its energy are always computed within a certain time window that includes the current sample or samples. If the ratio of the energies of said two residuals exceeds said threshold value, the apparatus assumes that the speech signal sample at the current time index is a noise peak so that there is good reason to replace it by the LPC estimate.
  • FIG. 2 shows a simple block diagram of an apparatus applying the preferred embodiment of the invention described above. It generally comprises a reception part 24, the detailed structure of which is not shown but which may include e.g. known means for receiving, demodulating and decoding a radio signal transmitted from a base station in a digital cellular radio system.
  • the reception part 24 feeds to block 20 a bit string that corresponds to a speech signal.
  • the digital-to-analog conversion is performed in block 21.
  • Block 22 calculates the residuals and compares the ratio of the energies of two residuals to a predetermined threshold value. When block 22 detects that the threshold value is exceeded, it instructs block 21 to replace the sample in the current time index by the value produced by the LPC prediction.
  • Average magnitude of residuals varies according to the regularity of the original signal.
  • the resulting waveform is very regular and thus very predictable.
  • the residual energies are very small and disturbances caused by impulse noise are clearly discernible. If the speaker utters an /s/ sound, the acoustic signal resembles random noise and its predictability is poor. Then, disturbances caused by impulse noise do not induce in the residual as great a change as in the case of voiced sounds.
  • the method according to the invention is unable to improve the quality of voiceless sounds as effectively as that of voiced sounds. This, however, is not very significant since in the case of the /s/ sound impulse noise is mixed with the characteristic noise structure of the sound it- self and a listener will not experience it as disturbing as in the case of a voiced sound.
  • the invention advantageously prepares for this divergence of sounds in such a manner that the threshold value at which a signal sample is interpreted as a noise peak, is adaptive, i.e. its magnitude is determined in a certain manner by the average ratio of the residual energies compared.
  • the comparison block 22 comprises e.g. a FIFO-type (First-In- First-Out) register 22a that always stores the n latest residuals or residual energy ratios, where n is a positive parameter.
  • the comparison block 22 must include calculating means 22b for computing an average or median or another suitable parameter from the contents of the register as well as means for determining the threshold value on the basis of said parameter. All said functional features are preferably realised as a program executed by a digital signal processor.
  • Fig. 3 illustrates the operation of the method according to the invention in a trans- mitter encoder 10 of the GSM system. More specifically, it is a regular pulse excited-long term predictive (RPE-LTP) speech encoder, which is currently widely considered the technically most capable speech encoder, but the inventional ideas of the example discussed here can also be applied to the CELP (Code Excited Linear Prediction), SBC-APCM, SBC-ADPCM and MPE-LTP speech encoders.
  • RPE-LTP regular pulse excited-long term predictive
  • a pre-emphasised speech signal is divided into segments comprising 160 samples in a segmentation part 40 to which the signal is brought via connection 34. After that, the segments of the signal are windowed using so-called Hamming windows in block 50.
  • the windowed and segmented signal goes to short term prediction (STP) analysis and filtering 300 where autocorrelation coefficients are first computed in block 60 in which the signal arrives via connection 56.
  • the autocorrelation coefficients are taken via connection 67 to block 70 where reflection coefficients are calculated using the so-called Schur recursion.
  • the reflection coefficients are sent 714 to the impulse removal 140 where they are used for recognising noise peaks.
  • the reflection coefficients are also sent forward in the STP analysis (300) via connection 78 to block 80, where logarithmic area ratios (LAR) are computed from them.
  • the logarithmic area ratios are quantised in blocks 90 and 100 and interpolated linearly in blocks 110 and 120.
  • an STP residual is computed from the processed logarithmic area ratios using a so-called partial correlation (PARCOR) method.
  • PARCOR partial correlation
  • This STP residual is fed to the impulse removal 140, where according to the invention a corrected residual is formed using both the original speech signal (connection 14) and the reflection coefficients (connection 714), said corrected residual being characterised in that the effects of the most significant noise peaks have been removed from it.
  • the corrected residual is fed via connection 141 to a long term prediction (LTP) analysis, where a peak structure reproduced at the fundamental frequency of the speech is removed from the residual signal using a so-called long term LPC estimate.
  • LTP long term prediction
  • the residual signal can be substantially improved by adding the impulse removal according to the invention, block 140, connections 14, 714, 1314, 141, to the known GSM speech encoder. It is obvious that a better quality of the residual signal can substantially reduce noise in the speech signal.
  • an RPE decoder 170 detects a speech signal from a coded signal brought to a GSM receiver, said speech signal being then sent via connection 1718 to be LTP synthesised in block 180.
  • the speech signal is then taken via connections 1813 and 1319 to signal postprocessing 500.
  • the LTP and STP syntheses (180, 190) and filterings performed on the speech signal along this path are substantially inverse operations of the LTP and STP analyses and filterings already described in conjunction with Fig. 3.
  • the RPE decoder 170 also receives LPC coefficients which are sent via connection 1710 to the STP synthesis filtering chain 400 where reflection coefficients are filtered from the LPC coefficients using parameters of logarithmic area ratios.
  • the reflection coefficients are filtered in blocks 100, 110, 120 by quantising and linearly interpolating the logarithmic area ratio parameters.
  • the method is substantially the inverse operation of the compilation of the uncorrected residual described in conjunction with Fig. 3.
  • the reflection coefficients are sent via connection 1214 to the impulse removal 140 where the strongest noise peaks are removed from the speech signal according to the invention using the reflection coefficients received via connection 1214.
  • the method discussed above substantially improves the quality of speech reproduced by a GSM receiver.
  • the embodiment shown in Fig. 4 is particularly advantageous in a receiving GSM mobile station since in this case the corrected signal is no more sent to possibly corruptive transmission paths but will propagate direct to the listener's ear.
  • the invention provides an apparatus and a method for improving the under- standability of an electrically reproduced speech signal in a manner which is easy to implement and which effectively attenuates all disturbances caused by the transmitting end, transmission path and preceding parts of the receiving apparatus in addition to disturbances acoustically coupled to the microphone.
  • the method according to the invention does not require complex equipment or special components so that it is ideal for small, portable communications devices, such as mobile stations, but on the other hand, it can be easily applied to base stations of various cellular networks as well.
  • the invention is also applicable to data com- munications systems of fixed networks. If the invention is applied in a communications device designed for the handling of both speech and other signals, it may be advantageous to include in the device a switch or an automatic function that activates the operation according to the invention only when the communications device is processing speech.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

Samples or numerical values representing the samples that represent a signal feature (2) which cannot have been produced by the human sound production system are removed from a digital speech signal. Applied to the LPC method, the numerical values that produce residual energy, the ratio of which to the residual energy calculated when the numberical value in question has been replaced by an LPC estimate exceeds a certain threshold condition, are interpreted as noise and the corresponding sample in the signal waveform is replaced by an estimate (4) according to the LPC method. The method particularly reduces the effect of impulse noise and it can be included in the D/A conversion (21) of a sound-reproducing apparatus. The method is advantageously applied to mobile communication systems, in particular to the GSM system.

Description

A method and a device fop the reduction impulse noise from a speech signal
The invention relates in general to reduction of noise in an electrically reproduced speech signal. In particular the invention relates to a method and an apparatus whereby impulse noise, or brief noise peaks, can be eliminated from a speech signal. More specifically the invention relates to a method and an apparatus which improve the quality of sound reproduced by digital mobile stations.
Electric reproduction of speech and other sounds play an important part in modern telecommunications. Especially the increased popularity of mobile phones has created a situation where users can be in voice connection with each other inde- pendent of the location and surrounding conditions. In electric speech transmission in general the microphone of the transmitting party converts the acoustic speech signal into electric form whereafter it is conveyed using various coding and modulation methods to the receiving party whose equipment, including a loudspeaker, reconverts the signal into acoustic form. The transmission path may include radio transmission and reception as well as transmission across fixed wire and fibre-optic connections.
Recording, transmission and playback conditions distort the transmitted speech signal and induce in it many kinds of spurious effects which the receiving party, or the listener, experiences as disturbing. The transmitting party with the microphone may be in a noisy place so that the microphone also records the ambient noise in the background. Electric switches, motors, lightnings and other electric systems operating in the same area also induce noise in an electric communication system. The noise may be relatively continuous in nature, in which case it is called white noise, or it may consist of single momentary disturbances, in which case it is called impulse noise.
It is known from the prior art several different approaches to eliminate noise from a transmitted speech signal. Electric signal processing commonly employs various filters that only pass through signals propagating at a desired frequency (bandpass filtering) or stop the propagation of a certain signal element located on a relatively narrow frequency band (band rejection filtering). Filters can be used to eliminate noise transmitted at a different frequency than the desired signal. So-called psycho- acoustic models are used to eliminate from a speech signal frequencies that are known to be of little significance to the understanding of speech (then, however, the main idea is not so much noise elimination but rather an attempt to decrease the amount of data transmitted). In addition, it is known to use in the transmitting end various microphone arrangements wherein, for example, one microphone records both the user's voice and background noise and a second microphone is oriented such that it records background noise only. The signal transmitted is the difference of these two so that the background noise element is cancelled out. A differential microphone structure aims at the same effect using only one microphone element which the speech signal affects only in one direction but noise affects symmetrically in both directions. So-called active noise reduction employs special sound generating elements that produce acoustic oscillation having the same amplitude as the background noise measured, but the opposite phase.
Prior-art methods have some known disadvantages. Filters having effect on certain frequency areas require prior knowledge of useful and unwanted frequencies or alternatively complex circuit arrangements that analyse the speech signal's frequency structure and control the filters' frequency response. Methods that aim at eliminating background noise in the transmitting end are ineffective against disturbances generated at a later stage of the transmission path and usually are unable to eliminate impulse noise. Active noise reduction in the receiving end requires expensive and complex electronics to produce a real-time response to measured background noise impulses, and noise reduction is then not directed to disturbances in the transmitted signal but to local background noise.
An object of this invention is to provide a method and an apparatus with which the understandability of an electrically reproduced speech signal can be enhanced in an advantageous manner compared to the prior art. Another object of the invention is that the method according to it only calls for slight changes in prior-art apparatus reproducing sound electrically so that the manufacturing costs of the apparatus according to the invention are not substantially bigger than in the prior art. A further object of the invention is that the method according to it is applicable in small, portable communication devices such as mobile phones.
The objects of the invention are achieved by eliminating from an electric waveform corresponding to an acoustic signal features that cannot have originated in the human sound production system. In a sample string corresponding to an electric waveform samples radically different from their surroundings are processed so as to better match samples produced by the human voice production mechanism by replacing them by estimates from linear predictive coding (LPC) analysis. The inter- pretation of a sample as impulse noise and the resulting substitution with an LPC estimate are based on the utilisation of a residual produced by the LPC. The method according to the invention for reducing noise in a digitally transmitted speech signal consisting of numerical values representing successive samples of a certain waveform, wherein it is recognised those of said numerical values that represent waveform details (2) which produce noise in significant portions, is characterised in that to recognise a certain first numerical value it is calculated
- a first residual which equals the difference of a numerical value representing a sample of said waveform and an estimate calculated for that numerical value on the basis of other numerical values, calculated at sampling moments in a certain set of numerical values including said first numerical value, and - a second residual which equals the difference of a numerical value representing a sample of said waveform and an estimate calculated for that numerical value on the basis of other numerical values, calculated at sampling moments in said set of numerical values when said first numerical value has been replaced by the estimate calculated for it, and said first numerical value is recognised on the basis of said first and second residuals.
The invention is also directed to an apparatus for realising the method described above. The apparatus according to the invention for reducing noise in a digitally transmitted signal, which consists of numerical values representing samples of a certain waveform, is characterised in that it comprises
- a calculation and comparison element (22) for calculating residuals for sets of numerical values using said numerical values and for recognising numerical values on the basis of said residuals.
The invention is based on the knowledge that, as a rule, a speech signal produced by the human vocal organs conforms to certain laws. If an acoustic signal is given an electrical representation, these laws are manifest in a certain regularity in the waveform of the electric signal. In digital systems, an electric signal is represented by a train of samples, where the samples can be processed individually or in groups of desired sizes. If the value of a given sample substantially differs from an otherwise regular waveform, it can be assumed that it was not generated on the basis of the original acoustic signal but as a consequence of a sudden disturbance. According to the invention, the deviating sample value is replaced by a value that is closer to the estimate obtained on the basis of the regular waveform so that a near-original, noiseless signal can be reconstructed.
A preferred embodiment of the invention is based on the known LPC method, in which at each sample of a digital signal a difference is calculated representing the difference of the current sample and the estimate calculated for it on the basis of the previous samples. A sudden disturbance in the original signal results in the degradation of the estimate, which shows as increasing amplitudes of the difference values at the location of the disturbance. The effect of the disturbance is removed when the deviating sample is replaced by another value which better conforms to the average behaviour of the speech signal. A preferred replacement value is the value calculated for the sample in question as an estimate determined on the basis of the previous speech samples.
Noise reduction according to the invention is preferably realised at the receiving end so that it works for all disturbances originated at the transmitting end or on the transmission path and transmitted via a communications link to the receiving apparatus. This brings considerable advantage as a communications device applying the method according to the invention produces a higher-quality acoustic signal from a received radio signal than an apparatus without the noise reduction according to the invention. Furthermore, the invention is not limited to be used at the receiving end only, but it can also be used to enhance the signal quality prior to transmission. In addition, the invention is applicable in local digital sound reproduction systems that do not require signal transmission in a communications network.
Particularly advantageously the noise reduction according to the invention can be applied to cellular networks and mobile stations. The inventional method can be realised in a receiving mobile station without any hardware changes as the implementation of the method can utilise information already computed in the mobile station. The inventional method is flexible in a mobile station environment in that it can be realised so as to process different quantities of candidate impulses so that computing capacity can be adapted to the environment. The simplicity of the method is an advantage as far as the production of mobile stations is concerned because it makes possible to perform the new calculations using the existing DSP circuit in the phone without any changes in the electronic circuitry of the apparatus. Furthermore, the proposed method does not call for changes in the LPC parameters, such as the order and frame size, of the most typical speech encoders.
The invention can be realised in a mobile station mainly by modifying the software that controls the operation of the mobile station. The method according to the invention can be realised not only in a receiving mobile station but also in a transmitting mobile station or in cellular base stations where transmission and reception coding is identical to that of mobile stations. Realisation of the method in base stations has the special advantage that no changes are needed in the existing mobile stations.
The invention is described in more detail with reference to the preferred embodiments presented by way of example and to the accompanying drawing wherein
Fig. la shows a certain signal waveform and sampling applied to it,
Fig. lb shows an LPC residual formed from the waveform of Fig. la according to the prior art,
Fig. lc shows an LPC residual shaped according to a preferred embodiment of the invention,
FFiigg.. 22 shows in the form of block diagram a communications device which includes noise removal according to the invention,
Fig. 3 shows in the form of block diagram a noise removal arrangement according to the invention applied to a transmitter according to the GSM system and Fig. 4 shows in the form of block diagram noise removal according to the invention applied to a receiver according to the GSM system.
The sound production process of a healthy person, as regards voiced sounds in the speech, starts with air being pushed from the lungs through the vocal cords in the larynx so that the vocal cords start oscillating and generate a so-called glottis impulse. By varying the profile of the physiological filter (so-called vocal tract) consisting of the mouth cavity and partly the nasal cavity a speaker can produce various resonating frequencies (so-called formants). As the glottal excitation that has left the vocal cords passes through the vocal tract, the exitation is strongly filtered by the vocal tract. The sound produced, e.g. vowel /a/, gets its characteristic shape mainly from the vocal tract's resonating frequencies used in the sound production. In the case of voiceless sounds, such as /k/ and /s/, vocal cords do not oscillate in the speech production, but the speaker produces in the outcoming air flow interruptions and/or broadband noise oscillation using the mouth. So, the human sound production mechanism, particularly in the case of voiced sounds, produces speech in physiological processes based on the periodic motion of two baggy organs (vocal cords) containing plenty of water. The waveform thus produced, when received by a microphone and converted to electric form, is quite regular when observed across a short span of time.
Fig. la schematically shows part of a waveform of an electric signal. Let it represent a human speech sound, recorded by a microphone and analogically converted into an electric signal the amplitude of which varies as a function of time as described by the waveform 1. Above it was referred to the fact that the human sound production system can only produce certain relatively regular waveforms, so we can say about the waveform 1 that the sudden peak 2 in it is probably caused by a disturbance and not included in the original speech sound. According to the invention, the waveform 1 must be shaped such that the effect of the peak 2 is eliminated so that when an electric signal according to the shaped waveform is fed to a loudspeaker, the loudspeaker reproduces the sound without the disturbance. Signal distortion caused by sudden disturbances such as the peak 2 is called impulse noise.
In digital sound processing systems the waveform 1 is not processed as such, but it is sampled, i.e. samples are taken from the analog signal at regular intervals. Sampling can be realised using various methods. A preferred embodiment of the invention for removing impulse noise involves the LPC method, which is known per se, but the essential features of the method will nevertheless be described below to provide background information for the preferred embodiment of the invention. In the LPC method an estimate is produced for each sample of a digital sample train, said estimate being a weighted linear combination of p previous samples, where p is a positive parameter. Then it is calculated the difference of the current signal and the estimate calculated for it in the manner described above, said difference being called a residual signal, or residual. The better the estimate, the smaller the energy of the residual signal. Calculating the energy of the residual or other signal as the sum of the squares of the signal values is known per se and modern signal processors perform the calculation very rapidly and efficiently. The residual not only has less energy than the original signal but also has a more flat spectrum. The fact that the energy is reduced and the spectrum changes to coπespond to noise when the original signal is compared to its residual calculated using the LPC method, makes linear prediction an effective tool in speech transmission.
Especially in digital mobile telephone systems the LPC analysis has become the most significant technique facilitating speech coding. Using the LPC analysis, a speech signal of good quality can be transmitted using a considerably lower bit rate than that used e.g. by the pulse code modulation (PCM) technique traditionally used in the fixed telecommunications network. Efficient speech coding using the LPC analysis is based on the fact that a residual resembling noise can be quantised using a small amount of information without significantly degrading the quality of the signal synthesised from the residual back into a speech signal. As the LPC method cannot predict sudden noise peaks, it produces for a sample corresponding to a time index k an estimate which is represented by the darkened circle 4 in the case depicted by Fig. la. The open circle 3 represents a sample of the original signal suffering from impulse noise at time index k so that the difference of the signal and the estimate is quite big. Fig. lb shows a prior-art LPC residual which shows that the prediction error grows considerably across the time span from index k to index k+p, where p is the order of the prediction, i.e. the number of the previous samples used for calculating a given estimate. In a preferred embodiment of the invention an apparatus processing the original noisy speech signal examines whether it would be more logical to replace in a given time index the speech signal sample by the estimate produced by the LPC analysis so that the residual of the new signal thus obtained would be smaller than the residual computed from the original speech signal. Fig. lc shows a residual calculated by replacing a noisy sample of the original speech signal at time index k by an estimate produced by the LPC analysis.
Let the invention be applied as an enhancement method for the speech quality of a sound-reproducing apparatus prior to the conversion of a digital signal into an analog one by a D/A converter. The method aims at reconstructing a speech signal spoiled by impulse noise so that it becomes nearer to the waveform that represented the signal before the introduction of the noise. By feeding the reconstructed wave- form to a loudspeaker we get an acoustic signal that sounds almost like the original noiseless speech signal. According to the known LPC method the apparatus calculates for each digital signal sample an estimate on the basis of previous samples. The preferred embodiment of the invention only requires that the apparatus knows a certain threshold value to which it compares the ratio of the energies of two residuals. A first residual has been calculated from the original signal which may include noise. A second residual is the residual obtained from the signal wherein the sample of the original speech signal at the current time index has been replaced by the LPC estimate of that sample. These two residuals differ from each other only in the index span from k to k+p, where k is the current time index and p is the order of the LPC prediction. Computing capacity can be saved in the energy calculation by taking into account the fact that differences occur only in that span. A residual and its energy are always computed within a certain time window that includes the current sample or samples. If the ratio of the energies of said two residuals exceeds said threshold value, the apparatus assumes that the speech signal sample at the current time index is a noise peak so that there is good reason to replace it by the LPC estimate. Fig. 2 shows a simple block diagram of an apparatus applying the preferred embodiment of the invention described above. It generally comprises a reception part 24, the detailed structure of which is not shown but which may include e.g. known means for receiving, demodulating and decoding a radio signal transmitted from a base station in a digital cellular radio system. The reception part 24 feeds to block 20 a bit string that corresponds to a speech signal. The digital-to-analog conversion is performed in block 21. Block 22 calculates the residuals and compares the ratio of the energies of two residuals to a predetermined threshold value. When block 22 detects that the threshold value is exceeded, it instructs block 21 to replace the sample in the current time index by the value produced by the LPC prediction.
Applicability of the invention to apparatus requiring real-time or near real-time information processing, say, mobile phones, depends on how heavy signal processing it requires and what is the computing power of the signal processing hardware used for the sound reproduction. Different depth levels can be used in the application of the invention according to the hardware resources available and the length of delays allowed. For example, in the GSM system a speech signal is sampled by frames so that each frame has a certain number of samples. One frame corresponds to a speech period of 20 ms, so its processing in the receiver must not take longer than 20 ms. The signal processing power permitting, all samples in a frame can be even processed several times. If computing resources are scarce, the calculation according to the invention is performed only for r time indexes, for example. Since it is advantageous to direct the calculation to the removal of major disturbances, said r time indexes are chosen such that they correspond to the highest amplitude values of the original signal.
Average magnitude of residuals varies according to the regularity of the original signal. When a clear-voiced speaker utters a long vowel in an even tone, the resulting waveform is very regular and thus very predictable. On average the residual energies are very small and disturbances caused by impulse noise are clearly discernible. If the speaker utters an /s/ sound, the acoustic signal resembles random noise and its predictability is poor. Then, disturbances caused by impulse noise do not induce in the residual as great a change as in the case of voiced sounds. Particularly in the case of low-amplitude impulse noise the method according to the invention is unable to improve the quality of voiceless sounds as effectively as that of voiced sounds. This, however, is not very significant since in the case of the /s/ sound impulse noise is mixed with the characteristic noise structure of the sound it- self and a listener will not experience it as disturbing as in the case of a voiced sound.
The invention advantageously prepares for this divergence of sounds in such a manner that the threshold value at which a signal sample is interpreted as a noise peak, is adaptive, i.e. its magnitude is determined in a certain manner by the average ratio of the residual energies compared. This requires that in the apparatus according to Fig. 2 the comparison block 22 comprises e.g. a FIFO-type (First-In- First-Out) register 22a that always stores the n latest residuals or residual energy ratios, where n is a positive parameter. Additionally, the comparison block 22 must include calculating means 22b for computing an average or median or another suitable parameter from the contents of the register as well as means for determining the threshold value on the basis of said parameter. All said functional features are preferably realised as a program executed by a digital signal processor.
Fig. 3 illustrates the operation of the method according to the invention in a trans- mitter encoder 10 of the GSM system. More specifically, it is a regular pulse excited-long term predictive (RPE-LTP) speech encoder, which is currently widely considered the technically most capable speech encoder, but the inventional ideas of the example discussed here can also be applied to the CELP (Code Excited Linear Prediction), SBC-APCM, SBC-ADPCM and MPE-LTP speech encoders. An original speech signal is first processed in a preprocessing part 200 whence it is first taken via connection 14 to a pre-emphasiser 30 and impulse removal 140, where it is used for recognising noise peaks. A pre-emphasised speech signal is divided into segments comprising 160 samples in a segmentation part 40 to which the signal is brought via connection 34. After that, the segments of the signal are windowed using so-called Hamming windows in block 50. The windowed and segmented signal goes to short term prediction (STP) analysis and filtering 300 where autocorrelation coefficients are first computed in block 60 in which the signal arrives via connection 56. The autocorrelation coefficients are taken via connection 67 to block 70 where reflection coefficients are calculated using the so-called Schur recursion. The reflection coefficients are sent 714 to the impulse removal 140 where they are used for recognising noise peaks. The reflection coefficients are also sent forward in the STP analysis (300) via connection 78 to block 80, where logarithmic area ratios (LAR) are computed from them. The logarithmic area ratios are quantised in blocks 90 and 100 and interpolated linearly in blocks 110 and 120. In step 130 of the STP analysis and filtering, an STP residual is computed from the processed logarithmic area ratios using a so-called partial correlation (PARCOR) method. This STP residual is fed to the impulse removal 140, where according to the invention a corrected residual is formed using both the original speech signal (connection 14) and the reflection coefficients (connection 714), said corrected residual being characterised in that the effects of the most significant noise peaks have been removed from it. The corrected residual is fed via connection 141 to a long term prediction (LTP) analysis, where a peak structure reproduced at the fundamental frequency of the speech is removed from the residual signal using a so-called long term LPC estimate. Finally, the parameters transmitted are quantised in the encoder block 160.
The residual signal can be substantially improved by adding the impulse removal according to the invention, block 140, connections 14, 714, 1314, 141, to the known GSM speech encoder. It is obvious that a better quality of the residual signal can substantially reduce noise in the speech signal.
In Fig. 4, an RPE decoder 170 detects a speech signal from a coded signal brought to a GSM receiver, said speech signal being then sent via connection 1718 to be LTP synthesised in block 180. The speech signal is then taken via connections 1813 and 1319 to signal postprocessing 500. The LTP and STP syntheses (180, 190) and filterings performed on the speech signal along this path are substantially inverse operations of the LTP and STP analyses and filterings already described in conjunction with Fig. 3.
The RPE decoder 170 also receives LPC coefficients which are sent via connection 1710 to the STP synthesis filtering chain 400 where reflection coefficients are filtered from the LPC coefficients using parameters of logarithmic area ratios. The reflection coefficients are filtered in blocks 100, 110, 120 by quantising and linearly interpolating the logarithmic area ratio parameters. The method is substantially the inverse operation of the compilation of the uncorrected residual described in conjunction with Fig. 3. The reflection coefficients are sent via connection 1214 to the impulse removal 140 where the strongest noise peaks are removed from the speech signal according to the invention using the reflection coefficients received via connection 1214.
The method discussed above substantially improves the quality of speech reproduced by a GSM receiver. The embodiment shown in Fig. 4 is particularly advantageous in a receiving GSM mobile station since in this case the corrected signal is no more sent to possibly corruptive transmission paths but will propagate direct to the listener's ear. The invention provides an apparatus and a method for improving the under- standability of an electrically reproduced speech signal in a manner which is easy to implement and which effectively attenuates all disturbances caused by the transmitting end, transmission path and preceding parts of the receiving apparatus in addition to disturbances acoustically coupled to the microphone. The method according to the invention does not require complex equipment or special components so that it is ideal for small, portable communications devices, such as mobile stations, but on the other hand, it can be easily applied to base stations of various cellular networks as well. The invention is also applicable to data com- munications systems of fixed networks. If the invention is applied in a communications device designed for the handling of both speech and other signals, it may be advantageous to include in the device a switch or an automatic function that activates the operation according to the invention only when the communications device is processing speech.
Above, the invention was described using different implementations and embodiments related to mobile communication systems but it is obvious that the invention is not limited to those but can be modified within the scope of the inventional idea and the claims set forth below.

Claims

Claims
1. A method for attenuating or reducing noise in a digitally transmitted speech signal comprising numerical values representing successive samples of a waveform, wherein it is recognised those of said numerical values that represent waveform details (2) which contain noise in significant portions, characterised in that to recognise a certain first numerical value it is calculated
- a first residual which equals in a certain set of numerical values including said first numerical value the difference of a numerical value representing a sample of said waveform and an estimate calculated for that numerical value on the basis of other numerical values, calculated at sampling moments, and
- a second residual which equals in said set of numerical values the difference of a numerical value representing a sample of said waveform and an estimate calculated for that numerical value on the basis of other numerical values, calculated at sampling moments when said first numerical value has been replaced by the estimate calculated for it and said first numerical value is recognised on the basis of said first and second residuals.
2. The method of claim 1, characterised in that to recognise a certain first numerical value it is calculated - the ratio or difference of the energies of said first residual and said second residual or another mathematical quantity representing the difference of the residuals mentioned above, so that, if said ratio exceeds a certain first threshold value, said first numerical value is considered recognised.
3. The method of claims 1 and 2, characterised in that said first threshold value is determined adaptively according to the average residual energy ratio or residual energy difference which is calculated by averaging the ratio or difference mentioned in claim 2 across several different sets of sample values.
4. The method of claim 1, 2 or 3, characterised in that each recognised numerical value (3) is replaced by an estimate (4) calculated for it.
5. The method of any one of the preceding claims, characterised in that it is part of a D/A conversion for feeding said digitally transmitted speech signal in analog form to a loudspeaker (23).
6. The method of any one of the preceding claims, characterised in that it is adapted so as to be used in a digital data communications system in connection with transmission.
7. The method of any one of the preceding claims, characterised in that it is adapted so as to be used in a digital data communications system in connection with reception.
8. An apparatus for reducing noise in a signal transmitted in digital form, said signal comprising numerical values representing successive samples of a certain waveform, characterised in that it comprises - a calculation and comparison element (22) for calculating on the basis of said numerical values residuals for sets of numerical values and for recognising numerical values on the basis of said residuals.
9. The apparatus of claim 8, characterised in that said calculation and comparison element (22) comprises means for comparing calculated residuals with a predetermined threshold condition.
10. The apparatus of claim 8 or 9, characterised in that it comprises replacement means (21) for replacing numerical values recognised by said calculation and comparison element by other numerical values.
11. The apparatus of any one of claims 8 to 10, characterised in that said replacement means (21) also comprises a D/A converter for reconstructing said waveform on the basis of said numerical values.
12. The apparatus of claim 8 or 9, characterised in that said calculation and comparison element (22) comprises a register (22a) for the temporary storage of a certain number of residual ratios or residual differences and means (22b) for producing said threshold condition on the basis of the contents of said register.
13. The apparatus of any one of claims 8 to 12, characterised in that it is a digital signal processor.
14. The apparatus of any one of claims 8 to 13, characterised in that it is adapted so as to be used in a digital data communications system in connection with transmission so that
- the calculation and comparison element (22) is located in an impulse removal unit (140) where it corrects an STP-analysed residual on the basis of the information contained in reflection coefficients and the original speech signal, said impulse removal unit (140) also including replacement means (21) which substitutes estimates for the noise peaks removed from the corrected residual.
15. The apparatus of any one of claims 8 to 13, characterised in that it is adapted so as to be used in a digital data communications system in connection with reception so that
- the calculation and comparison element (22) and the replacement means (21) are adapted to impulse removal (140) which produces a speech signal in which the calculation and comparison element (22) recognises strong noise peaks on the basis of the information contained in the reflection coefficients, and the replacement means (21) substitutes estimates for said noise peaks.
16. The method of any one of claims 1 to 7 or the apparatus of any one of claims 8 to 15, characterised in that said residual values are adapted so as to be transmitted in a digital data communications system.
17. The method of any one of claims 1 to 7 or the apparatus of any one of claims 8 to 15, characterised in that it is adapted so as to be used in mobile stations and/or base stations of a cellular network.
18. The method of any one of claims 6 to 7 or the apparatus of any one of claims 14 to 16, characterised in that said digital data communications system is a GSM system.
PCT/FI1997/000458 1996-07-25 1997-07-25 A method and a device for the reduction impulse noise from a speech signal WO1998005031A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU35445/97A AU3544597A (en) 1996-07-25 1997-07-25 A method and a device for the reduction impulse noise from speech signal

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FI962965A FI962965A (en) 1996-07-25 1996-07-25 Method and apparatus for eliminating impulse interference from the speech signal
FI962965 1996-07-25

Publications (2)

Publication Number Publication Date
WO1998005031A2 true WO1998005031A2 (en) 1998-02-05
WO1998005031A3 WO1998005031A3 (en) 1998-03-05

Family

ID=8546431

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI1997/000458 WO1998005031A2 (en) 1996-07-25 1997-07-25 A method and a device for the reduction impulse noise from a speech signal

Country Status (3)

Country Link
AU (1) AU3544597A (en)
FI (1) FI962965A (en)
WO (1) WO1998005031A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000038180A1 (en) * 1998-12-18 2000-06-29 Telefonaktiebolaget Lm Ericsson (Publ) Noise suppression in a mobile communications system
WO2006127968A1 (en) * 2005-05-26 2006-11-30 Berkley Integrated Audio Software, Inc. Restoring audio signals corrupted by impulsive noise
EP2149879A1 (en) * 2008-07-31 2010-02-03 Fujitsu Limited Noise detecting device and noise detecting method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0336685A2 (en) * 1988-04-08 1989-10-11 Cedar Audio Limited Impulse noise detection and supression
JPH06177786A (en) * 1992-12-08 1994-06-24 Pioneer Electron Corp Impulse noise elimination device
JPH08195688A (en) * 1995-01-17 1996-07-30 Kokusai Electric Co Ltd Receiving device having noise eliminating function
JPH08275086A (en) * 1995-04-03 1996-10-18 Toshiba Corp Pcm demodulation processing circuit

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0336685A2 (en) * 1988-04-08 1989-10-11 Cedar Audio Limited Impulse noise detection and supression
JPH06177786A (en) * 1992-12-08 1994-06-24 Pioneer Electron Corp Impulse noise elimination device
JPH08195688A (en) * 1995-01-17 1996-07-30 Kokusai Electric Co Ltd Receiving device having noise eliminating function
JPH08275086A (en) * 1995-04-03 1996-10-18 Toshiba Corp Pcm demodulation processing circuit

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000038180A1 (en) * 1998-12-18 2000-06-29 Telefonaktiebolaget Lm Ericsson (Publ) Noise suppression in a mobile communications system
WO2006127968A1 (en) * 2005-05-26 2006-11-30 Berkley Integrated Audio Software, Inc. Restoring audio signals corrupted by impulsive noise
US7787975B2 (en) 2005-05-26 2010-08-31 Berkley Integrated Audio Software, Inc. Restoring audio signals
EP2149879A1 (en) * 2008-07-31 2010-02-03 Fujitsu Limited Noise detecting device and noise detecting method
US8892430B2 (en) 2008-07-31 2014-11-18 Fujitsu Limited Noise detecting device and noise detecting method

Also Published As

Publication number Publication date
FI962965A (en) 1998-01-26
AU3544597A (en) 1998-02-20
WO1998005031A3 (en) 1998-03-05
FI962965A0 (en) 1996-07-25

Similar Documents

Publication Publication Date Title
US5933803A (en) Speech encoding at variable bit rate
KR100575193B1 (en) A decoding method and system comprising an adaptive postfilter
JP4927257B2 (en) Variable rate speech coding
EP0673013B1 (en) Signal encoding and decoding system
EP0993670B1 (en) Method and apparatus for speech enhancement in a speech communication system
KR100421160B1 (en) Adaptive Filter and Filtering Method for Low Bit Rate Coding
KR100216018B1 (en) Method and apparatus for encoding and decoding of background sounds
US20030065507A1 (en) Network unit and a method for modifying a digital signal in the coded domain
EP1020848A2 (en) Method for transmitting auxiliary information in a vocoder stream
WO1994025959A1 (en) Use of an auditory model to improve quality or lower the bit rate of speech synthesis systems
KR100498177B1 (en) Signal quantizer
WO1998005031A2 (en) A method and a device for the reduction impulse noise from a speech signal
KR20000028699A (en) Device and method for filtering a speech signal, receiver and telephone communications system
GB2336978A (en) Improving speech intelligibility in presence of noise
Stansfield et al. Speech processing techniques for HF radio security
JP3896654B2 (en) Audio signal section detection method and apparatus
Stansfield et al. Adaptive filters in speech coding
WO1993021627A1 (en) Digital signal coding
Kulakcherla Non linear adaptive filters for echo cancellation of speech coded signals

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH HU IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW AM AZ BY KG KZ MD RU TJ TM

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH KE LS MW SD SZ UG ZW AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL

AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH HU IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW AM AZ BY KG KZ MD RU TJ TM

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH KE LS MW SD SZ UG ZW AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: CA

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase in:

Ref country code: JP

Ref document number: 1998508288

Format of ref document f/p: F

122 Ep: pct application non-entry in european phase