EP0852052A1 - System zur adaptiven filterung von tonsignalen zur verbesserung der sprachverständlichkeit bei umgebungsgeräuschen - Google Patents

System zur adaptiven filterung von tonsignalen zur verbesserung der sprachverständlichkeit bei umgebungsgeräuschen

Info

Publication number
EP0852052A1
EP0852052A1 EP96931552A EP96931552A EP0852052A1 EP 0852052 A1 EP0852052 A1 EP 0852052A1 EP 96931552 A EP96931552 A EP 96931552A EP 96931552 A EP96931552 A EP 96931552A EP 0852052 A1 EP0852052 A1 EP 0852052A1
Authority
EP
European Patent Office
Prior art keywords
noise
speech
filter circuit
estimate
energy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP96931552A
Other languages
English (en)
French (fr)
Other versions
EP0852052B1 (de
Inventor
Torbjörn W. SÖLVE
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ericsson Inc
Original Assignee
Ericsson Inc
Ericsson GE Mobile Communications Holding Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ericsson Inc, Ericsson GE Mobile Communications Holding Inc filed Critical Ericsson Inc
Publication of EP0852052A1 publication Critical patent/EP0852052A1/de
Application granted granted Critical
Publication of EP0852052B1 publication Critical patent/EP0852052B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02168Noise filtering characterised by the method used for estimating noise the estimation exclusively taking place during speech pauses
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision
    • G10L2025/786Adaptive threshold
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain

Definitions

  • the present invention is related to U.S. Patent Application Serial No. 08/128,639, entitled “Adaptive Noise Reduction for Speech Signals” filed on September 29, 1993; and to U.S. Patent Application No. 07/967,027 entitled “Multi-Mode Signal Processing” filed on
  • the present invention relates to noise reduction systems, and in particular, to an adaptive speech intelligibility enhancement system for use in portable digital radio telephones.
  • PCNs personal communication networks
  • Digital communication systems take advantage of powerful digital signal processing techniques.
  • Digital signal processing refers generally to mathematical and other manipulation of digitized signals. For example, after converting (digitizing) an analog signal into digital form, that digital signal may be filtered, amplified, and attenuated using simple mathematical routines in a digital signal processor (DSP) .
  • DSPs are manufactured as high speed integrated circuits so that data processing operations can be performed essentially in real time. DSPs may also be used to reduce the bit transmission rate of digitized speech which translates into reduced spectral occupancy of the transmitted radio signals and increased system capacity.
  • a serial bit rate of 112 Kbits/sec is produced.
  • voice coding techniques can be used to compress the serial bit rate from 112 Kbits/sec to 7.95 Kbits/sec to achieve a 14:1 reduction in bit transmission rate. Reduced transmission rates translate into more available bandwidth.
  • VSELP vector sourcebook excited linear predictive coding
  • the distortion is caused in large part by the environment in which the mobile telephones are used.
  • Mobile telephones are typically used in a vehicle's interior where there is often ambient noise produced by the vehicle's engine and surrounding vehicular traffic.
  • This ambient noise in the vehicle's interior is typically concentrated in the low audible frequency range and the magnitude of the noise can vary due to such factors as the speed and acceleration of the vehicle and the extent of the surrounding vehicular traffic.
  • This type of low frequency noise also has the tendency of significantly decreasing the intelligibility of the speech coming from the speaking person in the car environment.
  • the decrease in speech intelligibility caused by low frequency noise can be particularly significant in communication systems deploying a VSELP vocoder, but can also occur in communication systems that do not include a VSELP vocoder.
  • the influence of the ambient noise on the mobile telephone can also be affected by the manner in which the mobile telephone is used.
  • the mobile telephone may be used in a hands-free mode where the telephone user talks on the telephone while the mobile telephone is in a cradle. This frees the telephone user's hands to drive but also increases the distance that the telephone user's audible words must travel before reaching the microphone input of the mobile telephone. This increased distance between the user and the mobile telephone, along with the varying ambient noise, can result in noise being a significant portion of the total power spectral energy of the audio signal inputted into the *no ilj» telephone.
  • the present invention provides an adaptive noise reduction system that reduces the undesirable contributions of encoded background noise while both minimizing any negative impact on the quality of the encoded speech and minimizing any increased drain on digital signal processor resources.
  • the method and system of the present invention increases the intelligibility of the speech in a digitized audio signal by passing frames of the digitized audio signal through a filter circuit.
  • the filter circuit functions as an adjustable, high-pass filter which filters a portion of the digitized signal in a low audible frequency range and passes the portion of the digitized signal falling in higher frequency ranges.
  • the filter circuit filters a large segment of the noise in the digitized audio signal while only filtering less important segments of the speech. This results in a relatively larger portion of the noise energy being removed compared to the portion of the speech energy removed.
  • a filter control circuit is used to adjust the filter circuit to exhibit different frequency response curves as a function of a noise estimate and/or a spectral profile result corresponding to the noise in the audio signal.
  • the noise estimate and/or the spectral profile result are adjusted on a frame-by- frame basis for the digital signal and as a function of speech detection. If speech is not detected, the noise estimate and/or spectral profile result is updated for the current frame. If speech is detected, the noise estimate and/or spectral profile result is left unadjusted.
  • the filter circuit calculates noise estimates for the frames of the digitized audio signals. The noise estimates correspond to the amount of background noise in the frames of the digitized audio signals.
  • the filter control circuit uses the noise estimates to adjust the filter circuit to filter larger portions of the low frequency range of speech as the relative amount of background noise to speech in a low frequency range of speech increases.
  • no background noise no portion of the speech signal is filtered.
  • Larger portions of noise and speech information are extracted when there is a higher level of background noise. Because noise tends to be concentrated in a low frequency range and only a relatively small portion of the intelligibility content of speech falls within this low frequency range, the overall intelligibility of the audio signal can be increased by increasing the portion of low frequency energy being filtered as the noise estimates increase.
  • a modified filter control circuit is used to adjust the filter circuit to exhibit different frequency response curves as a function of a noise profile of the noise estimate over a selected frequency range in the audio signal.
  • the filter control circuit includes a spectral analyzer for determining a noise profile estimate as a function of the detection speech. A noise profile estimate is determined for a current frame and compared to a reference noise profile. Based on this comparison, the filter circuit is adaptively adjusted to extract varying amounts of low frequency energy from the current frame.
  • the adaptive noise reduction system may be advantageously applied to telecommunication systems in which portable/mobile radio transceivers communicate over RF channels with each other or with fixed telephone line subscribers.
  • Each transceiver includes an antenna, a receiver for converting radio signals received over an RF channel via the antenna into analog audio signals, and a transmitter.
  • the transmitter includes a coder-decoder (codec) for digitizing analog audio signals to be transmitted into frames of digitized speech information, the speech information including both speech and background noise.
  • codec coder-decoder
  • a digital signal processor processes a current frame based on an estimate of the background noise and the detection of speech in the current frame to minimize background noise.
  • a modulator modulates an RF carrier with the processed frame of digitized speech information for subsequent transmission via the antenna.
  • FIGURE 1 is a general functional block diagram of the present invention
  • FIGURE 2 illustrates the frame and slot structure of the U.S. digital standard IS-54 for cellular radio communications;
  • FIGURE 3 is a block diagram of a first preferred embodiment of the present invention implemented using a digital signal processor
  • FIGURE 4 is a functional block diagram of an exemplary embodiment of the present invention in one of plural portable radio transceivers in a telecommunication system
  • FIGURES 5A and 5B is a flow chart which illustrates functions/operations performed by the digital signal processor in implementing the first preferred embodiment of the present invention
  • FIGURE 6A is a graph illustrating a first example of an attenuation vs. frequency characteristic of a filter circuit according to the first preferred embodiment of the present invention
  • FIGURE 6B is a graph illustrating a second example of an attenuation vs. frequency characteristic of a filter circuit according to the first preferred embodiment of the present invention.
  • FIGURE 7 is an example look-up table accessible by the filter control circuit of the first preferred embodiment of the present invention
  • FIGURES 8A and 8B are graphs illustrating the amplitude vs. frequency characteristics of example input audio signals
  • FIGURES 9A and 9B are graphs illustrating the amplitude vs. frequency cha*">c*-i!ristics of the input audio signals of Figures 8A and 8B, respectively, after having been filtered by the filter circuit of the present invention;
  • FIGURE 10 is a block diagram of a second preferred embodiment of the present invention implemented using a digital signal processor
  • FIGURE 11 is a flow chart, corresponding to the flow chart of Figure 5B, which illustrates functions/operations performed by the digital signal processor in implementing the second preferred embodiment of the present invention.
  • FIGURE 12 is an example look-up table accessible by the filter control circuit of the second preferred embodiment of the present invention.
  • FIG. 1 is a general block diagram of the adaptive noise reduction system 100 according to the present invention.
  • Adaptive noise reduction system 100 includes a filter control circuit 105 connected to a filter circuit 115.
  • Filter control circuit 105 generates a filter control signal for a current frame of a digitized audio signal.
  • the filter control signal is outputted to the filter circuit 115, and the filter circuit 115 adjusts in response to the filter control signal to exhibit a high-pass frequency response curve selected based on the filter control signal.
  • the adjusted filter circuit 115 filters the current frame of the digitized audio signal.
  • the filtering signal is processed by a voice coder 120 to produce a coded signal representing the digitized audio signal.
  • Figure 2 illustrates the time division multiple access (TDMA) frame structure employed by the IS-54 standard for digital cellular telecommunications.
  • a "frame” is a twenty millisecond time period which includes one transmit block TX, one receive block RX, and a signal strength measurement block used for mobile-assisted hand-off (MAHO) .
  • the two consecutive frames shown in Figure 2 are transmitted in a forty millisecond time period. Digitized speech and background noise information is processed and filtered on a frame-by- frame basis as further described below.
  • the functions of the filter control circuit 105, filter circuit 115, and voice coder 120 shown in Figure 1 are implemented with a high speed digital signal processor.
  • One suitable digital signal processor is the TMS320C53 DSP available from Texas Instruments.
  • the TMS320C53 DSP includes on a single integrated chip a sixteen-bit microprocessor, on-chip RAM for storing data such as speech frames to be processed, ROM for storing various data processing algorithms including the VSELP speech compression algorithm, and other algorithms to be described below for implementing the functions performed by the filter control circuit 105 and the filter circuit 115.
  • a first embodiment of the present invention i ⁇ shown in Figure 3.
  • the filter circuit 115 is adjusted as a function of background noise estimates determined by the filter control circuit.
  • Frames of pulse code modulated (PCM) audio information are sequentially stored in the DSP's on- chip RAM. The audio information could be digitized using other digitization techniques.
  • PCM pulse code modulated
  • Each PCM frame is retrieved from a DSP on-chip RAM and processed by frame energy estimator 210, and stored temporarily in temporary frame store 220.
  • the energy of the current frame determined by frame energy estimator 210 is provided to noise estimator 230 and speech detector 240 function blocks.
  • Speech detector 240 indicates that speech is present in the current frame when the frame energy estimate exceeds the sum of the previous noise estimate and a speech threshold. If the speech detector 240 determines that no speech is present, the digital signal processor 200 calculates an updated noise estimate as a function of the previous noise estimate and the current frame energy (block 230) .
  • the updated noise estimate is outputted to a filter selector 235.
  • Filter selector 235 generates a filter control signal based on the noise estimate.
  • the filter selector 235 accesses a look-up table in generating the filter control signal.
  • the look-up table includes a series of filter control values that are each matched with a noise estimate or range of noise estimates.
  • a filter control value from a look-up table is selected based on the updated noise estimate and this filter control value is represented by a filter control signal outputted to a filter bank 265 for the filter circuit 115.
  • a hangover time of N frames is set upon the selection of a new filter.
  • a new filter can only be selected every N frames, where N is an integer greater than one and preferably greater than 10.
  • the filter circuit 115 is adjusted in response to the filter control signal to exhibit a high-pass frequency response curve that corresponds with the inputted filter control signal and noise estimate.
  • Various different types of filter circuits well known in prior art can be utilized to exhibit selected frequency response curves in response to the filter control signal.
  • These prior art filters include IIR filters such as Butterworth, Chebyshev (Tschebyscheff) or elliptic filters. IIR filters are preferable to FIR filters, which also can be used, due to lower processing requirements.
  • the filtered signal is processed by a voice coder 120 which is used to compress the bit rate of the filtered signal.
  • the voice coder 120 uses vector sourcebook excited linear predictive coding (VSELP) to code the audio signal.
  • VSELP vector sourcebook excited linear predictive coding
  • CELP code excited linear predictive
  • RPE-LTP residual pulse excited linear predictive
  • IMBE improved multiband excited
  • the digital signal processor 200 described in conjunction with Figure 3 can be used, for example, in the transceiver of a digital portable/mobile radiotelephone used in a radio telecommunications system.
  • Figure 4 illustrates one such digital radio transceiver which may be used in a cellular telecommunications network. Although Figure 4 generally describes the basic function blocks included in the radio transceiver, a more detailed description of this transceiver may be obtained from the previously referenced U.S. Patent Application Serial No. 07/967,027 entitled "Multi-Mode Signal Processing" which is incorporated herein by reference.
  • Audio signals including s ft cb and background noise are input in a microphone 400 to a coder-decoder (codec) 402 which preferably is an application specific integrated circuit (ASIC) .
  • codec coder-decoder
  • ASIC application specific integrated circuit
  • the band limited audio signals detected at microphone 400 are sampled by the codec 402 at a rate of 8,000 samples per second and blocked into frames. Accordingly, each twenty millisecond frame includes 160 speech samples. These samples are quantized and converted into a coded digital format such as 14-bit linear PCM.
  • the transmit DSP 200 performs channel encoding functions, the frame energy estimation, noise estimation, speech detection, FFT, filter functions and digital speech coding/compression in accordance with the VSELP algorithm, as described above in conjunction with Figure 3.
  • a supervisory microprocessor 432 controls the overall operation of all of the components in the transceiver shown in Figure 4.
  • the filtered PCM data stream generated by transmit DSP 200 is provided for quadrature modulation and transmission.
  • an ASIC gate array 404 generates in-phase (I) and quadrature (Q) channels of information based upon the filtered PCM data stream from DSP 200.
  • the I and Q bit streams are processed by matched, low pass filters 406 and 408 and passed onto IQ mixers in balanced modulator 410.
  • a reference oscillator 412 and a multiplier 414 provide a transmit intermediate frequency (IF) .
  • the I signal is mixed with in-phase IF, and the Q signal is mixed with quadrature IF (i.e., the in-phase IF delayed by 90 degrees by phase shifter 416) .
  • the mixed I and Q signals are summed, converted "up" to an RF channel frequency selected by channel synthesizer 430, and transmitted via duplexer 420 and antenna 422 over the selected radio frequency channel.
  • signals received via antenna 422 and duplexer 420 are down converted from the selected receive channel frequency in a mixer 424 to a first IF frequency using a local oscillator signal synthesized by channel synthesizer 430 based on the output of reference oscillator 428.
  • the output of the first IF mixer 424 is filtered and down converted in frequency to a second IF frequency based on another output from channel synthesizer 430 and demodulator
  • a receive gate array 434 then converts the second IF signal into a series of phase samples and a series of frequency samples.
  • the receive DSP 436 performs demodulation, filtering, gain/attenuation, channel decoding, and speech expansion on the received signals.
  • the processed speech data are then sent to codec 402 and converted to baseband audio signals for driving loudspeaker 438.
  • Frame energy estimator 210 determines the energy in each frame of audio signals.
  • Frame energy estimator 210 determines the energy of the current frame by calculating the sum of the squared values of each PCM sample in the frame (step 505) . Since there are 160 samples per twenty millisecond frame for an 8000 samples per second sampling rate, 160 squared PCM samples are summed. Expressed mathematically, the frame energy estimate is determined according to equation 1 below:
  • the frame energy value calculated for the current frame is stored in the on-chip RAM 202 of DSP 200 (step 510) .
  • the functions of speech detector 240 include fetching a noise estimate previously determined by noise estimator 230 from the on-chip RAM of DSP 200 (step 515) .
  • Decision block 520 anticipates this situation and assigns a noise estimate in step 525.
  • an arbitrarily high value e.g. 20 dB above normal speech levels, is assigned as the noise estimate in order to force an update of the noise estimate value as will be described below.
  • the frame energy determined by frame energy estimator 210 is retrieved from the on-chip RAM 202 of DSP 200 (block 530) .
  • a decision is made in block 535 as to whether the frame energy estimate exceeds the sum of the retrieved noise estimate plus a predetermined speech threshold value, as shown in equation 2 below: frame energy estimate > (noise estimate + speech threshold) (equation 2)
  • the speech threshold value may be a fixed value determined empirically to be larger than short term energy variations of typical background noise and may, for example, be set to 9 dB. In addition, the speech threshold value may be adaptively modified to reflect changing speech conditions such as when the speaker enters a noisier or quieter environment. If the frame energy estimate exceeds the sum in equation 2, a flag is set in block 570 that speech exists. If speech detector 240 detects that speech exists, then noise estimator 230 is bypassed and the noise estimate calculated for the previous frame in the digitized audio is retrieved and used as the current noise estimate. Conversely, if the frame energy estimate is less than the sum in equation 2, the speech flag is reset in block 540. Other systems for detecting speech in a current frame can also be used.
  • the European Telecommunications Standards Institute has developed a standard for voice activity detection (VAD) in the Global System for Mobile communications (GSM) system and is described in the ETSI Reference: RE/SMG- 020632P which is incorporated by reference.
  • VAD voice activity detection
  • GSM Global System for Mobile communications
  • RE/SMG- 020632P which is incorporated by reference.
  • This standard could be used for speech detection in the present invention and is incorporated by reference.
  • the noise estimation update routine of noise. estimator 230 is executed.
  • the noise estimate is a running average of the frame energy during periods of no speech. As described above, if the initial start-up noise estimate is chosen sufficiently high, speech is not detected, and the speech flag will be reset thereby forcing an update of the noise estimate.
  • a difference/error delta ( ⁇ ) is determined in block 545 between the frame noise energy generated by frame energy estimator 210 and a noise estimate previously calculated by noise estimator 230 in accordance with the following equation:
  • current frame energy - previous noise estimate (equation 3)
  • a determination is made in decision block 550 whether ⁇ exceeds zero. If ⁇ is negative, as occurs for high values of the noise estimate, then the noise estimate is recalculated in block 560 in accordance with the following equation: noise estimate previous noise estimate + ⁇ /2 (equation 4) Since ⁇ is negative, this results in a downward correction of the noise estimate.
  • the relatively large step size of ⁇ /2 is chosen to rapidly correct for decreasing noise levels.
  • noise estimate previous r?.oije estimate + ⁇ /256 (equation 5) Since ⁇ is positive, the noise estimate must be increased. However, a smaller step size of ⁇ /256 (as compared to ⁇ /2) is chosen to gradually increase the noise estimate and provide substantial immunity to transient noise.
  • the noise estimate calculated for the current frame is outputted to the filter selector 235.
  • filter selector 235 accesses a look-up table and uses the current noise estimate to select a filter control value (Step 572) .
  • the filter circuit 115 (in Step 574) is then adjusted as a function of the selected filter control value to exhibit a frequency response curve intended to increase the amount of noise filtered as the noise estimate and background noise increases.
  • the PCM samples stored in DSP RAM are then passed through the adjusted filter circuit 265 to filter the PCM samples in order to remove noise (Step 576) .
  • the filtered PCM samples are then processed by voice coder 120 (step 578) , and the coded samples are then outputted to RF transmit circuits (Step 580) .
  • Figures 6A and 6B show examples of how the filter circuit 115 adjusts to exhibit different frequency response curves F1-F4 for different filter control signals inputted to the filter circuit 115.
  • the filter circuit 115 can be selected to exhibit a series of different frequency response curves with the frequency response curves F1-F4 having cut-off frequencies Flc-F4c, respectively.
  • the cut-off frequencies of filter circuit 115 may range in the preferred embodiment from 300 Hz to 800 Hz.
  • the filter circuit 115 is designed to exhibit frequency response curves having higher cut-off frequencies. The higher cut-off frequencies result in a larger portion of frame energy falling within the lower frequency range of speech being extracted by the filter circuit 115.
  • the filter circuit 115 can be selected to exhibit a series of different frequency response curves F1-F4 with each frequency response curve having a different slope and the same cut-off frequencies.
  • the cut-off frequency for frequency response curves F1-F4 is in the above- mentioned range.
  • the filter circuit 115 is adjusted to exhibit frequency response curves having steeper slopes. The steeper slopes result in a larger portion of frame energy falling within the lower frequency range of speech being extracted by the filter circuit 115.
  • the filter circuit 115 filters the current frames as a function of the noise estimate calculated for the current frame.
  • the current frame is filtered so that the noise is reduced and a major portion of the speech is passed.
  • the major portion of speech which is passed unfiltered provides for recognizable speech output with only a minimal reduction in the quality of the speech signal.
  • a combination of different cutoff frequencies and different slopes could be used for adaptively extracting selected portions of frame energy falling within a low frequency range of speech.
  • Figure 7 depicts an example look-up table accessed by filter selector 235 in order to select one of the filter response curves F1-F4 for filter circuit 115.
  • the look-up table includes a series of potential noise estimates Nl-Nn and filter control values Fl-Fn that correspond with potential response curves that are exhibitable by the filter circuit 115.
  • Noise estimates Nl-Nn can each represent a range of noise estimates and are each matched with a particular filter control value F1-F4.
  • the filter control circuit 105 generates a filter control signal by calculating a noise estimate and retrieving from the look-up table the filter control value associated therewith.
  • Figures 8A & B and 9A & B show how the audio signal for two frames are each adaptively filtered to provide an improved audio signal outputted to the RF transmitter.
  • Figures 8A and 8B show a first frame and a second frame of an audio signal containing speech components si and s2 and noise components nl and n2, respectively. As shown, the noise energy nl and n2 in both frames is concentrated in a low audible frequency range, while the speech energy si and s2 is concentrated in a higher audible frequency range.
  • Figure 9A shows the noise signal nl and speech signal ⁇ l for the first frame after filtering.
  • Figure 9B shows the noise signal n2 and speech signal s2 for the second frame after filtering.
  • the adaptive audio noise reduction system 100 is designed to account for the difference in noise level between the first frame and the second frame by adjusting the filter control circuit 105 based on a calculated noise estimate for the current frame. For example, a noise estimate Nl and a spectral profile SI is calculated by filter control circuit 105 and a filter control value of Fl is selected for the first frame.
  • the filter circuit 115 is adjusted based on filter control value Fl and exhibits a frequency response curve Fl having a cut-off frequency Flc, as shown in Figure 6A. The first frame is passed through this adjusted filter circuit 115.
  • the filter circuit 115 is selected so that a large portion of the noise nl and only a small portion of speech si falls below the cut-off frequency Flc of the frequency response curve Fl. This results in noise nl being effectively filtered and only a relatively insignificant portion of speech si being filtered.
  • the filtered audio signal of the first frame is shown in Figure 9A.
  • a higher background noise is present, and assuming speech is not detected, a higher noise estimate n2 is calculated by filter control circuit 105.
  • a higher corresponding filter control value F2 is determined for the second frame based on the higher noise estimate.
  • the filter circuit 115 is adjusted in response to the higher filter control value F2 to exhibit a frequency response curve having a higher cut ⁇ off frequency F2c, as shown in Figure 6A.
  • the subsequent frame of audio signal is passed through the adjusted filter circuit 115. Because the cut-off frequency F2c of the frequency response curve F2 is higher for the subsequent frame, a larger portion of both the noise n2 and speech s2 is filtered.
  • the portion of speech s2 filtered is still relatively insignificant to the intelligibility information contained by the frame so that there is only minimal affect on the speech.
  • the disadvantage of filtering a larger portion of the speech s2 is offset by the advantage of the increased removal of noise n2 from the second frame.
  • the filtered spectral portion of the speech does not significantly contribute to the intelligibility of the speech.
  • the filtered audio signal of the second frame is shown in Figure 9B.
  • a second preferred embodiment of adaptive noise reduction system 100 is shown in Figures 10-12.
  • the filter control circuit 105 adjusts the filter circuit 115 as a function of noise profile estimates. A noise profile estimate is calculated for each frame and is compared to a reference noise profile. Based on this comparison, the filter circuit 115 is adaptively adjusted to extract varying amounts of low frequency energy from the current frame.
  • the filter control circuit 105 includes a spectral analyzer 270, in addition to frame energy estimator 210, noise estimator 230, speech detector 240, and filter selector 235 which are described with respect to the first preferred embodiment.
  • the filter control circuit 105 determines noise estimates and detects speech for the received frames as described for the first embodiment and shown in flow charts 5A and 5B.
  • the spectral analyzer 270 updates the noise profile estimate and uses the noise profile estimate in adjusting the filter circuit 115.
  • Figure 11 shows the steps performed by spectral analyzer 270 incorporated into the overall process previously described in the flow charts of Figures 5A and 5B for the first preferred embodiment.
  • the spectral analyzer 270 first determines a noise profile for the current frame (step 600) .
  • the noise profile determined for the current frame includes energy calculations for different frequencies (i.e., frequency bins) within a selected low frequency range of speech for the current frame. In the preferred embodiment, the selected frequency range is approximately 300 to 800 hertz.
  • the noise profile of the current frame can be determined by processing the current frame using a Fast Fourier Transform (FFT) having N frequency bins. Processing digital signals using an FFT is well-known in the prior art and is advantageous in that very little processing power is required where the FFT is limited to a relatively small number of frequency bins such as 32. An FFT having N frequency bins produces energy calculations at N different frequencies.
  • FFT Fast Fourier Transform
  • the energy calculations for the frequency bins falling within the selected frequency range form the noise profile for the current frame.
  • the noise profile for the current frame is averaged with a noise profile estimate determined for the previous frame of the audio signal. Where no previous noise profile estimate is available, such as after initialization, a stored, initial noise profile estimate can be used.
  • each noise energy estimate e ⁇ corresponds to an average of the energy calculations at a particular frequency in the selected frequency range over a plurality of successive frames in which no speech was detected.
  • the filter circuit 115 is adjusted on a more gradual basis.
  • the noise profile estimate can be equated to the noise profile of the current frame.
  • the energy estimates e t of the noise profile estimate are then compared with a reference noise profile (step 604) .
  • the reference energy thresholds e ri can be determined empirically.
  • the noise energy estimates e A are successively compared to corresponding reference energy thresholds e ri from the highest frequency energy estimate e x to the lowest frequency energy estimate e n .
  • noise energy estimate e x is first compared to reference noise threshold e rl . If ⁇ j is greater than reference noise threshold e rl , then a comparison value c x is selected and inputted into filter selector 235. If noise energy estimate e x is less than reference noise threshold e rl , then noise energy estimate e 2 (which is a noise energy estimate taken at a lower frequency than e x ) is compared to reference noise threshold e r2 . If noise energy estimate e 2 is greater than reference noise threshold e r2 , then a comparison value c 2 is selected and inputted to filter selector 235.
  • the filter circuit 235 uses the determined comparison value Ci to determine a filter control value.
  • the filter control value is selected from a look-up table such as that shown in Figure 12.
  • the look-up table includes a series of comparison values c A and corresponding filter control values Fi.
  • the filter circuit 115 is adjusted as a function of the selected filter control value.
  • the filter circuit 115 is adjusted to exhibit a frequency response curve for extracting low frequency energy from the current frame.
  • the filter circuit 115 is adjusted to extract increasing amounts of low frequency energy as noise energy estimates at successively higher frequencies surpass their corresponding reference energy thresholds.
  • Figure 6A and 6B show example frequency response curves for selected filter control values.
  • noise profile estimates helps improve the ability to adaptively adjust the filter circuit to extract low frequency energy in a manner to improve the overall quality of speech. Since the car environment is not the only environment where a mobile telecommunications device is used, and therefore the noise profile in certain situations could be tilted more towards higher frequencies, the spectral analyzer 270 can be selectively disabled when noise energy in the low frequencies is small. Also, when a significant portion of the noise frequency spectrum resides in lower frequencies a steeper filtering slope could be applied even though some processing power may be sacrificed. This extra processing requirement is still fairly small.
  • the adaptive noise filter system of the present invention is implemented simply and without significant increase in DSP calculations. More complex methods of reducing noise, such as “spectral subtraction, " require several calculation-related MIPS and a large amount of memory for data and program code storage. By comparison, the present invention may be implemented using only a fraction of the MIPS and memory required for the
  • spectral subtraction algorithm which also introduces more speech distortion.
  • Reduced memory reduces the size of the DSP integrated circuits; decreased MIPS decreases power consumption. Both of these attributes are desirable for battery-powered portable/mobile radiotelephones.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Noise Elimination (AREA)
  • Signal Processing Not Specific To The Method Of Recording And Reproducing (AREA)
  • Tone Control, Compression And Expansion, Limiting Amplitude (AREA)
  • Filters That Use Time-Delay Elements (AREA)
EP96931552A 1995-09-14 1996-09-13 System zur adaptiven filterung von tonsignalen zur verbesserung der sprachverständlichkeit bei umgebungsgeräuschen Expired - Lifetime EP0852052B1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US52800595A 1995-09-14 1995-09-14
US528005 1995-09-14
PCT/US1996/014665 WO1997010586A1 (en) 1995-09-14 1996-09-13 System for adaptively filtering audio signals to enhance speech intelligibility in noisy environmental conditions

Publications (2)

Publication Number Publication Date
EP0852052A1 true EP0852052A1 (de) 1998-07-08
EP0852052B1 EP0852052B1 (de) 2001-06-13

Family

ID=24103874

Family Applications (1)

Application Number Title Priority Date Filing Date
EP96931552A Expired - Lifetime EP0852052B1 (de) 1995-09-14 1996-09-13 System zur adaptiven filterung von tonsignalen zur verbesserung der sprachverständlichkeit bei umgebungsgeräuschen

Country Status (15)

Country Link
EP (1) EP0852052B1 (de)
JP (1) JPH11514453A (de)
KR (1) KR100423029B1 (de)
CN (1) CN1121684C (de)
AU (1) AU724111B2 (de)
BR (1) BR9610290A (de)
CA (1) CA2231107A1 (de)
DE (1) DE69613380D1 (de)
EE (1) EE03456B1 (de)
MX (1) MX9801857A (de)
NO (1) NO981074L (de)
PL (1) PL185513B1 (de)
RU (1) RU2163032C2 (de)
TR (1) TR199800475T1 (de)
WO (1) WO1997010586A1 (de)

Cited By (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2575128A2 (de) * 2011-09-30 2013-04-03 Apple Inc. Benutzung von Kontextinformation um die Verarbeitung von Kommandos in einem virtuellen Assistenten zu ermöglichen
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US9668024B2 (en) 2014-06-30 2017-05-30 Apple Inc. Intelligent automated assistant for TV user interactions
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9966060B2 (en) 2013-06-07 2018-05-08 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US9986419B2 (en) 2014-09-30 2018-05-29 Apple Inc. Social reminders
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US10102359B2 (en) 2011-03-21 2018-10-16 Apple Inc. Device access using voice authentication
US10169329B2 (en) 2014-05-30 2019-01-01 Apple Inc. Exemplar-based natural language processing
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10706841B2 (en) 2010-01-18 2020-07-07 Apple Inc. Task flow identification based on user intent
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US10795541B2 (en) 2009-06-05 2020-10-06 Apple Inc. Intelligent organization of tasks items
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US11501758B2 (en) 2019-09-27 2022-11-15 Apple Inc. Environment aware voice-assistant devices, and related systems and methods
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US11831799B2 (en) 2019-08-09 2023-11-28 Apple Inc. Propagating context information in a privacy preserving manner
US12087284B1 (en) 2019-09-27 2024-09-10 Apple Inc. Environment aware voice-assistant devices, and related systems and methods

Families Citing this family (121)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19747885B4 (de) * 1997-10-30 2009-04-23 Harman Becker Automotive Systems Gmbh Verfahren zur Reduktion von Störungen akustischer Signale mittels der adaptiven Filter-Methode der spektralen Subtraktion
DE69815062T2 (de) * 1997-10-31 2004-02-26 Koninklijke Philips Electronics N.V. Verfahren und gerät zur audiorepräsentation von nach dem lpc prinzip kodierter sprache durch hinzufügen von rauschsignalen
KR20000074236A (ko) * 1999-05-19 2000-12-15 정몽규 오토 오디오 볼륨 제어장치
JP2001318694A (ja) * 2000-05-10 2001-11-16 Toshiba Corp 信号処理装置、信号処理方法および記録媒体
US6983242B1 (en) * 2000-08-21 2006-01-03 Mindspeed Technologies, Inc. Method for robust classification in speech coding
KR20030010432A (ko) * 2001-07-28 2003-02-05 주식회사 엑스텔테크놀러지 잡음환경에서의 음성인식장치
IL148592A0 (en) 2002-03-10 2002-09-12 Ycd Multimedia Ltd Dynamic normalizing
KR100978015B1 (ko) * 2002-07-01 2010-08-25 코닌클리케 필립스 일렉트로닉스 엔.브이. 고정 스펙트럼 전력 의존 오디오 강화 시스템
ATE419709T1 (de) * 2002-07-01 2009-01-15 Koninkl Philips Electronics Nv Von der stationären spektralleistung abhängiges audioverbesserungssystem
EP1522206B1 (de) * 2002-07-12 2007-10-03 Widex A/S Hörgerät und methode für das erhöhen von redeverständlichkeit
US7242763B2 (en) 2002-11-26 2007-07-10 Lucent Technologies Inc. Systems and methods for far-end noise reduction and near-end noise compensation in a mixed time-frequency domain compander to improve signal quality in communications systems
DE10305369B4 (de) * 2003-02-10 2005-05-19 Siemens Ag Benutzeradaptives Verfahren zur Geräuschmodellierung
US7127076B2 (en) 2003-03-03 2006-10-24 Phonak Ag Method for manufacturing acoustical devices and for reducing especially wind disturbances
EP2249586A3 (de) * 2003-03-03 2012-06-20 Phonak AG Verfahren zur Herstellung von akustischen Geräten und zur Verringerung von Windstörungen
CA2691762C (en) 2004-08-30 2012-04-03 Qualcomm Incorporated Method and apparatus for an adaptive de-jitter buffer
KR100640865B1 (ko) 2004-09-07 2006-11-02 엘지전자 주식회사 음성 품질 향상 방법 및 장치
US8085678B2 (en) 2004-10-13 2011-12-27 Qualcomm Incorporated Media (voice) playback (de-jitter) buffer adjustments based on air interface
WO2006075563A1 (ja) * 2005-01-11 2006-07-20 Nec Corporation オーディオ符号化装置、オーディオ符号化方法およびオーディオ符号化プログラム
GB2429139B (en) 2005-08-10 2010-06-16 Zarlink Semiconductor Inc A low complexity noise reduction method
KR100667852B1 (ko) * 2006-01-13 2007-01-11 삼성전자주식회사 휴대용 레코더 기기의 잡음 제거 장치 및 그 방법
EP2337223B1 (de) * 2006-01-27 2014-12-24 Dolby International AB Wirksames Filtern mithilfe einer komplex modulierten Filterbank
KR101414233B1 (ko) 2007-01-05 2014-07-02 삼성전자 주식회사 음성 신호의 명료도를 향상시키는 장치 및 방법
KR100883896B1 (ko) * 2007-01-19 2009-02-17 엘지전자 주식회사 음성명료도 향상장치 및 방법
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
KR100876794B1 (ko) 2007-04-03 2009-01-09 삼성전자주식회사 이동 단말에서 음성의 명료도 향상 장치 및 방법
JP5302968B2 (ja) * 2007-09-12 2013-10-02 ドルビー ラボラトリーズ ライセンシング コーポレイション 音声明瞭化を伴うスピーチ改善
EP2232703B1 (de) 2007-12-20 2014-06-18 Telefonaktiebolaget LM Ericsson (publ) Rauschunterdrückungsverfahren und vorrichtung
EP2232704A4 (de) * 2007-12-20 2010-12-01 Ericsson Telefon Ab L M Rauschunterdrückungsverfahren und vorrichtung
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
CN101221767B (zh) * 2008-01-23 2012-05-30 晨星半导体股份有限公司 人声语音加强装置与应用于其上的方法
SG189747A1 (en) 2008-04-18 2013-05-31 Dolby Lab Licensing Corp Method and apparatus for maintaining speech audibility in multi-channel audio with minimal impact on surround experience
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US20100030549A1 (en) 2008-07-31 2010-02-04 Lee Michael M Mobile device having human language translation capability with positional feedback
WO2010067118A1 (en) 2008-12-11 2010-06-17 Novauris Technologies Limited Speech recognition involving a mobile device
DE102009011583A1 (de) 2009-03-06 2010-09-09 Krones Ag Verfahren und Vorrichtung zum Herstellen und Befüllen von dünnwandigen Getränkebehältern
US10706373B2 (en) 2011-06-03 2020-07-07 Apple Inc. Performing actions associated with task items that represent tasks to perform
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
WO2011089450A2 (en) 2010-01-25 2011-07-28 Andrew Peter Nelson Jerram Apparatuses, methods and systems for a digital conversation management platform
CN102202038B (zh) * 2010-03-24 2015-05-06 华为技术有限公司 一种实现语音能量显示的方法、系统、会议服务器和终端
JP5867389B2 (ja) * 2010-05-24 2016-02-24 日本電気株式会社 信号処理方法、情報処理装置、及び信号処理プログラム
CN101859569B (zh) * 2010-05-27 2012-08-15 上海朗谷电子科技有限公司 数字音频信号处理降噪的方法
US8639516B2 (en) 2010-06-04 2014-01-28 Apple Inc. User-specific noise suppression for voice quality improvements
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
CN102128976B (zh) * 2011-01-07 2013-05-15 钜泉光电科技(上海)股份有限公司 电能表的能量脉冲输出方法、装置及电能表
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US8994660B2 (en) 2011-08-29 2015-03-31 Apple Inc. Text correction processing
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
CN102737646A (zh) * 2012-06-21 2012-10-17 佛山市瀚芯电子科技有限公司 单一麦克风的实时动态语音降噪方法
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
CN113470641B (zh) 2013-02-07 2023-12-15 苹果公司 数字助理的语音触发器
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US10652394B2 (en) 2013-03-14 2020-05-12 Apple Inc. System and method for processing voicemail
WO2014144579A1 (en) 2013-03-15 2014-09-18 Apple Inc. System and method for updating an adaptive speech recognition model
WO2014144949A2 (en) 2013-03-15 2014-09-18 Apple Inc. Training an at least partial voice command system
CN104095640A (zh) * 2013-04-03 2014-10-15 达尔生技股份有限公司 血氧饱和度检测方法及装置
WO2014197336A1 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
CN105265005B (zh) 2013-06-13 2019-09-17 苹果公司 用于由语音命令发起的紧急呼叫的系统和方法
EP2816557B1 (de) * 2013-06-20 2015-11-04 Harman Becker Automotive Systems GmbH Identifizierung von Störsignalen in Audiosignalen
US9697831B2 (en) * 2013-06-26 2017-07-04 Cirrus Logic, Inc. Speech recognition
CN105453026A (zh) 2013-08-06 2016-03-30 苹果公司 基于来自远程设备的活动自动激活智能响应
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
WO2015184186A1 (en) 2014-05-30 2015-12-03 Apple Inc. Multi-command single utterance input method
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
EP2980801A1 (de) 2014-07-28 2016-02-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Verfahren zur Schätzung des Rauschens in einem Audiosignal, Rauschschätzer, Audiocodierer, Audiodecodierer und System zur Übertragung von Audiosignalen
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
RU2589298C1 (ru) * 2014-12-29 2016-07-10 Александр Юрьевич Бредихин Способ повышения разборчивости и информативности звуковых сигналов в шумовой обстановке
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
EP3374990B1 (de) * 2015-11-09 2019-09-04 Nextlink IPR AB Verfahren und system zur rauschunterdrückung
CN105869650B (zh) * 2015-12-28 2020-03-06 乐融致新电子科技(天津)有限公司 数字音频数据播放方法及装置
CN106060717A (zh) * 2016-05-26 2016-10-26 广东睿盟计算机科技有限公司 一种高清晰度动态降噪拾音器
US9748929B1 (en) * 2016-10-24 2017-08-29 Analog Devices, Inc. Envelope-dependent order-varying filter control
CN107039044B (zh) * 2017-03-08 2020-04-21 Oppo广东移动通信有限公司 一种语音信号处理方法及移动终端
US10157627B1 (en) 2017-06-02 2018-12-18 Bose Corporation Dynamic spectral filtering
WO2019187841A1 (ja) * 2018-03-30 2019-10-03 パナソニックIpマネジメント株式会社 騒音低減装置
RU2680735C1 (ru) * 2018-10-15 2019-02-26 Акционерное общество "Концерн "Созвездие" Способ разделения речи и пауз путем анализа значений фаз частотных составляющих шума и сигнала
CN109643554B (zh) * 2018-11-28 2023-07-21 深圳市汇顶科技股份有限公司 自适应语音增强方法和电子设备
AU2021233699A1 (en) * 2020-03-13 2023-04-27 University Of South Australia A data processing method
CN111370033B (zh) * 2020-03-13 2023-09-22 北京字节跳动网络技术有限公司 键盘声处理方法、装置、终端设备及存储介质
CN111402916B (zh) * 2020-03-24 2023-08-04 青岛罗博智慧教育技术有限公司 一种语音增强系统、方法及手写板
CN114093391A (zh) * 2020-07-29 2022-02-25 华为技术有限公司 一种异常信号的过滤方法及装置
CN111916106B (zh) * 2020-08-17 2021-06-15 牡丹江医学院 一种提高英语教学中发音质量的方法
CN112927715B (zh) * 2021-02-26 2024-06-14 腾讯音乐娱乐科技(深圳)有限公司 一种音频处理方法、设备及计算机可读存储介质
CN114550740B (zh) * 2022-04-26 2022-07-15 天津市北海通信技术有限公司 噪声下的语音清晰度算法及其列车音频播放方法、系统
CN118411998B (zh) * 2024-07-02 2024-09-24 杭州知聊信息技术有限公司 基于大数据的音频噪声处理方法及系统

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4461025A (en) * 1982-06-22 1984-07-17 Audiological Engineering Corporation Automatic background noise suppressor
US4630305A (en) * 1985-07-01 1986-12-16 Motorola, Inc. Automatic gain selector for a noise suppression system
US4811404A (en) * 1987-10-01 1989-03-07 Motorola, Inc. Noise suppression system
DE4012349A1 (de) * 1989-04-19 1990-10-25 Ricoh Kk Einrichtung zum beseitigen von geraeuschen
JP3065739B2 (ja) * 1991-10-14 2000-07-17 三菱電機株式会社 音声区間検出装置
US5412735A (en) * 1992-02-27 1995-05-02 Central Institute For The Deaf Adaptive noise reduction circuit for a sound reproduction system
JPH05259928A (ja) * 1992-03-09 1993-10-08 Oki Electric Ind Co Ltd 適応制御ノイズキャンセラ装置及び適応制御ノイズキャンセル方法
US5251263A (en) * 1992-05-22 1993-10-05 Andrea Electronics Corporation Adaptive noise cancellation and speech enhancement system and apparatus therefor
JPH0695693A (ja) * 1992-09-09 1994-04-08 Fujitsu Ten Ltd 音声認識装置用騒音低減回路
JP3270866B2 (ja) * 1993-03-23 2002-04-02 ソニー株式会社 雑音除去方法および雑音除去装置
US5485522A (en) * 1993-09-29 1996-01-16 Ericsson Ge Mobile Communications, Inc. System for adaptively reducing noise in speech signals
US5657422A (en) * 1994-01-28 1997-08-12 Lucent Technologies Inc. Voice activity detection driven noise remediator

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9710586A1 *

Cited By (63)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9865248B2 (en) 2008-04-05 2018-01-09 Apple Inc. Intelligent text-to-speech conversion
US11080012B2 (en) 2009-06-05 2021-08-03 Apple Inc. Interface for a virtual digital assistant
US10475446B2 (en) 2009-06-05 2019-11-12 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US10795541B2 (en) 2009-06-05 2020-10-06 Apple Inc. Intelligent organization of tasks items
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US10706841B2 (en) 2010-01-18 2020-07-07 Apple Inc. Task flow identification based on user intent
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US10049675B2 (en) 2010-02-25 2018-08-14 Apple Inc. User profiling for voice input processing
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
US10102359B2 (en) 2011-03-21 2018-10-16 Apple Inc. Device access using voice authentication
EP2575128A2 (de) * 2011-09-30 2013-04-03 Apple Inc. Benutzung von Kontextinformation um die Verarbeitung von Kommandos in einem virtuellen Assistenten zu ermöglichen
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9966060B2 (en) 2013-06-07 2018-05-08 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US10657961B2 (en) 2013-06-08 2020-05-19 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10169329B2 (en) 2014-05-30 2019-01-01 Apple Inc. Exemplar-based natural language processing
US9668024B2 (en) 2014-06-30 2017-05-30 Apple Inc. Intelligent automated assistant for TV user interactions
US10904611B2 (en) 2014-06-30 2021-01-26 Apple Inc. Intelligent automated assistant for TV user interactions
US9986419B2 (en) 2014-09-30 2018-05-29 Apple Inc. Social reminders
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US11069347B2 (en) 2016-06-08 2021-07-20 Apple Inc. Intelligent automated assistant for media exploration
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US11037565B2 (en) 2016-06-10 2021-06-15 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US11831799B2 (en) 2019-08-09 2023-11-28 Apple Inc. Propagating context information in a privacy preserving manner
US11501758B2 (en) 2019-09-27 2022-11-15 Apple Inc. Environment aware voice-assistant devices, and related systems and methods
US12087284B1 (en) 2019-09-27 2024-09-10 Apple Inc. Environment aware voice-assistant devices, and related systems and methods

Also Published As

Publication number Publication date
EP0852052B1 (de) 2001-06-13
TR199800475T1 (xx) 1998-06-22
CA2231107A1 (en) 1997-03-20
NO981074L (no) 1998-05-13
KR19990044659A (ko) 1999-06-25
EE9800068A (et) 1998-08-17
AU724111B2 (en) 2000-09-14
RU2163032C2 (ru) 2001-02-10
NO981074D0 (no) 1998-03-11
CN1201547A (zh) 1998-12-09
BR9610290A (pt) 1999-03-16
WO1997010586A1 (en) 1997-03-20
AU7078496A (en) 1997-04-01
EE03456B1 (et) 2001-06-15
KR100423029B1 (ko) 2004-07-01
JPH11514453A (ja) 1999-12-07
MX9801857A (es) 1998-11-29
PL185513B1 (pl) 2003-05-30
DE69613380D1 (de) 2001-07-19
CN1121684C (zh) 2003-09-17
PL325532A1 (en) 1998-08-03

Similar Documents

Publication Publication Date Title
EP0852052B1 (de) System zur adaptiven filterung von tonsignalen zur verbesserung der sprachverständlichkeit bei umgebungsgeräuschen
EP0645756B1 (de) System zur angepassten Reduktion von Geräuschen bei Sprachsignalen
EP1017042B1 (de) Durch Sprachaktivitätsdetektion gesteuerte Rauschunterdrückung
US8019599B2 (en) Speech codecs
US5794199A (en) Method and system for improved discontinuous speech transmission
EP0699334B1 (de) Verfahren und vorrichtung zur gruppenkodierung von signalen
CA2348913C (en) Complex signal activity detection for improved speech/noise classification of an audio signal
FI116643B (fi) Kohinan vaimennus
EP0599664B1 (de) Sprachkodierer und Verfahren zur Sprachkodierung
JP2002501225A (ja) 適応形後置フィルタを備えた復号化方法およびシステム
KR19990007936A (ko) 배터리 수명이 향상된 배터리 전원 무선 송수신기 및 그 작동방법
JP2003524796A (ja) 音声コーダにおける線スペクトル情報量子化方法を交錯するための方法および装置
US5710862A (en) Method and apparatus for reducing an undesirable characteristic of a spectral estimate of a noise signal between occurrences of voice signals
WO2000007177A1 (en) Communication terminal
JP2002076960A (ja) ノイズ抑制方法及び携帯電話
EP1238479A1 (de) Verfahren und vorrichtung zur unterdrückung von akustischem hintergrundrauschen in einem kommunikationssystem

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19980210

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): BE DE DK ES FI FR GB GR IT NL PT SE

17Q First examination report despatched

Effective date: 19990517

RIC1 Information provided on ipc code assigned before grant

Free format text: 7G 10L 21/02 A

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): BE DE DK ES FI FR GB GR IT NL PT SE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20010613

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20010613

Ref country code: BE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20010613

REF Corresponds to:

Ref document number: 69613380

Country of ref document: DE

Date of ref document: 20010719

ITF It: translation for a ep patent filed
PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20010913

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20010913

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20010913

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20010914

Ref country code: DE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20010914

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20010917

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20011220

EN Fr: translation not filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20010913

26N No opposition filed
PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: NL

Payment date: 20051229

Year of fee payment: 11

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: IT

Payment date: 20070926

Year of fee payment: 12

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080401

NLV4 Nl: lapsed or anulled due to non-payment of the annual fee

Effective date: 20080401

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20080913