CN1121684C

CN1121684C - System for adaptively filtering audio signals to enhance speech intelligibility in noisy environmental conditions

Info

Publication number: CN1121684C
Application number: CN96198008A
Authority: CN
Inventors: T·W·索尔维
Original assignee: Ericsson Inc
Current assignee: Ericsson Inc
Priority date: 1995-09-14
Filing date: 1996-09-13
Publication date: 2003-09-17
Anticipated expiration: 2016-09-13
Also published as: MX9801857A; AU7078496A; NO981074D0; EE03456B1; TR199800475T1; WO1997010586A1; BR9610290A; KR100423029B1; KR19990044659A; AU724111B2; EE9800068A; PL185513B1; PL325532A1; RU2163032C2; CA2231107A1; DE69613380D1; EP0852052B1; NO981074L; JPH11514453A; EP0852052A1

Abstract

A method and system are provided for adaptively reducing noise in frames of digitized audio signals that include both speech and background noise. Frames of digitized audio signals are passed through an adjustable, high-pass filter circuit to filter a portion of background noise located in a low frequency range of the digitized signal. The filter circuit is adjusted by a filter control circuit adapted for a current frame to exhibit a selected frequency response curve. The filter control circuit includes a speech detector for detecting the presence or absence of speech in the frames of digitized audio signals. The filter circuit is adjusted when no speech is detected in the current frame. In a first preferred embodiment, the filter control circuit controls the filter circuit by calculating a noise estimate corresponding to the background noise, and adjusting the filter circuit based on the noise estimate. As the noise estimates increase, the filter circuit is adjusted to extract increasing amounts of energy falling in low frequency ranges of speech. In a second preferred embodiment, the filter circuit is adjusted as a function of a noise profile estimate. A noise profile estimate for a current frame is determined as a function of speech detection and is compared to a reference noise profile. Based on this comparison, the filter circuit is adaptively adjusted.

Description

Be used for optionally changing the method and apparatus of a frame of digital signal

Technical field

This invention relates to the noise-cut system, the particularly a kind of adaptive voice sharpness enhanced system of portable digital cordless phones and method and apparatus that is used for optionally changing a frame of digital signal of being used for.

Background technology

In the commercial operation of the U.S. and other parts of the world, cellular telephone industry has obtained significant progress.At main urban place, the demand of cellular service is being surpassed the capacity of existing system.Suppose that this trend continues, the cellular radio communication will touch even minimum rural market.Capacity of cellular network must strengthen when therefore, keeping high-quality service with reasonable cost.Towards one step of weight of improving capacity is that cellular system is from the conversion of simulation transmission to digital delivery.This conversion also is important, this is because first generation person-to-person communication net (PCNs) may be provided by the honeycomb substrate that uses digital cellular foundation structure of future generation, wherein, first generation person-to-person communication net adopted be easy to carry and be convenient at home, the low cost that make a phone call or answer the call in ground in the office, the street, automobile etc., the wireless phone of pocket size.

Digital communication system has utilized strong Digital Signal Processing.Digital signal processing is commonly referred to as the mathematics or the otherwise processing of digitized signal.For example, be after the digital form with analog signal conversion (digitizing), may utilize the simple mathematical routine in the digital signal processor (DSP) to come filtering, amplification and this digital signal that decays.Digital signal processor generally is manufactured to high speed integrated circuit, makes data processing operation carry out in real time basically.Digital signal processor also can be used to reduce the bit transfer rate of digitize voice, and its result shows as the reduction of spectrum occupation rate and the power system capacity that send radio signal and enlarges.If for example use the linear pulse code modulated of 14 bits (PCM) to come digitized voice signal, and, can produce 112K bps serial bit rate with the sampling of the sampling rate of 8KHZ.In addition, cut down with the bit transfer rate that obtains 14: 1 by utilize the redundant characteristic and the measurable characteristic of other human speech, acoustic coding technology can be used for that 112K bps serial bit rate is compressed to 7.95K bps from mathematics.Transfer rate is cut down and is meaned the bandwidth that get more.

In the U.S., a kind of popular voice compression technique that is used as the digitizing standard of second generation cell phone system (i.e.IS-54) by TIA is the linear predictive coding (VSELP) of vector source code book excitation.Unfortunately, when the sound signal that comprises voice and be mixed with high level neighbourhood noise (especially " coloured noise ") is utilized VSELP coding/compression, may comprise the undesirable characteristic audio signal of part in the result.For example, if use digital mobile phone in noise circumstance (for example, wherein in the car of moving vehicle), the voice of neighbourhood noise and hope all are used the compression of VSELP encryption algorithm, and being sent to the base station, the signal in the base station after the compression is decoded and be reconfigured as the voice that can listen.When ground unrest is reconfigured as analog form, listened to the distortion that is not hoped of noise, and accidental these situations that betide in the voice are introduced into.This distortion is stinking for general audience.

This distortion major part is caused by the environment that uses mobile phone.Mobile phone generally is used for vehicle interior, the noise that wagon flow caused around the neighbourhood noise that usually has car engine to produce there reached.The neighbourhood noise of this vehicle interior concentrates on the bass scope usually, and noise amplitude can since the speed of vehicle and acceleration and on every side the such factor of the big or small degree of vehicle flowrate change.This low-frequency noise also has such trend: the serious speech intelligibility that reduces from teller in the automobile.In the communication system that adopts the VSELP vocoder, the reduction of this speech intelligibility that is caused by low-frequency noise may be especially remarkable, but this phenomenon also may betide in the communication system that does not comprise the VSELP vocoder.

Neighbourhood noise also may work because of the mode of using mobile phone to the influence of mobile phone.Especially mobile phone can be used for hands-free way, and promptly the phone user is facing to the mobile phone speech that is placed in the carriage.This makes mobile phone user's hand can spare driving, arrives the distance that must pass by before the mobile phone microphone input end but also increased the words that the user tells.The distance that has strengthened between this user and the mobile phone adds the neighbourhood noise of variation, signal portion of the sound signal general power spectrum energy that can cause noise to become being input to mobile phone.

The description of prior art is included in EP 0 645 756, and EP 0 558 312, and EP 0 665530, DE 4 012 349, U.S. Patent No. 4,811,404,4,461,025, and in 5,251,263, all these has described the mode that signal content is not wished in filtering.

In theory, can utilize digital signal processor to realize the ground unrest of various digital signal processing algorithms with filtering VSELP coding.Yet these solutions usually need be carried out the very big digital signal processing expense that millions of instructions (MIPS) are calculated with per second, and this has expended the valuable processing time, memory headroom and power consumption.Yet in the portable radiotelephone phone, each in these signal processing resources all is limited.Therefore, for the ground unrest that minimizes the VSLEP coding and the ground unrest of other types, the processing burden that increases DSP simply is not a best solution.

Brief summary of the invention

This invention has provided a kind of adaptive noise reduction system, and this system has cut down the effect of undesirable coding ground unrest any negative effect to the encoded voice quality being minimized and any of digital signal processing resources being increased consumption the minimized while.The method and system of this invention has increased the sharpness of voice in the digital audio signal, and its method is by a filter circuit with the digital audio signal frame.This filter circuit plays the scalable Hi-pass filter, and the digitized signal of its filtering part bass scope is by dropping on the digitized signal part in the high-frequency range.Because the noise in the vehicle is tending towards concentrating on the bass scope, and has only sub-fraction lamprophonia value to drop on this low-frequency range, so filtering circuit is when only filtering out the inessential part of voice, filtering the most of noise in the digital audio signal.This makes and partly compares with the speech energy of removing that the noise energy of a bigger part is removed relatively.By adjusting the also frequency response curve of selective filter circuit adaptively, the speech volume that filters out is limited, and the speech intelligibility of radio output is had minimum influence.

Filter control circuit is used to adjust filter circuit, makes it to show different frequency response curves with the form of a certain noise estimation value and/or spectrum envelope function, and noise estimation value wherein and/or spectrum envelope are corresponding to the noise in the sound signal.Noise estimation value and/or spectrum envelope adjust at digital signal on basis frame by frame and adjust with the form of speech detection function.If do not detect voice, that is petty to be present frame correction noise estimation value and/or spectrum envelope.If detect voice, just do not adjust noise estimation value and/or spectrum envelope.

In the first embodiment, filter circuit is at digitized audio signal frame calculating noise estimated value.This noise estimation value is corresponding to the amount of background noise in the digital audio signal frame.When the ground unrest in the voice low-frequency range increased the relative quantity of voice, noise estimation value increased.When the ground unrest in the voice low-frequency range increased the relative quantity of voice, filter control circuit used noise estimation value to adjust filter circuit with the more most low-frequency range voice of filtering.When not having ground unrest, there is not voice signal by filtering.When having higher noise level, more most noise and voice messaging are extracted.Because noise trends towards concentrating in the low-frequency range and has only relative smaller portions lamprophonia value to drop in this low-frequency range, when noise estimation value increases, by strengthening by the low frequency energy part of filtering, the whole sharpness of sound signal can be enhanced.

In second embodiment, an amended filter control circuit is used to adjust filter circuit, make it to demonstrate different frequency response curves with the form of a certain noise envelope function, noise envelope wherein is a noise envelope of selecting noise estimation value on the frequency range in the sound signal.This filter control circuit comprises a spectralyzer, and this analyzer is determined a noise envelope estimated value with the form that detects the voice function.Determining a noise envelope estimated value for present frame also compares this estimated value with the reference noise envelope.Based on this relatively, filter circuit is adjusted the low frequency energy that is used for extracting varying number from present frame adaptively.

Cut down system according to the self-adaptation of this invention and can be advantageously applied to radio communication system, in this radio communication system, pass through the communication of RF channel between portable/mobile radio transceiver and between radio transceiver and the fixation phone line user.Each transceiver comprises an antenna, and a radio signal that is used for receiving by antenna on the RF channel converts the receiver of simulated audio signal to, and a transmitter.Transmitter comprises that a volume-demoder (codec) is used for the simulated audio signal that will be sent out is digitized as the digitized speech information frame, and this voice messaging had both comprised that voice also comprised ground unrest.Handle present frame on the basis of digital signal processor speech detection in ground unrest estimated value and present frame and come the minimum background noise.Digitized speech information frame after modulator will be handled is modulated to and is used for follow-up transmission by antenna on the RF carrier wave.

Accompanying drawing is briefly described

According to the description of writing below also in conjunction with the accompanying drawings, for those skilled in the art, all features and the advantage of this invention will be readily understood that.

Fig. 1 is a general utility functions block scheme of this invention.

Fig. 2 illustrates frame and the positional structure of the U.S. digital standard IS-54 that is used for the cellular radio communication;

Fig. 3 is that this that utilize that digital signal processor realizes invented the block scheme of first preferred embodiment;

Fig. 4 is the functional-block diagram of an exemplary embodiment of this invention, and this embodiment is applied in the radio communication system in a plurality of portable radio transceivers one.

Fig. 5 A and 5B are a process flow diagram, and it illustrates, and this is invented in the first preferred embodiment process in realization, function/operation that digital signal processor is carried out.

To be graphic extension invent first exemplary plot of decay-frequency characteristic of the filtering circuit of first preferred embodiment according to this to Fig. 6 A.

To be graphic extension invent second exemplary plot of decay-frequency characteristic of the filtering circuit of first preferred embodiment according to this to Fig. 6 B.

Fig. 7 can be invented a sample query table of filter controller circuit access in first preferred embodiment by this.

Fig. 8 A and 8B illustrate the amplitude-frequency characteristic of example input audio signal.

Fig. 9 A and 9B illustrate input audio signal among Fig. 8 A and the 8B respectively by the amplitude-frequency characteristic after the filter circuit filtering of this invention;

Figure 10 is that this that utilize that digital signal processor realizes invented the block scheme of second preferred embodiment;

Figure 11 is a process flow diagram, and corresponding to the process flow diagram of Fig. 5 B, it illustrates, and this invents function/operation that digital signal processor is carried out in the second preferred embodiment process in realization.

Figure 12 can be invented a sample query table of the second preferred embodiment median filter control circuit access by this.

Accompanying drawing is described in detail

In the following description, for purposes of explanation and not limitation, in order to provide the complete understanding to this invention, concrete details such as special circuit, circuit component, technology, process flow diagram or the like are stated.Yet these those skilled in the art can understand that this invention can be practiced in other embodiments that depart from these details.In other examples, the detailed description of well-known method, equipment and circuit all is omitted so that can not blur description to this invention with unnecessary details.

Fig. 1 is the general block diagram according to the adaptive noise reduction system 100 of this invention.Adaptive noise reduction system 100 comprises a filter control circuit 105 that is connected to filter circuit 115.Filter control circuit 105 is that the present frame of digital audio signal produces a filter control signal.This filter control signal is output to filter circuit 115, and filter circuit 115 is regulated to demonstrate a high-pass equipment response curve according to filter control signal, and this curve is to select on the basis of filter control signal.The present frame of the filtering circuit 115 filtering figure sound signals after the adjusting.Filtering signal is handled the coded signal that produces the expression digital audio signal by vocoder 120.

Be applied in cellular radio communication systems in the exemplary scheme of this invention of portable/mobile radiotelephone transceiver, Fig. 2 illustrates the frame structure of the time-division multiple access (TDMA) that the IS-54 standard adopts for the digital cellular radio communication.One " frame " is one 20 milliseconds time period, and it comprises that sends a piece TX, and a reception piece RX and a signal strength measurement piece are used for Mobile Assisted Handover (MAHO).Two successive frames expressing among Fig. 2 were sent out in 40 milliseconds time period.Digitized voice and background noise information are as further describing on basis frame by frame processed and filtering below.

Best is, the filter control circuit 105 among Fig. 1, and the function of filter circuit 115 and vocoder 120 realizes with a high speed digital signal processor.A kind of suitable digital signal processor is can be from the TMS320C53 DSP of TI company acquisition.This TMS320C53 DSP comprises one 16 bit microprocessor on integrated an of list, be used to store RAM on the sheet of the data as speech frame that will be processed, the ROM that is used for the store various kinds of data Processing Algorithm, algorithm wherein comprises the VSELP voice compression algorithm, and following other algorithm that is used for finishing the functional block of being carried out by filter control circuit 105 and filter circuit 115 that will describe.

First embodiment of this invention is expressed out in Fig. 3.In the first embodiment, filter circuit 115 is adjusted with the form of ground unrest estimated value function.This ground unrest estimated value is determined by filter control circuit.The audio-frequency information frame of pulse code modulated by sequential storage on the sheet of DSP among the RAM.Can use other digitizing technique to come digital audio information.Each PCM digital frame is removed the RAM from the DSP sheet and is handled by frame Energy Estimation device 210, is temporarily stored in then in the interim frame memory 220.The energy of the present frame of being determined by frame Energy Estimation device 210 is provided for noise estimator 230 and speech detector 240 functional blocks.When frame Energy Estimation value surpass last noise estimation value and a voice threshold value and the time, speech detector 240 represents that voice are present in the present frame.If speech detector determines not have voice to exist, digital signal processor 200 calculates the noise estimation value of a correction with the form of current noise estimation value and present frame energy function so.

Revised noise estimation value is output to filter selector 235.Filter selector 235 produces a filter control signal based on noise estimation value.In preferred embodiments, filter selector 235 reads question blank in producing the filter control signal process.Question blank comprises a series of FILTER TO CONTROL value, and each controlling value all is complementary with the scope of a noise estimation value or noise estimation value.On the basis of noise estimation value after the correction, the FILTER TO CONTROL value in the question blank is selected to be gone out, and this FILTER TO CONTROL value is represented that by a filter control signal this control signal is that filter circuit 115 outputs to bank of filters 265.In order to stablize this process, and avoid the continuous switching between different wave filters, be provided with the switching time of a N frame for the selection of new wave filter.New wave filter can only be selected once by every N frame, wherein N be one greater than 1, and more preferably greater than 10 integer.

Filter circuit 115 is adjusted to demonstrate the high-pass equipment response curve corresponding to input filter control signal and noise estimation value according to filter control signal.Well-known various dissimilar filter circuits can be used to show the frequency response curve of choosing according to filter control signal in the prior art.The wave filter of these prior aries comprises iir filter, as Butterworth, and Chebyshev or elliptic filter, because lower processing requirements also can use the FIR wave filter, but preferred iir filter.

Filtered signal is handled by

vocoder

120, and 120 are used to the bit rate of signal behind the compression filtering.In preferred embodiments, vocoder 120 uses vector source code book excited linear prediction encoding (VSELP) technology to come coding audio signal.Other acoustic coding technology and algorithm also can be used, for example Code Excited Linear Prediction (CELP) coding, residual impulse Excited Linear Prediction (RPE-LTP) coding, improved many band excitations (IMBE) coding.By before acoustic coding according to this invention filtering audio signals frame, ground unrest is minimized, any undesirable noise effect in the voice when voice reconstruct has been cut down in this processing basically.It has prevented that also voice from " being flooded " in low-frequency noise.

The digital signal processor of describing in conjunction with Fig. 3 200 can be used in the such device of the transceiver of digital portable/mobile radiotelephone of using in the radio communication system for example.Fig. 4 illustrates such digital radio transceiver, and it can be used for the cellular radio communication network.

The sound signal that comprises voice and ground unrest is input to a preferably special IC (ASIC) of volume-demoder 402,402 from microphone 400.At the detected band limited audio signals in microphone 400 places by codec 402 with the sampling of the sampling rate of 8000 samples of per second and be blocked into frame.According to above-mentioned, each 20 milliseconds of frame comprises 160 speech samples.These samples are quantized and are converted into for example such coded digital form of 14 bit linear PCM.In case 160 samples of the digitize voice of present frame are stored on the sheet that sends among the DSP200 among the RAM, send DSP200 as above describe in conjunction with Fig. 3, according to the VSELP algorithm, carry out the chnnel coding function, the frame Energy Estimation, Noise Estimation, speech detection, FFT, filter function and digital speech code/compression.

Monitoring microprocessor 432 is being controlled the whole operation of all elements in the transceiver of expressing among Fig. 4.Be provided and be used for orthogonal modulation and transmission by sending filtered PCM data stream that DSP200 produces.So far, based on PCM data stream after the filtering that derives from DSP 200, ASIC gate array 404 produces homophase (I) information channel and quadrature (Q) information channel.The low-pass filter 406 that I and Q bit stream are complementary and 408 is handled and is sent in the IQ mixer in the balanced modulator 410.Provide a transmission intermediate frequency (IF) with reference to oscillator 412 and multiplier 414.I signal mixes with homophase IF, and Q signal mixes (being that homophase IF is by phase-shifter 416 quadrature laggings) with quadrature IF.The I and the Q signal that mix are added, and by " on " be transformed on the RF channel frequency of selecting by channel synthesizer 430, send on the radio frequency channel of selecting by duplexer 420 and antenna 422 then.

Receiving the limit, the signal that receives by antenna 422 and duplexer 420 is converted to an IF frequency downwards by choosing from mixer 424 on the receive channel frequency, an IF frequency has wherein been used the local oscillator signal that is synthesized by channel synthesizer 430 on the output basis of reference oscillator 428.Filtered and its frequency of the output of the one IF mixer 424 is converted to the 2nd IF downwards, and this conversion is carried out on the basis of another output of channel synthesizer 430 and detuner 426.Then, receiving gate array 434 becomes a series of phase sample and a series of frequency samples with the 2nd IF conversion of signals.Receive DSP436 and carry out demodulation, filtering, gain/attenuation, channel-decoding and voice expansion to the received signal.Then, the speech data after the processing is sent to codec 402 and is converted into base-band audio signal and is used to drive loudspeaker 438.

Now will be in conjunction with the flow chart description digital signal processor among Fig. 5 A, the 5B 200 for realizing filter control circuit 105, the function of filter circuit 115 and vocoder 120 and the operation carried out.Frame Energy Estimation device 210 is determined the energy of each frame sound signal.By calculate each PCM sample square value in the frame and (step 505), frame Energy Estimation device 210 is determined the energy of present frames.Because for the sampling rate of per second 8000 samples, each 20 milliseconds of long frame has 160 samples, and that is petty just to have 160 PCM samples square to be added.Express according to mathematical way, the frame Energy Estimation is determined according to equation 1: Equation 1

The frame energy value that calculates for present frame is stored on the sheet of DSP200 among the RAM202 (step 510).

The function of speech detector 240 comprises takes out one by the noise estimator 230 previous noise estimation value of determining (step 515) the RAM202 from the sheet of DSP200.Certainly, when transceiver powers at first, there is not noise estimation value to exist.Decision block 520 expects this situation and provides a noise estimation value in step 525.For what will describe below resembling, force correction to noise estimation value, preferably arrange any high value as noise estimation value, for example the 20dB on the normal voice level.The frame energy of being determined by frame Energy Estimation device 210 is taken out (square frame 530) the RAM202 from the sheet of DSP210.In square frame 535, determine frame Energy Estimation value whether surpassed the noise estimation value that detects add a predetermined voice threshold value and, as following equation 2 expressions:

Frame Energy Estimation value＞(noise estimation value+voice threshold value) (equation 2)

The voice threshold value can be a fixed value, and this fixed value is determined short-time energy variance greater than general ground unrest by experience, and can be set to for example 9dB.In addition, the voice threshold value can be revised the voice condition that reflects variation adaptively, for example, and the voice condition when the talker enters a more noisy or more quiet environment.If frame Energy Estimation value surpassed in the equation 2 and, that is petty to be provided with a zone bit and to represent that voice exist in square frame 570.If speech detector 240 detects voice and exists, that petty noise estimator 230 is crossed, and the noise estimation value of calculating for former digitized audio frame is retrieved and be used as current noise estimation value.Opposite, if the frame Energy Estimation less than in the equation 2 and, at square frame 540 with the zero clearing of voice sign.

Also can use other to detect the system of voice in the present frame.For example, European telecommunication standards body (ETSI) has developed a kind of standard that GPS GSM voice activity detects (VAD) that is used for.And at the ETSI list of references: be described among the RE/SMG-020632P, the document is quoted at this by reference.

If voice do not exist, the noise estimation value correction routine in the noise estimator 230 is performed.In the time that does not have voice to exist, it is online average that noise estimation value comes down to of frame energy.As described above, if the selecteed enough height of initial startup noise estimation value, those petty voice are not detected, and the voice sign by therefore zero clearing to force correction to noise estimation value.

In the Noise Estimation routine of carrying out by noise estimator 230, in square frame 545, determined a difference/error (Δ), according to equation, this error is the frame noise energy of frame Energy Estimation device 210 generations and the difference between the noise estimator 230 former noise estimation value of calculating:

Δ=present frame energy-former Noise Estimation (equation 3)

Decision block 550 determines whether that Δ has surpassed 0.If Δ is born, take place during as the strong noise estimated value, recomputated in square frame 560 according to the equation noise estimation value so:

Noise Estimation=former Noise Estimation+Δ/2 (equation 4)

Because Δ is born, this causes noise estimation value to be proofreaied and correct downwards.Here selected relatively large step delta/2 to proofread and correct fast to reduce noise level.Yet, if the frame energy has surpassed noise estimation value, providing one greater than 0 Δ, noise is by according to the equation correction in square frame 555:

Noise Estimation=former noise estimation value+Δ/256 (equation 5)

Because Δ is positive, noise estimation value is bound to increase.Yet, chosen littler step delta/256 (comparing) here and strengthened noise estimation value gradually and instantaneous noise is made substantial elimination with Δ/2.

The noise estimation value of calculating for present frame is output to filter selector 235.In first preferred embodiment, filter selector 235 reads question blank and utilizes current noise estimation value to choose a FILTER TO CONTROL value (step 572).Filter circuit 115 (step 574) demonstrates a frequency response curve with the form adjustment of selecting the FILTER TO CONTROL value function then, and this response curve is intended to strengthen the noisiness of filtering when noise estimation value and ground unrest increase.Then, the PCM sample that is stored among the DSP RAM comes filtering PCM sample to remove denoising (step 576) by adjusted filter circuit 265.Handled (step 578) by vocoder 120 after the filtered PCM sample, then, the sample behind the coding is output to RF transtation mission circuit (step 580).

Fig. 6 A has provided several examples that show different frequency response curve F1-F4 for the different filter control signals that are input to filter circuit 115 of how adjusting about filter circuit with 6B.As shown in Figure 6A, filter circuit 115 can be selected to show a series of different frequency response curves, and frequency response curve F1-F4 has cutoff frequency F1c-F4c respectively.In preferred embodiments, the scope of the cutoff frequency of filter circuit 115 can be that 300HZ is to 800HZ.When noise estimation value increased, filter circuit 115 was designed to show to have the more frequency response curve of higher cutoff frequency.This higher cutoff frequency causes more, and major part drops on filtered device circuit 115 extractions of the interior frame energy of voice low-frequency range.

Same, shown in Fig. 6 B, filter circuit 115 can be selected to show a series of different frequency response curve F1-F4, and each frequency response curve has the different gradients and identical cutoff frequency.In the scope that the cutoff frequency of frequency response curve F1-F4 is mentioned in the above.When noise estimation value increased, filter circuit 115 was adjusted to show to have the more frequency response curve of steep gradient.This steeper gradient causes more, and major part drops on interior filtered device circuit 115 extractions of frame energy of low frequency ranges of voice.

Filter circuit 115 comes the filtering present frame with the form of a certain Noise Estimation value function, and noise estimation value is wherein calculated for present frame.The filtered noise that makes of present frame has been passed through the major part of voice by reduction.Do not provided discernible voice output by filtering and the voice major part passed through, quality of speech signal has only very little reduction.The combination of different cutoff frequencys and different gradient can be used to extract adaptively the part of selecting that drops on the interior frame energy of voice low-frequency range.

Fig. 7 has described a sample query table that is read by filter selector 235, so that select one for filter circuit 115 from filter response curve F1-F4.This question blank comprises a series of possible noise estimation value N1-Nn and FILTER TO CONTROL value F1-Fn, the possible response curve that these values show corresponding to filter circuit 115.Among the noise estimation value N1-Nn each can be represented the noise estimation value of a scope, and each all is complementary with a specific FILTER TO CONTROL value F1-F4.Filter control circuit 105 produces a filter control signal, and its method is to calculate a noise estimation value and detect associated FILTER TO CONTROL value from question blank.

Fig. 8 A﹠amp; B and 9A﹠amp; B represented each frame of two frame sound signals be how by auto adapted filtering to provide an improvement sound signal that outputs to the RF transmitter.Fig. 8 A and 8B have represented to comprise respectively speech components s1, s2 and noise component n1, one first frame and one second frame of the sound signal of n2.As shown in the figure, noise energy n1 in two frames and n2 concentrate in the bass scope.And speech energy s1 and s2 concentrate in the high audio scope.Fig. 9 A has provided the noise signal n1 and the voice signal s1 of first frame after the filtering.Fig. 9 B has provided the noise signal n2 and the voice signal s2 of second frame after the filtering.

As what discussed, adaptive audio noise reduction system 100 is designed to calculate the difference of noise level between first frame and second frame, and noise estimation value was adjusted filter control circuit 105 after its method was based on the calculating of present frame.For example, filter control circuit 105 calculates Noise Estimation N1 and spectrum envelope s1 and is that first frame is selected a FILTER TO CONTROL value F1.In preferred embodiments, based on FILTER TO CONTROL value F1, filter circuit 115 is adjusted and resembles shown in Fig. 6 A, demonstrates a frequency response curve with cutoff frequency F1c.Then, first frame is by this adjusted filter circuit 115.Filter circuit 115 selected most noise n1 and the only a fraction of voice s1 of making drop under the cutoff frequency F1c of frequency response curve F1.This cause noise n1 by effective filtering and only some unessential relatively voice s1 by filtering.The filtered first frame sound signal is illustrated among Fig. 9 A.

In second frame that shows, have higher ground unrest in Fig. 8 B, and suppose that voice are not detected, that petty filter control circuit 105 can calculate a higher noise estimation value n2.Based on this strong noise estimated value more, be that second frame is determined a higher respective filter controlling value F2.In first preferred embodiment, adjust filter circuit 115 according to higher FILTER TO CONTROL value F2 and resembling that Fig. 6 A represents, show that has the more frequency response curve of higher cutoff frequency F2c.Then, the subsequent frame of sound signal has passed through adjusted filter circuit 115.Because for subsequent frame, the cutoff frequency F2c of frequency response curve F2 is higher, so most noise n2 and voice s2 are by filtering.(still), the sharpness information that voice s2 is comprised by the part of filtering and this frame is compared still relative not remarkable, so this only has slight influence to voice.The shortcoming of the more most voice s2 of filtering is offset by the advantage that noise n2 removal amount in second frame increases.The speech manual part that is filtered out does not significantly act on the sharpness of voice.Filtered audio signal is expressed in Fig. 9 B in second frame.

Provided second preferred embodiment of adaptive noise reduction system 100 among Figure 10-12.In second preferred embodiment, filter control circuit 105 is adjusted filter circuit 115 with the form of noise envelope estimated value function.The noise envelope estimated value be calculate at each frame and with reference noise envelope estimated value relatively.Based on this relatively, filter circuit 115 is adjusted the low frequency energy that extracts varying number from present frame adaptively.

With reference to Figure 10, provided DSP 200 according to the configuration of second preferred embodiment.As shown in the figure, except the frame Energy Estimation device of describing with reference to first preferred embodiment 210, noise estimator 230, outside speech detector 240 and the filter selector 235, filter control circuit 105 also comprises spectralyzer 270.Described represented with process flow diagram 5A and 5B as first embodiment, filter control circuit 105 is determined noise estimation value for the frame that receives and is detected the existence of voice.When detecting voice for present frame, spectralyzer 270 is revised the noise envelope estimated value, and uses this value in adjusting filter circuit 115.

With reference to Figure 11, provided the step of revising the noise envelope estimated value and adjusting filter circuit 115.Figure 11 has provided the step that spectralyzer 270 is carried out, and is cited in the whole process of describing among the process flow diagram 5A of former first preferred embodiment of these steps and the 5B.

If do not detect voice in the present frame, spectralyzer 270 is at first determined a noise envelope (step 600) for present frame.For comprising the energy calculated value on the different frequency (being Frequency point) in the definite noise envelope of present frame, these frequencies are arranged in the voice low-frequency range of selecting for present frame.In preferred embodiments, the frequency range of choosing is approximately 300 to 800HZ.The noise envelope of present frame can be handled present frame by the fast Fourier transform (FFT) that utilization has a N Frequency point and determine.Utilizing the FFT processing digital signal is well-known in the prior art field, and it is superior to be in being confined to less relatively Frequency point as FFT, and for example 32 the time, it needs processing power seldom.FFT with N Frequency point calculates N different frequency place produce power.The energy calculated value that drops on the Frequency point in the frequency range of choosing has formed the noise envelope of present frame.

In order to determine the noise envelope estimated value (step 604) of present frame, with the noise envelope of present frame with do on average for the definite noise envelope estimated value of the former frame of sound signal.When the noise envelope estimated value before not having can obtain, for example after the initialization, can use the initial noise envelope estimated value of storage.The noise envelope estimated value comprise be positioned at continuous low frequency (that is, and in the frequency range of choosing, e ₁Be the noise energy estimated value of highest frequency and e _nBe the noise energy estimated value of low-limit frequency) on noise energy estimated value e _i(i=1 wherein, 2 ... n).In preferred embodiments, each noise energy estimated value e _iCorresponding to the mean value of energy calculated value on a certain characteristic frequency, this characteristic frequency is a Frequency point in a large amount of selected frequencies scopes that wherein do not have on the successive frame that voice are detected.By using a large amount of frames to determine the noise envelope estimated value, filter circuit 115 is adjusted on a more progressive basis.In another embodiment, the noise envelope estimated value can equal the noise envelope of present frame.

Then, the Energy Estimation value e of noise envelope estimated value _iCompare (step 604) with the reference noise envelope.The reference noise envelope comprises reference energy threshold value e _Ri(i=1 wherein, 2 ... n), these threshold values are positioned at the noise energy estimated value e corresponding to the noise envelope estimated value _iFrequency point on.Reference energy threshold value e _RiCan determine by experience.According to from highest frequency Energy Estimation value e ₁To low-limit frequency Energy Estimation value e _nOrder, noise energy estimated value e _iBy continuous and corresponding reference energy threshold e _RiCompare.

More specifically, noise energy is estimated e ₁At first with reference noise threshold value e _R1Compare.If e ₁Greater than reference noise threshold value e _R1, that petty fiducial value C1 is selected and be input to filter selector 235.If noise estimation value e ₁Less than reference noise threshold value e _Rl, that petty noise energy estimated value e ₂(this value is for to be lower than e ₁The noise energy estimated value that obtains of frequency place) with reference threshold e _R2Compare.If noise energy estimated value e ₂Greater than reference noise threshold value e _R2, that petty fiducial value C ₂Selected and be input to filter selector 235.To fiducial value C _i(i=1 wherein, 2 ... n) till selected, comparison procedure continues always.

Filter circuit 235 uses the fiducial value Ci that determines to determine a FILTER TO CONTROL value.Select in the question blank that this FILTER TO CONTROL value provides from Figure 12 for example.Question blank comprises a series of fiducial value Ci and corresponding FILTER TO CONTROL value Fi.Filter circuit 115 is adjusted with the form of choosing the FILTER TO CONTROL value function.Filter circuit 115 is adjusted to show that a frequency response curve is so that extract low frequency energy from present frame.When the noise energy estimated value on the continuous higher frequency surpassed their corresponding reference energy threshold, filter circuit 115 was adjusted to extract more low frequency energy.Fig. 6 A and 6B have provided the example frequency responses curve of choosing the FILTER TO CONTROL value.

The use of noise envelope estimated value helps to have improved the ability that filter circuit extracts low frequency energy of adjusting adaptively, and its mode of taking helps to improve the voice oeverall quality.Because automotive environment is not to use the unique environment of mobile radio communication device.Therefore, the noise envelope in a certain environment may tend to higher frequency.When the noise energy in the low frequency was very little, spectralyzer 270 can optionally be under an embargo.And, when a big chunk of noise frequency spectrum is positioned at low frequency,, also to use the steeper filtering gradient even that petty some processing power is sacrificed.This extra process requires to remain very little.

Apparent according to top description, the auto system of this invention is realized simply.And the calculated amount of DSP does not significantly increase.Cut down the more complicated method of noise, for example " spectrum is cut down " requires a large amount of storeies that several MIPS that relate to calculating and storage data and program code use.By relatively, this invention can sub-fraction " spectrum is cut down " the desired MIPS of algorithm and storer realize that spectrum is wherein cut down algorithm and also introduced more voice distortion simultaneously by only using.The storer that capacity reduces has reduced the size of DSP integrated circuit; The MIPS that reduces has reduced power consumption.These characteristics for battery powered portable/mobile radiotelephone all is very desirable.

With reference to its preferred embodiment, although this invention is represented especially and describe out that it is not limited only to these embodiments.For example, although DSP is described to carry out frame Energy Estimation device 210, noise estimator 230, speech detector 240, the function of filter selector 235 and filter circuit 265, these functions can realize by the numeral and/or the analog element that use other.In addition, when filter circuit 115 estimated that with Noise Estimation and noise envelope the form of the two function is adjusted, Avaptive filtering system 100 also can be implemented.

Claims

1. method that is used for optionally changing a frame of digital signal, digital signal wherein is made of a plurality of successive frames, the sound signal that this digital signal representation receives at the transmitter place, this sound signal or constitute by speech components, or constitute by noise component, perhaps be made of jointly speech components and noise component, described method is characterised in that and may further comprise the steps:

The energy sizes values of estimative figure signal frame;

Determine whether comprise speech components in the digital signal frame according to the estimated value that obtains in the described estimating step;

When determining that in described determining step speech components does not constitute frame a part of, revise noise estimation value with the form of the energy sizes values function estimated in previous noise estimation value and the described estimating step;

Read a record in the question blank, this question blank has the filter characteristic according to the noise estimation value size index, and the record of reading is corresponding to the noise estimation value of revising in the described correction step;

Select to want the filter circuit filtering characteristic of filtered device demonstration, the filtering characteristic of choosing is corresponding to the filter characteristic of storage in reading to note down in the described read step;

Use the filter filtering digital data frames, wave filter wherein shows the filter circuit filtering characteristic, changes digital data frames according to the filter circuit filtering characteristic thus.

2. according to the method in the claim 1, its feature is also the intermediate steps of adding not comprise speech components if digital data frames is determined that this additional step is determined the noise envelope estimated value of digital signal frame so.

3. according to the method in the claim 2, the noise envelope estimated value that wherein is determined in described definite noise envelope estimated value step is used to revise noise estimation value in described correction step.

4. according to the method in the claim 1, wherein question blank is read in described read step, and it is characterized in that has a lot of records in the look-up table, and each record all comprises an independently filter characteristic.

5. according to the method in the claim 4, wherein the separate filter characteristic of a plurality of records comprises independently high pass filter characteristic in the question blank, and each high-pass filtering characteristic is by an independent cutoff frequency definition.

6. according to the method in the claim 4, wherein the separate filter characteristic of a plurality of records comprises independently high pass filter characteristic in the question blank, and each high-pass filtering characteristic is defined by an independent frequency response curve gradient.

7. according to the method in the claim 1, it is characterized in that further step: thus Counter Value statistics frame number of estimated energy sizes values for it in described estimating step increased.

8. according to the method in the claim 7, when every N counting value increases, just carry out the step of described selective filter circuit median filter characteristic, N be one greater than 1 integer.

9. device (100 that is used for optionally changing digital signal frame; 200), digital signal wherein is made of a plurality of successive frames, the sound signal that this digital signal representation receives at the transmitter place, this sound signal can be made of speech components, or constitute by noise component, perhaps constituting jointly by these two components, described device is characterised in that:

Be coupled the energy value estimator (210) of receiving digital signals frame identification, described energy value estimator is used for the energy value of estimative figure signal frame;

Be coupled to the speech detector (240) of described energy value estimator, described speech components determiner is used for determining whether digital signal frame comprises speech components;

Determine not exercisable noise estimator (230) during configuration frame a part of of speech components when described speech components determiner, described noise estimator is revised noise estimation value with the form of the energy value function that front noise estimation value and described estimator are estimated;

The question blank that comprises a lot of records, wherein every record comes index with noise estimation value, and the record in the question blank reads according to the noise estimation value that is formed by described noise estimator;

Be coupled the wave filter (265) of receiving digital data frame, described wave filter shows the filtering characteristic of selectable filter circuit, selection to the filter circuit filtering characteristic of wave filter is to determine according to the record of question blank, and the record of this question blank reads according to the noise estimation value by described noise estimator correction.

10. according to the device in the claim 9, its feature also is a noise envelope estimator (270), if described speech components determiner determines that digital data frames does not comprise speech components, this noise envelope estimator is determined the noise envelope estimated value of this digital data frames so.