CN100440317C

CN100440317C - Voice frequency compression method of digital deaf-aid

Info

Publication number: CN100440317C
Application number: CNB2005100117800A
Authority: CN
Inventors: 迟惠生; 吴玺宏; 张志平; 陈婧
Original assignee: Science & Technology Development Deparatment Peking University
Current assignee: Science & Technology Development Deparatment Peking University
Priority date: 2005-05-24
Filing date: 2005-05-24
Publication date: 2008-12-03
Anticipated expiration: 2025-05-24
Also published as: CN1870133A

Abstract

The present invention provides a voice frequency compression method of a digital deaf aid, which compresses a wide-band voice signal into a narrow-band voice signal by processing the short-time frequency spectrum coefficient of the signal. The method of the present invention comprises the following techniques: (1) a windowing Fourier transformation is utilized to convert a time frequency domain; (2) the energy distribution of the voice signal is judged by the slope of the signal cepstrum, and the voice spectrum is dynamically regulated; (3) low pass filtering is realized by the method of removing the high-frequency spectrum coefficient; (4) the voice frequency band is compressed by the method of adjusting a spectrum envelope. The present invention compresses the signal band width by adjusting the signal spectrum envelope, which is equivalent to carry out the linear processing on the signal and avoids the hearing distortion brought to the voice signal by the non-linear frequency shift method, whereas, the dynamic frequency spectrum adjustment can ensure that the low-frequency stage information of the voice can not be broken to the utmost, and therefore, the proposal can produce the clear voice of high quality.

Description

Voice frequency compression method of digital deaf-aid

Technical field

The invention belongs to the voice process technology field, relate to a kind of method of speech processing of digital deaf-aid, be specifically related to be used for the treatment of the digital deaf-aid method of speech processing of high frequency hearing heavy losses.

Background technology

Verbal communication is the basic communication mode of human society, also is one of basic viability of individual.Yet for those deafness patients, the verbal communication obstacle that causes owing to auditory dysesthesia has had a strong impact on their quality of life.This has brought huge misery not only for he or she and family, has also increased burden to entire society simultaneously.A statistics of announcing on February 7th, 2002 according to China Disabled Federation shows that there is the disabled person 2,057 ten thousand of hearing disfluency in China, accounts for 34.3% of the whole nation 6,000 ten thousand disabled person's sums.In addition, owing to reasons such as heredity, medicine, infection, noise, mishaies, also can increase 30,000 of deaf youngsters every year newly.So numerous barrier crowd and their life miseries of listening are being impelled the worker of association area to remove to use up portion and are being made great efforts to help these deaf persons and come back to the sound world, live as the normal person, embody humanity love of our harmonious society with this.

In the therapeutic scheme to the phonosensitive nerve deafness patient, wearing osophone is a kind of treatment means commonly used.This method is applicable to impaired hearing but still has the deafness patient of residual hearing.The impaired hearing personage is that the threshold of audibility obviously rises with respect to normal person's a principal character, and the threshold of pain changes little, thereby whole earshot narrows down; Another principal character is in different frequency place deafness difference.The current clinically compensating for frequency response technology that widely used digital deaf-aid adopted proposes in order to address the above problem just.The compensating for frequency response technology can be according to the deafness of patient on each frequency, voice signal to each frequency band carries out dynamic range compression respectively, the voice signal that the normal person can be heard is amplified within patient's the earshot, thereby makes the patient can recover normal good hearing.

But there is research to think that the sound that amplifies these frequency ranges not only can not improve the identification of speech, can produce counter productive on the contrary when high frequency hearing loss during greater than 60dB.Be suggested in order to address this problem a kind of new frequency displacement osophone method, this osophone is compressed to the frequency band range of voice in the residual hearing frequency band range of listening the barrier patient, make the patient can utilize remaining low frequency hearing to experience the voice messaging of high band, to strengthen intelligibility to voice.

The present invention just is being based on this new hearing aid strategy, has proposed a kind of new osophone band compression method, to solve high frequency hearing heavy losses patient's auditory rehabilitation problem.

Summary of the invention

The voice frequency compression method of digital deaf-aid that is proposed among the present invention is handled by the short-term spectrum coefficient to signal, with wideband speech signal boil down to narrow band voice signal.

According to voice frequency compression method of digital deaf-aid of the present invention, mainly comprise following technology:

1) carries out the conversion of time-frequency domain; Adopt the method for windowing Fourier transform among the present invention.

2) judge that by the slope of signal logarithmic spectrum voice signal energy distributes, and dynamically adjust voice spectrum.With the logarithmic spectrum of signal and a slope over 10 is that 1 line segment is made inner product, and inner product result is the slope of logarithmic spectrum.If slope, illustrates the energy of this frame signal less than a certain predetermined threshold and mainly is distributed in low frequency, be generally the voiced segments of voice, then adopt the method for low-pass filtering; Otherwise, to high-frequency energy higher signal frame, be generally the voiceless sound section of voice, adopt the method for spectrum envelope compression.

3) by removing the method for high frequency spectrum coefficient, realize low-pass filtering.

4) method of adjusting by spectrum envelope, the compressed voice frequency band.At first utilize linear prediction analysis to obtain the spectrum envelope of short signal, and it is removed from former frequency spectrum, obtain the albefaction spectrum.Modulate again with envelope boil down to arrowband, broadband envelope, and to the low-frequency range spectral line of albefaction spectrum, simultaneously high frequency spectrum is removed.

The advantage of technique scheme is only to come the compressed signal bandwidth by adjusting the signal spectrum envelope, this method is equivalent to signal is carried out linear process, the audible distortion of having avoided non-linear frequency shift method to bring to voice signal, and dynamic frequency spectrum adjustment can guarantee as best one can that the low-frequency range information of voice is not damaged, and therefore utilizes this scheme can produce high-quality clear voice.

Description of drawings

Below in conjunction with accompanying drawing the present invention is illustrated in further detail:

Fig. 1 is the realization flow figure of frequency compression method;

Fig. 2 is the voice time-frequency figure contrast before and after the compression, wherein

Fig. 2 a is the time-frequency figure of former voice signal;

Fig. 2 b is the voice signal time-frequency figure after the compression.

Embodiment

Below with reference to accompanying drawing of the present invention, most preferred embodiment of the present invention is described in more detail.

Below in conjunction with case introduction digital deaf-aid frequency compression method proposed by the invention, the realization flow of this method as shown in Figure 1.The signal of being imported is the voice signal that 16bit quantizes the 16000Hz sampling rate, signal bandwidth 8000Hz, and the output signal bandwidth is 2000Hz, sampling rate and quantified precision are constant.The specific implementation step is as follows:

1. time-frequency conversion

Utilization adds Hanning window Fourier transform method one frame short signal (512 sampling points) is transformed into frequency domain.And by Fourier transform coefficient X _i, further obtain the power spectrum of signal.

2. ask spectrum slope, and dynamically adjust voice spectrum

Power spectrum to short signal is taken the logarithm, and obtains logarithmic spectrum P _i, be that 1 length is the line segment L of voice spectrum bandwidth with slope then _iMake inner product, with the spectrum slope λ of inner product result as this frame signal.Wherein:

L _i＝i-128 0≤i≤256

λ = Σ_{i = 1}^{256} L_{i} \cdot P_{i}

As λ during less than predetermined threshold th, the energy of signal mainly concentrates on low-frequency range, need carry out low-pass filtering to signal; As λ during greater than th, the energy of signal distributes more at high band, need compress signal spectrum.

3. frequency spectrum processing

According to the judged result of previous step, signal is carried out low-pass filtering or frequency spectrum compression processing.

Low-pass filtering: the coefficient in the complex frequency spectrum medium-high frequency section of signal is changed to zero, and low-frequency range remains unchanged, and is as follows with equation expression:

Y_{i} = \{\begin{matrix} X_{i} & 0 \leq i < 64,448 < i \leq 511 \\ 0 & 64 \leq i \leq 448 \end{matrix}

The spectrum envelope compression:

(1) current short time frame signal is carried out linear prediction analysis, and obtain the spectrum envelope E of signal by linear predictor coefficient _i

(2) with the Fourier transform coefficient of signal divided by the corresponding frequency spectrum envelope value, promptly obtain the frequency spectrum behind the envelope, or claim albefaction spectrum W _i

W _i＝X _i/E _i

(3) with spectrum envelope in compression ratio boil down to arrowband envelope, simple implementation method is to extract an envelope value every 4 frequency spectrums, that is:

E_{i}^{'} = E_{4^{*} i}

0≤i＜64

(4) the arrowband envelope after will compressing multiply by the albefaction spectral coefficient, and the high band spectral coefficient is changed to zero, the voice spectrum Y after obtaining compressing _i

Y_{i} = \{\begin{matrix} W_{i} \cdot E_{i}^{'} & 0 \leq i < 64,448 < i \leq 511 \\ 0 & 64 \leq i \leq 448 \end{matrix}

4. time domain is recovered

With the spectral coefficient Y after handling _iReturn to time domain through anti-Fourier transform, add behind the Hanning window with result in the past and carry out the aliasing addition.Frame continues first step operation after moving 1/4 frame length, handles the next frame signal.

Accompanying drawing 2 is the voice signal time-frequency figure before and after handling.As we can see from the figure, the voice band after the processing has been limited in the 2000Hz.For the voiced sound signal, frequency spectrum just by elimination high band, and to the voiceless sound signal, frequency spectrum has then been compressed significantly pro rata.

Although disclose specific embodiments of the invention and accompanying drawing for the purpose of illustration, its purpose is to help to understand content of the present invention and implement according to this, but it will be appreciated by those skilled in the art that: without departing from the spirit and scope of the invention and the appended claims, various replacements, variation and modification all are possible.Therefore, the present invention should not be limited to most preferred embodiment and the disclosed content of accompanying drawing.

Claims

1. voice frequency compression method of digital deaf-aid specifically may further comprise the steps:

1) voice signal is carried out the time-frequency domain conversion;

2) judge that by the slope of signal logarithmic spectrum voice signal energy distributes, and dynamically adjust voice spectrum;

3) if the slope of signal logarithmic spectrum less than a certain predetermined threshold, then carries out low-pass filtering by the method for removing the high frequency spectrum coefficient;

4) if the slope of signal logarithmic spectrum greater than described predetermined threshold, the method for adjusting by spectrum envelope then, compressed voice frequency band.

2. voice frequency compression method of digital deaf-aid as claimed in claim 1 is characterized in that: the method for windowing Fourier transform is adopted in the conversion of voice signal time-frequency domain.

3. voice frequency compression method of digital deaf-aid as claimed in claim 1 is characterized in that, step further is: with the logarithmic spectrum of signal and a slope over 10 is that 1 line segment is made inner product, and inner product result is the slope of logarithmic spectrum.

4. voice frequency compression method of digital deaf-aid as claimed in claim 1 is characterized in that, the method for spectrum envelope adjustment at first utilizes linear prediction analysis to obtain the spectrum envelope of short signal, and it is removed from former frequency spectrum, obtains the albefaction spectrum; Modulate again with envelope boil down to arrowband, broadband envelope, and to the low-frequency range spectral line of albefaction spectrum, simultaneously high frequency spectrum is removed.

5. voice frequency compression method of digital deaf-aid as claimed in claim 3 is characterized in that, the spectrum envelope compression may further comprise the steps:

1) current short time frame signal is carried out linear prediction analysis, and obtain the spectrum envelope of signal by linear predictor coefficient;

2) with the Fourier transform coefficient of signal divided by the corresponding frequency spectrum envelope value, obtain the frequency spectrum behind the envelope, or claim the albefaction spectrum;

3) with spectrum envelope in compression ratio boil down to arrowband envelope;

4) the arrowband envelope after will compressing multiply by the albefaction spectral coefficient, and the high band spectral coefficient is changed to zero, and the voice after obtaining compressing frequently.

6. voice frequency compression method of digital deaf-aid as claimed in claim 3 is characterized in that: low-pass filtering treatment is changed to zero with the coefficient in the complex frequency spectrum medium-high frequency section of signal, and low-frequency range remains unchanged.

7. as any described voice frequency compression method of digital deaf-aid among the claim 1-6, it is characterized in that: when signal is carried out the time domain recovery, spectral coefficient after handling is returned to time domain through anti-Fourier transform, add behind the Hanning window with result in the past and carry out the aliasing addition.