CN102222507B - Method and equipment for compensating hearing loss of Chinese language - Google Patents

Method and equipment for compensating hearing loss of Chinese language Download PDF

Info

Publication number
CN102222507B
CN102222507B CN201110150755A CN201110150755A CN102222507B CN 102222507 B CN102222507 B CN 102222507B CN 201110150755 A CN201110150755 A CN 201110150755A CN 201110150755 A CN201110150755 A CN 201110150755A CN 102222507 B CN102222507 B CN 102222507B
Authority
CN
China
Prior art keywords
consonant
vowel
short
voice signal
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201110150755A
Other languages
Chinese (zh)
Other versions
CN102222507A (en
Inventor
蔡宇
侯朝焕
洪缨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Original Assignee
Institute of Acoustics CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CAS filed Critical Institute of Acoustics CAS
Priority to CN201110150755A priority Critical patent/CN102222507B/en
Publication of CN102222507A publication Critical patent/CN102222507A/en
Application granted granted Critical
Publication of CN102222507B publication Critical patent/CN102222507B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention relates to a method and equipment for compensating hearing loss of Chinese language. The method comprises the following steps of: receiving a voice signal x(m); detecting the consonants in the voice signal x(m); and performing amplifying compensation on the detected consonants. The method and equipment provided by the invention reduce the voice distortion, improve the hearing comfort level for a patient, and are applicable to all hearing aids.

Description

A kind of hearing loss compensation method and equipment that is applicable to Chinese language
Technical field
The present invention relates to field of voice signal, relate in particular to the hearing loss compensation method.
Background technology
In Chinese language, vowel (simple or compound vowel of a Chinese syllable) and consonant (initial consonant) are the basic morphemes that constitutes Chinese language.Nearly all Chinese character all is that vowel (simple or compound vowel of a Chinese syllable) finishes with consonant (initial consonant) beginning, does not have the pre vowel syllable of consonant comprising having only vowel, is called " zero consonant ".In addition, Chinese speech also has following characteristics:
(1) phonetic system is simple, promptly phoneme is few, syllable is few (nearly 60 phonemes, but 407 syllables of only having an appointment; Even if consider tone only more than 1330 the tuning joint is arranged), and simple in structurely (have only two kinds of CV and V, wherein; C is a consonant, and V is single vowel or compound vowel);
(2) voiceless consonant is many, and is mostly weak voiceless sound;
(3) feel for the language is loud, and words is separated clear.
In Chinese language, vowel is the trunk of a syllable, no matter is from length or from energy, and vowel has all accounted for major part.Consonant is at the front end of a syllable, and its duration and volume are all less relatively, however but distinguishing and understand most important to language.Therefore, if can't hear or not hear the semanteme that consonant just is difficult to distinguish word.
Along with the growth at people's age, hearing loss (being mostly the phonosensitive nerve hearing loss) began from high band usually before this, and the centre frequency of consonant is positioned at 2.5kHz above (some voiceless consonant even be higher than 4kHz) mostly.Therefore for the patient that hearing loss is arranged, listen voiceless consonant to seem and be even more important.
The hearing loss compensation is the core algorithm of osophone, and its effect is the hearing loss situation to the patient, the external sound signal is carried out a series of compression amplify processing such as enhancing, thereby hearing is carried out compensation to a certain extent.
What existing hearing loss compensation technique almost adopted all is the hyperchannel compensation method, promptly earlier voice signal is divided into several separate bands, and the hearing loss situation according to the patient in each frequency band is amplified compensation.Traditional hearing compensation method has two kinds, and a kind of is the hyperchannel compensation method of adopting bank of filters, and another kind is the hyperchannel compensation method of adopting discrete Fourier transformation.
Fig. 1 is the synoptic diagram of the hyperchannel compensation method of available technology adopting bank of filters.The hyperchannel compensation method of this employing bank of filters is carried out in time domain; It at first inputs to one group of BPF. group with voice signal, and promptly analysis filterbank (being made as N) obtains N subband signal (passage); Calculate the amount of gain of each passage then, realize gain compensation.Signal after will compensating again at last is input to the synthesis filter group, the voice signal behind the superimposing compensation.Specifically can be referring to ICASSP meeting in 1991 (international acoustic voice and signal Processing meeting); " A digital filterbank hearing aid-design, implementation and evaluation " document by T Lunner and J Hellgren proposition; And referring to the international symposium of IEEE Circuits and Systems in 1998, " A flexible filterbank structure for extensive signal manipulations in digital hearing aids " document that R Brennan and T Schneider propose.
Fig. 2 is the synoptic diagram of the hyperchannel compensation method of available technology adopting discrete Fourier transformation.The hyperchannel compensation method of this employing discrete Fourier transformation is carried out at frequency domain, and it at first carries out discrete Fourier transformation (DFT) with input speech signal; Then frequency-region signal is combined into different passages as required, calculates the amount of gain of each passage, realize gain compensation; Signal after will compensating again at last carries out inverse Fourier transform, the time-domain signal after finally being compensated.Specifically can be referring to ICASSP meeting in 1991, by F Asano, " A digital hearing aid that compensates loudness for sensorineural impaired listeners " document that Y Suzuki and T Sone propose; And referring to nineteen ninety-five ICASSP meeting, by J C Tejero, S Bernal, " Adigital hearing aid that compensates loudness for sensorineural hearing impaiements " document that J A Hidaldo proposes.
Employing bank of filters of the prior art or hearing compensation that discrete fourier transform algorithm carries out; All need voice signal be divided into several independent frequency passages; And in each passage in various degree processing and amplifying in addition, at last synthetic acoustic playback is come out.Yet this method can be brought certain problem:
(1) if the resonance peak of vowel just in time is in the overlapping place of frequency range, will " be split " two parts, and every part has all given amplification in various degree, this causes the moving of resonance peak, distortion etc. probably, and then greatly reduces the intelligibility of voice;
(2) this kind method is being carried out in the processing and amplifying process voice signal, also can noise in the voice or interference component be amplified simultaneously, thereby reduce patient's sense of hearing comfort level.
Summary of the invention
The invention provides a kind of hearing loss compensation method and equipment of being applicable to Chinese language that can overcome the above problems.
In first aspect, the invention provides a kind of hearing loss compensation method.This method is received speech signal x (m) at first; Detect the consonant among the voice signal x (m) then; At last this detected consonant is amplified compensation.
In second aspect, the invention provides a kind of hearing loss compensation equipment, this equipment receives the voice signal that comes from the outside.And this equipment comprises consonant acquisition module, consonant compensating module.This consonant acquisition module detects the consonant in this voice signal according to the consonant characteristic of this voice signal.This consonant compensating module amplifies compensation based on detected consonant zone to the consonant in this voice signal.
The present invention is according to the characteristics of Chinese speech; A kind of new hearing loss compensation policy has been proposed; It adopts a kind of mode of amplifying selected, and the consonant part that patient in the voice is difficult to not hear identification is amplified, and energy is big, the vowel part of easy identification does not then process.Compared with prior art, the present invention has the following advantages:
(1) voice is not carried out frequency-division section and handle, therefore can not cause moving or distortion of resonance peak, reduced the distortion of voice;
(2) only partly amplify compensation, avoided the amplification of noise and interference, improved patient's sense of hearing comfort level to consonant.
Description of drawings
Below with reference to accompanying drawings specific embodiments of the present invention is explained in more detail, in the accompanying drawings:
Fig. 1 is the hyperchannel compensation method synoptic diagram of available technology adopting bank of filters;
Fig. 2 is the hyperchannel bucking-out system synoptic diagram of available technology adopting discrete Fourier transformation;
Fig. 3 is the hearing loss compensation equipment block diagram of one embodiment of the invention;
Fig. 4 is the short-time average magnitude waveform synoptic diagram of Chinese language medial vowel;
Fig. 5 is the short-time zero-crossing rate waveform synoptic diagram of consonant in the Chinese language;
Fig. 6 is certain hearing patient's a audiogram;
Fig. 7 is the hearing loss compensation method process flow diagram of one embodiment of the invention.
Embodiment
Fig. 3 is the hearing loss compensation equipment block diagram of one embodiment of the invention.This hearing loss compensation equipment comprises that windowing divides frame processing module 310, vowel to play not-go-end acquisition module 320, consonant acquisition module 330, consonant compensating module 340.
This windowing divides frame processing module 310 that time domain voice signal x (l) is carried out windowing and divides the frame processing, thereby obtains the voice signal x (m) behind the branch frame.
Vowel plays not-go-end acquisition module 320 and receives from this windowing and divide the voice signal x (m) behind the branch frame of frame processing module 310, and inquiry obtains this voice medial vowel (simple or compound vowel of a Chinese syllable) present position.In one embodiment of the invention, the short-time energy parameter through voice obtains the vowel position in the voice.Reason is: in Chinese speech, the duration of vowel is longer, and energy is higher than consonant and noise far away, therefore can detect the vowel in the voice through analyzing the short-time energy parameter of each frame.
How this vowel of brief description plays not-go-end acquisition module 320 through the vowel in the voice short-time energy parameter detecting voice.
Establish n frame voice signal x at present n(m) length is N (N=10 millisecond usually), then this signal x n(m) short-time average magnitude value (can be used as the sign of energy) is:
M n = Σ m = 0 N - 1 | x n ( m ) | - - - ( 1 )
At first, through calculating the high-energy threshold value ITU and the low-yield threshold value ITL that can obtain short-time average magnitude.Then, according to this high-energy threshold value ITU and low-yield threshold value ITL, and, obtaining the start-stop position N1 and the N2 of vowel, referring to Fig. 4 according to the short-time average magnitude characteristic of vowel. Fig. 4 is the short-time average magnitude waveform synoptic diagram of Chinese medial vowel (simple or compound vowel of a Chinese syllable).According to the short-time average magnitude of this vowel among Fig. 4 with and and ITU, ITL between relation, can obtain the start-stop position of vowel.
Get back to Fig. 3, this consonant acquisition module 330 receives and plays the voice x (m) of not-go-end acquisition module 320 and the vowel reference position of these voice from this vowel, and begins to search for forward from the starting point of this vowel, to orient consonant (initial consonant).In the example,, begin to orient consonant (initial consonant) from the starting point of vowel according to the short-time zero-crossing rate parameter of voice.How brief description orients consonant (initial consonant) through the short-time zero-crossing rate parameter of voice.
The short-time zero-crossing rate of computing voice at first, n frame voice signal x n(m) short-time zero-crossing rate is:
Z n = 1 2 Σ m = 0 N - 1 | sgn [ x n ( m ) ] - sgn [ x n ( m - 1 ) ] | - - - ( 2 )
Wherein, N is x n(m) length; Sgn [] is-symbol function, and satisfy,
sgn [ x ] = 1 , x &GreaterEqual; 0 - 1 , x < 0 - - - ( 3 )
Fig. 5 is the short-time zero-crossing rate waveform synoptic diagram of consonant in the Chinese language (initial consonant).In Chinese language, the short-time zero-crossing rate of consonant has waveform characteristic shown in Figure 5, can locate consonant according to this characteristic.Because the consonant particularly short-time zero-crossing rate of voiceless consonant is higher than the short-time zero-crossing rate of vowel and noise far away.Therefore, from the some forward frames of the starting point N1 of vowel (like 20 frames) and be no more than search in the scope of terminating point N2 ' of a last vowel, the short-time zero-crossing rate of each frame relatively one by one.If the above short-time zero-crossing rate of continuous some frames (like 10 frames) is all greater than zero-crossing rate detection threshold IZCT, these some frame alignment that then this short-time zero-crossing rate are higher than IZCT are the consonant scope, promptly should the zone for carrying out the zone of hearing compensation.
Need to prove that in these consonant acquisition module 330 location consonant (initial consonant) processes, if the Chinese character of being inquired about is single vowel or compound vowel, promptly " V " type structure then (is commonly considered as 20 frames) in its front in a period of time and is not had the consonant appearance; If the Chinese character of being inquired about is the consonant+vowel structure, i.e. " CV " type structure, and the consonant in this structure is that the plosive that some energy are higher and frequency is lower is (like [b]; [d]; [g] etc.), therefore adopt the short-time zero-crossing rate parameter may to detect to come out the consonant position, yet this kind plosive is heard more easily by the patient than voiceless sound; Therefore, can't influence whole hearing compensation effect.
Vowel plays not-go-end acquisition module 320 and has mainly adopted the sound end detecting method based on short-time energy and short-time zero-crossing rate with consonant acquisition module 330; It is proposed in 1975 by R Rabiner and R Sambur first, and the practical implementation details can be referring to document " An algorithm for determining the endpoints of isolated utterances ".In addition; The detection vowel that can adopt and the method for consonant also have the optimal filter design of people's propositions in 2002 such as Q Li and the real-time detection method of energy normalized, specifically referring to document " Robust endpoint detection and energy normalization for real-time speech and speaker recognition "; And " An improved endpoint detector for isolated word recognition " method of proposing in 2003 of people such as L F Lamel etc.
Get back to Fig. 3, this consonant compensating module 340 receives the position from consonant in the voice x (m) of consonant acquisition module 330 and this voice, and the consonant in these voice is carried out hearing compensation.
Particularly, this consonant compensating module 340 carries out hearing compensation according to patient's audiogram to the consonant frequency range.Wherein, the audiogram of being stored in this consonant compensating module 340 is configurable, and therefore hearing compensation equipment of the present invention is adaptable across the patient with different dysaudias.
Illustrate, this consonant compensating module 340 can be to three concentrated frequency 1000Hz of consonant in the voice, 2000Hz; 4000Hz carries out hearing compensation, as with 1000Hz, and 2000Hz; The mean value of the 4000Hz threshold of audibility 1/3 as the fixed compensation value, the consonant among these voice x (m) is amplified compensation.Audiogram with Fig. 6 hearing patient is that example is done further elaboration below.
Among Fig. 6, o is tin threshold value (minimum sound that in hearing test, can hear) of patient's auris dextra, and x is the threshold value of listening of left ear.For left ear; This consonant compensating module 340 can be with this patient in frequency 1000HZ, 2000HZ, the last threshold of audibility (60dB, 65dB, the 65dB) summation of 4000HZ, and 1/3 (being 21.11dB) of the mean value that will be somebody's turn to do and be worth again is as the consonant amplification compensation of fixed compensation value with these voice.
Fig. 7 is the hearing loss compensation method process flow diagram of one embodiment of the invention.
In step 710, time domain voice signal x (l) is carried out windowing divide frame to handle, thereby obtain the voice signal x (m) behind the branch frame.
In step 720, according to every frame short-time average magnitude value M of formula computing voice signal n
In step 730, calculate high-energy threshold value ITU and low-yield threshold value ITL.
In step 740, with every frame short-time average magnitude value M of above-mentioned some frame voice signals n, the high-energy threshold value ITU, the low-yield threshold value ITL that obtain with step 730 do comparison, thereby obtain the terminal position of vowel (simple or compound vowel of a Chinese syllable).
In step 750, the vowel initial point position that obtains according to the short-time zero-crossing rate characteristic and the step 740 of consonant begins to search for forward location consonant (initial consonant) from the starting point of vowel.
In step 760, the consonant position that obtains according to step 750, and, the consonant in the voice is carried out hearing compensation according to patient's audiogram.
Obviously, under the prerequisite that does not depart from true spirit of the present invention and scope, the present invention described here can have many variations.Therefore, the change that all it will be apparent to those skilled in the art that all should be included within the scope that these claims contain.The present invention's scope required for protection is only limited described claims.

Claims (5)

1. hearing loss compensation method comprises:
Received speech signal x (m);
Detect the consonant among this voice signal x (m);
This detected consonant is amplified compensation;
Wherein,
According to the short-time average magnitude value of said voice signal, obtain the start-stop position of said voice signal medial vowel, so that according to the start-stop position probing consonant of said vowel; The said step that obtains the start-stop position of vowel comprises: the short-time average magnitude value of calculating each frame of initial multiframe voice signal; Calculate the high-energy threshold value ITU and the low-yield threshold value ITL of amplitude in short-term; According to this high-energy threshold value ITU and this low-yield threshold value ITL, and, obtain the start-stop position of vowel according to the short-time average magnitude characteristic of vowel;
Said compensation process comprise three frequency thresholds of audibility concentrating according to consonant in the voice mean value 1/3 as the fixed compensation value, the consonant frequency range in the said voice signal is carried out hearing compensation, said three frequencies are 1000Hz, 2000Hz, 4000Hz.
2. a kind of hearing loss compensation method as claimed in claim 1 is characterized in that the step of consonant comprises in the said detection voice signal, the step of voice-based short-time zero-crossing rate parameter detecting consonant.
3. a kind of hearing loss compensation method as claimed in claim 1 is characterized in that, the step of consonant comprises in the said detection voice signal:
The short-time zero-crossing rate of computing voice;
From the position of vowel forward, the stop bit that is no more than previous vowel is put, one by one the short-time zero-crossing rate of each frame relatively;
If the short-time zero-crossing rate of continuous multiple frames is all greater than zero-crossing rate detection threshold IZCT, these a plurality of frame alignment that then this short-time zero-crossing rate are higher than this IZCT are the consonant zone.
4. hearing loss compensation equipment, this equipment receives the voice signal that comes from the outside, and it is characterized in that, comprising:
The consonant acquisition module detects the consonant in this voice signal;
The consonant compensating module amplifies compensation to this detected consonant;
Wherein,
Comprise that also vowel start-stop position acquisition module according to the short-time average magnitude value of said voice signal, obtains the start-stop position of said voice signal medial vowel, so that according to the start-stop position probing consonant of said vowel; The said step that obtains the start-stop position of vowel comprises: the short-time average magnitude value of calculating each frame of initial multiframe voice signal; Calculate the high-energy threshold value ITU and the low-yield threshold value ITL of amplitude in short-term; According to this high-energy threshold value ITU and this low-yield threshold value ITL, and, obtain the start-stop position of vowel according to the short-time average magnitude characteristic of vowel;
Said compensating module, the mean value of three frequency thresholds of audibility that also are used for concentrating according to the voice consonant 1/3 as the fixed compensation value, the consonant frequency range in the said voice signal is carried out hearing compensation, said three frequencies are 1000Hz, 2000Hz, 4000Hz.
5. a kind of hearing loss compensation equipment as claimed in claim 4 is characterized in that, the voice-based short-time zero-crossing rate parameter detecting of said consonant acquisition module consonant.
CN201110150755A 2011-06-07 2011-06-07 Method and equipment for compensating hearing loss of Chinese language Expired - Fee Related CN102222507B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110150755A CN102222507B (en) 2011-06-07 2011-06-07 Method and equipment for compensating hearing loss of Chinese language

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110150755A CN102222507B (en) 2011-06-07 2011-06-07 Method and equipment for compensating hearing loss of Chinese language

Publications (2)

Publication Number Publication Date
CN102222507A CN102222507A (en) 2011-10-19
CN102222507B true CN102222507B (en) 2012-10-24

Family

ID=44779040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110150755A Expired - Fee Related CN102222507B (en) 2011-06-07 2011-06-07 Method and equipment for compensating hearing loss of Chinese language

Country Status (1)

Country Link
CN (1) CN102222507B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104093111A (en) * 2014-03-25 2014-10-08 嘉兴益尔电子科技有限公司 Digital hearing aid with Chinese tone enhancing method
CN111107478B (en) 2019-12-11 2021-04-09 江苏爱谛科技研究院有限公司 Sound enhancement method and sound enhancement system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006011405A1 (en) * 2004-07-28 2006-02-02 The University Of Tokushima Digital filtering method and device
CN1269106C (en) * 2004-08-31 2006-08-09 四川微迪数字技术有限公司 Chinese voice signal process method for digital deaf-aid
JP4876245B2 (en) * 2006-02-17 2012-02-15 国立大学法人九州大学 Consonant processing device, voice information transmission device, and consonant processing method
US8374877B2 (en) * 2009-01-29 2013-02-12 Panasonic Corporation Hearing aid and hearing-aid processing method

Also Published As

Publication number Publication date
CN102222507A (en) 2011-10-19

Similar Documents

Publication Publication Date Title
US9384759B2 (en) Voice activity detection and pitch estimation
CN101593522B (en) Method and equipment for full frequency domain digital hearing aid
US8504360B2 (en) Automatic sound recognition based on binary time frequency units
US20160066088A1 (en) Utilizing level differences for speech enhancement
KR101414233B1 (en) Apparatus and method for improving speech intelligibility
Hamid Frame blocking and windowing speech signal
Yoo et al. Speech signal modification to increase intelligibility in noisy environments
US9240190B2 (en) Formant based speech reconstruction from noisy signals
US9437213B2 (en) Voice signal enhancement
US10176824B2 (en) Method and system for consonant-vowel ratio modification for improving speech perception
JP2011033717A (en) Noise suppression device
JP5115818B2 (en) Speech signal enhancement device
CN102222507B (en) Method and equipment for compensating hearing loss of Chinese language
Sadjadi et al. A comparison of front-end compensation strategies for robust LVCSR under room reverberation and increased vocal effort
Hsu et al. Modulation Wiener filter for improving speech intelligibility
EP2063420A1 (en) Method and assembly to enhance the intelligibility of speech
Zorila et al. On the Quality and Intelligibility of Noisy Speech Processed for Near-End Listening Enhancement.
Remes et al. Comparing human and automatic speech recognition in a perceptual restoration experiment
Alwan et al. Human and machine recognition of nasal consonants in noise
US20230217194A1 (en) Methods for synthesis-based clear hearing under noisy conditions
KR20180087038A (en) Hearing aid with voice synthesis function considering speaker characteristics and method thereof
Dai et al. 2D psychoacoustic filtering for robust speech recognition
JP4005166B2 (en) Audio signal processing circuit
JP6435133B2 (en) Phoneme segmentation apparatus, speech processing system, phoneme segmentation method, and phoneme segmentation program
Liu et al. A targeting-and-extracting technique to enhance hearing in the presence of competing speech

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121024

Termination date: 20150607

EXPY Termination of patent right or utility model