CN109493877A

CN109493877A - A kind of sound enhancement method and device of auditory prosthesis

Info

Publication number: CN109493877A
Application number: CN201710817728.7A
Authority: CN
Inventors: 王志华; 孙卓异; 姜汉钧
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2017-09-12
Filing date: 2017-09-12
Publication date: 2019-03-19
Anticipated expiration: 2037-09-12
Also published as: CN109493877B

Abstract

The embodiment of the invention discloses a kind of sound enhancement method of auditory prosthesis and devices, are related to medical electronic technology and Audio Signal Processing field.Method in the embodiment of the present invention includes: the four-way audio data for obtaining auditory prosthesis；The audio data that will acquire extracts acoustic enviroment feature, obtains the corresponding acoustics scene of the audio data；The compensation of subchannel sound and speech enhan-cement are carried out to the audio data got according to the acoustics scene；Exporting two-way enhances audio data.The audio data that terminal will acquire carries out speech enhan-cement processing, final output two-way realaudio data on portable terminal.Intelligence improves sound quality, substantially increase the prevalence of hearing aid fits, it can achieve better hearing aid effect and enhancement method, simultaneously because audio data processing is solidificated on auditory prosthesis processor, but the general processor chip based on portable terminal, conducive to the perfect of following system upgrade and sound enhancement method.

Description

Voice enhancement method and device of hearing aid device

Technical Field

The present invention relates to the field of medical electronic technology and audio signal processing, and in particular, to a method and apparatus for speech enhancement in a hearing aid device.

Background

Nowadays, China enters an age-accelerating social age, the life expectancy of old people is improved, the number of people with hearing loss and damage caused by excessive application of electronic products is on the rise, and in recent years, with the improvement of medical care level, the proportion of old people wearing hearing aids and hearing-impaired patients is increasing. Today, hearing aid technology is based on advanced digital signal processing, wireless communication and artificial intelligence techniques. With the rapid development of technology, the size of hearing aids is getting smaller and smaller, and the functions of hearing aids are getting more and more comprehensive, such as multi-channel wide dynamic range compression, active noise reduction, adaptive directivity, sound field analysis and wireless connection to other audio or communication systems.

An important aspect of hearing aids is to ensure that the hearing loss of a hearing patient is compensated and the audio quality is improved without further loss of hearing. The existing hearing aid built-in algorithm is solidified in a processor and cannot be intelligently upgraded along with the change of the processor.

Disclosure of Invention

In order to solve the above technical problem, embodiments of the present invention provide a method and an apparatus for speech enhancement of a hearing aid device, which utilize a portable intelligent terminal (e.g., a mobile phone) to implement a corresponding speech enhancement function of a hearing aid function.

In a first aspect, the present invention provides a method for speech enhancement of a hearing device, comprising:

acquiring four-channel audio data of a hearing aid device;

extracting acoustic environment characteristics from the acquired audio data to obtain an acoustic scene corresponding to the audio data;

performing channel-division sound compensation and voice enhancement on the acquired audio data according to the acoustic scene;

and outputting two paths of enhanced audio data.

Preferably, extracting the acoustic environment features from the acquired audio data, and obtaining the acoustic scene corresponding to the audio data includes:

extracting acoustic environment features of the audio data;

and matching the extracted acoustic environment characteristics with a preset voice environment, and determining the environment mode of the user.

Preferably, the performing of the channel-division sound compensation and the speech enhancement on the acquired audio data according to the acoustic scene includes:

preprocessing the audio data and filtering the audio data in channels;

sub-band division is carried out on the audio data after the channel division filtering; carrying out spectrum analysis on the sub-band of each audio data to obtain the signal-to-noise ratio of the sub-band of the audio data;

gating a sound source corresponding to the audio data according to the determined environment mode, and calculating the angle of the position of the sound source;

according to the determined angle of the position of the sound source and the signal to noise ratio of the sub-band, carrying out noise reduction and howling elimination processing on each sub-band of the audio data;

performing dynamic compression and sound intensity amplification processing on each sub-band of the audio data subjected to noise reduction;

performing time-frequency conversion on the frequency domain signal corresponding to each sub-band of the compressed and amplified audio data, and performing linear phase compensation;

each sub-band of the audio data is combined into a time-domain speech signal.

Preferably, the pre-processing the audio data comprises:

and performing first-order high-pass filtering on the components of the audio data, the frequencies of which are greater than a preset value.

Preferably, extracting the acoustic environment features from the acquired audio data, and obtaining the acoustic scene corresponding to the audio data further includes:

obtaining parameters of at least one of the following of the environmental patterns:

a modulation amplitude parameter, a directivity control parameter, a compression-amplification ratio parameter, and a noise suppression parameter.

Preferably, the sound source corresponding to the audio data is gated according to the determined environment mode, and calculating the angle of the position of the sound source includes:

gating sound sources of all directions of the hearing aid device according to the directivity control parameters;

the angle of the position of the sound source is calculated.

Preferably, the noise reduction processing for each sub-band of the audio data comprises:

identifying whether the audio data is noise according to the modulation amplitude parameter based on the envelope modulation characteristic and the spectrum analysis result of the audio data;

and carrying out suppression processing on the noise according to the determined signal-to-noise ratio and the noise suppression parameter.

Preferably, the performing time-frequency conversion on the frequency domain signal corresponding to each sub-band of the compressed and amplified audio data, and performing linear phase compensation includes:

performing time-frequency conversion on the frequency domain signal corresponding to each sub-band of the compressed and amplified audio data;

and performing phase compensation of corresponding degree according to the compression and amplification scale factor.

In a second aspect, the present invention also provides a speech enhancement device for a hearing device, comprising:

a sound pickup module configured to acquire four-channel audio data of the hearing aid device;

the acoustic environment monitoring module is configured to extract acoustic environment characteristics from the acquired audio data and acquire an acoustic scene corresponding to the audio data;

the sound processing module is arranged for performing channel-division sound compensation and voice enhancement on the acquired audio data according to the acoustic scene;

and the output module is used for outputting two paths of enhanced audio data.

Preferably, the extracting, by the acoustic environment monitoring module, the acoustic environment feature from the acquired audio data, and the obtaining an acoustic scene corresponding to the audio data includes:

extracting acoustic environment features of the audio data;

Preferably, the sound processing module includes:

a preprocessing unit configured to perform preprocessing and channel-wise filtering on the audio data;

the sub-band dividing unit is used for dividing the sub-band of the audio data after the sub-channel filtering; carrying out spectrum analysis on the sub-band of each audio data to obtain the signal-to-noise ratio of the sub-band of the audio data;

the sound source positioning unit is used for gating a sound source corresponding to the audio data according to the determined environment mode and calculating the angle of the position of the sound source;

the howling suppression and feedback elimination unit is set to perform noise reduction and howling elimination processing on each sub-band of the audio data according to the determined angle of the position of the sound source and the signal-to-noise ratio of the sub-band;

the compression and amplification unit is used for carrying out dynamic compression and sound intensity amplification processing on each sub-band of the audio data subjected to noise reduction;

the sound compensation unit is used for performing time-frequency conversion on the frequency domain signal corresponding to each sub-band of the compressed and amplified audio data and performing linear phase compensation;

a sound synthesis unit arranged to combine each sub-band of the audio data into a time domain speech signal.

Preferably, the preprocessing module preprocessing the audio data includes:

Preferably, the acoustic environment monitoring module is further configured to:

Preferably, the sound source positioning unit gates the sound source corresponding to the audio data according to the determined environment mode, and calculating the angle of the position of the sound source includes:

selecting sound sources in all directions of the hearing aid device according to the directivity control parameters;

the angle of the position of the sound source is calculated.

Preferably, the performing, by the howling suppression and feedback cancellation unit, noise reduction processing on each sub-band of the audio data includes:

Preferably, the sound compensation unit performs time-frequency conversion on the frequency domain signal corresponding to each sub-band of the compressed and amplified audio data, and performs linear phase compensation, including:

In a third aspect, the present invention further provides a speech enhancement apparatus, including: a memory and a processor;

the memory is used for storing executable instructions;

the processor is configured to execute the executable instructions stored in the memory, and perform the following operations:

acquiring four-channel audio data of a hearing aid device;

and outputting two paths of enhanced audio data.

In a fourth aspect, the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores computer-executable instructions, and when the computer-executable instructions are executed, the following operations are performed:

acquiring four-channel audio data of a hearing aid device;

and outputting two paths of enhanced audio data.

According to the voice enhancement method and device of the hearing aid device, the hearing aid device on the ear side is used for acquiring four paths of audio data and transmitting the four paths of audio data to the portable terminal, the acquired audio data are subjected to voice enhancement processing on the portable terminal, and finally two paths of real-time audio data are output. Different from a hearing aid system in a common hearing aid device, the embodiment of the invention fully considers different performance conditions of a portable terminal processor and provides a voice hearing aid enhancement method for intelligent upgrading under different portable terminals. The tone quality is intelligently improved, the universal degree of wearing of the hearing aid is greatly improved, a better hearing aid effect and an enhancement mode can be achieved, and meanwhile, because the audio data processing is not solidified on a processor of the hearing aid device but is based on a general processor chip of a portable terminal, the improvement of a future system upgrading and a voice enhancement method is facilitated.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.

Fig. 1 is a flowchart of a speech enhancement method of a hearing aid device according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a speech enhancement device of a hearing aid device according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a sound processing module according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.

Nowadays, portable intelligent terminals (such as mobile phones) have higher popularity and the computing power of general processors of portable terminals has also been stronger. However, most of the existing hearing aids do not have a perfect corresponding voice enhancement implementation method for matching portable terminals, which can implement hearing aid functions, and the existing hearing aid built-in algorithm is solidified in a processor and cannot be intelligently upgraded along with the change of the processor. As shown in fig. 1, an embodiment of the present invention provides a method for enhancing speech of a hearing aid device, implemented by a portable terminal processor, including:

s101, acquiring four-channel audio data of a hearing aid device;

s102, extracting acoustic environment characteristics from the acquired audio data to obtain an acoustic scene corresponding to the audio data;

s103, performing channel-division sound compensation and voice enhancement on the acquired audio data according to the acoustic scene;

and S104, outputting two paths of enhanced audio data.

The four-channel audio data in the embodiment of the present invention refers to: the portable terminal acquires sound input of a left-ear front microphone and sound input of a left-ear rear microphone, and sound input of a right-ear front microphone and sound input of a right-ear rear microphone of the hearing aid device.

According to the embodiment of the invention, on the premise of realizing intelligent voice enhancement based on the portable terminal and not updating the hearing aid device at the ear side, the hearing loss of the hearing-impaired patient can be compensated by only using the portable terminal, four paths of audio data are acquired by the hearing aid device at the ear side and transmitted to the portable terminal, the acquired audio data are subjected to voice enhancement processing on the portable terminal, and finally two paths of real-time audio data are output. Tone quality is improved to intelligence, has improved the general degree that the audiphone was worn greatly. The intelligent voice enhancement can be carried out in real time while the basic voice hearing aid function is completed, so that the intelligent voice enhancement is convenient for a user to use and is convenient for product upgrading.

Step S102, extracting acoustic environment characteristics from the acquired audio data, and acquiring an acoustic scene corresponding to the audio data includes:

extracting acoustic environment features of the audio data;

Step S103 of performing the channel-by-channel sound compensation and the speech enhancement on the acquired audio data according to the acoustic scene includes:

s1031, preprocessing the audio data and filtering the audio data in a channel division manner;

s1032, sub-band division is carried out on the audio data after the channel division filtering; carrying out spectrum analysis on the sub-band of each audio data to obtain the signal-to-noise ratio of the sub-band of the audio data;

s1033, gating a sound source corresponding to the audio data according to the determined environment mode, and calculating the angle of the position of the sound source;

s1034, according to the determined angle of the position of the sound source and the signal-to-noise ratio of the sub-band, carrying out noise reduction and howling elimination processing on each sub-band of the audio data;

s1035, performing dynamic compression and sound intensity amplification processing on each sub-band of the audio data subjected to noise reduction;

s1036, performing time-frequency conversion on the frequency domain signals corresponding to each sub-band of the compressed and amplified audio data, and performing linear phase compensation;

and S1037, combining each sub-band of the audio data into a time domain voice signal.

In step S1031 in this embodiment, the branch channel filtering is performed, the background noise is determined according to the detected voice endpoint, and the first filtering is performed by using the spectral subtraction method, so as to obtain four paths of voice signals from which the noise is primarily removed.

Wherein preprocessing the audio data comprises:

In the embodiment of the invention, the preprocessing mainly refers to the processing of pre-emphasis, and the high-pass filtering of a first-order filter is carried out on the high-frequency component, so that the high-frequency resolution of the voice is increased.

The embodiment of the invention carries out sub-channel filtering through the gamma filters, and the process is as follows:

the cochlear basilar membrane has different frequency specificities at different locations, depending on the specifics of the human ear architecture. This specificity can be expressed in terms of an n-th order non-uniform width gamma-atom filter whose time domain expression satisfies the following formula, i.e.,

wherein,representing the phase, fc the center frequency, b the bandwidth, N the order of the filter, t the time, a the amplitude.

The noise signals contained in the audio data from four channels usually exist in the low frequency band, and the noise is attenuated by using spectral subtraction and variable noise subtraction parameters α, so that the degree of speech distortion is controllable, for the speech signals in the high frequency band, the noise spectral components in the high frequency band are removed by using a cross-correlation function method, the relevant parameters required for positioning are retained, and the speech signals are not attenuated, wherein the determination of the variable noise subtraction parameters α can be obtained according to the following formula:

where k represents the subband sequence number, l represents the subband frame number,is a random initial value, SNR, representing α_pSignal to noise ratio value representing a posterioriσ is a positive integer for controlling the extent of spectral subtraction of the sub-band noise spectrum, β and α_i(k) The most significant is related to the estimate of the a priori SNR and is the evaluation parameter, β is used to prevent the denominator from being zero (there may be cases where the a posteriori SNR goes to zero), and β is calculated as the reciprocal of the maximum-minimum difference of α from the speech segments.

The division between the high band and the low band is calculated by the noise power spectrum of each subband output signal, and the division frequency range is usually selected to be about 800Hz to 1000 Hz.

Step S102 is to extract acoustic environment features from the acquired audio data, and after obtaining an acoustic scene corresponding to the audio data, the method further includes:

Wherein the directivity control parameters include: and parameters such as a binaural time difference, a binaural intensity difference, a binaural phase difference, a front-back ear phase difference and the like.

Gating the sound source corresponding to the audio data according to the determined environment mode, wherein the step of calculating the angle of the position of the sound source comprises the following steps:

the angle of the position of the sound source is calculated.

Performing noise reduction processing on each sub-band of the audio data includes:

In the embodiment of the invention, the modulation amplitude parameter is determined by the environment, because the envelope of the voice signal has modulation characteristics, after the spectrum analysis, the modulation rate can be used for identifying whether the input acoustic signal is voice or noise according to the size of the modulation rate, and the noise suppression parameter is different according to the environment, the noise spectrum of the environment in noisy and quiet environments, and the calculated input signal-to-noise ratio is also different, so that the noise suppression parameter is used for calculating the variable noise subtraction parameter α.

Performing time-frequency conversion on the frequency domain signal corresponding to each sub-band of the compressed and amplified audio data, and performing linear phase compensation comprises:

The compression and amplification scale parameter is determined by the hearing loss condition of the patient, a hearing loss graph is generated after audiometry, the hearing conditions of the patient under different frequencies are marked, the compression and amplification scale parameter is determined according to the data to amplify the data to the normal hearing level degree, and in sound compensation, the compression and amplification scale coefficients under different environments are different to compensate in different degrees.

According to the embodiment of the invention, voice signal integration is performed after the subsequent processing of pre-emphasis filtering, gamma-atom sub-channel filtering, spectral subtraction and the like. Can achieve better hearing aid effect and enhancement.

As shown in fig. 2, an embodiment of the present invention further provides a speech enhancement device for a hearing aid device, disposed at a portable terminal side, including:

a sound pickup module 11 configured to acquire four-channel audio data of the hearing aid device;

the acoustic environment monitoring module 12 is configured to extract acoustic environment features from the acquired audio data to obtain an acoustic scene corresponding to the audio data;

the sound processing module 13 is configured to perform channel-division sound compensation and speech enhancement on the acquired audio data according to the acoustic scene;

an output module 14 configured to output two paths of enhanced audio data.

The acoustic environment monitoring module 12 extracts acoustic environment features from the acquired audio data, and acquiring an acoustic scene corresponding to the audio data includes:

extracting acoustic environment features of the audio data;

The sound processing module 13 includes:

a preprocessing unit 131 configured to perform preprocessing and channel-wise filtering on the audio data;

a sub-band division unit 132 configured to sub-band-divide the audio data after the channel division filtering; carrying out spectrum analysis on the sub-band of each audio data to obtain the signal-to-noise ratio of the sub-band of the audio data;

a sound source positioning unit 133 configured to gate a sound source corresponding to the audio data according to the determined environment mode, and calculate an angle of a position where the sound source is located;

a howling suppression and feedback elimination unit 134 configured to perform noise reduction and howling elimination processing on each sub-band of the audio data according to the determined angle of the position of the sound source and the signal-to-noise ratio of the sub-band;

a compressing and amplifying unit 135 configured to perform dynamic compression and sound intensity amplification processing on each sub-band of the audio data after noise reduction;

the sound compensation unit 136 is configured to perform time-frequency conversion on the frequency domain signal corresponding to each sub-band of the compressed and amplified audio data, and perform linear phase compensation;

a sound synthesis unit 137 arranged to combine each sub-band of the audio data into a time domain speech signal.

The preprocessing module preprocessing the audio data comprises:

The acoustic environment monitoring module is further configured to:

The sound source positioning unit 133 gates the sound source corresponding to the audio data according to the determined environment mode, and calculating the angle of the position of the sound source includes:

the angle of the position of the sound source is calculated.

The noise reduction processing of each sub-band of the audio data by the howling suppression and feedback elimination unit comprises:

The sound compensation unit performs time-frequency conversion on the frequency domain signal corresponding to each sub-band of the compressed and amplified audio data, and performs linear phase compensation, including:

An embodiment of the present invention further provides a speech enhancement apparatus, including: a memory and a processor;

the memory is used for storing executable instructions;

acquiring four-channel audio data of a hearing aid device;

and outputting two paths of enhanced audio data.

The embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the following operations are performed:

acquiring four-channel audio data of a hearing aid device;

and outputting two paths of enhanced audio data.

It will be understood by those skilled in the art that all or part of the steps of the above methods may be implemented by a program instructing associated hardware (e.g., a processor) which may be stored in a computer readable storage medium such as a read only memory, a magnetic or optical disk, etc. Alternatively, all or part of the steps of the above embodiments may be implemented using one or more integrated circuits. Accordingly, the modules/units in the above embodiments may be implemented in hardware, for example, by an integrated circuit, or may be implemented in software, for example, by a processor executing programs/instructions stored in a memory to implement the corresponding functions. Embodiments of the invention are not limited to any specific form of hardware or software combination.

Although the embodiments of the present invention have been described above, the above description is only for the convenience of understanding the present invention, and is not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method for speech enhancement of a hearing assistance device, comprising:

acquiring four-channel audio data of a hearing aid device;

and outputting two paths of enhanced audio data.

2. The speech enhancement method according to claim 1, wherein extracting acoustic environment features from the acquired audio data and obtaining an acoustic scene corresponding to the audio data comprises:

extracting acoustic environment features of the audio data;

3. The speech enhancement method of claim 2, wherein performing the sub-channel sound compensation and speech enhancement on the acquired audio data according to the acoustic scene comprises:

preprocessing the audio data and filtering the audio data in channels;

each sub-band of the audio data is combined into a time-domain speech signal.

4. The speech enhancement method of claim 3 wherein pre-processing the audio data comprises:

5. The speech enhancement method according to claim 3, wherein the step of extracting the acoustic environment features from the obtained audio data and obtaining the acoustic scene corresponding to the audio data further comprises:

6. The speech enhancement method of claim 5, wherein the gating of the sound source corresponding to the audio data according to the determined environmental mode, the calculating of the angle of the position of the sound source comprises:

the angle of the position of the sound source is calculated.

7. The speech enhancement method of claim 5 wherein denoising each sub-band of the audio data comprises:

8. The speech enhancement method of claim 5, wherein performing time-frequency conversion on the frequency domain signal corresponding to each sub-band of the compressed and amplified audio data and performing linear phase compensation comprises:

9. A speech enhancement device for a hearing assistance device, comprising:

and the output module is used for outputting two paths of enhanced audio data.

10. The speech enhancement device according to claim 9, wherein the acoustic environment monitoring module extracts acoustic environment features from the acquired audio data, and obtaining an acoustic scene corresponding to the audio data includes:

extracting acoustic environment features of the audio data;

11. The speech enhancement device of claim 10 wherein the sound processing module comprises:

12. The speech enhancement device of claim 11, wherein the pre-processing module pre-processes the audio data comprising:

13. The speech enhancement device of claim 11, wherein the acoustic environment monitoring module is further configured to:

14. The apparatus according to claim 13, wherein the sound source localization unit gates the sound source corresponding to the audio data according to the determined environment mode, and the calculating the angle of the position of the sound source comprises:

the angle of the position of the sound source is calculated.

15. The apparatus of claim 13, wherein the means for performing noise reduction on each sub-band of the audio data comprises:

16. The apparatus of claim 13, wherein the sound compensation unit performs time-frequency conversion on the frequency domain signal corresponding to each sub-band of the compressed and amplified audio data, and performs linear phase compensation, and comprises:

17. A speech enhancement apparatus, comprising: a memory and a processor;

the memory is used for storing executable instructions;

acquiring four-channel audio data of a hearing aid device;

and outputting two paths of enhanced audio data.

18. A computer-readable storage medium having computer-executable instructions stored thereon that, when executed, perform operations comprising:

acquiring four-channel audio data of a hearing aid device;

and outputting two paths of enhanced audio data.