CN117278896B - Voice enhancement method and device based on double microphones and hearing aid equipment - Google Patents

Voice enhancement method and device based on double microphones and hearing aid equipment Download PDF

Info

Publication number
CN117278896B
CN117278896B CN202311566583.XA CN202311566583A CN117278896B CN 117278896 B CN117278896 B CN 117278896B CN 202311566583 A CN202311566583 A CN 202311566583A CN 117278896 B CN117278896 B CN 117278896B
Authority
CN
China
Prior art keywords
noise
differential
voice
signals
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311566583.XA
Other languages
Chinese (zh)
Other versions
CN117278896A (en
Inventor
方韶劻
林凤梅
曾庆宁
罗瀛
龙超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Aangsi Science & Technology Co ltd
Original Assignee
Shenzhen Aangsi Science & Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Aangsi Science & Technology Co ltd filed Critical Shenzhen Aangsi Science & Technology Co ltd
Priority to CN202311566583.XA priority Critical patent/CN117278896B/en
Publication of CN117278896A publication Critical patent/CN117278896A/en
Application granted granted Critical
Publication of CN117278896B publication Critical patent/CN117278896B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/08Mouthpieces; Microphones; Attachments therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2410/00Microphones
    • H04R2410/01Noise reduction using microphones having different directional characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/03Synergistic effects of band splitting and sub-band processing

Abstract

The application belongs to the field of voice signal processing, and provides a voice enhancement method and device based on double microphones, a computer readable storage medium and hearing aid equipment. The method comprises the following steps: the method comprises the steps of respectively sampling signals received by two microphones by using an oversampling frequency to obtain two paths of noise-containing voice signals, and performing downsampling treatment on the two paths of noise-containing voice signals to obtain a plurality of downsampled noise-containing voice signals; respectively selecting one downsampled noise-containing voice signal corresponding to each path of noise-containing voice signal to perform voice differential beam forming to obtain one differential voice signal, and respectively selecting a plurality of downsampled noise-containing voice signals corresponding to each path of noise-containing voice signal to perform a plurality of noise differential beam forming to obtain a plurality of differential noise signals; removing residual noise in the differential voice signals by adopting a MANC algorithm based on VAD improvement to one differential voice signal and a plurality of differential noise signals; and then performing time domain recovery operation and outputting. The method has good noise cancellation effect on high-frequency or broadband noise.

Description

Voice enhancement method and device based on double microphones and hearing aid equipment
Technical Field
The application belongs to the technical field of voice signal processing, and particularly relates to a voice enhancement method and device based on double microphones, a computer readable storage medium and hearing aid equipment.
Background
The hearing aid device mainly comprises a hearing aid, an artificial cochlea, an artificial middle ear, hearing aid devices and the like. Speech enhancement is an important technique in hearing devices, the main purpose of which is to suppress noise from noisy speech, and to obtain clean speech, in order to improve the clarity and intelligibility of the speech.
The speech enhancement modes in the prior art mainly comprise single-channel speech enhancement and multi-channel speech enhancement. Single-channel speech enhancement inevitably causes damage to speech signals, whereas multi-channel speech enhancement is generally better than single-channel speech enhancement because it has more multiple signals available than single-channel.
Multichannel speech enhancement usually occurs in the form of a microphone array, which, due to its large spatial aperture, severely affects its application in many applications, whereas a dual microphone consisting of two microphones may be regarded as the simplest microphone array, which is more widely used. However, the multi-channel speech enhancement mode adopted in the prior art has poor noise cancellation effect, especially noise cancellation effect of high-frequency noise or broadband noise.
Disclosure of Invention
The invention aims to provide a voice enhancement method and device based on double microphones, a computer readable storage medium and hearing aid equipment, and aims to solve the problem that a multichannel voice enhancement mode adopted in the prior art is poor in noise cancellation effect, especially noise cancellation effect of high-frequency noise or broadband noise.
In a first aspect, the present application provides a dual microphone based speech enhancement method, the method comprising the steps of:
s101, respectively sampling signals received by two microphones by using an oversampling frequency to obtain two paths of noise-containing voice signals, and respectively carrying out downsampling processing on the two paths of noise-containing voice signals to obtain a plurality of downsampled noise-containing voice signals;
s102, respectively selecting one noise-containing voice signal corresponding to each path of noise-containing voice signal after downsampling to perform voice differential beam forming to obtain one differential voice signal; respectively selecting a plurality of down-sampled noise-containing voice signals corresponding to each path of noise-containing voice signals to form a plurality of noise differential beams so as to obtain a plurality of differential noise signals;
s103, further eliminating residual noise in the differential voice signals by adopting a multi-channel adaptive noise cancellation (MANC) algorithm based on Voice Activity Detection (VAD) improvement on one differential voice signal and a plurality of differential noise signals;
s104, performing time domain recovery operation on the differential voice signal with the residual noise eliminated, and obtaining the output enhanced voice.
In a second aspect, the present application provides a dual microphone based speech enhancement apparatus comprising:
the sampling module is used for respectively sampling signals received by the two microphones by using the over-sampling frequency to obtain two paths of noise-containing voice signals, and respectively carrying out down-sampling processing on the two paths of noise-containing voice signals to obtain a plurality of down-sampled noise-containing voice signals;
the differential beam forming module is used for respectively selecting one noise-containing voice signal after downsampling corresponding to each path of noise-containing voice signal to carry out voice differential beam forming so as to obtain one differential voice signal; respectively selecting a plurality of down-sampled noise-containing voice signals corresponding to each path of noise-containing voice signals to form a plurality of noise differential beams so as to obtain a plurality of differential noise signals;
the noise cancellation module is used for further eliminating residual noise in the differential voice signals by adopting a multipath self-adaptive noise cancellation algorithm based on voice activity detection improvement on one differential voice signal and a plurality of differential noise signals;
and the recovery operation module is used for performing time domain recovery operation on the differential voice signal with the residual noise eliminated further to obtain the output enhanced voice.
In a third aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of a dual microphone based speech enhancement method as described.
In a fourth aspect, the present application provides a hearing assistance device comprising:
one or more processors;
a memory; and one or more computer programs, the processor and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, which when executing the computer programs implement the steps of the dual microphone based speech enhancement method as described.
In the application, as the over-sampling frequency is used for respectively sampling the signals received by the two microphones to obtain two paths of noise-containing voice signals, the two paths of noise-containing voice signals are respectively subjected to the down-sampling treatment to obtain a plurality of down-sampled noise-containing voice signals; respectively selecting one noise-containing voice signal corresponding to each noise-containing voice signal after downsampling to perform voice differential beam forming to obtain one differential voice signal; respectively selecting a plurality of down-sampled noise-containing voice signals corresponding to each path of noise-containing voice signals to form a plurality of noise differential beams so as to obtain a plurality of differential noise signals; the residual noise in the differential voice signals is further eliminated by adopting a multi-channel adaptive noise cancellation MANC algorithm which is improved by detecting VAD based on voice activity for one differential voice signal and a plurality of differential noise signals; and performing time domain recovery operation on the differential voice signal with the residual noise eliminated to obtain the output enhanced voice. Therefore, the noise cancellation effect, especially the noise cancellation effect of high-frequency noise or broadband noise is better.
Drawings
Fig. 1 is a flowchart of a dual microphone-based speech enhancement method according to an embodiment of the present application.
Fig. 2 is a schematic diagram of a dual microphone acquisition signal.
Fig. 3 is a beam pattern of voice differential beamforming.
Fig. 4 is a beam pattern of noise differential beamforming.
Fig. 5 is a schematic diagram of an improved multipath adaptive noise cancellation algorithm.
Fig. 6 is a functional block diagram of a dual microphone based speech enhancement device according to an embodiment of the present application.
Fig. 7 is a specific block diagram of a hearing aid device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantageous effects of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
In order to illustrate the technical solutions described in the present application, the following description is made by specific examples.
Referring to fig. 1, a flowchart of a dual-microphone-based voice enhancement method according to an embodiment of the present application is mainly illustrated by taking an application of the dual-microphone-based voice enhancement method to hearing-aid devices as an example, and the dual-microphone-based voice enhancement method according to an embodiment of the present application includes the following steps:
s101, respectively sampling signals received by two microphones by using an oversampling frequency to obtain two paths of noise-containing voice signals, and respectively carrying out downsampling processing on the two paths of noise-containing voice signals to obtain a plurality of downsampled noise-containing voice signals.
In one embodiment of the present application, the source of speech is located on the line connecting the two microphones and at one end of one of the microphones, the source of noise is located in the other direction, ideally on the midpoint of the line connecting the two microphones.
Referring to fig. 2, the distance between two microphonesThe calculation formula of (2) is as follows:
(1);
wherein,is the sampling frequency of the preset common voice signal,is the speed of sound. Assume that=16khz,=340 m/s, then=0.02125m。
As shown in FIG. 2, ideally, the speech source is in a hearing aid deviceThe source of noise is located to the left or right of the wearer, i.e. the source of speech is located on the extension of the line connecting the two microphones, and is close to one of the microphonesAt one end, the noise is located on the middle vertical line of the two microphone wires.
In an embodiment of the present application, the S101 may specifically be:
using oversampling frequencyThe signals received by the two microphones are sampled respectively to obtain two paths of noise-containing voice signals, and the two paths of noise-containing voice signals are subjected to downsampling respectively to obtain a sampling rate ofA kind of electronic deviceNoise-containing speech signal after path down-sampling, whereinIs a positive integer greater than 1,for example, 22.05khz, 16khz, which is the sampling frequency of the preset common speech signal.
The specific formula is as follows:
let the noise-containing speech signal sampled by the ith microphone at sampling time n beThen
(2);
(3);
(4);
Wherein,for a speech signal which is theoretically sampled by the ith microphone at sampling instant n,as interference noise sampled by the theoretical i-th microphone at sampling instant n,for the signal amplitude attenuation factor of speech as it propagates from one microphone to another,is a positive integer greater than 1;
for speech signals obtained by theoretical samplingInterference noiseAnd noisy speech signalDownsampling to make the downsampled voice signalPost-downsampling interference noiseNoise-containing speech signal after downsamplingThe method comprises the following steps:
(5) ;
(6);
(7);
s102, respectively selecting one noise-containing voice signal corresponding to each path of noise-containing voice signal after downsampling to perform voice differential beam forming to obtain one differential voice signal; and respectively selecting a plurality of down-sampled noise-containing voice signals corresponding to each path of noise-containing voice signals to form a plurality of noise differential beams so as to obtain a plurality of differential noise signals.
The voice differential beam forming adopts a fixed differential beam forming algorithm to enhance signals in the voice direction and inhibit noise signals in the non-voice direction.
The noise differential wave beam forming adopts a fixed differential wave beam forming algorithm to strengthen signals in a noise direction and inhibit signals in a voice direction, and provides input signals for a subsequent multipath self-adaptive noise cancellation algorithm.
In an embodiment of the present application, the S102 may specifically be:
respectively selecting one noise-containing voice signal corresponding to each noise-containing voice signal after downsampling to perform voice differential beam forming to obtain one differential voice signal
(8);
As can be seen from (8), after performing a voice differential operation on the noise-containing voice signal after downsampling, the interference noise after downsamplingIs subtracted from the downsampled speech signalIs retained but the speech signal is distorted byDistortion toRecovery by a subsequent filtering process is required.
However, in practical applications, the noise may not be located on the vertical line connecting the two microphones, or due to the influence of the reflection path of the noise in the practical environment, the noise is often not completely subtracted by the voice differential beam forming, and the residual noise may be determined according to the beam pattern of the voice differential beam forming.
Referring to FIG. 3, a differential speech beam is formed into a beam pattern at a frequency of 1KHz, and speech signals are shown in the followingOr (b)The direction (front or rear) remains intact, decays in other directions, especially inOr (b)The directional (left or right) decay is greatest. When a dual microphone is used in a hearing device, the signal is typically caused to be in the ear or other deviceThe direction (backward) is also suppressed.
Respectively selecting a plurality of down-sampled noise-containing voice signals corresponding to each path of noise-containing voice signals to perform a plurality of noise differential beam forming to obtain a plurality of differential noise signals
(9);
As can be seen from equation (9), after the noise difference operation is performed on the noise-containing speech signal after the downsampling, the speech signal is subtracted, and the noise signal is retained but distorted, which is formed byDistortion to
Also, in practice, noise differential beamforming is often not able to completely subtract speech, and its residual speech condition may be determined according to the beam pattern of the noise differential beamforming.
Referring to FIG. 4, a noise differential beam is formed into a beam pattern at a frequency of 1KHz, and a noise signal is shown inOr (b)The (left or right) direction remains intact, decaying in other directions, especially inOr (b)The (forward or backward) direction is most attenuated.
S103, the residual noise in the differential voice signals is further eliminated by adopting a multipath adaptive noise cancellation (Multi-channel Adaptive Noise Cancellation, MANC) algorithm which is improved based on voice activity detection (Voice Activity Detection, VAD) on the differential voice signals and the plurality of differential noise signals.
In practical applications, the voice source is not necessarily located right in front of the wearer, the noise source is not necessarily located right around the wearer, and the noise generally have direct and reflected effects in the propagation process, and the noise cannot be completely eliminated after the processing in step S102, so that step S103 is required to further eliminate the residual noise in the differential voice signal.
Referring to fig. 5, the step S103 may specifically be:
the main input in the MANC algorithm is differential voice signalThe multiple reference inputs being differential noise signalsThe adaptive filter A of MANC algorithm adopts least mean square (Least Mean Square, LMS) algorithm to realize corresponding function, VAD is realized by differential voice signalThe detection of the voiced or unvoiced period is performed and the detection result is used to control whether the coefficients of the adaptive filter are updated, i.e. only during the unvoiced period.
The LMS algorithm has the advantages of small calculated amount, easy realization in real time and the like.
The specific formula of S103 may be:
adaptive algorithm using LMS, then(10);
(11);
(12);
(13);
(14);
Wherein,is a differential noise signalA vector formed by taking values at different moments;is the output signal of the adaptive filter;the error signal is generated by subtracting the output signal of the adaptive filter from the voice differential signal, and is also a distorted voice signal after the residual noise is further reduced;is a coefficient vector of the adaptive filter;each element on the right of the expression (11) is a coefficient of the adaptive filter, and when the initial value of the element is n=0, any real number can be taken, usually 0 can be taken, and then iteration is performed according to the expression (14);each element to the right of expression (12) is a differential noise signalThe values at different moments;for the maximum delay sampling point number of each path of input signal of the adaptive filter, the coefficient of the adaptive filter is adjustedTo make the error signalIs the least mean square expectation value;representing the update step size of the adaptive filter.
For voice enhancement, since the reference input of MANC is difficult to ensure pure noise signal, part of voice is inevitably mixed, so that noise cancellation is inevitably caused and part of voice is causedTo avoid or mitigate cancellation of speech, an improved MANC speech enhancement approach is employed, namely: the adaptive filter coefficients of the MANC are adaptively updated only during periods of no speech, while the adaptive filter coefficients remain unchanged during periods of speech. This improved MANC method is referred to as a MANC method based on voice activity detection (Voice Activity Detector, VAD). With this improved MANC method, the VAD can be achieved by applying a method ofTo detect the sound/silence period and to use the detection result to control whether the adaptive filter coefficients are updated or not, the adaptive filter coefficients being updated only in the silence period.
Multipath adaptive noise cancellation algorithms (MANCs) generally have better noise cancellation effects than single-path Adaptive Noise Cancellation (ANCs), even though the number of coefficients of the adaptive filter in a single-path ANC is not less than the total number of adaptive filter coefficients in the MANC.
S104, performing time domain recovery operation on the differential voice signal with the residual noise eliminated, and obtaining the output enhanced voice.
In an embodiment of the present application, S104 may specifically be:
and (3) carrying out recovery operation on the differential voice signal with the residual noise eliminated by adopting an L-order time domain finite impulse response (Finite Impulse Response, FIR) filtering algorithm to obtain output enhanced voice, wherein L is a positive integer.
The value of L may be generally much smaller than the length value of a frame signal in the frequency domain recovery algorithm.
The FIR filtering algorithm can obtain smaller operation amount and smaller time delay amount than the frequency domain recovery algorithm in the prior art, and is easier to realize in real time.
In experiments in real environments, it was found that even though the noise signals of the multi-path reference inputs in the multi-path adaptive noise cancellation algorithm (MANC) are obtained by downsampling two paths of oversampled noise signals, the noise cancellation effect is better than that of the single path Adaptive Noise Cancellation (ANC), especially for high-frequency noise or wideband noise, even if the number of coefficients of the adaptive filters in the single path ANC is not less than the total number of coefficients of the adaptive filters in the MANC. This is also the main reason for sampling the signal with oversampling in the present application.
As can be seen from equation (8), the differential speech signal is output by the differential operationNot speech signalsBut its distortion signalIt is necessary to perform a time domain recovery operation thereon.
The recovery operation of the application by adopting an L-order time domain finite impulse response (Finite Impulse Response, FIR) filtering algorithm is specifically as follows:
z-transforming the product (8)(15);
Formula (16) is obtained from formula (15):
(16);
when the formula (16) is converted from Z domain to time domain and the positive integer L is large enough, there is
(17);
Taking out(18);
(19);
Then (20);
Where h is the coefficient of the recovery filter and L is the order of the recovery filter.
Since MANC can be used for differential signalThe residual noise in (3) is suppressed, i.e. in formula (9)Has a ratio ofHigher signal-to-noise ratio, so the recovery filtering can be directed toBy proceeding, i.e
(21);
Wherein the method comprises the steps of(22);
The algorithm for recovering the voice signal by the formula (21) only needs to carry out L-order Finite Impulse Response (FIR) filtering, and the signal delay caused by the recovery algorithm is only L sampling points (usually L is taken to be 32), so that compared with the delay time and the calculated amount of the existing frequency domain recovery algorithm, which are delayed by one frame (usually 256 sampling points), the delay time and the calculated amount of the existing frequency domain recovery algorithm are obviously reduced, and the method is very beneficial to real-time implementation.
Referring to fig. 6, the dual microphone-based speech enhancement apparatus according to an embodiment of the present application may be a computer program or a program code running in a hearing aid device, for example, the dual microphone-based speech enhancement apparatus is an application software; the dual-microphone based speech enhancement apparatus may be used to perform the corresponding steps in the dual-microphone based speech enhancement method provided by the embodiments of the present application. The dual-microphone-based voice enhancement device provided in an embodiment of the present application includes:
the sampling module 11 is configured to sample signals received by the two microphones respectively using an oversampling frequency to obtain two paths of noise-containing voice signals, and perform downsampling processing on the two paths of noise-containing voice signals respectively to obtain a plurality of downsampled noise-containing voice signals;
the differential beam forming module 12 is configured to respectively select one down-sampled noise-containing voice signal corresponding to each path of noise-containing voice signal to perform voice differential beam forming, so as to obtain one differential voice signal; respectively selecting a plurality of down-sampled noise-containing voice signals corresponding to each path of noise-containing voice signals to form a plurality of noise differential beams so as to obtain a plurality of differential noise signals;
a noise cancellation module 13, configured to further cancel residual noise in the differential voice signal by adopting a multipath adaptive noise cancellation algorithm that is improved based on voice activity detection for the differential voice signal and the plurality of differential noise signals;
the recovery operation module 14 is configured to perform a time domain recovery operation on the differential speech signal with the residual noise removed further, so as to obtain an output enhanced speech.
The dual-microphone-based voice enhancement device provided in an embodiment of the present application and the dual-microphone-based voice enhancement method provided in an embodiment of the present application belong to the same concept, and detailed implementation processes thereof are shown throughout the specification and are not repeated here.
An embodiment of the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of a dual microphone based speech enhancement method as provided by an embodiment of the present application.
Fig. 7 shows a specific block diagram of a hearing assistance device according to an embodiment of the present application, and the hearing assistance device 100 includes: one or more processors 101, a memory 102, and one or more computer programs, wherein the processors 101 and the memory 102 are connected by a bus, the one or more computer programs being stored in the memory 102 and configured to be executed by the one or more processors 101, the processor 101 implementing the steps of a dual microphone based speech enhancement method as provided by an embodiment of the present application when the computer programs are executed.
In the application, as the over-sampling frequency is used for respectively sampling the signals received by the two microphones to obtain two paths of noise-containing voice signals, the two paths of noise-containing voice signals are respectively subjected to the down-sampling treatment to obtain a plurality of down-sampled noise-containing voice signals; respectively selecting one noise-containing voice signal corresponding to each noise-containing voice signal after downsampling to perform voice differential beam forming to obtain one differential voice signal; respectively selecting a plurality of down-sampled noise-containing voice signals corresponding to each path of noise-containing voice signals to form a plurality of noise differential beams so as to obtain a plurality of differential noise signals; the residual noise in the differential voice signals is further eliminated by adopting a multi-channel adaptive noise cancellation MANC algorithm which is improved by detecting VAD based on voice activity for one differential voice signal and a plurality of differential noise signals; and performing time domain recovery operation on the differential voice signal with the residual noise eliminated to obtain the output enhanced voice. Therefore, the noise cancellation effect, especially the noise cancellation effect of high-frequency noise or broadband noise is better.
It should be understood that the steps in the embodiments of the present application are not necessarily sequentially performed in the order indicated by the step numbers. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in various embodiments may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or steps.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the invention, which are described in detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

Claims (9)

1. A method of dual microphone based speech enhancement, the method comprising the steps of:
s101, respectively sampling signals received by two microphones by using an oversampling frequency to obtain two paths of noise-containing voice signals, and respectively carrying out downsampling processing on the two paths of noise-containing voice signals to obtain a plurality of downsampled noise-containing voice signals;
s102, respectively selecting one noise-containing voice signal corresponding to each path of noise-containing voice signal after downsampling to perform voice differential beam forming to obtain one differential voice signal; respectively selecting a plurality of down-sampled noise-containing voice signals corresponding to each path of noise-containing voice signals to form a plurality of noise differential beams so as to obtain a plurality of differential noise signals;
s103, further eliminating residual noise in the differential voice signals by adopting a multi-channel adaptive noise cancellation (MANC) algorithm based on Voice Activity Detection (VAD) improvement on one differential voice signal and a plurality of differential noise signals;
s104, performing time domain recovery operation on the differential voice signal with the residual noise eliminated to obtain output enhanced voice;
the step S103 specifically includes:
the main input in the MANC algorithm is differential voice signalThe multiple reference inputs are differential noise signals +.>For positive integer greater than 1, the adaptive filter of MANC algorithm adopts least mean square LMS algorithm to realize corresponding function, VAD is realized by differential voice signal +.>Detecting a sound or silence period, and using the detection result to control whether to update the coefficients of the adaptive filter, i.e., update only in the silence period;
the specific formula of S103 is:
adaptive algorithm using LMS, then
(10);
(11);
(12);
(13);
(14);
Wherein,is differential noise signal->A vector formed by taking values at different moments; />Is the output signal of the adaptive filter; />The error signal is generated by subtracting the output signal of the adaptive filter from the voice differential signal, and is also a distorted voice signal after the residual noise is further reduced; />Is a coefficient vector of the adaptive filter; />Each element on the right side of the expression (11) is a coefficient of the adaptive filter, and when an initial value, namely n=0, any real number can be taken, and then iteration is carried out according to the expression (14); />Each element on the right side of the expression (12) is a differential noise signal +>The values at different moments; />For the maximum delay sampling point number of each input signal of the adaptive filter, the coefficient of the adaptive filter is adjusted>To make error signal +>Is the least mean square expectation value; />Representing the update step size of the adaptive filter.
2. The method of claim 1, wherein the distance between the two microphonesThe calculation formula of (2) is as follows:
(1);
wherein,for the sampling frequency of the preset common speech signal, < >>Is the sound velocity;
the step S101 specifically includes:
using oversampling frequencyThe signals received by the two microphones are sampled respectively to obtain two paths of noise-containing voice signals, and the two paths of noise-containing voice signals are subjected to downsampling respectively to obtain a sampling rate of +.>Is->Noise-containing speech signal after road downsampling, wherein +.>Is a positive integer greater than 1;
the specific formula is as follows:
let the noise-containing speech signal sampled by the ith microphone at sampling time n beThen
(2);
(3);
(4);
Wherein,for a speech signal which is theoretically sampled by the ith microphone at sampling instant n, +.>For the interference noise which is theoretically sampled by the ith microphone at sampling instant n, +.>Signal amplitude attenuation factor for speech propagating from one microphone to another microphone,/->
For speech signals obtained by theoretical samplingInterference noise->And noise-containing speech signal->Downsampling the down-sampled speech signal>Interference noise after downsampling->Noise-containing speech signal after downsampling +.>The method comprises the following steps:
(5) ;
(6);
(7)。
3. the method of claim 2, wherein,
the voice differential beam forming adopts a fixed differential beam forming algorithm to enhance signals in the voice direction and inhibit noise signals in the non-voice direction;
the noise differential wave beam forming adopts a fixed differential wave beam forming algorithm to strengthen signals in a noise direction and inhibit signals in a voice direction, and provides input signals for a subsequent multipath self-adaptive noise cancellation algorithm.
4. A method according to claim 3, wherein S102 is specifically:
respectively selecting one noise-containing voice signal corresponding to each noise-containing voice signal after downsampling to perform voice differential beam forming to obtain one differential voice signal
(8);
Respectively selecting a plurality of down-sampled noise-containing voice signals corresponding to each path of noise-containing voice signals to perform a plurality of noise differential beam forming to obtain a plurality of differential noise signals
(9)。
5. The method of claim 4, wherein S104 is specifically:
and (3) carrying out recovery operation on the differential voice signal with the residual noise eliminated by adopting an L-order time domain Finite Impulse Response (FIR) filtering algorithm to obtain the output enhanced voice, wherein L is a positive integer.
6. The method of claim 5, wherein the recovery operation performed by using an L-order time domain finite impulse response FIR filtering algorithm is specifically:
z-transforming the product (8)
(15);
Formula (16) is obtained from formula (15):
(16);
when the formula (16) is converted from Z domain to time domain and the positive integer L is large enough, there is
(17);
Taking out(18);
(19);
Then(20);
Wherein h is the coefficient of the recovery filter, and L is the order of the recovery filter;
for the purpose ofRestoring filtering, i.e.)>(21);
Wherein the method comprises the steps of(22)。
7. A dual microphone based speech enhancement apparatus comprising:
the sampling module is used for respectively sampling signals received by the two microphones by using the over-sampling frequency to obtain two paths of noise-containing voice signals, and respectively carrying out down-sampling processing on the two paths of noise-containing voice signals to obtain a plurality of down-sampled noise-containing voice signals;
the differential beam forming module is used for respectively selecting one noise-containing voice signal after downsampling corresponding to each path of noise-containing voice signal to carry out voice differential beam forming so as to obtain one differential voice signal; respectively selecting a plurality of down-sampled noise-containing voice signals corresponding to each path of noise-containing voice signals to form a plurality of noise differential beams so as to obtain a plurality of differential noise signals;
the noise cancellation module is used for further eliminating residual noise in the differential voice signals by adopting a multipath self-adaptive noise cancellation algorithm based on voice activity detection improvement on one differential voice signal and a plurality of differential noise signals;
the recovery operation module is used for performing time domain recovery operation on the differential voice signal with the residual noise eliminated to obtain output enhanced voice;
the noise cancellation module is specifically configured to:
the main input in the MANC algorithm is differential voice signalThe multiple reference inputs are differential noise signals +.>For positive integer greater than 1, the adaptive filter of MANC algorithm adopts least mean square LMS algorithm to realize corresponding function, VAD is realized by differential voice signal +.>Detecting a sound or silence period, and using the detection result to control whether to update the coefficients of the adaptive filter, i.e., update only in the silence period;
the specific formula is as follows:
adaptive algorithm using LMS, then
(10);
(11);
(12);
(13);
(14);
Wherein,is differential noise signal->A vector formed by taking values at different moments; />Is the output signal of the adaptive filter; />The error signal is generated by subtracting the output signal of the adaptive filter from the voice differential signal, and is also a distorted voice signal after the residual noise is further reduced; />Is a coefficient vector of the adaptive filter; />Each element on the right side of the expression (11) is a coefficient of the adaptive filter, and when an initial value, namely n=0, any real number can be taken, and then iteration is carried out according to the expression (14); />Each element on the right side of the expression (12) is a differential noise signal +>The values at different moments; />For the maximum delay sampling point number of each input signal of the adaptive filter, the coefficient of the adaptive filter is adjusted>To make error signal +>Is the least mean square expectation value; />Representing the update step size of the adaptive filter.
8. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the dual microphone based speech enhancement method according to any of claims 1 to 6.
9. A hearing assistance device comprising:
one or more processors;
a memory; and one or more computer programs, the processor and the memory being connected by a bus, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, characterized in that the processor, when executing the computer programs, implements the steps of the dual microphone based speech enhancement method according to any of claims 1 to 6.
CN202311566583.XA 2023-11-23 2023-11-23 Voice enhancement method and device based on double microphones and hearing aid equipment Active CN117278896B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311566583.XA CN117278896B (en) 2023-11-23 2023-11-23 Voice enhancement method and device based on double microphones and hearing aid equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311566583.XA CN117278896B (en) 2023-11-23 2023-11-23 Voice enhancement method and device based on double microphones and hearing aid equipment

Publications (2)

Publication Number Publication Date
CN117278896A CN117278896A (en) 2023-12-22
CN117278896B true CN117278896B (en) 2024-03-19

Family

ID=89203085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311566583.XA Active CN117278896B (en) 2023-11-23 2023-11-23 Voice enhancement method and device based on double microphones and hearing aid equipment

Country Status (1)

Country Link
CN (1) CN117278896B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090013621A (en) * 2007-08-02 2009-02-05 재단법인서울대학교산학협력재단 Space noise reducing system using multi-channel active noise control
CN102074245A (en) * 2011-01-05 2011-05-25 瑞声声学科技(深圳)有限公司 Dual-microphone-based speech enhancement device and speech enhancement method
CN102074246A (en) * 2011-01-05 2011-05-25 瑞声声学科技(深圳)有限公司 Dual-microphone based speech enhancement device and method
CN109243482A (en) * 2018-10-30 2019-01-18 深圳市昂思科技有限公司 Improve the miniature array voice de-noising method of ACRANC and Wave beam forming
CN113782043A (en) * 2021-09-06 2021-12-10 北京捷通华声科技股份有限公司 Voice acquisition method and device, electronic equipment and computer readable storage medium
CN116711007A (en) * 2021-04-01 2023-09-05 深圳市韶音科技有限公司 Voice enhancement method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010091077A1 (en) * 2009-02-03 2010-08-12 University Of Ottawa Method and system for a multi-microphone noise reduction

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090013621A (en) * 2007-08-02 2009-02-05 재단법인서울대학교산학협력재단 Space noise reducing system using multi-channel active noise control
CN102074245A (en) * 2011-01-05 2011-05-25 瑞声声学科技(深圳)有限公司 Dual-microphone-based speech enhancement device and speech enhancement method
CN102074246A (en) * 2011-01-05 2011-05-25 瑞声声学科技(深圳)有限公司 Dual-microphone based speech enhancement device and method
CN109243482A (en) * 2018-10-30 2019-01-18 深圳市昂思科技有限公司 Improve the miniature array voice de-noising method of ACRANC and Wave beam forming
CN116711007A (en) * 2021-04-01 2023-09-05 深圳市韶音科技有限公司 Voice enhancement method and system
CN113782043A (en) * 2021-09-06 2021-12-10 北京捷通华声科技股份有限公司 Voice acquisition method and device, electronic equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN117278896A (en) 2023-12-22

Similar Documents

Publication Publication Date Title
US10446171B2 (en) Online dereverberation algorithm based on weighted prediction error for noisy time-varying environments
Doclo et al. Frequency-domain criterion for the speech distortion weighted multichannel Wiener filter for robust noise reduction
EP2237271B1 (en) Method for determining a signal component for reducing noise in an input signal
EP3357256B1 (en) Apparatus using an adaptive blocking matrix for reducing background noise
Doclo et al. GSVD-based optimal filtering for single and multimicrophone speech enhancement
JP4184342B2 (en) Method and system for processing subband signals using adaptive filters
CN108922554B (en) LCMV frequency invariant beam forming speech enhancement algorithm based on logarithmic spectrum estimation
CN109727604A (en) Frequency domain echo cancel method and computer storage media for speech recognition front-ends
CN111128220B (en) Dereverberation method, apparatus, device and storage medium
JP3940662B2 (en) Acoustic signal processing method, acoustic signal processing apparatus, and speech recognition apparatus
US11373667B2 (en) Real-time single-channel speech enhancement in noisy and time-varying environments
US9564144B2 (en) System and method for multichannel on-line unsupervised bayesian spectral filtering of real-world acoustic noise
EP1081985A2 (en) Microphone array processing system for noisly multipath environments
CN108293170B (en) Method and apparatus for adaptive phase distortion free amplitude response equalization in beamforming applications
CN104835503A (en) Improved GSC self-adaptive speech enhancement method
CN110211602B (en) Intelligent voice enhanced communication method and device
Spriet et al. Stochastic gradient-based implementation of spatially preprocessed speech distortion weighted multichannel Wiener filtering for noise reduction in hearing aids
Mitianoudis et al. Audio source separation: Solutions and problems
CN117278896B (en) Voice enhancement method and device based on double microphones and hearing aid equipment
KR102517939B1 (en) Capturing far-field sound
Hidri et al. About multichannel speech signal extraction and separation techniques
Huang et al. Dereverberation
EP1305975B1 (en) Adaptive microphone array system with preserving binaural cues
US11195540B2 (en) Methods and apparatus for an adaptive blocking matrix
CN110661510B (en) Beam former forming method, beam forming device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant