CN114333884B - Voice noise reduction method based on combination of microphone array and wake-up word - Google Patents
Voice noise reduction method based on combination of microphone array and wake-up word Download PDFInfo
- Publication number
- CN114333884B CN114333884B CN202011061741.2A CN202011061741A CN114333884B CN 114333884 B CN114333884 B CN 114333884B CN 202011061741 A CN202011061741 A CN 202011061741A CN 114333884 B CN114333884 B CN 114333884B
- Authority
- CN
- China
- Prior art keywords
- noise
- covariance
- wake
- voice
- stage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000009467 reduction Effects 0.000 title claims abstract description 29
- 238000000034 method Methods 0.000 title claims abstract description 22
- 238000012545 processing Methods 0.000 claims abstract description 22
- 230000002618 waking effect Effects 0.000 claims abstract description 12
- 230000000873 masking effect Effects 0.000 claims abstract description 5
- 238000001228 spectrum Methods 0.000 claims description 32
- 239000011159 matrix material Substances 0.000 claims description 26
- 230000000694 effects Effects 0.000 claims description 11
- 230000009466 transformation Effects 0.000 claims description 8
- 230000003595 spectral effect Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 125000004122 cyclic group Chemical group 0.000 claims description 4
- 238000009432 framing Methods 0.000 claims description 4
- 238000000354 decomposition reaction Methods 0.000 claims description 3
- 238000001514 detection method Methods 0.000 claims description 3
- 230000001629 suppression Effects 0.000 abstract description 3
- 238000006073 displacement reaction Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 241000712899 Lymphocytic choriomeningitis mammarenavirus Species 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000011946 reduction process Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Circuit For Audible Band Transducer (AREA)
- Soundproofing, Sound Blocking, And Sound Damping (AREA)
Abstract
The invention provides a voice noise reduction method based on a microphone array combined with a wake-up word, which is characterized in that on the basis of echo cancellation, DOA and beam forming of multipath audio data received by the microphone array, post noise reduction operation is added, noise estimation is performed by combining with a voice wake-up word position mark, and after voice wake-up, noise reduction processing is further performed on voice noise and music noise except expected voice, so that the processing capacity of the voice front end based on the whole microphone array is improved. In the post noise reduction processing, the post noise reduction processing is divided into two stages according to the awakening condition of the awakening word, wherein the stages are respectively non-awakening stages; and after waking up, waiting for a voice recognition result to return or waiting for a certain time after waking up, wherein the time can be the average time length when people finish the sentence to be recognized. Different noise estimations are used in the two stages, and a masking effect is used in noise reduction, so that the suppression of human noise and music noise in the recognition stage after waking up is achieved.
Description
Technical Field
The invention relates to the technical field of audio processing, in particular to a voice noise reduction method based on combination of a microphone array and wake-up words.
Background
With the continuous development of artificial intelligence and speech recognition, speech wake-up and recognition are increasingly occurring in our lives, such as intelligent sound boxes, vehicle-mounted speech systems and the like, and the application scenes are more and more diversified. Ambient noise and the sound emitted by the device itself are unavoidable during the application, which affects the speech recognition effect and thus requires processing of the speech provided to the speech recognition system, i.e. speech front-end processing.
The front-end processing of the voice mainly adopts a microphone array to pick up the voice and carries out a series of processing such as echo cancellation, sound source positioning, beam forming, noise reduction and the like on the picked-up multipath voice signals, so that the voice signals in the expected direction are enhanced, and meanwhile, the noise in the unexpected direction is restrained, thereby improving the voice recognition effect.
Noise reduction and noise estimation in microphone array based speech front end processing is currently mainly directed to stationary environmental noise, such as kitchen noise, including noise such as microwave ovens, range hood sounds, and white noise. The noise estimation is mainly performed by using VAD, and if no voice is detected, the noise is considered to be noise, and the noise is suppressed by performing the relevant noise reduction processing.
In a scene of speaking of multiple persons, such as family chatting or when songs are played, after the equipment is waken up by using wake-up words, DOA is used for positioning and beam forming, and then, the voice in other directions except for the expected voice signal in the direction of the waken up is not well restrained in the process of wave book formation, and can not be further restrained in the subsequent noise reduction process, so that the voice can be recognized by a voice recognition system, and the voice recognition effect is further affected.
The existing microphone array-based voice front-end processing has no obvious effect on suppressing human voice and song noise which are not expected voice signals, and the effect of human voice and song noise on voice recognition is far more than the effect of steady environmental noise.
Furthermore, technical terms commonly used in the art include:
microphone array: a system consisting of a number of acoustic sensors (typically microphones) for sampling and processing the spatial characteristics of the sound field.
Echo cancellation: (Acoustic Echo Cancellation, AEC) for canceling the sound emitted by the device itself.
Direction of arrival: (Direction of Arrive, DOA) for determining the direction of sound.
Beamforming: (Beamforming) refers to forming a main beam in a particular direction to receive a useful desired signal while forming an ultra-low sidelobe canceling noise signal and an interference signal.
Linear constraint minimum variance: (Linearly constrained minimum variance, LCMV), a linear constraint minimum variance, a beamforming algorithm.
Noise reduction: sound from other sound sources than the desired speech signal is suppressed.
Wake-up word: keywords for voice wakeup, such as "little degree", "little lovely" and the like.
Toeplitz matrix: the matrix also becomes Toeplitz matrix, which is called T-shaped matrix for short, and the elements on the main diagonal of the Toeplitz matrix are equal, and the elements on the lines parallel to the main diagonal are also equal; each element in the matrix is symmetrical about a minor diagonal, i.e., the T-shaped matrix is a minor symmetrical matrix. A simple T-shaped matrix includes a forward displacement matrix and a backward displacement matrix. In Matlab, the function of generating the toeplitz matrix is: toeplitz (x, y). Generating a Toeplitz matrix with x as a first column and y as a first row, wherein x and y are vectors and are not equal in length;
Let t= [ T ij]∈Cn×n, if T ij=tj-i (i, j=1, 2,., n), i.e.
Then T is referred to as Toeplitz matrix.
Discrete cosine transform: (Discrete Cosine Transform, abbreviated as DCT transform) is a mathematical operation closely related to the Fourier transform. In fourier series expansion, if the expanded function is a real even function, the fourier series contains only cosine terms, and the discrete cosine transform can be derived by discretizing the cosine terms, which is called discrete cosine transform.
Bartret window: (Bartlett window) refers to a pass band of the filter whose transfer function is resolved as
Voice activity detection: (VAD, voice Activity Detection) for detecting whether the current voice signal contains voice signals, namely judging the input signals, distinguishing the voice signals from various background noise signals, and respectively adopting different processing methods for the two signals.
Laplace transform: is an integral transformation commonly used in engineering mathematics, and is also called Lawster's transformation. The Laplace transform is a linear transform that converts a function with an argument of a real number t (t.gtoreq.0) into a function with an argument of a complex number s. If replaced with a symbol, the equation can be written as:
This is the Laplace transform, and when a function of t is input, a function about s will be obtained.
Disclosure of Invention
In order to solve the above problems in the prior art, an object of the present invention is to: aiming at the problem that the voice front-end processing based on the microphone array has poor effect of suppressing the voice noise in the scenes of talking chat among multiple people, playing songs and the like and having the voice noise, the noise suppression method based on the wake-up word is provided.
According to the method, on the basis of echo cancellation, DOA and beam forming of multipath audio data received by a microphone array, post noise reduction operation is added, noise estimation is carried out by combining a voice wake-up word position mark, and noise reduction processing is further carried out on voice noise and music noise except expected voice after voice wake-up, so that the processing capacity of the voice front end based on the whole microphone array is improved.
The invention performs echo cancellation after the microphone array collects multi-path audio data and performs preprocessing such as pre-emphasis, framing, windowing and the like on the data, and performs post-processing of noise estimation and noise reduction on the result of beam forming by combining a voice wake-up word on the basis that DOA determines a voice angle and beam forming enhances an expected voice signal in a target direction and performs preliminary suppression on other angle audios.
In the post noise reduction processing, the post noise reduction processing is divided into two stages according to the awakening condition of the awakening word, wherein the stages are respectively non-awakening stages; and after waking up, waiting for a voice recognition result to return or waiting for a certain time after waking up, wherein the time can be the average time length when people finish the sentence to be recognized. In the following, the first stage is called as a non-awakening stage, and the second stage is called as an awakening stage; different noise estimations are used in the two stages, and a masking effect is used in noise reduction to inhibit music noise to a certain extent, so that the inhibition of human noise and music noise in the recognition stage after waking up is achieved.
Specifically, the invention provides a voice noise reduction method based on a microphone array combined with wake-up words, which comprises the following steps:
s1, carrying out framing and windowing operation on one path of audio data output After Echo Cancellation (AEC), direction of arrival (DOA) and beam forming;
S2, covariance calculation:
s2.1, calculating the circular convolution of the whole frame of data;
S2.2, taking the last L data in the convolution result to form a Toeplitz matrix, wherein the matrix is covariance of the data, and L is the length of subframe data;
S3, determining an initial value: the method comprises the steps of dividing a non-awakening stage and an awakening stage, wherein the noise covariance and the noise power spectrum density of the non-awakening stage and the initial values in the noise covariance and the noise power spectrum density of the awakening stage are respectively determined;
S4, judging whether the wake-up word is in a wake-up stage,
S4.1, if the operation is in the non-awakening stage, turning to S4.1.1 operation;
s4.1.1, performing VAD judgment on the data,
If the noise is judged, updating the noise covariance matrix and updating the noise power spectrum density;
if the judgment is the audio, the covariance matrix and the power spectrum density of the noise are not updated, and the audio of the front noise is maintained;
S4.1.2, in a non-awakening stage, taking the audio data of the stage as noise of the awakening stage, updating the noise covariance and the noise power spectrum density of the stage, and storing the noise covariance and the power spectrum density, wherein the part needs to open up a storage space which is longer than the awakening word length and is used for storing the noise covariance and the noise power spectrum of the awakening stage calculated in the step;
s4.1.3, calculating covariance of data by using the frame data, subtracting noise covariance by using the covariance to obtain covariance of a voice signal, and turning to S5;
S4.2, if the stage is the stage after waking up and waiting for the identification result, namely the waking-up stage, turning to S2.2.1;
S4.2.1, after waking up, according to the length of the wake-up word, the current position of the storage space retreats forward to the maximum length of the wake-up word, and the noise covariance and the power spectrum density on the storage position are taken out and used as the noise covariance and the power spectrum of the stage; and calculating the covariance of the voice signal at the stage;
s5, performing eigenvalue decomposition on the covariance of the voice signals obtained by S4.1.3 and S4.2.1, and performing Laplacian transformation and transformation from a frequency domain to a characteristic value domain;
s6, in order to remove music noise and other non-document noise, a mask is further calculated by adopting a critical bandwidth and a masking effect, weights are calculated according to the mask and the result of S5, and data after final noise reduction processing is calculated.
In summary, the advantages that can be achieved by applying the method of the application are as follows: the method effectively improves the processing capacity of the voice front end based on the whole microphone array; optimizing the effect of speech recognition; the method is simple.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate and together with the description serve to explain the application.
Fig. 1 is a schematic block flow diagram of the method of the present invention.
Detailed Description
In order that the technical content and advantages of the present invention may be more clearly understood, a further detailed description of the present invention will now be made with reference to the accompanying drawings.
As shown in fig. 1, a voice noise reduction method based on a microphone array combined with a wake-up word includes the following steps:
S1, carrying out framing and windowing operation on one path of audio data output After Echo Cancellation (AEC), direction of arrival (DOA) and beam forming; wherein, the length of 2-4ms data is selected as the length L of subframe data, x is whole frame data;
S2, covariance calculation:
S2.1, calculating a cyclic convolution of the whole frame data, cx=xcorr (x, L-1, 'biased');
S2.2, taking the last L data in the convolution result to form a Toeplitz matrix, wherein the matrix is covariance of the data, and L is the length of subframe data;
s3, determining an initial value:
The invention needs to calculate and maintain the noise covariance and the power spectrum density of the two stages of non-awakening and awakening respectively, which are called the noise covariance and the noise power spectrum density of the non-awakening stage and the noise covariance and the noise power spectrum density of the awakening stage respectively;
a) Initial value of noise covariance: calculating by using the first frame data and adopting the covariance calculation method;
b) Initial value of power spectral density of noise: using a result of the cyclic convolution, namely a gabor window, and performing DCT operation;
S4, judging whether the wake-up word is in a wake-up stage,
S4.1, if the operation is in the non-awakening stage, turning to S4.1.1 operation;
s4.1.1, performing VAD judgment on the data,
If the noise is judged, updating a noise covariance matrix, calculating a noise covariance by using covariance data calculated by the current frame and a previous noise covariance by adopting a forgetting factor, and updating a noise power spectrum density;
if the judgment is the audio, the covariance matrix and the power spectrum density of the noise are not updated, and the audio of the front noise is maintained;
S4.1.2, in a non-awakening stage, taking the audio data of the stage as noise of the awakening stage, updating the noise covariance and the noise power spectrum density of the stage, and storing the noise covariance and the power spectrum density, wherein the part needs to open up a storage space which is longer than the awakening word length and is used for storing the noise covariance and the noise power spectrum of the awakening stage calculated in the step;
s4.1.3, calculating covariance of data by using the frame data, subtracting noise covariance by using the covariance to obtain covariance of a voice signal, and turning to S5;
S4.2, if the stage is the stage after waking up and waiting for the identification result, namely the waking-up stage, turning to S2.2.1;
S4.2.1, after waking up, because the wake-up word belongs to a desired signal, the covariance and the power spectrum density calculated according to the wake-up word cannot be used as the noise covariance and the power spectrum density, and the noise covariance and the power spectrum density in front of the wake-up word need to be taken out from the noise covariance and the power spectrum storage in the wake-up stage maintained above; according to the length of the wake-up word, the maximum length of the wake-up word is moved forward at the current position of the storage space, the noise covariance and the power spectrum density at the storage position are taken out and used as the noise covariance and the power spectrum at the stage, and the voice signal covariance at the stage is calculated;
s5, performing eigenvalue decomposition on the covariance of the voice signals obtained by S4.1.3 and S4.2.1, and performing Laplacian transformation and transformation from a frequency domain to a characteristic value domain;
s6, in order to remove music noise and other non-document noise, a mask is further calculated by adopting a critical bandwidth and a masking effect, weights are calculated according to the mask and the result of S5, and data after final noise reduction processing is calculated.
The wake-up stage is after wake-up and waits for a voice recognition result to return or waits for a certain time after wake-up, wherein the certain time is the average duration of the statement to be recognized after speaking.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations can be made to the embodiments of the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (7)
1. A voice noise reduction method based on a microphone array combined with wake-up words is characterized by comprising the following steps:
s1, carrying out framing and windowing operation on one path of audio data output After Echo Cancellation (AEC), direction of arrival (DOA) and beam forming;
S2, covariance calculation:
s2.1, calculating the circular convolution of the whole frame of data;
S2.2, taking the last L data in the convolution result to form a Toeplitz matrix, wherein the matrix is covariance of the data, and L is the length of subframe data;
S3, determining an initial value: the method comprises the steps of dividing a non-awakening stage and an awakening stage, wherein the noise covariance and the noise power spectrum density of the non-awakening stage and the initial values in the noise covariance and the noise power spectrum density of the awakening stage are respectively determined;
S4, judging whether the wake-up word is in a wake-up stage,
S4.1, if the operation is in the non-awakening stage, turning to S4.1.1 operation;
s4.1.1. voice activity detection is carried out on the data,
If the noise is judged, updating the noise covariance matrix and updating the noise power spectrum density;
if the judgment is the audio, the covariance matrix and the power spectrum density of the noise are not updated, and the audio of the front noise is maintained;
s4.1.2, in the non-awakening stage, taking the audio data of the stage as the noise of the awakening stage, updating the noise covariance and the noise power spectrum density of the stage, and storing the noise covariance and the power spectrum density, wherein the part needs to open up a storage space which is longer than the length of the awakening word and is used for storing the noise covariance and the noise power spectrum of the awakening stage calculated in the step;
s4.1.3, calculating covariance of data by using the frame data, subtracting noise covariance by using the covariance to obtain covariance of a voice signal, and turning to S5;
S4.2, if the stage is the stage after waking up and waiting for the identification result, namely the waking-up stage, turning to S2.2.1;
S4.2.1, after waking up, according to the length of the wake-up word, the current position of the storage space retreats forward to the maximum length of the wake-up word, and the noise covariance and the power spectrum density on the storage position are taken out and used as the noise covariance and the power spectrum of the stage; and calculating the covariance of the voice signal at the stage;
s5, performing eigenvalue decomposition on the covariance of the voice signals obtained by S4.1.3 and S4.2.1, and performing Laplacian transformation and transformation from a frequency domain to a characteristic value domain;
s6, in order to remove music noise and other non-document noise, a mask is further calculated by adopting a critical bandwidth and a masking effect, weights are calculated according to the mask and the result of S5, and data after final noise reduction processing is calculated.
2. The method for voice noise reduction based on the combination of a microphone array and a wake-up word according to claim 1, wherein in the step S1, a data length of 2-4ms is selected as a length L of subframe data, and x is whole frame data.
3. The method for voice noise reduction based on the combination of the microphone array and the wake-up word according to claim 2, wherein the cyclic convolution of the whole frame of data in step S2 is denoted as: cx=xcorr (x, L-1, 'biased').
4. A method of voice noise reduction based on a microphone array in combination with wake-up words according to claim 3, wherein the initial value determination in step S3 is specifically calculated as follows:
a) Initial value of noise covariance: calculating by using the first frame data and adopting an S2 covariance calculation method;
b) Initial value of power spectral density of noise: and using a result of the cyclic convolution, namely the gabor window, and performing DCT operation.
5. The method according to claim 1, wherein the step s4.1.1 is to calculate the noise covariance using the covariance data calculated for the current frame and the previous noise covariance by using a forgetting factor.
6. The method according to claim 1, wherein the step S4.2.1 is to take out the noise covariance and the power spectral density before the wake-up word from the noise covariance and the power spectral density storage of the wake-up stage maintained in the above step S3 because the wake-up word belongs to the desired signal and the covariance and the power spectral density calculated from the wake-up word cannot be used as the noise covariance and the power spectral density.
7. The method for voice noise reduction based on a microphone array combined with a wake-up word according to claim 1, wherein the wake-up stage is after wake-up and waits for a voice recognition result to return or waits for a certain time after wake-up, and the certain time is an average duration of speaking the sentence to be recognized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011061741.2A CN114333884B (en) | 2020-09-30 | 2020-09-30 | Voice noise reduction method based on combination of microphone array and wake-up word |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011061741.2A CN114333884B (en) | 2020-09-30 | 2020-09-30 | Voice noise reduction method based on combination of microphone array and wake-up word |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114333884A CN114333884A (en) | 2022-04-12 |
CN114333884B true CN114333884B (en) | 2024-05-03 |
Family
ID=81010630
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011061741.2A Active CN114333884B (en) | 2020-09-30 | 2020-09-30 | Voice noise reduction method based on combination of microphone array and wake-up word |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114333884B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108122563A (en) * | 2017-12-19 | 2018-06-05 | 北京声智科技有限公司 | Improve voice wake-up rate and the method for correcting DOA |
CN108538305A (en) * | 2018-04-20 | 2018-09-14 | 百度在线网络技术(北京)有限公司 | Audio recognition method, device, equipment and computer readable storage medium |
CN109949810A (en) * | 2019-03-28 | 2019-06-28 | 华为技术有限公司 | A kind of voice awakening method, device, equipment and medium |
US10667045B1 (en) * | 2018-12-28 | 2020-05-26 | Ubtech Robotics Corp Ltd | Robot and auto data processing method thereof |
-
2020
- 2020-09-30 CN CN202011061741.2A patent/CN114333884B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108122563A (en) * | 2017-12-19 | 2018-06-05 | 北京声智科技有限公司 | Improve voice wake-up rate and the method for correcting DOA |
CN108538305A (en) * | 2018-04-20 | 2018-09-14 | 百度在线网络技术(北京)有限公司 | Audio recognition method, device, equipment and computer readable storage medium |
US10667045B1 (en) * | 2018-12-28 | 2020-05-26 | Ubtech Robotics Corp Ltd | Robot and auto data processing method thereof |
CN109949810A (en) * | 2019-03-28 | 2019-06-28 | 华为技术有限公司 | A kind of voice awakening method, device, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN114333884A (en) | 2022-04-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110556103B (en) | Audio signal processing method, device, system, equipment and storage medium | |
CN109473118B (en) | Dual-channel speech enhancement method and device | |
CN109584896A (en) | A kind of speech chip and electronic equipment | |
CN106875938B (en) | Improved nonlinear self-adaptive voice endpoint detection method | |
CN110211599B (en) | Application awakening method and device, storage medium and electronic equipment | |
CN109509465B (en) | Voice signal processing method, assembly, equipment and medium | |
CN110634497A (en) | Noise reduction method and device, terminal equipment and storage medium | |
CN110610718B (en) | Method and device for extracting expected sound source voice signal | |
CN111435598B (en) | Voice signal processing method, device, computer readable medium and electronic equipment | |
CN110660407B (en) | Audio processing method and device | |
Wang et al. | Mask weighted STFT ratios for relative transfer function estimation and its application to robust ASR | |
CN105355199A (en) | Model combination type speech recognition method based on GMM (Gaussian mixture model) noise estimation | |
CN110706719A (en) | Voice extraction method and device, electronic equipment and storage medium | |
CN114171041A (en) | Voice noise reduction method, device and equipment based on environment detection and storage medium | |
Han et al. | Robust GSC-based speech enhancement for human machine interface | |
CN113203987A (en) | Multi-sound-source direction estimation method based on K-means clustering | |
CN113870893A (en) | Multi-channel double-speaker separation method and system | |
CN112259117B (en) | Target sound source locking and extracting method | |
WO2024017110A1 (en) | Voice noise reduction method, model training method, apparatus, device, medium, and product | |
CN114333884B (en) | Voice noise reduction method based on combination of microphone array and wake-up word | |
CN112363112A (en) | Sound source positioning method and device based on linear microphone array | |
CN111192569B (en) | Double-microphone voice feature extraction method and device, computer equipment and storage medium | |
CN115620739A (en) | Method for enhancing voice in specified direction, electronic device and storage medium | |
CN113223552A (en) | Speech enhancement method, speech enhancement device, speech enhancement apparatus, storage medium, and program | |
CN113611319A (en) | Wind noise suppression method, device, equipment and system based on voice component |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |