CN111028857B - Method and system for reducing noise of multichannel audio-video conference based on deep learning - Google Patents
Method and system for reducing noise of multichannel audio-video conference based on deep learning Download PDFInfo
- Publication number
- CN111028857B CN111028857B CN201911378821.8A CN201911378821A CN111028857B CN 111028857 B CN111028857 B CN 111028857B CN 201911378821 A CN201911378821 A CN 201911378821A CN 111028857 B CN111028857 B CN 111028857B
- Authority
- CN
- China
- Prior art keywords
- noise
- covariance matrix
- calculating
- frequency domain
- channel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000013135 deep learning Methods 0.000 title claims abstract description 23
- 239000011159 matrix material Substances 0.000 claims abstract description 70
- 230000009467 reduction Effects 0.000 claims abstract description 24
- 238000013528 artificial neural network Methods 0.000 claims abstract description 18
- 238000004364 calculation method Methods 0.000 claims description 14
- 230000000694 effects Effects 0.000 abstract description 5
- 238000010586 diagram Methods 0.000 description 7
- 238000004590 computer program Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 239000012141 concentrate Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012067 mathematical method Methods 0.000 description 1
- 230000003340 mental effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000003867 tiredness Effects 0.000 description 1
- 208000016255 tiredness Diseases 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0264—Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Circuit For Audible Band Transducer (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The invention relates to a method and a system for reducing noise of a multichannel audio-video conference based on deep learning, wherein the method comprises the following steps: collecting original multichannel signals, and converting the collected time domain signals into frequency domain signals; calculating noise probability existing on each frequency band by using a neural network, and calculating a covariance matrix of the noise by using the noise probability; calculating eigenvectors of a covariance matrix of the noise through the covariance matrix of the noise, and calculating weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise; and outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal. The invention has the advantages of high recognition efficiency, small calculated amount and poor actual use effect.
Description
Technical Field
The invention relates to the technical field of noise reduction processing, in particular to a multichannel audio-video conference noise reduction method based on deep learning.
Background
Noise is usually generated in an audio-video conference, such as a table-knocking sound, a keyboard-knocking sound, a squeak sound generated by a table, etc., which greatly affect the quality of the conference, and besides, the other end of the video conference also generates relatively large noise, for example, a party opposite to the audio-video may be located on a train or in motion. When the noise is loud, the participant usually needs to concentrate on what he or she can say, thus causing the participant to spend much mental effort and resulting in very tiredness.
To solve the problem of conference noise, which generally involves acoustic processing, it is necessary to remove noise from an acoustic signal by utilizing acoustic characteristics. The acoustic signal is a one-dimensional time domain signal, and a common processing method is to decompose the signal into two-dimensional time-frequency signals by using a mathematical method such as fourier transform. However, the sounds of human voice and noise such as a knock table or the like are coincident in time-frequency space, and thus there is no very good way to distinguish between them.
In recent years, with the development of deep learning, a deep learning method has been used to solve the noise reduction problem, for example, "Recurrent Neural Networks for Noise Reduction in Robust ASR", in which an author uses RNN to perform noise reduction on an acoustic signal, however, in the actual use process, the following problems exist: the noise is estimated in 3-5 seconds theoretically, but in practical use, the noise can be estimated in 8-16 seconds, so that the practical use speed is too slow; for noise data types that are not trained, the recognition efficiency is very low; the calculated amount is too large, so that the effect is poor in practical use.
Disclosure of Invention
Therefore, the invention aims to solve the technical problem that the quality of the sound is lost in the prior art, and the use effect is poor, so that the method and the system for reducing the noise of the multichannel audio-video conference based on the deep learning, which have the advantages of no loss of the quality of the sound and good practical use effect, are provided.
In order to solve the technical problems, the method for reducing noise of the multichannel audio-video conference based on deep learning comprises the following steps: collecting original multichannel signals, and converting the collected time domain signals into frequency domain signals; calculating noise probability existing on each frequency band by using a neural network, and calculating a covariance matrix of the noise by using the noise probability; calculating eigenvectors of a covariance matrix of the noise through the covariance matrix of the noise, and calculating weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise; and outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
In one embodiment of the present invention, the method for acquiring the original multi-channel signal is as follows: the raw multichannel signal is acquired by a microphone array.
In one embodiment of the invention, the method for converting the acquired time domain signal into the frequency domain signal comprises the following steps: the acquired time domain signal is converted into the frequency domain signal by fast fourier transform using a single filter or a plurality of filters.
In one embodiment of the present invention, the method for calculating the probability of noise existing on each frequency band by using the neural network is as follows: and inputting the data marked in advance into the neural network, and outputting noise probability existing on each frequency band after calculation of the neural network.
In one embodiment of the present invention, the method for calculating the covariance matrix of the noise is as follows: if the covariance matrix of the noise is phi f The frequency domain signal is Y i,t ThenWherein Y is i,t Representing the frequency domain signal of the ith channel at time t, N representing the number of channels, +.>Is Y i,t Is a conjugate transpose of (a).
In one embodiment of the present invention, the eigenvector calculation method of the covariance matrix of the noise is Φ f W f =W f Λ, wherein the eigenvector of the covariance matrix of the noise is W f The covariance matrix of the noise is phi f Λ represents a matrix of eigenvalues.
In one embodiment of the present invention, the method for calculating the weights of the merging multi-channels is as follows:
wherein the weight of the merging multi-channel is +.> Is W f Is a conjugate transpose of (a).
In one embodiment of the present invention, the method for outputting the noise reduction result according to the weights of the combining multiple channels and the frequency domain signal includes:
the invention also discloses a multichannel audio-video conference noise reduction system based on deep learning, which comprises: the acquisition module is used for acquiring original multichannel signals and converting the acquired time domain signals into frequency domain signals; the first calculation module is used for calculating noise probability existing on each frequency band by using the neural network, and calculating a covariance matrix of the noise by the noise probability; the second calculation module is used for calculating the eigenvectors of the covariance matrix of the noise through the covariance matrix of the noise, and calculating the weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise; and the output module is used for outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
Compared with the prior art, the technical scheme of the invention has the following advantages:
according to the method and the system for reducing noise of the multichannel audio-video conference based on deep learning, the covariance matrix of noise can be calculated more rapidly and effectively, and then the covariance matrix is brought into a traditional signal processing frame, so that the covariance matrix of noise can be converged rapidly, and the spectrum matrix of the noise can be calculated; in addition, the invention uses the physical characteristics of the signals to reduce dryness and uses the traditional signal processing framework with physical significance, so the restored original sound is more real.
Drawings
In order that the invention may be more readily understood, a more particular description of the invention will be rendered by reference to specific embodiments thereof that are illustrated in the appended drawings, in which
FIG. 1 is a flow chart of a method for reducing noise of a multichannel audio-video conference based on deep learning;
fig. 2 is a schematic diagram of a system for noise reduction in a multi-channel audio-video conference based on deep learning.
Detailed Description
Example 1
As shown in fig. 1, the embodiment provides a method for reducing noise of a multichannel audio/video conference based on deep learning, which includes the following steps: s1, acquiring original multi-channel signals, and converting the acquired time domain signals into frequency domain signals; s2, calculating noise probability existing on each frequency band by using a neural network, and calculating a covariance matrix of noise according to the noise probability; step S3: calculating eigenvectors of a covariance matrix of the noise through the covariance matrix of the noise, and calculating weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise; step S4: and outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
In the method for reducing noise of the multichannel audio-video conference based on deep learning, in the step S1, original multichannel signals are collected, and the collected time domain signals are converted into frequency domain signals, so that subsequent processing of the signals is facilitated; in the step S2, the noise probability existing in each frequency band is calculated by using the neural network, and the covariance matrix of the noise is calculated by using the noise probability, so that the covariance matrix can be quickly converged, thereby being beneficial to calculating the spectrum matrix of the noise; in the step S3, the eigenvectors of the covariance matrix of the noise are calculated through the covariance matrix of the noise, and the weights of the combined multiple channels are calculated according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise, so that the recognition efficiency is high because the physical characteristics of the signals are utilized to reduce dryness; in the step S4, a noise reduction result is output according to the weight of the combined multi-channel and the frequency domain signal, which is not only beneficial to recovering the original sound and making it more realistic, but also has high speed and good use effect in actual use.
The method for collecting the original multichannel signals comprises the following steps: the original multichannel signals are collected through the microphone array, so that the collected signals are accurate and the speed is high. In addition, in this embodiment, the sampling rate is 16Khz.
The method for converting the acquired time domain signals into frequency domain signals comprises the following steps: the acquired time domain signal is converted into the frequency domain signal by fast fourier transform using a single filter or a plurality of filters. In this embodiment, a multi-filter bank is used, so that signals of each frequency band can be effectively reserved.
The method for calculating the noise probability existing on each frequency band by using the neural network comprises the following steps: the data marked in advance is input into the neural network, and noise probability existing on each frequency band is output after calculation through the neural network, so that the method is simple, and the calculation amount is small, so that the speed is high.
The method for calculating the covariance matrix of the noise comprises the following steps: if the covariance matrix of the noise is phi f The frequency domain signal is Y i,t ThenWherein Y is i,t Representing the frequency domain signal of the ith channel at time t, N representing the number of channels, +.>Is Y i,t Which represents the spectrum of noise. The eigenvector calculation method of the covariance matrix of the noise is phi f W f =W f Λ, wherein the eigenvector of the covariance matrix of the noise is W f The covariance matrix of the noise is phi f Λ represents a matrix of eigenvalues.
The method for calculating the weight of the combined multi-channel comprises the following steps:
wherein the weight of the merging multi-channel is +.> Is W f Is a conjugate transpose of (a). Due to covariance matrix phi of noise f Is brought into a traditional minimum variance filter, so that the calculation is simple and quick.
The method for outputting the noise reduction result according to the weight of the combined multi-channel and the frequency domain signal comprises the following steps:the invention uses the physical characteristics of the signals to reduce dryness and uses the traditional signal processing framework with physical significance, so the restored original sound is more real.
Example two
Based on the same inventive concept, the present embodiment provides a system for reducing noise of a multichannel audio-video conference based on deep learning, and the principle of solving the problem is similar to that of the method for reducing noise of the multichannel audio-video conference based on deep learning, and the repetition is omitted.
Referring to fig. 2, the system for noise reduction in a multi-channel audio/video conference based on deep learning according to the present embodiment includes:
the acquisition module is used for acquiring original multichannel signals and converting the acquired time domain signals into frequency domain signals;
the first calculation module is used for calculating noise probability existing on each frequency band by using the neural network, and calculating a covariance matrix of the noise by the noise probability;
the second calculation module is used for calculating the eigenvectors of the covariance matrix of the noise through the covariance matrix of the noise, and calculating the weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise;
and the output module is used for outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations and modifications of the present invention will be apparent to those of ordinary skill in the art in light of the foregoing description. It is not necessary here nor is it exhaustive of all embodiments. While still being apparent from variations or modifications that may be made by those skilled in the art are within the scope of the invention.
Claims (6)
1. The method for reducing noise of the multichannel audio-video conference based on the deep learning is characterized by comprising the following steps of:
step S1: collecting original multichannel signals, and converting the collected time domain signals into frequency domain signals;
step S2: the method for calculating the covariance matrix of the noise by using the neural network to calculate the noise probability existing on each frequency band comprises the following steps: if the covariance matrix of the noise is phi f The frequency domain signal is Y i,t ThenWherein Y is i,t Representing the frequency domain signal of the ith channel at time t, N representing the number of channels, +.>Is Y i,t The eigenvector calculation method of the covariance matrix of the noise is phi f W f =W f Λ, wherein the eigenvector of the covariance matrix of the noise is W f The covariance matrix of the noise is phi f And a represents a matrix of characteristic values, and the method for calculating the weight of the combined multi-channel is as follows: />Wherein the weight of the merging multi-channel is +.> Is W f Is a conjugate transpose of (2);
step S3: calculating eigenvectors of a covariance matrix of the noise through the covariance matrix of the noise, and calculating weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise;
step S4: and outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
2. The deep learning-based multichannel audio-video conference noise reduction method according to claim 1, wherein: the method for collecting the original multichannel signals comprises the following steps: the raw multichannel signal is acquired by a microphone array.
3. The method for noise reduction of a deep learning-based multichannel audio-video conference according to claim 1, wherein: the method for converting the acquired time domain signals into frequency domain signals comprises the following steps: the acquired time domain signal is converted into the frequency domain signal by fast fourier transform using a single filter or a plurality of filters.
4. The method for noise reduction of a deep learning-based multichannel audio-video conference according to claim 1, wherein: the method for calculating the noise probability existing on each frequency band by using the neural network comprises the following steps: and inputting the data marked in advance into the neural network, and outputting noise probability existing on each frequency band after calculation of the neural network.
5. The method for noise reduction of a deep learning-based multichannel audio-video conference according to claim 1, wherein: the method for outputting the noise reduction result according to the weight of the combined multi-channel and the frequency domain signal comprises the following steps:
6. a system for noise reduction in a multichannel audio-video conference based on deep learning, comprising:
the acquisition module is used for acquiring original multichannel signals and converting the acquired time domain signals into frequency domain signals;
the first calculating module is used for calculating noise probability existing on each frequency band by using a neural network, and calculating a covariance matrix of the noise by the noise probability, wherein the calculating method of the covariance matrix of the noise comprises the following steps: if the covariance matrix of the noise is phi f The frequency domain signal is Y i,t ThenWherein Y is i,t Representing the frequency domain signal of the ith channel at time t, N representing the number of channels, +.>Is Y i,t The eigenvector calculation method of the covariance matrix of the noise is phi f W f =W f Λ, wherein the eigenvector of the covariance matrix of the noise is W f The covariance matrix of the noise is phi f And a represents a matrix of characteristic values, and the method for calculating the weight of the combined multi-channel is as follows: />Wherein the weight of the merging multi-channel is +.> Is the conjugate transpose of Wf;
the second calculation module is used for calculating the eigenvectors of the covariance matrix of the noise through the covariance matrix of the noise, and calculating the weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise;
and the output module is used for outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911378821.8A CN111028857B (en) | 2019-12-27 | 2019-12-27 | Method and system for reducing noise of multichannel audio-video conference based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911378821.8A CN111028857B (en) | 2019-12-27 | 2019-12-27 | Method and system for reducing noise of multichannel audio-video conference based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111028857A CN111028857A (en) | 2020-04-17 |
CN111028857B true CN111028857B (en) | 2024-01-19 |
Family
ID=70196500
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911378821.8A Active CN111028857B (en) | 2019-12-27 | 2019-12-27 | Method and system for reducing noise of multichannel audio-video conference based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111028857B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113628633A (en) * | 2021-10-14 | 2021-11-09 | 辰风策划(深圳)有限公司 | Noise reduction method for multi-channel information transmission of enterprise multi-party meeting |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013054258A (en) * | 2011-09-06 | 2013-03-21 | Nippon Telegr & Teleph Corp <Ntt> | Sound source separation device and method, and program |
CN103517185A (en) * | 2012-06-26 | 2014-01-15 | 鹦鹉股份有限公司 | Method for suppressing noise in an acoustic signal for a multi-microphone audio device operating in a noisy environment |
CN103811020A (en) * | 2014-03-05 | 2014-05-21 | 东北大学 | Smart voice processing method |
CN106653047A (en) * | 2016-12-16 | 2017-05-10 | 广州视源电子科技股份有限公司 | Automatic gain control method and device for audio data |
CN108831495A (en) * | 2018-06-04 | 2018-11-16 | 桂林电子科技大学 | A kind of sound enhancement method applied to speech recognition under noise circumstance |
CN109994120A (en) * | 2017-12-29 | 2019-07-09 | 福州瑞芯微电子股份有限公司 | Sound enhancement method, system, speaker and storage medium based on diamylose |
CN110136737A (en) * | 2019-06-18 | 2019-08-16 | 北京拙河科技有限公司 | A kind of voice de-noising method and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9173025B2 (en) * | 2012-02-08 | 2015-10-27 | Dolby Laboratories Licensing Corporation | Combined suppression of noise, echo, and out-of-location signals |
-
2019
- 2019-12-27 CN CN201911378821.8A patent/CN111028857B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013054258A (en) * | 2011-09-06 | 2013-03-21 | Nippon Telegr & Teleph Corp <Ntt> | Sound source separation device and method, and program |
CN103517185A (en) * | 2012-06-26 | 2014-01-15 | 鹦鹉股份有限公司 | Method for suppressing noise in an acoustic signal for a multi-microphone audio device operating in a noisy environment |
CN103811020A (en) * | 2014-03-05 | 2014-05-21 | 东北大学 | Smart voice processing method |
CN106653047A (en) * | 2016-12-16 | 2017-05-10 | 广州视源电子科技股份有限公司 | Automatic gain control method and device for audio data |
CN109994120A (en) * | 2017-12-29 | 2019-07-09 | 福州瑞芯微电子股份有限公司 | Sound enhancement method, system, speaker and storage medium based on diamylose |
CN108831495A (en) * | 2018-06-04 | 2018-11-16 | 桂林电子科技大学 | A kind of sound enhancement method applied to speech recognition under noise circumstance |
CN110136737A (en) * | 2019-06-18 | 2019-08-16 | 北京拙河科技有限公司 | A kind of voice de-noising method and device |
Non-Patent Citations (1)
Title |
---|
何希 ; 杨雪梅 ; 徐家品 ; .基于随机矩阵最大特征值分布的频谱感知算法.计算机测量与控制.(第02期),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111028857A (en) | 2020-04-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110970053B (en) | Multichannel speaker-independent voice separation method based on deep clustering | |
Grais et al. | Raw multi-channel audio source separation using multi-resolution convolutional auto-encoders | |
WO2015196729A1 (en) | Microphone array speech enhancement method and device | |
JP6482173B2 (en) | Acoustic signal processing apparatus and method | |
JP2007017900A (en) | Data-embedding device and method, and data-extracting device and method | |
US9530429B2 (en) | Reverberation suppression apparatus used for auditory device | |
CN111863015A (en) | Audio processing method and device, electronic equipment and readable storage medium | |
CN111429939A (en) | Sound signal separation method of double sound sources and sound pickup | |
CN106664472A (en) | Signal processing apparatus, signal processing method, and computer program | |
CN110503967B (en) | Voice enhancement method, device, medium and equipment | |
CN116405823B (en) | Intelligent audio denoising enhancement method for bone conduction earphone | |
US20240177726A1 (en) | Speech enhancement | |
CN111028857B (en) | Method and system for reducing noise of multichannel audio-video conference based on deep learning | |
CN107592600B (en) | Pickup screening method and pickup device based on distributed microphones | |
CN113744715A (en) | Vocoder speech synthesis method, device, computer equipment and storage medium | |
CN112908353A (en) | Voice enhancement method for hearing aid by combining edge computing and cloud computing | |
Kates et al. | Integrating cognitive and peripheral factors in predicting hearing-aid processing effectiveness | |
CN114023352B (en) | Voice enhancement method and device based on energy spectrum depth modulation | |
CN110992966B (en) | Human voice separation method and system | |
CN114189781A (en) | Noise reduction method and system for double-microphone neural network noise reduction earphone | |
CN114283832A (en) | Processing method and device for multi-channel audio signal | |
Donley et al. | DARE-Net: Speech dereverberation and room impulse response estimation | |
Muhsina et al. | Signal enhancement of source separation techniques | |
JP2008278406A (en) | Sound source separation apparatus, sound source separation program and sound source separation method | |
JP3787103B2 (en) | Speech processing apparatus, speech processing method, speech processing program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: No. 229, Lingqiao Road, Haishu District, Ningbo, Zhejiang 315000 Applicant after: Suzhou Auditoryworks Co.,Ltd. Address before: 215000 unit 2-b504, creative industry park, 328 Xinghu street, Suzhou Industrial Park, Jiangsu Province Applicant before: Suzhou frog sound technology Co.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |