CN111028857B - Method and system for reducing noise of multichannel audio-video conference based on deep learning - Google Patents

Method and system for reducing noise of multichannel audio-video conference based on deep learning Download PDF

Info

Publication number
CN111028857B
CN111028857B CN201911378821.8A CN201911378821A CN111028857B CN 111028857 B CN111028857 B CN 111028857B CN 201911378821 A CN201911378821 A CN 201911378821A CN 111028857 B CN111028857 B CN 111028857B
Authority
CN
China
Prior art keywords
noise
covariance matrix
calculating
frequency domain
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911378821.8A
Other languages
Chinese (zh)
Other versions
CN111028857A (en
Inventor
辛鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Auditoryworks Co ltd
Original Assignee
Suzhou Auditoryworks Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Auditoryworks Co ltd filed Critical Suzhou Auditoryworks Co ltd
Priority to CN201911378821.8A priority Critical patent/CN111028857B/en
Publication of CN111028857A publication Critical patent/CN111028857A/en
Application granted granted Critical
Publication of CN111028857B publication Critical patent/CN111028857B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0264Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention relates to a method and a system for reducing noise of a multichannel audio-video conference based on deep learning, wherein the method comprises the following steps: collecting original multichannel signals, and converting the collected time domain signals into frequency domain signals; calculating noise probability existing on each frequency band by using a neural network, and calculating a covariance matrix of the noise by using the noise probability; calculating eigenvectors of a covariance matrix of the noise through the covariance matrix of the noise, and calculating weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise; and outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal. The invention has the advantages of high recognition efficiency, small calculated amount and poor actual use effect.

Description

Method and system for reducing noise of multichannel audio-video conference based on deep learning
Technical Field
The invention relates to the technical field of noise reduction processing, in particular to a multichannel audio-video conference noise reduction method based on deep learning.
Background
Noise is usually generated in an audio-video conference, such as a table-knocking sound, a keyboard-knocking sound, a squeak sound generated by a table, etc., which greatly affect the quality of the conference, and besides, the other end of the video conference also generates relatively large noise, for example, a party opposite to the audio-video may be located on a train or in motion. When the noise is loud, the participant usually needs to concentrate on what he or she can say, thus causing the participant to spend much mental effort and resulting in very tiredness.
To solve the problem of conference noise, which generally involves acoustic processing, it is necessary to remove noise from an acoustic signal by utilizing acoustic characteristics. The acoustic signal is a one-dimensional time domain signal, and a common processing method is to decompose the signal into two-dimensional time-frequency signals by using a mathematical method such as fourier transform. However, the sounds of human voice and noise such as a knock table or the like are coincident in time-frequency space, and thus there is no very good way to distinguish between them.
In recent years, with the development of deep learning, a deep learning method has been used to solve the noise reduction problem, for example, "Recurrent Neural Networks for Noise Reduction in Robust ASR", in which an author uses RNN to perform noise reduction on an acoustic signal, however, in the actual use process, the following problems exist: the noise is estimated in 3-5 seconds theoretically, but in practical use, the noise can be estimated in 8-16 seconds, so that the practical use speed is too slow; for noise data types that are not trained, the recognition efficiency is very low; the calculated amount is too large, so that the effect is poor in practical use.
Disclosure of Invention
Therefore, the invention aims to solve the technical problem that the quality of the sound is lost in the prior art, and the use effect is poor, so that the method and the system for reducing the noise of the multichannel audio-video conference based on the deep learning, which have the advantages of no loss of the quality of the sound and good practical use effect, are provided.
In order to solve the technical problems, the method for reducing noise of the multichannel audio-video conference based on deep learning comprises the following steps: collecting original multichannel signals, and converting the collected time domain signals into frequency domain signals; calculating noise probability existing on each frequency band by using a neural network, and calculating a covariance matrix of the noise by using the noise probability; calculating eigenvectors of a covariance matrix of the noise through the covariance matrix of the noise, and calculating weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise; and outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
In one embodiment of the present invention, the method for acquiring the original multi-channel signal is as follows: the raw multichannel signal is acquired by a microphone array.
In one embodiment of the invention, the method for converting the acquired time domain signal into the frequency domain signal comprises the following steps: the acquired time domain signal is converted into the frequency domain signal by fast fourier transform using a single filter or a plurality of filters.
In one embodiment of the present invention, the method for calculating the probability of noise existing on each frequency band by using the neural network is as follows: and inputting the data marked in advance into the neural network, and outputting noise probability existing on each frequency band after calculation of the neural network.
In one embodiment of the present invention, the method for calculating the covariance matrix of the noise is as follows: if the covariance matrix of the noise is phi f The frequency domain signal is Y i,t ThenWherein Y is i,t Representing the frequency domain signal of the ith channel at time t, N representing the number of channels, +.>Is Y i,t Is a conjugate transpose of (a).
In one embodiment of the present invention, the eigenvector calculation method of the covariance matrix of the noise is Φ f W f =W f Λ, wherein the eigenvector of the covariance matrix of the noise is W f The covariance matrix of the noise is phi f Λ represents a matrix of eigenvalues.
In one embodiment of the present invention, the method for calculating the weights of the merging multi-channels is as follows:
wherein the weight of the merging multi-channel is +.> Is W f Is a conjugate transpose of (a).
In one embodiment of the present invention, the method for outputting the noise reduction result according to the weights of the combining multiple channels and the frequency domain signal includes:
the invention also discloses a multichannel audio-video conference noise reduction system based on deep learning, which comprises: the acquisition module is used for acquiring original multichannel signals and converting the acquired time domain signals into frequency domain signals; the first calculation module is used for calculating noise probability existing on each frequency band by using the neural network, and calculating a covariance matrix of the noise by the noise probability; the second calculation module is used for calculating the eigenvectors of the covariance matrix of the noise through the covariance matrix of the noise, and calculating the weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise; and the output module is used for outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
Compared with the prior art, the technical scheme of the invention has the following advantages:
according to the method and the system for reducing noise of the multichannel audio-video conference based on deep learning, the covariance matrix of noise can be calculated more rapidly and effectively, and then the covariance matrix is brought into a traditional signal processing frame, so that the covariance matrix of noise can be converged rapidly, and the spectrum matrix of the noise can be calculated; in addition, the invention uses the physical characteristics of the signals to reduce dryness and uses the traditional signal processing framework with physical significance, so the restored original sound is more real.
Drawings
In order that the invention may be more readily understood, a more particular description of the invention will be rendered by reference to specific embodiments thereof that are illustrated in the appended drawings, in which
FIG. 1 is a flow chart of a method for reducing noise of a multichannel audio-video conference based on deep learning;
fig. 2 is a schematic diagram of a system for noise reduction in a multi-channel audio-video conference based on deep learning.
Detailed Description
Example 1
As shown in fig. 1, the embodiment provides a method for reducing noise of a multichannel audio/video conference based on deep learning, which includes the following steps: s1, acquiring original multi-channel signals, and converting the acquired time domain signals into frequency domain signals; s2, calculating noise probability existing on each frequency band by using a neural network, and calculating a covariance matrix of noise according to the noise probability; step S3: calculating eigenvectors of a covariance matrix of the noise through the covariance matrix of the noise, and calculating weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise; step S4: and outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
In the method for reducing noise of the multichannel audio-video conference based on deep learning, in the step S1, original multichannel signals are collected, and the collected time domain signals are converted into frequency domain signals, so that subsequent processing of the signals is facilitated; in the step S2, the noise probability existing in each frequency band is calculated by using the neural network, and the covariance matrix of the noise is calculated by using the noise probability, so that the covariance matrix can be quickly converged, thereby being beneficial to calculating the spectrum matrix of the noise; in the step S3, the eigenvectors of the covariance matrix of the noise are calculated through the covariance matrix of the noise, and the weights of the combined multiple channels are calculated according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise, so that the recognition efficiency is high because the physical characteristics of the signals are utilized to reduce dryness; in the step S4, a noise reduction result is output according to the weight of the combined multi-channel and the frequency domain signal, which is not only beneficial to recovering the original sound and making it more realistic, but also has high speed and good use effect in actual use.
The method for collecting the original multichannel signals comprises the following steps: the original multichannel signals are collected through the microphone array, so that the collected signals are accurate and the speed is high. In addition, in this embodiment, the sampling rate is 16Khz.
The method for converting the acquired time domain signals into frequency domain signals comprises the following steps: the acquired time domain signal is converted into the frequency domain signal by fast fourier transform using a single filter or a plurality of filters. In this embodiment, a multi-filter bank is used, so that signals of each frequency band can be effectively reserved.
The method for calculating the noise probability existing on each frequency band by using the neural network comprises the following steps: the data marked in advance is input into the neural network, and noise probability existing on each frequency band is output after calculation through the neural network, so that the method is simple, and the calculation amount is small, so that the speed is high.
The method for calculating the covariance matrix of the noise comprises the following steps: if the covariance matrix of the noise is phi f The frequency domain signal is Y i,t ThenWherein Y is i,t Representing the frequency domain signal of the ith channel at time t, N representing the number of channels, +.>Is Y i,t Which represents the spectrum of noise. The eigenvector calculation method of the covariance matrix of the noise is phi f W f =W f Λ, wherein the eigenvector of the covariance matrix of the noise is W f The covariance matrix of the noise is phi f Λ represents a matrix of eigenvalues.
The method for calculating the weight of the combined multi-channel comprises the following steps:
wherein the weight of the merging multi-channel is +.> Is W f Is a conjugate transpose of (a). Due to covariance matrix phi of noise f Is brought into a traditional minimum variance filter, so that the calculation is simple and quick.
The method for outputting the noise reduction result according to the weight of the combined multi-channel and the frequency domain signal comprises the following steps:the invention uses the physical characteristics of the signals to reduce dryness and uses the traditional signal processing framework with physical significance, so the restored original sound is more real.
Example two
Based on the same inventive concept, the present embodiment provides a system for reducing noise of a multichannel audio-video conference based on deep learning, and the principle of solving the problem is similar to that of the method for reducing noise of the multichannel audio-video conference based on deep learning, and the repetition is omitted.
Referring to fig. 2, the system for noise reduction in a multi-channel audio/video conference based on deep learning according to the present embodiment includes:
the acquisition module is used for acquiring original multichannel signals and converting the acquired time domain signals into frequency domain signals;
the first calculation module is used for calculating noise probability existing on each frequency band by using the neural network, and calculating a covariance matrix of the noise by the noise probability;
the second calculation module is used for calculating the eigenvectors of the covariance matrix of the noise through the covariance matrix of the noise, and calculating the weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise;
and the output module is used for outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations and modifications of the present invention will be apparent to those of ordinary skill in the art in light of the foregoing description. It is not necessary here nor is it exhaustive of all embodiments. While still being apparent from variations or modifications that may be made by those skilled in the art are within the scope of the invention.

Claims (6)

1. The method for reducing noise of the multichannel audio-video conference based on the deep learning is characterized by comprising the following steps of:
step S1: collecting original multichannel signals, and converting the collected time domain signals into frequency domain signals;
step S2: the method for calculating the covariance matrix of the noise by using the neural network to calculate the noise probability existing on each frequency band comprises the following steps: if the covariance matrix of the noise is phi f The frequency domain signal is Y i,t ThenWherein Y is i,t Representing the frequency domain signal of the ith channel at time t, N representing the number of channels, +.>Is Y i,t The eigenvector calculation method of the covariance matrix of the noise is phi f W f =W f Λ, wherein the eigenvector of the covariance matrix of the noise is W f The covariance matrix of the noise is phi f And a represents a matrix of characteristic values, and the method for calculating the weight of the combined multi-channel is as follows: />Wherein the weight of the merging multi-channel is +.> Is W f Is a conjugate transpose of (2);
step S3: calculating eigenvectors of a covariance matrix of the noise through the covariance matrix of the noise, and calculating weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise;
step S4: and outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
2. The deep learning-based multichannel audio-video conference noise reduction method according to claim 1, wherein: the method for collecting the original multichannel signals comprises the following steps: the raw multichannel signal is acquired by a microphone array.
3. The method for noise reduction of a deep learning-based multichannel audio-video conference according to claim 1, wherein: the method for converting the acquired time domain signals into frequency domain signals comprises the following steps: the acquired time domain signal is converted into the frequency domain signal by fast fourier transform using a single filter or a plurality of filters.
4. The method for noise reduction of a deep learning-based multichannel audio-video conference according to claim 1, wherein: the method for calculating the noise probability existing on each frequency band by using the neural network comprises the following steps: and inputting the data marked in advance into the neural network, and outputting noise probability existing on each frequency band after calculation of the neural network.
5. The method for noise reduction of a deep learning-based multichannel audio-video conference according to claim 1, wherein: the method for outputting the noise reduction result according to the weight of the combined multi-channel and the frequency domain signal comprises the following steps:
6. a system for noise reduction in a multichannel audio-video conference based on deep learning, comprising:
the acquisition module is used for acquiring original multichannel signals and converting the acquired time domain signals into frequency domain signals;
the first calculating module is used for calculating noise probability existing on each frequency band by using a neural network, and calculating a covariance matrix of the noise by the noise probability, wherein the calculating method of the covariance matrix of the noise comprises the following steps: if the covariance matrix of the noise is phi f The frequency domain signal is Y i,t ThenWherein Y is i,t Representing the frequency domain signal of the ith channel at time t, N representing the number of channels, +.>Is Y i,t The eigenvector calculation method of the covariance matrix of the noise is phi f W f =W f Λ, wherein the eigenvector of the covariance matrix of the noise is W f The covariance matrix of the noise is phi f And a represents a matrix of characteristic values, and the method for calculating the weight of the combined multi-channel is as follows: />Wherein the weight of the merging multi-channel is +.> Is the conjugate transpose of Wf;
the second calculation module is used for calculating the eigenvectors of the covariance matrix of the noise through the covariance matrix of the noise, and calculating the weights of the combined multiple channels according to the covariance matrix of the noise and the eigenvectors of the covariance matrix of the noise;
and the output module is used for outputting a noise reduction result according to the weight of the combined multi-channel and the frequency domain signal.
CN201911378821.8A 2019-12-27 2019-12-27 Method and system for reducing noise of multichannel audio-video conference based on deep learning Active CN111028857B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911378821.8A CN111028857B (en) 2019-12-27 2019-12-27 Method and system for reducing noise of multichannel audio-video conference based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911378821.8A CN111028857B (en) 2019-12-27 2019-12-27 Method and system for reducing noise of multichannel audio-video conference based on deep learning

Publications (2)

Publication Number Publication Date
CN111028857A CN111028857A (en) 2020-04-17
CN111028857B true CN111028857B (en) 2024-01-19

Family

ID=70196500

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911378821.8A Active CN111028857B (en) 2019-12-27 2019-12-27 Method and system for reducing noise of multichannel audio-video conference based on deep learning

Country Status (1)

Country Link
CN (1) CN111028857B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113628633A (en) * 2021-10-14 2021-11-09 辰风策划(深圳)有限公司 Noise reduction method for multi-channel information transmission of enterprise multi-party meeting

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013054258A (en) * 2011-09-06 2013-03-21 Nippon Telegr & Teleph Corp <Ntt> Sound source separation device and method, and program
CN103517185A (en) * 2012-06-26 2014-01-15 鹦鹉股份有限公司 Method for suppressing noise in an acoustic signal for a multi-microphone audio device operating in a noisy environment
CN103811020A (en) * 2014-03-05 2014-05-21 东北大学 Smart voice processing method
CN106653047A (en) * 2016-12-16 2017-05-10 广州视源电子科技股份有限公司 Automatic gain control method and device for audio data
CN108831495A (en) * 2018-06-04 2018-11-16 桂林电子科技大学 A kind of sound enhancement method applied to speech recognition under noise circumstance
CN109994120A (en) * 2017-12-29 2019-07-09 福州瑞芯微电子股份有限公司 Sound enhancement method, system, speaker and storage medium based on diamylose
CN110136737A (en) * 2019-06-18 2019-08-16 北京拙河科技有限公司 A kind of voice de-noising method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9173025B2 (en) * 2012-02-08 2015-10-27 Dolby Laboratories Licensing Corporation Combined suppression of noise, echo, and out-of-location signals

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013054258A (en) * 2011-09-06 2013-03-21 Nippon Telegr & Teleph Corp <Ntt> Sound source separation device and method, and program
CN103517185A (en) * 2012-06-26 2014-01-15 鹦鹉股份有限公司 Method for suppressing noise in an acoustic signal for a multi-microphone audio device operating in a noisy environment
CN103811020A (en) * 2014-03-05 2014-05-21 东北大学 Smart voice processing method
CN106653047A (en) * 2016-12-16 2017-05-10 广州视源电子科技股份有限公司 Automatic gain control method and device for audio data
CN109994120A (en) * 2017-12-29 2019-07-09 福州瑞芯微电子股份有限公司 Sound enhancement method, system, speaker and storage medium based on diamylose
CN108831495A (en) * 2018-06-04 2018-11-16 桂林电子科技大学 A kind of sound enhancement method applied to speech recognition under noise circumstance
CN110136737A (en) * 2019-06-18 2019-08-16 北京拙河科技有限公司 A kind of voice de-noising method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
何希 ; 杨雪梅 ; 徐家品 ; .基于随机矩阵最大特征值分布的频谱感知算法.计算机测量与控制.(第02期),全文. *

Also Published As

Publication number Publication date
CN111028857A (en) 2020-04-17

Similar Documents

Publication Publication Date Title
CN110970053B (en) Multichannel speaker-independent voice separation method based on deep clustering
Grais et al. Raw multi-channel audio source separation using multi-resolution convolutional auto-encoders
WO2015196729A1 (en) Microphone array speech enhancement method and device
JP6482173B2 (en) Acoustic signal processing apparatus and method
JP2007017900A (en) Data-embedding device and method, and data-extracting device and method
US9530429B2 (en) Reverberation suppression apparatus used for auditory device
CN111863015A (en) Audio processing method and device, electronic equipment and readable storage medium
CN111429939A (en) Sound signal separation method of double sound sources and sound pickup
CN106664472A (en) Signal processing apparatus, signal processing method, and computer program
CN110503967B (en) Voice enhancement method, device, medium and equipment
CN116405823B (en) Intelligent audio denoising enhancement method for bone conduction earphone
US20240177726A1 (en) Speech enhancement
CN111028857B (en) Method and system for reducing noise of multichannel audio-video conference based on deep learning
CN107592600B (en) Pickup screening method and pickup device based on distributed microphones
CN113744715A (en) Vocoder speech synthesis method, device, computer equipment and storage medium
CN112908353A (en) Voice enhancement method for hearing aid by combining edge computing and cloud computing
Kates et al. Integrating cognitive and peripheral factors in predicting hearing-aid processing effectiveness
CN114023352B (en) Voice enhancement method and device based on energy spectrum depth modulation
CN110992966B (en) Human voice separation method and system
CN114189781A (en) Noise reduction method and system for double-microphone neural network noise reduction earphone
CN114283832A (en) Processing method and device for multi-channel audio signal
Donley et al. DARE-Net: Speech dereverberation and room impulse response estimation
Muhsina et al. Signal enhancement of source separation techniques
JP2008278406A (en) Sound source separation apparatus, sound source separation program and sound source separation method
JP3787103B2 (en) Speech processing apparatus, speech processing method, speech processing program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: No. 229, Lingqiao Road, Haishu District, Ningbo, Zhejiang 315000

Applicant after: Suzhou Auditoryworks Co.,Ltd.

Address before: 215000 unit 2-b504, creative industry park, 328 Xinghu street, Suzhou Industrial Park, Jiangsu Province

Applicant before: Suzhou frog sound technology Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant