CN108806708A - Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model - Google Patents

Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model Download PDF

Info

Publication number
CN108806708A
CN108806708A CN201810606145.4A CN201810606145A CN108806708A CN 108806708 A CN108806708 A CN 108806708A CN 201810606145 A CN201810606145 A CN 201810606145A CN 108806708 A CN108806708 A CN 108806708A
Authority
CN
China
Prior art keywords
voice
arbiter
scene analysis
auditory scene
confrontation network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810606145.4A
Other languages
Chinese (zh)
Inventor
陈龙
张小博
张晓灿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 3 Research Institute
Original Assignee
CETC 3 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 3 Research Institute filed Critical CETC 3 Research Institute
Priority to CN201810606145.4A priority Critical patent/CN108806708A/en
Publication of CN108806708A publication Critical patent/CN108806708A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/0308Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)

Abstract

The present invention relates to a kind of based on Computational auditory scene analysis and generates the voice de-noising method of confrontation network model, including:Step 1, noisy speech is handled based on the generator and arbiter for generating confrontation network, obtains intermediate result;Step 2, the intermediate result is handled based on Computational auditory scene analysis method, obtains final result.The present invention can remove the partial noise in voice signal acquired under Complex Channel background environment, and preferably phonological component can be kept not distort.

Description

Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model
Technical field
The present invention relates to a kind of voice de-noising methods, more particularly to one kind based on Computational auditory scene analysis and generating confrontation The voice de-noising method of network model.
Background technology
Voice is the most important means that the mankind mutually transmit information, one section of voice bearer intention of speaker, identity, feelings The abundant information such as thread.Voice signal can be propagated by various kinds of media such as empty gas and water, radio.Voice signal is passing During broadcasting, or due to the limitation of collecting device, usually it can all be interfered by various noises.Especially in certain professions In, extraneous noise is inevitable, and in many cases, and noise type is complicated, intensity is larger.This noise like Serious influence can be caused on subsequent voice signal processing, such as the accuracy of speech recognition can be reduced.In addition, if passing through people The mode of work handles the voice data of this Noise, and prolonged work can cause to damage to the auditory system of people.
Invention content
The purpose of the present invention is to provide a kind of based on Computational auditory scene analysis and generates the voice of confrontation network model Noise-reduction method to remove the partial noise in voice signal acquired under Complex Channel background environment, and keeps phonological component It does not distort.
The present invention provides a kind of based on Computational auditory scene analysis and generates the voice de-noising method of confrontation network model, Including:
Step 1, noisy speech is handled based on the generator and arbiter for generating confrontation network, obtains intermediate knot Fruit;
Step 2, intermediate result is handled based on Computational auditory scene analysis method, obtains final result.
Further, in step 1, the training process for generating confrontation network includes:
1) noisy data and clean data are inputted into arbiter, enables arbiter be judged as differing, passes through backpropagation Mode adjusts the network parameter of arbiter;
2) noisy data input generator is subjected to noise reduction process, is exported as a result, defeated together with noisy data later Enter arbiter, enable arbiter be judged as identical, the network parameter of arbiter is adjusted by way of backpropagation;
3) fixing step 2) in the obtained network parameter of arbiter, the net of generator is adjusted by way of backpropagation Network parameter, target are that generator is made to be judged as differing.
Further, step 2 includes:
Using the intermediate result as the input of Computational auditory scene analysis, masking estimation is carried out to input signal, according to Estimated result synthesizes the intermediate result again, obtains the voice data after noise reduction.
Compared with prior art the beneficial effects of the invention are as follows:
The partial noise in voice signal acquired under Complex Channel background environment can be removed, and can preferably be kept Phonological component does not distort.
Description of the drawings
Fig. 1 is the flow the present invention is based on Computational auditory scene analysis and the voice de-noising method for generating confrontation network model Figure;
Fig. 2 is the network structure of generator;
Fig. 3 is the network training process figure for generating confrontation network.
Specific implementation mode
The present invention is described in detail for each embodiment shown in below in conjunction with the accompanying drawings, but it should explanation, these Embodiment is not limitation of the present invention, those of ordinary skill in the art according to function, method made by these embodiments, Or the equivalent transformation in structure or replacement, all belong to the scope of protection of the present invention within.
It present embodiments provides a kind of based on Computational auditory scene analysis (Computational auditory scene Analysis, CASA) and generate confrontation network (Generative adversarial networks, GAN) model voice drop Method for de-noising, including:
Step 1, based on the generator (Generator) and arbiter (Discriminator) for generating confrontation network to containing Voice of making an uproar is handled, and intermediate result is obtained;
Step 2, intermediate result is handled based on Computational auditory scene analysis method, obtains final result.
Voice de-noising method provided in this embodiment based on Computational auditory scene analysis and generation confrontation network model, energy The partial noise in voice signal acquired under Complex Channel background environment is enough removed, and can preferably keep phonological component not It distorts.
The training process of generation confrontation network includes in step 1:
1) noisy data and clean data are inputted into arbiter, enables arbiter be judged as differing, passes through backpropagation Mode adjusts the network parameter of arbiter;
2) noisy data input generator is subjected to noise reduction process, is exported as a result, defeated together with noisy data later Enter arbiter, enable arbiter be judged as identical, the network parameter of arbiter is adjusted by way of backpropagation;
3) fixing step 2) in the obtained network parameter of arbiter, the net of generator is adjusted by way of backpropagation Network parameter, target are that generator is made to be judged as differing.
In the present embodiment, step 2 includes:
Using the intermediate result as the input of Computational auditory scene analysis, masking estimation is carried out to input signal, according to Estimated result synthesizes the intermediate result again, obtains the voice data after noise reduction.
Invention is further described in detail below.
Noise-reduction method of the present embodiment based on Computational auditory scene analysis and generation confrontation network carry out voice de-noising, CASA It is with ideal two-value masking (ideal binary mask, IBM) or ideal floating value masking (ideal ratio mask, IRM) Target is calculated, converts voice de-noising problem to parameter Estimation and two-value classification problem, GAN is differentiated by a generator and one Device forms, and network training process simulates zero-sum two-person game, is optimized respectively to the parameter of generator and arbiter, training mesh Mark is that a kind of effective mapping model between truthful data to training data is arrived in study.As shown in Figure 1, y (n) is for sample rate fsLength is the noisy speech of n, and after GAN is handled, the intermediate result of output isAfter being handled using CASA, finally Result be x (n).In the present embodiment, the sample rate f of all voice datasIt is unified for 16kHz.Below to the noise reduction based on GAN And the noise reduction part based on CASA is described in detail.
1, the noise reduction based on GAN
The essence of GAN networks is the zero-sum game between generator and arbiter.By being adjusted in continuous gambling process Parameter is saved, network gradually learns to the mapping relations between specific data, and can ensure that can be by this after training Mapping relations handle completely new data.For the voice de-noising problem under Complex Channel background environment, GAN networks need to learn Practise be y (n) withBetween mapping relations.
The GAN network structures that this method is proposed using Pascual et al., wherein generator G is as final progress noise reduction Network, that is, complete from y (n) toProcess, arbiter D only the training stage be used to G carry out game training, testing Stage can be removed completely.
The network structure of G is as shown in Figure 2, and structure is similar with autocoder, by the encoder of lower half portion in Fig. 2 It is constituted with decoder two parts in top half.Such composition is so that network has feature end to end, the input of network It is the voice signal of similar length with output, avoids complicated characteristic extraction procedure.The network knot of encoder and decoder Structure is identical, but the arrangement mode of several network layers is different, and the two is at symmetric relation.It is that full convolution connects between layers, this So that dense layer is not present in network, and temporarily closely to be associated with during network attention input signal and whole levels Relationship is further able to reduce the quantity of training parameter.
G is made of 22 one-dimensional trapezoidal convolutional layers, and every layer of filter width is 31, step-length 2.The number of filter is successively It is incremented by, width successively successively decreases.Every layer of output dimension be hits × characteristic pattern, respectively 16384 × 1,8192 × 16,4096 × 32,2048 × 32,1024 × 64,512 × 64,256 × 128,128 × 128,64 × 256,32 × 256,16 × 512,8 × 1024.Decoder network and encoder network filter width having the same and filter quantity.G networks are in addition between levels Connection outside, in fact also corresponding with the decoder layer connection of each coding layer shunts the compression process among model, i.e., Now jump connection.In this way, the details of low level can be removed so that speech waveform is more properly rebuild.Jump connection will The information that fine processing is crossed in voice is directly delivered to decoding stage, and can solve gradient disperse to a certain extent and ask Topic so that gradient can be transmitted deeper in back-propagation process in network model.
The network structure of arbiter D is similar to the coded portion of G, is one-dimensional convolutional coding structure and the convolution with sorter network Topological structure is identical.But difference lies in:1) input of D is two channels, and each channel is 16384 sampled points;2) exist Before LeakyReLU activation primitives, using virtually batch norm, and α=0.3;3) it is one one after the last one active coating Convolutional layer is tieed up, and filter width is 1, it in this way will not be down-sampled to hiding activation primitive progress.Therefore, the parameter number of full articulamentum Amount reduces the method merged to 8,1024 channels from 8 × 1024=8192 and can learn to arrive by deconvolution parameter.
The training process of network is as shown in Figure 3, and wherein y indicates the training data of Noise,It indicates corresponding to be free of The data of noise,It indicates by G treated data.The training voice data that this method uses is Valentini et al. public affairs Trainset partial datas in the database opened include 11572 clean speech from 28 people.This method is by above-mentioned Clean speech data add the mode of Gaussian noise to simulate the voice data under Complex Channel background condition, and in order to simulate The complexity of real noise, adds the noise of different signal-to-noise ratio, and concrete condition is as shown in table 1.Noise it can be seen from table The data accounting of relatively high (40dB) and noise relatively low (20dB) is less, and signal-to-noise ratio is the data accounting highest of 30dB, in this way Design be for the noise situations under preferably simulation of real scenes.
1 training data of table adds noise situations
Every 100 of training data is a batch, to every batch of training data, set trained rate as 0.0002, training process Including following three steps:
1) noisy data y and clean dataD is inputted, enables D be judged as differing and (being labeled as 1), passes through backpropagation Mode adjusts the network parameter of D;
2) noisy data y inputs G is subjected to noise reduction process, is exportedD is inputted together with y later, D is enabled to be judged as phase With (being labeled as 0), the network parameter of D is adjusted by way of backpropagation;
3) network parameter of the D obtained in fixed previous step, adjusts the network parameter of G, mesh by way of backpropagation It is designated as that D is made to be judged as differing and (being labeled as 1).
It is an epoch that whole training samples, which traverse one time,.After 86 epoch, terminate training process, Zhi Hougu Determine the network parameter of G, and in this, as final noise reduction network.
2, the noise reduction based on CASA
Y (n) is after the processing of G, resultInput as CASA.As shown in fig. 1, it carries out first to input The masking of signal is estimated, then according to estimated result pairIt is synthesized again, finally obtains the voice data x (n) after noise reduction.
Assuming thatIt is made of pure voice s (n) and noise signal l (n), i.e.,
Time-frequency representation Y ∈ Rm×nIt can be decomposed into sparse speech items S and low-rank noise item L, i.e.,
Y=S+L (2)
Above formula can be solved by the method for RPCA:
In view of the physical meaning of spectrogram, two after decomposition should be non-negative, therefore have:
But above-mentioned Model Condition is excessively harsh, therefore introduces dense error term:
Y=S+L+E (5)
By introducing auxiliary variable L+And S+, formula (4) can be rewritten as:
Its augmentation Lagrange's equation is:
In formula, ΩY, ΩSAnd ΩLFor extended binary variable, ρ is scale parameter.
Object function in formula (7) can divide, therefore the solution of ADMM algorithms may be used.All variables in formula (7) Can alternately it be updated respectively under ADMM frames by solving corresponding word problem.In two auxiliary variables and three binary Under the constraints of variable, LρIt can be minimized and be solved by gradient descent method.
Input signalIt can be analyzed to sparse item, low-rank item as stated above after gammatone filtering transformations With dense item three parts.Then, the masking estimation that IBM and IRM is performed as follows can be obtained:
In this way, noisy speech signalPoint of realization voice and noise can be synthesized again by sheltering weighted sum on frequency spectrum Solution, to achieve the purpose that noise reduction.
The present invention is directed to the noise-reduction method of voice data under Complex Channel background environment, it can be achieved that under the conditions of to Complex Noise The decrease of noise functions of the voice data got, while also preferably phonological component can be kept not distort.This method can be used as Intercept the skills such as artificial speech recognition under environment, automatic speech recognition, Application on Voiceprint Recognition, voice keyword detection, speech emotional analysis The preprocessing part of art, play the role of reduce noise jamming, improve identification or Detection accuracy, can be applied to information obtain with The military fields such as analysis can also be applied to the civil fields such as big data analysis.
The series of detailed descriptions listed above only for the present invention feasible embodiment specifically Bright, they are all without departing from equivalent implementations made by technical spirit of the present invention not to limit the scope of the invention Or change should all be included in the protection scope of the present invention.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case of without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Profit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent requirements of the claims Variation is included within the present invention.

Claims (3)

1. a kind of voice de-noising method based on Computational auditory scene analysis and generation confrontation network model, which is characterized in that packet It includes:
Step 1, noisy speech is handled based on the generator and arbiter for generating confrontation network, obtains intermediate result;
Step 2, the intermediate result is handled based on Computational auditory scene analysis method, obtains final result.
2. the voice de-noising side according to claim 1 based on Computational auditory scene analysis and generation confrontation network model Method, which is characterized in that in step 1, the training process for generating confrontation network includes:
1) noisy data and clean data are inputted into arbiter, enables arbiter be judged as differing, by way of backpropagation Adjust the network parameter of arbiter;
2) noisy data input generator is subjected to noise reduction process, is exported and is sentenced as a result, being inputted together with noisy data later Other device, enables arbiter be judged as identical, and the network parameter of arbiter is adjusted by way of backpropagation;
3) fixing step 2) in the obtained network parameter of arbiter, the network ginseng of generator is adjusted by way of backpropagation Number, target are that generator is made to be judged as differing.
3. the voice de-noising side according to claim 2 based on Computational auditory scene analysis and generation confrontation network model Method, which is characterized in that the step 2 includes:
Using the intermediate result as the input of Computational auditory scene analysis, masking estimation is carried out to input signal, according to estimation As a result the intermediate result is synthesized again, obtains the voice data after noise reduction.
CN201810606145.4A 2018-06-13 2018-06-13 Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model Pending CN108806708A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810606145.4A CN108806708A (en) 2018-06-13 2018-06-13 Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810606145.4A CN108806708A (en) 2018-06-13 2018-06-13 Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model

Publications (1)

Publication Number Publication Date
CN108806708A true CN108806708A (en) 2018-11-13

Family

ID=64085675

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810606145.4A Pending CN108806708A (en) 2018-06-13 2018-06-13 Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model

Country Status (1)

Country Link
CN (1) CN108806708A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110310650A (en) * 2019-04-08 2019-10-08 清华大学 A kind of voice enhancement algorithm based on second-order differential microphone array
CN110363751A (en) * 2019-07-01 2019-10-22 浙江大学 A kind of big enteroscope polyp detection method based on generation collaborative network
CN110503976A (en) * 2019-08-15 2019-11-26 广州华多网络科技有限公司 Audio separation method, device, electronic equipment and storage medium
CN110751960A (en) * 2019-10-16 2020-02-04 北京网众共创科技有限公司 Method and device for determining noise data
CN110751958A (en) * 2019-09-25 2020-02-04 电子科技大学 Noise reduction method based on RCED network
CN111383651A (en) * 2018-12-29 2020-07-07 Tcl集团股份有限公司 Voice noise reduction method and device and terminal equipment
CN111583954A (en) * 2020-05-12 2020-08-25 中国人民解放军国防科技大学 Speaker independent single-channel voice separation method
CN111933187A (en) * 2020-09-21 2020-11-13 深圳追一科技有限公司 Emotion recognition model training method and device, computer equipment and storage medium
CN112133293A (en) * 2019-11-04 2020-12-25 重庆邮电大学 Phrase voice sample compensation method based on generation countermeasure network and storage medium
CN112259068A (en) * 2020-10-21 2021-01-22 上海协格空调工程有限公司 Active noise reduction air conditioning system and noise reduction control method thereof
CN112466320A (en) * 2020-12-12 2021-03-09 中国人民解放军战略支援部队信息工程大学 Underwater acoustic signal noise reduction method based on generation countermeasure network
CN112487914A (en) * 2020-11-25 2021-03-12 山东省人工智能研究院 ECG noise reduction method based on deep convolution generation countermeasure network
CN113096673A (en) * 2021-03-30 2021-07-09 山东省计算中心(国家超级计算济南中心) Voice processing method and system based on generation countermeasure network
CN113160844A (en) * 2021-04-27 2021-07-23 山东省计算中心(国家超级计算济南中心) Speech enhancement method and system based on noise background classification
CN113409377A (en) * 2021-06-23 2021-09-17 四川大学 Phase unwrapping method for generating countermeasure network based on jump connection
CN115392325A (en) * 2022-10-26 2022-11-25 中国人民解放军国防科技大学 Multi-feature noise reduction modulation identification method based on cycleGan

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120010881A1 (en) * 2010-07-12 2012-01-12 Carlos Avendano Monaural Noise Suppression Based on Computational Auditory Scene Analysis
CN102890930A (en) * 2011-07-19 2013-01-23 上海上大海润信息系统有限公司 Speech emotion recognizing method based on hidden Markov model (HMM) / self-organizing feature map neural network (SOFMNN) hybrid model
CN104064196A (en) * 2014-06-20 2014-09-24 哈尔滨工业大学深圳研究生院 Method for improving speech recognition accuracy on basis of voice leading end noise elimination
CN104538043A (en) * 2015-01-16 2015-04-22 北京邮电大学 Real-time emotion reminder for call
US9215527B1 (en) * 2009-12-14 2015-12-15 Cirrus Logic, Inc. Multi-band integrated speech separating microphone array processor with adaptive beamforming
CN107452405A (en) * 2017-08-16 2017-12-08 北京易真学思教育科技有限公司 A kind of method and device that data evaluation is carried out according to voice content
CN107452389A (en) * 2017-07-20 2017-12-08 大象声科(深圳)科技有限公司 A kind of general monophonic real-time noise-reducing method
CN107563428A (en) * 2017-08-25 2018-01-09 西安电子科技大学 Classification of Polarimetric SAR Image method based on generation confrontation network
CN107845389A (en) * 2017-12-21 2018-03-27 北京工业大学 A kind of sound enhancement method based on multiresolution sense of hearing cepstrum coefficient and depth convolutional neural networks
CN107945811A (en) * 2017-10-23 2018-04-20 北京大学 A kind of production towards bandspreading resists network training method and audio coding, coding/decoding method

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9215527B1 (en) * 2009-12-14 2015-12-15 Cirrus Logic, Inc. Multi-band integrated speech separating microphone array processor with adaptive beamforming
US20120010881A1 (en) * 2010-07-12 2012-01-12 Carlos Avendano Monaural Noise Suppression Based on Computational Auditory Scene Analysis
CN102890930A (en) * 2011-07-19 2013-01-23 上海上大海润信息系统有限公司 Speech emotion recognizing method based on hidden Markov model (HMM) / self-organizing feature map neural network (SOFMNN) hybrid model
CN104064196A (en) * 2014-06-20 2014-09-24 哈尔滨工业大学深圳研究生院 Method for improving speech recognition accuracy on basis of voice leading end noise elimination
CN104538043A (en) * 2015-01-16 2015-04-22 北京邮电大学 Real-time emotion reminder for call
CN107452389A (en) * 2017-07-20 2017-12-08 大象声科(深圳)科技有限公司 A kind of general monophonic real-time noise-reducing method
CN107452405A (en) * 2017-08-16 2017-12-08 北京易真学思教育科技有限公司 A kind of method and device that data evaluation is carried out according to voice content
CN107563428A (en) * 2017-08-25 2018-01-09 西安电子科技大学 Classification of Polarimetric SAR Image method based on generation confrontation network
CN107945811A (en) * 2017-10-23 2018-04-20 北京大学 A kind of production towards bandspreading resists network training method and audio coding, coding/decoding method
CN107845389A (en) * 2017-12-21 2018-03-27 北京工业大学 A kind of sound enhancement method based on multiresolution sense of hearing cepstrum coefficient and depth convolutional neural networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
PASCUAL S .ETC: "SEGAN: speech enhancement generative adversarial network", 《ARXIV PREPRINT》 *
PASCUAL S .ETC: "SEGAN: speech enhancement generative adversarial network", 《ARXIV PREPRINT》, 30 June 2017 (2017-06-30), pages 1 - 5 *
陈龙 等: "面向无线电侦听的语音降噪方法", 《电声技术》 *
陈龙 等: "面向无线电侦听的语音降噪方法", 《电声技术》, vol. 42, no. 4, 30 April 2018 (2018-04-30), pages 25 - 30 *

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111383651A (en) * 2018-12-29 2020-07-07 Tcl集团股份有限公司 Voice noise reduction method and device and terminal equipment
CN110310650A (en) * 2019-04-08 2019-10-08 清华大学 A kind of voice enhancement algorithm based on second-order differential microphone array
CN110363751B (en) * 2019-07-01 2021-08-03 浙江大学 Large intestine endoscope polyp detection method based on generation cooperative network
CN110363751A (en) * 2019-07-01 2019-10-22 浙江大学 A kind of big enteroscope polyp detection method based on generation collaborative network
CN110503976A (en) * 2019-08-15 2019-11-26 广州华多网络科技有限公司 Audio separation method, device, electronic equipment and storage medium
CN110503976B (en) * 2019-08-15 2021-11-23 广州方硅信息技术有限公司 Audio separation method and device, electronic equipment and storage medium
CN110751958A (en) * 2019-09-25 2020-02-04 电子科技大学 Noise reduction method based on RCED network
CN110751960A (en) * 2019-10-16 2020-02-04 北京网众共创科技有限公司 Method and device for determining noise data
CN110751960B (en) * 2019-10-16 2022-04-26 北京网众共创科技有限公司 Method and device for determining noise data
CN112133293A (en) * 2019-11-04 2020-12-25 重庆邮电大学 Phrase voice sample compensation method based on generation countermeasure network and storage medium
CN111583954A (en) * 2020-05-12 2020-08-25 中国人民解放军国防科技大学 Speaker independent single-channel voice separation method
CN111933187B (en) * 2020-09-21 2021-02-05 深圳追一科技有限公司 Emotion recognition model training method and device, computer equipment and storage medium
CN111933187A (en) * 2020-09-21 2020-11-13 深圳追一科技有限公司 Emotion recognition model training method and device, computer equipment and storage medium
CN112259068B (en) * 2020-10-21 2023-04-11 上海协格空调工程有限公司 Active noise reduction air conditioning system and noise reduction control method thereof
CN112259068A (en) * 2020-10-21 2021-01-22 上海协格空调工程有限公司 Active noise reduction air conditioning system and noise reduction control method thereof
CN112487914A (en) * 2020-11-25 2021-03-12 山东省人工智能研究院 ECG noise reduction method based on deep convolution generation countermeasure network
CN112487914B (en) * 2020-11-25 2021-08-31 山东省人工智能研究院 ECG noise reduction method based on deep convolution generation countermeasure network
CN112466320A (en) * 2020-12-12 2021-03-09 中国人民解放军战略支援部队信息工程大学 Underwater acoustic signal noise reduction method based on generation countermeasure network
CN112466320B (en) * 2020-12-12 2023-11-10 中国人民解放军战略支援部队信息工程大学 Underwater sound signal noise reduction method based on generation countermeasure network
CN113096673A (en) * 2021-03-30 2021-07-09 山东省计算中心(国家超级计算济南中心) Voice processing method and system based on generation countermeasure network
CN113096673B (en) * 2021-03-30 2022-09-30 山东省计算中心(国家超级计算济南中心) Voice processing method and system based on generation countermeasure network
CN113160844A (en) * 2021-04-27 2021-07-23 山东省计算中心(国家超级计算济南中心) Speech enhancement method and system based on noise background classification
CN113409377A (en) * 2021-06-23 2021-09-17 四川大学 Phase unwrapping method for generating countermeasure network based on jump connection
CN113409377B (en) * 2021-06-23 2022-09-27 四川大学 Phase unwrapping method for generating countermeasure network based on jump connection
CN115392325A (en) * 2022-10-26 2022-11-25 中国人民解放军国防科技大学 Multi-feature noise reduction modulation identification method based on cycleGan
CN115392325B (en) * 2022-10-26 2023-08-18 中国人民解放军国防科技大学 Multi-feature noise reduction modulation identification method based on CycleGan

Similar Documents

Publication Publication Date Title
CN108806708A (en) Voice de-noising method based on Computational auditory scene analysis and generation confrontation network model
CN110619885B (en) Method for generating confrontation network voice enhancement based on deep complete convolution neural network
Braun et al. A curriculum learning method for improved noise robustness in automatic speech recognition
CN109036465B (en) Speech emotion recognition method
DE112015004785B4 (en) Method for converting a noisy signal into an enhanced audio signal
CN109524020B (en) Speech enhancement processing method
CN108172238A (en) A kind of voice enhancement algorithm based on multiple convolutional neural networks in speech recognition system
CN105047194B (en) A kind of self study sound spectrograph feature extracting method for speech emotion recognition
CN107845389A (en) A kind of sound enhancement method based on multiresolution sense of hearing cepstrum coefficient and depth convolutional neural networks
Shah et al. Time-frequency mask-based speech enhancement using convolutional generative adversarial network
CN106653056A (en) Fundamental frequency extraction model based on LSTM recurrent neural network and training method thereof
CN109559736A (en) A kind of film performer's automatic dubbing method based on confrontation network
CN112802491B (en) Voice enhancement method for generating confrontation network based on time-frequency domain
Hui et al. Convolutional maxout neural networks for speech separation
CN109890043A (en) A kind of wireless signal noise-reduction method based on production confrontation network
CN107967920A (en) A kind of improved own coding neutral net voice enhancement algorithm
CN111429947A (en) Speech emotion recognition method based on multi-stage residual convolutional neural network
CN114428234A (en) Radar high-resolution range profile noise reduction identification method based on GAN and self-attention
CN110102051A (en) The plug-in detection method and device of game
CN107516065A (en) The sophisticated signal denoising method of empirical mode decomposition combination dictionary learning
CN106204482A (en) Based on the mixed noise minimizing technology that weighting is sparse
CN114863938B (en) Method and system for identifying bird language based on attention residual error and feature fusion
Zöhrer et al. Representation learning for single-channel source separation and bandwidth extension
CN114283829A (en) Voice enhancement method based on dynamic gate control convolution cyclic network
Nair et al. Mfcc based noise reduction in asr using kalman filtering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181113

RJ01 Rejection of invention patent application after publication