CN103559888A - Speech enhancement method based on non-negative low-rank and sparse matrix decomposition principle - Google Patents

Speech enhancement method based on non-negative low-rank and sparse matrix decomposition principle Download PDF

Info

Publication number
CN103559888A
CN103559888A CN201310548773.9A CN201310548773A CN103559888A CN 103559888 A CN103559888 A CN 103559888A CN 201310548773 A CN201310548773 A CN 201310548773A CN 103559888 A CN103559888 A CN 103559888A
Authority
CN
China
Prior art keywords
matrix
rank
noisy speech
sparse matrix
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310548773.9A
Other languages
Chinese (zh)
Other versions
CN103559888B (en
Inventor
孙成立
须明
王希敏
谢坚筱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
KEY LABORATORY OF SCIENCE AND TECHNOLOGY ON AVIONICS INTEGRATION TECHNOLOGIES
Original Assignee
KEY LABORATORY OF SCIENCE AND TECHNOLOGY ON AVIONICS INTEGRATION TECHNOLOGIES
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by KEY LABORATORY OF SCIENCE AND TECHNOLOGY ON AVIONICS INTEGRATION TECHNOLOGIES filed Critical KEY LABORATORY OF SCIENCE AND TECHNOLOGY ON AVIONICS INTEGRATION TECHNOLOGIES
Priority to CN201310548773.9A priority Critical patent/CN103559888B/en
Publication of CN103559888A publication Critical patent/CN103559888A/en
Application granted granted Critical
Publication of CN103559888B publication Critical patent/CN103559888B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a speech enhancement method based on the non-negative low-rank and sparse matrix decomposition principle. The method includes the first step of firstly carrying out smoothing, framing and discrete Fourier transformation on noisy speech signals to obtain noisy speech frequency spectra, the second step of allowing the noisy speech magnitude spectra of frames to serve as column vectors which are arranged in chronological order to form a noisy speech time-frequency matrix and then carrying out non-negative low-rank and sparse matrix decomposition on the noisy speech time-frequency matrix to obtain a non-negative low-rank and sparse matrix, and the third step of utilizing the sparse matrix and reconstruction of noisy speech phase positions to enhance the speech spectra and finally obtaining the enhanced speech in a time domain form through inverse Fourier transformation. By the adoption of the method, noise adaptability is high, endpoint detection and model training are not needed, parameters are fewer and easy to regulate, strong noise environmental performance is good, and therefore the method has a good application prospect.

Description

Sound enhancement method based on non-negative low-rank and sparse matrix decomposition principle
Technical field
The present invention relates to signal process field, be applicable to the squelch of noisy speech, particularly the sound enhancement method based on non-negative low-rank and sparse matrix decomposition principle.
Background technology
Voice signal is that mankind's exchange of information is the most natural, the most effective means.Along with the mankind enter the information age, in the urgent need to the voice processing technology with advanced, promote human society intelligent.As far back as 2000, Bill Gates just once proposed " coming 10 years is the epoch of voice ".Recent years, along with the companies such as apple, Google, Microsoft successively release intelligent sound service, intelligent sound industry has become the new industry in areas of information technology, and user cognition degree and market scale expand gradually.The smart mobile phone that particularly apple is released recently has voice assistant function, and the voice " cloud " that University of Science and Technology news fly release, and makes intelligent sound technology face more wide application.Yet, in voice communication and application process, be inevitably subject to the interference from surrounding environment, communication media and inside communication equipment noise, had a strong impact on the practical application of intelligent sound technology.
It is the effective technology that solves noise pollution that voice strengthen.Voice strengthen by suppressing the interference of noise to voice, make to strengthen the voice signal of processing minimum with the distortion between original clean speech signal.Come in the past few decades, emerged in large numbers many voice enhancement algorithms, typical algorithm comprises spectrum-subtraction, based on spectral amplitude least mean-square error, subspace method, wavelet de-noising method.Under the higher environment of signal to noise ratio (S/N ratio), voice enhancing has obtained effective solution.Yet, due to the diversity of noise in physical environment and the complicacy of voice signal itself, voice enhancement algorithm is according to the difference of applied environment and difference, and this makes its research work difficulty very large, and the Speech Enhancement problem of very noisy and multiple noise circumstance does not still obtain fine solution.
In existing voice enhancement algorithm, many methods attempt to remove to the full extent noise signal with the pdf model of voice signal and noise signal, yet research in recent years shows certain single distribution and can not be applicable to all voice or noise, need more flexibly mathematical model and model algorithm for estimating with the feature of adaptation signal self.In addition, at existing voice, strengthen in algorithm, noise estimates it is the indispensable work in early stage of voice enhancement algorithm.By noise, estimate to obtain the priori signal to noise ratio (S/N ratio) of noise power spectrum and voice signal, the improvement that voice is strengthened to effect is most important.Existing sound enhancement method is detected the voice signal collecting is divided into noise segment and noisy speech section by sound end, utilize noise segment to estimate and upgrade noise estimator, yet this is a kind of suboptimum estimation mode, in reality, the instantaneous noise of noise segment and noisy speech section also not exclusively conforms to, therefore, this noise estimation method always brings error, moreover existing voice end-point detection technology is still immature under low signal-to-noise ratio and nonstationary noise environment, easily cause erroneous judgement, can cause the very large residual noise of existence in voice.
In recent years compressive sensing theory research shows, the observed quantity of many reality can be summed up as the pattern of a low-rank component and the addition of sparse component, low-rank and sparse matrix by matrix decompose, and can from large noise or exceptional value contamination data, recover primary data information (pdi).The low-rank of matrix and sparse matrix decompose for many sciemtifec and technical spheres such as figure image intensifying, video object detection, data minings.
Steadily random noise and periodic noise are modal two kinds of noise class.Steadily random noise is described its stochastic process with single order and second-order statistic, its average and autocorrelation function and time-independent, because the Fourier transform of random signal autocorrelation function is power spectrum, therefore steadily the time-frequency matrix of random noise is the low-rank matrix that an order number is 1.Equally, if noise is periodic noise, because its time-frequency matrix only has value at some fixed frequency place, its rectangular array vector has stronger correlativity, and inevitable is also a low-rank matrix.
In sum, the time-frequency rectangular array vector of ground unrest has very strong correlativity, so the time-frequency matrix of noise has low-rank.Relative ground unrest, speech source signal when major part on frequency value be zero or close to zero, only have a small amount of samples point place value larger, so speech source signal has certain sparse property, be applicable to describing with sparse matrix.Therefore, can consider that low-rank and the sparse resolution theory of using for reference matrix solve Speech Enhancement problem.Chinese patent discloses a kind of single channel decomposing based on low-rank and sparse matrix without the supervision language separation method (publication number: CN102915742A) of making an uproar.First the method is used Short Time Fourier Transform that thereby noisy speech time domain waveform is transformed to the amplitude spectrum that time-frequency domain obtains noisy speech; Utilize low-rank and sparse matrix decomposition algorithm that the amplitude spectrum of noisy speech is decomposed into noise amplitude spectrum, voice amplitude spectrum and residual noise amplitude spectrum three sum; Finally, utilize the voice time domain waveform that inverse Fourier transform reconstructs from the amplitude spectrum of voice in short-term.The deficiency of the method is low-rank and sparse matrix decomposition not to be added to non-negativity constraint, easily causes the separated voice amplitude spectrum obtaining from noisy speech amplitude spectrum to contain negative value result.And actual amplitude spectrum is non-negative physical quantity, should there is not negative value phenomenon.Negative value amplitude spectrum not only causes resolution error, and can produce people's ear and feel the music noise of feeling bad, thereby affects phonetic hearing quality.
The present invention has designed a kind of sound enhancement method based on non-negative low-rank and sparse matrix decomposition principle, the method adopts non-negative low-rank and sparse matrix decomposition principle to decompose noisy speech amplitude spectrum, can make to decompose the voice amplitude spectrum obtaining and meet nonnegativity, effectively improve low-rank and sparse matrix and decompose effect.The method has strong robustness, do not need to carry out end-point detection and the parameter easy advantage such as adjusting less, and the voice that are applicable under strong noise environment strengthen task.
Summary of the invention
Technical matters to be solved by this invention is to provide a kind of sound enhancement method based on non-negative low-rank and sparse matrix decomposition principle, by introducing low-rank and the sparse constraint of noise and voice and low-rank is carried out in non-negativity constraint and sparse matrix decomposes in time-frequency domain, the separation of making an uproar of the language of realizing noisy speech.
The present invention takes following technical scheme, and the sound enhancement method based on non-negative low-rank and sparse matrix decomposition principle is isolated voice signal with non-negative low-rank and sparse matrix decomposition method from noisy speech, and implementation step is as follows:
(1) discrete noisy speech signal is carried out to pre-service, pre-service comprises signal smoothing and minute frame;
(2) noisy speech signal after minute frame is carried out to discrete Fourier transformation, obtain noisy speech frequency spectrum;
(3) in frequency domain, using the spectrum amplitude of every frame voice as column vector, arrange in chronological order, by several speech frames, form noisy speech time-frequency matrix;
(4) utilize non-negative low-rank and sparse matrix decomposition algorithm to decompose noisy speech time-frequency matrix, obtain non-negative low-rank matrix and sparse matrix; Decomposing expression formula is:
Y=L+S+E meets rank (L)≤r, || S|| 0≤ h, L>=0, S>=0;
Wherein: Y is noisy speech time-frequency matrix; L is low-rank matrix, the amplitude spectrum of corresponding noise; S is sparse matrix, the amplitude spectrum of corresponding voice, || S|| 0represent the non-zero element number that sparse matrix S contains, the order of rank (L) representing matrix L, E is residual matrix, r and h represent low-rank and sparse constraint upper limit parameter;
(5) utilize the phase spectrum reconstruct of sparse matrix S and noisy speech to strengthen voice spectrum, then by inverse Fourier transform, obtain the enhancing voice of forms of time and space.
In described step (1), discrete noisy speech signal being carried out to pretreated processing procedure is:
(1) adopt P point arest neighbors signal average to carry out signal smoothing, in order to the amplitude wave-shape of level and smooth noisy speech;
(2) to noisy speech signal, divide frame, the window function that minute frame adopts is Hamming window, and window length is 200 points, and overlapping the counting that each interframe moves is 80 points.
The step of calculating low-rank matrix L and sparse matrix S is as follows:
(1) initialization: Y 0=Y; L 0=S 0=[0] n * K;
Iterations initial value i=1; Maximum iteration time imax=10 3; Relative error threshold value δ=10 -3;
(2) use NMF to upgrade low-rank matrix: (W, H)=NMF (Y i-1), L i=WH; W ∈ R n * r, H ∈ R r * K;
NMF represents Non-negative Matrix Factorization, and NMF represents Non-negative Matrix Factorization, and W and H are that order is the NMF decomposition result of r, and the measure function of NMF selects Itakura-Saito to estimate;
(3) use Soft-thresholding operator to upgrade sparse matrix: S i=(Y i-1-L i+ S i-1> λ) (Y i-1-L i+ S i-1-λ);
Wherein: symbol representing matrix correspondence position element product, λ is thresholding constant; λ is relevant with noise level, recommendation λ=σ, the mean square deviation that wherein σ is noise;
(4) upgrade stack matrix: Y i=L i+ S i;
(5) if i reach maximum iteration time i=imax or stop iteration, the estimated value L of output L and S iand S i; Otherwise jump to step (2), i=i+1; Continue to carry out iterative process.
In described step (5), utilize the phase spectrum reconstruct of sparse matrix and noisy speech to strengthen voice spectrum:
Figure BDA0000410022380000041
Wherein: the spectral phase that ∠ Y (n, k) is noisy speech, S is sparse matrix, S (n, k) is sparse matrix spectral amplitude value,
Figure BDA0000410022380000042
for the enhancing voice spectrum of reconstruct, n is time frame index, and k is frequency indices.
Sound enhancement method provided by the invention decomposes by non-negative low-rank and sparse matrix, and can make to decompose the low-rank matrix and the element in sparse matrix that obtain is all nonnegative value.The method does not need to carry out end-point detection and model training, has strong robustness, the parameter easy advantage such as adjusting less, and the voice that are particularly suitable under strong noise environment strengthen task.
Accompanying drawing explanation
Fig. 1 is speech-enhancement system block diagram of the present invention.
Embodiment
Now the invention will be further described by reference to the accompanying drawings, and referring to Fig. 1, the sound enhancement method based on non-negative low-rank and sparse matrix decomposition principle, comprises following concrete steps:
1) noisy speech signal y (t) is carried out to pre-service 101; 101 stages of pre-service comprise signal smoothing and minute frame, make it easy to subsequent processes.Signal smoothing refers to adopt the P point arest neighbors signal average of y (t) to calculate noisy speech signal currency, in order to the amplitude wave-shape of level and smooth noisy speech signal.In the present invention, the value of P is 3,
Figure BDA0000410022380000043
the window function that divides frame to adopt is Hamming window, and window length is 200 points, and overlapping the counting that each interframe moves is 80 points;
2) noisy speech signal after minute frame is carried out to DFT (discrete Fourier transformation) 102, obtain signal spectrum, signal spectrum comprises the amplitude spectrum 104|Y (n, k) of signal | and phase spectrum 103 ∠ Y (n, k).Wherein n represents frame index, n=1, and 2 ..., N; K represents frequency indices, k=1, and 2 ..., k; N is total time frame number; K is that Fourier transform is counted;
3) in frequency domain, using the amplitude spectrum of every frame voice 104 as column vector order, arrange, several speech frames just form the noisy time-frequency matrix Y of a N * K like this.
4) noisy time-frequency matrix Y is carried out to NLSMD (non-negative low-rank and sparse matrix decompose) 105, calculate non-negative low-rank matrix L and non-negative sparse matrix S.
Y=L+S+E meets rank (L)≤r, || S|| 0≤ h, L>=0, S>=0
Here the amplitude spectrum of the corresponding noise of L; The amplitude spectrum of the corresponding voice of S; || S|| 0the non-zero element number that representing matrix S contains; E is residual matrix; The order of rank (L) representing matrix L; R and h represent low-rank and sparse constraint upper limit parameter; Through contrast test, good noise reduction is obtained in r value 1~3 o'clock.
The computation process of NLSMD (non-negative low-rank and sparse matrix decompose) 105 is as follows:
1. initialization: Y 0=Y; L 0=S 0=[0] n * K; Iterations i=1; Maximum iteration time imax=10 3; Relative error threshold value δ=10 -3;
2. use Non-negative Matrix Factorization to upgrade low-rank matrix: (W, H)=NMF (Y i-1), L i=WH; W ∈ R n * r, H ∈ R r * K;
Wherein: L ibe that NMF represents Non-negative Matrix Factorization through the estimated value of the i time iteration L, W and H are that order is the NMF decomposition result of r, because W and H are nonnegative value, so L iinevitable is also nonnegative matrix.The measure function of NMF algorithm can be selected Euclidean distance, Kullback-Leibler to estimate with Itakura-Saito to estimate.Through contrast test, adopt Itakura-Saito to estimate and obtain best effects.Therefore, the present invention adopts the NMF method of estimating based on Itakura-Saito to calculate L.
3. use Soft-thresholding operator to upgrade sparse matrix S i: S i=(Y i-1-L i+ S i-1> λ) (Y i-1-L i+ Si-1-λ);
Wherein: symbol
Figure BDA0000410022380000052
representing matrix correspondence position element product, λ is threshold value, the value of λ is relevant with noise intensity, recommendation λ=σ, wherein σ is noise mean square deviation.
4. upgrade stack matrix: Y i=L i+ S i;
If 5. i reach maximum iteration time i=imax or
Figure BDA0000410022380000053
stop iteration, estimated value Li and the S of output L and S i; Otherwise jump to step 2., i=i+1, continues to carry out iterative process;
5) utilize sparse matrix S and the reconstruct of noisy speech phase spectrum to strengthen voice spectrum, because people's ear is insensitive to the phase information of sound, can replace strengthening with the phase place ∠ Y (n, k) of noisy speech frequency spectrum the phase place of voice, the complex number spectrum of the voice that are enhanced:
6) the complex number spectrum matrix that strengthens voice is expanded into vector, it is carried out to IDFT (inverse discrete Fourier transform) 106, the discrete time of obtaining enhancing voice represents:
Figure BDA0000410022380000055
Wherein:
Figure BDA0000410022380000056
vec function representation is concatenated into rectangular array vector the operation of one-dimensional vector by time frame sequential.

Claims (4)

1. the sound enhancement method based on non-negative low-rank and sparse matrix decomposition principle, is characterized in that, with non-negative low-rank and sparse matrix decomposition method, isolates voice signal from noisy speech, and implementation step is as follows:
(1) discrete noisy speech signal is carried out to pre-service, pre-service comprises signal smoothing and minute frame;
(2) noisy speech signal after minute frame is carried out to discrete Fourier transformation, obtain noisy speech frequency spectrum;
(3) in frequency domain, using the spectrum amplitude of every frame voice as column vector, arrange in chronological order, by several speech frames, form noisy speech time-frequency matrix;
(4) utilize non-negative low-rank and sparse matrix decomposition algorithm to decompose noisy speech time-frequency matrix, obtain non-negative low-rank matrix and sparse matrix; Decomposing expression formula is:
Y=L+S+E meets rank (L)≤r, || S|| 0≤ h, L>=0, S>=0;
Wherein: Y is noisy speech time-frequency matrix; L is low-rank matrix, the amplitude spectrum of corresponding noise; S is sparse matrix, the amplitude spectrum of corresponding voice, || S|| 0represent the non-zero element number that sparse matrix S contains, the order of rank (L) representing matrix L, E is residual matrix, r and h represent low-rank and sparse constraint upper limit parameter;
(5) utilize the phase spectrum reconstruct of sparse matrix S and noisy speech to strengthen voice spectrum, then by inverse Fourier transform, obtain the enhancing voice of forms of time and space.
2. the sound enhancement method based on non-negative low-rank and sparse matrix decomposition principle according to claim 1, is characterized in that, in described step (1), discrete noisy speech signal is carried out to pretreated processing procedure and is:
(1) adopt P point arest neighbors signal average to carry out signal smoothing, in order to the amplitude wave-shape of level and smooth noisy speech;
(2) to noisy speech signal, divide frame, the window function that minute frame adopts is Hamming window, and window length is 200 points, and overlapping the counting that each interframe moves is 80 points.
3. non-negative low-rank according to claim 1 and sparse matrix decomposition algorithm, is characterized in that, the step of calculating low-rank matrix L and sparse matrix S is as follows:
(1) initialization: Y 0=Y; L 0=S 0=[0] n * K;
Iterations initial value i=1; Maximum iteration time imax=10 3; Relative error threshold value δ=10 -3;
(2) use NMF to upgrade low-rank matrix: (W, H)=NMF (Y i-1), L i=WH; W ∈ R n * r, H ∈ R r * K;
NMF represents Non-negative Matrix Factorization, and NMF represents Non-negative Matrix Factorization, and W and H are that order is the NMF decomposition result of r, and the measure function of NMF selects Itakura-Saito to estimate;
(3) use Soft-thresholding operator to upgrade sparse matrix: S i=(Y i-1-L i+ S i-1> λ) (Y i-1-L i+ S i-1-λ);
Wherein: symbol representing matrix correspondence position element product, λ is thresholding constant; λ is relevant with noise level, recommendation λ=σ, the mean square deviation that wherein σ is noise;
(4) upgrade stack matrix: Y i=L i+ S i;
(5) if i reach maximum iteration time i=imax or
Figure FDA0000410022370000021
stop iteration, the estimated value L of output L and S iand S i; Otherwise jump to step (2), i=i+1; Continue to carry out iterative process.
4. the sound enhancement method based on non-negative low-rank and sparse matrix decomposition principle according to claim 1, is characterized in that, utilizes the phase spectrum reconstruct of sparse matrix and noisy speech to strengthen voice spectrum in described step (5):
Wherein: the spectral phase that ∠ Y (n, k) is noisy speech, S is sparse matrix, | S (n, k) | be sparse matrix spectral amplitude value,
Figure FDA0000410022370000023
for the enhancing voice spectrum of reconstruct, n is time frame index, and k is frequency indices.
CN201310548773.9A 2013-11-07 2013-11-07 Based on non-negative low-rank and the sound enhancement method of sparse matrix decomposition principle Expired - Fee Related CN103559888B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310548773.9A CN103559888B (en) 2013-11-07 2013-11-07 Based on non-negative low-rank and the sound enhancement method of sparse matrix decomposition principle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310548773.9A CN103559888B (en) 2013-11-07 2013-11-07 Based on non-negative low-rank and the sound enhancement method of sparse matrix decomposition principle

Publications (2)

Publication Number Publication Date
CN103559888A true CN103559888A (en) 2014-02-05
CN103559888B CN103559888B (en) 2016-10-05

Family

ID=50014118

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310548773.9A Expired - Fee Related CN103559888B (en) 2013-11-07 2013-11-07 Based on non-negative low-rank and the sound enhancement method of sparse matrix decomposition principle

Country Status (1)

Country Link
CN (1) CN103559888B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104021797A (en) * 2014-06-19 2014-09-03 南昌大学 Voice signal enhancement method based on frequency domain sparse constraint
CN104505100A (en) * 2015-01-06 2015-04-08 中国人民解放军理工大学 Non-supervision speech enhancement method based robust non-negative matrix decomposition and data fusion
CN104751855A (en) * 2014-11-25 2015-07-01 北京理工大学 Speech enhancement method in music background based on non-negative matrix factorization
CN107528648A (en) * 2017-10-23 2017-12-29 北京邮电大学 A kind of blind frequency spectrum sensing method and device based on low-rank sparse matrix decomposition
CN108573711A (en) * 2017-03-09 2018-09-25 中国科学院声学研究所 A kind of single microphone speech separating method based on NMF algorithms
CN108899045A (en) * 2018-06-29 2018-11-27 中国航空无线电电子研究所 Subspace sound enhancement method based on constraint low-rank and sparse decomposition
CN108986834A (en) * 2018-08-22 2018-12-11 中国人民解放军陆军工程大学 The blind Enhancement Method of bone conduction voice based on codec framework and recurrent neural network
CN109036452A (en) * 2018-09-05 2018-12-18 北京邮电大学 A kind of voice information processing method, device, electronic equipment and storage medium
CN109215671A (en) * 2018-11-08 2019-01-15 西安电子科技大学 Speech-enhancement system and method based on MFrSRRPCA algorithm
CN109658944A (en) * 2018-12-14 2019-04-19 中国电子科技集团公司第三研究所 Helicopter acoustic signal Enhancement Method and device
US10276179B2 (en) 2017-03-06 2019-04-30 Microsoft Technology Licensing, Llc Speech enhancement with low-order non-negative matrix factorization
US10528147B2 (en) 2017-03-06 2020-01-07 Microsoft Technology Licensing, Llc Ultrasonic based gesture recognition
WO2020113575A1 (en) * 2018-12-07 2020-06-11 广东省智能制造研究所 Sound classification method, device and medium based on semi-nonnegative materix factorization with constraint
CN111402909A (en) * 2020-03-02 2020-07-10 东华大学 Speech enhancement method based on constant frequency domain transformation
CN111863014A (en) * 2019-04-26 2020-10-30 北京嘀嘀无限科技发展有限公司 Audio processing method and device, electronic equipment and readable storage medium
US10984315B2 (en) 2017-04-28 2021-04-20 Microsoft Technology Licensing, Llc Learning-based noise reduction in data produced by a network of sensors, such as one incorporated into loose-fitting clothing worn by a person
CN113129872A (en) * 2021-04-06 2021-07-16 新疆大学 Voice enhancement method based on deep compressed sensing
WO2021159772A1 (en) * 2020-02-10 2021-08-19 腾讯科技(深圳)有限公司 Speech enhancement method and apparatus, electronic device, and computer readable storage medium
CN106796803B (en) * 2014-10-14 2023-09-19 交互数字麦迪逊专利控股公司 Method and apparatus for separating speech data from background data in audio communication

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101441872A (en) * 2007-11-19 2009-05-27 三菱电机株式会社 Denoising acoustic signals using constrained non-negative matrix factorization
US20100254539A1 (en) * 2009-04-07 2010-10-07 Samsung Electronics Co., Ltd. Apparatus and method for extracting target sound from mixed source sound
US20110061516A1 (en) * 2009-09-14 2011-03-17 Electronics And Telecommunications Research Institute Method and system for separating musical sound source without using sound source database
CN102855884A (en) * 2012-09-11 2013-01-02 中国人民解放军理工大学 Speech time scale modification method based on short-term continuous nonnegative matrix decomposition
CN102915742A (en) * 2012-10-30 2013-02-06 中国人民解放军理工大学 Single-channel monitor-free voice and noise separating method based on low-rank and sparse matrix decomposition

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101441872A (en) * 2007-11-19 2009-05-27 三菱电机株式会社 Denoising acoustic signals using constrained non-negative matrix factorization
US20100254539A1 (en) * 2009-04-07 2010-10-07 Samsung Electronics Co., Ltd. Apparatus and method for extracting target sound from mixed source sound
US20110061516A1 (en) * 2009-09-14 2011-03-17 Electronics And Telecommunications Research Institute Method and system for separating musical sound source without using sound source database
CN102855884A (en) * 2012-09-11 2013-01-02 中国人民解放军理工大学 Speech time scale modification method based on short-term continuous nonnegative matrix decomposition
CN102915742A (en) * 2012-10-30 2013-02-06 中国人民解放军理工大学 Single-channel monitor-free voice and noise separating method based on low-rank and sparse matrix decomposition

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104021797A (en) * 2014-06-19 2014-09-03 南昌大学 Voice signal enhancement method based on frequency domain sparse constraint
CN106796803B (en) * 2014-10-14 2023-09-19 交互数字麦迪逊专利控股公司 Method and apparatus for separating speech data from background data in audio communication
CN104751855A (en) * 2014-11-25 2015-07-01 北京理工大学 Speech enhancement method in music background based on non-negative matrix factorization
CN104505100A (en) * 2015-01-06 2015-04-08 中国人民解放军理工大学 Non-supervision speech enhancement method based robust non-negative matrix decomposition and data fusion
CN104505100B (en) * 2015-01-06 2017-12-12 中国人民解放军理工大学 A kind of unsupervised sound enhancement method based on robust Non-negative Matrix Factorization and data fusion
US10276179B2 (en) 2017-03-06 2019-04-30 Microsoft Technology Licensing, Llc Speech enhancement with low-order non-negative matrix factorization
US10528147B2 (en) 2017-03-06 2020-01-07 Microsoft Technology Licensing, Llc Ultrasonic based gesture recognition
CN108573711A (en) * 2017-03-09 2018-09-25 中国科学院声学研究所 A kind of single microphone speech separating method based on NMF algorithms
US10984315B2 (en) 2017-04-28 2021-04-20 Microsoft Technology Licensing, Llc Learning-based noise reduction in data produced by a network of sensors, such as one incorporated into loose-fitting clothing worn by a person
CN107528648A (en) * 2017-10-23 2017-12-29 北京邮电大学 A kind of blind frequency spectrum sensing method and device based on low-rank sparse matrix decomposition
CN108899045A (en) * 2018-06-29 2018-11-27 中国航空无线电电子研究所 Subspace sound enhancement method based on constraint low-rank and sparse decomposition
CN108986834A (en) * 2018-08-22 2018-12-11 中国人民解放军陆军工程大学 The blind Enhancement Method of bone conduction voice based on codec framework and recurrent neural network
CN109036452A (en) * 2018-09-05 2018-12-18 北京邮电大学 A kind of voice information processing method, device, electronic equipment and storage medium
CN109215671B (en) * 2018-11-08 2022-12-02 西安电子科技大学 Voice enhancement system and method based on MFrSRRPCA algorithm
CN109215671A (en) * 2018-11-08 2019-01-15 西安电子科技大学 Speech-enhancement system and method based on MFrSRRPCA algorithm
WO2020113575A1 (en) * 2018-12-07 2020-06-11 广东省智能制造研究所 Sound classification method, device and medium based on semi-nonnegative materix factorization with constraint
CN109658944B (en) * 2018-12-14 2020-08-07 中国电子科技集团公司第三研究所 Helicopter acoustic signal enhancement method and device
CN109658944A (en) * 2018-12-14 2019-04-19 中国电子科技集团公司第三研究所 Helicopter acoustic signal Enhancement Method and device
CN111863014A (en) * 2019-04-26 2020-10-30 北京嘀嘀无限科技发展有限公司 Audio processing method and device, electronic equipment and readable storage medium
WO2021159772A1 (en) * 2020-02-10 2021-08-19 腾讯科技(深圳)有限公司 Speech enhancement method and apparatus, electronic device, and computer readable storage medium
CN111402909A (en) * 2020-03-02 2020-07-10 东华大学 Speech enhancement method based on constant frequency domain transformation
CN111402909B (en) * 2020-03-02 2023-07-07 东华大学 Speech enhancement method based on constant frequency domain transformation
CN113129872A (en) * 2021-04-06 2021-07-16 新疆大学 Voice enhancement method based on deep compressed sensing
CN113129872B (en) * 2021-04-06 2023-03-14 新疆大学 Voice enhancement method based on deep compressed sensing

Also Published As

Publication number Publication date
CN103559888B (en) 2016-10-05

Similar Documents

Publication Publication Date Title
CN103559888B (en) Based on non-negative low-rank and the sound enhancement method of sparse matrix decomposition principle
CN106486131B (en) A kind of method and device of speech de-noising
Sigg et al. Speech enhancement using generative dictionary learning
CN100543842C (en) Realize the method that ground unrest suppresses based on multiple statistics model and least mean-square error
CN102915742B (en) Single-channel monitor-free voice and noise separating method based on low-rank and sparse matrix decomposition
Yang et al. Under-determined convolutive blind source separation combining density-based clustering and sparse reconstruction in time-frequency domain
Talmon et al. Single-channel transient interference suppression with diffusion maps
CN105190751B (en) Keyboard input detection and inhibition
KR101305373B1 (en) Interested audio source cancellation method and voice recognition method thereof
CN111508518B (en) Single-channel speech enhancement method based on joint dictionary learning and sparse representation
Mavaddaty et al. A novel speech enhancement method by learnable sparse and low-rank decomposition and domain adaptation
WO2013138747A1 (en) System and method for anomaly detection and extraction
CN105489226A (en) Wiener filtering speech enhancement method for multi-taper spectrum estimation of pickup
KR20130057668A (en) Voice recognition apparatus based on cepstrum feature vector and method thereof
González et al. MMSE-based missing-feature reconstruction with temporal modeling for robust speech recognition
Jiang et al. An improved unsupervised single-channel speech separation algorithm for processing speech sensor signals
JP5726790B2 (en) Sound source separation device, sound source separation method, and program
Bavkar et al. PCA based single channel speech enhancement method for highly noisy environment
CN102509268B (en) Immune-clonal-selection-based nonsubsampled contourlet domain image denoising method
KR101568282B1 (en) Mask estimation method and apparatus in cluster based missing feature reconstruction
KR20170087211A (en) Feature compensation system and method for recognizing voice
Christensen et al. Robust subspace-based fundamental frequency estimation
Badiezadegan et al. A wavelet-based data imputation approach to spectrogram reconstruction for robust speech recognition
Ben et al. Chirp signal denoising based on convolution neural network
Wu et al. Time-Domain Mapping with Convolution Networks for End-to-End Monaural Speech Separation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20161005

Termination date: 20171107

CF01 Termination of patent right due to non-payment of annual fee