CN109671447A - A kind of binary channels is deficient to determine Convolution Mixture Signals blind signals separation method - Google Patents

A kind of binary channels is deficient to determine Convolution Mixture Signals blind signals separation method Download PDF

Info

Publication number
CN109671447A
CN109671447A CN201811434791.3A CN201811434791A CN109671447A CN 109671447 A CN109671447 A CN 109671447A CN 201811434791 A CN201811434791 A CN 201811434791A CN 109671447 A CN109671447 A CN 109671447A
Authority
CN
China
Prior art keywords
signals
signal
mixture
frequency domain
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811434791.3A
Other languages
Chinese (zh)
Inventor
解元
谢胜利
谢侃
吴宗泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201811434791.3A priority Critical patent/CN109671447A/en
Publication of CN109671447A publication Critical patent/CN109671447A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02087Noise filtering the noise being separate speech, e.g. cocktail party

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

It owes to determine Convolution Mixture Signals blind signals separation method the present invention relates to a kind of binary channels, comprising the following steps: S1: acquisition voice signal and music signal, and synthesize binary channels and owe to determine Convolution Mixture Signals signal;S2: Convolution Mixture Signals signal progress mathematical modeling is determined to deficient, obtains the mathematic(al) representation for owing to determine Convolution Mixture Signals model;S3: Fourier transformation is carried out to observation signal and obtains the aliasing signal x (f, n) on frequency domain, Mixture matrix is estimated on frequency domainS4: the Mixture matrix of estimation is utilizedThe source signals on frequency domain, obtainS5: to the source signal separated on frequency domainInverse Fourier transform is carried out, to obtain the estimation source signal in time domainThe present invention quotes parallel factor decomposition estimation aliasing access matrix, scale and sequence uncertain problem are solved using minimum distortion principle and K-means clustering method, then Wiener Filter Method source signals are utilized, compared to other algorithms, separating effect of the invention is more superior.

Description

A kind of binary channels is deficient to determine Convolution Mixture Signals blind signals separation method
Technical field
The present invention relates to the technical field of blind signal processing more particularly to a kind of binary channels owe to determine Convolution Mixture Signals signal it is blind Separation method.
Background technique
Blind separation (Blind Source Separation, BSS) originates from cocktail party problem, i.e., says simultaneously in more people In the environment of words, how by way of machine learning, isolated from multiple sound mix signals that microphone receives every The sound of a speaker? this is extremely challenging project in field of signal processing.
It owes to determine convolution blind separation to be a kind of more complicated situation, mainly number of the number of source signal greater than microphone Mesh causes the information content obtained limited, brings very big difficulty to separation.In particular, received signal is past in real life Toward the delay on time of occurrence, lead to more complicated Convolution Mixture Signals.In order to solve this blind separation for owing to determine Convolution Mixture Signals signal Problem, method popular at present is time-frequency domain method, and mainly the aliasing signal in time domain is become by Fourier in short-term It changes on frequency domain, by reconstructing source signal on each frequency point.Having been presented for Part Methods includes Full-rank algorithm (Duong N Q K,Vincent E.Under-determined reverberant audio source separation using a Full-rank spatial covariance model [M] .IEEE Press, 2010.), EM NMF, MU NMF algorithm (Ozerov A,Fevotte C.Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation[J].IEEE Transactions on Audio Speech&Language Processing, 2010,18 (3): 550-563.), GEM-MU NTF algorithm (Ozerov A,Févotte C,Blouet R,et al.Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation[C]//IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2011: 257-260.), GEM-MU NMF algorithm (Al-Tmeme A, Woo W L, Dlay S S, et al.Underdetermined Convolutive Source Separation using GEM-MU with Variational Approximated Optimum Model Order NMF2D[J].IEEE/ACM Transactions on Audio Speech&Language Processing, 2017, PP (99): 1-1.), Weighted interleaved ICA algorithm (Nesta F, Omologo M.Convolutive Underdetermined Source Separation through Weighted Interleaved ICA and Spatio-temporal Source Correlation[C]//Latent Variable Analysis and Signal Separation-,International Conference,Lva/ica 2012,Tel Aviv,Israel, March 12-15,2012.Proceedings.DBLP, 2012:222-230.), Bin-wise clustering algorithm (Sawada H,Araki S,Makino S.Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment[J] .IEEE Transactions on Audio Speech&Language Processing, 2010,19 (3): 516-527.), Bayes-risk minimization algorithm (Cho J, Chang D Y.Underdetermined convolutive BSS: bayes risk minimization based on a mixture of super-Gaussian posterior approximation[J].IEEE/ACM Transactions on Audio Speech&Language Processing, 2015,23 (5): 828-839.) etc..But source signals are easy to appear scale ambiguousness and sequencing problem on frequency domain, Cause final separating resulting undesirable.Present invention focuses on the performances of the source of raising separation, obtain better separating resulting.
Summary of the invention
It is an object of the invention to overcome the deficiencies of the prior art and provide a kind of separating effect is more superior than other algorithms Binary channels owe determine Convolution Mixture Signals blind signals separation method,.
To achieve the above object, technical solution provided by the present invention are as follows:
Method estimates source signal in two steps: estimation Mixture matrix firstThen source signalsSpecific steps are such as Under:
S1: acquisition voice signal and music signal, and synthesize binary channels and owe to determine Convolution Mixture Signals signal;
S2: Convolution Mixture Signals signal progress mathematical modeling is determined to deficient, obtains the mathematic(al) representation for owing to determine Convolution Mixture Signals model;
S3: carrying out Fourier transformation to observation signal and obtain the aliasing signal x (f, n) on frequency domain, estimates on frequency domain mixed Folded matrix
S4: the Mixture matrix of estimation is utilizedThe source signals on frequency domain, obtain
S5: to the source signal separated on frequency domainInverse Fourier transform is carried out, to obtain the estimation source in time domain Signal
Further, the step S2 is to owing to determine Convolution Mixture Signals signal modeling specific step is as follows:
Assuming that there is n signal s (t)=[s1(t),...,sn(t)]T, received, generate aliasing signal x (t) by m microphone =[x1(t),...,xm(t)]TIt indicates are as follows:
Wherein, A ∈ Rm×nIndicating unknown aliasing access matrix, * indicates convolution symbol, and τ indicates time delay, n (t)= [n1(t),...,nm(t)]T∈RmIndicate Gaussian noise.
Further, in the step S3, Mixture matrixSpecific step is as follows for estimation:
Fourier transformation is carried out to aliasing signal x (t), obtains the aliasing signal x (f, n) on frequency domain, utilizes CP tensor point Solution method, iteration more new estimation Mixture matrix, i.e.,
Wherein, Rx(f, n)=E [x (f, n) xT(f, n)] it is autocorrelation matrix, e indicates Khatri-Rao product,For Af Complex conjugate.
Further, the Mixture matrix estimated in the step S4 using step S3Firstly, being utilized on frequency domain Minimum distortion principle and K-means clustering method solve scale and the uncertain texts and pictures that sort.Then, it is separated using Wiener Filter Method Source signal obtainsIt is as follows:
Wherein,For Rx(f's, n) is inverse.
Further, the step S5 is to the source signal separated on frequency domainInverse Fourier transform is carried out, thus Estimation source signal on to time domainObjective function is defined as foloows:
Compared with prior art, this programme principle and advantage is as follows:
Method estimates source signal to this programme in two steps: estimation Mixture matrix firstThen source signalsWherein, A kind of mathematical tool (parallel factor decomposition) estimation aliasing access matrix is quoted, is clustered using minimum distortion principle and K-means Method solves scale and sequence uncertain problem, then utilizes Wiener Filter Method source signals.Pass through experimental verification this programme Separating effect it is more superior compared to other algorithms.
Detailed description of the invention
Fig. 1 is that a kind of binary channels of the present invention owes to determine the work flow diagram of Convolution Mixture Signals blind signals separation method;
Fig. 2 is the waveform diagram of source signal;
Fig. 3 is aliasing channel waveform diagram;
Fig. 4 is the waveform diagram of isolated source signal;
Fig. 5 is music signal separating property comparison diagram;
Fig. 6 is voice signal separating property comparison diagram.
Specific embodiment
The present invention is further explained in the light of specific embodiments:
A kind of binary channels is deficient described in the present embodiment determines Convolution Mixture Signals blind signals separation method, the source of method estimation in two steps letter Number: estimation Mixture matrix firstThen source signalsSpecific step is as follows:
S1: the aliasing of acquisition one group of two channel, three music signals;Four groups of speech source signals are acquired again, are respectively synthesized two groups It owes to determine Convolution Mixture Signals signal, i.e., the aliasing of twin-channel three sound source signals and mixing for twin-channel four sound source signals It is folded.Wherein, the distance between two microphones are 1 meter, reverberation time RT60=250ms.
S2: aliasing signal is modeled.There is n signal s (t)=[s1(t),...,sn(t)]T, n=(3,4), by 2 Mikes Wind, which receives, generates aliasing, then aliasing signal x (t)=[x1(t),...,xm(t)]T, (m=2) may be expressed as:
Wherein, A ∈ Rm×nIndicate that unknown aliasing access matrix, τ indicate time delay, n (t)=[n1(t),...,nm (t)]T∈RmIndicate Gaussian noise.
S3: Fourier transformation is carried out to aliasing signal x (t), the aliasing signal x (f, n) on frequency domain is obtained, utilizes CP tensor Decomposition method, iteration more new estimation Mixture matrix, i.e.,
Wherein, Rx(f, n)=E [x (f, n) xT(f, n)] it is autocorrelation matrix, e indicates Khatri-Rao product,For Af Complex conjugate.
S4: the Mixture matrix estimated using step S3First with minimum distortion principle and K-means on frequency domain Clustering method solves scale and sequence uncertain problem, recycles Wiener Filter Method source signals, obtainsIt is as follows:
Wherein,For Rx(f's, n) is inverse.
S5: to the source signal separated on frequency domainInverse Fourier transform is carried out, to obtain the estimation source in time domain SignalObjective function is defined as foloows:
Illustrate the feasibility and superiority of the present embodiment below by three specific emulation experiments, all experiments be Under Ubuntu 15.04, Inter (R) Xeon (R) CPU E5-2630v3@2.40GHz, 32.00GB, Matlab R2016b environment What programming was realized.
Firstly, considering the aliasing situation of three, a two channel music signal, the data set of selection comes from " SiSEC 2013 " (http://www.sisec.wiki.irisa.fr) common data sets.The waveform of source signal is as shown in Fig. 2, aliasing is logical Road waveform is as shown in figure 3, the following Fig. 4 of the waveform of isolated signal.In addition, experiment selects signal-to-noise ratio SDR to compare as performance, believe It makes an uproar and illustrates that separating property is better than bigger, compare several popular algorithms, it is clear that the method performance of the present embodiment is more excellent More, as shown in Figure 5.
Then, the case where considering three, two channel voice signal aliasing and four voice signal aliasings, comparison is compared now Popular several algorithms, separating property comparison are as shown in Figure 6.Obviously, the method separating property that the present embodiment is proposed is more excellent More.
The examples of implementation of the above are only the preferred embodiments of the invention, and implementation model of the invention is not limited with this It encloses, therefore all shapes according to the present invention, changes made by principle, should all be included within the scope of protection of the present invention.

Claims (5)

1. a kind of binary channels owes to determine Convolution Mixture Signals blind signals separation method, which is characterized in that method estimates source signal in two steps: first Estimate Mixture matrixThen source signalsSpecific step is as follows:
S1: acquisition voice signal and music signal, and synthesize binary channels and owe to determine Convolution Mixture Signals signal;
S2: Convolution Mixture Signals signal progress mathematical modeling is determined to deficient, obtains the mathematic(al) representation for owing to determine Convolution Mixture Signals model;
S3: Fourier transformation is carried out to observation signal and obtains the aliasing signal x (f, n) on frequency domain, aliasing square is estimated on frequency domain Battle array
S4: the Mixture matrix of estimation is utilizedThe source signals on frequency domain, obtain
S5: to the source signal separated on frequency domainInverse Fourier transform is carried out, to obtain the estimation source signal in time domain
2. a kind of binary channels according to claim 1 owes to determine Convolution Mixture Signals blind signals separation method, which is characterized in that described Step S2 is to owing to determine Convolution Mixture Signals signal modeling specific step is as follows:
Assuming that there is n signal s (t)=[s1(t),...,sn(t)]T, received by m microphone, generate aliasing signal x (t)=[x1 (t),...,xm(t)]TIt indicates are as follows:
Wherein, A ∈ Rm×nIndicate that unknown aliasing access matrix, * indicate that convolution symbol, τ indicate time delay, n (t)=[n1 (t),...,nm(t)]T∈RmIndicate Gaussian noise.
3. a kind of binary channels according to claim 1 owes to determine Convolution Mixture Signals blind signals separation method, which is characterized in that described In step S3, Mixture matrixSpecific step is as follows for estimation:
Fourier transformation is carried out to aliasing signal x (t), the aliasing signal x (f, n) on frequency domain is obtained, utilizes the tensor resolution side CP Method, iteration more new estimation Mixture matrix, i.e.,
Wherein, Rx(f, n)=E [x (f, n) xT(f, n)] it is autocorrelation matrix, e indicates Khatri-Rao product,For AfAnswer Conjugation.
4. a kind of binary channels according to claim 1 owes to determine Convolution Mixture Signals blind signals separation method, which is characterized in that described The Mixture matrix estimated in step S4 using step S3It is poly- first with minimum distortion principle and K-means on frequency domain Class method solves scale and sequence uncertain problem, then utilizes Wiener Filter Method source signals, obtainsIt is as follows:
Wherein,For Rx(f's, n) is inverse.
5. a kind of binary channels according to claim 1 owes to determine Convolution Mixture Signals blind signals separation method, which is characterized in that described Step S5 is to the source signal separated on frequency domainInverse Fourier transform is carried out, to obtain the estimation source signal in time domainObjective function is defined as foloows:
CN201811434791.3A 2018-11-28 2018-11-28 A kind of binary channels is deficient to determine Convolution Mixture Signals blind signals separation method Pending CN109671447A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811434791.3A CN109671447A (en) 2018-11-28 2018-11-28 A kind of binary channels is deficient to determine Convolution Mixture Signals blind signals separation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811434791.3A CN109671447A (en) 2018-11-28 2018-11-28 A kind of binary channels is deficient to determine Convolution Mixture Signals blind signals separation method

Publications (1)

Publication Number Publication Date
CN109671447A true CN109671447A (en) 2019-04-23

Family

ID=66143308

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811434791.3A Pending CN109671447A (en) 2018-11-28 2018-11-28 A kind of binary channels is deficient to determine Convolution Mixture Signals blind signals separation method

Country Status (1)

Country Link
CN (1) CN109671447A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110336574A (en) * 2019-07-11 2019-10-15 中国人民解放军战略支援部队信息工程大学 The restoration methods and device of one source signals
CN110491408A (en) * 2019-07-16 2019-11-22 广东工业大学 A kind of music signal based on sparse meta analysis is deficient to determine aliasing blind separating method
CN110706709A (en) * 2019-08-30 2020-01-17 广东工业大学 Multi-channel convolution aliasing voice channel estimation algorithm combined with video signal
CN110708094A (en) * 2019-09-12 2020-01-17 广东石油化工学院 PLC signal filtering method and system utilizing Gibuss effect
CN110956978A (en) * 2019-11-19 2020-04-03 广东工业大学 Sparse blind separation method based on underdetermined convolution aliasing model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667425A (en) * 2009-09-22 2010-03-10 山东大学 Method for carrying out blind source separation on convolutionary aliasing voice signals
CN102222508A (en) * 2011-07-12 2011-10-19 大连理工大学 Matrix-transformation-based method for underdetermined blind source separation
CN106570527A (en) * 2016-11-07 2017-04-19 广东工业大学 Single-channel blind source separation method for two-period signal aliasing
CN106887238A (en) * 2017-03-01 2017-06-23 中国科学院上海微系统与信息技术研究所 A kind of acoustical signal blind separating method based on improvement Independent Vector Analysis algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101667425A (en) * 2009-09-22 2010-03-10 山东大学 Method for carrying out blind source separation on convolutionary aliasing voice signals
CN102222508A (en) * 2011-07-12 2011-10-19 大连理工大学 Matrix-transformation-based method for underdetermined blind source separation
CN106570527A (en) * 2016-11-07 2017-04-19 广东工业大学 Single-channel blind source separation method for two-period signal aliasing
CN106887238A (en) * 2017-03-01 2017-06-23 中国科学院上海微系统与信息技术研究所 A kind of acoustical signal blind separating method based on improvement Independent Vector Analysis algorithm

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
方标等: "多通道盲反卷积算法综述", 《信号处理》 *
李剑等: "基于平行因子分解的频域盲解卷积算法", 《计算机与网络》 *
艾小凡等: "基于张量正则分解的时频混叠信号欠定盲分离方法", 《航空学报》 *
邱珊等: "基于空间协方差矩阵的欠定卷积盲源分离", 《邵阳学院学报(自然科学版)》 *
闵苏等: "基于非负矩阵分解的欠定卷积盲源分离方法", 《桂林电子科技大学学报》 *
陆凤波等: "一种时频混叠的欠定混合通信信号盲分离算法", 《国防科技大学学报》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110336574A (en) * 2019-07-11 2019-10-15 中国人民解放军战略支援部队信息工程大学 The restoration methods and device of one source signals
CN110491408A (en) * 2019-07-16 2019-11-22 广东工业大学 A kind of music signal based on sparse meta analysis is deficient to determine aliasing blind separating method
CN110491408B (en) * 2019-07-16 2021-12-24 广东工业大学 Music signal underdetermined aliasing blind separation method based on sparse element analysis
CN110706709A (en) * 2019-08-30 2020-01-17 广东工业大学 Multi-channel convolution aliasing voice channel estimation algorithm combined with video signal
CN110706709B (en) * 2019-08-30 2021-11-19 广东工业大学 Multi-channel convolution aliasing voice channel estimation method combined with video signal
CN110708094A (en) * 2019-09-12 2020-01-17 广东石油化工学院 PLC signal filtering method and system utilizing Gibuss effect
CN110956978A (en) * 2019-11-19 2020-04-03 广东工业大学 Sparse blind separation method based on underdetermined convolution aliasing model

Similar Documents

Publication Publication Date Title
CN109671447A (en) A kind of binary channels is deficient to determine Convolution Mixture Signals blind signals separation method
Pedersen et al. Convolutive blind source separation methods
Sawada et al. Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment
Li et al. Multiple-speaker localization based on direct-path features and likelihood maximization with spatial sparsity regularization
Grais et al. Raw multi-channel audio source separation using multi-resolution convolutional auto-encoders
Mazur et al. An approach for solving the permutation problem of convolutive blind source separation based on statistical signal models
Sun et al. A speaker-dependent approach to separation of far-field multi-talker microphone array speech for front-end processing in the CHiME-5 challenge
Quan et al. Multi-channel narrow-band deep speech separation with full-band permutation invariant training
Xiao et al. Beamforming networks using spatial covariance features for far-field speech recognition
Sawada et al. Blind extraction of a dominant source signal from mixtures of many sources [audio source separation applications]
Tachioka et al. Coupled Initialization of Multi-Channel Non-Negative Matrix Factorization Based on Spatial and Spectral Information.
CN110265060B (en) Speaker number automatic detection method based on density clustering
Cobos et al. Two-microphone separation of speech mixtures based on interclass variance maximization
Wang et al. UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures
Jafari et al. Sparse coding for convolutive blind audio source separation
Peng et al. Competing Speaker Count Estimation on the Fusion of the Spectral and Spatial Embedding Space.
Jafari et al. An adaptive stereo basis method for convolutive blind audio source separation
Ihara et al. Multichannel speech separation and localization by frequency assignment
Ukai et al. Multistage SIMO-model-based blind source separation combining frequency-domain ICA and time-domain ICA
Jang et al. Independent vector analysis using non-spherical joint densities for the separation of speech signals
Wang et al. A robust blind source separation algorithm based on non-negative matrix factorization and frequency-sliding generalized cross-correlation
SHIRAISHI et al. Blind source separation by multilayer neural network classifiers for spectrogram analysis
Yang et al. Boosting spatial information for deep learning based multichannel speaker-independent speech separation in reverberant environments
Kühne et al. Time-frequency masking: Linking blind source separation and robust speech recognition
JP2008026625A (en) Multi-bin independent component analysis and blind sound source separation device using the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190423