CN113823316B - Voice signal separation method for sound source close to position - Google Patents

Voice signal separation method for sound source close to position Download PDF

Info

Publication number
CN113823316B
CN113823316B CN202111125927.4A CN202111125927A CN113823316B CN 113823316 B CN113823316 B CN 113823316B CN 202111125927 A CN202111125927 A CN 202111125927A CN 113823316 B CN113823316 B CN 113823316B
Authority
CN
China
Prior art keywords
signal
time
separation matrix
voice
separation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111125927.4A
Other languages
Chinese (zh)
Other versions
CN113823316A (en
Inventor
廖乐乐
卢晶
陈锴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202111125927.4A priority Critical patent/CN113823316B/en
Publication of CN113823316A publication Critical patent/CN113823316A/en
Application granted granted Critical
Publication of CN113823316B publication Critical patent/CN113823316B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses a voice signal separation method aiming at a sound source close to the position. The method comprises the following steps: step 1, acquiring a mixed voice time-frequency domain signal to be processed; step 2, initializing a separation matrix of each frequency band; step 3, performing joint optimization on the separation matrixes of all the frequency bands; step 4, performing size normalization on the separation matrix; step 5, estimating the separated time-frequency domain voice signals; and 6, recovering a time domain voice signal from the separated time-frequency domain voice signal. The method can help the separation algorithm to obtain better voice signal separation effect under the unfavorable condition that the sound source positions are close.

Description

Voice signal separation method for sound source close to position
Technical Field
The invention relates to the technical field of voice processing, in particular to a voice signal separation technology.
Background
The voice separation technology can separate the original sound source signals from the mixed signals of a plurality of sound sources, is an important task in the field of voice signal processing, and plays an important role in various application scenes such as intelligent home systems, video conference systems, voice recognition systems and the like.
In a multichannel speech signal processing scheme, independent Vector Analysis (IVA) establishes the association of each frequency component of a source signal through a joint probability distribution model, thereby constructing an overall cost function. Auxiliary function based IVA (AuxIVA) and Independent low-rank matrix analysis (ILRMA) are considered to be the most advanced methods of separating convolutionally mixed audio signals. The AuxIVA algorithm utilizes the optimization skill of the localization-minimization (MM) to deduce Iterative Projection (IP) iteration rules, and can rapidly and stably optimize the separation matrix. The optimization of AuxIVA can also be combined with other more flexible signal models. ILRMA is a signal model fused with the optimization strategy of AuxIVA and MNMF, and ensures that the cost after each iteration is not increased while utilizing the strong representation capability of MNMF.
The separation effect of the IVA is independent of the sound source position in the ideal case, however, in the actual case, due to the presence of noise, when the sound source position approaches, the separation effect of the algorithm is significantly reduced, which limits the application of the separation algorithm in practice to a great extent. How to improve the separation effect of the close-to-sound source is a considerable technical problem.
Disclosure of Invention
In order to solve the technical problems, the invention provides a voice signal separation method aiming at a sound source close to the position, which can obviously improve the separation effect of voice signals.
The invention adopts the technical scheme that:
a voice signal separation method for a sound source positioned close to the sound source comprises the following steps:
step 1, acquiring a mixed voice time-frequency domain signal to be processed;
step 2, initializing a separation matrix of each frequency band for the mixed voice time-frequency domain signal;
step 3, joint optimization is carried out on the separation matrixes of all the frequency bands so as to solve the sequencing uncertainty;
step 4, performing size normalization on the optimized separation matrix;
step 5, estimating a time-frequency domain voice signal according to the separation matrix processed in the step 4;
and 6, recovering the time domain voice signal from the time-frequency domain voice signal estimated in the step 5.
Further, the specific steps of the step 1 are as follows: and obtaining a time domain signal of the mixed voice to be processed by using a signal acquisition system, and performing short-time Fourier transform on the time domain signal to obtain a time-frequency domain signal of the mixed voice to be processed.
Further, in the step 2, the separation matrix of each frequency band is initialized by using an identity matrix, the diagonal element of the matrix is 1, and the rest elements are 0.
Further, in the step 3, the specific step of performing joint optimization on the separation matrix of all the frequency bands is as follows: selecting a source signal distribution model to obtain a cost function; (2) Selecting an optimization method for the cost function to obtain an updating rule of the separation matrix; (3) And iterating the separation matrix by using the updating rule until convergence to obtain the separation matrix after each frequency band is optimized.
Further, in the step 4, the separation matrix is subjected to the size normalization according to the minimum distortion criterion.
Further, the specific steps of the step 5 are as follows: multiplying the separation matrix obtained in the step 4 with the mixed voice time-frequency domain signal to be processed, and estimating a separated time-frequency domain voice signal.
Further, the specific steps of the step 6 are as follows: and (5) performing short-time Fourier inverse transformation on the time-frequency domain voice signals estimated in the step (5) to obtain separated time domain voice signals.
The invention realizes an improved voice signal separation method aiming at sound sources close to each other. The method has the advantages that the separation effect of the sound source position close to the scene is obviously improved, meanwhile, the block ordering problem of IVA under certain conditions is relieved, and the separation effect under the scene with the sound source far away is also improved.
Drawings
FIG. 1 is a flow chart of a method for separating speech signals according to the present invention;
FIG. 2 is a schematic diagram of a sound source approach scene to which the present invention is applicable;
fig. 3 is a graph comparing SDR improvement values at different reverberation times for the original AuxIVA method, the improved AuxIVA method of the present invention, the original ILRMA method, and the improved ILRMA method of the present invention.
Fig. 4 is a graph of SIR rise values at different reverberation times for the original AuxIVA method, the improved AuxIVA method of the present invention, the original ilmma method, and the improved ilmma method of the present invention.
Detailed Description
The invention mainly aims at the voice separation method of the position close to the sound source, which mainly comprises the following parts:
1. signal acquisition
1) And convoluting and mixing the pure source signal with the room impulse response, and adding diffusion noise to obtain a mixed signal.
2) Performing short-time Fourier transform on signals
If the mixed signal acquired by the mth microphone is x m (t) performing short-time Fourier transform on the signal to obtain a time-frequency domain, ignoring a time frame index t, and expressing the signal of the kth frequency band asThe signals picked up by a total of M microphones form a mixed signal vector +.>The superscript T denotes a transpose operation.
2. Iterative algorithm
The nth source signal vector is denoted s n N is the source signal indicator and n=1, 2, …, N is the total number of source signals. The separation matrix is denoted by W, and the nth row of the separation matrix isThe superscript H denotes the conjugate transpose, the superscript K denotes the kth frequency band, and k=1, 2, …, K are the total number of frequency bands. />Representing a set of all band separation matrices, detW k Is a determinant of the separation matrix within the kth frequency band. Source signal vector s n The corresponding estimated signal is denoted y n ,/>A t-th frame representing an n-th estimated signal in a k-th frequency band. Neglecting the time frame index,/->For separation purposes, the estimated signals are made as independent as possible, and the cost function is constructed by using mutual information as an independence measure.
1) If a laplace source signal distribution model is selected, the mutual information cost function is properly modified to be suitable for a scene with a close sound source position, and the final cost function can be written as follows:
wherein Mean sample, +.>Is in the form of y n || 2 As a function of the argument, f represents the probability density distribution function of the source signal. The auxiliary function is constructed by adopting the optimization skill of the localization-minimization (MM):
wherein Is an auxiliary variable. Let->Optimal conditions for obtaining solutions
Where q is another source signal indicator. The iteration rule is then:
g' (. Cndot.) represents the first derivative of G (. Cndot.), e n Representing a unit vector, the nth element is 1 and the remaining elements are 0. For the laplace distribution, G (|) y n || 2 )=||y n || 2 ,G'(||y n || 2 ) =1. Initializing the separation matrix as an identity matrix, and then iterating until convergence according to the rules of formulas (4) - (7) to obtain an optimized separation matrix.
2) If MNMF is selected as the source signal distribution model, the cost functions of IVA and MNMF are fused, and the cost functions are properly modified to be suitable for the scene with the close sound source position, the final cost functions can be written as follows:
wherein ,tkl,n and vlt,n The basis and the activation parameters of different sound sources, respectively, l being an indicator of the basis. The following iteration rule is obtained by adopting the optimization skill of the localization-minimization (MM):
wherein the model parameter t kl,n and vlt,n The update rules of (a) are respectively:
wherein Representing the sample average, l' is a new indicator of basis. Initializing the separation matrix as an identity matrix, and then iterating until convergence according to the rules of formulas (9) - (14) to obtain an optimized separation matrix.
3. Size is regular
In order to solve the uncertainty of the recovered signal amplitude, the separation matrix obtained after convergence needs to be subjected to amplitude normalization. According to MDP, the optimized separation matrix is subjected to the following treatment:
W k ←(W k (W k ) H ) -1/2 W k (15)
4. reconstructing a target signal
1) Estimating time-frequency domain target signals
The final separation matrix obtained from equation (15) can be estimated from the following equation for each band-separated speech signal:
y k =W k x k (16)
2) Reconstructing a time domain target signal
Finally, the separated time-frequency domain voice signals are transformed into time domains through short-time inverse Fourier transform, and signals of the time domains are recovered.
Examples
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the accompanying drawings.
1. Test sample and objective evaluation criterion
The clean speech signal in this embodiment is selected from the TIMIT dataset (cut and spliced to form a speech signal for each segment 10s long) at a sampling rate of 16kHz. The room impulse response was generated with an image model (j.b. allen and d.a. berkley, "Image method for efficiently simulating small-room games," j.acoust.soc.am., vol.65, pp.943-950,1979.), the room size was 7m x 5m x 2.75m, and the reverberation times were set to 0ms, 100ms, 300ms, 500ms, 700ms, respectively. As shown in fig. 2,2 microphones are used in this embodiment to receive signals from 2 sound sources. The distance between the two microphones is 2.5cm, and the center is at [4,1,1.5] (m). The sound source and the microphone are positioned at the same horizontal plane, the two sound sources are respectively positioned at 45 degrees and 60 degrees, and the distance from the center of the array is 1m. The clean speech signal was convolutionally mixed with the room impulse response and a diffuse noise of signal to noise ratio (SNR) of 30dB was added as described in literature (e.a. habets and s. Gannot, "Generating sensor signals in isotropic noise fields," JASA, vol.122, no.6, pp.3464-3470,2007.), yielding 100 different mixed signals. All algorithms are processed in the time-frequency domain, and the short-time Fourier transform uses a Hanning window of 2048 points and an overlap ratio of 3/4.
The present embodiment uses signal to distortion ratio (SDR) and signal to interference ratio (SIR) as objective evaluation criteria, and subtracts the output SDR value (sdr_out)/SIR value (sir_out) after the algorithm processing from the SDR value (sdr_in)/SIR value (sir_in) of the input mixed signal to obtain the SDR boost value (sdramp)/SIR boost value (SIRimp) after the algorithm processing, that is, sdramp=sdr_out-sdr_in, and sirimp=sir_out-sir_in.
2. Specific implementation flow of method
Referring to fig. 1, a time-domain mixed speech signal is input and subjected to short-time fourier transform to obtain a time spectrum, and a separation matrix of each frequency band is initialized to an identity matrix. In the modified AuxIVA algorithm (denoted AuxIVA-imp), use is made ofCarrying out iterative optimization on formulas (4) - (7); in the modified ilmra algorithm (denoted ilmra-imp), iterative optimization was performed using equations (9) - (14). After iteration convergence, the final separation matrix W is obtained by adopting the formula (15) to carry out the size normalization k Substituting the time-domain speech signal into the formula (16) to obtain the separated speech time-frequency spectrum estimation, and finally performing short-time inverse Fourier transform on the estimated speech time-frequency spectrum to obtain the separated time-domain speech signal.
To demonstrate the performance of the method of the present invention, the present example compares the original AuxIVA algorithm (denoted AuxIVA-ori) (N.Ono., "Stable and fast update rules for independent vector analysis based on auxiliary function technique," in Proc. IEEE WASPAA, pp.189-192,2011.) and the ILRMA algorithm (denoted ILRMA-ori) (D.Kitamura, N.Ono, H.Sawada, H.Kameoka, and H.Saruwatari, "Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization," IEEE Trans. Audio, spech, lang. Process., vol.24, no.9, pp.1626-1641,2016.) with the modified methods of the present invention AuxIVA-imp, ILRMA-imp. FIG. 3 shows the results of 100 tests at different reverberation times for the average SDRimp; figure 4 shows the results of 100 tests with average SIRimp at different reverberation times.
It can be found that the method of the invention can separate more effectively under noise-containing conditions than the original algorithm in a scene where the sound source is close, and has more obvious advantages under the condition of middle and low reverberation.

Claims (7)

1. A method of separating speech signals for a sound source located close to the sound source, the method comprising the steps of:
step 1, acquiring a mixed voice time-frequency domain signal to be processed;
step 2, initializing a separation matrix of each frequency band for the mixed voice time-frequency domain signal;
step 3, joint optimization is carried out on the separation matrixes of all the frequency bands so as to solve the sequencing uncertainty; the method comprises the following specific steps:
step 31, selecting a source signal distribution model to obtain a cost function;
when the Laplace distribution is selected as a source signal distribution model, the cost function is:
wherein ,representing sample averages, G (·) is a scoring function determined by the source signal model; n is the source signal index and n=1, 2, …, N is the total number of source signals; k is a frequency index and k=1, 2, …, K is the total number of frequency bands; />Represents the nth estimated signal in the kth frequency band, detW k Is a determinant of a separation matrix within the kth frequency band;
when multi-channel non-negative matrix factorization is selected as a source signal model, the cost function is:
wherein t is a time frame index, t kl,n and vlt,n Respectively the basis and the activation parameters of different sound sources, i is the index of the basis, N is the index of the source signal and n=1, 2, …, N is the total number of source signals; k is a frequency index and k=1, 2, …, K is the total number of frequency bands;a t-th frame, detW, representing an nth estimated signal in a kth frequency band k Is a determinant of a separation matrix within the kth frequency band;
step 32, adopting a localization-minimization optimization method for the cost function to obtain an updating rule of the separation matrix;
step 33, iterating the separation matrix by using the updating rule until convergence to obtain a separation matrix after each frequency band is optimized;
step 4, performing size normalization on the optimized separation matrix;
step 5, estimating a time-frequency domain voice signal according to the separation matrix processed in the step 4;
and 6, recovering the time domain voice signal from the time-frequency domain voice signal estimated in the step 5.
2. The method for separating a voice signal from a sound source according to claim 1, wherein the specific steps of step 1 are as follows: and obtaining a time domain signal of the mixed voice to be processed by using a signal acquisition system, and performing short-time Fourier transform on the time domain signal to obtain a time-frequency domain signal of the mixed voice to be processed.
3. The method according to claim 1, wherein in the step 2, the separation matrix of each frequency band is initialized by using an identity matrix, the diagonal element of the matrix is 1, and the remaining elements are 0.
4. The method according to claim 1, wherein in step 32, when a laplace distribution is selected as the source signal distribution model, an update rule for obtaining the separation matrix is:
wherein Representing a separation matrix W k In (2), the superscript H denotes the conjugate transpose, x k Representing a mixed signal vector in the kth frequency band, is->M represents the total number of microphones, G' (. Cndot.) represents the first derivative of G (. Cndot.), G (r) n )=r n ,G'(r n )=1;e n Representing a unit vector, the nth element being 1, the remaining elements being 0;
when multi-channel non-negative matrix factorization is selected as a source signal model, the update rule of the obtained separation matrix is as follows:
wherein tkl,n and vlt,n The update rules of (a) are respectively:
wherein ,represents sample average, e n Representing a unit vector, the nth element being 1, the remaining elements being 0,l' new indicators of the base; />Representing a separation matrix W k The superscript H denotes the conjugate transpose.
5. The method for separating voice signals from sound sources according to claim 1, wherein in the step 4, the separation matrix is subjected to the size normalization according to the minimum distortion criterion, and the specific steps are as follows:
W k ←(W k (W k ) H ) -1/2 W k
where K is a frequency index, k=1, 2, …, K is the total number of frequency bands; w (W) k The separation matrix of the kth frequency band is represented, and the superscript H represents the conjugate transpose.
6. The method for separating a speech signal from a sound source according to claim 5, wherein the specific steps of step 5 are as follows: separating matrix W obtained in step 4 k With the mixed speech time-frequency domain signal x to be processed k Multiplying to estimate the separated time-frequency domain voice signal y k
7. The method for separating a voice signal from a sound source according to claim 1, wherein the specific steps of step 6 are as follows: and (5) performing short-time Fourier inverse transformation on the time-frequency domain voice signals estimated in the step (5) to obtain separated time domain voice signals.
CN202111125927.4A 2021-09-26 2021-09-26 Voice signal separation method for sound source close to position Active CN113823316B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111125927.4A CN113823316B (en) 2021-09-26 2021-09-26 Voice signal separation method for sound source close to position

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111125927.4A CN113823316B (en) 2021-09-26 2021-09-26 Voice signal separation method for sound source close to position

Publications (2)

Publication Number Publication Date
CN113823316A CN113823316A (en) 2021-12-21
CN113823316B true CN113823316B (en) 2023-09-12

Family

ID=78915482

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111125927.4A Active CN113823316B (en) 2021-09-26 2021-09-26 Voice signal separation method for sound source close to position

Country Status (1)

Country Link
CN (1) CN113823316B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114220453B (en) * 2022-01-12 2022-08-16 中国科学院声学研究所 Multi-channel non-negative matrix decomposition method and system based on frequency domain convolution transfer function
CN116866123B (en) * 2023-07-13 2024-04-30 中国人民解放军战略支援部队航天工程大学 Convolution blind separation method without orthogonal limitation

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104333523A (en) * 2014-10-14 2015-02-04 集美大学 NPCA-based post nonlinear blind source separation method
WO2016050725A1 (en) * 2014-09-30 2016-04-07 Thomson Licensing Method and apparatus for speech enhancement based on source separation
WO2016152511A1 (en) * 2015-03-23 2016-09-29 ソニー株式会社 Sound source separating device and method, and program
CN108597531A (en) * 2018-03-28 2018-09-28 南京大学 A method of improving binary channels Blind Signal Separation by more sound source activity detections
CN109584900A (en) * 2018-11-15 2019-04-05 昆明理工大学 A kind of blind source separation algorithm of signals and associated noises
CN110010148A (en) * 2019-03-19 2019-07-12 中国科学院声学研究所 A kind of blind separation method in frequency domain and system of low complex degree
CN111259327A (en) * 2020-01-15 2020-06-09 桂林电子科技大学 Subgraph processing-based optimization method for consistency problem of multi-agent system
CN112037813A (en) * 2020-08-28 2020-12-04 南京大学 Voice extraction method for high-power target signal
CN112185411A (en) * 2019-07-03 2021-01-05 南京人工智能高等研究院有限公司 Voice separation method, device, medium and electronic equipment
CN112820312A (en) * 2019-11-18 2021-05-18 北京声智科技有限公司 Voice separation method and device and electronic equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016050725A1 (en) * 2014-09-30 2016-04-07 Thomson Licensing Method and apparatus for speech enhancement based on source separation
CN104333523A (en) * 2014-10-14 2015-02-04 集美大学 NPCA-based post nonlinear blind source separation method
WO2016152511A1 (en) * 2015-03-23 2016-09-29 ソニー株式会社 Sound source separating device and method, and program
CN108597531A (en) * 2018-03-28 2018-09-28 南京大学 A method of improving binary channels Blind Signal Separation by more sound source activity detections
CN109584900A (en) * 2018-11-15 2019-04-05 昆明理工大学 A kind of blind source separation algorithm of signals and associated noises
CN110010148A (en) * 2019-03-19 2019-07-12 中国科学院声学研究所 A kind of blind separation method in frequency domain and system of low complex degree
CN112185411A (en) * 2019-07-03 2021-01-05 南京人工智能高等研究院有限公司 Voice separation method, device, medium and electronic equipment
CN112820312A (en) * 2019-11-18 2021-05-18 北京声智科技有限公司 Voice separation method and device and electronic equipment
CN111259327A (en) * 2020-01-15 2020-06-09 桂林电子科技大学 Subgraph processing-based optimization method for consistency problem of multi-agent system
CN112037813A (en) * 2020-08-28 2020-12-04 南京大学 Voice extraction method for high-power target signal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Performance Based Cost Functions for End-to-End Speech Separation;Shrikant Venkataramani;《2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)》;全文 *

Also Published As

Publication number Publication date
CN113823316A (en) 2021-12-21

Similar Documents

Publication Publication Date Title
US9668066B1 (en) Blind source separation systems
US8874439B2 (en) Systems and methods for blind source signal separation
CN111133511B (en) sound source separation system
CN113823316B (en) Voice signal separation method for sound source close to position
US8848933B2 (en) Signal enhancement device, method thereof, program, and recording medium
CN106251877A (en) Voice Sounnd source direction method of estimation and device
CN106847301A (en) A kind of ears speech separating method based on compressed sensing and attitude information
CN109671447A (en) A kind of binary channels is deficient to determine Convolution Mixture Signals blind signals separation method
Nesta et al. Robust Automatic Speech Recognition through On-line Semi Blind Signal Extraction
Luo et al. Implicit filter-and-sum network for multi-channel speech separation
Li et al. An EM algorithm for audio source separation based on the convolutive transfer function
CN114283832B (en) Processing method and device for multichannel audio signal
CN112037813B (en) Voice extraction method for high-power target signal
CN112201276B (en) TC-ResNet network-based microphone array voice separation method
Dmour et al. A new framework for underdetermined speech extraction using mixture of beamformers
Aroudi et al. Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding
JP6910609B2 (en) Signal analyzers, methods, and programs
Yoshioka et al. Dereverberation by using time-variant nature of speech production system
CN114863944B (en) Low-delay audio signal overdetermined blind source separation method and separation device
CN112820312A (en) Voice separation method and device and electronic equipment
CN114220453B (en) Multi-channel non-negative matrix decomposition method and system based on frequency domain convolution transfer function
Jafari et al. Sparse coding for convolutive blind audio source separation
Talagala et al. Binaural localization of speech sources in the median plane using cepstral HRTF extraction
CN113393850A (en) Parameterized auditory filter bank for end-to-end time domain sound source separation system
JP4714892B2 (en) High reverberation blind signal separation apparatus and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant