CN108962276A - A kind of speech separating method and device - Google Patents

A kind of speech separating method and device Download PDF

Info

Publication number
CN108962276A
CN108962276A CN201810820474.9A CN201810820474A CN108962276A CN 108962276 A CN108962276 A CN 108962276A CN 201810820474 A CN201810820474 A CN 201810820474A CN 108962276 A CN108962276 A CN 108962276A
Authority
CN
China
Prior art keywords
signal
separation
iteration
separated
residual coefficients
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810820474.9A
Other languages
Chinese (zh)
Other versions
CN108962276B (en
Inventor
代金良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sinwt Science & Technology Co ltd
Original Assignee
Beijing Three Hearing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Three Hearing Technology Co Ltd filed Critical Beijing Three Hearing Technology Co Ltd
Priority to CN201810820474.9A priority Critical patent/CN108962276B/en
Publication of CN108962276A publication Critical patent/CN108962276A/en
Application granted granted Critical
Publication of CN108962276B publication Critical patent/CN108962276B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02087Noise filtering the noise being separate speech, e.g. cocktail party

Landscapes

  • Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Telephone Function (AREA)

Abstract

The invention discloses a kind of speech separating methods, comprising: obtains the voice data to be separated of each signal path;For each preset sampling instant, separating treatment is carried out to voice data to be separated using blind source separation algorithm, obtains P separation signal;It calculates current separation signal and is separated with other that P separate in signal in addition to currently separating signal and intersect residual coefficients between signal;And whether judge to intersect residual coefficients less than the first preset threshold;If not, using echo cancellation algorithm all intersection residual coefficients are not less than with the separation signal of the first preset threshold, echo cancellation process is carried out, and separates signal for treated and separates signal as target with all set for intersecting the separation signal that residual coefficients are not less than the first preset threshold;If so, will separate signal as target separates signal.The embodiment of the invention also provides a kind of speech Separation devices.Using the embodiment of the present invention, the crossbar signal residual in voice signal can be reduced.

Description

A kind of speech separating method and device
Technical field
The present invention relates to a kind of method of speech processing and device, it is more particularly to a kind of speech separating method and device.
Background technique
In speech processes field, it will usually it encounters to more people while the voice signal spoken carries out speech Separation processing, into And everyone voice signal is obtained, the effect of better Sound seperation how is obtained, that is, other speakers' after separating is residual Stay voice less.This problem is academicly being known as " cocktail party problem ", is to perplex man machine language's interactive application for a long time Problem, so far still without stablizing available product or scheme in the actual environment.
Currently, the algorithm of common speech Separation specifically includes that neural network algorithm, maximum entropy algorithm, Minimum mutual information Algorithm, maximum likelihood algorithm, Independent Component Analysis Algorithm, genetic algorithm, machine learning, the Wave beam forming based on microphone array Algorithm etc..
But since the basic theory of existing algorithm limits, cause the universal separating effect of existing algorithm undesirable, intersect letter Number residual is larger.
Summary of the invention
Technical problem to be solved by the present invention lies in a kind of speech separating method and device is provided, to solve existing skill Crossbar signal remains larger technical problem in art.
The present invention is to solve above-mentioned technical problem by the following technical programs:
The embodiment of the invention provides a kind of speech separating methods, which comprises
Obtain the voice data to be separated of each signal path, wherein the voice data to be separated contains at least two The voice data that people generates when speaking simultaneously;
For each preset sampling instant, the voice data to be separated is carried out at separation using blind source separation algorithm Reason, obtains P separation signal;
Signal is separated for each, current separation signal is calculated and is currently separated with a separate in signal of the P except described The intersection residual coefficients between other separation signals except signal;And judge whether the intersection residual coefficients are pre- less than first If threshold value;
If it is not, using echo cancellation algorithm all intersection residual coefficients are not less than with the separation signal of the first preset threshold, Echo cancellation process is carried out, and separation signal with all intersects separation of the residual coefficients less than the first preset threshold by treated The set of signal separates signal as target;
If so, separating signal for the separation signal as target.
Optionally, the blind source separation algorithm includes: non-linear principal component analysis, independent component analysis, neural network calculation One of method, maximum entropy algorithm, Minimum mutual information algorithm, maximum likelihood algorithm or multiple combinations.
It is optionally, described that separating treatment is carried out to the voice data to be separated using blind source separation algorithm, comprising:
For each voice data to be separated, the generation for being directed to the voice data to be separated is established using NPCA criterion Valence functionWherein,
J (W) is the cost of the separation matrix of t moment;E { } is desired operation function;X (t) is corresponding for each microphone The observation signal that is observed of signal path;W is separation matrix;(.)TFor transposition operation;For nonlinear function;T is to work as The preceding moment;
Minimum processing is carried out to the cost function, obtains the iterative estimate of separation matrix are as follows:
W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], wherein
W (t+1) is the separation matrix at t+1 moment;W (t) is the separation matrix of t moment;θ is iteration step length, andθ (t) is the iteration step length of t moment, and θ (t-1) is the iteration at t-1 moment Step-length, ρ are constant,For gradient function, J (t) is the cost of t moment;Z (t) is nonlinear function, and
Utilize formula, W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], iterate to calculate the separation square of subsequent time Battle array obtains the target separation matrix of each voice data to be separated until the separation matrix is restrained;
Utilize formula, y (t)=Wx (t), the signal after obtaining the separation of the voice data to be separated, wherein y (t) is Signal after the separation of Current observation signal.
Optionally, the current separation signal of the calculating with the P separate in signal except it is described it is current separate signal in addition to Other separation signals between intersection residual coefficients, comprising:
Using formula,Current separation signal is calculated to separate in signal with the P The intersection residual coefficients between other separation signals in addition to the current separation signal, wherein
For current separation signal and the P of i-th of channel separate in signal except it is described currently separate signal in addition to Other separation signals between intersection residual coefficients;I is the number in the channel of current separation signal;J is P separation letter The number in the channel of other separation signals in number in addition to the current separation signal;ai,kFor the separation signal in i-th of channel The mixed coefficint between signal is separated with k-th;aj,kMixing between signal is separated with k-th for the separation signal in j-th of channel Collaboration number;ykFor the sound-source signal in k-th of channel;∑ is summing function.
Optionally, the separation that using echo cancellation algorithm all intersection residual coefficients are not less than with the first preset threshold Signal carries out echo cancellation process, comprising:
For all intersection residual coefficients not less than each separation signal in the separation signal of the first preset threshold, will work as Preceding separation signal makees near end signal;Residual coefficients will be intersected currently to divide not less than in the separation signal of the first preset threshold except described From other signals except signal as remote signaling;
Using formula,Obtain error signal;Wherein, e (n) is error letter Number;D (n) is desired output signal;N is the corresponding duration of each audio frame, and value is filter length;K is in audio frame The serial number of sampled point;K-th of sampled point corresponding filter coefficient when for nth iteration;N is the number of iterations;x(n-k) Observation signal when iteration secondary for the n-th-k;
Using formula,Update iteration step length, wherein
Iteration step length when μ (n) is nth iteration;For the variance of near end signal;When N is that each audio frame is corresponding Long, value is filter length, and k ∈ (0, N);Observation signal when x (n-i) is the n-th-i iteration;Λ (n) is the l times Imbalance when iteration;
Using formula,Update filtering The estimated value of the coefficient of device, wherein
The estimated value of filter coefficient when for (n+1)th iteration;μ (n) is iteration step length;For l The estimated value of filter coefficient when secondary iteration;N is the corresponding duration of each audio frame, and value is filter length;x(n-i) Observation signal when iteration secondary for the n-th-i;x*(n-k) observation signal conjugate when iteration secondary for the n-th-k;| | it is modulus letter Number;
Utilize formula, d (n)=v (n)+∑kwk(n) x (n-k) calculates desired signal when nth iteration, wherein v (n) For near end signal;wk(n) be nth iteration when the corresponding filter coefficient of k-th of sampled point theoretical value;X (n-k) is n-th- Observation signal when k iteration;
Judge whether desired signal when nth iteration restrains, makees closely if so, returning and executing the signal that will currently separate The step of end signal;If it is not, desired signal when using the nth iteration is as the signal after eliminating echo.
The embodiment of the invention provides a kind of speech Separation device, described device includes:
First obtains module, for obtaining the voice data to be separated of each signal path, wherein the voice to be separated The voice data that data contain at least two people while generating when speaking;
Second obtains module, for being directed to each preset sampling instant, using blind source separation algorithm to described to be separated Voice data carries out separating treatment, obtains P separation signal;
Computing module, for for each separation signal, the current separation signal of calculating separates in signal with the P to be removed The intersection residual coefficients between other separation signals except the current separation signal;And judge that the intersection residual coefficients are It is no less than the first preset threshold;
Cancellation module, for utilizing echo cancellation algorithm pair in the case where the calculated result of the computing module is no All separation signals for intersecting residual coefficients and being not less than the first preset threshold, carry out echo cancellation process, and will treated point Intersect residual coefficients with all from signal and separate signal as target less than the set of the separation signal of the first preset threshold;
Setup module, for the calculated result of the computing module be in the case where, using the separation signal as Target separates signal.
Optionally, the blind source separation algorithm includes: non-linear principal component analysis, independent component analysis, neural network calculation One of method, maximum entropy algorithm, Minimum mutual information algorithm, maximum likelihood algorithm or multiple combinations.
Optionally, described second module is obtained, is also used to:
For each voice data to be separated, the generation for being directed to the voice data to be separated is established using NPCA criterion Valence functionWherein,
J (W) is the cost of the separation matrix of t moment;E { } is desired operation function;X (t) is corresponding for each microphone The observation signal that is observed of signal path;W is separation matrix;(.)TFor transposition operation;For nonlinear function;T is to work as The preceding moment;
Minimum processing is carried out to the cost function, obtains the iterative estimate of separation matrix are as follows:
W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], wherein
W (t+1) is the separation matrix at t+1 moment;W (t) is the separation matrix of t moment;θ is iteration step length, andθ (t) is the iteration step length of t moment, and θ (t-1) is the iteration at t-1 moment Step-length, ρ are constant,For gradient function, J (t) is the cost of t moment;Z (t) is nonlinear function, and
Utilize formula, W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], iterate to calculate the separation square of subsequent time Battle array obtains the target separation matrix of each voice data to be separated until the separation matrix is restrained;
Utilize formula, y (t)=Wx (t), the signal after obtaining the separation of the voice data to be separated, wherein y (t) is Signal after the separation of Current observation signal.
Optionally, the computing module, is also used to:
Using formula,Current separation signal is calculated to separate in signal with the P The intersection residual coefficients between other separation signals in addition to the current separation signal, wherein
For current separation signal and the P of i-th of channel separate in signal except it is described currently separate signal in addition to Other separation signals between intersection residual coefficients;I is the number in the channel of current separation signal;J is P separation letter The number in the channel of other separation signals in number in addition to the current separation signal;ai,kFor the separation signal in i-th of channel The mixed coefficint between signal is separated with k-th;aj,kMixing between signal is separated with k-th for the separation signal in j-th of channel Collaboration number;ykFor the sound-source signal in k-th of channel;∑ is summing function.
Optionally, the cancellation module, is also used to:
For all intersection residual coefficients not less than each separation signal in the separation signal of the first preset threshold, will work as Preceding separation signal makees near end signal;Residual coefficients will be intersected currently to divide not less than in the separation signal of the first preset threshold except described From other signals except signal as remote signaling;
Using formula,Obtain error signal;Wherein, e (n) is error letter Number;D (n) is desired output signal;N is the corresponding duration of each audio frame, and value is filter length;K is in audio frame The serial number of sampled point;K-th of sampled point corresponding filter coefficient when for nth iteration;N is the number of iterations;x(n-k) Observation signal when iteration secondary for the n-th-k;
Using formula,Update iteration step length, wherein
Iteration step length when μ (n) is nth iteration;For the variance of near end signal;When N is that each audio frame is corresponding Long, value is filter length, and k ∈ (0, N);Observation signal when x (n-i) is the n-th-i iteration;Λ (n) is the l times Imbalance when iteration;
Using formula,Update filtering The estimated value of the coefficient of device, wherein
The estimated value of filter coefficient when for (n+1)th iteration;μ (n) is iteration step length;For l The estimated value of filter coefficient when secondary iteration;N is the corresponding duration of each audio frame, and value is filter length;x(n-i) Observation signal when iteration secondary for the n-th-i;x*(n-k) observation signal conjugate when iteration secondary for the n-th-k;| | it is modulus letter Number;
Utilize formula, d (n)=v (n)+∑kwk(n) x (n-k) calculates desired signal when nth iteration, wherein v (n) For near end signal;wk(n) be nth iteration when the corresponding filter coefficient of k-th of sampled point theoretical value;X (n-k) is n-th- Observation signal when k iteration;
Judge whether desired signal when nth iteration restrains, makees closely if so, returning and executing the signal that will currently separate The step of end signal;If it is not, desired signal when using the nth iteration is as the signal after eliminating echo.
The present invention has the advantage that compared with prior art
Using the embodiment of the present invention, crossbar signal remaining in the signal after separation can be can be regarded as into other sound sources Echo, reuse echo cancellation algorithm to each separation signal carry out echo cancellation process, so as to reach improve separation Effect, and then reduce the crossbar signal residual in echo signal.
Detailed description of the invention
Fig. 1 is a kind of flow diagram of speech separating method provided in an embodiment of the present invention;
Fig. 2 is a kind of structural schematic diagram of speech Separation device provided in an embodiment of the present invention.
Specific embodiment
It elaborates below to the embodiment of the present invention, the present embodiment carries out under the premise of the technical scheme of the present invention Implement, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to following implementation Example.
The embodiment of the invention provides a kind of speech separating method and devices, just provided in an embodiment of the present invention first below A kind of speech separating method is introduced.
Firstly the need of explanation, the embodiment of the present invention has a wide range of applications scene, such as: (1) traditionally for public affairs The monitoring of occasion only has video monitoring altogether, can not accomplish sound monitoring, because there may be multiple speakers are same in public arena When speak, in addition can also have various ambient noises, background music etc..It can be in protection and monitor field using the embodiment of the present invention It is monitored while realizing voice and video.(2) occur the meeting transcription system for completing meeting summary in real time in the industry at present, To efficiently complete the function of meeting summary, but this system is made a speech (if session discussing mistake simultaneously for there are more people When debating actively in journey) the case where will fail, existing speech recognition system can not cope with more speaker's voices completely The scene of identification.Using the embodiment of the present invention, intelligent meeting system can be applied to.(3) it can be applied to general voice drop Scene of making an uproar remains with the channel of user's normal speech by Sound seperation, removes the channel of not normal speech, can be realized Voice de-noising.
Fig. 1 is a kind of flow diagram of speech separating method provided in an embodiment of the present invention, as shown in Figure 1, the side Method includes:
S101: the voice data to be separated of each signal path is obtained, wherein the voice data to be separated contains at least The voice data that two people generate when speaking simultaneously.
Specifically, the microphone that different location is arranged in using at least two, it is same to obtain two or two people or more When voice data when speaking, one of microphone obtains voice data to be separated all the way, for example, microphone -1 obtain to The voice data -4 to be separated of voice data -2 to be separated, the acquisition of microphone -3 that separation voice data -1, microphone -2 obtain, The voice data -5 to be separated that microphone -1 obtains.
It is understood that voice data to be separated corresponds to a signal path all the way.
S102: being directed to each preset sampling instant, is carried out using blind source separation algorithm to the voice data to be separated Separating treatment obtains P separation signal.
Specifically, the blind source separation algorithm includes: non-linear principal component analysis, independent component analysis, neural network calculation One of method, maximum entropy algorithm, Minimum mutual information algorithm, maximum likelihood algorithm or multiple combinations
Specifically, with the voice data position input to be separated of the road S101 step Zhong Ge, benefit is with the following method, available every The corresponding road P of one preset sampling instant separates signal.It can be directed to each voice data to be separated, utilize NPCA (nonlinear principle component analysis, non-linear principal component analysis) criterion is established for described wait divide Cost function from voice data:
Wherein, J (W) is the cost of the separation matrix of t moment;E { } is Expectation computing function;The observation signal that x (t) is observed by the corresponding signal path of each microphone;W is separation matrix;(. )TFor transposition operation;For nonlinear function;T is current time;
Minimum processing is carried out to the cost function, obtains the iterative estimate of separation matrix are as follows:
W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], wherein
W (t+1) is the separation matrix at t+1 moment;W (t) is the separation matrix of t moment;θ is iteration step length, andθ (t) is the iteration step length of t moment, and θ (t-1) is the iteration at t-1 moment Step-length, ρ are constant,For gradient function, J (t) is the cost of t moment;Z (t) is nonlinear function, and
Utilize formula, W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], iterate to calculate the separation square of subsequent time Battle array obtains the target separation matrix of each voice data to be separated until the separation matrix is restrained;
Utilize formula, y (t)=Wx (t), the signal after obtaining the separation of the voice data to be separated, wherein y (t) is Signal after the separation of Current observation signal.
For example, the available road the P separation signal of the 1st sampling instant, the available road the P separation letter of the 2nd sampling instant Number, the available road P of n-th of sampling instant separate signal.
It is emphasized that aforementioned observed signal refers to, per voice data to be separated all the way.
S103: it for each separation signal, calculates current separation signal and separates in signal with the P except described current Separate the intersection residual coefficients between other separation signals except signal;And judge the intersection residual coefficients whether less than One preset threshold;If it is not, executing S104 step;If so, executing S105 step.
Specifically, s can be usedi(n) each obtained separation signal is indicated, wherein i is the corresponding letter of separation signal Number channel position, the i.e. serial number of microphone;N is the serial number of the sampling instant in each signal path.Inventors have found that in reality In the application of border, any one separation signal can be regarded as other roads P-1 in the same time and intersect what residual signal mixed, because This, can use formula,Current separation signal is calculated to remove with a separate in signal of the P The intersection residual coefficients between other separation signals except the current separation signal, wherein
For current separation signal and the P of i-th of channel separate in signal except it is described currently separate signal in addition to Other separation signals between intersection residual coefficients;I is the number in the channel of current separation signal;J is P separation letter The number in the channel of other separation signals in number in addition to the current separation signal;ai,kFor the separation signal in i-th of channel The mixed coefficint between signal is separated with k-th, it is to be understood that when k is 1, ai,kValue be 1;aj,kFor j-th of channel Separation signal the mixed coefficint between signal is separated with k-th;ykFor the sound-source signal in k-th of channel;∑ is summing function.
It is not less than the first preset threshold when intersecting residual coefficients, when such as 0.0125, executes S104 step;If intersecting residual system Number executes S105 step less than the first preset threshold.
In practical applications, t can be usedi(n) it indicates to need to carry out the separation signal of echo cancellation process, wherein i is The corresponding signal path serial number of the separation signal, the i.e. serial number of microphone;N is the sequence of the sampling instant in each signal path Number, and i ∈ (1,2 ..., Q), Q≤P.
S104: being not less than the separation signal of the first preset threshold using echo cancellation algorithm to all intersection residual coefficients, Echo cancellation process is carried out, and separation signal with all intersects separation of the residual coefficients less than the first preset threshold by treated The set of signal separates signal as target.
Specifically, the echo cancellation algorithm, comprising: frequency domain MDF algorithm.In practical applications, it will successively need to carry out The separation signal t of echo cancellation processi(n) it is successively used as near end signal, others need to carry out the separation of echo cancellation process Signal is handled using echo cancellation algorithm respectively as remote signaling.
Specifically, the separation that using echo cancellation algorithm all intersection residual coefficients are not less than with the first preset threshold Signal carries out echo cancellation process, comprising:
For all intersection residual coefficients not less than each separation signal in the separation signal of the first preset threshold, will work as Preceding separation signal makees near end signal;Residual coefficients will be intersected currently to divide not less than in the separation signal of the first preset threshold except described From other signals except signal as remote signaling;
Using formula,Obtain error signal;Wherein, e (n) is error letter Number;D (n) is desired output signal;N is the corresponding duration of each audio frame, and value is filter length;K is in audio frame The serial number of sampled point;K-th of sampled point corresponding filter coefficient when for nth iteration;N is the number of iterations;x(n-k) Observation signal when iteration secondary for the n-th-k;
Using formula,Update iteration step length, wherein
Iteration step length when μ (n) is nth iteration;For the variance of near end signal;When N is that each audio frame is corresponding Long, value is filter length, and k ∈ (0, N);Observation signal when x (n-i) is the n-th-i iteration;Λ (n) is the l times Imbalance when iteration;
Using formula,Update filtering The estimated value of the coefficient of device, wherein
The estimated value of filter coefficient when for (n+1)th iteration;μ (n) is iteration step length;For l The estimated value of filter coefficient when secondary iteration;N is the corresponding duration of each audio frame, and value is filter length;x(n-i) Observation signal when iteration secondary for the n-th-i;Observation signal conjugate when x* (n-k) is the n-th-k iteration;| | it is modulus letter Number;
Utilize formula, d (n)=v (n)+∑kwk(n) x (n-k) calculates desired signal when nth iteration, wherein v (n) For near end signal;wk(n) be nth iteration when the corresponding filter coefficient of k-th of sampled point theoretical value;X (n-k) is n-th- Observation signal when k iteration;
Judge whether desired signal when nth iteration restrains, makees closely if so, returning and executing the signal that will currently separate The step of end signal;If it is not, desired signal when using the nth iteration is as the signal after eliminating echo.
By the separation signal after progress echo cancellation process with all residual coefficients of intersecting not less than the first preset threshold Signal is separated, that is, the set for not needing the separation signal for carrying out echo cancellation process separates signal as target.
S105: signal is separated using the separation signal as target.
Separation signal of the residual coefficients not less than the first preset threshold will be intersected, that is, do not need to carry out echo cancellation process The set for separating signal separates signal as target.
It should be noted that the P separation signal for the (n+1)th moment is also handled according to the method described above.Finally The target separation signal at each moment is arrived.
Using embodiment illustrated in fig. 1 of the present invention, crossbar signal remaining in the signal after separation can be can be regarded as The echo of other sound sources reuses echo cancellation algorithm and carries out echo cancellation process to each separation signal, so as to reach Improve separating effect, and then reduces the crossbar signal residual in echo signal.
In addition, if all echo cancellor is used to post-process each separation signal after blind source separating, and can bring Great extra computation amount can effectively judge which signal after blind source separating is suitble to use using the embodiment of the present invention Echo cancellation process improves separating effect, and then effectively promotes the working efficiency of whole system.
Corresponding with embodiment illustrated in fig. 1 of the present invention, the embodiment of the invention also provides a kind of speech Separation devices.
Fig. 2 is a kind of structural schematic diagram of speech Separation device provided in an embodiment of the present invention, as shown in Fig. 2, the dress It sets and includes:
First obtains module 201, for obtaining the voice data to be separated of each signal path, wherein described to be separated The voice data that voice data contains at least two people while generating when speaking;
Second obtains module 202, for being directed to each preset sampling instant, using blind source separation algorithm to described wait divide Separating treatment is carried out from voice data, obtains P separation signal;
Computing module 203, for calculating current separation signal with the P and separating signal for each separation signal In except it is described it is current separation signal in addition to other separation signals between intersection residual coefficients;And judge intersection residual system Whether number is less than the first preset threshold;
Cancellation module 204, for utilizing echo cancellation algorithm in the case where the calculated result of the computing module is no All intersection residual coefficients are not less than with the separation signal of the first preset threshold, carries out echo cancellation process, and by treated Separation signal intersects residual coefficients and separates signal as target less than the set of the separation signal of the first preset threshold with all;
Setup module 205, in the case where the calculated result of the computing module, which is, is, the separation signal to be made Signal is separated for target.
Using embodiment illustrated in fig. 2 of the present invention, crossbar signal remaining in the signal after separation can be can be regarded as The echo of other sound sources reuses echo cancellation algorithm and carries out echo cancellation process to each separation signal, so as to reach Improve separating effect, and then reduces the crossbar signal residual in echo signal.
In a kind of specific embodiment of the embodiment of the present invention, the blind source separation algorithm includes: non-linear principal component One of analysis, independent component analysis, neural network algorithm, maximum entropy algorithm, Minimum mutual information algorithm, maximum likelihood algorithm Or multiple combination.
In a kind of specific embodiment of the embodiment of the present invention, described second obtains module 202, is also used to:
For each voice data to be separated, the generation for being directed to the voice data to be separated is established using NPCA criterion Valence functionWherein,
J (w) is the cost of the separation matrix of t moment;E { } is desired operation function;X (t) is corresponding for each microphone The observation signal that is observed of signal path;W is separation matrix;(.)TFor transposition operation;For nonlinear function;T is to work as The preceding moment;
Minimum processing is carried out to the cost function, obtains the iterative estimate of separation matrix are as follows:
W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], wherein
W (t+1) is the separation matrix at t+1 moment;W (t) is the separation matrix of t moment;θ is iteration step length, andθ (t) is the iteration step length of t moment, and θ (t-1) is the iteration step length at t-1 moment, ρ is constant,For gradient function, J (t) is the cost of t moment;Z (t) is nonlinear function, and
Utilize formula, W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], iterate to calculate the separation square of subsequent time Battle array obtains the target separation matrix of each voice data to be separated until the separation matrix is restrained;
Utilize formula, y (t)=Wx (t), the signal after obtaining the separation of the voice data to be separated, wherein y (t) is Signal after the separation of Current observation signal.
In a kind of specific embodiment of the embodiment of the present invention, the computing module 203 is also used to:
Using formula,Current separation signal is calculated to separate in signal with the P The intersection residual coefficients between other separation signals in addition to the current separation signal, wherein
For current separation signal and the P of i-th of channel separate in signal except it is described currently separate signal in addition to Other separation signals between intersection residual coefficients;I is the number in the channel of current separation signal;J is P separation letter The number in the channel of other separation signals in number in addition to the current separation signal;ai,kFor the separation signal in i-th of channel The mixed coefficint between signal is separated with k-th;aj,kMixing between signal is separated with k-th for the separation signal in j-th of channel Collaboration number;ykFor the sound-source signal in k-th of channel;∑ is summing function.
In a kind of specific embodiment of the embodiment of the present invention, the cancellation module 204 is also used to:
For all intersection residual coefficients not less than each separation signal in the separation signal of the first preset threshold, will work as Preceding separation signal makees near end signal;Residual coefficients will be intersected currently to divide not less than in the separation signal of the first preset threshold except described From other signals except signal as remote signaling;
Using formula,Obtain error signal;Wherein, e (n) is error letter Number;D (n) is desired output signal;N is the corresponding duration of each audio frame, and value is filter length;K is in audio frame The serial number of sampled point;K-th of sampled point corresponding filter coefficient when for nth iteration;N is the number of iterations;x(n-k) Observation signal when iteration secondary for the n-th-k;
Using formula,Update iteration step length, wherein
Iteration step length when μ (n) is nth iteration;For the variance of near end signal;When N is that each audio frame is corresponding Long, value is filter length, and k ∈ (0, N);Observation signal when x (n-i) is the n-th-i iteration;Λ (n) is the l times Imbalance when iteration;
Using formula,Update filtering The estimated value of the coefficient of device, wherein
The estimated value of filter coefficient when for (n+1)th iteration;μ (n) is iteration step length;For l The estimated value of filter coefficient when secondary iteration;N is the corresponding duration of each audio frame, and value is filter length;x(n-i) Observation signal when iteration secondary for the n-th-i;x*(n-k) observation signal conjugate when iteration secondary for the n-th-k;| | it is modulus letter Number;
Utilize formula, d (n)=v (n)+∑kwk(n) x (n-k) calculates desired signal when nth iteration, wherein v (n) For near end signal;wk(n) be nth iteration when the corresponding filter coefficient of k-th of sampled point theoretical value;X (n-k) is n-th- Observation signal when k iteration;
Judge whether desired signal when nth iteration restrains, makees closely if so, returning and executing the signal that will currently separate The step of end signal;If it is not, desired signal when using the nth iteration is as the signal after eliminating echo.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims (10)

1. a kind of speech separating method, which is characterized in that the described method includes:
Obtain the voice data to be separated of each signal path, wherein it is same that the voice data to be separated contains at least two people When the voice data that generates when speaking;
For each preset sampling instant, separating treatment is carried out to the voice data to be separated using blind source separation algorithm, Obtain P separation signal;
Signal is separated for each, current separation signal is calculated with a separate in signal of the P and currently separates signal except described Except other separation signals between intersection residual coefficients;And judge the intersection residual coefficients whether less than the first default threshold Value;
If it is not, using echo cancellation algorithm all intersection residual coefficients are not less than with the separation signal of the first preset threshold, carry out Echo cancellation process, and separation signal with all intersects separation signal of the residual coefficients less than the first preset threshold by treated Set as target separate signal;
If so, separating signal for the separation signal as target.
2. a kind of speech separating method according to claim 1, which is characterized in that the blind source separation algorithm includes: non- Linear principal component analysis, independent component analysis, neural network algorithm, maximum entropy algorithm, Minimum mutual information algorithm, maximum likelihood are calculated One of method or multiple combinations.
3. a kind of speech separating method according to claim 1, which is characterized in that described to utilize blind source separation algorithm to institute It states voice data to be separated and carries out separating treatment, comprising:
For each voice data to be separated, the cost letter for being directed to the voice data to be separated is established using NPCA criterion NumberWherein,
J (W) is the cost of the separation matrix of t moment;E { } is desired operation function;X (t) is the corresponding letter of each microphone The observation signal that number channel is observed;W is separation matrix;(.)TFor transposition operation;For nonlinear function;When t is current It carves;
Minimum processing is carried out to the cost function, obtains the iterative estimate of separation matrix are as follows:
W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], wherein
W (t+1) is the separation matrix at t+1 moment;W (t) is the separation matrix of t moment;θ is iteration step length, and θ (t)=θ (t-1)-ρ*▽θ*(J(t)|θ=θ (t-1)), θ (t) is the iteration step length of t moment, and θ (t-1) is the iteration step length at t-1 moment, ρ For constant, ▽θFor gradient function, J (t) is the cost of t moment;Z (t) is nonlinear function, and
Utilize formula, W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], the separation matrix of subsequent time is iterated to calculate, Until the separation matrix is restrained, the target separation matrix of each voice data to be separated is obtained;
Utilize formula, y (t)=Wx (t), the signal after obtaining the separation of the voice data to be separated, wherein y (t) is current Signal after the separation of observation signal.
4. a kind of speech separating method according to claim 1, which is characterized in that the calculating is current to separate signal and institute State the intersection residual coefficients between other separation signals in P separation signal in addition to the current separation signal, comprising:
Using formula,Current separation signal is calculated to separate in signal with the P except described The intersection residual coefficients between other separation signals except current separation signal, wherein
For current separation signal and the P of i-th of channel separate in signal except it is described currently separate signal in addition to its He separates the intersection residual coefficients between signal;I is the number in the channel of current separation signal;J is in the P separation signal The number in the channel of other separation signals in addition to the current separation signal;ai,kFor the separation signal and in i-th of channel Mixed coefficint between k separation signal;aj,kThe mixed stocker between signal is separated with k-th for the separation signal in j-th of channel Number;ykFor the sound-source signal in k-th of channel;∑ is summing function.
5. a kind of speech separating method according to claim 1, which is characterized in that described to utilize echo cancellation algorithm to institute Have and intersect the separation signal that residual coefficients are not less than the first preset threshold, carries out echo cancellation process, comprising:
For all intersection residual coefficients not less than each separation signal in the separation signal of the first preset threshold, will currently divide Make near end signal from signal;Residual coefficients will be intersected and believed not less than the current separation is removed in the separation signal of the first preset threshold Other signals except number are as remote signaling;
Using formula,Obtain error signal;Wherein, e (n) is error signal;d It (n) is desired output signal;N is the corresponding duration of each audio frame, and value is filter length;K is to sample in audio frame The serial number of point;K-th of sampled point corresponding filter coefficient when for nth iteration;N is the number of iterations;X (n-k) is the Observation signal when n-k iteration;
Using formula,Update iteration step length, wherein
Iteration step length when μ (n) is nth iteration;For the variance of near end signal;N is the corresponding duration of each audio frame, Value is filter length, and k ∈ (0, N);Observation signal when x (n-i) is the n-th-i iteration;Λ (n) is the l times iteration When imbalance;
Using formula,Update filter The estimated value of coefficient, wherein
The estimated value of filter coefficient when for (n+1)th iteration;μ (n) is iteration step length;Repeatedly for the l times For when filter coefficient estimated value;N is the corresponding duration of each audio frame, and value is filter length;X (n-i) is the Observation signal when n-i iteration;x*(n-k) observation signal conjugate when iteration secondary for the n-th-k;| | it is mod function;
Utilize formula, d (n)=v (n)+∑kwk(n) x (n-k) calculates desired signal when nth iteration, wherein v (n) is close End signal;wk(n) be nth iteration when the corresponding filter coefficient of k-th of sampled point theoretical value;X (n-k) is the n-th-k times Observation signal when iteration;
Judge whether desired signal when nth iteration restrains, execution is described to make proximal end letter for current separation signal if so, returning Number the step of;If it is not, desired signal when using the nth iteration is as the signal after eliminating echo.
6. a kind of speech Separation device, which is characterized in that described device includes:
First obtains module, for obtaining the voice data to be separated of each signal path, wherein the voice data to be separated The voice data generated when speaking simultaneously containing at least two people;
Second obtains module, for being directed to each preset sampling instant, using blind source separation algorithm to the voice to be separated Data carry out separating treatment, obtain P separation signal;
Computing module, for calculating current separation signal and separating in signal with the P except described for each separation signal The intersection residual coefficients between other separation signals except current separation signal;And judge whether the intersection residual coefficients are small In the first preset threshold;
Cancellation module, in the case where the calculated result of the computing module is no, using echo cancellation algorithm to all Intersection residual coefficients are not less than the separation signal of the first preset threshold, carry out echo cancellation process, and separation is believed by treated Number intersects residual coefficients with all and separate signal as target less than the set of the separation signal of the first preset threshold;
Setup module, for the calculated result of the computing module be in the case where, using the separation signal as target Separate signal.
7. a kind of speech Separation device according to claim 6, which is characterized in that the blind source separation algorithm includes: non- Linear principal component analysis, independent component analysis, neural network algorithm, maximum entropy algorithm, Minimum mutual information algorithm, maximum likelihood are calculated One of method or multiple combinations.
8. a kind of speech Separation device according to claim 6, which is characterized in that described second obtains module, is also used to:
For each voice data to be separated, the cost letter for being directed to the voice data to be separated is established using NPCA criterion NumberWherein,
J (W) is the cost of the separation matrix of t moment;E { } is desired operation function;X (t) is the corresponding letter of each microphone The observation signal that number channel is observed;W is separation matrix;(.)TFor transposition operation;For nonlinear function;When t is current It carves;
Minimum processing is carried out to the cost function, obtains the iterative estimate of separation matrix are as follows:
W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], wherein
W (t+1) is the separation matrix at t+1 moment;W (t) is the separation matrix of t moment;θ is iteration step length, and θ (t)=θ (t-1)-ρ*▽θ*(J(t)|θ=θ (t-1)), θ (t) is the iteration step length of t moment, and θ (t-1) is the iteration step length at t-1 moment, ρ For constant, ▽θFor gradient function, J (t) is the cost of t moment;Z (t) is nonlinear function, and
Utilize formula, W (t+1)=W (t)+θ * z (t) [xT(t)-zT(t) W (t)], the separation matrix of subsequent time is iterated to calculate, Until the separation matrix is restrained, the target separation matrix of each voice data to be separated is obtained;
Utilize formula, y (t)=Wx (t), the signal after obtaining the separation of the voice data to be separated, wherein y (t) is current Signal after the separation of observation signal.
9. a kind of speech Separation device according to claim 6, which is characterized in that the computing module is also used to:
Using formula,Current separation signal is calculated to separate in signal with the P except described The intersection residual coefficients between other separation signals except current separation signal, wherein
For current separation signal and the P of i-th of channel separate in signal except it is described currently separate signal in addition to its He separates the intersection residual coefficients between signal;I is the number in the channel of current separation signal;J is in the P separation signal The number in the channel of other separation signals in addition to the current separation signal;ai,kFor the separation signal and in i-th of channel Mixed coefficint between k separation signal;aj,kThe mixed stocker between signal is separated with k-th for the separation signal in j-th of channel Number;ykFor the sound-source signal in k-th of channel;∑ is summing function.
10. a kind of speech Separation device according to claim 6, which is characterized in that the cancellation module is also used to:
For all intersection residual coefficients not less than each separation signal in the separation signal of the first preset threshold, will currently divide Make near end signal from signal;Residual coefficients will be intersected and believed not less than the current separation is removed in the separation signal of the first preset threshold Other signals except number are as remote signaling;
Using formula,Obtain error signal;Wherein, e (n) is error signal;d It (n) is desired output signal;N is the corresponding duration of each audio frame, and value is filter length;K is to sample in audio frame The serial number of point;K-th of sampled point corresponding filter coefficient when for nth iteration;N is the number of iterations;X (n-k) is the Observation signal when n-k iteration;
Using formula,Update iteration step length, wherein
Iteration step length when μ (n) is nth iteration;For the variance of near end signal;N is the corresponding duration of each audio frame, Value is filter length, and k ∈ (0, N);Observation signal when x (n-i) is the n-th-i iteration;Λ (n) is the l times iteration When imbalance;
Using formula,Update filter The estimated value of coefficient, wherein
The estimated value of filter coefficient when for (n+1)th iteration;μ (n) is iteration step length;Repeatedly for the l times For when filter coefficient estimated value;N is the corresponding duration of each audio frame, and value is filter length;X (n-i) is the Observation signal when n-i iteration;x*(n-k) observation signal conjugate when iteration secondary for the n-th-k;| | it is mod function;
Utilize formula, d (n)=v (n)+∑kwk(n) x (n-k) calculates desired signal when nth iteration, wherein v (n) is close End signal;wk(n) be nth iteration when the corresponding filter coefficient of k-th of sampled point theoretical value;X (n-k) is the n-th-k times Observation signal when iteration;
Judge whether desired signal when nth iteration restrains, execution is described to make proximal end letter for current separation signal if so, returning Number the step of;If it is not, desired signal when using the nth iteration is as the signal after eliminating echo.
CN201810820474.9A 2018-07-24 2018-07-24 Voice separation method and device Active CN108962276B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810820474.9A CN108962276B (en) 2018-07-24 2018-07-24 Voice separation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810820474.9A CN108962276B (en) 2018-07-24 2018-07-24 Voice separation method and device

Publications (2)

Publication Number Publication Date
CN108962276A true CN108962276A (en) 2018-12-07
CN108962276B CN108962276B (en) 2020-11-17

Family

ID=64464704

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810820474.9A Active CN108962276B (en) 2018-07-24 2018-07-24 Voice separation method and device

Country Status (1)

Country Link
CN (1) CN108962276B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020223952A1 (en) * 2019-05-09 2020-11-12 广东省智能制造研究所 Semi-non-negative matrix factorization-based sound signal separation method
CN113362847A (en) * 2021-05-26 2021-09-07 北京小米移动软件有限公司 Audio signal processing method and device and storage medium
CN113470689A (en) * 2021-08-23 2021-10-01 杭州国芯科技股份有限公司 Voice separation method

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894558A (en) * 2010-08-04 2010-11-24 华为技术有限公司 Lost frame recovering method and equipment as well as speech enhancing method, equipment and system
CN101917527A (en) * 2010-09-02 2010-12-15 杭州华三通信技术有限公司 Method and device of echo elimination
CN101964670A (en) * 2009-07-21 2011-02-02 雅马哈株式会社 Echo suppression method and apparatus thereof
CN102142259A (en) * 2010-01-28 2011-08-03 三星电子株式会社 Signal separation system and method for automatically selecting threshold to separate sound source
CN103188184A (en) * 2012-12-17 2013-07-03 中国人民解放军理工大学 NPCA (Nonlinear Principal Component Analysis)-based self-adaptive variable step size blind source separation method
US20140105410A1 (en) * 2012-10-12 2014-04-17 Huawei Technologies Co., Ltd. Echo cancellation method and device
CN103780522A (en) * 2014-01-08 2014-05-07 西安电子科技大学 Non-orthogonal joint diagonalization instantaneous blind source separation method based on double iteration
CN105845148A (en) * 2016-03-16 2016-08-10 重庆邮电大学 Convolution blind source separation method based on frequency point correction
CN106057210A (en) * 2016-07-01 2016-10-26 山东大学 Quick speech blind source separation method based on frequency point selection under binaural distance
CN106898361A (en) * 2017-03-16 2017-06-27 杭州电子科技大学 Single channel blind source separation method based on feedback variation Mode Decomposition
CN107316650A (en) * 2016-04-26 2017-11-03 诺基亚技术有限公司 Method, device and the computer program of the modification of the feature associated on the audio signal with separating
US20180182412A1 (en) * 2016-12-28 2018-06-28 Google Inc. Blind source separation using similarity measure
CN108231087A (en) * 2017-12-14 2018-06-29 宁波升维信息技术有限公司 A kind of single channel blind source separating method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101964670A (en) * 2009-07-21 2011-02-02 雅马哈株式会社 Echo suppression method and apparatus thereof
CN102142259A (en) * 2010-01-28 2011-08-03 三星电子株式会社 Signal separation system and method for automatically selecting threshold to separate sound source
CN101894558A (en) * 2010-08-04 2010-11-24 华为技术有限公司 Lost frame recovering method and equipment as well as speech enhancing method, equipment and system
CN101917527A (en) * 2010-09-02 2010-12-15 杭州华三通信技术有限公司 Method and device of echo elimination
US20140105410A1 (en) * 2012-10-12 2014-04-17 Huawei Technologies Co., Ltd. Echo cancellation method and device
CN103188184A (en) * 2012-12-17 2013-07-03 中国人民解放军理工大学 NPCA (Nonlinear Principal Component Analysis)-based self-adaptive variable step size blind source separation method
CN103780522A (en) * 2014-01-08 2014-05-07 西安电子科技大学 Non-orthogonal joint diagonalization instantaneous blind source separation method based on double iteration
CN105845148A (en) * 2016-03-16 2016-08-10 重庆邮电大学 Convolution blind source separation method based on frequency point correction
CN107316650A (en) * 2016-04-26 2017-11-03 诺基亚技术有限公司 Method, device and the computer program of the modification of the feature associated on the audio signal with separating
CN106057210A (en) * 2016-07-01 2016-10-26 山东大学 Quick speech blind source separation method based on frequency point selection under binaural distance
US20180182412A1 (en) * 2016-12-28 2018-06-28 Google Inc. Blind source separation using similarity measure
CN106898361A (en) * 2017-03-16 2017-06-27 杭州电子科技大学 Single channel blind source separation method based on feedback variation Mode Decomposition
CN108231087A (en) * 2017-12-14 2018-06-29 宁波升维信息技术有限公司 A kind of single channel blind source separating method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LI HONG ET AL.: "An adaptive echo cancellation method based on a blind signal separation", 《2010 INTERNATIONAL CONFERENCE ON ELECTRICAL AND CONTROL ENGINEERING》 *
MUHAMMAD Z. IKRAM ET AL.: "BLIND SOURCE SEPARATION AND ACOUSTIC ECHO CANCELLATION: A UNIFIED FRAMEWORK", 《2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS,SPEECH AND SIGNAL PROCESSING(ICASSP)》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020223952A1 (en) * 2019-05-09 2020-11-12 广东省智能制造研究所 Semi-non-negative matrix factorization-based sound signal separation method
CN113362847A (en) * 2021-05-26 2021-09-07 北京小米移动软件有限公司 Audio signal processing method and device and storage medium
CN113470689A (en) * 2021-08-23 2021-10-01 杭州国芯科技股份有限公司 Voice separation method
CN113470689B (en) * 2021-08-23 2024-01-30 杭州国芯科技股份有限公司 Voice separation method

Also Published As

Publication number Publication date
CN108962276B (en) 2020-11-17

Similar Documents

Publication Publication Date Title
Buchner et al. TRINICON: A versatile framework for multichannel blind signal processing
Ortega-García et al. Overview of speech enhancement techniques for automatic speaker recognition
CN108962276A (en) A kind of speech separating method and device
CN1914683A (en) Methods and apparatus for blind separation of multichannel convolutive mixtures in the frequency domain
WO2020256257A3 (en) Combined learning method and device using transformed loss function and feature enhancement based on deep neural network for speaker recognition that is robust to noisy environment
CN1261759A (en) Adding blind source separate technology to hearing aid
US20090043588A1 (en) Sound-source separation system
Braun et al. Task splitting for dnn-based acoustic echo and noise removal
Wang et al. NN3A: Neural network supported acoustic echo cancellation, noise suppression and automatic gain control for real-time communications
CN106782592B (en) System and method for eliminating echo and howling of network sound transmission
KR100446626B1 (en) Noise suppression method and apparatus
CN110992966B (en) Human voice separation method and system
Takeda et al. Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition
CN108564961A (en) A kind of voice de-noising method of mobile communication equipment
KR101936242B1 (en) Apparatus and method for noise removal, and recording medium thereof
Krueger et al. Bayesian Feature Enhancement for ASR of Noisy Reverberant Real-World Data.
Rincón-Trujillo et al. Analysis of Speech Separation Methods based on Deep Learning.
Lakra et al. Selective noise filtering of speech signals using an adaptive neuro-fuzzy inference system as a frequency pre-classifier
Kokkinakis et al. Multichannel speech separation using adaptive parameterization of source PDFs
CN116648747A (en) Apparatus for providing a processed audio signal, method for providing a processed audio signal, apparatus for providing a neural network parameter and method for providing a neural network parameter
Adasme et al. Proposed Integration Algorithm to Optimize the Separation of Audio Signals Using the ICA and Wavelet Transform
Jadhav et al. Blind source separation: trends of new age-a review
Nezamdoust et al. Frequency-Domain Functional Links For Nonlinear Feedback Cancellation In Hearing Aids
Aoulass et al. Noise Reduction using DUET algorithm for dual-microphone mobile station
Harding et al. Mask estimation based on sound localisation for missing data speech recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20201027

Address after: 310000 station 07, room 704, building 8, No. 20, Keji Garden Road, Baiyang street, Qiantang New District, Hangzhou City, Zhejiang Province

Applicant after: Hangzhou audiometry Technology Co.,Ltd.

Address before: 100176 FC-3, 6th floor, No. 5 Building, 2 Ronghua South Road, Daxing Economic and Technological Development Zone, Beijing

Applicant before: BEIJING SINWT SCIENCE & TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: Room 703-1, building 8, No.20, kekeyuan Road, Qiantang New District, Hangzhou City, Zhejiang Province

Patentee after: Hangzhou audiometry Technology Co.,Ltd.

Address before: 310000 station 07, room 704, building 8, No. 20, Keji Garden Road, Baiyang street, Qiantang New District, Hangzhou City, Zhejiang Province

Patentee before: Hangzhou audiometry Technology Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231213

Address after: 100176 1201-09, 12 / F, building 2, yard 1, No. 29, Kechuang 13th Street, Beijing Economic and Technological Development Zone, Daxing District, Beijing

Patentee after: BEIJING SINWT SCIENCE & TECHNOLOGY Co.,Ltd.

Address before: Room 703-1, building 8, No.20, kekeyuan Road, Qiantang New District, Hangzhou City, Zhejiang Province

Patentee before: Hangzhou audiometry Technology Co.,Ltd.