CN106782595B - Robust blocking matrix method for reducing voice leakage - Google Patents

Robust blocking matrix method for reducing voice leakage Download PDF

Info

Publication number
CN106782595B
CN106782595B CN201611218157.7A CN201611218157A CN106782595B CN 106782595 B CN106782595 B CN 106782595B CN 201611218157 A CN201611218157 A CN 201611218157A CN 106782595 B CN106782595 B CN 106782595B
Authority
CN
China
Prior art keywords
signal
module
noise
blocking matrix
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611218157.7A
Other languages
Chinese (zh)
Other versions
CN106782595A (en
Inventor
曹裕行
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Shanghai Intelligent Technology Co Ltd
Original Assignee
Unisound Shanghai Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisound Shanghai Intelligent Technology Co Ltd filed Critical Unisound Shanghai Intelligent Technology Co Ltd
Priority to CN201611218157.7A priority Critical patent/CN106782595B/en
Publication of CN106782595A publication Critical patent/CN106782595A/en
Application granted granted Critical
Publication of CN106782595B publication Critical patent/CN106782595B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Abstract

The invention discloses a robust blocking matrix method for reducing voice leakage, which comprises the following steps: inputting a sound signal; acquiring a target voice signal from the voice signal by using a fixed beam module; eliminating a target voice signal from the voice signal by using a blocking matrix module to obtain a noise signal; estimating the prior probability of the target voice signal in the noise signal by using a fixed beam module; the blocking matrix module updates the noise signal according to the prior probability, eliminates a target voice signal existing in the noise signal and obtains an updated noise signal; and eliminating the noise signal output by the blocking matrix module from the target voice signal output by the fixed beam module by utilizing the elimination module to form an output signal and output the output signal. According to the invention, before the elimination module is used for eliminating the residual noise signal in the target voice signal, the blocking matrix parameter of the blocking matrix module is updated in advance so as to eliminate the target voice signal which is missed in the noise signal and enhance the function of eliminating the target voice signal of the blocking matrix module.

Description

Robust blocking matrix method for reducing voice leakage
Technical Field
The invention relates to the field of voice recognition, in particular to a robust blocking matrix method for reducing voice leakage.
Background
The speech enhancement technology based on the microphone array is widely applied to communication, man-machine interaction, speech recognition systems and the like, wherein the Generalized Sidelobe Cancellation (GSC) method is most widely applied, and the method is easy to implement and has good performance. The GSC is divided into an upper path and a lower path, the upper path is a fixed beam module (FBF) used for estimating a reference signal of target voice, and the lower path is a blocking matrix module (BM) and a eliminating Module (MC) used for eliminating residual noise in the fixed beam, wherein the blocking matrix module is used for eliminating the target voice signal to obtain a noise signal.
From many practical systems, the most vulnerable to performance degradation of the GSC is speech leakage in the BM module, i.e. the BM does not block the target speech signal, resulting in cancellation of the leaked speech signal by subtraction from the speech signal in the FBF. Conventional BM designs often use adaptive BMs or directly use differential matrices. Because of the error of the microphone array system or the error of the estimation of the target direction, the performance of the differential matrix is greatly reduced, and the adaptive BM is affected by the step size of the adaptive weight update, and the adaptive convergence is a relatively large problem.
Disclosure of Invention
The invention aims to solve the technical problem of providing a robust blocking matrix method for reducing voice leakage, which can greatly reduce the voice leakage condition.
In order to achieve the technical effect, the invention discloses a robust blocking matrix method for reducing voice leakage, which comprises the following steps:
providing a sound signal;
inputting the sound signal into a fixed beam module and a blocking matrix module of a generalized side lobe cancellation structure, wherein the generalized side lobe cancellation structure is provided with a first channel and a second channel which are connected in parallel, the fixed beam module is positioned on the first channel, and the blocking matrix module is positioned on the second channel; the second path is also provided with a cancellation module, the input of the cancellation module is connected with the output of the blocking matrix module, and the output of the cancellation module is connected with the output of the fixed beam module;
acquiring a target voice signal from the input voice signal by using the fixed beam module, and outputting the target voice signal;
eliminating a target voice signal from the input voice signal by using the blocking matrix module to obtain a noise signal;
estimating, with the fixed beam module, a prior probability of a target speech signal being present in the noise signal;
the blocking matrix module updates the noise signal according to the prior probability, eliminates a target voice signal existing in the noise signal, obtains an updated noise signal and outputs the updated noise signal;
and eliminating the noise signal output by the blocking matrix module from the target voice signal output by the fixed beam module by using the eliminating module to form an output signal and output the output signal.
Due to the adoption of the technical scheme, the invention has the following beneficial effects: the target voice signal output by the fixed beam module and the noise signal output by the blocking matrix module are offset by the aid of the eliminating module, before the residual noise signal in the target voice signal is eliminated, the probability prior of the target voice signal existing in the noise signal output by the blocking matrix module is carried out in advance, the blocking matrix parameter of the blocking matrix module is updated, the target voice signal omitted in the noise signal is eliminated, the function of the blocking matrix module for eliminating the target voice signal is enhanced, the phenomenon that the target voice signal is completely blocked by the blocking matrix module and is subtracted from the target voice signal in the fixed beam module to offset the leaked target voice signal is avoided, and the situation of voice leakage is greatly reduced.
The robust blocking matrix method for reducing the voice leakage is further improved in that the voice two-state model of the voice signal is as follows:
H0:X=N
H1: x is S + N (one type)
Wherein H0The state represents a state in which only noise is present, N represents a noise signal, H1The state indicates a state where the noise signal and the target speech signal are present, and S is the target speech signal.
The robust blocking matrix method for reducing voice leakage is further improved in that the voice signal is a microphone input signal, and the fixed beam module acquires a target voice signal from the input microphone input signal and outputs the target voice signal; output Y of the fixed beam moduleFBFComprises the following steps:
Figure GDA0002442002770000021
where M is the number of microphones, xiIs the ith microphone input signal, w is the weight of the fixed beam module, w is theiIs the weight of the ith fixed beam.
The robust blocking matrix method for reducing voice leakage is further improved in that the weight w of the fixed beam module is obtained by calculation through a delay summation method or a minimum sidelobe method.
The robust blocking matrix method for reducing voice leakage is further improved in that the blocking matrix module eliminates a target voice signal from the input microphone input signal to obtain a noise signal and outputs the noise signal; the output Z of the blocking matrix module is:
z is B X (III)
Wherein Z is [ Z ]1z2…zN]Is the output signal of the blocking matrix module; x ═ X1x2…xM]Is the microphone input signal; b is the blocking matrix of the blocking matrix module.
The robust blocking matrix method for reducing voice leakage is further improved in that the output Y of the fixed beam module is utilizedFBFThe method for estimating the prior probability of the target speech signal in the noise signal Z by the conditional prior probability comprises the following steps:
estimating Y by controlling recursive average algorithmFBFProbability P (H1| Y) of target speech signal being presentFBF) To determine the prior probability P (H) of the target speech signal in Z1):
P(H1)k=λP(H1)k-1+(1-λ)P(H1|YFBF) (formula IV)
Wherein the content of the first and second substances,
Figure GDA0002442002770000031
H1is the speech existence state, λ is the smoothing coefficient, k is the frame number;
then there is no prior probability P (H) of the target speech signal in Z0) Is obtained from the following equation
P(H0)=1-P(H1). (type six)
The robust blocking matrix method for reducing voice leakage is further improved in that the blocking matrix module updates the noise signal according to the prior probability, eliminates a target voice signal existing in the noise signal, and obtains an updated noise signal, and the method comprises the following steps:
the method comprises the following steps: solving for the conditional prior probability P (H1| Z) of the presence of the target speech signal in Z
a. Solving the posterior signal-to-noise ratio gamma
Figure GDA0002442002770000032
Wherein the content of the first and second substances,
Figure GDA0002442002770000033
is an estimate of the noise signal;
b. solving the prior signal-to-noise ratio epsilon by adopting a decision-guiding method
Figure GDA0002442002770000034
Wherein η is a smoothing coefficient with a value of 0.92, γoldIs the posterior signal-to-noise ratio, GH, of the previous frame1Is H1The voice gain in the state, MAX is a large function;
c. solving speech existence likelihood GLR
Figure GDA0002442002770000041
Wherein the content of the first and second substances,
Figure GDA0002442002770000042
d. solving conditional prior probability P (H1| BM)
Figure GDA0002442002770000043
Step two: modifying signal-to-noise ratio and updating speech gain
a. Using a priori probability P (H)1) Correcting signal-to-noise ratio
Figure GDA0002442002770000044
Figure GDA0002442002770000045
Wherein the content of the first and second substances,
Figure GDA0002442002770000046
is the corrected a posteriori signal-to-noise ratio,
Figure GDA0002442002770000047
is the corrected prior signal-to-noise ratio;
b. updating speech gain GH1
Figure GDA0002442002770000048
Wherein the content of the first and second substances,
Figure GDA0002442002770000049
exp is an exponential operator, e is a natural constant, and x is an integral variable;
step three: estimating dynamic noise smoothing coefficients
Figure GDA00024420027700000410
Wherein α is 0.92;
step four: estimating noise
Figure GDA00024420027700000411
Figure GDA00024420027700000412
Where E is the desired operation, estimated using the following equation:
Figure GDA0002442002770000051
where k is the number of frames, ε represents the prior signal-to-noise ratio, P (H)0|BM)=1-P(H1|BM);
Step five: calculating speech gain
Estimation of updated speech Gain by using optimal modified logarithmic magnitude spectrum estimation method
Figure GDA0002442002770000052
Wherein Gmin is the lower limit constraint of gain when speech does not exist, the value of Gmin is 0.01,
Figure GDA0002442002770000053
is at H1The gain of the speech at the time of the state,
Figure GDA0002442002770000054
is at H0Speech gain at state;
step six: calculating to obtain an updated noise signal Z'
Z ═ Z (1-Gain). (seventeen formula)
Drawings
Fig. 1 is a functional block diagram of a robust blocking matrix method for reducing voice leakage according to the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention.
The main task of speech enhancement techniques is to suppress background noise and interference, thereby enhancing the robustness of subsequent processing to input noise. In the traditional single-channel speech enhancement algorithm, only one-channel analog signals are input, no reference signal is provided, and noise can be suppressed and speech can be enhanced only by utilizing the statistical characteristics of noise-containing speech signals in time domain and frequency domain. However, speech signals are often submerged in noise and interference in time domain and frequency domain, and are difficult to accurately separate from the noise and the interference, so that the space for improving the algorithm effect is relatively small. The introduction of microphone arrays opens a new idea for speech enhancement technology, which utilizes the difference of target speech and interference in spatial position and the correlation between the signals of the microphones to suppress background noise and interference in the incoming wave direction and separated from speech by using a beamforming algorithm, thereby enhancing speech, and has gradually become a hot point of research in the field of speech enhancement.
In the existing beamforming algorithm, an adaptive beamforming algorithm adopting a Generalized Sidelobe Cancellation (GSC) structure plays an important role.
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Referring to fig. 1, fig. 1 is a functional block diagram of a robust blocking matrix method for reducing voice leakage according to the present invention, and is also a diagram of a generalized sidelobe canceling structure.
The generalized sidelobe canceling structure (GSC) is divided into two paths, an upper path and a lower path: a first passage 101 and a second passage 102, the first passage 101 and the second passage 102 being connected in parallel with each other, the first passage 101 being located in an upper passage and the second passage 102 being located in an upper passage in the figure. The generalized sidelobe canceling structure mainly includes three functional modules, namely a fixed beam module (FBF)11, a Blocking Matrix module (BM)12, and a cancellation module (multi-input filter MC) 13. Wherein the fixed beam module (FBF)11 is located on the first path 101 and the blocking matrix module (BM)12 and the cancellation Module (MC)13 are located on the second path 102. The input of the fixed beam module (FBF)11 is connected to the input of the blocking matrix module (BM)12, the output of the blocking matrix module (BM)12 is connected to the input of the cancellation Module (MC)13, the output of the cancellation Module (MC)13 is connected to the output of the fixed beam module (FBF)11, and "+/-" (and/or logical operations) are performed at the intersection node of the output of the cancellation Module (MC)13 and the output of the fixed beam module (FBF) 11.
The fixed beam module (FBF) is used for estimating a reference signal of the target voice, the FBF adopts a filter with a fixed coefficient to filter original channel signals, and adds the filtered channel signals, so that interference and noise of an incoming wave direction different from the target voice signal are suppressed, and the primary enhancement of the target voice signal is realized.
The block matrix module (BM) is configured to eliminate a target speech signal to obtain a noise signal, and the BM performs adaptive filtering on each channel original signal by using the FBF output as a reference signal to remove a target speech component therein, so as to obtain N paths of noise signals (N is the number of microphones), where an adaptive filter in the process may adopt a CCAF (coefficient-defined adaptive filter).
Finally, a removing Module (MC) is used to remove the residual noise in the fixed beam, and the MC further performs adaptive noise reduction processing on the FBF output by using the N paths of noise signals, and enhances the target speech signal again to obtain the final output, and the adaptive filter in the process can adopt NCAF (adaptive filter with limited range).
The invention provides a robust blocking matrix method for reducing voice leakage, aiming at solving the problem that in the current Generalized Sidelobe Canceling (GSC) method, a Blocking Matrix (BM) module does not completely block a target voice signal, so that the leaked target voice signal is cancelled out by subtracting the target voice signal in a fixed beam module (FBF) of the Blocking Matrix (BM) module.
The specific implementation method of the robust blocking matrix method for reducing the voice leakage comprises the following steps:
s001: providing a sound signal, wherein the sound signal is a speech signal containing noise;
s002: inputting the sound signal into a fixed beam module 11(FBF) and a blocking matrix module 12(BM) of a generalized side lobe cancellation structure, where the generalized side lobe cancellation structure has a first path 101 and a second path 102 connected in parallel, the fixed beam module 11 is located in the first path 101, and the blocking matrix module 12 is located in the second path 102; the second path 102 is further provided with a cancellation Module (MC)13, an input of the cancellation module 13 is connected to an output of the blocking matrix module 12, and an output of the cancellation module 13 is connected to an output of the fixed beam module 11;
s003: acquiring a target voice signal from the input voice signal by using the fixed beam module 11, and outputting the target voice signal;
s004: eliminating a target voice signal from the input voice signal by using a blocking matrix module 12 to obtain a noise signal;
s004: estimating the prior probability of the target speech signal existing in the noise signal by using a fixed beam module 11;
s005: the blocking matrix module 12 updates the noise signal according to the prior probability, eliminates the target voice signal existing in the noise signal, obtains the updated noise signal and outputs the updated noise signal;
s006: the noise signal output from the blocking matrix module 12 is removed from the target speech signal output from the fixed beam module 11 by the removal module 13, and an output signal is formed and output.
Taking a microphone input signal as an example of a sound signal, inputting the microphone input signal into a generalized sidelobe canceling structure, and performing voice enhancement on the input microphone input signal by using the robust blocking matrix method of the present invention, specifically as follows:
inputting a microphone input signal;
the speech two-state model of the microphone input signal is as follows:
H0:X=N
H1: x is S + N (one type)
Wherein H0The state represents a state in which only noise is present, N represents a noise signal, H1The state indicates a state where the noise signal and the target speech signal are present, and S is the target speech signal.
Secondly, the fixed beam module 11(FBF) obtains a target voice signal from the input microphone input signal and outputs the target voice signal;
output Y of fixed beam module (FBF)FBFComprises the following steps:
Figure GDA0002442002770000081
where M is the number of microphones, xiIs the ith microphone input signal, w is the weight of the fixed beam module, w is theiIs the weight of the ith fixed beam; the weight w of the fixed beam module can be calculated by adopting a delay summation method or a minimum sidelobe method.
(III) eliminating a target voice signal from an input microphone input signal by a blocking matrix module (BM) to obtain a noise signal and outputting the noise signal;
the output Z of the blocking matrix module (BM) is:
z is B X (III)
Wherein Z is [ Z ]1z2…zN]Is the output signal (noise signal) of the blocking matrix module; x ═ X1x2…xM]Is the microphone input signal; b is a blocking matrix of the blocking matrix module, and the blocking matrix is obtained by a common difference method.
(IV) output Y using fixed beam module (FBF)FBFEstimating the prior probability P (H) of the presence of the target speech signal in the output signal Z (noise signal) of the blocking matrix module (BM)1) The method comprises the following steps:
estimating Y by controlling recursive average algorithmFBFProbability P (H1| Y) of target speech signal being presentFBF) To determine the prior probability P (H) of the target speech signal in Z1):
P(H1)k=λP(H1)k-1+(1-λ)P(H1|YFBF) (formula IV)
Wherein the content of the first and second substances,
Figure GDA0002442002770000082
H1is the speech existence state, and is the smooth coefficient, and k is the frame number;
the control recursive averaging algorithm can be seen in "Israel Cohen Noise Spectrum Estimation in addition Environments: improved minimum Controlled regenerative operating "-IEEETRANSACTIONS SPEECH AND AUDIO PROCESSING, VOL.11, NO.5, SEPTEMBER 2003/Page 466-475. The operational principles governing the recursive averaging algorithm are described in detail in the article.
At this time, the prior probability P (H) of the absence of the target speech signal in the output signal Z (noise signal) of the block matrix module (BM)0) Is obtained from the following equation
P(H0)=1-P(H1). (type six)
(V) the block matrix module (BM) estimates the prior probability P (H) according to the fixed beam module (FBF)1) Updating the noise signal output by the blocking matrix module (BM) to eliminate the target speech signal still existing in the noise signal to obtain an updated noise signal, wherein the specific process is as follows:
the method comprises the following steps: solving for the conditional prior probability P (H1| Z) of the presence of the target speech signal in Z
a. Solving the posterior signal-to-noise ratio gamma
Figure GDA0002442002770000091
Wherein the content of the first and second substances,
Figure GDA0002442002770000092
is an estimate of the noise signal;
b. solving the prior signal-to-noise ratio epsilon by adopting a decision-guiding method
Figure GDA0002442002770000093
Wherein η is a smoothing coefficient with a value of 0.92, γoldIs the posterior signal-to-noise ratio, GH, of the previous frame1Is H1The voice gain in the state, MAX is a large function;
c. solving speech existence likelihood GLR
Figure GDA0002442002770000094
Wherein the content of the first and second substances,
Figure GDA0002442002770000095
exp is an index transport.
d. Solving conditional prior probability P (H1| BM)
Figure GDA0002442002770000096
Step two: modifying signal-to-noise ratio and updating speech gain
a. Using a priori probability P (H)1) Correcting signal-to-noise ratio
Figure GDA0002442002770000097
Figure GDA0002442002770000098
Wherein the content of the first and second substances,
Figure GDA0002442002770000099
is the corrected a posteriori signal-to-noise ratio,
Figure GDA00024420027700000910
is the corrected prior signal-to-noise ratio;
b. updating speech gain GH1
Figure GDA00024420027700000911
Wherein the content of the first and second substances,
Figure GDA0002442002770000101
exp is an exponential operator, e is a natural constant, and x is an integral variable;
step three: estimating dynamic noise smoothing coefficients
Figure GDA0002442002770000102
Wherein α is 0.92;
step four: estimating noise
Figure GDA0002442002770000103
Figure GDA0002442002770000104
Where E is the desired operation, estimated using the following equation:
Figure GDA0002442002770000105
where k is the number of frames, ε represents the prior signal-to-noise ratio, P (H)0|BM)=1-P(H1|BM);
Step five: calculating speech gain
Estimating the updated speech Gain by using an optimally modified log-amplitude spectrum estimation (OM-LSA) method
Figure GDA0002442002770000106
Wherein Gmin is a gain lower limit constraint when no voice exists, and the value of Gmin is 0.01(-20dB), -20dB (10 × log10(0.01)) dB, and dB is a unit of decibel;
Figure GDA0002442002770000107
is at H1The gain of the speech at the time of the state,
Figure GDA0002442002770000108
is at H0Speech gain during state, but to prevent excessive attenuation, GH is usually applied0Change to Gmin as H0Lower gain bound of time
OM-LSA (logarithmic magnitude spectrum estimation with optimal modification of Log Spectral Amplitude) method can be referred to as "Irael Cohen, Baruch Berdgugospeech enhancement for non-stationary noise enhancement" -J.A cosmot. c Am 87(2) February 1990, academic Society of America/Page 820-857. The implementation principle of the OM-LSA method is described in detail in the article.
Step six: calculating to obtain an updated noise signal Z'
Z ═ Z (1-Gain). (seventeen formula)
By adopting the method, the blocking matrix module updates the noise signal according to the prior probability, eliminates the target voice signal existing in the noise signal and finally outputs the updated noise signal.
And (VI) eliminating the noise signal output by the blocking matrix module from the target voice signal output by the fixed beam module by utilizing the elimination module to form an output signal and output the output signal.
The robust blocking matrix method for reducing the voice leakage updates the blocking matrix parameters of the blocking matrix module by carrying out probability prior of the existence of the target voice signal on the noise signal output by the blocking matrix module before the target voice signal output by the fixed beam module and the noise signal output by the blocking matrix module are cancelled by the cancellation module to eliminate the residual noise signal in the target voice signal, so as to eliminate the target voice signal missed in the noise signal, enhance the function of the blocking matrix module for eliminating the target voice signal, avoid the situation that the target voice signal is subtracted from the target voice signal in the fixed beam module to cancel the leaked target voice signal due to the fact that the blocking matrix module does not completely block the target voice signal, and achieve the purpose of greatly reducing the voice leakage.
It should be noted that the structures, ratios, sizes, and the like shown in the drawings attached to the present specification are only used for matching the disclosure of the present specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions of the present invention, so that the present invention has no technical essence, and any structural modification, ratio relationship change, or size adjustment should still fall within the scope of the present invention without affecting the efficacy and the achievable purpose of the present invention. In addition, the terms "upper", "lower", "left", "right", "middle" and "one" used in the present specification are for clarity of description, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms is not to be construed as a scope of the present invention.
Although the present invention has been described with reference to a preferred embodiment, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (2)

1. A robust blocking matrix method for reducing voice leakage includes the following steps:
providing a sound signal;
inputting the sound signal into a fixed beam module and a blocking matrix module of a generalized side lobe cancellation structure, wherein the generalized side lobe cancellation structure is provided with a first channel and a second channel which are connected in parallel, the fixed beam module is positioned on the first channel, and the blocking matrix module is positioned on the second channel; the second path is also provided with a cancellation module, the input of the cancellation module is connected with the output of the blocking matrix module, and the output of the cancellation module is connected with the output of the fixed beam module;
acquiring a target voice signal from the input voice signal by using the fixed beam module, and outputting the target voice signal;
eliminating a target voice signal from the input voice signal by using the blocking matrix module to obtain a noise signal;
estimating, with the fixed beam module, a prior probability of a target speech signal being present in the noise signal;
the blocking matrix module updates the noise signal according to the prior probability, eliminates a target voice signal existing in the noise signal, obtains an updated noise signal and outputs the updated noise signal;
eliminating the noise signal output by the blocking matrix module from the target voice signal output by the fixed beam module by using the eliminating module to form an output signal and outputting the output signal;
the speech two-state model of the sound signal is:
H0:X=N
H1x is S + N (one type)
Wherein H0The state represents a state in which only noise is present, N represents a noise signal, H1The state indicates a state where a noise signal and a target speech signal are present, S is the target speech signal, and X ═ X1x2…xM]Is the microphone input signal, M is the number of microphones;
the fixed beam module acquires a target voice signal from the input microphone input signal and outputs the target voice signal; output Y of the fixed beam moduleFBFComprises the following steps:
Figure FDA0002442002760000011
wherein x isiIs the ith microphone input signal, w is the weight of the fixed beam module, w is theiIs the weight of the ith fixed beam;
the blocking matrix module eliminates a target voice signal from the input microphone input signal to obtain a noise signal and outputs the noise signal; the output Z of the blocking matrix module is:
z is B X (III)
Wherein Z is [ Z ]1z2…zn]Is the output signal of the blocking matrix module; b is a blocking matrix of the blocking matrix module;
using the output Y of the fixed beam moduleFBFThe method for estimating the prior probability of the target speech signal in the noise signal Z by the conditional prior probability comprises the following steps:
estimating Y by controlling recursive average algorithmFBFIn which the target language existsProbability P (H1| Y) of a tone signalFBF) To determine the prior probability P (H) of the target speech signal in Z1):
P(H1)k=λP(H1)k-1+(1-λ)P(H1|YFBF) (formula IV)
Wherein the content of the first and second substances,
Figure FDA0002442002760000021
H1is the speech existence state, λ is the smoothing coefficient, k is the frame number;
then there is no prior probability P (H) of the target speech signal in Z0) Is obtained from the following equation
P(H0)=1-P(H1) (formula six);
the process that the blocking matrix module updates the noise signal according to the prior probability, eliminates a target voice signal existing in the noise signal and obtains an updated noise signal comprises the following steps:
the method comprises the following steps: solving for the conditional prior probability P (H1| Z) of the presence of the target speech signal in Z
a. Solving the posterior signal-to-noise ratio gamma
Figure FDA0002442002760000022
Wherein the content of the first and second substances,
Figure FDA0002442002760000023
is an estimate of the noise signal;
b. solving the prior signal-to-noise ratio epsilon by adopting a decision-guiding method
Figure FDA0002442002760000025
Wherein η is a smoothing coefficient with a value of 0.92, γoldIs the posterior signal-to-noise ratio, GH, of the previous frame1Is H1The voice gain in the state, MAX is a large function;
c. solving speech existence likelihood GLR
Figure FDA0002442002760000024
Wherein the content of the first and second substances,
Figure FDA0002442002760000031
d. solving conditional prior probability P (H1| BM)
Figure FDA0002442002760000032
Step two: modifying signal-to-noise ratio and updating speech gain
a. Using a priori probability P (H)1) Correcting signal-to-noise ratio
Figure FDA0002442002760000033
Wherein the content of the first and second substances,
Figure FDA0002442002760000034
is the corrected a posteriori signal-to-noise ratio,
Figure FDA0002442002760000035
is the corrected prior signal-to-noise ratio;
b. updating speech gain GH1
Figure FDA0002442002760000036
Wherein the content of the first and second substances,
Figure FDA0002442002760000037
exp is an exponential operator, e is a natural constant, and x is an integral variable;
step three: estimating dynamic noise smoothing coefficients
Figure FDA0002442002760000038
Wherein α is 0.92;
step four: estimating noise
Figure FDA0002442002760000039
Figure FDA00024420027600000310
Where E is the desired operation, estimated using the following equation:
Figure FDA00024420027600000311
where k is the number of frames, ε represents the prior signal-to-noise ratio, P (H)0|BM)=1-P(H1|BM);
Step five: calculating speech gain
Estimation of updated speech Gain by using optimal modified logarithmic magnitude spectrum estimation method
Figure FDA0002442002760000041
Wherein Gmin is the lower limit constraint of gain when speech does not exist, the value of Gmin is 0.01,
Figure FDA0002442002760000042
is at H1The gain of the speech at the time of the state,
Figure FDA0002442002760000043
is at H0Speech gain at state;
step six: calculating to obtain an updated noise signal Z'
Z ═ Z (1-Gain) (formula seventeen).
2. A robust blocking matrix method for reducing speech leakage according to claim 1, characterized by: and calculating the weight w of the fixed beam module by adopting a delay summation method or a minimum sidelobe method.
CN201611218157.7A 2016-12-26 2016-12-26 Robust blocking matrix method for reducing voice leakage Active CN106782595B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611218157.7A CN106782595B (en) 2016-12-26 2016-12-26 Robust blocking matrix method for reducing voice leakage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611218157.7A CN106782595B (en) 2016-12-26 2016-12-26 Robust blocking matrix method for reducing voice leakage

Publications (2)

Publication Number Publication Date
CN106782595A CN106782595A (en) 2017-05-31
CN106782595B true CN106782595B (en) 2020-06-09

Family

ID=58925084

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611218157.7A Active CN106782595B (en) 2016-12-26 2016-12-26 Robust blocking matrix method for reducing voice leakage

Country Status (1)

Country Link
CN (1) CN106782595B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301869B (en) * 2017-08-17 2021-01-29 珠海全志科技股份有限公司 Microphone array pickup method, processor and storage medium thereof
CN109473118B (en) * 2018-12-24 2021-07-20 思必驰科技股份有限公司 Dual-channel speech enhancement method and device
CN111341340A (en) * 2020-02-28 2020-06-26 重庆邮电大学 Robust GSC method based on coherence and energy ratio

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080075362A (en) * 2007-02-12 2008-08-18 인하대학교 산학협력단 A method for obtaining an estimated speech signal in noisy environments
CN101976565A (en) * 2010-07-09 2011-02-16 瑞声声学科技(深圳)有限公司 Dual-microphone-based speech enhancement device and method
CN103106390A (en) * 2011-11-11 2013-05-15 索尼公司 Information processing apparatus, information processing method, and program
CN103236260A (en) * 2013-03-29 2013-08-07 京东方科技集团股份有限公司 Voice recognition system
KR20160116440A (en) * 2015-03-30 2016-10-10 한국전자통신연구원 SNR Extimation Apparatus and Method of Voice Recognition System

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080075362A (en) * 2007-02-12 2008-08-18 인하대학교 산학협력단 A method for obtaining an estimated speech signal in noisy environments
CN101976565A (en) * 2010-07-09 2011-02-16 瑞声声学科技(深圳)有限公司 Dual-microphone-based speech enhancement device and method
CN103106390A (en) * 2011-11-11 2013-05-15 索尼公司 Information processing apparatus, information processing method, and program
CN103236260A (en) * 2013-03-29 2013-08-07 京东方科技集团股份有限公司 Voice recognition system
KR20160116440A (en) * 2015-03-30 2016-10-10 한국전자통신연구원 SNR Extimation Apparatus and Method of Voice Recognition System

Also Published As

Publication number Publication date
CN106782595A (en) 2017-05-31

Similar Documents

Publication Publication Date Title
US8731210B2 (en) Audio processing methods and apparatuses utilizing the same
KR101610656B1 (en) System and method for providing noise suppression utilizing null processing noise subtraction
JP5444472B2 (en) Sound source separation apparatus, sound source separation method, and program
CN108922554B (en) LCMV frequency invariant beam forming speech enhancement algorithm based on logarithmic spectrum estimation
CN106782595B (en) Robust blocking matrix method for reducing voice leakage
EP3007170A1 (en) Robust noise cancellation using uncalibrated microphones
US8462962B2 (en) Sound processor, sound processing method and recording medium storing sound processing program
US11812237B2 (en) Cascaded adaptive interference cancellation algorithms
US10348887B2 (en) Double talk detection for echo suppression in power domain
US20130322655A1 (en) Method and device for microphone selection
CN106653043B (en) Reduce the Adaptive beamformer method of voice distortion
CN108630216B (en) MPNLMS acoustic feedback suppression method based on double-microphone model
CN112530451A (en) Speech enhancement method based on denoising autoencoder
WO2007123048A1 (en) Adaptive array control device, method, and program, and its applied adaptive array processing device, method, and program
TWI465121B (en) System and method for utilizing omni-directional microphones for speech enhancement
KR102517939B1 (en) Capturing far-field sound
CN110140346B (en) Acoustic echo cancellation
US20190035414A1 (en) Adaptive post filtering
Kalamani et al. Modified noise reduction algorithm for speech enhancement
KR102040986B1 (en) Method and apparatus for noise reduction in a portable terminal having two microphones
CN109326297B (en) Adaptive post-filtering
CN102655558B (en) Double-end pronouncing robust structure and acoustic echo cancellation method
WO2002003749A2 (en) Adaptive microphone array system with preserving binaural cues
CN113362846A (en) Voice enhancement method based on generalized sidelobe cancellation structure
US10692514B2 (en) Single channel noise reduction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20170929

Address after: 200233 Shanghai City, Xuhui District Guangxi 65 No. 1 Jinglu room 702 unit 03

Applicant after: YUNZHISHENG (SHANGHAI) INTELLIGENT TECHNOLOGY CO.,LTD.

Address before: 200233 Shanghai, Qinzhou, North Road, No. 82, building 2, layer 1198,

Applicant before: SHANGHAI YUZHIYI INFORMATION TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A robust blocking matrix method for reducing speech leakage

Effective date of registration: 20201201

Granted publication date: 20200609

Pledgee: Bank of Hangzhou Limited by Share Ltd. Shanghai branch

Pledgor: YUNZHISHENG (SHANGHAI) INTELLIGENT TECHNOLOGY Co.,Ltd.

Registration number: Y2020310000047

PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20220307

Granted publication date: 20200609

Pledgee: Bank of Hangzhou Limited by Share Ltd. Shanghai branch

Pledgor: YUNZHISHENG (SHANGHAI) INTELLIGENT TECHNOLOGY CO.,LTD.

Registration number: Y2020310000047

PC01 Cancellation of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A Robust Blocking Matrix Method for Reducing Speech Leakage

Effective date of registration: 20230210

Granted publication date: 20200609

Pledgee: Bank of Hangzhou Limited by Share Ltd. Shanghai branch

Pledgor: YUNZHISHENG (SHANGHAI) INTELLIGENT TECHNOLOGY CO.,LTD.

Registration number: Y2023310000028

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Granted publication date: 20200609

Pledgee: Bank of Hangzhou Limited by Share Ltd. Shanghai branch

Pledgor: YUNZHISHENG (SHANGHAI) INTELLIGENT TECHNOLOGY CO.,LTD.

Registration number: Y2023310000028

PC01 Cancellation of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A Robust Blocking Matrix Method for Reducing Speech Leakage

Granted publication date: 20200609

Pledgee: Bank of Hangzhou Limited by Share Ltd. Shanghai branch

Pledgor: YUNZHISHENG (SHANGHAI) INTELLIGENT TECHNOLOGY CO.,LTD.

Registration number: Y2024310000165

PE01 Entry into force of the registration of the contract for pledge of patent right