CN108597532A

CN108597532A - Hidden method for acoustic based on MVDR

Info

Publication number: CN108597532A
Application number: CN201810221809.5A
Authority: CN
Inventors: 陈景东; 梁菲菲; 王雪瀚; 黄海; 聂玮奇
Original assignee: Linjing Acoustics Technology Jiangsu Co Ltd
Current assignee: Linjing Acoustics Technology Jiangsu Co Ltd
Priority date: 2017-03-19
Filing date: 2018-03-18
Publication date: 2018-09-28
Also published as: CN107170462A

Abstract

The invention discloses a kind of hidden method for acoustic based on MVDR, this method is first with Short Time Fourier Transform, the time-domain signal that sensor array receives is resolved into frequency domain sub-band signal, the hidden acoustic filter based on MVDR is constructed on each subband, so that the acoustical signal of sound source is undamped by hidden acoustic filter within from hidden throw, estimation signal is obtained eventually by anti-STFT.The present invention has preferable hidden sound effective value.

Description

MVDR-based implicit method

Technical Field

The invention relates to a hidden sound technology of a microphone array, in particular to a hidden sound method based on MVDR.

Background

The research of the hidden sound technology has a long history, and people mainly explore two methods in the field of sound signal sensing: bone conduction microphones and ultrasound microphones, signal separation techniques and differential microphone arrays have been explored in the field of acoustic signal processing.

The bone conduction microphone collects sound signals and converts the sound signals into electric signals by utilizing slight vibration of bones of the head and neck caused by speaking of a person. Since it is different from a conventional microphone that picks up sound through air conduction, sound can be transmitted out clearly in a very noisy environment. As early as hundreds of years, people have many practical applications of bone conduction technology, but the voice quality of early bone conduction microphones is not very good, especially the voice quality of high frequency is poor, so the early bone conduction microphones can only be used for assisting conventional microphones, for example, the bone conduction microphones are used for voice endpoint detection to improve the performance of single-channel voice noise reduction. In the last twenty years, bone conduction microphones are really attracting attention, and the performance of the bone conduction microphones is greatly improved. Currently, several types of bone conduction microphone-based communication headsets have appeared on the market. The bone conduction microphone is invented by Beijing Meiersion scientific and technological development limited company in 5 months in 2013, realizes the domestic zero breakthrough of independent research and development of the product, and is already put into the market at present. Recently, other acoustic sensors similar to the bone conduction principle have been paid attention, such as DAIKIN-D Talk Mic headset, which works by using a highly sensitive microphone to pick up a vocal cord vibration signal of a speaker and convert the signal into an electrical signal. This way of picking up sound is much similar to bone conduction microphones. Although bone conduction microphone technology has been developed in a breakthrough manner, the popularization of the communication headset is not very ideal at present, and several problems mainly exist: 1) the dynamic effect is poor; 2) the cost is high; 3) the packaging effect is poor; 4) the sound quality is poor.

Over a decade ago, a group of scientists at the united states bell laboratory designed an ultrasonic microphone. This microphone consists of a small ultrasound transmitter and a large broadband conventional microphone. In operation, the ultrasound transmitter transmits a periodic sequence of wideband pulses. The frequency of the broadband pulse signal is between 20kHz and 70kHz, after the broadband pulse signal is reflected by a vocal tract of a speaker, a microphone receives a reflected signal, and then a digital signal processing link at the rear end estimates the shape parameter of the vocal tract by utilizing the transmitted signal and the reflected signal, so as to synthesize the voice spoken by the speaker. The microphone is characterized by working in an ultrasonic frequency band, so that the microphone is not interfered by audio signals in an auditory perception frequency range and can be used for solving the problem of voice communication in a cocktail-like environment. Scientists in the united states bell laboratory have constructed a workable prototype system that provides received and synthesized vowels to a speech recognition system that achieves 95% recognition. Preliminary hearing experiments also confirmed that the quality of the synthesized vowel substantially achieved the voice quality of conventional microphones. Of course, the microphone has many problems to be solved before being really practical, and the biggest technical problem is that the synthesized voice quality and intelligibility are not high for phonemes with unobtrusive vocal tract characteristics such as nasal sound.

The steganoacoustic technique can be viewed theoretically as a sub-problem of signal separation or enhancement. In a complex acoustic environment, a microphone is used to pick up a signal from a certain sound source, which is almost without exception contaminated. According to the mechanism of pollution generation, noise in speech signal processing is classified into 4 types: ambient noise, echoes, reverberation and interference. In order to separate the sound source from the noise, a specific method is used for each type of noise:

ambient Noise (Noise): environmental noise is unavoidable and ubiquitous, and its presence can seriously affect the voice quality, intelligibility and perception of spatial information by the human ear of a speech signal. The ambient noise is usually relatively stationary, i.e. the statistical properties of the noise at the current time point can be replaced by the statistical properties of the noise at the historical time. According to the statistical characteristics of the signal with noise and the statistical characteristics of the noise, a filter can be designed to filter the observation signal, so that the voice signal is enhanced, and the background noise is suppressed. The noise reduction technique may utilize a single-channel pickup system or may utilize a multi-channel pickup system, which correspond to the single-channel noise reduction technique and the multi-channel noise reduction technique, respectively. In contrast to single-channel noise reduction techniques, which result in speech distortion while achieving noise suppression, multi-channel noise reduction techniques reduce speech distortion while achieving the same output signal-to-noise ratio.

Echo (Echo): acoustic echoes are generated by acoustic coupling between the microphone and the loudspeaker. The presence of echo can severely impact multi-party duplex interaction. The echo is characterized by the fact that the source signal is known, that the echo component of the observed signal picked up by the microphone can be estimated as long as the acoustic propagation channel from the loudspeaker to the microphone can be estimated, and that the echo cancellation can be achieved by subtracting the estimate of this component from the picked-up signal, a technique known as echo cancellation.

Reverberation (reverbration): reverberation is caused by interface reflections (multipath effects) in the room environment. The reflections are classified into early reflections and late reflections. Early reflections (typically within 40 ms) typically carry useful information, such as the size of the room, which can be estimated by analyzing the structure of the early reflections. In addition, the early reflection can also enhance the harmonic components of the music and improve the listening feeling. However, late reflections cause spectral distortions, which lead to a reduction in speech quality, intelligibility, and blurring of the location information of the sound source. In a voice communication system, late reflections cause reverberation, thereby seriously affecting the quality of voice communication, so that a dereverberation technique is required. A dereverberation technology is that firstly, blind estimation is carried out on a channel, and then dereverberation is realized by utilizing an equalization technology; another technique for suppressing reverberation is super-directional array beamforming, which basically extracts a sound source signal from a desired direction and suppresses signals from other directions at the same time. The reverberation comes from all directions, so the super-directional array can suppress the reverberation to some extent.

Interference (Interference) signals from other sound sources: the interfering signal is caused by point source noise and is noise from a certain direction in space. In voice communication, many people are around and other sound sources are present, so that a plurality of sound sources are inevitable at each communication end, and signals from different sound sources interfere with each other. The typical technique for interference suppression is beamforming, which basically forms a spatial filter and aligns the direction of the maximum filter response to the direction of the desired sound source, and the degree of interference suppression depends on the size of the array response in the direction of the interference. The goals of speech noise reduction, source separation and beamforming are to separate the desired useful signal from other interfering signals, so these techniques can be used for steganographic processing. However, the separation performance available by the current separation technology is limited, and the requirements of the application of the hidden sound cannot be met.

The aforementioned signal separation techniques require the use of microphone arrays. Microphone array research has been in progress for over forty years, and over forty years, many array designs and processing methods have been developed. These arrays can be divided into two broad categories according to their response to the sound field: additive type arrays (AMA) and differential arrays (DMA). The additive array is typically large in size, each microphone measures the sound pressure of the sound field, and the entire array beamforming is also responsive to the sound pressure field. Much work is currently done in the literature on additive arrays, and on methods of processing additive arrays. In comparison, the differential array responds to the spatial differential field of the sound pressure field, and has the characteristics of small array size, good beam pattern frequency consistency, capability of maximizing array directivity under the condition of given array element number and the like.

Disclosure of Invention

The invention aims to provide a hidden sound method based on MVDR, which has a good hidden sound effect.

In order to solve the technical problems, the invention adopts the following technical scheme: the method is characterized in that a short-time Fourier transform is utilized to decompose a time domain signal received by a sensor array into sub-band signals, and an MVDR-based hidden sound filter is constructed on each sub-band, so that a sound signal from a sound source within a hidden sound distance does not attenuate and passes through the hidden sound filter, and finally an estimation signal is obtained through inverse STFT.

Further, the method comprises the steps of:

s1: constructing a guide vector according to parameters such as the structure of the array, the number of array elements, the position of a sound source and the like

S2: the signal y received by the sensor in the microphone array_m(k)＝x_m(k)+v_m(k) M is divided into short time frames with a certain coincidence proportion, the frame length can be from several milliseconds to tens of milliseconds, and then each frame of each channel of the M channels is subjected to short time Fourier transform to obtain Y_m(ω, i) where i represents the ith frame, and then constructing

y_m(ω，i)＝[Y₁(ω，i) Y₂(ω，i) … Y_M(ω，i)]^T.

S3: decomposing a time domain signal received by the sensor array into sub-band signals by using short-time Fourier transform;

s4: constructing a hidden sound filter based on MVDR on a subband with frequency omega

S5: on the sub-band, the signal of the i-th frame is processed using a steganoacoustic filter h (ω)

S6: and transforming Z (omega, i) by using a short-time inverse Fourier transform and an overlap-add method, thereby obtaining a time-domain signal Z (k) after beam forming.

Further, the specific steps of decomposing the time domain signal received by the sensor array into sub-band signals by using short-time fourier transform are as follows:

assuming that the distance between two adjacent microphones is delta, because the sound hiding technology is used for picking up a sound source at a short distance, an ideal near-field sound source and interference act on the microphone array in an acoustic environment, and the distances from the sound source to the microphones are r respectively_s，1，r_s，2，…，r_s，MDefining the center of the array as a reference point, the distance of the sound source from the reference point being r_sIncident angle of theta_sThen the distance of the sound source from the mth microphone can be expressed as:

wherein,

assuming that at discrete time k, the signal emitted by the sound source is x (k), and if the absorption loss in the propagation process is neglected, the signal picked up by the m-th microphone has only a phase delay with respect to the sound source signal, and there is also an amplitude attenuation inversely proportional to the distance, which can be expressed as:

wherein x is_m(k) Representing the sound source signal, v, picked up by the m-th microphone_m(k) Representing the noise signal picked up by the mth microphone; tau is_mRepresenting the time delay of the mth microphone compared to the sound source;

tau since the wave front of the near-field ideal sound source is spherical_mCan be expressed as:

where c represents the speed of sound in air,

in the frequency domain, equation (1) becomes:

wherein,denotes wave number, ω ═ 2 π f denotes angular frequency; f represents the frequency of the time and the frequency,represents an imaginary unit, Y_m(ω)、X_m(ω)、V_m(ω) represents y_m(k)、x_m(k)、v_m(k) The fourier transform of (d).

Further, according to parameters such as the structure of the array, the number of array elements, the position of a sound source and the like, a guide vector with the length of M is constructed:the superscript T represents the vector transpose operation, let r be r_s，θ＝θ_sThen, the process of the present invention,the signals picked up by the M microphones are represented by a vector as:

y(ω)＝[Y₁(ω) Y₂(ω) … Y_M(ω)]^T

＝d_s(ω，r_s，θ_s)X(ω)+v(ω)，

y (ω) is processed by the array, and the resulting output signal is:

wherein Z (ω) is an estimate of the sound source signal X (ω); the superscript H denotes the conjugate transpose operation, H (ω) — [ H [)₁(ω)H₂(ω) … H_M(ω)]^TRepresenting the weighting vector of the microphone array to the input signal, i.e. the filter coefficients of the microphone array.

Further, in S3, at the subband having the frequency ω, the MVDR-based implicit filter method is constructed as follows: the MVDR hidden sound filter is obtained by not attenuating the close-distance sound source signal and minimizing the variance of the residual noise at the output end of the array, and the mathematical expression is as follows:

h_MVDR＝argmin_h(ω)h^HR_vvh subject toh^H(ω)d_s(ω，r_s，θ_s)＝1， (2)

equation (2) can be solved using the Lagrange multiplier method to obtain

Further, assuming that the noise is isotropic noise, the (m, n) th element of its normalized correlation matrix can be written as follows:

wherein,there are two extreme cases: 1) if ω τ₀Very large, i.e. at height

At high frequency or large distance, the noise signals received by the two sensors are nearly uncorrelated and isotropic

The sexual noise is close to spatial white noise; 2) if ω τ₀The noise signals received by the two sensors are close to coherent under the condition of low frequency or small distance, and isotropic noise is close to point source noise;

the MVDR-based implicit acoustic filter has a matrix inversion operation, when the number of microphone elements is large, the matrix is ill-conditioned, the filter is extremely unstable, and in order to avoid the inversion instability, the traditional method is adopted: the diagonal loading technique is to add the matrix and a diagonal matrix, and the matrix after diagonal loading can be expressed as [ gamma ] (gamma)_dn(ω)+εI]Wherein I is an identity matrix with a size of M, and epsilon is a loading coefficient, and at this time, the coefficient of the MVDR-based implicit acoustic filter is:

the invention has the beneficial effects that: the invention is a small microphone array on a platform, the core of the invention is that time domain signals received by a sensor array are decomposed into sub-band signals by utilizing short-time Fourier transform, and a proper sound hiding filter is constructed on each sub-band, so that sound signals from a sound source within a sound hiding distance pass through the sound hiding filter without attenuation.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments are briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

Fig. 1 is a diagram of a muffling model.

FIG. 2 is a schematic diagram of a system for processing speech with a small microphone array.

Fig. 3 is a diagram of the pick-up and processing of near-field acoustic signals by a uniform linear array.

Fig. 4 is a graph of the attenuation of the hidden sound by the hidden sound method based on the 8-element MVDR.

FIG. 5 shows a 8-element MVDR-based implicit method with frequencies of 1300Hz and 3300Hz

The beam patterns (blue, red, purple, black indicate r is 5cm, 10cm, 30cm, 50cm, respectively).

FIG. 6 is a schematic diagram of a 8-array-element MVDR-based implicit method for aligning a sound source and an end with an end-fire direction of 0cm

Signal to interference ratio map of interference in the direction of the beam.

Detailed Description

The technical solution of the present invention will be clearly and completely described by the following detailed description.

The method for hiding the sound based on the MVDR comprises the steps of decomposing a time domain signal received by a sensor array into sub-band signals by utilizing short-time Fourier transform, constructing a proper hidden sound filter on each sub-band, enabling a sound signal from a sound source within a hidden sound distance to pass through the hidden sound filter without attenuation, and finally obtaining an estimated signal through inverse STFT.

The method of the invention comprises the following steps:

y_m(ω，i)＝[Y₁(ω，i) Y₂(ω，i) … Y_M(ω，i)]^T.

S3: decomposing a time domain signal received by the sensor array into sub-band signals by using short-time Fourier transform; assuming that the distance between two adjacent microphones is delta, because the sound hiding technology is used for picking up a sound source at a short distance, an ideal near-field sound source and interference act on the microphone array in an acoustic environment, and the distances from the sound source to the microphones are r respectively_s，1，r_s，2，…，r_s，MDefining the center of the array as a reference point, the distance of the sound source from the reference point being r_sIncident angle of theta_sThen the distance of the sound source from the mth microphone can be expressed as:

wherein,

where c represents the speed of sound in air,

in the frequency domain, equation (1) becomes:

Constructing a guide vector with the length of M according to parameters such as the structure of the array, the number of array elements, the position of a sound source and the like:the superscript T represents the vector transpose operation, let r be r_s，θ＝θ_sThen, the process of the present invention,the signals picked up by the M microphones are represented by a vector as:

y(ω)＝[Y₁(ω) Y₂(ω… Y_M(ω)]^T

＝d_s(ω，r_s，θ_s)X(ω)+v(ω)，

y (ω) is processed by the array, and the resulting output signal is:

Before deriving the implicit filter, several important indexes for measuring the performance of the implicit sound are introduced.

The components of the array output that are related to the source signal are: h is^H(ω)d_s(ω，r_s，θ_s) X (omega). Thus, the response of the array to the acoustic source signal is:

response of array to sound source signalThere should be three variables: omega, r_s，θ_s. Fixing two of these variables, the response of the array to the source signal varies with the other variable to produce three indices: 1) fix omega, theta_sThe response of the array to the acoustic source signal is dependent on r_sThe index of change is called the muffled attenuation function; 2) fixed r_s，θ_sThe index of the response of the array to the sound source signal changing along with omega is called a hidden sound frequency response function; 3) fix omega, r_sResponse of the array to the source signal with theta_sThe indicator of change is called the beam pattern.

Attenuation function of implicit sound

The muffling attenuation function describes the gain of the array at different distances to a single-frequency sound source signal, which is mathematically defined as:

muffled frequency response function

The muffled frequency response function describes the gain of the array to the broadband signal, which is mathematically defined as:

beam pattern

The beam pattern describes the sensitivity of the array to incident signals in different directions, which is mathematically defined as:

signal to interference ratio and signal to interference ratio gain

The signal-to-noise ratio measures the relative magnitude of the source signal and the noise signal. By comparing the input-output signal-to-noise ratio, the performance of the beamformer can be measured. The invention relates toThe heart is the interference noise. The input signal-to-interference ratio of the array is defined as:

wherein, X₀(ω) is the sound source signal picked up by the reference position,V₀(ω) is the interference signal picked up by the reference position,r_nrepresenting the distance of the interference noise from the center of the array.Respectively represent X₀(ω),V₀(ω) power.

The output signal-to-interference ratio of the array is expressed as:

wherein theta is_nIndicating the direction of the interference noise, phi_X(ω)、Φ_VAnd (ω) represents the power of X (ω) and V (ω), respectively.

Thus, the signal-to-interference ratio gain is:

s4: on the subband with frequency ω, the MVDR-based implicit filter method is constructed as follows: the MVDR hidden sound filter is obtained by not attenuating the close-distance sound source signal and minimizing the variance of the residual noise at the output end of the array, and the mathematical expression is as follows:

h_MVDR＝argmin_h(ω)h^HR_vvh subject to h^H(ω)d_s(ω，r_s，θ_s)＝1， (2)

equation (2) can be solved using the Lagrange multiplier method to obtain

wherein,there are two extreme cases: 1) if ω τ₀Very large, i.e. at high frequency or large spacing, the noise signals received by the two sensors are nearly uncorrelated, and the isotropic noise is close to spatial white noise; 2) if ω τ₀Very small, i.e. at low frequencies or small spacings

Next, noise signals received by the two sensors are close to coherent, and isotropic noise is close to point source noise;

the MVDR-based implicit acoustic filter has a matrix inversion operation, when the number of microphone elements is large, the matrix is ill-conditioned, the filter is extremely unstable, and in order to avoid the inversion instability, the traditional method is adopted: the diagonal loading technique is to add the matrix and a diagonal matrix, and the matrix after diagonal loading can be expressed as [ gamma ] (gamma)_dn(ω)+εI]Where I is an identity matrix of size M and ε is a loading coefficient, in which case MVDR-basedThe coefficients of the implicit sound filter are:

In order to demonstrate the effect of the present invention, the following specific examples are given to verify the correctness of the algorithm proposed by the present patent. The hidden sound attenuation function, the hidden sound frequency response function and the beam pattern are MATLAB simulation results, and the signal-to-interference ratio is an experimental result of a test in a total-noise-elimination darkroom of an intelligent acoustics and immersive communication voice research center of northwest industrial university. The experimental conditions were set as follows:

the microphone array structure is a uniform linear array, the number M of the microphones is 8, and the array element distance delta is 1.1 cm.

Sound source position: theta_s＝0°，r_s＝5cm。

Fig. 4(a) is a graph of the attenuation of the concealment in the concealment method based on the 8-element MVDR. It can be seen that: the point source signal is attenuated quickly within 20cm, and the attenuation is quicker when the distance is closer; above 20cm, the attenuation is almost inversely proportional to the distance. The method is used for hiding the remote interference by the hiding method based on the 8-array element MVDR when the sound source distance is not more than 20 cm. Description of the drawings: radius of sound masking r₀About 20cm, when the sound source distance r_s＜r₀In time, the 8-array element MVDR-based implicit method can be used for implicit r₀External interference.

Fig. 4(b) is a graph of the muffled frequency response of the muffled method based on 8-element MVDR. It can be seen that: the response of a point source signal at 2000Hz at 5cm, 10cm, 30cm and 50cm is in turn about: 0dB, 19dB, 33dB, 38 dB. Description of the drawings: when the sound source distance is 5cm, the interference at the positions of 10cm, 30cm and 50cm in the sound end-fire direction can be concealed by the concealed sound method based on the 8-array element MVDR.

FIG. 5 is a beam diagram of an 8-element MVDR based implicit method at 1300Hz and 3300 Hz. It can be seen that: at different frequencies, when the sound source distance is 5cm, the hidden sound method based on the 8-array element MVDR can hide the interference at 10cm, 30cm and 50cm in any direction, and the hidden sound effect on the interference in the non-end-fire direction is better.

Fig. 6 is a signal-to-interference ratio diagram of interference of a sound source with an end-fire direction of 0cm and an end-fire direction by a hidden sound method based on 8-array element MVDR. It can be seen that: when the interference is at 50cm, the signal-to-interference ratio of the voice interference in fig. 6(a) is improved by about 10dB, the signal-to-interference ratio of the white gaussian noise interference in fig. 6(b) is improved by about 6dB, the signal-to-interference ratio of the chirp interference in fig. 6(c) is improved by about 7dB, and the signal-to-interference ratio of the single frequency interference in fig. 6(d) is improved by about 7 dB. Description of the drawings: the implicit method based on the 8-array element MVDR has a good implicit effect.

The above-mentioned embodiments are merely descriptions of the preferred embodiments of the present invention, and do not limit the concept and scope of the present invention, and various modifications and improvements made to the technical solutions of the present invention by those skilled in the art should fall into the protection scope of the present invention without departing from the design concept of the present invention, and the technical contents of the present invention as claimed are all described in the technical claims.

Claims

1. The MVDR-based implicit method is characterized in that: the method comprises the steps of decomposing a time domain signal received by a sensor array into sub-band signals by short-time Fourier transform, constructing an MVDR-based sound hiding filter on each sub-band, enabling sound signals from a sound source within a sound hiding distance to pass through the sound hiding filter without attenuation, and finally obtaining an estimated signal through inverse STFT.

2. The MVDR-based implicit method of claim 1 wherein: the method comprises the following steps:

y_m(ω，i)＝[Y₁(ω，i)Y₂(ω，i)…Y_M(ω，i)]^T.

3. The MVDR-based implicit method of claim 1 wherein: by short-time Fourier transformAnd transforming, namely decomposing the time domain signals received by the sensor array into sub-band signals, wherein the specific steps are as follows: assuming that the distance between two adjacent microphones is delta, because the sound hiding technology is used for picking up a sound source at a short distance, an ideal near-field sound source and interference act on the microphone array in an acoustic environment, and the distances from the sound source to the microphones are r respectively_s，1，r_s，2，…，r_s，MDefining the center of the array as a reference point, the distance of the sound source from the reference point being r_sIncident angle of theta_sThen the distance of the sound source from the mth microphone can be expressed as:

wherein,

where c represents the speed of sound in air,

in the frequency domain, equation (1) becomes:

4. The MVDR-based implicit method of claim 3 wherein: constructing a guide vector with the length of M according to parameters such as the structure of the array, the number of array elements, the position of a sound source and the like:the superscript T represents the vector transpose operation, let r be r_s，θ＝θ_sThen, the process of the present invention,the signals picked up by the M microphones are represented by a vector as:

y(ω)＝[Y₁(ω) Y₂(ω) … Y_M(ω)]^T

＝d_s(ω，r_s，θ_s)X(ω)+v(ω)，

y (ω) is processed by the array, and the resulting output signal is:

wherein Z (ω) is an estimate of the sound source signal X (ω); the superscript H denotes the conjugate transpose operation, H (ω) — [ H [)₁(ω) H₂(ω) … H_M(ω)]^TRepresenting the weighting vector of the microphone array to the input signal, i.e. the filter coefficients of the microphone array.

5. The MVDR-based implicit method of claim 1 wherein: in S3, the MVDR-based implicit filter method is constructed on the subband having the frequency ω as follows: the MVDR hidden sound filter is obtained by not attenuating the close-distance sound source signal and minimizing the variance of the residual noise at the output end of the array, and the mathematical expression is as follows:

h_MVDR＝arg min_h(ω)h^HR_vvh subject to h^H(ω)d_s(ω，r_s，θ_s)＝1，(2)

equation (2) can be solved using the Lagrange multiplier method to obtain

6. The MVDR-based implicit method of claim 5 wherein: assuming that the noise is isotropic, the (m, n) th element of its normalized correlation matrix can be written as follows:

wherein,there are two extreme cases: 1) if ω τ₀Very large, i.e. at high frequency or large spacing, the noise signals received by the two sensors are nearly uncorrelated, and the isotropic noise is close to spatial white noise; 2) if ω τ₀Very small, i.e. at low frequencies orUnder the condition of small distance, noise signals received by the two sensors are close to coherent, and isotropic noise is close to point source noise;

7. the MVDR-based implicit sound method of claim 1 wherein the microphone array includes but is not limited to a uniform linear microphone array, a non-uniform linear microphone array and a circular microphone array.

8. The MVDR-based masking method of claim 1 wherein the microphone array is one of a small microphone array and a large microphone array.

9. The MVDR-based implicit method of claim 1 wherein the speech signal processing method is used to process narrowband signals or wideband signals.

10. The MVDR-based stealth method of claim 1, wherein a stealth radius of the microphone array increases as a number of microphones and an inter-element spacing increases.