A kind of microphone array auditory localization signal processing method
Technical field
The invention belongs to Audio Signal Processings and array signal processing technology, in particular to a kind of microphone
Array sound source positioning signal processing method.
Background technique
Microphone array location algorithm is roughly divided into three categories at present, i.e., is positioned based on reaching time-difference (TDOA), is controllable
Responding power (SRP) and the algorithm based on High-Resolution Spectral Estimation.Algorithm based on High-Resolution Spectral Estimation is initially applied to
The positioning of narrow-band source was gradually referred in broad band source orientation problem by numerous scholars transformation later.It is extended to broadband signal estimation
When, it needs that signal frequency is divided into multiple subbands in frequency domain, or carry out frequency focusing to be converted into narrow band signal processing
Mode.Such algorithm positioning resolution is very high, but due to the conversion in broadband to be carried out to narrowband, so that algorithm operation quantity increases greatly
Add, be even more in practice because sound source number is unknown and noise circumstance is unsatisfactory for ideal white Gaussian noise condition and performance sharply
Decline.
Core based on reaching time-difference (TDOA) location algorithm is the accurate estimation to acoustic propagation time delay, generally passes through
Cross-correlation is done between signal microphone or broad sense cross correlation process obtains.Finally by the application for geometric algorithm by sound source
Position determined.Orientation algorithm operand based on reaching time-difference is relatively small, and real-time is preferable, hardware cost compared with
It is low, thus receive much attention, become the method being widely used in sound source direction.Whether time delay estimated value accurately determines in this method
Whether auditory localization is accurate, and ambient noise and room reverberation can generate certain influence to its accuracy.
SRP method divides space into grid one by one, and each grid has the sound source of a hypothesis, can calculate every
A imagination sound source to a pair of designated position microphone delay inequality, by all microphones in the corresponding cross correlation value of its delay inequality
Summation, so that it may responding power is obtained, so that it is real sources that responding power, which obtains the corresponding imaginary sound source position of maximum value,
The estimated value of position.Combine the sound localization method (SRP-PHAT) of controllable responding power and phse conversion for controllable responding power
Insensitivity of the phse conversion method to signal ambient enviroment in the intrinsic robustness of method, short-time analysis feature and time delay estimation
It combines, makes sonic location system that there is certain noise immunity, anti-reverberation.But SRP-PHAT method is in the presence of a harsh environment
(noise jamming is big, reverberation influences seriously) performance sharply declines.
Summary of the invention
It is an object of the invention to solve the positioning accuracy of SRP-PHAT method in the prior art by ambient noise and reverberation
The influence of condition is serious, the problem of sharply declining.
To achieve the above object, the present invention discloses a kind of microphone array auditory localization signal processing method, comprising:
Estimation sound source position is divided into Q mesh point in measurement space by step 1), and each mesh point three-dimensional coordinate isM microphone signal is sampled, mesh point is calculatedTo the time delay of two different microphone signals
Difference;
Step 2) acquires the current frame data of M microphone channel, calculates the time delay value of microphone pair;Based on the time delay value
The weighted value w of q-th of mesh point is calculated with the delay inequality of step 1)q;Then the SRP-PHAT value p of q-th of mesh point is calculatedq,
W is found in Q mesh pointqpqThe corresponding mesh point of maximum valueTo obtain the net of the corresponding estimation sound source position of the frame data
Lattice point coordinate
As a kind of improvement of the above method, the step 1) includes:
Step 1-1) set the microphone array distribution of M microphone composition in three dimensions, each microphone coordinate is
Step 1-2) in measurement space all possible positions of sound source are divided into Q mesh point, three-dimensional coordinate is
Step 1-3) one channel of each microphone correspondence, if the sample frequency of signal is fs, every every channel sample of frame is long
Degree is L, and every channel sample signal is xi1(n), i1=1 ..., M, n=1 ..., L;Fourier transformation points are equal to 2L-1;
Step 1-4) calculate mesh pointTo the i-th 1 and the delay inequality Δ τ in the i-th 2 channelsi1i2(q):
Wherein, i2=1 ..., M, i2 ≠ i1, c are the velocity of sound.
As a kind of improvement of the above method, the step 2) includes:
Step 2-1) calculate separately each microphone channel signal xi1(n), i1=1 ..., M, n=1 ..., the 2L-1 point of L
Fast Fourier Transform (FFT) obtains Xi1(k), i1=1 ..., M, k=1 ..., 2L-1;
Step 2-2) calculate the phse conversion PHAT cross correlation value R of the i-th 1 and the i-th 2 microphone channelsi1i2(l):
Wherein, Xi1It (k) is the i-th 1 channel receiving signal xi1(n), i1=1 ..., M, n=1 ..., the frequency domain representation of L,
The points that Fast Fourier Transform (FFT) FFT is calculated are 2L-1;Xi2It (k) is the i-th 2 channel receiving signal xi2(n), i2=1 ..., M,
The frequency domain representation of n=1 ..., L,It is Xi2(k) conjugation;|Xi1(k) | it is Xi1(k) amplitude;L=1 ..., L;
Step 2-3) according to Ri1i2(l) time delay value between the i-th 1 and the i-th 2 microphone channels is calculated
Step 2-4) calculate Δ τi1i2(q) withBetween standard deviation obtain the weighted value w of each mesh pointq:
Step 2-5) calculate controllable responding power-phse conversion SRP-PHAT value p of each mesh pointq;
Step 2-6) calculate q-th of mesh point the controllable responding power of weighting-phse conversion SRP-PHAT value wqpq, at Q
wqpqMaximum value therein is found out, according to wqpqMaximum value obtain corresponding mesh point
Step 2-7) according to wqpqThe corresponding mesh point of maximum valueObtain the corresponding sound source position of the frame data
Present invention has an advantage that
1, the present invention discloses a kind of microphone array auditory localization signal processing method, fixed using weighting SRP-PHAT sound source
Position signal processing technology scheme, with the standard deviation between the time delay correct time delay value corresponding with Searching point of PHAT cross correlation value estimation
Inverse ask the responding power of space networks lattice point to can be further improved using this method as the weighted value of SRP-PHAT value
The accuracy of auditory localization;
The time delay value that the relative time delay value and PHAT cross-correlation method of 2 sound source positions and microphone of the invention are calculated
More close, responding power value is bigger;
3, the present invention is able to solve the positioning accuracy of SRP-PHAT method in the prior art by ambient noise and reverberation condition
Influence is serious, the problem of sharply declining.
Detailed description of the invention
Fig. 1 is signal processing method flow chart of the present invention.
Specific embodiment
The present invention will be described in detail in the following with reference to the drawings and specific embodiments.
If the microphone array distribution of M microphone composition is in three dimensions, each microphone coordinate isAccording to system to estimated accuracy requirement, all possible positions of sound source can be reduced to three-dimensional in measurement space
The lattice point of grid.Assuming that being divided into Q mesh point altogether, coordinate isIf the sampling rate of signal is fs,
Every every channel sample length of frame is L.
Weighting SRP-PHAT sound localization method disclosed by the invention passes through the weighting SRP-PHAT value in search grid most
Big position determines the estimated value of sound source position
Wherein, pqFor Searching pointSRP-PHAT value, calculation formula is as follows:
Wherein, PHAT cross correlation value Ri1i2(Δτi1i2(q)) calculation formula is as follows:
Wherein, Xi1It (k) is the i-th 1 channel receiving signal xi1(n), i1=1 ..., M, n=1 ..., the frequency domain representation of L,
The points that FFT is calculated are 2L-1;Xi2It (k) is the i-th 2 channel receiving signal xi2(n), i2=1 ..., M, n=1 ..., the frequency of L
Domain representation,It is Xi2(k) conjugation;|Xi1(k) | it is Xi1(k) amplitude;L=1 ..., L;
Wherein, Δ τi1i2It (q) is mesh pointTo the i-th 1 and the delay inequality in the i-th 2 channels, its calculation formula is:
Wherein, i2=1 ..., M, i2 ≠ i1, c are the velocity of sound.
Weighted value wqCalculation formula it is as follows:
WhereinFor with Ri1i2The time delay value that (τ) maximum value position estimates:
Embodiment
If the microphone array distribution of M microphone composition is in three dimensions, each microphone coordinate isAccording to system to estimated accuracy requirement, all possible positions of sound source can be reduced to three in measurement space
Tie up the lattice point of grid.Assuming that being divided into Q mesh point altogether, coordinate isEach microphone is one corresponding
Channel, if the sample frequency of signal is fs, every every channel sample length of frame is L, is denoted as xi1(n), i1=1 ..., M, n=1 ...,
L.Fourier transformation points are equal to 2L-1.
As shown in Figure 1, specific step is as follows for signal processing method disclosed by the invention:
Step 1) calculates each mesh point to wheat according to microphone position coordinate and dragnet lattice point coordinate, with formula (4)
Gram wind stores spare the delay inequality of position.The step for Exactly-once;
The every frame data of step 2) processing obtain the frame data and estimate sound source position.
Specific step is as follows for every frame data processing:
Step 2-1) calculate separately each channel signal xi1(n), i1=1 ..., M, n=1 ..., quick Fu of 2L-1 point of L
In leaf transformation (FFT), obtain Xi1(k), i1=1 ..., M, k=1 ..., 2L-1;
Step 2-2) according to formula (3) all channel microphones are calculated to the PHAT cross correlation value R of signali1i2(l);
Step 2-3) according to formula (6) use PHAT cross correlation value Ri1i2(τ) calculates the Delay Estima-tion value between all channels pair
Step 2-4) according to formula (5) calculating Δ τi1i2(q) withBetween standard deviation obtain the weighting of each mesh point
Value wq;
Step 2-5) according to the SRP-PHAT value p of each mesh point of formula (2) calculatingq;
Step 2-6) according to the weighting SRP-PHAT value p of all mesh points of formula (1) calculatingq, find out wherein maximum value pair
The mesh point answered
Step 2-7) according to wqpqThe corresponding mesh point of maximum valueObtain the corresponding sound source position of the frame data
A kind of weighting SRP-PHAT microphone array auditory localization signal processing method disclosed by the invention, it is mutual with PHAT
Weighted value of the inverse of standard deviation between the time delay correct time delay value corresponding with Searching point of pass value estimation as SRP-PHAT value
Seek the responding power of space networks lattice point.Its guiding theory is if mesh point is correct sound source position, with microphone pair
The time delay value that should be calculated with PHAT cross-correlation method of relative time delay value it is more close, and then make the responding power value of the point
It is bigger.Using this method, the accuracy of auditory localization can be further improved.
It should be noted last that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting.Although ginseng
It is described the invention in detail according to embodiment, those skilled in the art should understand that, to technical side of the invention
Case is modified or replaced equivalently, and without departure from the spirit and scope of technical solution of the present invention, should all be covered in the present invention
Scope of the claims in.