CN104142492A - SRP-PHAT multi-source spatial positioning method - Google Patents
SRP-PHAT multi-source spatial positioning method Download PDFInfo
- Publication number
- CN104142492A CN104142492A CN201410366922.4A CN201410366922A CN104142492A CN 104142492 A CN104142492 A CN 104142492A CN 201410366922 A CN201410366922 A CN 201410366922A CN 104142492 A CN104142492 A CN 104142492A
- Authority
- CN
- China
- Prior art keywords
- omega
- tau
- source
- microphone
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S5/00—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
- G01S5/18—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
- G01S5/20—Position of source determined by a plurality of spaced direction-finders
Abstract
The invention provides a SRP-PHAT multi-source spatial positioning method. The method comprises the steps that the number and spatial positions of all microphones in a uniform circular microphone array are assumed to be unchanged in the data obtaining process at first, the isotropous microphones are evenly distributed on a circumference which has the radius r and is located on an x-y plane, the direction of arrival of a plane wave s is expressed by polar coordinates, the original point of the coordinate system is located on a circle center position of the circular array, multiple sound source signals are divided into non-overlapped time frequency point sets, each time frequency window contains only one movable source signal, and weak W orthogonal separation conditions are met; a Hamming window is selected, a controllable response power function is calculated and a target function is obtained through a SRP-PHAT algorithm, wave beams are controlled to carry out scanning in all the possible receiving directions, and the wave beams output the direction value with the maximum power to obtain the direction of a sound source, so that the DOA estimation of the multiple sound sources has the better separating performance in the strong noise and moderate reverberation acoustic environment, the real peak value is obviously outstanding, and high positioning precision is achieved.
Description
Technical field
The present invention relates to a kind of space-location method, specifically, relate to a kind of SRP-PHAT multi-source space-location method, be applied in the systems such as video conference, voice enhancing, osophone, hands-free phone and intelligent robot.
Background technology
Auditory localization technology is with a wide range of applications in the systems such as video conference, voice enhancing, osophone, hands-free phone and intelligent robot, has received in recent years increasing concern.
Controlled responding power (SRP-PHAT:Steered Response Power-Phase Transform) the auditory localization algorithm of phase tranformation weighting has at present become main flow algorithm, this algorithm combines the advantage of steerable beam formation and GCC-PHAT, has stronger robustness under Low SNR.For simple sund source, be positioned with good performance, but maximum shortcoming is that operand is large, huge operand has limited the application in real-time system.
Many researchers are attempting reducing the calculated amount of the controlled responding power search procedure of its core.As secondary accelerates SRP-PHAT auditory localization algorithm, by vertically arranged array, the search of two-dimensional space is converted into the search of the one-dimensional space, adopts Level Search strategy, by thick, to smart, the one-dimensional space is searched for.And for example improved associating SRP-PHAT voice location algorithm utilizes orthogonal straight lines microphone array that two-dimensional search space is reduced to dimension space one to one, then in the one-dimensional space, carries out respectively hierarchical search strategy, finds SRP maximal value to determine sound source position.
In practice, usually need to estimate the position of multi-acoustical.The separated orthogonality hypothesis of the existing W-based on the sparse property of voice signal does not meet many sound sources, cause the method spatial resolution low, easily be subject to the impact of reverberation, particularly under reverberation and noise circumstance, cannot differentiate two nearer signal sources of leaning in direction.Therefore, many auditory localizations problem has very important theory significance and practical value.
Summary of the invention
The present invention has overcome shortcoming of the prior art, and a kind of SRP-PHAT multi-source space-location method is provided, and can under reverberation and noise circumstance, differentiate a plurality of nearer signal sources of leaning in direction, good positioning effect.
In order to solve the problems of the technologies described above, the present invention is achieved by the following technical solutions:
A SRP-PHAT multi-source space-location method, is characterized in that, comprises the following steps:
1) computer memory coordinate under assumed condition, first number and the locus of supposing whole microphones of Homogeneous Circular microphone array in data acquisition process are constant, sound source and microphone distance meet the requirement of sound-field model, the physical property of each microphone is identical, isotropic microphone is evenly distributed on the circumference that is positioned at x-y plane that a radius is r, adopt polar coordinates to represent the arrival direction of plane wave s, the initial point of coordinate system is positioned on the home position of circular array, the pitching angle theta ∈ [0 of signal, pi/2], and position angle φ ∈ [0,2 π];
2) many sound-source signals are divided into the time frequency point sets of non-overlapping copies, make only to comprise a movable source signal in each time frequency window, meet the separated orthogonality condition of weak W; And choose Hamming window, work as WDO
mmeet the separated quadrature of W-at=1 o'clock;
3) by SRP-PHAT algorithm, calculate the controlled responding power function of the right phase tranformation of all microphones and obtain an objective function, the control wave beam of Beam-former scans at all possible receive direction, and the direction value of wave beam output power maximum obtains the direction of sound source.
Further, described step 2) comprising:
First introduce two important performance criterias: (1) is sheltered and to what extent retained interested sound source; (2) shelter and to what extent suppressed interference sound source;
Consideration is divided into many sound-source signals the time frequency point sets of non-overlapping copies, only comprises a movable source signal in each time frequency window, and approximate satisfied
Definition time-frequency masking code is
By estimating the time-frequency masking in corresponding each source, can from mixing source, obtain certain source j thus
M wherein
jfor the indicator function of source j support, S
j(t, ω), X (t, ω) is respectively s
j, the time-frequency representation of x (t),
For given time-frequency mask M, the signal ratio PSRM that definition retains:
PSRM is the shared number percent of source Sj energy that appraisal retains after use is sheltered;
Definition simultaneously
Z wherein
j(t) be at source S
jlower active sum of interference;
After definition application time-frequency masking M, signal-to-noise ratio is:
SIR wherein
mthe main signal-to-noise ratio of estimating after application time-frequency masking M separation signal;
Pass through PSR
mand SIR
mcan estimate approximate W-separated orthogonality WDO
m:
Because voice signal has sparse time-frequency representation, the power of its time-frequency representation accounts for the exhausted vast scale of general power, and the product amplitude of its time-frequency representation is conventionally always little, therefore meets the separated orthogonality condition of weak W; Especially, work as WDO
mmeet the separated quadrature of W-at=1 o'clock.
Further, described step 3) for the SRP-PHAT algorithm of dual microphone;
For only having two microphones, microphone m
iwith microphone m
jarray, from the signal of position angle and the angle of pitch, arriving two microphone time delays is Δ τ
ij(θ, φ), TDOA can estimate by broad sense simple crosscorrelation (GCC), be expressed as:
Wherein P (r) is three-dimensional space vectors r spatial likelihood function, can obtain by calculating all possible θ and φ broad sense cross correlation function Rs
is
j(Δ τ
i, j(θ, φ)) in frequency domain, can be expressed as:
ψ wherein
ij(ω) be weighting function, S
i(ω) S*
j(ω) be cross-spectral density function;
Phase tranformation (PHAT) method is exactly a kind of typical transform method,
Definition phase weighting function is:
By selecting suitable weighting function, make the controlled responding power of delay accumulation meet optimization signal-to-noise ratio (SNR) Criterion, broad sense simple crosscorrelation Rs
is
j(Δ τ
i, j(θ, φ)) in limited scope τ, show as a peak value, correspondence propagates into microphone m
iwith microphone m
jdelay TDOA.
Further, described step 3) for the SRP-PHAT algorithm of circular array microphone sound source:
The broad sense simple crosscorrelation right to all microphones
summation:
Δ τ wherein
1, Δ τ
2Δ τ
nfor the controllable time delay of N microphone, Δ τ wherein
i=τ
i-τ
0i=1 ... N, τ
0for estimating with reference to time delay, getting minimum in all microphone time delays is reference.
Further, described step 3) for many sound sources circular array microphone SRP-PHAT algorithm:
When there is two and above sound source, when there is more than two sound source, the SRP-PHAT peak value of a sound source has been sneaked into the SRP-PHAT peak value of another sound source, on some points, can produce false peak value, is difficult to find local maximal peak simultaneously simultaneously;
Utilize voice signal approximate W-separated orthogonality, at time-frequency domain, estimate that each sound-source signal arrives the relative time delay of microphone, array, utilize Short Time Fourier Transform as approximate W-separated orthogonal transformation,
The frequency domain representation of supposing the signal model of i microphone is:
If given window function W, the Short Time Fourier Transform of sj is Sj, has
By selecting appropriate window function and size, at signal, be under approximate W-separated orthogonality hypothesis, only have a sound source at any time-Frequency point is effective, its cross-spectrum is:
The time delay Δ τ between microphone i and microphone j
n, i-Δ τ
n, jcan obtain by cross-power spectrum.
Compared with prior art, the invention has the beneficial effects as follows:
A kind of SRP-PHAT multi-source space-location method of the present invention shows by theoretical analysis and emulation experiment, associating approximate W based on circular array-separated quadrature SRP-PHAT algorithm makes the DOA of many sound sources estimate to have good separating property under the acoustic enviroment of very noisy and appropriate reverberation, obviously give prominence to true peaks, there is higher positioning precision.
1. for uniform circular array row, can see the research to simple sund source location, and relatively less for the multi-source Position Research of circular array.There is more high spatial resolution
2. on the basis of approximate W-separated orthogonality hypothesis, SRP-PHAT algorithm makes the DOA of many sound sources estimate under the acoustic enviroment of very noisy and appropriate reverberation, to have good separating property, has obviously given prominence to true peaks, has higher positioning precision.
3. can effectively solve the problem at false spectrum peak, 3 signal sources can be differentiated and opened,
4. this method is applicable to the location under medium reverberation.
Accompanying drawing explanation
Accompanying drawing is used to provide a further understanding of the present invention, for explaining the present invention, is not construed as limiting the invention together with embodiments of the present invention, in the accompanying drawings:
Fig. 1 is uniform circular array row geometric graphs;
Fig. 2 is that uniform circular array train wave bundle forms principle;
Fig. 2: WDO ratio (80%) in 3 sound source situations;
Fig. 4: WDO ratio (90%) in 3 sound source situations;
Fig. 5 sound source s
1(t) time frequency analysis | S
1w (t, ω) |;
Fig. 6 sound source s
2(t) time frequency analysis | S
2w (t, ω) |;
Fig. 7 time frequency analysis | S
1w (t, ω) S
2w (t, ω) |;
Fig. 8 method realizes block diagram;
Fig. 9 uniform circular array row;
Figure 10 is two auditory localization two-dimensional imaging figure, and signal to noise ratio (S/N ratio) is 20dB;
Figure 11 is two auditory localization two-dimensional imaging figure, and signal to noise ratio (S/N ratio) is 30dB;
Figure 12 is the position angle that circular array is surveyed two sound sources, and signal to noise ratio (S/N ratio) is 20dB;
Figure 13 is the position angle that circular array is surveyed two sound sources, and signal to noise ratio (S/N ratio) is 30dB;
Figure 14 is two angle, the sound bearing three-dimensional plot of surveying, and signal to noise ratio (S/N ratio) is 30dB;
Figure 15 is three auditory localization two-dimensional imaging figure, and signal to noise ratio (S/N ratio) is 30dB;
Figure 16 improves one's methods for circular array, to survey the position angle of three sound sources, and signal to noise ratio (S/N ratio) is 30dB;
Figure 17 is that classic method is surveyed the position angle of three sound sources for circular array, and signal to noise ratio (S/N ratio) is 30dB;
Figure 18 is the signal waveform that 8 yuan of microphones receive;
Signal to noise ratio (S/N ratio) and angular error curve when Figure 19 is different T60
Embodiment
Below in conjunction with accompanying drawing, the preferred embodiments of the present invention are described, should be appreciated that preferred embodiment described herein, only for description and interpretation the present invention, is not intended to limit the present invention.
The first step, location model and uniform circular array train wave bundle form.
A Homogeneous Circular array can be determined space coordinates, as shown in Figure 1, is that isotropic microphone is evenly distributed on the circumference that is positioned at x-y plane that a radius is R.Adopt polar coordinates to represent the arrival direction of plane wave s, the initial point of coordinate system is positioned on the home position of circular array, and true origin is the pitching angle theta ∈ [0, pi/2] of system reference point signal, and position angle φ ∈ [0,2 π].Wherein r is the distance that sound source arrives the circular array center of circle, r
ifor sound source is to microphone m
idistance.
Suppose that acoustic signals is:
Wherein: ω
0for the angular frequency of sound-source signal, and
C is velocity of wave, C=384m/s.
F is the frequency (Hz) of sound source.
The signal of i microphone reception is
f
i(r,t)=s(t-Δτ
i)
(2)
As shown in Figure 1
Wherein: r
ibe that i microphone is to the distance in source
R is the distance that the round microphone array center of circle is arrived in source
R circular array radius
θ is the angle of pitch of sound source,
position angle for sound source.
i=0,1,2 ... N-1 is the position angle of i microphone.
So the time delay of each microphone before stack is
Wherein: C is velocity of wave, C=384m/s.
As shown in Figure 2, by the time delay Beam-former that superposes, the shifted signal of all microphones capture is sued for peace.The contribution stack of each sound source far zone field point just can, in the hope of the far-field pattern function of this ring array, be had
(4) are brought in (5) and obtained
Wherein:
for sound source unit's wave-number vector.
T is vector transposition.
Δ τ
i=τ
i-τ
0τ 0 estimates with reference to time delay, and getting minimum in all microphone time delays is reference.
Second step, approximate W-separated orthogonality hypothesis
Conventionally the masking effect of people's ear is divided into frequency masking and temporal masking characteristic, based on time-frequency masking method hypothesis sound-source signal, is sparse in separable, meets the separated orthogonality of W-.
Suppose that signal x (t) is comprised of N sound-source signal, can be expressed as
If there is certain linear transformation T, be called s
jto S
jmapping, be designated as
and there is following properties:
(1) conversion T has reversibility, i.e. T
-1(Ts)=T (T
-1s)=s
(2)
during j ≠ k, Λ wherein
jfor S
jsupport, Λ
j=supp S
j:={ λ: S
j(λ) ≠ 0}, table
Show collection Λ
jwith Λ
kfriendship non-zero.
If meet above-mentioned (1), the condition of (2), the mixed signal in collection S all can be effectively separated.
If a given window function, if meet
Claim two sound source S
jand S
kmeet the separated orthogonality of W-.
But the separated orthogonality hypothesis of W-does not meet the signal that will study herein, the result of expression formula (7) is seldom zero.
For this reason, introduce two important performance criterias: (1) is sheltered and to what extent retained interested sound source; (2) shelter and to what extent suppressed interference sound source.
Consideration is divided into many sound-source signals the time frequency point sets of non-overlapping copies, only comprises a movable source signal in each time frequency window, and approximate satisfied
Definition time-frequency masking code is
By estimating the time-frequency masking in corresponding each source, can from mixing source, obtain certain source j thus
M wherein
jfor the indicator function of source j support, S
j(t, ω), X (t, ω) is respectively s
j, the time-frequency representation of x (t),
For given time-frequency mask M, the signal ratio PSR that definition retains
m
PSR
mfor estimate the source S retaining after use is sheltered
jthe number percent that energy is shared.
Definition simultaneously
Z wherein
j(t) be at source S
jlower active sum of interference.
After definition application time-frequency masking M, signal-to-noise ratio is
SIR wherein
mthe main signal-to-noise ratio of estimating after application time-frequency masking M separation signal.
Pass through PSR
mand SIR
mcan estimate approximate W-separated orthogonality WDO
m.
Because voice signal has sparse time-frequency representation, the power of its time-frequency representation accounts for the exhausted vast scale of general power, and the product amplitude of its time-frequency representation is conventionally always little.Therefore meet the separated orthogonality condition of weak W.Approximate W-separated intercept is higher, has better separating effect.Want to obtain good time-frequency masking effect, window function type and choosing of size are played vital effect to its performance.Especially, work as WDO
mmeet the separated quadrature of W-at=1 o'clock.
According to the experiment of Scott Rickard (Scott Rickard, Radu Balan and Justinian Rosca.Real-time time-frequency based blind source separation.Proceedings ICA2001, pp.651-656, December2001.), under 0dB, the WDO ratio of different number sound sources is as follows
N | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
WDO | 93.6 | 88.0 | 83.4 | 79.2 | 75.6 | 72.3 | 69.3 | 66.6 | 64 |
As shown in Figure 3, Figure 4, by the situation of 3 sound sources is carried out to simplation verification, horizontal ordinate is WDO value, and ordinate is voice signal number of samples, can see in 3 sound source situations, and signal more than 80% is being quadrature.
As Fig. 5, Fig. 6 and Fig. 7, in addition 2 sound sources are carried out to nearly orthogonal condition Verification, respectively to signal s
1(t), s
2(t) carry out time frequency analysis, respectively
with
analyze simultaneously
horizontal ordinate is the time, and ordinate is frequency.Window function W (t) chooses Hamming window, length of window 64ms, and by Fig. 5, Fig. 6, Fig. 7 can find out,
in comprise seldom
with
composition, can prove that sound-source signal meets approximate W-separated quadrature thus.
The 3rd step, the SRP-PHAT localization method of associating approximate W-separated many sound sources of quadrature circular array
SRP-PHAT algorithm is by calculating the controlled responding power function of the right phase tranformation of all microphones and obtaining an objective function, the Beam-former of devise optimum is also controlled wave beam and is scanned at all possible receive direction, and the direction value of wave beam output power maximum obtains the direction of sound source.
The SRP-PHAT algorithm of 1 dual microphone
For only there being two microphone m
iand m
jarray, from the signal of position angle and pitching, arriving two microphone time delays is Δ τ
ij(θ, φ), TDOA can estimate by broad sense simple crosscorrelation (GCC), be expressed as:
Wherein P (r) is three-dimensional space vectors r spatial likelihood function, can obtain by calculating all possible θ and φ.Broad sense cross correlation function Rs
is
j(Δ τ
i, j(θ, φ)) in frequency domain, can be expressed as:
ψ wherein
ij(ω) be weighting function, S
i(ω) S*
j(ω) be cross-spectral density function.
Phase tranformation (PHAT) method is exactly a kind of typical transform method.
Definition phase weighting function is:
By selecting suitable weighting function, make the controlled responding power of delay accumulation meet optimization signal-to-noise ratio (SNR) Criterion, broad sense simple crosscorrelation Rs
is
j(Δ τ
i, j(θ, φ)) in limited scope τ, show as a peak value, correspondence propagates into microphone m
iand m
jdelay TDOA.This algorithm has certain noise immunity, anti-reverberation and robustness in auditory localization.
2 circular array SRP-PHAT algorithms
The broad sense simple crosscorrelation right to all microphones
summation
Δ τ wherein
1, Δ τ
2Δ τ
nfor the controllable time delay of N microphone, Δ τ wherein
i=τ
i-τ
0i=1 ... N, τ
0for estimating with reference to time delay, getting minimum in all microphone time delays is reference.
Along with the increase of microphone number, dual microphone SRP-PHAT method expands to round microphone SRP-PHAT method naturally.
The circular array of sound source more than 3 SRP-PHAT algorithm
When there is two and above sound source, when there is more than two sound source, the SRP-PHAT peak value of a sound source has been sneaked into the SRP-PHAT peak value of another sound source, on some points, can produce false peak value, is difficult to find local maximal peak simultaneously simultaneously.
Utilize foregoing voice signal approximate W-separated orthogonality, at time-frequency domain, estimate that each sound-source signal arrives the relative time delay of microphone array.
Utilize Short Time Fourier Transform as approximate W-separated orthogonal transformation.
The frequency domain representation of supposing the signal model of i microphone is:
If given window function W, the Short Time Fourier Transform of sj is Sj, has
By selecting appropriate window function and size, at signal, be under approximate W-separated orthogonality hypothesis, only have a sound source at any time-Frequency point is effective.Its cross-spectrum is:
The time delay Δ τ n between microphone i and j, i-Δ τ n, j can obtain by cross-power spectrum.
1 two auditory localizations of embodiment
1. uniform circular array row location model is selected
Emulation experiment is simulated under different signal to noise ratio (S/N ratio)s and reverberation environment, and Homogeneous Circular array is placed in the room of 7m * 8m * 3.5m, and its 8 yuan of microphone locus are respectively [3.25 ,-1.6,1.5], [3.25,1.1,1.5], [1.87,3.75,1.5], [1.0,3.75,1.5], [3.25,1.8,1.5], [3.25,-1.0,1.5], [2.2 ,-3.75,1.5], [0.6 ,-3.75,1.5].
2. the selection of sound source
Sound source is the random voice signal producing, and signal to noise ratio (S/N ratio) is 0-30dB.Random interfering signal is gaussian signal, is used for simulating air condition electric fan and from noise outside window, noise power can reach 10dB the most by force, and the corresponding reverberation time is determined by the reflection coefficient of room wall, floor and ceiling.
3. pair array reception signal carries out Short Time Fourier Transform (STFT)
If given window function W, s
jshort Time Fourier Transform be S
j, have
Want to obtain good time-frequency masking effect, window function type and choosing of size are played vital effect to its performance.Wherein window function is chosen Hamming window, and window size is 1024 points.
4. carry out the broad sense simple crosscorrelation of phase tranformation
By choosing suitable window function, desirable good separating effect, meets approximate W-separated quadrature.On this basis, can carry out broad sense computing cross-correlation.
Broad sense cross correlation function Rs
is
j(Δ τ
i, j(θ, φ)) in frequency domain, can be expressed as:
ψ wherein
ij(ω) be weighting function, for:
The broad sense simple crosscorrelation right to all microphones
summation
Δ τ wherein
1, Δ τ
2Δ τ
nfor the controllable time delay of N microphone, Δ τ wherein
i=τ
i-τ
0i=1 ... N, τ
0for estimating with reference to time delay, getting minimum in all microphone time delays is reference.
Obtain P (Δ τ
1, Δ τ
2... Δ τ
n) maximal value after, can determine pitching angle theta and the position angle φ of sound source.
5. the result after above step
Shown in Figure 10, Figure 11, be respectively circular array at 20dB, sound source wave field image under 30dB signal to noise ratio (S/N ratio).In figure, is microphone position, and zero represents the sound source of estimating, * for disturbing residing position.
The locus that Figure 10 shows that two sound sources is respectively [0.59,2.08,1.5] and [0.29 ,-1.37,1.5], and signal to noise ratio (S/N ratio) is 20dB.Random interfering signal is gaussian signal, is used for simulating air condition electric fan and from noise outside window, locus is respectively [2 ,-4,1.5], [3.5 ,-3.2,1.5], noise power can reach 10dB the most by force, and the corresponding reverberation time is determined by the reflection coefficient of room wall, floor and ceiling.
The locus that Figure 11 shows that two sound sources is respectively [1.5,2.1,1.5] and [2.1,0.8,1.5], and signal to noise ratio (S/N ratio) is 30dB.Be used for simulating air condition electric fan and from noise outside window away from two sound sources.
Adopt the SRP-PHAT algorithm of associating approximate W-separated quadrature to carry out orientation estimation, choose Hamming window, window size is 1024 points.Shown in Figure 10, Figure 11, be respectively circular array at 20dB, sound source wave field image under 30dB signal to noise ratio (S/N ratio).Is microphone position, and zero represents the sound source of estimating, * for disturbing residing position.Visible under identical background noise environment, the signal to noise ratio (S/N ratio) of signal more high position precision is also higher.
Shown in Figure 12, Figure 13, be respectively the angle, sound bearing recording.Fig. 5 position angle is respectively φ 1=74 ° and φ 2=-78 °, although the azran of two signals is near and Signal-to-Noise is low, 2 sound sources can be differentiated out substantially, in true bearing, all there is spectrum peak, do not have false spectrum peak to occur, and target azimuth correctly still can draw estimated result, 2 sound sources can be differentiated out substantially.Figure 13 is measured position angle φ 1=17 °, φ 2=52 °.Although the azran of two signals is nearer, because signal to noise ratio (S/N ratio) is high and two angles differ larger, 2 sound sources are differentiated completely.Along with the increase of signal to noise ratio (S/N ratio), evaluated error can be more and more less, and estimated accuracy can be more and more higher.The larger estimation of differential seat angle between two signals is more accurate, when the difference of angle greatly to a certain extent after estimated accuracy tend towards stability.
Position angle shown in Figure 14 and the angle of pitch are (φ 1=74 °, θ 1=46 °) and (φ 2=-78 °, θ 2=0 °).
2 three auditory localizations of embodiment
When sound source increases to 3, in the situation that signal to noise ratio (S/N ratio) is low, can not solve well the problem at false spectrum peak.Under high s/n ratio condition, substantially can solve the problem at false spectrum peak, many sound sources are had to good resolution characteristic.
Specific implementation step, with example 1, is omited herein.
Figure 15 shows that three auditory localization two-dimensional imaging figure, signal to noise ratio (S/N ratio) 30dB.
Shown in Figure 16, Figure 17, be respectively the angle, sound bearing that the method that proposes herein and traditional SRP-PHAT method record under the higher condition of signal to noise ratio (S/N ratio).SRP-PHAT method based on approximate W-separated quadrature can solve the problem at false spectrum peak effectively, 3 signal sources can be differentiated and opened, and traditional SRP-PHAT method there will be false spectrum peak, 3 useful signals of indistinguishable.
Figure 18 shows that the sound-source signal that 8 yuan of microphone array received arrive, can find out that interference source is on No. 7 microphones impacts close to are larger from it, Figure 19 shows that 60 times signal to noise ratio (S/N ratio)s of different reverberation time T and orientation angle error relationship curve, RT60 chooses respectively 300ms, 450ms and 600ms.Along with the increase of T60, evaluated error is increasing, and estimated accuracy can be more and more lower.Visible in the situation that reverberation is large, be difficult to resolution target orientation, this method is applicable to the location under medium reverberation.
From simulation result, can find out, adopt the SRP-PHAT algorithm of even ring array to there is good positioning performance.Particularly, when SNR is higher, when reverberation is moderate, locating effect is better
The separated orthogonality hypothesis of W-the present invention is directed to based on the sparse property of voice signal does not meet many sound sources, two key properties of signal to noise ratio (S/N ratio) after signal retention rate and time-frequency masking after introducing voice signal time-frequency masking, derived approximate W-separated orthogonality hypothesis condition, many sound-source signals are divided into the time frequency point sets of non-overlapping copies, each set only comprises the time frequency component of single source signal, at time-frequency domain, estimates that each sound-source signal arrives the relative time delay of microphone array.Estimate that source signal arrives the relative time delay of microphone array.Special employing has the more circular array of high spatial resolution, realized and the high-resolution of the position angle of many sound-source signals, the angle of pitch having been estimated simultaneously, realize the space orientation of sound-source signal, overcome the three-dimensional fix problem that existing sound localization method cannot effectively be realized a plurality of aliasing sound sources.
Finally it should be noted that: these are only the preferred embodiments of the present invention; be not limited to the present invention; although the present invention is had been described in detail with reference to embodiment; for a person skilled in the art; its technical scheme that still can record aforementioned each embodiment is modified; or part technical characterictic is wherein equal to replacement; but within the spirit and principles in the present invention all; any modification of doing, be equal to replacement, improvement etc., within protection scope of the present invention all should be included in.
Claims (5)
1. a SRP-PHAT multi-source space-location method, is characterized in that, comprises the following steps:
1) computer memory coordinate under assumed condition, first number and the locus of supposing whole microphones of Homogeneous Circular microphone array in data acquisition process are constant, sound source and microphone distance meet the requirement of sound-field model, the physical property of each microphone is identical, isotropic microphone is evenly distributed on the circumference that is positioned at x-y plane that a radius is r, adopt polar coordinates to represent the arrival direction of plane wave s, the initial point of coordinate system is positioned on the home position of circular array, the pitching angle theta ∈ [0 of signal, pi/2], and position angle φ ∈ [0,2 π];
2) many sound-source signals are divided into the time frequency point sets of non-overlapping copies, make only to comprise a movable source signal in each time frequency window, meet the separated orthogonality condition of weak W; And choose Hamming window, work as WDO
mmeet the separated quadrature of W-at=1 o'clock;
3) by SRP-PHAT algorithm, calculate the controlled responding power function of the right phase tranformation of all microphones and obtain an objective function, the control wave beam of Beam-former scans at all possible receive direction, and the direction value of wave beam output power maximum obtains the direction of sound source.
2. a kind of SRP-PHAT multi-source space-location method according to claim 1, is characterized in that described step 2) comprising:
First introduce two important performance criterias: (1) is sheltered and to what extent retained interested sound source; (2) shelter and to what extent suppressed interference sound source;
Consideration is divided into many sound-source signals the time frequency point sets of non-overlapping copies, only comprises a movable source signal in each time frequency window, and approximate satisfied
Definition time-frequency masking code is
By estimating the time-frequency masking in corresponding each source, can from mixing source, obtain certain source j thus
Wherein Mj is the indicator function of source j support, Sj (t, ω), and X (t, ω) is respectively sj, the time-frequency representation of x (t),
For given time-frequency mask M, the signal ratio PSRM that definition retains:
PSRM is the shared number percent of source Sj energy that appraisal retains after use is sheltered;
Definition simultaneously
Wherein zj (t) is active sum under the interference at source Sj;
After definition application time-frequency masking M, signal-to-noise ratio is:
Wherein SIRM mainly estimates the signal-to-noise ratio after application time-frequency masking M separation signal;
By PSRM and SIRM, can estimate approximate W-separated orthogonality WDOM:
Because voice signal has sparse time-frequency representation, the power of its time-frequency representation accounts for the exhausted vast scale of general power, and the product amplitude of its time-frequency representation is conventionally always little, therefore meets the separated orthogonality condition of weak W; Especially, when WDOM=1, meet the separated quadrature of W-.
3. a kind of SRP-PHAT multi-source space-location method according to claim 1, is characterized in that described step 3) for the SRP-PHAT algorithm of dual microphone,
For only there being two microphones, microphone mi and microphone mj array, from the signal of position angle and the angle of pitch, arriving two microphone time delays is Δ τ ij (θ, φ), TDOA can estimate by broad sense simple crosscorrelation (GCC), be expressed as:
Wherein P (r) is three-dimensional space vectors r spatial likelihood function, can obtain by calculating all possible θ and φ, and broad sense cross correlation function Rsisj (Δ τ i, j (θ, φ)) can be expressed as in frequency domain:
Wherein ψ ij (ω) is weighting function, and Si (ω) S*j (ω) is cross-spectral density function;
Phase tranformation (PHAT) method is exactly a kind of typical transform method,
Definition phase weighting function is:
By selecting suitable weighting function, make the controlled responding power of delay accumulation meet optimization signal-to-noise ratio (SNR) Criterion, broad sense simple crosscorrelation Rsisj (Δ τ i, j (θ, φ)) in limited scope τ, show as a peak value, correspondence propagates into the delay TDOA of microphone mi and microphone mj.
4. a kind of SRP-PHAT multi-source space-location method according to claim 1, is characterized in that described step 3) for the SRP-PHAT algorithm of circular array microphone sound source:
The broad sense simple crosscorrelation right to all microphones
summation:
Δ τ wherein
1, Δ τ
2Δ τ
nfor the controllable time delay of N microphone, Δ τ wherein
i=τ
i-τ
0i=1 ... N, τ
0for estimating with reference to time delay, getting minimum in all microphone time delays is reference.
5. a kind of SRP-PHAT multi-source space-location method according to claim 1, is characterized in that described step 3) for many sound sources circular array microphone SRP-PHAT algorithm:
When there is two and above sound source, when there is more than two sound source, the SRP-PHAT peak value of a sound source has been sneaked into the SRP-PHAT peak value of another sound source, on some points, can produce false peak value, is difficult to find local maximal peak simultaneously simultaneously;
Utilize voice signal approximate W-separated orthogonality, at time-frequency domain, estimate that each sound-source signal arrives the relative time delay of microphone, array, utilize Short Time Fourier Transform as approximate W-separated orthogonal transformation,
The frequency domain representation of supposing the signal model of i microphone is:
If given window function W, the Short Time Fourier Transform of sj is Sj, has
By selecting appropriate window function and size, at signal, be under approximate W-separated orthogonality hypothesis, only have a sound source at any time-Frequency point is effective, its cross-spectrum is:
The time delay Δ τ n between microphone i and microphone j, i-Δ τ n, j can obtain by cross-power spectrum.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410366922.4A CN104142492B (en) | 2014-07-29 | 2014-07-29 | A kind of SRP PHAT multi-source space-location methods |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410366922.4A CN104142492B (en) | 2014-07-29 | 2014-07-29 | A kind of SRP PHAT multi-source space-location methods |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104142492A true CN104142492A (en) | 2014-11-12 |
CN104142492B CN104142492B (en) | 2017-04-05 |
Family
ID=51851720
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410366922.4A Expired - Fee Related CN104142492B (en) | 2014-07-29 | 2014-07-29 | A kind of SRP PHAT multi-source space-location methods |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104142492B (en) |
Cited By (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104898086A (en) * | 2015-05-19 | 2015-09-09 | 南京航空航天大学 | Sound intensity estimation sound source orientation method applicable for minitype microphone array |
CN104936091A (en) * | 2015-05-14 | 2015-09-23 | 科大讯飞股份有限公司 | Intelligent interaction method and system based on circle microphone array |
CN105044675A (en) * | 2015-07-16 | 2015-11-11 | 南京航空航天大学 | Fast SRP sound source positioning method |
CN105467364A (en) * | 2015-11-20 | 2016-04-06 | 百度在线网络技术(北京)有限公司 | Method and apparatus for localizing target sound source |
CN105489219A (en) * | 2016-01-06 | 2016-04-13 | 广州零号软件科技有限公司 | Indoor space service robot distributed speech recognition system and product |
CN106093864A (en) * | 2016-06-03 | 2016-11-09 | 清华大学 | A kind of microphone array sound source space real-time location method |
CN106448722A (en) * | 2016-09-14 | 2017-02-22 | 科大讯飞股份有限公司 | Sound recording method, device and system |
CN106950542A (en) * | 2016-01-06 | 2017-07-14 | 中兴通讯股份有限公司 | The localization method of sound source, apparatus and system |
CN107063437A (en) * | 2017-04-12 | 2017-08-18 | 中广核研究院有限公司北京分公司 | Nuclear power station noise-measuring system based on microphone array |
CN107102296A (en) * | 2017-04-27 | 2017-08-29 | 大连理工大学 | A kind of sonic location system based on distributed microphone array |
CN107271963A (en) * | 2017-06-22 | 2017-10-20 | 广东美的制冷设备有限公司 | The method and apparatus and air conditioner of auditory localization |
CN107290711A (en) * | 2016-03-30 | 2017-10-24 | 芋头科技(杭州)有限公司 | A kind of voice is sought to system and method |
CN107918108A (en) * | 2017-11-14 | 2018-04-17 | 重庆邮电大学 | A kind of uniform circular array 2-d direction finding method for quick estimating |
CN108089153A (en) * | 2016-11-23 | 2018-05-29 | 杭州海康威视数字技术股份有限公司 | A kind of sound localization method, apparatus and system |
CN108089152A (en) * | 2016-11-23 | 2018-05-29 | 杭州海康威视数字技术股份有限公司 | A kind of apparatus control method, apparatus and system |
CN108198568A (en) * | 2017-12-26 | 2018-06-22 | 太原理工大学 | A kind of method and system of more auditory localizations |
CN108510987A (en) * | 2018-03-26 | 2018-09-07 | 北京小米移动软件有限公司 | Method of speech processing and device |
CN108549052A (en) * | 2018-03-20 | 2018-09-18 | 南京航空航天大学 | A kind of humorous domain puppet sound intensity sound localization method of circle of time-frequency-spatial domain joint weighting |
CN108872939A (en) * | 2018-04-29 | 2018-11-23 | 桂林电子科技大学 | Interior space geometric profile reconstructing method based on acoustics mirror image model |
CN109254266A (en) * | 2018-11-07 | 2019-01-22 | 苏州科达科技股份有限公司 | Sound localization method, device and storage medium based on microphone array |
CN109633551A (en) * | 2019-01-08 | 2019-04-16 | 中国电子科技集团公司第三研究所 | A kind of acoustic array of detectable a variety of acoustic targets |
CN109997375A (en) * | 2016-11-09 | 2019-07-09 | 西北工业大学 | Concentric circles difference microphone array and associated beam are formed |
CN110376551A (en) * | 2019-07-04 | 2019-10-25 | 浙江大学 | A kind of TDOA localization method based on the distribution of acoustical signal time-frequency combination |
CN110544490A (en) * | 2019-07-30 | 2019-12-06 | 南京林业大学 | sound source positioning method based on Gaussian mixture model and spatial power spectrum characteristics |
CN110703199A (en) * | 2019-10-22 | 2020-01-17 | 哈尔滨工程大学 | Quaternary cross array high-precision azimuth estimation method based on compass compensation |
CN110726972A (en) * | 2019-10-21 | 2020-01-24 | 南京南大电子智慧型服务机器人研究院有限公司 | Voice sound source positioning method using microphone array under interference and high reverberation environment |
CN111060872A (en) * | 2020-03-17 | 2020-04-24 | 深圳市友杰智新科技有限公司 | Sound source positioning method and device based on microphone array and computer equipment |
CN111798869A (en) * | 2020-09-10 | 2020-10-20 | 成都启英泰伦科技有限公司 | Sound source positioning method based on double microphone arrays |
CN111833901A (en) * | 2019-04-23 | 2020-10-27 | 北京京东尚科信息技术有限公司 | Audio processing method, audio processing apparatus, audio processing system, and medium |
CN111880148A (en) * | 2020-08-07 | 2020-11-03 | 北京字节跳动网络技术有限公司 | Sound source positioning method, device, equipment and storage medium |
CN111929645A (en) * | 2020-09-23 | 2020-11-13 | 深圳市友杰智新科技有限公司 | Method and device for positioning sound source of specific human voice and computer equipment |
CN112379330A (en) * | 2020-11-27 | 2021-02-19 | 浙江同善人工智能技术有限公司 | Multi-robot cooperative 3D sound source identification and positioning method |
CN112684412A (en) * | 2021-01-12 | 2021-04-20 | 中北大学 | Sound source positioning method and system based on pattern clustering |
CN113470682A (en) * | 2021-06-16 | 2021-10-01 | 中科上声(苏州)电子有限公司 | Method, device and storage medium for estimating speaker orientation by microphone array |
CN113655440A (en) * | 2021-08-09 | 2021-11-16 | 西南科技大学 | Self-adaptive compromising pre-whitening sound source positioning method |
CN113936687A (en) * | 2021-12-17 | 2022-01-14 | 北京睿科伦智能科技有限公司 | Method for real-time voice separation voice transcription |
CN115150712A (en) * | 2022-06-07 | 2022-10-04 | 中国第一汽车股份有限公司 | Vehicle-mounted microphone system and automobile |
CN115295000A (en) * | 2022-10-08 | 2022-11-04 | 深圳通联金融网络科技服务有限公司 | Method, device and equipment for improving speech recognition accuracy under multi-object speaking scene |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090279714A1 (en) * | 2008-05-06 | 2009-11-12 | Samsung Electronics Co., Ltd. | Apparatus and method for localizing sound source in robot |
CN101762806A (en) * | 2010-01-27 | 2010-06-30 | 华为终端有限公司 | Sound source locating method and apparatus thereof |
KR20140015893A (en) * | 2012-07-26 | 2014-02-07 | 삼성테크윈 주식회사 | Apparatus and method for estimating location of sound source |
-
2014
- 2014-07-29 CN CN201410366922.4A patent/CN104142492B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090279714A1 (en) * | 2008-05-06 | 2009-11-12 | Samsung Electronics Co., Ltd. | Apparatus and method for localizing sound source in robot |
CN101762806A (en) * | 2010-01-27 | 2010-06-30 | 华为终端有限公司 | Sound source locating method and apparatus thereof |
KR20140015893A (en) * | 2012-07-26 | 2014-02-07 | 삼성테크윈 주식회사 | Apparatus and method for estimating location of sound source |
Non-Patent Citations (2)
Title |
---|
DAVID AYLLO´N ET AL.: "Real-time phase-isolation algorithm for speech separation", 《19TH EUROPEAN SIGNAL PROCESSING CONFERENCE》 * |
M. SWARTLING ET AL.: "Source Localization for Multiple Speech Sources Using Low Complexity Non-Parametric Source Separation and Clustering", 《SIGNAL PROCESSING》 * |
Cited By (60)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104936091A (en) * | 2015-05-14 | 2015-09-23 | 科大讯飞股份有限公司 | Intelligent interaction method and system based on circle microphone array |
CN104936091B (en) * | 2015-05-14 | 2018-06-15 | 讯飞智元信息科技有限公司 | Intelligent interactive method and system based on circular microphone array |
CN104898086A (en) * | 2015-05-19 | 2015-09-09 | 南京航空航天大学 | Sound intensity estimation sound source orientation method applicable for minitype microphone array |
CN105044675A (en) * | 2015-07-16 | 2015-11-11 | 南京航空航天大学 | Fast SRP sound source positioning method |
CN105467364A (en) * | 2015-11-20 | 2016-04-06 | 百度在线网络技术(北京)有限公司 | Method and apparatus for localizing target sound source |
CN105467364B (en) * | 2015-11-20 | 2019-03-29 | 百度在线网络技术(北京)有限公司 | A kind of method and apparatus positioning target sound source |
CN106950542A (en) * | 2016-01-06 | 2017-07-14 | 中兴通讯股份有限公司 | The localization method of sound source, apparatus and system |
CN105489219A (en) * | 2016-01-06 | 2016-04-13 | 广州零号软件科技有限公司 | Indoor space service robot distributed speech recognition system and product |
CN107290711A (en) * | 2016-03-30 | 2017-10-24 | 芋头科技(杭州)有限公司 | A kind of voice is sought to system and method |
CN106093864A (en) * | 2016-06-03 | 2016-11-09 | 清华大学 | A kind of microphone array sound source space real-time location method |
CN106448722A (en) * | 2016-09-14 | 2017-02-22 | 科大讯飞股份有限公司 | Sound recording method, device and system |
CN106448722B (en) * | 2016-09-14 | 2019-01-18 | 讯飞智元信息科技有限公司 | The way of recording, device and system |
CN109997375A (en) * | 2016-11-09 | 2019-07-09 | 西北工业大学 | Concentric circles difference microphone array and associated beam are formed |
WO2018095166A1 (en) * | 2016-11-23 | 2018-05-31 | 杭州海康威视数字技术股份有限公司 | Device control method, apparatus and system |
CN108089152A (en) * | 2016-11-23 | 2018-05-29 | 杭州海康威视数字技术股份有限公司 | A kind of apparatus control method, apparatus and system |
US10816633B2 (en) | 2016-11-23 | 2020-10-27 | Hangzhou Hikvision Digital Technology Co., Ltd. | Device control method, apparatus and system |
CN108089153A (en) * | 2016-11-23 | 2018-05-29 | 杭州海康威视数字技术股份有限公司 | A kind of sound localization method, apparatus and system |
CN108089152B (en) * | 2016-11-23 | 2020-07-03 | 杭州海康威视数字技术股份有限公司 | Equipment control method, device and system |
CN107063437A (en) * | 2017-04-12 | 2017-08-18 | 中广核研究院有限公司北京分公司 | Nuclear power station noise-measuring system based on microphone array |
CN107102296B (en) * | 2017-04-27 | 2020-04-14 | 大连理工大学 | Sound source positioning system based on distributed microphone array |
CN107102296A (en) * | 2017-04-27 | 2017-08-29 | 大连理工大学 | A kind of sonic location system based on distributed microphone array |
CN107271963A (en) * | 2017-06-22 | 2017-10-20 | 广东美的制冷设备有限公司 | The method and apparatus and air conditioner of auditory localization |
CN107918108A (en) * | 2017-11-14 | 2018-04-17 | 重庆邮电大学 | A kind of uniform circular array 2-d direction finding method for quick estimating |
CN108198568B (en) * | 2017-12-26 | 2020-10-16 | 太原理工大学 | Method and system for positioning multiple sound sources |
CN108198568A (en) * | 2017-12-26 | 2018-06-22 | 太原理工大学 | A kind of method and system of more auditory localizations |
CN108549052A (en) * | 2018-03-20 | 2018-09-18 | 南京航空航天大学 | A kind of humorous domain puppet sound intensity sound localization method of circle of time-frequency-spatial domain joint weighting |
CN108549052B (en) * | 2018-03-20 | 2021-04-13 | 南京航空航天大学 | Time-frequency-space domain combined weighted circular harmonic domain pseudo-sound strong sound source positioning method |
US10930304B2 (en) | 2018-03-26 | 2021-02-23 | Beijing Xiaomi Mobile Software Co., Ltd. | Processing voice |
CN108510987A (en) * | 2018-03-26 | 2018-09-07 | 北京小米移动软件有限公司 | Method of speech processing and device |
CN108510987B (en) * | 2018-03-26 | 2020-10-23 | 北京小米移动软件有限公司 | Voice processing method and device |
CN108872939A (en) * | 2018-04-29 | 2018-11-23 | 桂林电子科技大学 | Interior space geometric profile reconstructing method based on acoustics mirror image model |
CN108872939B (en) * | 2018-04-29 | 2020-09-29 | 桂林电子科技大学 | Indoor space geometric outline reconstruction method based on acoustic mirror image model |
CN109254266A (en) * | 2018-11-07 | 2019-01-22 | 苏州科达科技股份有限公司 | Sound localization method, device and storage medium based on microphone array |
CN109633551A (en) * | 2019-01-08 | 2019-04-16 | 中国电子科技集团公司第三研究所 | A kind of acoustic array of detectable a variety of acoustic targets |
CN111833901A (en) * | 2019-04-23 | 2020-10-27 | 北京京东尚科信息技术有限公司 | Audio processing method, audio processing apparatus, audio processing system, and medium |
CN111833901B (en) * | 2019-04-23 | 2024-04-05 | 北京京东尚科信息技术有限公司 | Audio processing method, audio processing device, system and medium |
CN110376551B (en) * | 2019-07-04 | 2021-05-04 | 浙江大学 | TDOA (time difference of arrival) positioning method based on acoustic signal time-frequency joint distribution |
CN110376551A (en) * | 2019-07-04 | 2019-10-25 | 浙江大学 | A kind of TDOA localization method based on the distribution of acoustical signal time-frequency combination |
CN110544490B (en) * | 2019-07-30 | 2022-04-05 | 南京工程学院 | Sound source positioning method based on Gaussian mixture model and spatial power spectrum characteristics |
CN110544490A (en) * | 2019-07-30 | 2019-12-06 | 南京林业大学 | sound source positioning method based on Gaussian mixture model and spatial power spectrum characteristics |
CN110726972A (en) * | 2019-10-21 | 2020-01-24 | 南京南大电子智慧型服务机器人研究院有限公司 | Voice sound source positioning method using microphone array under interference and high reverberation environment |
CN110703199A (en) * | 2019-10-22 | 2020-01-17 | 哈尔滨工程大学 | Quaternary cross array high-precision azimuth estimation method based on compass compensation |
CN111060872A (en) * | 2020-03-17 | 2020-04-24 | 深圳市友杰智新科技有限公司 | Sound source positioning method and device based on microphone array and computer equipment |
CN111060872B (en) * | 2020-03-17 | 2020-06-23 | 深圳市友杰智新科技有限公司 | Sound source positioning method and device based on microphone array and computer equipment |
CN111880148A (en) * | 2020-08-07 | 2020-11-03 | 北京字节跳动网络技术有限公司 | Sound source positioning method, device, equipment and storage medium |
CN111798869B (en) * | 2020-09-10 | 2020-11-17 | 成都启英泰伦科技有限公司 | Sound source positioning method based on double microphone arrays |
CN111798869A (en) * | 2020-09-10 | 2020-10-20 | 成都启英泰伦科技有限公司 | Sound source positioning method based on double microphone arrays |
CN111929645A (en) * | 2020-09-23 | 2020-11-13 | 深圳市友杰智新科技有限公司 | Method and device for positioning sound source of specific human voice and computer equipment |
CN112379330A (en) * | 2020-11-27 | 2021-02-19 | 浙江同善人工智能技术有限公司 | Multi-robot cooperative 3D sound source identification and positioning method |
CN112379330B (en) * | 2020-11-27 | 2023-03-10 | 浙江同善人工智能技术有限公司 | Multi-robot cooperative 3D sound source identification and positioning method |
CN112684412A (en) * | 2021-01-12 | 2021-04-20 | 中北大学 | Sound source positioning method and system based on pattern clustering |
CN112684412B (en) * | 2021-01-12 | 2022-09-13 | 中北大学 | Sound source positioning method and system based on pattern clustering |
CN113470682A (en) * | 2021-06-16 | 2021-10-01 | 中科上声(苏州)电子有限公司 | Method, device and storage medium for estimating speaker orientation by microphone array |
CN113470682B (en) * | 2021-06-16 | 2023-11-24 | 中科上声(苏州)电子有限公司 | Method, device and storage medium for estimating speaker azimuth by microphone array |
CN113655440A (en) * | 2021-08-09 | 2021-11-16 | 西南科技大学 | Self-adaptive compromising pre-whitening sound source positioning method |
CN113936687B (en) * | 2021-12-17 | 2022-03-15 | 北京睿科伦智能科技有限公司 | Method for real-time voice separation voice transcription |
CN113936687A (en) * | 2021-12-17 | 2022-01-14 | 北京睿科伦智能科技有限公司 | Method for real-time voice separation voice transcription |
CN115150712A (en) * | 2022-06-07 | 2022-10-04 | 中国第一汽车股份有限公司 | Vehicle-mounted microphone system and automobile |
CN115295000A (en) * | 2022-10-08 | 2022-11-04 | 深圳通联金融网络科技服务有限公司 | Method, device and equipment for improving speech recognition accuracy under multi-object speaking scene |
CN115295000B (en) * | 2022-10-08 | 2023-01-03 | 深圳通联金融网络科技服务有限公司 | Method, device and equipment for improving speech recognition accuracy under multi-object speaking scene |
Also Published As
Publication number | Publication date |
---|---|
CN104142492B (en) | 2017-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104142492A (en) | SRP-PHAT multi-source spatial positioning method | |
Chen et al. | Acoustic source localization and beamforming: theory and practice | |
Chen et al. | Maximum-likelihood source localization and unknown sensor location estimation for wideband signals in the near-field | |
CN111123192B (en) | Two-dimensional DOA positioning method based on circular array and virtual extension | |
CN111474521B (en) | Sound source positioning method based on microphone array in multipath environment | |
Zou et al. | Multisource DOA estimation based on time-frequency sparsity and joint inter-sensor data ratio with single acoustic vector sensor | |
Long et al. | Acoustic source localization based on geometric projection in reverberant and noisy environments | |
Huleihel et al. | Spherical array processing for acoustic analysis using room impulse responses and time-domain smoothing | |
Padois et al. | Acoustic source localization using a polyhedral microphone array and an improved generalized cross-correlation technique | |
Schwartz et al. | Multi-speaker DOA estimation in reverberation conditions using expectation-maximization | |
He et al. | Closed-form DOA estimation using first-order differential microphone arrays via joint temporal-spectral-spatial processing | |
CN109696657A (en) | A kind of coherent sound sources localization method based on vector hydrophone | |
Dang et al. | A feature-based data association method for multiple acoustic source localization in a distributed microphone array | |
KR20090128221A (en) | Method for sound source localization and system thereof | |
Wang et al. | Robust direct position determination methods in the presence of array model errors | |
Liu et al. | Research on acoustic source localization using time difference of arrival measurements | |
Dang et al. | Multiple sound source localization based on a multi-dimensional assignment model | |
Çöteli et al. | Multiple sound source localization with rigid spherical microphone arrays via residual energy test | |
Liu et al. | A multiple sources localization method based on TDOA without association ambiguity for near and far mixed field sources | |
Pertilä | Acoustic source localization in a room environment and at moderate distances | |
Wang et al. | 3-D sound source localization with a ternary microphone array based on TDOA-ILD algorithm | |
Zavala et al. | Generalized inverse beamforming investigation and hybrid estimation | |
Zhang et al. | Three‐Dimension Localization of Wideband Sources Using Sensor Network | |
Sun et al. | Indoor sound source localization and number estimation using infinite Gaussian mixture models | |
Pasha et al. | Forming ad-hoc microphone arrays through clustering of acoustic room impulse responses |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170405 Termination date: 20200729 |