CN106093864A - A kind of microphone array sound source space real-time location method - Google Patents

A kind of microphone array sound source space real-time location method Download PDF

Info

Publication number
CN106093864A
CN106093864A CN201610391351.9A CN201610391351A CN106093864A CN 106093864 A CN106093864 A CN 106093864A CN 201610391351 A CN201610391351 A CN 201610391351A CN 106093864 A CN106093864 A CN 106093864A
Authority
CN
China
Prior art keywords
sound source
candidate point
microphone array
controlled power
power response
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610391351.9A
Other languages
Chinese (zh)
Other versions
CN106093864B (en
Inventor
杨毅
孙甲松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201610391351.9A priority Critical patent/CN106093864B/en
Publication of CN106093864A publication Critical patent/CN106093864A/en
Application granted granted Critical
Publication of CN106093864B publication Critical patent/CN106093864B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

One microphone array sound source space of the present invention real-time location method, using microphone array as signals collecting and outut device, by using controlled power response phase converter technique tentatively to provide sound source locus candidate point;Carry out preliminary candidate point by priori to screen, and use controlled power response phase converter technique to calculate the controlled power response output of candidate point;Shrink by the random areas improved and redefine search border, improve the efficiency of controlled power response phase converter technique;Finally calculating the controlled power response of residue candidate point, position is estimated as final sound source in the position choosing maximum;Sound localization definite principle of the present invention, real-time is good, and experiment proof the method position error range-controllable system in the plane is on centimetres, and performance is better than method based on prior art;Having higher arithmetic speed and robustness, can be applicable to Smart Home and intelligent robot etc. needs to carry out the scene of real-time acoustic source location.

Description

A kind of microphone array sound source space real-time location method
Technical field
The invention belongs to voice technology field, particularly to a kind of microphone array sound source space real-time location method.
Background technology
Voice processing technology typically requires the spatial positional information grasping sound source, for Application on Voiceprint Recognition, voice content identification Help is provided etc. interaction technique.Such as, when user carries out voice dialogue with intelligent robot, the space bit of people to be determined Put and carry out turning to and close, even if the concrete orientation of user can also be determined in the dark only according to sound, and automatically leaning on Near nearly speaker, it is provided that necessary service and help;It is generally required to automatically adjust meeting photographic head in video conference, towards Spokesman, and different spokesman can be carried out video switching.
The equipment being commonly used in sound localization is microphone array, is commonly defined as: by multiple (typically larger than three) Mike is put according to the geometrical rule specified and the equipment of Complete Synchronization collected sound signal.Microphone array positions at present Common method specifically includes that sound localization based on the time of advent poor (Time Delay of Arrival, TDOA), location head First pass through time delay and estimate that obtaining sound-source signal arrives the time difference of different array element, then carried out by the geometric construction of microphone array Sound source position judges;The sound localization of (Steered Response Power, SRP) is responded and based on height based on controlled power The sound localization etc. of resolution Power estimation (High-resolution Spectral Estimation).
1, the time of advent poor (Time Delay of Arrival, TDOA) method:
The method first passes through time delay and estimates that obtaining sound-source signal arrives the time difference of different array element, then passes through microphone array The geometric construction of row carries out sound source position judgement.Conventional delay time estimation method is based on broad sense cross-correlation (Generalized Cross Correlation, GCC) method, it concretely comprises the following steps, if Xi(ω) and Xj(ω) represent i-th mike respectively to receive Signal xiT () and jth mike receive signal xjT the Fourier transformation of (), then have a following equation:
Ψ i j ( ω ) = X i ( ω ) X j * ( ω )
R i j ( τ ) = ∫ - ∞ ∞ W i j ( ω ) Ψ i j ( ω ) e i ω τ d ω
τ ^ i j = arg max R i j ( τ )
Wherein Ψij(ω) cross-spectral density of two signals, R are representedijRepresent the cross-correlation function of two signals, For sound source to the Delay Estima-tion between i-th mike and jth mike, Wij(ω) it is frequency domain weighting coefficient, when this weights When coefficient is 1, it is simply that basic cross-correlation.Conventional frequency domain weighting functions also has phse conversion (PhAse Transform, PHAT) Weighting, ROTH weighting, maximum likelihood weighting, SCOT weighting etc., wherein PHAT weighting is the most conventional, because based on broad sense cross-correlation Time delay estimated information be hidden in the phase information of crosspower spectrum, unrelated with amplitude, PHAT weighting rule eliminate amplitude letter The impact of breath so that correlation function peak value is more sharp-pointed.PHAT weighting function formula is Wij(ω)=1/| Ψij(ω)|。
Above-mentioned localization method principle based on the difference time of advent is simple, and computational efficiency is high, but in bigger noise or reverberation Interference its time delay lower estimates that performance drastically declines, and is not therefore suitable for low signal-to-noise ratio or live acoustics scene.Additionally, also have Some delay time estimation methods, such as utilize combination entropy and mutual information in theory of information, use alternative manner, or consider different channels The method such as asymmetry, but under strong reverberation and noise conditions, result is not ideal, and Part Methods calculates speed and delays Slowly, such as iterative method etc..
2, controlled power response-phse conversion (Steered Response Power-PhAse Transform, SRP- PHAT) method
Localization method based on controlled power response-phse conversion, by scanning for global space, finds the maximum can The point of control power output, grid is the least, and search resolution is the highest.Its sound localization principle is as follows:
Assuming that sound source is positioned at s, τiRepresent the sound source transmission delay to i-th mike.The controlled power of sound source s rings Should-phse conversion output PPHAT(s) be:
P P H A T ( s ) = Σ i = 1 m Σ j = 1 m ∫ - ∞ ∞ Ψ i j ( ω ) | Ψ i j ( ω ) | e jωτ j i d ω
WhereinRepresent the cross-spectral density of two signals, τjiRepresent that at s, sound source is to jth Mike and the delay inequality of i-th mike.The sound source position estimatedFor:
s ^ = arg max P P H A T ( s )
Through analyzing it is recognised that the thinking of SRP-PHAT algorithm assumes that in space, a certain position exists sound source above, Then obtain controlled power based on the phase weighting output of this position, finally choose so that PPHATS locus s that () is maximum As the sound source position estimated.But the thinking of SRP-PHAT is first to suppose a certain position sound source in space, in order to accurately Orient sound source information, it is generally required to grid data service, travel through all of grid position in space.Assume search volume c length Wide high respectively X, Y, Z, grid interval is δ, then Grid dimension N is:
Assuming that X=3m, Y=3m, Z=2m, grid interval δ=0.01m, then Grid dimension N=301*301*201= 18210801, say, that calculate close to more than 1,800 ten thousand pointsEven if only two dimension, 301*301 to be calculated =90601 times, therefore efficiency is the lowest.
3, random areas shrinkage method
It is Do that random areas based on SRP-PHAT shrinks (Stochastic Region Contraction, SRC) method H, Silverman H F is in order to improve a kind of algorithm that SRP-PHAT efficiency proposed in 2007, and this algorithm main thought is logical Cross stochastical sampling, choose a part of point maximum based on the response of PHAT controlled power, redefine search limit according to this partial dot Boundary, then repeats top-operation, until finding global maximum in new region.The premise of this algorithm robustness is to adopt at random During sample, the probability missing summit is almost nil, and the probability the most not sampling sound source position is almost nil.Assume Sound source volume is Vs, room volume is Vroom, then stochastical sampling adopts the Probability p of peak value every timehitFor phit=Vs/Vroom, miss Crest probability pmissFor phit=Vs/Vroom.Assume that sampling collects M point altogether for the first time, then miss crest probability Pmiss(N) it isMeanwhile, crest probability is missed less than certain threshold value p in order to ensure sampling for the first timethre, sampled point number M P to be metmiss(M)≤pthreCondition, it is possible to obtainIn the ordinary course of things, p is takenthre= 0.005pthre=0.001 needs that just can meet robustness.But it is entirely random due to sampled point in the algorithm, right Associated vector will be calculated in each stochastical sampling point, therefore also to pay more calculating than the controlled power response calculating this point Cost.
4, High-Resolution Spectral Estimation method
Sound localization method based on High-Resolution Spectral Estimation has used for reference the location technology in radar array, but radar is believed Number mostly being narrow band signal, voice signal frequency band is wider by contrast, and the effect of the most this method is the most preferable.
Sound localization is widely used in human-computer interaction technology, but traditional method cannot reach high robust simultaneously With real-time effect.
Visible, sound localization method based on microphone array is widely studied, compared with additive method, based on controlled The localization method of power response has environment resistant noise and the advantage of reverberation interference, but its calculating real-time is poor, should in reality Play a role with scene is difficult to.
Summary of the invention
In order to overcome the shortcoming of above-mentioned prior art, it is an object of the invention to provide a kind of microphone array sound source space Real-time location method, its sound localization definite principle, real-time is good, and experiment proves the method position error scope in the plane Caning be controlled on centimetres, performance is better than method based on prior art;The sound based on microphone array that the present invention proposes Source space real-time location method has higher arithmetic speed and robustness, and can be applicable to Smart Home and intelligent robot etc. needs The scene of real-time acoustic source location to be carried out.
To achieve these goals, the technical solution used in the present invention is:
A kind of microphone array sound source space real-time location method, comprises the following steps:
First, using microphone array as signals collecting and outut device, by using controlled power response-phse conversion (SRP-PHAT) method tentatively provides sound source locus candidate point;
Secondly, carry out preliminary candidate point by priori and screen, and use controlled power response-phse conversion (SRP-PHAT) method calculates the controlled power response output of candidate point;
Subsequently, shrink (Stochastic Region Contraction, SRC) by the random areas improved to redefine Search border, improves the efficiency of controlled power response-phse conversion method;
Finally, calculating the controlled power response of residue candidate point, position is estimated as final sound source in the position choosing maximum.
The described sound source locus candidate point that is tentatively given refers to determine the locus candidate point of whole sound source, and method is as follows:
Assuming that the length, width and height of space c to be searched are respectively X, Y, Z, grid interval is δ, then Grid dimension The i.e. candidate point of sound source locus is N number of.
Described preliminary candidate point screening refers to tentatively reduce candidate point number, and method is as follows:
Assume that microphone array comprises m mike, then when can obtain being positioned at sound source at s to the arrival of microphone array Prolong vector T DOAs=[τ1,2,s1,3,s1,4,s2,3,s2,4,s3,4,s,......]T, wherein τi,j,s=(di,s-dj,s)/v Sound propagation velocity in air, d is represented to i-th mike and the time delay of jth mike, v for being positioned at the sound source at si,sTable Show the physical distance being positioned at the sound source at s to i-th mike, dj,sRepresent and be positioned at the thing to jth mike of the sound source at s Reason distance;
Definition sampled point number poor (Sample number Difference, SD) vector is:
SDs=[sd1,2,s,sd1,3,s,sd1,4,s,sd2,3,s,sd2,4,s,sd3,4,s,......]T
When signal sampling frequency is fs, have:
sdi,j,s=round (fs τi,j,s)
Wherein round represents and rounds each element to nearest direction, if the sampled point calculated by some candidate point Difference vector SD is equal to, and the most only retains wherein any one candidate point, deletes other candidate points, it is to avoid double counting;
Obtaining each two candidate point s1 further, the sampled point number between s2 is poor:
SDs1,s2=abs (SDs1-SDs2)
Wherein abs represents and asks for absolute value, and selects and all meet max (SDs1,s2The candidate point of)≤threshold, only Retaining wherein any one candidate point, delete other candidate points, it is to avoid double counting, Threshold is defined as:
t h r e s h o l d = ( 1 5 × λ v × f s )
Wherein, λ represents wavelength of sound.When sample frequency fs is 16000Hz, arranging threshold=1 can effectively drop The number of low candidate point.
In actual applications, above-mentioned TDOA vector sum sampled point difference SD vector calculating, and candidate point select needs Well in advance is also stored in a look-up table, so when sound localization, only need to need not repeat meter according to index search Calculate.Pre-build that look-up table is also the key of speed-raising.
The controlled power response output computational methods of described candidate point are as follows:
It is positioned at the s of locus assuming that screen the candidate point obtained, Xi(ω) and Xj(ω) i-th mike is represented respectively Receive signal xiT () and jth mike receive signal xjThe Fourier transformation of (t), τiRepresent that sound source is to i-th mike Transmission delay, τjiRepresent that at s, sound source is to jth mike and the delay inequality of i-th mike, according to controlled power response-phase The definition of bit map method, can obtain the controlled power response output P of sound source sPHAT(s) be:
P P H A T ( s ) = Σ i = 1 m Σ j = 1 m ∫ - ∞ ∞ Ψ i j ( ω ) | Ψ i j ( ω ) | e jωτ j i d ω
WhereinRepresenting cross-spectral density between the two, m is total number of mike, right The candidate point that all screenings obtain calculates PPHAT(s)。
The described search border that redefines refers to search for global maximum with quick random areas contraction algorithm, and method is such as Under:
Not by stochastical sampling, but the candidate point that screening is obtained (and its TDOA vector sum sampled point difference SD vector Can directly obtain in the look-up table of second step) by being calculated the controlled power response output of correspondence, choose top n Maximum, randomly selects M in the candidate point of its correspondence, redefines search border from these points subsequently, then exists Repeat in new region this to choose and search for, until meeting required precision.
The determination of the selection of the value of described N and stochastical sampling M value afterwards and concrete microphone array shape and room Between size etc. relevant, the selection of M and N has a lot of strategy, optional in the following way:
Mode one, elects definite value as, and M also elects definite value as;Such as under the conditions of quaternary microphone array, in general N=100 Time efficiency and accuracy reach best, performance is optimal.
Mode two, every time according to the controlled power response output that last N number of candidate point is corresponding, pick out wherein than The candidate point of the controlled power response output correspondence that average is big, its sum is N', if N-N'≤N', then in these candidate points Random choose N-N' as candidate point;If N-N' > N', then retain whole N' candidate point, so can ensure that each district After territory is shunk, average is continuously increased, when the controlled power of these candidate points responds output calculation times more than certain given threshold Stop during value choosing and searching for.
The positioning result obtained by such method disclosure satisfy that general required precision.If to determine more accurately Position, then can use gridding method precise search near the sound source result finally determined in little scope;Or it is according to look-up table, first First obtain the corresponding region of some controlled powers response output maximum, find all candidates in these areas adjacent grids Point, then calculates the controlled power response of these candidate points, and position is estimated as final sound source in the position choosing maximum.
Compared with prior art, the invention has the beneficial effects as follows:
(1) what the present invention proposed carries out, by prior information, the method that candidate point tentatively screens out, and largely reduces Respond the calculation cost of output based on the controlled power on candidate point, go for several scenes;
(2) what the present invention proposed pre-saves method in a lookup table by candidate point position associated vector, and principle is simple, Calculation cost is low, can be effectively improved live effect;
(3) two kinds of methods based on candidate point precise search sound source position again that the present invention proposes, can further improve The resolution of sound localization and precision, and computation complexity is relatively low, it is adaptable to hardware environment configures in relatively low equipment and scene.
The present invention proposes sound source space based on microphone array real-time location method and sound-source signal is carrying out space calmly During position, calculation cost is better than state of the art.The sound localization method of the present invention have be widely used, respond in real time etc. excellent Point, it is adaptable to intelligent robot and Smart Home etc. need to use sound to carry out the scene being accurately positioned.
Accompanying drawing explanation
Fig. 1 is sound source space based on microphone array real-time location method general illustration.
Detailed description of the invention
Embodiments of the present invention are described in detail below in conjunction with the accompanying drawings with embodiment.
As it is shown in figure 1, the whole calculating procedural details of the embodiment of the present invention constitutes as follows:
1, the locus candidate point of whole sound source is determined
Assuming that space to be searched c length, width and height are respectively X, Y, Z, grid interval is δ, then Grid dimension N is:
The i.e. candidate point of sound source locus is N number of.
2, arrival time delay vector sum sampled point difference vector is calculated
Assume that microphone array comprises m >=4 mike, then can obtain sound source the arriving to microphone array being positioned at s Reach time delay vector T DOAsFor:
TDOAs=[τ1,2,s1,3,s1,4,s2,3,s2,4,s3,4,s,......]T(formula 2)
Wherein τi,j,s=(di,s-dj,s)/v is to be positioned at the sound source at s to i-th mike and the time delay of jth mike, Sound propagation velocity during wherein v represents air, di,sRepresent and be positioned at the physical distance to i-th mike of the sound source at s, dj,sTable Show the physical distance being positioned at the sound source at s to jth mike.
Definition sampled point number difference vector SDsFor:
SDs=[sd1,2,s,sd1,3,s,sd1,4,s,sd2,3,s,sd2,4,s,sd3,4,s,......]T(formula 3)
When signal sampling frequency is fs, have:
SDi,j,s=round (fs τi,j,s) (formula 4)
Wherein round represents and rounds each element to nearest direction, and fs is sample frequency.Obtain each two further Candidate point s1, the sampled point number between s2 is poor:
SDs1,s2=abs (SDs1-SDs2) (formula 5)
Wherein abs represents and asks for absolute value.
3, part candidate point is deleted
In above-mentioned second step calculates, if some candidate point sampled point difference vector SD calculated by formula 4 is equal to, then Only retain wherein any one candidate point, delete other candidate points, it is to avoid double counting.And remaining candidate point is passed through formula 5 calculate again, select and all meet max (SDs1,s2The candidate point of)≤threshold, only retains wherein any one candidate point, Delete other candidate points, it is to avoid double counting.Threshold is defined as herein:
Wherein, λ represents wavelength of sound.When sample frequency fs is 16000Hz, arranging threshold=1 can effectively drop The number of low candidate point.
It addition, in actual applications, the calculating of above-mentioned TDOA vector sum sampled point difference SD vector, and the selecting of candidate point Need well in advance and be stored in a look-up table, so when sound localization, only need not need to weigh according to index search Multiple calculating.Pre-build that look-up table is also the key of speed-raising.
4, the controlled power response output of candidate point is calculated
Assuming that the candidate point that above-mentioned steps obtains after completing is positioned at the s of locus, Xi(ω) and Xj(ω) is represented respectively I mike receives signal xiT () and jth mike receive signal xjThe Fourier transformation of (t), τiRepresent that sound source is to i-th The transmission delay of mike, τjiRepresent that at s, sound source is to jth mike and the delay inequality of i-th mike.According to controlled power The definition of response-phse conversion method, can obtain the controlled power output P of sound source sPHAT(s) be:
WhereinRepresent cross-spectral density between the two.The time that all second steps are obtained Reconnaissance calculates PPHAT(s)。
5, determine M and N, redefine search border
To the P being calculated correspondence by above-mentioned stepsPHATS (), chooses top n maximum, at the candidate point of its correspondence In randomly select M, redefine from these points subsequently and search for border.
How the value of N selects and the determination of stochastical sampling M afterwards and N value and concrete microphone array shape and room Between size etc. relevant.The selection of M and N has a lot of strategy, and simplest is that fixing M and N is for determining value.Such as at quaternary mike Under the conditions of array, in general during N=100, efficiency and accuracy reach best, and performance is optimal.The strategy of another kind of selection N is, Every time according to the P that last N number of candidate point is correspondingPHATS () picks out wherein big than average PPHATS candidate that () is corresponding Point, its sum is N', if N-N'≤N', then in these candidate points, random choose N-N' is individual as candidate point;If N-N' > N', then retain whole N' candidate point.After so can ensure that each regions contract, average is continuously increased.When these candidate points PPHATStop choosing and searching for when () calculation times is more than certain threshold value given s.
6, repeat this to choose and search for, until exporting candidate point after meeting required precision
After the above step is finished, obtain new searching element region, repeat this to choose and search for, directly in this region To meeting required precision, export these subsequently and meet the candidate point required.The positioning result obtained by such method can Meet general required precision.
7, finer location is realized with gridding method or look-up table further
If to position more accurately, then near the sound source result finally determined, gridding method essence in little scope, can be used Really search;Or according to look-up table, first obtain some PPHATS the corresponding region of () maximum, finds these areas adjacent nets All candidate points in lattice, then calculate the controlled power response of these candidate points, choose the position of maximum as final sound Estimation position, source.

Claims (10)

1. a microphone array sound source space real-time location method, it is characterised in that comprise the following steps:
First, using microphone array as signals collecting and outut device, by using controlled power response-phse conversion (SRP-PHAT) method tentatively provides sound source locus candidate point;
Secondly, carry out preliminary candidate point by priori and screen, and use controlled power response-phse conversion (SRP- PHAT) method calculates the controlled power response output of candidate point;
Subsequently, shrink (Stochastic Region Contraction, SRC) by the random areas improved and redefine search Border, improves the efficiency of controlled power response-phse conversion method;
Finally, calculating the controlled power response of residue candidate point, position is estimated as final sound source in the position choosing maximum.
Microphone array sound source space real-time location method the most according to claim 1, it is characterised in that described be tentatively given Sound source locus candidate point refers to determine the locus candidate point of whole sound source, and method is as follows:
Assuming that the length, width and height of space c to be searched are respectively X, Y, Z, grid interval is δ, then Grid dimension The i.e. candidate point of sound source locus is N number of.
Microphone array sound source space real-time location method the most according to claim 1, it is characterised in that described preliminary time Reconnaissance screening refers to tentatively reduce candidate point number, and method is as follows:
Assume that microphone array comprises m mike, then can obtain being positioned at sound source at s to microphone array arrival time delay to Amount TDOAs=[τ1,2,s1,3,s1,4,s2,3,s2,4,s3,4,s,......]T, wherein τi,j,s=(di,s-dj,s)/v is position Sound source at s represents sound propagation velocity in air, d to i-th mike and the time delay of jth mike, vi,sRepresent position Sound source at s is to the physical distance of i-th mike, dj,sRepresent be positioned at sound source at s to jth mike physics away from From;
Definition sampled point number poor (Sample number Difference, SD) vector is:
SDs=[sd1,2,s,sd1,3,s,sd1,4,s,sd2,3,s,sd2,4,s,sd3,4,s,......]T
When signal sampling frequency is fs, have:
sdi,j,s=round (fs τi,j,s)
Wherein round represents and rounds each element to nearest direction, if the sampled point difference calculated by some candidate point to Amount SD equivalent, the most only retains wherein any one candidate point, deletes other candidate points, it is to avoid double counting;
Obtaining each two candidate point s1 further, the sampled point number between s2 is poor:
SDs1,s2=abs (SDs1-SDs2)
Wherein abs represents and asks for absolute value, and selects and all meet max (SDs1,s2The candidate point of)≤threshold, only retains Wherein any one candidate point, deletes other candidate points, it is to avoid double counting, and Threshold is defined as:
t h r e s h o l d = ( 1 5 × λ v × f s )
Wherein, λ represents wavelength of sound.
Microphone array sound source space real-time location method the most according to claim 3, it is characterised in that described sample frequency When fs is 16000Hz, threshold=1 is set.
Microphone array sound source space real-time location method the most according to claim 3, it is characterised in that described TDOA vector The calculating of SD poor with sampled point vector and selecting of candidate point, well in advance is also stored in a look-up table, fixed in sound source During position, directly according to index search, without double counting.
Microphone array sound source space real-time location method the most according to claim 3, it is characterised in that described candidate point Controlled power response output computational methods are as follows:
It is positioned at the s of locus assuming that screen the candidate point obtained, Xi(ω) and Xj(ω) represent i-th mike respectively to receive Signal xiT () and jth mike receive signal xjThe Fourier transformation of (t), τiRepresent the sound source transmission to i-th mike Postpone, τjiRepresent that at s, sound source, to jth mike and the delay inequality of i-th mike, becomes according to controlled power response-phase place Change the definition of method, the controlled power response output P of sound source s can be obtainedPHAT(s) be:
P P H A T ( s ) = Σ i = 1 m Σ j = 1 m ∫ - ∞ ∞ Ψ i j ( ω ) | Ψ i j ( ω ) | e jωτ j i d ω
WhereinRepresenting cross-spectral density between the two, m is total number of mike, to all The candidate point that screening obtains calculates PPHAT(s)。
Microphone array sound source space real-time location method the most according to claim 1, it is characterised in that described in redefine Search border refers to search for global maximum with quick random areas contraction algorithm, and method is as follows:
The candidate point obtaining screening, by being calculated the controlled power response output of correspondence, chooses top n maximum, at it Corresponding candidate point randomly selects M, redefines search border, then weight in new region from these points subsequently Carry out again this choosing and searching for, until meeting required precision.
Microphone array sound source space real-time location method the most according to claim 7, it is characterised in that the value of described N The determination of selection and stochastical sampling M value afterwards is relevant with concrete microphone array shape and room-sized, the choosing of the value of N Select in the following way:
Mode one, elects definite value as, and M also elects definite value as;
Mode two, every time according to the controlled power response output that last N number of candidate point is corresponding, picks out wherein than average The candidate point that big controlled power response output is corresponding, its sum is N', if N-N'≤N', then random in these candidate points Select N-N' as candidate point;If N-N' > N', then retain whole N' candidate point, when the controlled power of these candidate points Stop choosing and searching for when response output calculation times is more than certain threshold value given.
Microphone array sound source space real-time location method the most according to claim 8, it is characterised in that described mode one In, under the conditions of quaternary microphone array, when selecting N=100.
Microphone array sound source space real-time location method the most according to claim 1, it is characterised in that finally determining Sound source result near use gridding method precise search in little scope;Or according to look-up table, first obtain some controlled powers The corresponding region of response output maximum, finds all candidate points in these areas adjacent grids, then calculates these candidates The controlled power response of point, position is estimated as final sound source in the position choosing maximum.
CN201610391351.9A 2016-06-03 2016-06-03 A kind of microphone array sound source space real-time location method Active CN106093864B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610391351.9A CN106093864B (en) 2016-06-03 2016-06-03 A kind of microphone array sound source space real-time location method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610391351.9A CN106093864B (en) 2016-06-03 2016-06-03 A kind of microphone array sound source space real-time location method

Publications (2)

Publication Number Publication Date
CN106093864A true CN106093864A (en) 2016-11-09
CN106093864B CN106093864B (en) 2018-04-17

Family

ID=57447696

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610391351.9A Active CN106093864B (en) 2016-06-03 2016-06-03 A kind of microphone array sound source space real-time location method

Country Status (1)

Country Link
CN (1) CN106093864B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107167770A (en) * 2017-06-02 2017-09-15 厦门大学 A kind of microphone array sound source locating device under the conditions of reverberation
CN107621625A (en) * 2017-06-23 2018-01-23 桂林电子科技大学 Sound localization method based on double micro-microphone battle arrays
CN108417036A (en) * 2018-05-07 2018-08-17 北京中电慧声科技有限公司 Vehicle whistle sound localization method and device in intelligent transportation system
CN108549052A (en) * 2018-03-20 2018-09-18 南京航空航天大学 A kind of humorous domain puppet sound intensity sound localization method of circle of time-frequency-spatial domain joint weighting
CN108872939A (en) * 2018-04-29 2018-11-23 桂林电子科技大学 Interior space geometric profile reconstructing method based on acoustics mirror image model
CN109669158A (en) * 2017-10-16 2019-04-23 杭州海康威视数字技术股份有限公司 A kind of sound localization method, system, computer equipment and storage medium
CN109709517A (en) * 2018-12-10 2019-05-03 东南大学 SRP-PHAT auditory localization trellis search method based on simulated annealing
CN109741609A (en) * 2019-02-25 2019-05-10 南京理工大学 A kind of motor vehicle whistle sound monitoring method based on microphone array
CN110082724A (en) * 2019-05-31 2019-08-02 浙江大华技术股份有限公司 A kind of sound localization method, device and storage medium
CN110544490A (en) * 2019-07-30 2019-12-06 南京林业大学 sound source positioning method based on Gaussian mixture model and spatial power spectrum characteristics
CN111273231A (en) * 2020-03-23 2020-06-12 桂林电子科技大学 Indoor sound source positioning method based on different microphone array topological structure analysis
CN111443329A (en) * 2020-03-25 2020-07-24 北京东方振动和噪声技术研究所 Sound source positioning method and device, computer storage medium and electronic equipment
CN111856400A (en) * 2020-07-29 2020-10-30 中北大学 Underwater target sound source positioning method and system
CN111929645A (en) * 2020-09-23 2020-11-13 深圳市友杰智新科技有限公司 Method and device for positioning sound source of specific human voice and computer equipment
US11079462B2 (en) 2017-10-24 2021-08-03 International Business Machines Corporation Facilitation of efficient signal source location employing a coarse algorithm and high-resolution computation
CN113419216A (en) * 2021-06-21 2021-09-21 南京信息工程大学 Multi-sound-source positioning method suitable for reverberation environment
CN113791386A (en) * 2021-08-06 2021-12-14 浙江大华技术股份有限公司 Method, device and equipment for positioning sound source and computer readable storage medium
CN115184868A (en) * 2022-07-04 2022-10-14 杭州爱谱科技有限公司 Method for positioning three-dimensional position of noise source
CN115452141A (en) * 2022-11-08 2022-12-09 杭州兆华电子股份有限公司 Non-uniform acoustic imaging method
CN117368847A (en) * 2023-12-07 2024-01-09 深圳市好兄弟电子有限公司 Positioning method and system based on microphone radio frequency communication network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090279714A1 (en) * 2008-05-06 2009-11-12 Samsung Electronics Co., Ltd. Apparatus and method for localizing sound source in robot
CN101762806A (en) * 2010-01-27 2010-06-30 华为终端有限公司 Sound source locating method and apparatus thereof
KR20140015893A (en) * 2012-07-26 2014-02-07 삼성테크윈 주식회사 Apparatus and method for estimating location of sound source
CN104142492A (en) * 2014-07-29 2014-11-12 佛山科学技术学院 SRP-PHAT multi-source spatial positioning method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090279714A1 (en) * 2008-05-06 2009-11-12 Samsung Electronics Co., Ltd. Apparatus and method for localizing sound source in robot
CN101762806A (en) * 2010-01-27 2010-06-30 华为终端有限公司 Sound source locating method and apparatus thereof
KR20140015893A (en) * 2012-07-26 2014-02-07 삼성테크윈 주식회사 Apparatus and method for estimating location of sound source
CN104142492A (en) * 2014-07-29 2014-11-12 佛山科学技术学院 SRP-PHAT multi-source spatial positioning method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HOANG DO等: "A Real-Time SRP-PHAT Source Location Implementation using Stochastic Region Contraction(SRC) on a Large-Aperture Microphone Array", 《 ACOUSTICS, SPEECH AND SIGNAL PROCESSING, 2007. ICASSP 2007. IEEE INTERNATIONAL CONFERENCE ON》 *
袁晓坤等: "SRP-PHAT的改进算法综述", 《电声技术》 *

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107167770A (en) * 2017-06-02 2017-09-15 厦门大学 A kind of microphone array sound source locating device under the conditions of reverberation
CN107621625A (en) * 2017-06-23 2018-01-23 桂林电子科技大学 Sound localization method based on double micro-microphone battle arrays
CN107621625B (en) * 2017-06-23 2020-07-17 桂林电子科技大学 Sound source positioning method based on double micro microphones
CN109669158B (en) * 2017-10-16 2021-04-20 杭州海康威视数字技术股份有限公司 Sound source positioning method, system, computer equipment and storage medium
CN109669158A (en) * 2017-10-16 2019-04-23 杭州海康威视数字技术股份有限公司 A kind of sound localization method, system, computer equipment and storage medium
US11079462B2 (en) 2017-10-24 2021-08-03 International Business Machines Corporation Facilitation of efficient signal source location employing a coarse algorithm and high-resolution computation
CN108549052A (en) * 2018-03-20 2018-09-18 南京航空航天大学 A kind of humorous domain puppet sound intensity sound localization method of circle of time-frequency-spatial domain joint weighting
CN108549052B (en) * 2018-03-20 2021-04-13 南京航空航天大学 Time-frequency-space domain combined weighted circular harmonic domain pseudo-sound strong sound source positioning method
CN108872939A (en) * 2018-04-29 2018-11-23 桂林电子科技大学 Interior space geometric profile reconstructing method based on acoustics mirror image model
CN108872939B (en) * 2018-04-29 2020-09-29 桂林电子科技大学 Indoor space geometric outline reconstruction method based on acoustic mirror image model
CN108417036A (en) * 2018-05-07 2018-08-17 北京中电慧声科技有限公司 Vehicle whistle sound localization method and device in intelligent transportation system
CN109709517A (en) * 2018-12-10 2019-05-03 东南大学 SRP-PHAT auditory localization trellis search method based on simulated annealing
CN109741609A (en) * 2019-02-25 2019-05-10 南京理工大学 A kind of motor vehicle whistle sound monitoring method based on microphone array
CN109741609B (en) * 2019-02-25 2021-05-04 南京理工大学 Motor vehicle whistling monitoring method based on microphone array
CN110082724B (en) * 2019-05-31 2021-09-21 浙江大华技术股份有限公司 Sound source positioning method, device and storage medium
CN110082724A (en) * 2019-05-31 2019-08-02 浙江大华技术股份有限公司 A kind of sound localization method, device and storage medium
CN110544490B (en) * 2019-07-30 2022-04-05 南京工程学院 Sound source positioning method based on Gaussian mixture model and spatial power spectrum characteristics
CN110544490A (en) * 2019-07-30 2019-12-06 南京林业大学 sound source positioning method based on Gaussian mixture model and spatial power spectrum characteristics
CN111273231A (en) * 2020-03-23 2020-06-12 桂林电子科技大学 Indoor sound source positioning method based on different microphone array topological structure analysis
CN111443329A (en) * 2020-03-25 2020-07-24 北京东方振动和噪声技术研究所 Sound source positioning method and device, computer storage medium and electronic equipment
CN111856400B (en) * 2020-07-29 2021-04-09 中北大学 Underwater target sound source positioning method and system
CN111856400A (en) * 2020-07-29 2020-10-30 中北大学 Underwater target sound source positioning method and system
CN111929645A (en) * 2020-09-23 2020-11-13 深圳市友杰智新科技有限公司 Method and device for positioning sound source of specific human voice and computer equipment
CN113419216B (en) * 2021-06-21 2023-10-31 南京信息工程大学 Multi-sound source positioning method suitable for reverberant environment
CN113419216A (en) * 2021-06-21 2021-09-21 南京信息工程大学 Multi-sound-source positioning method suitable for reverberation environment
CN113791386A (en) * 2021-08-06 2021-12-14 浙江大华技术股份有限公司 Method, device and equipment for positioning sound source and computer readable storage medium
CN113791386B (en) * 2021-08-06 2024-03-29 浙江大华技术股份有限公司 Sound source positioning method, device, equipment and computer readable storage medium
CN115184868A (en) * 2022-07-04 2022-10-14 杭州爱谱科技有限公司 Method for positioning three-dimensional position of noise source
CN115452141A (en) * 2022-11-08 2022-12-09 杭州兆华电子股份有限公司 Non-uniform acoustic imaging method
CN117368847A (en) * 2023-12-07 2024-01-09 深圳市好兄弟电子有限公司 Positioning method and system based on microphone radio frequency communication network
CN117368847B (en) * 2023-12-07 2024-03-15 深圳市好兄弟电子有限公司 Positioning method and system based on microphone radio frequency communication network

Also Published As

Publication number Publication date
CN106093864B (en) 2018-04-17

Similar Documents

Publication Publication Date Title
CN106093864A (en) A kind of microphone array sound source space real-time location method
Cobos et al. A survey of sound source localization methods in wireless acoustic sensor networks
CN102103200B (en) Acoustic source spatial positioning method for distributed asynchronous acoustic sensor
JP6042858B2 (en) Multi-sensor sound source localization
Sheng et al. Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks
CN112712557B (en) Super-resolution CIR indoor fingerprint positioning method based on convolutional neural network
CN111474521B (en) Sound source positioning method based on microphone array in multipath environment
CN106353821B (en) Microseism event positioning method
CN108828501B (en) Method for real-time tracking and positioning of mobile sound source in indoor sound field environment
CN105163282A (en) Indoor positioning system and positioning method based on Bluetooth location fingerprint
CN110261816A (en) Voice Wave arrival direction estimating method and device
CN105388459A (en) Robustness sound source space positioning method of distributed microphone array network
Cheng et al. Node selection algorithm for underwater acoustic sensor network based on particle swarm optimization
Chen et al. FLoc: Device-free passive indoor localization in complex environments
US7397427B1 (en) Phase event detection and direction of arrival estimation
CN113238280B (en) Green function-based earthquake monitoring method
Liu et al. A Kriging algorithm for location fingerprinting based on received signal strength
Shi et al. A TDOA technique with super-resolution based on the volume cross-correlation function
CN107167770A (en) A kind of microphone array sound source locating device under the conditions of reverberation
CN111273231A (en) Indoor sound source positioning method based on different microphone array topological structure analysis
CN109633554A (en) Moving sound based on probabilistic data association reaches delay time estimation method
Sheng et al. Maximum likelihood wireless sensor network source localization using acoustic signal energy measurements
Assayag et al. Indoor positioning system using synthetic training and data fusion
CN109600711B (en) Indoor positioning method based on channel response frequency domain and spatial domain combined processing
CN114994608B (en) Multi-device self-organizing microphone array sound source positioning method based on deep learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant