CN1952684A

CN1952684A - Method and device for localization of sound source by microphone

Info

Publication number: CN1952684A
Application number: CN 200510116434
Authority: CN
Inventors: 崔玮玮; 魏建强
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2005-10-20
Filing date: 2005-10-20
Publication date: 2007-04-25

Abstract

A method for positioning the location of sound source is disclosed that includes steps: the receiving device receives the sound signal generated by the sound source; provisional estimates the time difference of said sound signals reaching the receiving device; provisional estimates the receiving device separately receives the energy ratio of said sound signals; ascertains the location of the sound source by the time difference and the energy ratio. The method of double micro sound localization is compared to people's double ears position mechanism; the position is located by the sound energy and the time-delay information simultaneously, it can reach the aim of reducing the number of matrix element, and doesn't need to send and receive the type-specific reference signal.

Description

Utilize the method and apparatus of microphone localization of sound source

Technical field

The present invention relates to a kind of method and apparatus that utilizes microphone that sound source is positioned.Particularly, be suitable for the method and apparatus that in the small device of inner space, uses, utilize two microphones that sound source is positioned.

Background technology

Utilizing microphone that sound source is positioned is to adopt microphone array that the sound source position of sounding is positioned.So-called microphone array is meant by a plurality of microphones to be put according to different locus, forms the device of a certain specific receiving system jointly.

Utilize the location technology of microphone array in our current social, to have a wide range of applications.For example in video conference, auditory localization is used to detect speaker's position, and the automatic focus camera.Microphone array in osophone is used as the pretreatment system that voice strengthen, and it points to interested Sounnd source direction with array, thereby curb the interference and the noise of other direction by detecting the positional information of sound source.Digital pen based on the array location technology can be used to replace mouse as new wireless input device.

The location technology of utilizing microphone array at present mainly is to come the position of localization of sound source by estimated value according to the mistiming of each microphone of sound from sound source arrival microphone array.

Existing sound localization method is broadly divided into following two classes according to its implementation difference: single step and two step location algorithm.The single step localization method is divided into again based on the localization method of beam position (Steered-Beamformer) with based on the localization method of high resolution analysis of spectrum (High-Resolution Spectral Analysis).In preceding a kind of method, the signal that each microphone array element is collected is shifted in time.The purpose of this time shift is that the compensation sound source is to the mistiming τ between the different microphone array elements _IjCarry out addition again to compensating later signal, ask operation such as average to obtain array filtering output.At this moment, in each road microphone array element signal, the signal content corresponding with the target speaker is the homophase addition, then has certain phase difference when the addition from the signal content of sound source on other orientation.This phase differential has caused the decay of the sound-source signal on these orientation in the array output signal, thereby can be the direction of sound source according to the maximum outbound course of array.

Based on the high resolution analysis of spectrum related generally to autoregressive model (AutoRegressive, AR), minimum variance spectrum estimate (Minimum Variance, MV) and various methods based on the signature analysis technology.For example, be mainly used in the Multiple Signal Classification method of music.For these methods, perhaps be subjected to the influence of model hypothesis can not handle the noise of reverberation and directivity, perhaps computation complexity is too high, seldom is used for the practical position system.

Two step location algorithms are present most widely used a kind of location algorithms.The mistiming of wherein utilizing each microphone in the sound signal arrival microphone array, (Time Delay Of Arrival, the method that TDOA) positions obtained paying close attention to especially.This method estimates that at first sound source arrives the right TDOA value of a plurality of microphones of apart, then according to geometric relationship, utilizes estimated value to orient sound source position.

U.S. Patent Application Publication US20040100868 A1 discloses the system and method for a kind of identification and location sound incident.The acoustic positioning system that this patent disclosure discloses needs a plurality of array elements composition sensor networks to position.Determine the absolute time that signal is propagated by the embedded synchronous clock of sensor.Each sensor need comprise the GPS module and be used for synchronously.The positional precision height that this localization method estimates depends on the precision of synchronous clock.After obtaining the travel-time, this system employs triangulation location positions.This patented claim openly relates generally at shooting incident, has only preserved the signal of several specific types in its system, that is to say that it can only discern and orient the sound source of limited kinds.

WO/2003/044516 discloses a kind of equipment and method that is used for sound detection and defect location.Provided a kind of new two step localization methods in WO/2003/044516 open, that is, and based on ears level difference (Interaural Level Difference, location algorithm ILD).This method is at first estimated the energy ratio of the signal that two microphones are received.We know that sound is in communication process, and contrary square law is obeyed in the decay of signal energy.Therefore, the propagation distance that arrives each microphone in the microphone array when signal not simultaneously, its energy that receives also has difference.So, can determine sound source position by this information.

In the positioning equipment of WO/2003/044516, two sensors are fixed on one can be along on the support of central shaft rotation.At this moment, the signal that they receive multiplies each other after amplifying, and produces output.When sound source was positioned on the perpendicular bisector of support just, signal arrived two microphones simultaneously, and the signal of strict alignment will produce the output of maximum in time.Yet there is following point in this localization method.It only can determine the orientation rather than the positional information of sound source.In addition, it judges the orientation of sound source as indication with the output signal maximum, do not consider reflection and directional interference The noise, in case reflect or the interference noise of directivity has surpassed master signal component, it just can not right judgement goes out the position of sound source.In addition, not only computation complexity is too high for this method by determining Sounnd source direction at whole space search, and sound source that can not pursuit movement.

U.S. Patent Application Publication US20010031053 A1 discloses a kind of binaural signal treatment technology.Wherein disclosed positioning system realizes auditory localization by beam-forming technology.It is weighted each channel signal according to the propagation distance of signal, judges that then the maximum direction of output is for estimating direction.Similar with a last method, it also can only determine the azimuth information of sound source, can not provide accurate position.

U.S. Pat 6703570 B1 that are entitled as " digital pen that uses ultrasound wave to follow the tracks of " disclose a kind of location technology based on two sensors.It launches ultrasound wave and two kinds of signals of infrared ray simultaneously, and with infrared ray as reference signal.The propagation of the disclosure supposition light can be ignored, so ultrasound wave and ultrared propagation time difference are the absolute time of ultrasonic propagation, utilizes the method for triangle location to position then.Yet infrared emission that this method need be added on the one hand and receiving device have increased the cost of equipment greatly; On the other hand, this positioning system is very responsive to the power of light, and when the brightness of daylight lamp surpassed certain limit, it just can not use.

In sum,, determine the point in the two dimensional surface, generally all need three microphones at least, or need transmit and receive the equipment of the reference signal of specific type for the localization method of prior art.The use of this class auditory localization mode is restricted in the equipment of finite volume.

Summary of the invention

The purpose of this invention is to provide a kind of sound localization method and device, utilize the time delay and the energy information of the signal that microphone receives to position simultaneously, can effectively the array element quantity of microphone array be reduced to two based on dual microphone.Method and apparatus of the present invention is very effective for the voice communication apparatus that volume is restricted.

According to an aspect of the present invention, provide the method for 1. 1 kinds of localization of sound source positions, comprise step: receive the voice signal that sound source is sent by receiving trap; Estimate that described voice signal arrives the mistiming of described receiving trap; Estimate the energy ratio of the described voice signal that described receiving trap receives respectively; According to the described mistiming of estimation and the position of the definite described sound source of described energy ratio.

According to another aspect of the present invention, provide a kind of method of utilizing two microphone localization of sound source positions, comprise step: two microphones receive the voice signal that sound source is sent respectively; Estimate that described voice signal arrives the mistiming of described two microphones; Estimate the energy ratio of the described voice signal that described two microphones receive respectively; According to the described mistiming of estimation and the position of the definite described sound source of described energy ratio.

According to a further aspect of the invention, provide the device of a kind of localization of sound source position, comprising: receiving trap is used to receive the voice signal that sound source is sent; Time of arrival, the difference estimating device was used to estimate that described voice signal arrives the mistiming of described receiving trap; Ears level difference estimating device, the energy ratio of the described voice signal that is used to estimate that described receiving trap receives respectively; With the location Calculation device, be used for the position of determining described sound source according to the described mistiming and the described energy ratio of estimation.

According to a further aspect of the invention, provide a kind of device that utilizes two microphone localization of sound source positions, comprising: receiving trap is used to receive the voice signal that sound source is sent; Time of arrival, the difference estimating device was used to estimate that described voice signal arrives the mistiming of described two microphones; Ears level difference estimating device, the energy ratio of the described voice signal that is used to estimate that described two microphones receive respectively; With the location Calculation device, be used for the position of determining described sound source according to the described mistiming and the described energy ratio of estimation.

Diamylose restrains sound localization method and device according to the present invention, is similar to people's binaural localization mechanism, has utilized the energy of sound and time delay information to position simultaneously, thereby has reached the purpose that reduces array element number.On the other hand, sound localization method of the present invention is different from existing three angle positioning methods, and it does not need to transmit and receive the reference signal of specific type.

Description of drawings

Fig. 1 is the synoptic diagram of the array received signal be made up of two microphones according to the present invention;

Fig. 2 is the synoptic diagram that the expression sound source arrives each microphone distance;

Fig. 3 is the process flow diagram of broad sense simple crosscorrelation (GCC) method of mistiming estimation;

Fig. 4 is the synoptic diagram that the expression signal is propagated the range difference that arrives two microphones;

Fig. 5 utilizes hyperbola positioning method to determine the synoptic diagram of sound source track;

Fig. 6 is that expression utilizes two circles to determine the synoptic diagram of the position of sound source;

Fig. 7 illustrates the synoptic diagram of the situation that energy that sound arrives two microphones equates;

Fig. 8 is the structural representation that utilizes the device of the array localization of sound source that two microphones form according to the embodiment of the invention;

Fig. 9 handles to obtain the process flow diagram of sound source coordinate position voice signal according to the embodiment of the invention;

Figure 10 shows the synoptic diagram that carries out the environment of emulation according to location of the present invention;

Figure 11 shows the synoptic diagram of Sounnd source direction angle definition;

Figure 12 (a) to (c) is the synoptic diagram of the simulation result of sound source when being positioned at the array different directions, and wherein reflection coefficient changes to 0.9 from 0.1.

Embodiment

Specifically describe the embodiment that implements preference pattern of the present invention below with reference to description of drawings.Having omitted in the description process is unnecessary details and function for the present invention, obscures to prevent that the understanding of the present invention from causing.

Fig. 1 is the synoptic diagram of the array received signal be made up of two microphones according to the present invention.The synoptic diagram of the array received voice signal of being made up of two microphones at first, is described with reference to figure 1.Be that example describes with the people as the sound source of sounding among Fig. 1.Should be appreciated that, can be other sound that sends such as stereo set.The voice signal that the people sends converts electric signal to by microphone 1 and microphone 2.The voice signal that arrives microphone 1 can be used x ₁(t) expression, the voice signal that arrives microphone 2 can be used x ₂(t) expression.For the array that two microphones are formed, its signal model that receives sound-source signal can be represented by following formula (1).

x _i(t)＝s(t-τ)/d _i+n _i(t)；i＝1，2 (1)

Wherein s (t) is a source signal, n _i(t) be additive white noise.d _iAnd τ _iBe respectively the distance and the time delay that arrive I microphone from sound source, we have considered the contrary square law that signal is propagated in (1) formula, and so-called inverse square law is meant the decay of sound energy in communication process and square being inversely proportional to of the distance propagated.

Because the energy of ILD method major concern signal, thereby can ignore its time delay information.If the supposition voice signal is present in [0, W] in the time period, and is stably.Thereby for i the received signal energy of microphone be exactly in during this period of time statistics and.Therefore, the there that microphone receives can be expressed as shown in the formula (2).

E_{i} = {&Integral;}_{0}^{W} x_{i}^{2} (t) dt = \frac{1}{d_{i}^{2}} {&Integral;}_{0}^{W} s^{2} (t) dt + {&Integral;}_{0}^{W} n_{i}^{2} (t) dt; i = 1, 2 - - - (2)

From top formula, the relation that can obtain energy and distance is shown in following formula (3).

E_{1} d_{1}^{2} = E_{2} d_{2}^{2} + η - - - (3)

Wherein

η = {&Integral;}_{0}^{W} [d_{1}^{2} n_{1}^{2} (t) - d_{2}^{2} n_{2}^{2} (t)] dt

Be a error term with zero-mean.If we suppose (x _i, y _i) be i, the position coordinates of microphone, (x _s, y _s) be the position coordinates of sound source, then sound source can be used formula (4) expression to the distance of i microphone.

d_{i} = \sqrt{{(x_{i} - x_{s})}^{2} + {(y_{i} - y_{s})}^{2}} - - - (4)

Fig. 2 shows the synoptic diagram that sound source arrives each microphone distance.

Formula (4) is brought in (3), and under the condition of not considering noise effect, we can obtain

E_{1} [{(x_{1} - x_{s})}^{2} + {(y_{1} - y_{s})}^{2}] = E_{2} [{(x_{1} - x_{s})}^{2} + {(y_{1} - y_{s})}^{2}] - - - (5)

On the other hand, based on the steady arm of TDOA, its location is by determining the time of arrival of estimating two microphone places poor (TDOA).Can ignore the influence of path attenuation at this.

Gone out to be used for the process flow diagram of broad sense simple crosscorrelation (GCC) method of mistiming estimation during Fig. 3.As shown in Figure 3, at first, at step S301, to the voice signal x of each microphone reception ₁(t) and x ₂(t) carry out Fourier transform and obtain X ₁(ω) and X ₂(ω).At step S302, calculate the cross-power spectrum that each microphone obtains voice signal.Then,, calculate broad sense and breathe the pass function, after this, pass through to detect R at step S304 at step S303 ₁₂Maximal value (τ) comes Estimated Time of Arrival poor.

In case after the time delay of signal estimates, just can utilize hyperbola positioning method to utilize following formula (3) to position

\sqrt{{(x_{1} - x_{s})}^{2} + {(y_{1} - y_{s})}^{2}} - \sqrt{{(x_{2} - x_{s})}^{2} + {(y_{2} - y_{s})}^{2}} = c τ_{12} - - - (6)

τ wherein ₁₂=τ ₁-τ ₂Be two Mikes' delay inequality, c is the velocity of sound, is approximately equal to 340m/s under the normal temperature.Concrete c τ ₁₂Can represent with Fig. 4.

Because the range difference of two fixed points of arrival is that the track of the point of definite value is a hyperbolic curve, so utilize finding the solution of hyperbola positioning method to illustrate with Fig. 5.

By carrying out conversion and can obtain following formula (7) to formula (5).

\sqrt{{(x_{1} - x_{s})}^{2} + {(y_{1} - y_{s})}^{2}} = \frac{1}{γ} \sqrt{{(x_{2} - x_{s})}^{2} + {(y_{2} - y_{s})}^{2}} - - - (7)

Wherein definition

γ = \sqrt{E_{1} / E_{2}}

, and bring (7) formula into (6), by the variable formula of abbreviation (8).

{(x_{s} - x_{1})}^{2} + {(y_{s} - y_{1})}^{2} = {(\frac{c τ_{12}}{1 - γ})}^{2} - - - (8)

In like manner, can obtain another one formula (9)

{(x_{s} - x_{2})}^{2} + {(y_{s} - y_{2})}^{2} = {(\frac{c τ_{12} γ}{1 - γ})}^{2} - - - (9)

Just can obtain the position coordinates of sound source by solving equation group (8) and (9).By formula (8) and (9) as can be seen, have 4 unknown numbers in two formula, utilize two equations can't obtain each unknown number determine separate.In fact, formula (8) and (9) two formulas have been determined with (x ₁, y ₁) and (x ₂, y ₂) be the center of circle, c τ ₁₂/ (1-γ) and c τ ₁₂γ/(1-γ) is two circles of radius, and both intersection points are exactly the position of sound source, as shown in Figure 6.

According to the position relation of as shown in Figure 6 two circles, can determine the necessary condition that there is formula (10) expression in intersection point:

(\frac{c τ_{12}}{1 - γ}) | 1 - γ | = | d_{1} - d_{2} | \leq d \leq (d_{1} + d_{2}) = (\frac{c τ_{12}}{1 - γ}) (1 + γ) - - - (10)

Wherein d is the distance of two microphones.If τ ₁₂＜0, that is to say that sound source arrives the 1st microphone will be early than the 2nd microphone, this moment γ＞1, so, τ ₁₂(1-γ) is simultaneously for just or for negative.If we further suppose E ₁≠ E ₁, then (10) can be written as formula (11).

c | τ_{12} | \leq d \leq c | τ_{12} | \frac{1 + γ}{| 1 - γ |} - - - (11)

Because the range difference of two microphones of sound source arrival the most very much not can surpass the distance between two microphones, and considers Atria frontier juncture system, and 0＜c| τ is then arranged ₁₂|≤d and

d \leq d_{1} + d_{2} = c | τ_{12} | \frac{1 + γ}{| 1 - γ |}

So equation (11) is to satisfy automatically.Yet, if E ₁=E ₂, all will deteriorate to straight line by equation (5) and (6) determined circle and hyperbolic curve, just the right perpendicular bisector of microphone.In this case, the position of sound source just can't be determined, can only detect the directional information of sound source.As shown in Figure 9.

Next, can derive its closed solution.System of equations for formula (8) and (9) composition can draw following system of equations (12)

{(x_{s} - x_{1})}^{2} + {(y_{s} - y_{1})}^{2} = {(c τ_{12} \cdot \frac{1}{1 - γ})}^{2} = r_{1}^{2}

{(x_{s} - x_{2})}^{2} + {(y_{s} - y_{2})}^{2} = {(c τ_{12} \cdot \frac{γ}{1 - γ})}^{2} = r_{2}^{2} - - - (12)

Can obtain formula (13) through suitable distortion.

x_{i} x_{s} + y_{i} y_{s} = \frac{1}{2} (K_{i} - r_{i}^{} + R_{s}^{2}); i = 1,2 - - - (13)

Wherein can make

K_{i} = x_{i}^{2} + y_{i}^{2} (i = 1,2)

, and establish expression formula (14).

R_{s} = \sqrt{x_{s}^{2} + y_{s}^{2}} - - - (14)

Equation (13) is expressed as matrix form can gets formula (15)

[\begin{matrix} x_{1} & y_{1} \\ x_{2} & y_{2} \end{matrix}] \times [\begin{matrix} x_{s} \\ y_{s} \end{matrix}] = \frac{1}{2} {R_{s}^{2} [\begin{matrix} 1 \\ 1 \end{matrix}] + [\begin{matrix} K_{1} - r_{1}^{2} \\ K_{2} - r_{2}^{2} \end{matrix}]} - - - (15)

Obtain formula (16) after the distortion.

[\begin{matrix} x_{s} \\ y_{s} \end{matrix}] = {[\begin{matrix} x_{1} & y_{1} \\ x_{2} & y_{2} \end{matrix}]}^{- 1} \times \frac{1}{2} {R_{s}^{2} [\begin{matrix} 1 \\ 1 \end{matrix}] + [\begin{matrix} K_{1} - r_{1}^{2} \\ K_{2} - r_{2}^{2} \end{matrix}]} - - - (16)

During simplifying, can define following formula (17) and (18).

a = [\begin{matrix} a_{1} \\ a_{2} \end{matrix}] = \frac{1}{2} {[\begin{matrix} x_{1} & y_{1} \\ x_{2} & y_{2} \end{matrix}]}^{- 1} [\begin{matrix} 1 \\ 1 \end{matrix}] - - - (17)

With

b = \begin{matrix} [\begin{matrix} b_{1} \\ b_{2} \end{matrix}] \end{matrix} = \frac{1}{2} {[\begin{matrix} x_{1} & y_{1} \\ x_{2} & y_{2} \end{matrix}]}^{- 1} [\begin{matrix} K_{1} - r_{1}^{2} \\ K_{2} - r_{2}^{2} \end{matrix}] - - - (18)

Then the coordinate of sound source can be expressed as about R _sExpression formula, as shown in Equation (19)

x = [\begin{matrix} x_{s} \\ y_{s} \end{matrix}] = [\begin{matrix} b_{1} + a_{1} R_{s}^{2} \\ b_{2} + a_{2} R_{s}^{2} \end{matrix}] - - - (19)

(19) formula substitution equation (14) just can be obtained about R _sExpression formula, as following formula (20) institute not.

(a_{1}^{2} + a_{2}^{2}) R_{s}^{4} + 2 (a_{1} b_{1} + a_{2} b_{2} - 1) R_{s}^{} + b_{1}^{2} + b_{2}^{2} = 0 - - - (20)

Solving an equation to obtain formula (21),

R_{s}^{2} = \frac{(1 - a_{1} b_{1} + a_{2} b_{2}) &PlusMinus; \sqrt{{(1 - a_{1} b_{1} + a_{2} b_{2})}^{2} - (a_{1}^{2} + a_{2}^{2}) (b_{1}^{2} + b_{2}^{2})}}{a_{1}^{2} + a_{2}^{2}} - - - (21)

Wherein positive root provided sound source to the distance of initial point square.The result who solves is taken back the position coordinates that (19) just can obtain sound source again.Final result will determine according to actual application environment.

Utilize the structural drawing of the device that dual microphone positions sound source according to the embodiment of the invention below with reference to Fig. 8 explanation.As an example, this device can be arranged on microphone inside, also can be arranged on the microphone outside, and is connected with microphone to control the direction of microphone by the control corresponding device.Auditory localization device according to the present invention comprises receiving element 81, and time of arrival, poor (TDOA) evaluation unit 82, ears level difference (ILD) evaluation unit 83 and location Calculation unit 84.Receiving element 81 receives the voice signal x that sound source is sent ₁(t) and x ₂(t), and with voice signal offer time of arrival difference evaluation unit 82 and ears level difference evaluation unit 83.Mode estimates the mistiming τ that voice signal arrives two microphones as described above in time of arrival difference evaluation unit 82 ₁₂In ears level difference evaluation unit 83, go out to arrive the energy ratio of two microphones according to the aforesaid mode of estimation

γ = \sqrt{E_{1} / E_{2}}

。After this, will time of arrival difference evaluation unit 82 and ears level difference evaluation unit 83 obtained the result and offered location Calculation unit 84.Location Calculation unit 84 mode as described above calculates the position coordinates of sound source, thereby obtains the position of sound source.

Fig. 9 handles to obtain the process flow diagram of sound source coordinate position voice signal according to the embodiment of the invention.In the branch of difference computing time, at step S91, to source signal s from sound source ₁(t) and s ₂(t) carrying out Fourier respectively changes.On the other hand, in the branch of calculating energy, at another at step S96, to source signal s from sound source ₁(t) and s ₂(t) inverse square law calculates respectively.After this, at step S92, in multiplier to through the source signal s of Fourier transform ₁(t) and s ₂(t) carry out multiplying.At step S93, utilize the GCC method of for example phase tranformation (PHAT) weighting that the signal from multiplier is handled, obtain the result of corresponding TDOA estimation gained.After this, step S94, the result carries out inverse-Fourier transform to estimation.Next, by detecting signal, obtain maximal value wherein, draw the mistiming τ that arrives two microphones thus through inverse-Fourier transform at step S96 ₁₂

In another branch, after step S96 calculates through inverse square law, calculate the energy (shown in top formula (2)) of the received sound of microphone at step S97.After this, in step 98, utilize formula recited above (7) to calculate

γ = \sqrt{E_{1} / E_{2}}

, obtain the energy ratio.At last, the result who utilizes step S95 and S98 to obtain at step S99 sets up equation to calculate the coordinate position of sound source.

The sound source position that the method according to this invention is obtained has carried out emulation.At first under unreflected environment, verify its feasibility.Used sound source is the female voice signal of 8KHz sampling.Simulated environment is that (rectangular room of 4m * 6m * 3m) is supposed the direct reflection that is reflected into, the reflection coefficient that tool is identical and and the frequency-independent of each wall.Figure 10 shows the synoptic diagram of simulated environment.A summit with the house is the center of circle, serves as that axle is set up coordinate system with three crest lines.As shown in figure 10, the initial position of sound source is (1,3,1), and the position of two microphones is respectively (2,3.75,1) and (2,4,1) (the unified m of being of coordinate unit).The location algorithm of mentioning for last joint, when TDOA estimate to adopt phase tranformation (PhaseTransform, PHAT) during the GCC method of weighting, the resulting result of emulation as shown in figure 10, the hyperbolic curve correspondence TDOA estimate the result of gained, circle is the track of ILD method correspondence.Simultaneously, concrete sound source (" * ") and Mike position (" ° ") have been marked in the drawings.Be appreciated that for linear array two points can appear in its positioning result simultaneously, with respect to array mirror image symmetry (as circle and hyp two intersection points among Figure 10), this is an inherent shortcoming of linear array.Fuzzy in order to eliminate this mirror image, locating area can be limited in the interested scope, perhaps get rid of unnecessary point by means of prior imformation.For example, when fixedly array is in a certain wall, only need to orient the point in the zone in array the place ahead, and opposite side is an institute unconcerned (because opposite side is the outside of wall, and sound source is not set).In addition, the cusp actual sound source position that well coincide among Figure 10, thus proved the validity of method proposed by the invention.

Be given in below and carry out l-G simulation test under the various conditioned reflexes, house impulse response is produced by the Image Method method that Allen and Berkley propose, and wherein utilizes reflection interval the Sabine formula to estimate, and is as follows

R T_{60} = 55 \frac{V}{c \times Se} - - - (22)

Se＝effective?absorbing?area＝α ₁S ₁+α ₂S ₂+α ₃S ₃+… (23)

V (m wherein ³) be bulk of building, Se (m ²) be effective sound source absorption area.For known reflection coefficient β _i, absorption coefficient _iCan obtain by following relation

α_{i} = 1 - β_{i}^{2} - - - (24)

In this experiment, two microphones are placed on (2,3.2,1) and (2,2.8,1) 2 points.Because its positioning result of sound source of diverse location is subjected to the difference that influences of reverberation, so Gauss's sound source that we will produce at random is placed on 1m place, distance arrays center, along 10 °, 45 ° and 80 ° of angular direction, its viewpoint definition is sound source and array center's line and array normal angular separation, as shown in figure 11.Change reflection coefficient simultaneously from 0.1 (RT ₆₀=109ms) to 0.9 (RT ₆₀=568ms) changing, the result of gained is as shown in figure 12 thus.

As can be seen from Figure 12, under the dead condition, method of the present invention can obtain sound source position very accurately.Be increased to a certain degree yet work as reverberation, the performance of location begins to worsen.Add the method that hyperbolic is located, the easier influence that is subjected to reverberation of this ILD method with respect to TDOA based on energy.When the direction of sound source approaches 0 (Figure 13 c), the received signal energy of each microphone is more or less the same, and this moment, the information of being obtained by energy became no longer reliable along with the increase of reverberation, and small error may make energy ratio γ that the variation of essence takes place.

More special situation, when sound source is positioned on two Mikes' the perpendicular bisector, E ₁=E ₁And τ ₁₂=0, hyperbolic curve and circle all deteriorate to straight line, and described algorithm can not be oriented the particular location of sound source this moment.In this case, little angle of linear array deflection can be avoided this specific position, utilize diamylose gram location algorithm to position again.

Be noted that the present invention is that example is described the method for utilizing two sound receiving elements that sound source is positioned with the microphone.Be appreciated that scope of the present invention is not limited to microphone as the sound receiving element, its basic thought can be used other sound receiver.

The present invention obtains the position of sound source by the intersection point of two curves, can realize by separating closed expression formula.Method of the present invention is littler than the method calculated amount of space search and iteration, and it is fast that processing speed is wanted, so can be used for the tracking in moving source.The relative time difference that it arrives two microphone places with signal does not need synchronous clock, so just there is not synchronous error as positional parameter yet.

Algorithm of the present invention has been obtained good result under low conditioned reflex, although the positional precision under high reverberation situation is not very satisfactory, it still can obtain DOA estimation accurately; On the other hand, general applied environment can both satisfy our accurate positioning requirements.

In addition, the same positioning system of forming by two microphones, the signal that algorithm of the present invention does not need to launch specific type as a reference, infrared ray etc. for example.

Although with certain the level of detail the present invention has been described according to a preferred embodiment of the invention, but, the content that preferred embodiment discloses can change on its CONSTRUCTED SPECIFICATION, and can carry out any change with order to the combination of assembly, and not break away from the scope and spirit of pointing out in the claim of the present invention.

Claims

1. the method for a localization of sound source position comprises step:

Receive the voice signal that sound source is sent by receiving trap;

Estimate that described voice signal arrives the mistiming of described receiving trap;

Estimate the energy ratio of the described voice signal that described receiving trap receives respectively;

According to the described mistiming of estimation and the position of the definite described sound source of described energy ratio.

2. method according to claim 1, the quantity that wherein is used to locate the described receiving trap of described sound source is 2.

3. method according to claim 1 and 2, the step of wherein estimating the described mistiming comprises utilizes the broad sense cross correlation function to calculate the step of described mistiming.

4. method according to claim 3 comprises that further the maximal value of detection broad sense cross correlation function result of calculation estimates that described voice signal arrives the described receiving trap mistiming.

5. method according to claim 1 and 2 further comprises the step of the described voice signal that receives being carried out Fourier transform.

6. method of utilizing two microphone localization of sound source positions comprises step:

Two microphones receive the voice signal that sound source is sent respectively;

Estimate that described voice signal arrives the mistiming of described two microphones;

Estimate the energy ratio of the described voice signal that described two microphones receive respectively;

7. method according to claim 6, the step of wherein said described mistiming of estimation comprise utilizes the broad sense cross correlation function to calculate the step of described mistiming.

8. method according to claim 7 comprises that further the maximal value of detection broad sense cross correlation function result of calculation estimates that described voice signal arrives the described microphone mistiming.

9. method according to claim 6 further comprises the step of the described voice signal that receives being carried out Fourier transform.

10. the device of a localization of sound source position comprises:

Receiving trap is used to receive the voice signal that sound source is sent;

Differ from estimating device time of arrival, is used to estimate that described voice signal arrives the mistiming of described receiving trap:

Ears level difference estimating device, the energy ratio of the described voice signal that is used to estimate that described receiving trap receives respectively; With

The location Calculation device is used for the position of determining described sound source according to the described mistiming and the described energy ratio of estimation.

11. device according to claim 10, the quantity 2 of wherein said receiving trap.

12. according to claim 10 or 11 described devices, wherein said receiving trap is a microphone.

13. a device that utilizes two microphone localization of sound source positions comprises:

Receiving trap is used to receive the voice signal that sound source is sent;

Time of arrival, the difference estimating device was used to estimate that described voice signal arrives the mistiming of described two microphones;

Ears level difference estimating device, the energy ratio of the described voice signal that is used to estimate that described two microphones receive respectively; With