CN111239691B - Multi-sound-source tracking method for restraining main sound source - Google Patents

Multi-sound-source tracking method for restraining main sound source Download PDF

Info

Publication number
CN111239691B
CN111239691B CN202010184264.2A CN202010184264A CN111239691B CN 111239691 B CN111239691 B CN 111239691B CN 202010184264 A CN202010184264 A CN 202010184264A CN 111239691 B CN111239691 B CN 111239691B
Authority
CN
China
Prior art keywords
sound source
source
state
particle
sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010184264.2A
Other languages
Chinese (zh)
Other versions
CN111239691A (en
Inventor
蔡卫平
黄印君
刘瑞娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiujiang Vocational and Technical College
Original Assignee
Jiujiang Vocational and Technical College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiujiang Vocational and Technical College filed Critical Jiujiang Vocational and Technical College
Priority to CN202010184264.2A priority Critical patent/CN111239691B/en
Publication of CN111239691A publication Critical patent/CN111239691A/en
Application granted granted Critical
Publication of CN111239691B publication Critical patent/CN111239691B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
    • G01S5/22Position of source determined by co-ordinating a plurality of position lines defined by path-difference measurements

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

A multi-sound source tracking method for inhibiting a main sound source comprises the steps of framing and windowing a voice sound source signal received by a microphone array; generating an initial particle group according to an initial state of a sound source; predicting a new particle state according to the state equation, and calculating an observed value of the particle state; judging a main source according to a positioning function value at the position estimation position of the sound source at the last moment; calculating the distance between the weak sound source particles and the main source, and constructing an attenuation coefficient according to the distance; multiplying the weaker sound source particle state observations near the primary source by the attenuation factor to reduce their value; constructing a pseudo-likelihood function of each sound source according to the positioning function and the attenuation coefficient, and calculating the particle weight of each sound source at the current moment according to the pseudo-likelihood function; and normalizing the particle weight of each sound source, and estimating the position of each sound source at the current moment according to the particle weight and the particle state. The invention can keep tracking two sound sources when the two sound sources are close to each other or the sound source tracks are crossed, and can be widely applied to the fields of robot hearing, audio monitoring and the like.

Description

Multi-sound-source tracking method for restraining main sound source
Technical Field
The invention relates to the technical field of multi-sound-source tracking based on a microphone array, in particular to a multi-sound-source tracking method for restraining a main sound source.
Background
The voice sound source positioning and tracking technology based on the microphone array is widely applied to the fields of digital hearing aids, robot hearing, intelligent monitoring and the like. Early voice sound source localization and tracking techniques were mainly applied to single sound sources, such as video conferences, vehicle-mounted hands-free voice communications, and the like. Through years of research, the positioning and tracking algorithm of a single sound source can reach higher precision. In recent years, with the expansion of the application field of sound source tracking technology, the problem of multi-sound source positioning and tracking needs to be considered in many cases.
Sound source tracking algorithms based on microphone arrays mainly comprise two types at present, wherein one type is represented by Kalman filtering and an improved algorithm thereof; another class is algorithms represented by the use of particle filtering. The former must be used under the condition of satisfying a plurality of assumptions, and the latter has wider application range and higher precision and occupies a mainstream position in a tracking algorithm. Currently, single sound source tracking algorithms in indoor environments are researched more, and multi-sound source tracking algorithms are researched less. Achieving multiple sound source tracking in reverberant environments is much more difficult than in the single sound source case. In addition to the influence of reverberation and noise on the tracking accuracy, the interference between sound sources will seriously reduce the accuracy of the tracking algorithm, and especially when the sound source tracks are close to or crossed, the traditional tracking algorithm has difficulty in identifying the sound source tracks, which results in the loss of targets.
In the process of particle filtering, the particle swarm moves along with the target track. When the two targets are closer, a part of particles of the weaker sound source fall into the region where the spatial spectrum peak of the stronger sound source is located, so that the part of particles is given too large weight, and the position estimation error of the corresponding weaker sound source is larger, namely, the position estimation error is too close to the position of the stronger sound source. As the iteration progresses, the particle distribution of the weaker sound source will be more and more similar to the particle distribution of the stronger sound source, and the estimated trajectory of the former will be nearly coincident with the latter. Even if the sound source distance increases again, the conventional tracking algorithm has difficulty in tracking a weak sound source, thereby losing the target.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a particle filter-based multiple voice sound source tracking algorithm. The algorithm keeps track of two sound sources in case the two sound source trajectories cross or are close.
In order to achieve the above object, the present invention provides a multi-sound-source tracking method that suppresses a primary sound source. This method refers to the stronger sound source as the primary source. In the tracking process, for the particles of the weaker sound source close to the main source, the weight of the particles is multiplied by a proper attenuation coefficient, so that the particles are not too large, the error of position estimation is reduced, and the estimated track overlapping is avoided. The method provided by the invention comprises the following steps:
s1, establishing a two-dimensional rectangular coordinate system, determining the coordinates of each array element in the microphone array, framing and windowing sound source signals received by the microphone array, and storing the sound source signals into a buffer area;
s2, generating an initial particle group of each sound source according to the initial state of each sound source:
under a rectangular coordinate system, the state vector of the ith sound source at the t frame is
Figure BSA0000204070960000021
The first two elements in the state vector
Figure BSA0000204070960000022
Representing the coordinates of the sound source, the last two elements
Figure BSA0000204070960000023
Representing the sound source speed. For state vector
Figure BSA0000204070960000024
For the corresponding sound source coordinate vector
Figure BSA0000204070960000025
To indicate. The number of particles is represented by N, NsRepresenting the number of sound sources, the initial set of particles is:
Figure BSA0000204070960000026
wherein
Figure BSA0000204070960000027
I.e. the initial state of the sound source,
Figure BSA0000204070960000028
for the initial weight of the particle, i ═ 1: n is a radical ofs
S3, predicting the new particle state of each sound source according to the state equation:
the equation of state can be expressed as
Figure BSA0000204070960000031
Specifically, Langevin's equation can be used as the state equation of the sound source, and its expression is as follows:
Figure BSA0000204070960000032
wherein u istIs a two-dimensional Gaussian random vector with a mean vector of [0, 0]TThe covariance matrix is a second-order identity matrix, T is the duration of a frame of signal, i.e. the time interval of state update, the parameter a is exp (- β T),
Figure BSA0000204070960000033
wherein beta and
Figure BSA0000204070960000034
is a constant set according to the motion state of the sound source.
S4, calculating the observation value of each sound source particle state at the current time by using the positioning function according to the received signal frame:
x for received signal frame of microphone arrayt=[x1(n),x2(n),...,xM(n)]TWhere M is the number of array elements of the microphone array. A phase-shift weighted pilot response power-phase transform (SRP-PHAT) function is used as a positioning function, and the expression is:
Figure BSA0000204070960000035
wherein
Figure BSA0000204070960000036
For the imaginary sound source coordinates, l and m are array element numbers,
Figure BSA0000204070960000037
a generalized cross-correlation function for a pair of array elements, defined as
Figure BSA0000204070960000038
Xm(k) Is the m-th array element receiving signal xm(n) discrete fourier transform, K being its number of points, representing the taking of a conjugate; ω is the analog angular frequency.
Figure BSA0000204070960000039
Is the time difference of arrival (TDOA) of a phantom sound source to an array element pair, where
Figure BSA0000204070960000041
The coordinate of the m-th array element is shown, c is the sound velocity (342m/s), and | is | · | | represents the 2-norm of the vector. The observed value of the particle state calculated according to the positioning function is as follows:
Figure BSA0000204070960000042
s5, constructing an attenuation coefficient according to the distance between the particles of the weak sound source and the main source, and multiplying the attenuation coefficient by the particle state observed value to obtain the particle state observed value after attenuation
Figure BSA00002040709600000410
S6, constructing a pseudo-likelihood function of each sound source according to the positioning function and the attenuation coefficient:
for pseudo-likelihood functions
Figure BSA0000204070960000043
Expressed, its expression is:
Figure BSA0000204070960000044
the function of the function max (-) ensures that the likelihood function is not negative, and r is a positive real number, so as to adjust the shape of the likelihood function and improve the performance of the tracking algorithm;
s7, calculating the weight of each sound source particle at the current moment according to the pseudo-likelihood function:
Figure BSA0000204070960000045
s8, normalizing the particle weight of each sound source:
Figure BSA0000204070960000046
s9, estimating the position of each sound source at the current moment according to the weight of the particles and the state of the particles;
Figure BSA0000204070960000047
s10, selecting the existing particle group according to the weight
Figure BSA0000204070960000048
Intermediate resampling to obtain resampled particle group
Figure BSA0000204070960000049
S11, the resampled particles and their weights are stored, and the process advances to step S3.
Further, the state observation of the sound source particles with weaker attenuation in step S5 includes the following steps:
s5.1, calculating a positioning function value at each sound source position estimation position at the previous moment according to the received signal frame at the previous moment, wherein the expression is as follows:
Figure BSA0000204070960000051
wherein the content of the first and second substances,
Figure BSA0000204070960000052
representing the position estimate of the ith sound source at time t-1;
s5.2, determining the source with the large positioning function value as a main source, namely the main source number at the time t is as follows:
Figure BSA0000204070960000053
after the main source number is obtained, the main source can be easily associated with the particles, for example, the particle group of the ith source is
Figure BSA0000204070960000054
If i is htIf the source is the primary source, otherwise, the source is notIs that;
s5.3, calculating the distance between the particles of the weak sound source and the position estimation position of the main source at the previous moment:
Figure BSA0000204070960000055
wherein the content of the first and second substances,
Figure BSA0000204070960000056
indicating the position of the particle of the ith sound source at time t,
Figure BSA0000204070960000057
representing the main source position estimation at the time of t-1;
s5.4, constructing an attenuation coefficient according to the distance in the step S5.3, and multiplying the state observed value of the weak sound source particles by the attenuation coefficient to obtain an attenuated particle state observed value:
Figure BSA0000204070960000058
where μ is a constant less than 1, the larger the value, the larger the attenuation amplitude, and z is a constant determining the attenuation rate, and the larger the value, the larger the attenuation amplitude, at the same distance.
The invention has the beneficial effects that:
(1) compared with the prior art, the method for restraining the main sound source is adopted in the sound source tracking process, and the problem that the estimated track of the weak sound source is overlapped with the track of the main sound source when the two sound sources are close to each other is solved well. Specifically, the present invention multiplies the observed value of the state of the particles of the weaker sound source close to the primary source by an appropriate attenuation coefficient in step S5, thereby reducing the weight of the particles. When two sound sources are close to each other or the tracks of the two sound sources are crossed, the weight of the particles of the weaker sound source close to the main source is reduced, so that the particles of the weaker sound source are not attracted by the main source, the independence of the particles of the sound sources is kept, and the tracking of the two sound sources can be kept.
(2) The method of the invention does not limit the shape of the microphone array, thus being applicable to any array type; the motion trail of the sound source is not limited, so that the method is suitable for the situation that the sound source moves in a curve.
Drawings
FIG. 1 is a main flow chart of the method of the present invention.
Fig. 2 is a flowchart of method step S5 of the present invention.
Fig. 3 is a schematic diagram of the position and sound source track of the microphone according to the present invention.
FIG. 4 is a schematic diagram illustrating the comparison between the tracking trajectory of the non-intersecting object and the real trajectory by the method of the present invention.
FIG. 5 is a schematic diagram illustrating the comparison between the tracking trajectory and the real trajectory of the intersecting object by the method of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
Example (b):
the simulated room has a size of 4m × 4m × 2.7m, and as shown in fig. 1, 2 and 3, 8 microphones are disposed on the wall surface around the room, and the height of each microphone is 1.464 m. The semicircle in the figure is the sound source track, and the sound source moves along the direction indicated by the arrow, and the height of the sound source is the same as that of the microphone. The sound source signal is male voice with two segments of duration of about 3.6s taken from TIMIT database, and the sampling frequency fs8 kHz. The receiving signal of the microphone is generated by an image method, and a Gaussian white noise signal is added. SNR (signal to noise ratio) of 20dB and reverberation time T600.132 s. The frame length L is 512 points, the frames are not overlapped, a Hanning window is added, and the specific steps are as follows:
s1, establishing a two-dimensional rectangular coordinate system, determining the coordinates of each array element in the microphone array, framing and windowing sound source signals received by the microphone array, and storing the sound source signals into a buffer area;
s2, generating an initial particle group of each sound source according to the initial state of each sound source;
under a rectangular coordinate system, the state vector of the ith sound source at the t frame is
Figure BSA0000204070960000071
The first two elements in the state vector
Figure BSA0000204070960000072
Representing the coordinates of the sound source, the last two elements
Figure BSA0000204070960000073
Representing the sound source speed. For state vector
Figure BSA0000204070960000074
For the corresponding sound source coordinate vector
Figure BSA0000204070960000075
To indicate. The number of particles is represented by N, NsWhen 2 denotes the number of sound sources, and N in this example is 50, the initial particle group is:
Figure BSA0000204070960000076
wherein
Figure BSA0000204070960000077
I.e., the initial state of the sound source, assuming that the initial position of the sound source is known, the initial velocity is 0,
Figure BSA0000204070960000078
for the initial weight of the particle, i ═ 1: n is a radical ofs
S3, predicting the new particle state of each sound source according to the state equation;
the equation of state can be expressed as
Figure BSA0000204070960000079
Specifically, Langevin's equation can be used as the state equation of the sound source, and its expression is as follows:
Figure BSA00002040709600000710
wherein u istIs a two-dimensional Gaussian random vector with a mean vector of [0, 0]TThe covariance matrix is a second-order identity matrix, T is the duration of a frame of signal, i.e. the time interval of state update, in this example, T is L/fs64 ms; the parameter a is exp (-T),
Figure BSA00002040709600000711
in this example, the speed of normal walking of a person is taken as beta-10 Hz,
Figure BSA00002040709600000712
s4, calculating the observed value of each sound source particle state at the current moment by using a positioning function according to the received signal frame;
x for received signal frame of microphone arrayt=[x1(n),x2(n),...,xM(n)]TWhere M is the number of elements of the microphone array, M is 8 in this example. A phase-shift weighted pilot response power-phase transform (SRP-PHAT) function is used as a positioning function, and the expression is:
Figure BSA0000204070960000081
wherein
Figure BSA0000204070960000082
For the imaginary sound source coordinates, l and m are array element numbers,
Figure BSA0000204070960000083
a generalized cross-correlation function for a pair of array elements, defined as
Figure BSA0000204070960000084
Xm(k) Is the m-th array element receiving signal xm(n) discrete fourier transform, K being the number of points, in this example K is 512, which represents the conjugate, ω is 2 π kfsand/K is the analog angular frequency.
Figure BSA0000204070960000085
Is the time difference of arrival (TDOA) of a phantom sound source to an array element pair, where
Figure BSA0000204070960000086
The coordinate of the m-th array element is shown, c is the sound velocity (342m/s), and | is | · | | represents the 2-norm of the vector. The observed value of the particle state calculated according to the positioning function is as follows:
Figure BSA0000204070960000087
s5, constructing an attenuation coefficient according to the distance between the particles of the weak sound source and the main source, and multiplying the attenuation coefficient by the particle state observed value to obtain the particle state observed value after attenuation
Figure BSA0000204070960000088
The specific process is as follows:
s5.1, calculating a positioning function value at each sound source position estimation position at the previous moment according to the received signal frame at the previous moment, wherein the expression is as follows:
Figure BSA0000204070960000089
wherein the content of the first and second substances,
Figure BSA00002040709600000810
representing the position estimate of the ith sound source at time t-1;
s5.2, determining the source with the large positioning function value as a main source, namely the main source number at the time t is as follows:
Figure BSA0000204070960000091
after the main source number is obtained, the main source can be easily associated with the particles, for example, the particle group of the ith source is
Figure BSA0000204070960000092
If i is htIf the current source is the main source, otherwise, the current source is not the main source;
s5.3, calculating the distance between the particles of the weak sound source and the position estimation position of the main source at the previous moment:
Figure BSA0000204070960000093
wherein the content of the first and second substances,
Figure BSA0000204070960000094
indicating the position of the particle of the ith sound source at time t,
Figure BSA0000204070960000095
representing the primary source position estimate at time t-1.
S5.4, constructing an attenuation coefficient according to the distance in the step S5.3, and multiplying the state observed value of the weak sound source particles by the attenuation coefficient to obtain an attenuated particle state observed value:
Figure BSA0000204070960000096
where μ is a constant less than 1, the larger the value, the larger the attenuation amplitude, and z is a parameter determining the attenuation rate, and the larger the value, the larger the attenuation amplitude, at the same distance. With many particles in weaker sources in case of close distance between the sources
Figure BSA00002040709600000910
Small, the observed values of these particles should not be allowed to decay to near zero, but rather remain in appropriate proportions so that the particles can still be state estimatedAs a contribution, μ cannot take a value too close to 1. The value of μ cannot be too small, otherwise it is not sufficient to attenuate the influence of the main source. Considering that the distance between speakers is usually larger than 0.5m, when
Figure BSA0000204070960000097
The attenuation coefficient should be close to 1. In this example, μ is 0.8, and z is 0.15 m;
s6, constructing a pseudo-likelihood function of each sound source according to the positioning function and the attenuation coefficient;
for pseudo-likelihood functions
Figure BSA0000204070960000098
Expressed, its expression is:
Figure BSA0000204070960000099
the function max (·) is to ensure that the likelihood function is not negative, and r is a positive real number, which aims to adjust the shape of the likelihood function and improve the performance of the tracking algorithm, where r is 3 in this embodiment.
S7, calculating the weight of each sound source particle at the current moment according to the pseudo-likelihood function;
Figure BSA0000204070960000101
s8, normalizing the particle weight of each sound source;
Figure BSA0000204070960000102
s9, estimating the position of each sound source at the current moment according to the weight of the particles and the state of the particles;
Figure BSA0000204070960000103
s10, selecting the existing particle group according to the weight
Figure BSA0000204070960000104
Intermediate resampling to obtain resampled particle group
Figure BSA0000204070960000105
S11, the resampled particles and their weights are stored, and the process advances to step S3.
To illustrate the tracking effect of the method of the present invention, two evaluation indicators of tracking performance are defined, namely Root Mean Square Error (RMSE) and tracking loss rate. Both indices are calculated separately for each sound source. Root mean square error is defined as
Figure BSA0000204070960000106
Wherein: i is a sound source number;
Figure BSA0000204070960000107
the true position of the ith sound source at the time of the t-th frame,
Figure BSA0000204070960000108
is an estimated value thereof; ksThe number of signal frames.
When tracing at a time
Figure BSA0000204070960000109
Then target i is considered lost in this trace. In this example, N is usedtrackIndicating the number of traces, by NlossWhen the number of target losses is represented, a tracking loss rate (TLP) is defined as
Figure BSA00002040709600001010
Case of non-intersecting sound source trajectories:
as shown in fig. 2, the trajectories of both sound sources are semi-circles with a radius of 0.75 m. SoundSource S1Has a starting point coordinate of [1.2, 3 ]]The coordinate of the center of the circle is [1.95, 3 ]]Sound source S2Has a starting point coordinate of [1.2, 1.8 ]]The coordinate of the center of the circle is [1.95, 1.8 ]]. The two sound sources make a uniform velocity circular motion within a period of 3.6s, and the distance is kept at 1.2 m. The tracking is performed by using the conventional algorithm and the algorithm of the present invention, respectively, fig. 4 is a representative one-time tracking result, in which a dotted line represents a real track and a solid line represents an estimated track.
As can be seen from fig. 3, when the sound source tracks do not intersect and are far away from each other, the method of the present invention can better track two sound sources. In order to further examine the tracking performance of the method of the present invention, tracking experiments were performed at different sound source distances using the method of the present invention and the conventional tracking method, and 30 calculations were performed in each case, and the results are shown in table 1. D in Table 1sWhich represents the distance between the sound sources,
Figure BSA0000204070960000111
represents the mean root mean square error, which is the average of the successfully tracked RMSE, reflecting the tracking accuracy of the algorithm.
Figure BSA0000204070960000112
TABLE 1
As can be seen from the table 1, when the sound source distance is far, the tracking accuracy of the two methods is close, and the tracking loss rate of the method is slightly lower than that of the traditional method; the tracking loss rate of the conventional method is high when the sound source is close in distance, because particles of a weak sound source are easily "attracted" by the primary source when the distance is close, resulting in a loss of target. The method can effectively reduce the tracking loss rate by adjusting the observed value of the particles in time, and the tracking precision is higher than that of the traditional method.
Case of crossing sound source trajectories:
sound source S1The track of (1) is still a semicircle, the radius is still 0.75m, and the coordinates of the starting point are [1.2, 2.4 ]]The coordinate of the center of the circle is [1.95, 2.4 ]](ii) a Sound source S2Is a straight line, and the coordinates of the starting point are [1.8, 3.1 ]]Finally, finallyPoint coordinates of [1.9, 0.9 ]]. Fig. 5 shows a typical trace result.
As can be seen from fig. 5, for the case where the sound source trajectories intersect, the method of the present invention can still continuously track the two targets. This shows that for the weak sound source particles close to the main source, the observation values are attenuated, so as to effectively avoid being "attracted" by the main source. The 30 calculations were also performed for the case where the sound source trajectories crossed, and the results are shown in table 2.
Figure BSA0000204070960000121
TABLE 2
As can be seen from Table 2, the conventional method is applied to the sound source S2The tracking loss rate of (2) increases dramatically. This indicates that at the crossing time S1Is the main source, S2The method is a weak sound source, and after target tracks are crossed, the traditional method is difficult to keep tracking the weak sound source. Still as can be seen from table 2, the method of the present invention can significantly reduce the tracking loss rate of weaker sound sources.

Claims (1)

1. A multi-source tracking method for suppressing a primary sound source, comprising the steps of:
s1, establishing a two-dimensional rectangular coordinate system, determining the coordinates of each array element in the microphone array, framing and windowing sound source signals received by the microphone array, and storing the sound source signals into a buffer area;
s2, generating an initial particle group of each sound source according to the initial state of each sound source:
under a rectangular coordinate system, the state vector of the ith sound source at the t frame is
Figure FSB0000196655800000011
The first two elements in the state vector
Figure FSB0000196655800000012
Representing the coordinates of the sound source, the last two elements
Figure FSB0000196655800000013
Representing the sound source speed; for state vector
Figure FSB0000196655800000014
For the corresponding sound source coordinate vector
Figure FSB0000196655800000015
To represent; the number of particles is represented by N, NsRepresenting the number of sound sources, the initial set of particles is:
Figure FSB0000196655800000016
wherein
Figure FSB0000196655800000017
I.e. the initial state of the sound source,
Figure FSB0000196655800000018
is the initial weight of the particle, i is 1: Ns
S3, predicting the new particle state of each sound source according to the state equation:
the equation of state can be expressed as
Figure FSB0000196655800000019
Specifically, Langevin's equation can be used as the state equation of the sound source, and its expression is as follows:
Figure FSB00001966558000000110
wherein u istIs a two-dimensional Gaussian random vector with a mean vector of [0, 0]TThe covariance matrix is a second order unitThe matrix, T is the duration of a frame of signals, i.e. the time interval for state updating, the parameter a is exp (- β T),
Figure FSB0000196655800000021
wherein beta and
Figure FSB0000196655800000022
is a constant set according to the motion state of the sound source;
s4, calculating the observation value of each sound source particle state at the current time by using the positioning function according to the received signal frame:
x for received signal frame of microphone arrayt=[x1(n),x2(n),...,xM(n)]TWherein M is the number of array elements of the microphone array; a phase-shift weighted pilot response power-phase transform (SRP-PHAT) function is used as a positioning function, and the expression is:
Figure FSB0000196655800000023
wherein
Figure FSB0000196655800000024
For the imaginary sound source coordinates, l and m are array element numbers,
Figure FSB0000196655800000025
a generalized cross-correlation function for a pair of array elements, defined as
Figure FSB0000196655800000026
Xm(k) Is the m-th array element receiving signal xm(n) discrete fourier transform, K being its number of points, representing the taking of a conjugate; ω is the analog angular frequency;
Figure FSB0000196655800000027
is the time difference of arrival (TDOA) of a phantom sound source to an array element pair, where
Figure FSB0000196655800000028
The coordinate of the mth array element is represented, c is the sound velocity (342m/s), and | is | · | | represents solving the 2-norm of the vector; the observed value of the particle state calculated according to the positioning function is as follows:
Figure FSB0000196655800000029
s5, constructing an attenuation coefficient according to the distance between the particles of the weak sound source and the main source, and multiplying the attenuation coefficient by the particle state observed value to obtain the particle state observed value after attenuation
Figure FSB00001966558000000210
S6, constructing a pseudo-likelihood function of each sound source according to the positioning function and the attenuation coefficient:
for pseudo-likelihood functions
Figure FSB0000196655800000031
Expressed, its expression is:
Figure FSB0000196655800000032
the function of the function max (-) ensures that the likelihood function is not negative, and r is a positive real number, so as to adjust the shape of the likelihood function and improve the performance of the tracking algorithm;
s7, calculating the weight of each sound source particle at the current moment according to the pseudo-likelihood function:
Figure FSB0000196655800000033
s8, normalizing the particle weight of each sound source:
Figure FSB0000196655800000034
s9, estimating the position of each sound source at the current moment according to the weight of the particles and the state of the particles:
Figure FSB0000196655800000035
s10, selecting the existing particle group according to the weight
Figure FSB0000196655800000036
Intermediate resampling to obtain resampled particle group
Figure FSB0000196655800000037
S11, storing the resampled particles and the weights thereof, and entering the step S3;
the state observation value of the sound source particles with weak attenuation in the step S5 includes the following steps:
s5.1, calculating a positioning function value at each sound source position estimation position at the previous moment according to the received signal frame at the previous moment, wherein the expression is as follows:
Figure FSB0000196655800000038
wherein the content of the first and second substances,
Figure FSB0000196655800000039
representing the position estimate of the ith sound source at time t-1
S5.2, determining the source with the large positioning function value as a main source, namely the main source number at the time t is as follows:
Figure FSB00001966558000000310
after the main source number is obtained, the main source can be easily associated with the particles, for example, the particle group of the ith source is
Figure FSB0000196655800000041
If i is htIf the current source is the main source, otherwise, the current source is not the main source;
s5.3, calculating the distance between the particles of the weak sound source and the position estimation position of the main source at the previous moment:
Figure FSB0000196655800000042
wherein the content of the first and second substances,
Figure FSB0000196655800000043
indicating the position of the particle of the ith sound source at time t,
Figure FSB0000196655800000044
representing the main source position estimation at the time of t-1;
s5.4, constructing an attenuation coefficient according to the distance in the step S5.3, and multiplying the state observed value of the weak sound source particles by the attenuation coefficient to obtain an attenuated particle state observed value:
Figure FSB0000196655800000045
where μ is a constant less than 1, the larger the value, the larger the attenuation amplitude, and z is a constant determining the attenuation rate, and the larger the value, the larger the attenuation amplitude, at the same distance.
CN202010184264.2A 2020-03-08 2020-03-08 Multi-sound-source tracking method for restraining main sound source Active CN111239691B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010184264.2A CN111239691B (en) 2020-03-08 2020-03-08 Multi-sound-source tracking method for restraining main sound source

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010184264.2A CN111239691B (en) 2020-03-08 2020-03-08 Multi-sound-source tracking method for restraining main sound source

Publications (2)

Publication Number Publication Date
CN111239691A CN111239691A (en) 2020-06-05
CN111239691B true CN111239691B (en) 2022-03-08

Family

ID=70870674

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010184264.2A Active CN111239691B (en) 2020-03-08 2020-03-08 Multi-sound-source tracking method for restraining main sound source

Country Status (1)

Country Link
CN (1) CN111239691B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115662383B (en) * 2022-12-22 2023-04-14 杭州爱华智能科技有限公司 Method and system for deleting main sound source, method, system and device for identifying multiple sound sources

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4686532A (en) * 1985-05-31 1987-08-11 Texas Instruments Incorporated Accurate location sonar and radar
CN104991573A (en) * 2015-06-25 2015-10-21 北京品创汇通科技有限公司 Locating and tracking method and apparatus based on sound source array

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5403896B2 (en) * 2007-10-31 2014-01-29 株式会社東芝 Sound field control system
CN103152820B (en) * 2013-02-06 2015-08-12 长安大学 A kind of wireless sensor network acoustic target iteration localization method
JP6030012B2 (en) * 2013-03-21 2016-11-24 株式会社東芝 Direction measuring apparatus, direction measuring program, and direction measuring method
EP3134709A4 (en) * 2014-04-22 2018-01-03 BASF (China) Company Ltd. Detector for optically detecting at least one object
CN104820993B (en) * 2015-03-27 2017-12-01 浙江大学 It is a kind of to combine particle filter and track the underwater weak signal target tracking for putting preceding detection
CN108828524B (en) * 2018-06-03 2021-04-06 桂林电子科技大学 Particle filter sound source tracking and positioning method based on Delaunay triangulation
CN109407515A (en) * 2018-12-17 2019-03-01 厦门理工学院 A kind of interference observer design method suitable for non-minimum phase system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4686532A (en) * 1985-05-31 1987-08-11 Texas Instruments Incorporated Accurate location sonar and radar
CN104991573A (en) * 2015-06-25 2015-10-21 北京品创汇通科技有限公司 Locating and tracking method and apparatus based on sound source array

Also Published As

Publication number Publication date
CN111239691A (en) 2020-06-05

Similar Documents

Publication Publication Date Title
Brandstein et al. A practical methodology for speech source localization with microphone arrays
CN104076331B (en) A kind of sound localization method of seven yuan of microphone arrays
Brandstein et al. A practical time-delay estimator for localizing speech sources with a microphone array
CN107102296B (en) Sound source positioning system based on distributed microphone array
CN104991573A (en) Locating and tracking method and apparatus based on sound source array
CN104106267B (en) Signal enhancing beam forming in augmented reality environment
Valin et al. Robust 3D localization and tracking of sound sources using beamforming and particle filtering
CN103308889A (en) Passive sound source two-dimensional DOA (direction of arrival) estimation method under complex environment
CN102305925A (en) Robot continuous sound source positioning method
CN1952684A (en) Method and device for localization of sound source by microphone
CN103278801A (en) Noise imaging detection device and detection calculation method for transformer substation
CN107167770B (en) A kind of microphone array sound source locating device under the conditions of reverberation
Nakadai et al. Robust tracking of multiple sound sources by spatial integration of room and robot microphone arrays
CN103176166A (en) Tracking algorithm for time difference of arrival of signals for acoustic passive positioning
CN111239691B (en) Multi-sound-source tracking method for restraining main sound source
CN105607042A (en) Method for locating sound source through microphone array time delay estimation
CN103901400A (en) Binaural sound source positioning method based on delay compensation and binaural coincidence
CN113539288A (en) Voice signal denoising method and device
CN112363112A (en) Sound source positioning method and device based on linear microphone array
Okuno et al. Sound and visual tracking for humanoid robot
CN110927668A (en) Sound source positioning optimization method of cube microphone array based on particle swarm
Kossyk et al. Binaural bearing only tracking of stationary sound sources in reverberant environment
Bechler et al. Three different reliability criteria for time delay estimates
Warsitz et al. Adaptive beamforming combined with particle filtering for acoustic source localization
Pan et al. Deconvolved conventional beamforming and adaptive cubature kalman filter based distant speech perception system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant