Disclosure of Invention
The invention aims to provide a method and a system for positioning a time-frequency spectrum sound source with adjustable response, which are used for overcoming the technical defects of low positioning precision and low positioning speed in the conventional positioning method and improving the positioning precision and the positioning speed.
In order to achieve the purpose, the invention provides the following scheme:
a method for tunable response time-frequency spectrum sound source localization, the localization method comprising the steps of:
calculating the cross-correlation coefficient of the vibration signals acquired by every two sensors in the sensor array;
performing time-frequency transformation on each cross-correlation coefficient to obtain a time-frequency spectrum of the cross-correlation coefficient;
obtaining positioning space energy distribution containing signal frequency information by utilizing a delay and sum beam forming principle according to the time-frequency spectrum of each cross-correlation coefficient;
determining a positioning characteristic vector corresponding to the maximum value of the energy distribution of the positioning space of the vibration signals with different frequencies in a frequency constraint range by adopting a constraint maximum likelihood estimation mode, and constructing a positioning characteristic matrix;
and clustering and weighting fusion are carried out on the positioning feature vectors in the positioning feature matrix to obtain a sound source positioning result.
Optionally, the obtaining, according to the time-frequency spectrum of each cross-correlation coefficient, a positioning space energy distribution including signal frequency information by using a principle of delay-sum beam forming specifically includes:
according to the time-frequency spectrum of each cross-correlation coefficient, the positioning space energy distribution containing signal frequency information is obtained by utilizing the principle of delay summation beam forming, and the positioning space energy distribution is as follows:
where P (x, f) denotes the localized spatial energy distribution, τ (x) denotes the time difference between the spatial position x to the ith and jth sensors, f denotes the vibration signal frequency,
represents the cross-correlation coefficient R after the (i, j) th time-frequency transformation
i,jThe time-frequency transform function value of (1); s
i(t) represents the vibration signal collected by the ith sensor,
representing the vibration signal s collected by the jth sensor
j(t + τ (x)) with t representing the time at which the ith sensor acquired the vibration signal and N representing the number of sensors in the sensor array.
Optionally, the determining, in a constrained maximum likelihood estimation manner, a positioning feature vector corresponding to a maximum value of energy distribution in a positioning space of vibration signals with different frequencies within a frequency constraint range, and constructing a positioning feature matrix specifically includes:
solving formula by using constrained maximum likelihood estimation mode
Obtaining a positioning feature matrix as A ═ a
1,a
2,...,a
m,...a
M];
Wherein the content of the first and second substances,
representing the frequency of the vibration signal as f
mPosition vector, P (x, f), of the maximum of the temporal localization spatial energy distribution
m) Representing the frequency of the vibration signal as f
mTemporal localization spatial energy distribution, a
mRepresenting the frequency f of the vibration signal
mThe corresponding location feature vector is then used to locate,
E
mrepresenting the frequency f of the vibration signal
mThe corresponding spatially tunable response spectral energy,
representing the frequency of the vibration signal as f
mLocation vector in temporal localization spatial energy distribution
Energy of (f) of (d)
min,f
max]Indicating the frequency band range, f
minRepresenting the minimum effective frequency, f
maxRepresenting the maximum effective frequency.
Optionally, the clustering and weighting fusion of the localization feature vectors in the localization feature matrix is performed to obtain a sound source localization result, and the method specifically includes:
taking the positioning feature vector with the vibration signal frequency closest to the main frequency in the positioning feature matrix as a main clustering center;
performing two-classification dynamic clustering on the positioning feature vector in the positioning feature matrix by using the main clustering center to obtain a class in which the main clustering center is located as a data fusion sample set;
using formulas
Performing weighted fusion on each positioning feature vector in the data fusion sample set to obtain a source positioning result;
where Φ represents the source localization result, w
nRepresenting the nth location feature vector a in the data fusion sample set
nThe weight of (a) is determined,
β
nrepresenting the inverse of the Euclidean distance, β, of the nth location feature vector in the data fusion sample set from the primary cluster center
lAnd L represents the quantity of the positioning feature vectors in the data fusion sample set.
A tunable response time-frequency spectral sound source localization system, the localization system comprising:
the cross-correlation coefficient calculation module is used for calculating the cross-correlation coefficient of the vibration signals acquired by every two sensors in the sensor array;
the time-frequency transformation module is used for respectively carrying out time-frequency transformation on each cross-correlation coefficient to obtain a time-frequency spectrum of the cross-correlation coefficient;
the positioning space energy distribution determining module is used for obtaining positioning space energy distribution containing signal frequency information by utilizing the principle of delay summation beam forming according to the cross-correlation coefficient after each time frequency transformation;
the positioning feature matrix construction module is used for determining a positioning feature vector corresponding to the maximum value of the energy distribution of the positioning space of the vibration signals with different frequencies in the frequency constraint range by adopting a constraint maximum likelihood estimation mode, and constructing a positioning feature matrix;
and the sound source positioning module is used for clustering and weighting the positioning characteristic vectors in the positioning characteristic matrix to obtain a sound source positioning result.
Optionally, the positioning spatial energy distribution determining module specifically includes:
and the positioning space energy distribution determining submodule is used for obtaining positioning space energy distribution containing signal frequency information according to the time-frequency spectrum of each cross-correlation coefficient by utilizing the principle of delay summation beam forming, and comprises the following steps:
where P (x, f) denotes the localized spatial energy distribution, τ (x) denotes the time difference between the spatial position x to the ith and jth sensors, f denotes the vibration signal frequency,
represents the cross-correlation coefficient R after the (i, j) th time-frequency transformation
i,jThe time-frequency transform function value of (1); s
i(t) represents the vibration signal collected by the ith sensor,
representing the vibration signal s collected by the jth sensor
j(t + τ (x)) with t representing the time at which the ith sensor acquired the vibration signal and N representing the number of sensors in the sensor array.
Optionally, the module for constructing the location feature matrix specifically includes:
a positioning feature matrix construction submodule for solving the formula by adopting a constraint maximum likelihood estimation mode
Obtaining a positioning feature matrix as A ═ a
1,a
2,...,a
m,...a
M];
Wherein the content of the first and second substances,
representing the frequency of the vibration signal as f
mPosition vector, P (x, f), of the maximum of the temporal localization spatial energy distribution
m) Representing the frequency of the vibration signal as f
mTemporal localization spatial energy distribution, a
mIndicating vibrationFrequency f of the signal
mThe corresponding location feature vector is then used to locate,
E
mrepresenting the frequency f of the vibration signal
mThe corresponding spatially tunable response spectral energy,
representing the frequency of the vibration signal as f
mLocation vector in temporal localization spatial energy distribution
Energy of (f) of (d)
min,f
max]Indicating the frequency band range, f
minRepresenting the minimum effective frequency, f
maxRepresenting the maximum effective frequency.
Optionally, the sound source positioning module specifically includes:
the main clustering center determining submodule is used for taking the positioning feature vector with the vibration signal frequency closest to the main frequency in the positioning feature matrix as a main clustering center;
the data fusion sample set acquisition submodule is used for carrying out two-classification dynamic clustering on the positioning feature vector in the positioning feature matrix by using the main clustering center to obtain a class where the main clustering center is located as a data fusion sample set;
a weighted fusion submodule for utilizing the formula
Performing weighted fusion on each positioning feature vector in the data fusion sample set to obtain a source positioning result;
where Φ represents the source localization result, w
nRepresenting the nth location feature vector a in the data fusion sample set
nThe weight of (a) is determined,
β
nrepresenting the inverse of the Euclidean distance, β, of the nth location feature vector in the data fusion sample set from the primary cluster center
lAnd L represents the quantity of the positioning feature vectors in the data fusion sample set.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a method and a system for positioning a time-frequency spectrum sound source with adjustable response, wherein the positioning method comprises the following steps: calculating the cross-correlation coefficient of the vibration signals acquired by every two sensors in the sensor array; performing time-frequency transformation on each cross-correlation coefficient to obtain a time-frequency spectrum of the cross-correlation coefficient; obtaining positioning space energy distribution containing signal frequency information by utilizing a delay and sum beam forming principle according to the time-frequency spectrum of each cross-correlation coefficient; determining a positioning characteristic vector corresponding to the maximum value of the energy distribution of the positioning space of the vibration signals with different frequencies in a frequency constraint range by adopting a constraint maximum likelihood estimation mode, and constructing a positioning characteristic matrix; and clustering and weighting fusion are carried out on the positioning feature vectors in the positioning feature matrix to obtain a sound source positioning result. According to the invention, the positioning space energy distribution containing signal frequency information is obtained by performing time-frequency transformation on the cross-correlation coefficient and utilizing the principle of delay summation beam forming, so that the obtained positioning space energy distribution has frequency information, then the maximum likelihood estimation is restricted, a positioning characteristic matrix is constructed, the technical defect of inaccurate positioning caused by the fact that the frequency information restriction is not considered is overcome, and the positioning precision and speed are further improved by adopting a clustering and weighting fusion mode.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a method and a system for positioning a time-frequency spectrum sound source with adjustable response, which are used for overcoming the technical defects of low positioning precision and low positioning speed in the conventional positioning method and improving the positioning precision and the positioning speed.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in fig. 1 and 2, the present invention provides a method for localization of a time-frequency-spectrum sound source with adjustable response, the localization method comprising the steps of:
step 101, calculating the cross-correlation coefficient of the vibration signals collected by every two sensors in the sensor array.
Collecting vibration data generated by sound source by array sensor, calculating cross correlation R of signals of sensor pair (i, j)i,j(τ),
Where j denotes the jth sensor, t denotes the time of acquisition of the signal, τ denotes the time delay, si(t) represents the signal collected by the ith sensor, sj(t + τ) represents the acquired signal delay signal of the jth sensor.
And 102, respectively carrying out time-frequency transformation on each cross correlation coefficient to obtain a time-frequency spectrum of the cross correlation coefficient.
The SRP-PHAT (weighted controlled response power sound source positioning of phase transformation) method which can be obtained by the formula (1) is the positioning method with the best anti-noise and anti-multipath interference performance at present, and the positioning principle is as follows:
where P (x) represents the localization spatial energy distribution,
is the time difference between the calculated spatial position x to the ith and jth sensors, wherein x represents the spatial position of the sound source, | · | | survival
2Is the 2 norm (distance) of the vector and v is the wave propagation velocity. Maximizing the spatial position corresponding to P (x)
Is the location of the sound source. Although the SRP-PHAT method has certain anti-noise performance, the method utilizes the full-frequency-band information of signals, and a large number of local extrema exist in P (x) due to noise interference, so that the positioning accuracy of a sound source is reduced, the convergence speed in the optimization solving process is low, and the like.
In order to overcome the defects of the SRP-PHAT method, the invention firstly carries out the treatment on Ri,j(tau) performing time-frequency transformation, and then obtaining an adjustable spatial response spectrum by utilizing delay-sum beam forming in a time-frequency domain. Ri,j(τ) time-frequency transform with S-transform:
wherein the content of the first and second substances,
representing a signal s
j(t) synchronized pressureThe S-scaling transformation is carried out,
representing the cross-correlation R
i,j(τ) time-frequency transformation.
And 103, obtaining positioning space energy distribution containing signal frequency information by utilizing a delay and sum beam forming principle according to the time-frequency spectrum of each cross-correlation coefficient.
Step 103, obtaining the positioning space energy distribution containing the signal frequency information by using the principle of delay and sum beam forming according to the cross-correlation coefficient after each time-frequency transformation, specifically including: according to the cross-correlation coefficient after each time frequency transformation, the positioning space energy distribution containing signal frequency information is obtained by utilizing the principle of delay summation beam forming, and the positioning space energy distribution is as follows:
where P (x, f) denotes the localized spatial energy distribution, τ (x) denotes the time difference between the spatial position x to the ith and jth sensors, f denotes the vibration signal frequency,
represents the cross-correlation coefficient R after the (i, j) th time-frequency transformation
i,jThe time-frequency transform function value of (1); s
i(t) represents the vibration signal collected by the ith sensor,
representing the vibration signal s collected by the jth sensor
j(t + τ (x)) with t representing the time at which the ith sensor acquired the vibration signal and N representing the number of sensors in the sensor array.
Specifically, let the position of the ith sensor be xi=[xi,yi,zi]TFrom the principle of delay-sum beamforming, one can obtain:
the tunable spatial response spectrum represented by equation (3) contains not only the spatial position variable x but also a function of the signal frequency f. Thus, better positioning performance can be obtained by restricting the frequency f of the signal.
And 104, determining a positioning feature vector corresponding to the maximum value of the energy distribution of the positioning space of the vibration signals with different frequencies in the frequency constraint range by adopting a constraint maximum likelihood estimation mode, and constructing a positioning feature matrix.
Step 104, determining a positioning feature vector corresponding to the maximum value of the energy distribution of the positioning space of the vibration signals with different frequencies in the frequency constraint range by adopting a constraint maximum likelihood estimation mode, and constructing a positioning feature matrix, which specifically comprises the following steps: solving formula by using constrained maximum likelihood estimation mode
Obtaining a positioning feature matrix as A ═ a
1,a
2,...,a
m,...a
M](ii) a Wherein the content of the first and second substances,
representing the frequency of the vibration signal as f
mPosition vector, P (x, f), of the maximum of the temporal localization spatial energy distribution
m) Representing the frequency of the vibration signal as f
mTemporal localization spatial energy distribution, a
mRepresenting the frequency f of the vibration signal
mThe corresponding location feature vector is then used to locate,
E
mrepresenting the frequency f of the vibration signal
mThe corresponding spatially tunable response spectral energy,
representing the frequency of the vibration signal as f
mLocation vector in temporal localization spatial energy distribution
Energy of (f) of (d)
min,f
max]Indicating the frequency band range, f
minRepresenting the minimum effective frequency, f
maxRepresenting the maximum effective frequency.
Specifically, when the sensor array collects signals, the sensor array is often influenced by the surrounding environment, and particularly for complex environments, the signal-to-noise ratio of the collected signals is low. Meanwhile, the acquired array signals are often in a main frequency band, and not all frequency bands are useful information. Therefore, better positioning results can be obtained by constraining the signal frequency f so that only the adjustable spatial response spectral energy of the signal main frequency band is considered in the formula (3) maximization process. Carrying out spectrum analysis on the array signal to obtain the frequency band range [ f ] of the array signalmin,fmax]And main frequency f0It is used as a constraint to solve the maximum likelihood estimate of P (x, f). The maximization problem can be converted into a constraint optimization problem
fiRepresents that is ofmin,fmax]At a certain frequency of the range, the above-mentioned constrained optimization problem can be solved by group intelligent optimization methods such as genetic algorithm, etc., so as to obtain a series of positioning results corresponding to the frequency band range and their corresponding characteristics, which are expressed as
A=[a1,a2,...,ai,...aM] (5)
Wherein a is
iIs derived from the frequency f within the frequency band
iIts corresponding positioning space position
And its corresponding spatially tunable response spectral energy E
iConstructed vectors, i.e.
Wherein the content of the first and second substances,
the above method obtains a series of samples, possibly of the sound source position, which contain both spatial position information and frequency and energy information. This information is in fact the key information available for positioning. The acquisition of the information can provide a basis for subsequent high-precision fusion positioning.
And 105, clustering and weighting the positioning feature vectors in the positioning feature matrix to obtain a sound source positioning result.
Step 105, performing clustering and weighted fusion on the positioning feature vectors in the positioning feature matrix to obtain a sound source positioning result, which specifically includes: taking the positioning feature vector with the vibration signal frequency closest to the main frequency in the positioning feature matrix as a main clustering center; performing two-classification dynamic clustering on the positioning feature vector in the positioning feature matrix by using the main clustering center to obtain a class in which the main clustering center is located as a data fusion sample set; using formulas
Performing weighted fusion on each positioning feature vector in the data fusion sample set to obtain a source positioning result; where Φ represents the sound source localization result, w
nRepresenting the nth location feature vector a in the data fusion sample set
nThe weight of (a) is determined,
β
nrepresenting the inverse of the Euclidean distance, β, of the nth location feature vector in the data fusion sample set from the primary cluster center
lAnd L represents the quantity of the positioning feature vectors in the data fusion sample set.
Specifically,
step 104 has already obtained the key information for determining the positioning performance, but still cannot simply obtain high precision therefromAnd as a result of the accurate positioning, data mining is required to realize high-precision positioning. Although the positioning result within the signal frequency band is obtained in
step 104, the frequencies of the signals collected by different sensors, which can represent the positioning performance, are not completely consistent due to the difference of the propagation paths of the waves to the different sensors. Meanwhile, the spatial position corresponding to the maximum spatial response spectral energy due to noise interference may not be the best positioning result. Based on the above two points, the invention firstly searches the A middle and dominant frequency f
0Closest sample
(f
mClosest to f
0) As a clustering center of the dynamic clustering, the obtained sample set A is subjected to two-classification dynamic clustering, and the removed clustering center is not a
mKeep the cluster center as a
mThe class (denoted as B) is used as a sample for data fusion. Calculate each sample a in B
nAnd a
mInverse beta of the Euclidean distance of (1)
n。
Wherein d isnm=||an-am||2,anE.g. B. From betanConstructing a weighting factor w for a weighted fusionn;
Using the constructed weighting coefficients, for sample a in BnAnd performing weighted fusion to obtain a final positioning vector phi.
The spatial position corresponding to phi is the final positioning result.
The invention also provides a system for positioning a spectral sound source in adjustable response, which comprises:
the cross-correlation coefficient calculation module is used for calculating the cross-correlation coefficient of the vibration signals acquired by every two sensors in the sensor array;
the time-frequency transformation module is used for respectively carrying out time-frequency transformation on each cross-correlation coefficient to obtain a time-frequency spectrum of the cross-correlation coefficient;
the positioning space energy distribution determining module is used for obtaining positioning space energy distribution containing signal frequency information by utilizing the principle of delay summation beam forming according to the time-frequency spectrum of each cross-correlation coefficient;
the positioning spatial energy distribution determining module specifically includes: and the positioning space energy distribution determining submodule is used for obtaining positioning space energy distribution containing signal frequency information according to the cross-correlation coefficient after each time-frequency transformation by utilizing the principle of delay summation beam forming:
where P (x, f) denotes the localized spatial energy distribution, τ (x) denotes the time difference between the spatial position x to the ith and jth sensors, f denotes the vibration signal frequency,
represents the cross-correlation coefficient R after the (i, j) th time-frequency transformation
i,jThe time-frequency transform function value of (1); s
i(t) represents the vibration signal collected by the ith sensor,
representing the vibration signal s collected by the jth sensor
j(t + τ (x)) with t representing the time at which the ith sensor acquired the vibration signal and N representing the number of sensors in the sensor array.
And the positioning feature matrix construction module is used for determining a positioning feature vector corresponding to the maximum value of the energy distribution of the positioning space of the vibration signals with different frequencies in the frequency constraint range by adopting a constraint maximum likelihood estimation mode, and constructing a positioning feature matrix.
The positioning feature matrix building module specifically comprises: a positioning feature matrix construction submodule for solving the formula by adopting a constraint maximum likelihood estimation mode
Obtaining a positioning feature matrix as A ═ a
1,a
2,...,a
m,...a
M];
Wherein the content of the first and second substances,
representing the frequency of the vibration signal as f
mPosition vector, P (x, f), of the maximum of the temporal localization spatial energy distribution
m) Representing the frequency of the vibration signal as f
mTemporal localization spatial energy distribution, a
mRepresenting the frequency f of the vibration signal
mThe corresponding location feature vector is then used to locate,
E
mrepresenting the frequency f of the vibration signal
mThe corresponding spatially tunable response spectral energy,
representing the frequency of the vibration signal as f
mLocation vector in temporal localization spatial energy distribution
Energy of (f) of (d)
min,f
max]Indicating the frequency band range, f
minRepresenting the minimum effective frequency, f
maxRepresenting the maximum effective frequency.
And the sound source positioning module is used for clustering and weighting the positioning characteristic vectors in the positioning characteristic matrix to obtain a sound source positioning result.
The sound source positioning module specifically comprises: a main clustering center determining submodule for determining the location feature matrixThe positioning feature vector with the vibration signal frequency closest to the main frequency is used as a main clustering center; the data fusion sample set acquisition submodule is used for carrying out two-classification dynamic clustering on the positioning feature vector in the positioning feature matrix by using the main clustering center to obtain a class where the main clustering center is located as a data fusion sample set; a weighted fusion submodule for utilizing the formula
Performing weighted fusion on each positioning feature vector in the data fusion sample set to obtain a source positioning result; where Φ represents the source localization result, w
nRepresenting the nth location feature vector a in the data fusion sample set
nThe weight of (a) is determined,
β
nrepresenting the inverse of the Euclidean distance, β, of the nth location feature vector in the data fusion sample set from the primary cluster center
lAnd L represents the quantity of the positioning feature vectors in the data fusion sample set.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a method and a system for positioning a time-frequency spectrum sound source with adjustable response, wherein the positioning method comprises the following steps: calculating the cross-correlation coefficient of the vibration signals acquired by every two sensors in the sensor array; performing time-frequency transformation on each cross-correlation coefficient to obtain a time-frequency spectrum of the cross-correlation coefficient; obtaining positioning space energy distribution containing signal frequency information by utilizing a delay and sum beam forming principle according to the time-frequency spectrum of each cross-correlation coefficient; determining a positioning characteristic vector corresponding to the maximum value of the energy distribution of the positioning space of the vibration signals with different frequencies in a frequency constraint range by adopting a constraint maximum likelihood estimation mode, and constructing a positioning characteristic matrix; and clustering and weighting fusion are carried out on the positioning feature vectors in the positioning feature matrix to obtain a sound source positioning result. According to the invention, the positioning space energy distribution containing signal frequency information is obtained by performing time-frequency transformation on the cross-correlation coefficient and utilizing the principle of delay summation beam forming, so that the obtained positioning space energy distribution has frequency information, then the maximum likelihood estimation is restricted, a positioning characteristic matrix is constructed, the technical defect of inaccurate positioning caused by the fact that the frequency information restriction is not considered is overcome, and the positioning precision and speed are further improved by adopting a clustering and weighting fusion mode.
The equivalent embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts between the equivalent embodiments can be referred to each other.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In summary, this summary should not be construed to limit the present invention.