AU2020287677A1

AU2020287677A1 - Sound source position estimation method, readable storage medium and computer device

Info

Publication number: AU2020287677A1
Application number: AU2020287677A
Authority: AU
Inventors: Xinming JIA; Xianbin SUN; Zhen Wang; Yi Zheng
Original assignee: Qingdao University of Technology; Institute of Oceanographic Instrumentation Shandong Academy of Sciences
Current assignee: Qingdao University of Technology; Institute of Oceanographic Instrumentation Shandong Academy of Sciences
Priority date: 2019-06-06
Filing date: 2020-05-13
Publication date: 2021-03-04
Anticipated expiration: 2040-05-13
Also published as: WO2020244359A1; AU2020287677B2; CN110146846A; CN110146846B

Abstract

Disclosed is a sound source position estimation method. The method comprises: a single-vector hydrophone receiving multichannel signals sent by a sound source in the sea; fusing the received multichannel signals into an instant single-channel sound intensity signal by means of combined sliding of a fixed window and a dynamic window, dividing the instant single-channel sound intensity signal into signal segments containing a sufficient amount of information, and with the premise that the information amount is guaranteed, reducing data quantity, and improving the operating speed; performing self-supplementation on the signal by means of an expectation maximization algorithm, and expanding distances between the various signal segments, such that the resolution between the signal segments is improved, and information lost during interception in the previous step is supplemented to a certain extent; and estimating the position of the sound source by means of a recurrent neural network and using self-supplemented equal-length signals. According to the method, only the single-vector hydrophone is needed to collect signals, such that not only are arrangement difficulty and usage costs simplified, but the application range is also expanded. Further disclosed are a readable storage medium and a computer device.

Description

SOUND SOURCE POSITION ESTIMATION METHOD, READABLE STORAGE MEDIUM, AND COMPUTER DEVICE BACKGROUND

Technical Field

The present disclosure relates to the field of sound source position estimation technologies, and in particular, to a sound source position estimation method, a readable storage medium, and a computer device.

Related Art

The description in this section merely provides background information related to the present disclosure, and does not necessarily constitute the prior art.

With rapid and continuous growth in economy in China, population also continuously increases, and people's demands for resources and requirements for consumption levels also increase greatly accordingly. People are developing and utilizing the limited resources on the land to the greatest extent, and are also facing resource shortage at the same time. Therefore, new fields and new resources need to be further researched and developed on the basis of maximizing the application of existing resources. The ocean area accounts for more than 70% of the surface area of the earth, and therefore, it is crucial to develop and utilize marine resources. In recent years, countries in the world are quite fiercely competing for marine resources. How to develop and utilize marine resources to make full use of the resources has become a major issue that attracts much attention in recent years. China is in a unique position, with a total national territorial area of approximately 9.6 million square kilometers, of which the ocean area accounts for approximately 3% of the total area, and most of the ocean waters are shallow sea waters. Therefore, technical research on shallow sea waters will be of greater significance for human survival and development.

Compared with the deep sea environment, the temporal and spatial variability and uncertainty of the shallow sea environment have a severer impact on signal propagation. In addition, both reflected signals from the shallow sea floor and human activities in the shallow sea cause aliasing of a target signal, further affecting sound source position estimation. Therefore, sound source target estimation in the shallow sea environment has been a research difficulty in this field currently. Compared with a conventional sound pressure hydrophone, a vector hydrophone can acquire a sound pressure signal and triaxial acoustic particle velocity signals in orthogonal directions at a common point, and is of great practical value.

The inventor of the present disclosure finds during research that: currently, a single vector hydrophone is mostly used to estimate an azimuth direction of arrival (DOA) and a pitch DOA of a target, and to determine a position distance of the target, cross-estimation needs to be performed by using a vector hydrophone array due to an inherent defect of the single vector hydrophone, namely, insufficient distance resolution. Moreover, in actual engineering application of the single vector hydrophone, on the one hand, it is difficult for an actual parameter to meet a requirement of an ideal electroacoustic parameter characteristic due to limitations of process conditions, restricting the accuracy of position estimation of the single vector hydrophone; on the other hand, the single vector hydrophone is easily influenced by the environment to cause uncertain changes in attitude, so that acquisition of a real position of the target is further influenced. Such reasons result in insufficient application of the single vector hydrophone in target position estimation.

SUMMARY

To overcome the shortcomings in the related art, the present disclosure provides a sound source position estimation method, a readable storage medium, and a computer device. Compared with a conventional sound source estimation model in which a complex vector hydrophone array needs to be deployed to receive a signal, in the sound source estimation method, only a single vector hydrophone is required to acquire a signal, thereby reducing deployment difficulty and use costs, and expanding the application scope.

To achieve the foregoing objective, the present disclosure uses the following technical solutions:

According to a first aspect, the present disclosure provides a sound source position estimation method.

The sound source position estimation method includes the following steps:

receiving, by a single vector hydrophone, multichannel signals from a sound source in the ocean;

fusing the received multichannel signals into an instantaneous single-channel sound intensity signal through fixed-dynamic window combined sliding, and dividing the received multichannel signals into signal segments including a sufficient information amount, to reduce a data amount and improve an operation speed while ensuring the information amount;

performing signal self-supplement by using the expectation maximization algorithm, and increasing a distance between the signal segments, to improve a resolution between the signal segments and supplement information lost in the previous step of capture to some extent;

estimating, through a recurrent neural network, a position of the sound source by using equal-length signals obtained after the self-supplement.

In some possible implementations, the multichannel signals are four-channel signals, including three acoustic particle velocity signals in orthogonal directions: an acoustic

particle velocity v- in an x -axis direction, an acoustic particle velocity vY in a Y -axis

direction, and an acoustic particle velocity vz in a z -axis direction, and a scalar sound

pressure signal P .

As a further limitation, the multichannel signals are fused into the instantaneous single-channel sound intensity signal by using a fixed window, dynamic windows of all lengths are traversed, a steepest ascent section of a Shannon entropy is found, an optimum dynamic window is determined, the instantaneous single-channel sound intensity signal in the fixed window are dynamically captured as unequal-length signals based on the Shannon entropy by using the optimum dynamic window, and signal self-supplement is performed on the captured unequal-length signals by using the expectation maximization algorithm.

As a still further limitation, the dividing the received multichannel signals into signal segments including a sufficient information amount through fixed-dynamic window combined sliding specifically includes the following steps:

401: specifying a fixed window length if and an initial window start point t/A for the

acquired four-channel signals P, v ", and v

402: fusing, by using a fixed window W *f whose window length and start point are

respectively I and tf, , four-channel information in the window, to obtain an

instantaneous single-channel sound intensity signal whose length is if;

403: capturing a dynamic window Wd, whose window length and start point are

respectively 1 and / in the instantaneous single-channel sound intensity signal '

, where the length of the signal in the dynamic window d, is as short as possible provided that the signal includes a sufficient information amount; and

404: returning to 402, updating a start point tf of a fixed window f'+ based on a

signal overlap rate q, and performing a cyclic operation.

As a still further limitation, in step 402, synchronous sliding is performed in channel signals through a time window of a fixed size, signals are extracted, and the information is fused into the instantaneous single-channel sound intensity signal by using a cross-spectrum method; and the step specifically includes the following steps:

501: specifying a fixed window length / and a window start point t/A according to a signal fusion degree;

502: capturing signal segments of a window sizeif by using the same start point tf

in the signal channels of the sound pressure P and the vibration velocities Vv and

VZ W W W in the axial directions, where corresponding window signals are A, xr, e , and

pvr and

503: calculating, based on the cross-spectrum method, the instantaneous single-channel

sound intensity signal ', obtained after the window signals are fused, to implement multi-sensor information fusion, where a calculation formula of the instantaneous single-channel sound intensity signal obtained after the fusion is:

Re[S ] Re[S, ] Re[S ] I=p(t).(t)pRe[S(f)] V(Re[S (f)] 'W) (WRe[S (f]

where S P,, SP , and -= are respectively cross-spectrum functions of three

components x, y, and z, is a spectrum function of 2 is a frequency,

Re[ ] indicates Laplace transform, 0 and 9 are respectively a pitch direction of arrival (DOA) and an azimuth DOA of a sound source relative to a vector hydrophone, where the pitch DOA and the azimuth DOA respectively take the XOY plane and the x

-axis as 0°, and p(t) vx(t) vy,(t), and v(t) are respectively a sound pressure signal and acoustic particle velocity signals in different directions that are received by the vector hydrophone at a moment t.

As a still further limitation, in step 403, the dynamic windows of all lengths are

traversed for the instantaneous single-channel sound intensity signal ', in the fixed window, and the steepest ascent section of the Shannon entropy is found as the optimum

dynamic window d; and the step specifically includes the following steps:

601: traversing the entire fixed window 't of the captured instantaneous

single-channel sound intensity signal I starting from the start point tf of the fixed window, and calculating a Shannon entropy of signals of all lengths by using the following

formula, to construct a Shannon entropy signal S',

Shannon(X )=Ip(x,).-og =- p(x,). log p(x,) i=1 ~ p(x,) i

where X iis a possible value of a random event X, Shannon(X) is a Shannon entropy included in the random event X, m is a total quantity of random events, and

P(x1 is a probability of occurrence of xi

602: finding a steepest ascent section of S'i according to a derivation result S, of

S1 , marking the length as Si, and jumping to step 604;

603: marking the length as if no steepest ascent section is found in SI, W considering that the signal in the fixed window is an invalid signal or a noise signal, where the following two conditions are satisfied:

when S'i is relatively small, it is considered that the signal is a null signal or Shannon

entropy content of the signal in the fixed window is insufficient, and 10, where 0 is a preset minimum captured length; and

when S'i is relatively large, it is considered that the signal is a noise signal or a valid

signal including a relatively high Shannon entropy, and s71, where 11 is a preset maximum captured length;

604: capturing a signal segment whose length is starting from the start point tfi in

the fixed window '' as the dynamic window a', and marking an end time of the

window ast

As a still further limitation, signal self-supplement is performed by using the expectation maximization algorithm, segmented unequal-length signals are equivalent to

observation data X, supplemented equal-length signals are equivalent to complete data Y,

a supplementary signal is equivalent to unobserved data Z , and a result is iterated by using the expectation maximization algorithm to obtain a maximum value 6* of a parameter 0, that is, when a maximum value of a maximum likelihood function L(0) based on Y is obtained, optimal solutions of a mean u and a variance ' of a complete dataset are obtained, and an unknown dataset Z is obtained based on an observed dataset X, to supplement the complete dataset Y, which specifically includes the following steps:

701: assuming that an iteration quantity t = 0, initializing a parameter vector 0(0), 0

being a parameter vector formed by the mean and the variance of the dataset Y, and

calculating an initial maximum likelihood function

n

L(O)j = L (xj,---x,; 0) = p(x,; 0)-.... p(x,,;O = p (xi;0

) 702: obtaining Q (z) from 0t, and ensuring that when 0' is specified, the

equal sign in In(E(X))E[In(X)]is satisfied, to establish a lower bound of ;

Q z) 0(t L(O0) 703: fixing Qt)()and taking as a variable, taking a derivative of in

step 702, and obtaining (t+0 according to the formula

L (O(t1 )) I Qz) in =L (OW) ) ;and

704: if , ending the iterative calculation; otherwise, assuming

that= t +1, and returning to step 702, where a threshold 6 is a specified very small value;

where i represents a distribution of unknown data Z, P( 0(t)) is a

probability of occurrence of xz) under a condition 0 (t), the superscript i indicates an th value of a corresponding parameter, - is the threshold and is an initially specified small value used as a standard for ending the iteration, and E[ ] is a mathematical expectation.

As a still further limitation, the estimating, through a recurrent neural network, a position of the sound source by using equal-length signals obtained after the self-supplement specifically includes: using the supplemented signal segments as an input by using the expectation maximization algorithm, outputting azimuth DOAs and distances of the sound source in different signal segments, and performing cross-validation on estimation results of the different signal segments, to implement accurate positioning of the position of the sound source.

According to a second aspect, the present disclosure provides a computer-readable storage medium, storing a computer program, where when the program is executed by a processor, steps in the sound source position estimation method of the present disclosure are implemented.

According to a third aspect, the present disclosure provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of being run on the processor, where when the processor executes the program, steps in the sound source position estimation method of the present disclosure are implemented.

Compared with the prior art, the present disclosure has the following beneficial effects:

1. Compared with the conventional sound source estimation model, the sound source position estimation method of the present disclosure avoids a problem of deploying a complex vector hydrophone array to receive a signal. In the sound source estimation method of this application, only a single vector hydrophone is required to acquire a signal, thereby reducing deployment difficulty and use costs, and expanding the application scope.

2. In the sound source position estimation method of the present disclosure, a short-duration signal sample is divided into a large quantity of signal segments through dynamic window-fixed window combined sliding. Through mutual validation of the signal segments, the accuracy and stability of position estimation are improved. The data amount is reduced and the operation speed is improved while ensuring the information amount.

3. In the sound source position estimation method of the present disclosure, large samples are required to train a network only in an early stage, and complex operations are not required during use, thereby implementing real-time tracking of high-speed and highly maneuverable target trajectories.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a relationship between a single vector hydrophone and a sound source position according to Embodiment 1 of the present disclosure.

FIG. 2 is a flowchart of a sound source position estimation method according to Embodiment 1 of the present disclosure.

FIG. 3 is a flowchart of fixed-dynamic window combined sliding according to Embodiment 1 of the present disclosure.

FIG. 4 is a curve graph of an estimation result of a position of an ultra-low frequency sound source according to Embodiment 1 of the present disclosure.

DETAILED DESCRIPTION

It should be noted that the following detailed descriptions are all exemplary and are intended to provide a further description of the present disclosure. Unless otherwise specified, all technical and scientific terms used herein have the same meaning as commonly understood by a person of ordinary skill in the art to which the present disclosure belongs.

It should be noted that the terms used herein are merely used for describing specific implementations, and are not intended to limit exemplary implementations of the present disclosure. As used herein, the singular form is intended to include the plural form, unless the context clearly indicates otherwise. In addition, it should further be understood that terms "comprise" and/or "include" used in this specification indicate that there are features, steps, operations, devices, components, and/or combinations thereof.

Embodiment 1:

As shown in FIG. 1 and FIG. 2, Embodiment 1 of the present disclosure provides a sound source position estimation method, including the following steps:

receiving, by a single vector hydrophone, multichannel signals from a sound source in the ocean, where the multichannel signals are four-channel signals, including three acoustic

particle velocity signals in orthogonal directions: an acoustic particle velocity v- in an X

-axis direction, an acoustic particle velocity vy in a Y -axis direction, and an acoustic

particle velocity v in a z -axis direction, and a scalar sound pressure signal P ;

The multichannel signals are fused into the instantaneous single-channel sound intensity signal by using the fixed window, dynamic windows of all lengths are traversed, a steepest ascent section of a Shannon entropy is found, an optimum dynamic window is determined, the instantaneous single-channel sound intensity signal in the fixed window are dynamically captured as unequal-length signals based on the Shannon entropy by using the optimum dynamic window, and signal self-supplement is performed on the captured unequal-length signals by using the expectation maximization algorithm.

As shown in FIG. 3, the dividing the received multichannel signals into signal segments including a sufficient information amount through fixed-dynamic window combined sliding specifically includes the following steps:

301: specifying a fixed window length If and an initial window start point tf for

the acquired four-channel signalsP, v, v , and

W 302: fusing, by using a fixed window f whose window length and start point are

respectively If and tf , four-channel information in the window, to obtain an

instantaneous single-channel sound intensity signal whose length is ;

303: capturing a dynamic window di whose window length and start point are 1i t W respectively s and f in the instantaneous single-channel sound intensity signal ',

where the length of the signal in the dynamic window di is as short as possible provided that the signal includes a sufficient information amount; and

304: returning to 302, updating a start point f, of a fixed window based on a

signal overlap rate q, and performing a cyclic operation.

In step 302, synchronous sliding is performed in channel signals through a time window of a fixed size, signals are extracted, and the information is fused into the instantaneous single-channel sound intensity signal by using a cross-spectrum method; and the step specifically includes the following steps:

401: specifying a fixed window length If and a window start point t ,according to a signal fusion degree;

402: capturing signal segments of a window size f by using the same start point tf

in the signal channels of the sound pressure P and the vibration velocities v- , and

v2 in the axial directions, where corresponding window signals are , and

Wvzi and

403: calculating, based on the cross-spectrum method, the instantaneous single-channel

Re[SPV ] Re[Sp, ] Re[S ]_ Re[S (f)] Re[S (f)] Re[S (f)]

A specific derivation process of the instantaneous single-channel sound intensity signal is:

Assuming that a sound signal P propagates in an isotropic noise field, a vector

hydrophone Q receives the signal, and outputs of the vector hydrophone have the following relationships:

Soundpressure p(t)=p(t)+p(t) (2)

Acoustic particle velocity component x v(t)= v(t)+v,1(t) (3)

Acoustic particle velocity component Y v,(t)= v, (t)+ v,, (t) (4)

Acoustic particle velocity component z v,(t)=v.(t)+v'(t) (5)

In the foregoing formulas, subscripts "s " and "n " respectively represent signal and noise. If noise sources are independent of each other and a mean is 0, a sound intensity in the x direction is:

I= (t)-v,(t) =([p'(t) + p" (01 W -(, ++v,,(t)]

= p,(t)-v.(t)±+ p,(t)-v,.(t)+p,(t)-v,(t)+p,(t)-v,.(t) =pS(t)-v, (t) (6)

Similarly:

=p,(t) vy(t) '2 P, t).v2 }t) (7)

It can be learned from the foregoing formula that the sound intensity obtained from the

outputs P, v , and vz of the vector hydrophone does not include noise energy, that is, the vector hydrophone can resist an isotropic noise.

At a moment t, a sound pressure signal and acoustic particle velocity signals in

different directions that are received by the vector hydrophone are respectively p(t)

v,(t) v,(t),and v,(t)

An approximate spatial position of the target is estimated by using the cross-spectrum method. First, a cross-correlation operation is performed on the sound pressure P and each of the acoustic particle velocity components, to obtain the following cross-correlation functions:

R =0p(t)vt-r)dt=0p(t)p(t-r)cosOcospdt

R =(t)v,t-r)dt=JT p(t)p(t-r)cos0sinqdt

T R = p(t)v{t-r)dt=JT p(t)p(t-r)sinOdt (8)

Then, Fourier transform is performed on the foregoing cross-correlation functions, to obtain cross-spectrum functions of the cross-correlation functions:

S =S 2(f)cosOcosqP SP =SP f cos0sinq

sp = S (f)sin0 (9)

Herein, is a spectrum function ofP 2 is a frequency, Re[ ] indicates Laplace transform, and 0, and 9 are respectively a pitch direction of arrival (DOA) and an azimuth DOA of a sound source relative to a vector hydrophone, where the pitch DOA and the azimuth DOA respectively take the XY plane and the x -axis as0°.

In this case, the azimuth DOA and pitch DOA of the target are:

9= arctan S Sil, (f)

0 = arctan S)V (f S ~ v(f+ Mf (10)

The sound intensity of the vector hydrophone may be obtained with reference to formulas (6), (7), and (10):

I=(IxcosOcosp+I cos0sinp+isinO)

=p (t) -v1,, (T)cos 0cos p +p, (t) -v, (t)cos 0sinp +p, (t) - v ,(t)sin 0 Re[S ] Re ] Re[S p(t)-v,(t) [S2 +p(t)-v (t) 2 +p(t)-v,(t) eS2 Re[S (f )] Re[S (f )] Re[S (f)] ($

In step 303, the dynamic windows of all lengths are traversed for the instantaneous

single-channel sound intensity signal w'i in the fixed window, and the steepest ascent

section of the Shannon entropy is found as the optimum dynamic window di; and the step specifically includes the following steps:

601: traversing the entire fixed window 't of the captured instantaneous

single-channel sound intensity signal I starting from the start point t/f of the fixed window, and calculating a Shannon entropy of signals of all lengths by using the following

formula, to construct a Shannon entropy signal s,,

1 Shannon(X)= p(x). log = -$p(x,). log p(x,) P(x1 ) i (12)

where X' is a possible value of a random event X, Shannon(X) is a Shannon entropy included in the random event X, m is a total quantity of random events, and

P(x1 is a probability of occurrence of x;

602: finding a steepest ascent section of S'i according to a derivation result S of

S 1 , marking the length as s, and jumping to step 604;

603: marking the length as si if no steepest ascent section is found in S'i W considering that the signal in the fixed window is an invalid signal or a noise signal, where the following two conditions are satisfied:

entropy content of the signal in the fixed window is insufficient, and 0 0, where I is a preset minimum captured length; and

signal including a relatively high Shannon entropy, and where 11 is a preset maximum captured length;

604: capturing a signal segment whose length is si starting from the start point tf in

the fixed window h as the dynamic window di, and marking an end time of the

window ast

For the shortcoming of different lengths of the signals obtained after fixed-dynamic window segmentation, signal self-supplement is performed by using the expectation maximization algorithm (EM algorithm). The segmented unequal-length signals are

equivalent to observation data X, the supplemented equal-length signals are equivalent to

complete data Y, and the supplementary signal is equivalent to unobserved data Z. The equal lengths of the signals facilitate subsequent calculation and comparison.

A result is iterated by using the expectation maximization algorithm to obtain a maximum value 0 of a parameter 0, that is, when a maximum value of a maximum likelihood function L(0) based on Y is obtained, optimal solutions of a mean ui and a variance(7 of a complete dataset are obtained, and an unknown dataset Z is obtained based on an observed dataset X , to supplement the complete dataset Y , which specifically includes the following steps:

calculating an initial maximum likelihood function

n

702: obtaining Q ( )(from 0), and ensuring that when 0 is specified, the

equal sign in n(E(X))E[n(X)]is satisfied, to establish a lower bound of ;

703: fixing Q() and taking 0 as a variable, taking a derivative of ))in

step 702, and obtaining O(t*1) according to the formula

L (O(t1 )) I Qz() n =L (OW) I' QQz() ZW; n)d and

704: if , ending the iterative calculation; otherwise, assuming

A specific iteration process is as follows:

Assuming that Z represents missing data, that is, unobserved data, X is observed

data, referred to as incomplete data, a sum of the missing data Z and the incomplete data

X is defined as complete data Y, and X is a function of Y, there are the following relations:

[L(O)= Inp(XIO) -+>max] > [Inp(Yl0) -+max] > 0 -0>o*] (13)

L(O)=L(x),---,x");)=p( x ;0). ... p(x'";O)=ip(x ;0) =1 (14)

{ (15)

p(XIO)is a probability density function of an observed dataset, is a

probability density function of a complete dataset, and ui and 0 are respectively a mean and a variance of a probability density function.

To find a maximum value of the likelihood function L(O) is to find 0 in a parameter space & to maximize the likelihood function when sample points

{ are fixed, specifically: 0* = argmax., 0 L(0) (16)

Since L(0) and InL(0) have extreme values at the same 0 , a logarithmic operation is performed on the likelihood function:

ln(L (0))= lnp(x(i);o) =1(17)

A maximum likelihood estimate 0* of 0 may be obtained by solving the following equation:

d InLO)=0 dO (18)

Therefore, formula (13) can be transformed into:

L(O)= Q(z('))lIn f' 0) Q1(z(') (19)

Qi zf')):= p(zf' x');O) P ( (0 (20)

Qi represents a distribution of unknown data Z , and satisfies the following conditions:

(z)> 0 (21)

from related definitions of a mathematical expectation and the Jensen's inequality:

E[f(x)]= f(x)p(x) (22 i (22)

ln(E(X))> E[ln(X)] (23) It is learned with reference to formula (19) that:

0 L(6)= In E PX'

(-1 (24)

It is leaked with reference to formulas (19) and (20) that, in a tt h iteration:

Q z p( xZ ;0 = lnInQ, (z('))~()z';() Ek~l =Y~ Q z

) > Q) z l =L(6

) I Q, (()InP(O Z(;01).= W IW Q,(?)) (25)

Formula (25) may be seen as a process of finding a lower bound of L(O) Through

continuous iteration, the lower bound is increased until the lower bound

) converges to the vicinity the likelihood function L(0) when the parameter 0 reaches a

maximum value 0*. In this case, the iteration ends.

The estimating, through a recurrent neural network, a position of the sound source by using equal-length signals obtained after the self-supplement specifically includes: using the supplemented signal segments as an input by using the expectation maximization algorithm, outputting azimuth DOAs and distances of the sound source in different signal segments, and performing cross-validation on estimation results of the different signal segments, to implement accurate positioning of the position of the sound source.

To further describe an implementation process of the method, the method is tested by using a signal acquired by a single vector hydrophone deployed at a position when a ship is sailing. Upon testing, it is found that a sound source position can be located in a very short time by using the method, with accuracy of 1.5 m. Compared with a conventional method, only a single vector hydrophone is required, not only improving the positioning accuracy, but also improving the stability. An estimation result is shown in FIG. 4.

Embodiment 2:

Embodiment 2 of the present disclosure provides a computer-readable storage medium, storing a computer program, where when the program is executed by a processor, steps in the sound source position estimation method in Embodiment 1 of the present disclosure are implemented.

Embodiment 3:

Embodiment 3 of the present disclosure provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of being run on the processor, where when the processor executes the program, steps in the sound source position estimation method in Embodiment 1 of the present disclosure are implemented.

The foregoing descriptions are merely exemplary embodiments of the present disclosure, but are not intended to limit the present disclosure. The present disclosure may include various modifications and changes for a person skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present disclosure shall fall within the protection scope of the present disclosure.

Claims

CLAIMS What is claimed is:

1. A sound source position estimation method, comprising the following steps:

fusing the received multichannel signals into an instantaneous single-channel sound intensity signal through fixed-dynamic window combined sliding, and dividing the received multichannel signals into signal segments comprising a sufficient information amount;

performing signal self-supplement by using the expectation maximization algorithm, and increasing a distance between the signal segments; and

2. The sound source position estimation method according to claim 1, wherein the multichannel signals are four-channel signals, comprising three acoustic particle velocity

signals in orthogonal directions: an acoustic particle velocity vx in an X -axis direction, an

acoustic particle velocity v' in a Y -axis direction, and an acoustic particle velocity v

in a z -axis direction, and a scalar sound pressure signal P .

3. The sound source position estimation method according to claim 2, wherein the multichannel signals are fused into the instantaneous single-channel sound intensity signal by using a fixed window, dynamic windows of all lengths are traversed, a steepest ascent section of a Shannon entropy is found, an optimum dynamic window is determined, the instantaneous single-channel sound intensity signal in the fixed window are dynamically captured as unequal-length signals based on the Shannon entropy by using the optimum dynamic window, and signal self-supplement is performed on the captured unequal-length signals by using the expectation maximization algorithm.

4. The sound source position estimation method according to claim 3, wherein the dividing the received multichannel signals into signal segments comprising a sufficient information amount through fixed-dynamic window combined sliding specifically comprises the following steps:

401: specifying a fixed window length If and an initial window start point tf for

the acquired four-channel signalsP, ,v ,and v

W 402: fusing, by using a fixed window f whose window length and start point are

respectively If and tf , four-channel information in the window, to obtain an

instantaneous single-channel sound intensity signal whose length is ;

403: capturing a dynamic window di whose window length and start point are Ii t W respectively S and f in the instantaneous single-channel sound intensity signal i,

wherein the length of the signal in the dynamic window di is as short as possible provided that the signal comprises a sufficient information amount; and

404: returning to 402, updating a start point tf of a fixed window based on a

signal overlap rate q, and performing a cyclic operation.

5. The sound source position estimation method according to claim 4, wherein in step 402, synchronous sliding is performed in channel signals through a time window of a fixed size, signals are extracted, and the information is fused into the instantaneous single-channel sound intensity signal by using a cross-spectrum method; and the step specifically comprises the following steps:

501: specifying a fixed window length If and a window start point tf according to a signal fusion degree;

502: capturing signal segments of a window size f by using the same start point tf

in the signal channels of the sound pressure P and the vibration velocities v , and V, in the axial directions, wherein corresponding window signals are A, , and "zi; and

503: calculating, based on the cross-spectrum method, the instantaneous w single-channel sound intensity signal 'i obtained after the window signals are fused, to implement multi-sensor information fusion, wherein a calculation formula of the instantaneous single-channel sound intensity signal obtained after the fusion is:

Re[S ] Re[S, ] Re[S V] I=p(t)-v(t) 'Ji)+pft)-v,(t) *( +pft)-v,(t1 Re[S(f)] Re[S(f)] (t) Re[S'(f)]

" wherein S , , and SP= are respectively cross-spectrum functions of three

components x, y, and z, is a spectrum function of 2 is a frequency, Re[ ] indicates Laplace transform, 0 and 9 are respectively a pitch direction of arrival (DOA) and an azimuth DOA of a sound source relative to a vector hydrophone, wherein the pitch DOA and the azimuth DOA respectively take the XOY plane and the x

6. The sound source position estimation method according to claim 4, wherein in step 403, the dynamic windows of all lengths are traversed for the instantaneous single-channel

sound intensity signal 'i in the fixed window, and the steepest ascent section of the

Shannon entropy is found as the optimum dynamic window di; and the step specifically comprises the following steps:

601: traversing the entire fixed window 't of the captured instantaneous

formula, to construct a Shannon entropy signal S,

Shannon(X)= p(x). log - =- p(x, ). logp(x) i=1 p( x,) j

wherein i is a possible value of a random event X, Shannon(X) is a Shannon entropy comprised in the random event X, m is a total quantity of random events, and

P(x1 is a probability of occurrence of xi;

602: finding a steepest ascent section of '' according to a derivation result S' of

S1 ,, marking the length as , and jumping to step 604;

603: marking the length as if no steepest ascent section is found in Si, W considering that the signal in the fixed window is an invalid signal or a noise signal, wherein the following two conditions are satisfied:

when S'i is relatively small, it is considered that the signal is a null signal or

Shannon entropy content of the signal in the fixed window is insufficient, and 0,

wherein 10 is a preset minimum captured length; and

signal comprising a relatively high Shannon entropy, and i 1, wherein 11 is a preset maximum captured length;

604: capturing a signal segment whose length is starting from the start point tf W W in the fixed window I as the dynamic window ,and marking an end time of the

window as tde

7. The sound source position estimation method according to claim 1, wherein

signal self-supplement is performed by using the expectation maximization algorithm, segmented unequal-length signals are equivalent to observation data X, supplemented equal-length signals are equivalent to complete data Y , a supplementary signal is equivalent to unobserved data Z , and a result is iterated by using the expectation maximization algorithm to obtain a maximum value 0* of a parameter 0, that is, when a maximum value of a maximum likelihood function L(0) based on Y is obtained, optimal solutions of a mean u and a variance'a of a complete dataset are obtained, and an unknown dataset Z is obtained based on an observed dataset X, to supplement the complete dataset Y, which specifically comprises the following steps:

701: assuming that an iteration quantity t = 0, initializing a parameter vector 0), 0

calculating an initial maximum likelihood function

n

L())= L (xj,---x,,; 0) = p( xl; 0)-.... p( x,;6)= p (xi; 0)

702: obtaining Qt) z()) from 0(t), and ensuring that when 0 is specified, the

equal sign in n(E(X))E[n(X)]is satisfied, to establish a lower bound of ;

703: fixing ( and taking0( as a variable, taking a derivative of ))in

step 702, and obtaining O(t1) according to the formula

L (O(t1 )) Q(z)ln =L (6 ;and

704: if () () , ending the iterative calculation; otherwise, assuming

that tt +1, and returning to step 702, wherein a threshold 6 is a specified very small value; wherein ' represents a distribution of unknown data Z , P(i)'Z )is a probability of occurrence of x '), z' under a condition 0 ), the superscript i indicates an Ith value of a corresponding parameter, - is the threshold and is an initially specified small value used as a standard for ending the iteration, and E[ ] is a mathematical expectation.

8. The sound source position estimation method according to claim 1, wherein the estimating, through a recurrent neural network, a position of the sound source by using equal-length signals obtained after the self-supplement specifically comprises: using the supplemented signal segments as an input by using the expectation maximization algorithm, outputting azimuth DOAs and distances of the sound source in different signal segments, and performing cross-validation on estimation results of the different signal segments, to implement accurate positioning of the position of the sound source.

9. A computer-readable storage medium, storing a computer program, wherein when the program is executed by a processor, steps in the sound source position estimation method according to any one of claims 1 to 8 are implemented.

10. A computer device, comprising a memory, a processor, and a computer program stored in the memory and capable of being run on the processor, wherein when the processor executes the program, steps in the sound source position estimation method according to any one of claims 1 to 8 are implemented.