CN113189571A - Sound source passive ranging method based on tone feature extraction and deep learning - Google Patents

Sound source passive ranging method based on tone feature extraction and deep learning Download PDF

Info

Publication number
CN113189571A
CN113189571A CN202010037014.6A CN202010037014A CN113189571A CN 113189571 A CN113189571 A CN 113189571A CN 202010037014 A CN202010037014 A CN 202010037014A CN 113189571 A CN113189571 A CN 113189571A
Authority
CN
China
Prior art keywords
spectral
tone
neural network
deep neural
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010037014.6A
Other languages
Chinese (zh)
Other versions
CN113189571B (en
Inventor
肖旭
倪海燕
王同
苏林
任群言
马力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Original Assignee
Institute of Acoustics CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CAS filed Critical Institute of Acoustics CAS
Priority to CN202010037014.6A priority Critical patent/CN113189571B/en
Publication of CN113189571A publication Critical patent/CN113189571A/en
Application granted granted Critical
Publication of CN113189571B publication Critical patent/CN113189571B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S11/00Systems for determining distance or velocity not using reflection or reradiation
    • G01S11/14Systems for determining distance or velocity not using reflection or reradiation using ultrasonic, sonic, or infrasonic waves
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/30Assessment of water resources

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

The invention discloses a sound source passive distance measurement method based on tone characteristic extraction and deep learning, which comprises the following steps: extracting time domain features, spectral features based on short-time Fourier transform, auditory spectral features based on equivalent rectangular bandwidth and harmonic spectral features based on a sinusoidal harmonic model from the real-time acoustic signal; extracting a plurality of tone descriptors from each feature to form 68-dimensional tone descriptor vectors; inputting the 68-dimensional tone descriptor vector into a pre-established deep neural network, outputting probability distribution corresponding to each distance, and taking the maximum probability as a distance predicted value. The method can achieve the ranging precision of more than 95% within the range of 1-10 km, and the highest ranging precision reaches 99.54%.

Description

Sound source passive ranging method based on tone feature extraction and deep learning
Technical Field
The invention relates to the field of underwater sound physics, in particular to a sound source passive distance measurement method based on tone feature extraction and deep learning.
Background
Passive sound source ranging is a main function of a sonar system and has been a problem addressed by underwater sound workers for many years. Because the ocean is a time-varying and space-varying complex acoustic channel, the traditional matching field method often faces the problems of environmental mismatch, too large calculated amount and the like during ranging. In recent years, deep learning is taken as a new branch based on a data driving mode, and a new idea is provided for underwater sound passive ranging by the strong feature extraction capability and the unique advantages of processing complex, high-dimensional, nonlinear and other data.
The extraction and the construction of deep learning features are key links of the passive positioning of underwater targets. The tone of the acoustic signal contains a large amount of information of the acoustic source and the underwater sound field, and the acoustic source distance measurement model is constructed by utilizing the tone characteristics extracted from the underwater acoustic signal and the deep neural network, so that the effective identification of the acoustic source distance can be realized.
Disclosure of Invention
The invention aims to overcome the technical defects and provides a method for realizing passive distance measurement of a sound source based on tone feature extraction and deep learning. The method comprises the steps of extracting time domain waveform characteristics, time domain envelope characteristics, short-time Fourier transform (STFT) -based spectrum characteristics, equivalent rectangular bandwidth-based auditory spectrum characteristics and harmonic spectrum characteristics based on a sinusoidal harmonic model from an acoustic signal by using MATLAB, extracting a set of complete tone color descriptors on the basis, taking the complete tone color descriptors as model input, and realizing estimation of the sound source distance through a deep neural network.
In order to achieve the above object, the present invention provides a sound source passive distance measurement method based on tone feature extraction and deep learning, the method comprising:
extracting time domain features, spectral features based on short-time Fourier transform, auditory spectral features based on equivalent rectangular bandwidth and harmonic spectral features based on a sinusoidal harmonic model from the real-time acoustic signal;
extracting a plurality of tone descriptors from each feature to form 68-dimensional tone descriptor vectors;
inputting the 68-dimensional tone descriptor vector into a pre-established deep neural network, outputting probability distribution corresponding to each distance, and taking the maximum probability as a distance predicted value.
As an improvement of the above method, the time domain features include: time domain waveform characteristics and time domain envelope characteristics; the tone color descriptors extracted from the time domain features comprise attack time, decay time, unvoiced reverberation time, logarithmic attack time, attack slope, descent slope, time domain centroid, effective duration, frequency modulation, amplitude modulation, zero crossing rate and RMS energy envelope;
the timbre descriptors extracted from the short-time fourier transform-based spectral features include: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy;
the tone color descriptor extracted from the auditory spectrum feature based on the equivalent rectangular bandwidth comprises: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy;
tone descriptors extracted from harmonic spectral features based on a sinusoidal harmonic model include: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy.
As an improvement of the above method, the input layer of the deep neural network inputs 68-dimensional timbre descriptor vectors;
the hidden layer activation function of the deep neural network adopts a hyperbolic tangent function;
the output layer of the deep neural network adopts 200 Softmax nodes and corresponds to probability distribution of different distances.
As an improvement of the above method, the method further comprises: the deep neural network training method specifically comprises the following steps:
establishing a training set: the transmitting signal adopts a broadband signal s (t), a Pekeris waveguide is used as an environment model, and the target distance range is 1-10 km; for the transmitting signal s (t), the transmitting condition and the water body condition are kept unchanged, the receiving distance is used as the only variable in the simulated underwater acoustic environment, and the KRAKEN sound field model is used for obtaining the corresponding receiving signal yi(t), i is 1,2 … N, and N is the number of signals; introducing Gaussian white noise n (t) into a received signal, wherein the substance range of the SNR is as follows: 1-10 dB;
performing frame calculation on each sample of the training set to obtain a characteristic sequence, respectively calculating each tone descriptor to form 68-dimensional tone descriptor vectors, and inputting the vectors into a deep neural network;
iterative optimization is carried out on the loss function through a back propagation algorithm by using an optimization algorithm to obtain a minimum value, a mean square error is taken as a cost function, model parameters are updated by adopting an Adam algorithm and an MSE cost function, and a Drop-out regularization strategy is used for realizing network parameter regularization; the initial weights are generated by a truncated normal distribution model with a standard deviation set to 0.1.
The invention also provides a sound source passive ranging system based on tone characteristic extraction and deep learning, which comprises: the device comprises a trained deep neural network, a tone characteristic extraction module, a tone descriptor calculation module and a distance prediction module;
the tone characteristic extraction module is used for extracting time domain characteristics, spectrum characteristics based on short-time Fourier transform, auditory spectrum characteristics based on equivalent rectangular bandwidth and harmonic spectrum characteristics based on a sinusoidal harmonic model from the real-time sound signals;
the tone descriptor calculation module is used for respectively extracting a plurality of tone descriptors from each feature to form 68-dimensional tone descriptor vectors;
and the distance prediction module is used for inputting the 68-dimensional tone descriptor vector into the trained deep neural network, outputting probability distribution corresponding to each distance, and taking the maximum probability as a distance prediction value.
As an improvement of the above system, the time domain features include: time domain waveform characteristics and time domain envelope characteristics; the tone color descriptors extracted from the time domain features comprise attack time, decay time, unvoiced reverberation time, logarithmic attack time, attack slope, descent slope, time domain centroid, effective duration, frequency modulation, amplitude modulation, zero crossing rate and RMS energy envelope;
the timbre descriptors extracted from the short-time fourier transform-based spectral features include: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy;
the tone color descriptor extracted from the auditory spectrum feature based on the equivalent rectangular bandwidth comprises: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy;
tone descriptors extracted from harmonic spectral features based on a sinusoidal harmonic model include: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy.
As an improvement of the system, the input layer of the deep neural network inputs 68-dimensional tone descriptor vectors;
the hidden layer activation function of the deep neural network adopts a hyperbolic tangent function;
the output layer of the deep neural network adopts 200 Softmax nodes and corresponds to probability distribution of different distances.
As an improvement of the above system, the specific process of training the deep neural network is as follows:
establishing a training set: the transmitting signal adopts a broadband signal s (t), a Pekeris waveguide is used as an environment model, and the target distance range is 1-10 km; for the transmitting signal s (t), the transmitting condition and the water body condition are kept unchanged, the receiving distance is used as the only variable in the simulated underwater acoustic environment, and the KRAKEN sound field model is used for obtaining the corresponding receiving signal yi(t), i is 1,2 … N, and N is the number of signals; introducing Gaussian white noise n (t) into a received signal, wherein the substance range of the SNR is as follows: 1-10 dB;
performing frame calculation on each sample of the training set to obtain a characteristic sequence, respectively calculating each tone descriptor to form 68-dimensional tone descriptor vectors, and inputting the vectors into a deep neural network;
iterative optimization is carried out on the loss function through a back propagation algorithm by using an optimization algorithm to obtain a minimum value, a mean square error is taken as a cost function, model parameters are updated by adopting an Adam algorithm and an MSE cost function, and a Drop-out regularization strategy is used for realizing network parameter regularization; the initial weights are generated by a truncated normal distribution model with a standard deviation set to 0.1.
The invention has the advantages that:
1. the method can achieve more than 95% of ranging accuracy within the range of 1-10 km, and the highest ranging accuracy can reach 99.54%;
2. for the trained model, the distance measurement task can be completed within 0.1s, and the real-time performance is achieved;
3. the method of the invention establishes the model based on the data, and can avoid the sound field theoretical modeling of the unknown environment, thereby avoiding the influence caused by the environmental mismatch to the maximum extent and improving the universality of the model; the multi-dimensional sensing characteristic quantities of the acoustic signals in time and frequency domains are constructed, a large amount of information of the acoustic source and the underwater sound field is extracted, and model learning efficiency and stability are facilitated; the trained model only performs light-weight calculation in the prediction stage, and is beneficial to realizing real-time processing of data.
Drawings
FIG. 1 is a schematic diagram of environmental parameters of a KRAKEN sound field model;
FIG. 2 is a schematic diagram of a portion of an acoustic signal generated by the KRAKEN model;
FIG. 3 is a schematic diagram of a feature construction and tone feature extraction process;
FIG. 4 is a schematic view of a feature space;
FIG. 5 shows the ranging accuracy of data in the training set and the test set during the network training process;
FIG. 6 is a schematic diagram illustrating the influence of signal frequency and depth on the model ranging accuracy;
FIG. 7 is a MSE curve for signals of different bandwidths at a center frequency of 500 Hz;
FIG. 8 is a MSE curve for signals of different bandwidths at a center frequency of 1000 Hz;
FIG. 9 is a MSE curve for signals of different time lengths at a center frequency of 500 Hz;
FIG. 10 shows MSE curves for signals of different time lengths at a center frequency of 1000 Hz.
Detailed Description
The technical solution of the present invention is described in detail below with reference to the accompanying drawings and specific embodiments.
The invention provides a sound source passive distance measurement method based on tone characteristic extraction and deep learning.
Step 1), KRAKEN sound field model calculation:
the transmitting signal adopts a broadband signal, a typical Pekeris waveguide is used as an environment model (as shown in figure 1), the target distance is 1-10 km, the transmitting signal adopts a broadband signal s (t), and the KRAKEN sound field model can be used for calculating a receiving signal y (t) of a sound field. And for the transmitting signal s (t), keeping the transmitting condition and the water body condition unchanged, and taking the receiving distance as a unique variable in the simulated underwater acoustic environment. Obtaining a corresponding received signal y by using a KRAKEN sound field modeli(t) (i ═ 1,2 … N). Gaussian white noise n (t) is introduced into a received signal, SNR (signal to noise ratio) is 1-10 dB, and the signal is divided into two batches according to a certain proportion to be used as a neural network training set and a test set.
Step 2) extracting tone color characteristics:
time domain waveform features, time domain envelope features, Short Time Fourier Transform (STFT) -based spectral features, equivalent rectangular bandwidth-based auditory spectral features, and sinusoidal harmonic model-based harmonic spectral features are extracted from the acoustic signal, on which basis a 68-dimensional timbre descriptor is extracted for each acoustic sample and used as a model input. The feature extraction flow chart is shown in fig. 2.
Step 3), deep neural network:
the deep neural network carries out iterative optimization on the loss function by using an optimization algorithm through a Back Propagation (BP) algorithm to obtain a minimum value. And the mean square error is used as a cost function, the Adam optimization algorithm is used for network training, and meanwhile, the Drop-out method is used for realizing the regularization of network parameters, so that the overfitting is reduced.
In the experimental process, simulation data are generated by a KRAKEN sound field calculation tool under Pekeris waveguide environment parameters. FIG. 1 depicts environmental parameters used by the present invention. The simulation data comprises signals of Continuous Waves (CW) at 50Hz, 150Hz and 300Hz, the center frequencies of linear frequency modulation signals (LFM) are 500Hz, 1000Hz and 2000Hz, the frequency bandwidth range is 100-1000 Hz, and the signal length is 0.2-1.0 s. By adding gaussian noise to the analog received signal (SNR 1dB to 10 dB). The receiving point distance is distributed in 1-10 kilometers, the depth is distributed in 5-145 m, the network input training set accounts for 80% of the total sample set and consists of 16080 samples, and the rest 20% of data is used as a test set and consists of 4020 samples. And performing tone characteristic extraction on the generated sample, solving statistical characteristics of the characteristic sequence obtained by frame calculation, and taking the mean value and the variance as input characteristics. Finally, 68-dimensional tone descriptors of 20100 samples are obtained and used as input features of the deep neural network, as shown in fig. 3. The extracted feature space is shown in fig. 4, and the meaning of each feature is shown in table 1:
TABLE 1 timbre characteristics
Figure BDA0002366387920000051
Figure BDA0002366387920000061
The neural network selects Adam optimization algorithm to perform network training, the initial learning rate is set to be 0.03, the cost function selects MSE, Drop-out regularization processing is adopted, 5% of neurons are forbidden in each iteration, the initial weight is generated by a truncation positive-space distribution model, the standard deviation is set to be 0.1, the hidden layer activation function adopts hyperbolic tangent function, the output layer adopts 200 Softmax nodes, and probability distribution corresponding to different distances is achieved. The number of network iterations is set to 20000.
To transmit CW signals (f 150Hz, z)s35m), fig. 5 shows the prediction result of the system through 20000 iterations, depth spiritThe prediction precision of the network on the test set reaches 99.54 percent, and the effective identification of the sound source position can be realized; the experimental result shows that the method has the advantages that the estimation precision on the test set is over 95% under different conditions, the prediction performance is stable, and the method is an effective method.
According to the comparison of training efficiencies of different transmitting signals, the performance of the algorithm is robust to waveform parameters and sound source depth (as shown in fig. 6), and the training efficiency of a transient transmitting signal model with small bandwidth and long signal duration is higher (as shown in fig. 7, 8, 9 and 10).
The invention also provides a sound source passive ranging system based on tone characteristic extraction and deep learning, which comprises: the device comprises a trained deep neural network, a tone characteristic extraction module, a tone descriptor calculation module and a distance prediction module;
the tone characteristic extraction module is used for extracting time domain characteristics, spectrum characteristics based on short-time Fourier transform, auditory spectrum characteristics based on equivalent rectangular bandwidth and harmonic spectrum characteristics based on a sinusoidal harmonic model from the real-time sound signals;
the tone descriptor calculation module is used for respectively extracting a plurality of tone descriptors from each feature to form 68-dimensional tone descriptor vectors;
and the distance prediction module is used for inputting the 68-dimensional tone descriptor vector into the trained deep neural network, outputting probability distribution corresponding to each distance, and taking the maximum probability as a distance prediction value.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (8)

1. A sound source passive ranging method based on tone feature extraction and deep learning comprises the following steps:
extracting time domain features, spectral features based on short-time Fourier transform, auditory spectral features based on equivalent rectangular bandwidth and harmonic spectral features based on a sinusoidal harmonic model from the real-time acoustic signal;
extracting a plurality of tone descriptors from each feature to form 68-dimensional tone descriptor vectors;
inputting the 68-dimensional tone descriptor vector into a pre-established deep neural network, outputting probability distribution corresponding to each distance, and taking the maximum probability as a distance prediction value.
2. The sound source passive ranging method based on timbre feature extraction and deep learning as claimed in claim 1, wherein the time domain features comprise: time domain waveform characteristics and time domain envelope characteristics; the tone color descriptors extracted from the time domain features comprise attack time, decay time, unvoiced reverberation time, logarithmic attack time, attack slope, descent slope, time domain centroid, effective duration, frequency modulation, amplitude modulation, zero crossing rate and RMS energy envelope;
the timbre descriptors extracted from the short-time fourier transform-based spectral features include: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy;
the tone color descriptor extracted from the auditory spectrum feature based on the equivalent rectangular bandwidth comprises: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy;
tone descriptors extracted from harmonic spectral features based on a sinusoidal harmonic model include: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy.
3. The sound source passive ranging method based on timbre feature extraction and deep learning as claimed in claim 2, wherein 68-dimensional timbre descriptor vectors are input into the input layer of the deep neural network;
the hidden layer activation function of the deep neural network adopts a hyperbolic tangent function;
the output layer of the deep neural network adopts 200 Softmax nodes and corresponds to probability distribution of different distances.
4. The sound source passive ranging method based on timbre feature extraction and deep learning as claimed in claim 3, wherein the method further comprises: the deep neural network training method specifically comprises the following steps:
establishing a training set: the transmitting signal adopts a broadband signal s (t), a Pekeris waveguide is used as an environment model, and the target distance range is 1-10 km; for the transmitting signal s (t), the transmitting condition and the water body condition are kept unchanged, the receiving distance is used as the only variable in the simulated underwater acoustic environment, and the KRAKEN sound field model is used for obtaining the corresponding receiving signal yi(t), i is 1,2 … N, and N is the number of signals; introducing Gaussian white noise n (t) into a received signal, wherein the substance range of the SNR is as follows: 1-10 dB;
performing frame calculation on each sample of the training set to obtain a characteristic sequence, respectively calculating each tone descriptor to form 68-dimensional tone descriptor vectors, and inputting the vectors into a deep neural network;
iterative optimization is carried out on the loss function through a back propagation algorithm by using an optimization algorithm to obtain a minimum value, a mean square error is taken as a cost function, model parameters are updated by adopting an Adam algorithm and an MSE cost function, and a Drop-out regularization strategy is used for realizing network parameter regularization; the initial weights are generated by a truncated normal distribution model with a standard deviation set to 0.1.
5. A sound source passive ranging system based on tone feature extraction and deep learning, the system comprising: the device comprises a trained deep neural network, a tone characteristic extraction module, a tone descriptor calculation module and a distance prediction module;
the tone characteristic extraction module is used for extracting time domain characteristics, spectrum characteristics based on short-time Fourier transform, auditory spectrum characteristics based on equivalent rectangular bandwidth and harmonic spectrum characteristics based on a sinusoidal harmonic model from the real-time sound signals;
the tone descriptor calculation module is used for respectively extracting a plurality of tone descriptors from each feature to form 68-dimensional tone descriptor vectors;
and the distance prediction module is used for inputting the 68-dimensional tone descriptor vector into the trained deep neural network, outputting probability distribution corresponding to each distance, and taking the maximum probability as a distance prediction value.
6. The sound source passive ranging system based on timbre feature extraction and deep learning of claim 5, wherein the time domain features comprise: time domain waveform characteristics and time domain envelope characteristics; the tone color descriptors extracted from the time domain features comprise attack time, decay time, unvoiced reverberation time, logarithmic attack time, attack slope, descent slope, time domain centroid, effective duration, frequency modulation, amplitude modulation, zero crossing rate and RMS energy envelope;
the timbre descriptors extracted from the short-time fourier transform-based spectral features include: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy;
the tone color descriptor extracted from the auditory spectrum feature based on the equivalent rectangular bandwidth comprises: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy;
tone descriptors extracted from harmonic spectral features based on a sinusoidal harmonic model include: spectral centroid, spectral spread, spectral skewness, spectral kurtosis, spectral slope, spectral dip, spectral roll-off, spectral flux, and spectral energy.
7. The sound source passive ranging system based on timbre feature extraction and deep learning as claimed in claim 6, wherein the input layer of the deep neural network inputs 68-dimensional timbre descriptor vectors;
the hidden layer activation function of the deep neural network adopts a hyperbolic tangent function;
the output layer of the deep neural network adopts 200 Softmax nodes and corresponds to probability distribution of different distances.
8. The sound source passive ranging system based on timbre feature extraction and deep learning as claimed in claim 7, wherein the specific process of training the deep neural network is as follows:
establishing a training set: the transmitting signal adopts a broadband signal s (t), a Pekeris waveguide is used as an environment model, and the target distance range is 1-10 km; for the transmitting signal s (t), the transmitting condition and the water body condition are kept unchanged, the receiving distance is used as the only variable in the simulated underwater acoustic environment, and the KRAKEN sound field model is used for obtaining the corresponding receiving signal yi(t), i is 1,2 … N, and N is the number of signals; introducing Gaussian white noise n (t) into a received signal, wherein the substance range of the SNR is as follows: 1-10 dB;
performing frame calculation on each sample of the training set to obtain a characteristic sequence, respectively calculating each tone descriptor to form 68-dimensional tone descriptor vectors, and inputting the vectors into a deep neural network;
iterative optimization is carried out on the loss function through a back propagation algorithm by using an optimization algorithm to obtain a minimum value, a mean square error is taken as a cost function, model parameters are updated by adopting an Adam algorithm and an MSE cost function, and a Drop-out regularization strategy is used for realizing network parameter regularization; the initial weights are generated by a truncated normal distribution model with a standard deviation set to 0.1.
CN202010037014.6A 2020-01-14 2020-01-14 Sound source passive ranging method based on tone feature extraction and deep learning Active CN113189571B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010037014.6A CN113189571B (en) 2020-01-14 2020-01-14 Sound source passive ranging method based on tone feature extraction and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010037014.6A CN113189571B (en) 2020-01-14 2020-01-14 Sound source passive ranging method based on tone feature extraction and deep learning

Publications (2)

Publication Number Publication Date
CN113189571A true CN113189571A (en) 2021-07-30
CN113189571B CN113189571B (en) 2023-04-07

Family

ID=76972469

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010037014.6A Active CN113189571B (en) 2020-01-14 2020-01-14 Sound source passive ranging method based on tone feature extraction and deep learning

Country Status (1)

Country Link
CN (1) CN113189571B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116911817A (en) * 2023-09-08 2023-10-20 浙江智加信息科技有限公司 Paperless conference record archiving method and paperless conference record archiving system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226197A (en) * 2013-04-16 2013-07-31 哈尔滨工程大学 Underwater target echo classification method based on timbre parameter model
JP2017107141A (en) * 2015-12-09 2017-06-15 日本電信電話株式会社 Sound source information estimation device, sound source information estimation method and program
CN109975816A (en) * 2019-03-11 2019-07-05 武汉理工大学 A kind of sensor data fusion method of miniature underwater robot

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226197A (en) * 2013-04-16 2013-07-31 哈尔滨工程大学 Underwater target echo classification method based on timbre parameter model
JP2017107141A (en) * 2015-12-09 2017-06-15 日本電信電話株式会社 Sound source information estimation device, sound source information estimation method and program
CN109975816A (en) * 2019-03-11 2019-07-05 武汉理工大学 A kind of sensor data fusion method of miniature underwater robot

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
LUDWIG HOUÉGNIGAN 等: "Machine and deep learning approaches to localization and range estimation of underwater acoustic sources", 《2017 IEEE/OES ACOUSTICS IN UNDERWATER GEOSCIENCES SYMPOSIUM (RIO ACOUSTICS)》 *
WENBO WANG 等: "Deep transfer learning for source ranging: Deep-sea experiment results", 《THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA》 *
YI-NING LIU 等: "Source Ranging Using Ensemble Convolutional Networks in the Direct Zone of Deep Water", 《CHIN. PHYS. LETT.》 *
牛海强 等: "水声被动定位中的机器学习方法研究进展综述", 《信号处理》 *
肖旭 等: "基于多域特征提取和深度学习的声源被动测距", 《应用声学》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116911817A (en) * 2023-09-08 2023-10-20 浙江智加信息科技有限公司 Paperless conference record archiving method and paperless conference record archiving system
CN116911817B (en) * 2023-09-08 2023-12-01 浙江智加信息科技有限公司 Paperless conference record archiving method and paperless conference record archiving system

Also Published As

Publication number Publication date
CN113189571B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN110133599B (en) Intelligent radar radiation source signal classification method based on long-time and short-time memory model
WO2020124902A1 (en) Supervised learning auditory attention-based voice extraction method and system, and apparatuses
CN109212526A (en) Distributive array target angle measurement method for high-frequency ground wave radar
CN111239692B (en) PRI (pulse repetition index) combined intra-pulse information radiation source signal identification method based on deep learning
US10902832B2 (en) Timbre fitting method and system based on time-varying multi-segment spectrum
CN111368892A (en) Generalized S transformation and SVM electric energy quality disturbance efficient identification method
CN114201987A (en) Active interference identification method based on self-adaptive identification network
CN113189571B (en) Sound source passive ranging method based on tone feature extraction and deep learning
CN112560342A (en) DNN-based atmospheric waveguide parameter estimation method
CN112036239A (en) Radar signal working mode identification method and system based on deep learning network
CN106559146A (en) A kind of signal generator and signal generating method
CN112086100A (en) Quantization error entropy based urban noise identification method of multilayer random neural network
Cai et al. Modulation recognition of radar signal based on an improved CNN model
CN113111786A (en) Underwater target identification method based on small sample training image convolutional network
CN111289991B (en) Multi-scene-based laser ranging method and device
CN112785052A (en) Wind speed and wind direction prediction method based on particle filter algorithm
CN107765259A (en) A kind of transmission line of electricity laser ranging Signal denoising algorithm that threshold value is improved based on Lifting Wavelet
CN116826735A (en) Broadband oscillation identification method and device for new energy station
CN115932773A (en) Target angle detection method, device, equipment and medium based on spectrum shape characteristics
CN115632970A (en) Method, device and storage medium for estimating communication interference signal bandwidth under non-Gaussian noise
CN115598714A (en) Time-space coupling neural network-based ground penetrating radar electromagnetic wave impedance inversion method
CN115204237A (en) Swin-transform-based short wave protocol signal automatic identification method
CN113109795B (en) Deep sea direct sound zone target depth estimation method based on deep neural network
CN112016684B (en) Electric power terminal fingerprint identification method of deep parallel flexible transmission network
CN114358046A (en) Multi-complexity-level complex electromagnetic interference environment simulation generation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant