CN111027453B

CN111027453B - Automatic non-cooperative underwater target identification method based on Gaussian mixture model

Info

Publication number: CN111027453B
Application number: CN201911237928.0A
Authority: CN
Inventors: 曾向阳; 乔彦; 王海涛; 杨爽
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2019-12-06
Filing date: 2019-12-06
Publication date: 2022-05-17
Anticipated expiration: 2039-12-06
Also published as: CN111027453A

Abstract

The invention discloses a non-cooperative underwater target automatic identification method based on a Gaussian mixture model, which is used for solving the technical problem of poor practicability of the existing non-cooperative underwater target identification method. The technical scheme is that when an unknown class of target is faced, starting from the angle of a mode of in-set and out-set identification, Mel frequency cepstrum coefficients, namely MFCC coefficients, which can describe the non-linear characteristics of human ear hearing, are extracted by utilizing a known in-set data structure, a Gaussian mixture model is trained, and a proper threshold value is selected, so that a target identification system is constructed.

Description

Automatic non-cooperative underwater target identification method based on Gaussian mixture model

Technical Field

The invention relates to an underwater target identification method, in particular to a non-cooperative underwater target automatic identification method based on a Gaussian mixture model.

Background

The document 'application progress and signal processing of deep learning in passive recognition of underwater targets, 2019, Vol35(9), p 1460-1475' discloses a passive recognition method of underwater targets based on deep learning, which is characterized in that after the steps of a typical pattern classification recognition system of preprocessing sample data and feature extraction are carried out, a specific deep neural network structure is adopted to realize a classifier design and a classifier link, a certain number of samples are used to complete model training of a deep neural network, or the features with good feature learning capability of the deep neural network are directly utilized, and the deep neural network is used to weaken or completely replace the more key feature extraction link in the traditional sense. However, in the document, the modeling is realized by using a deep learning method, a sufficient amount of data with label information is needed, and the modeling effect is seriously influenced by the lack of data. In practical application scenarios, however, it is difficult to collect enough sample data for each underwater target, and the underwater acoustic target data collection is expensive and the existing accumulation is rather weak, and most targets will be in a data-starvation state.

Disclosure of Invention

In order to overcome the defect that the existing non-cooperative underwater target identification method is poor in practicability, the invention provides a non-cooperative underwater target automatic identification method based on a Gaussian mixture model. When the method faces to an unknown class of target, starting from the angle of a mode of in-set and out-set identification, Mel frequency cepstrum coefficients, namely MFCC coefficients, which can describe the non-linear characteristics of human ear hearing, are extracted by using a known in-set data structure, a Gaussian mixture model is trained, a proper threshold value is selected, so that a target identification system is constructed, samples which are difficult to obtain label information are substituted into the target identification system to distinguish whether the samples belong to the known sample set, the class of the samples is judged, the class of a non-cooperative target is preliminarily judged, and meanwhile, the requirements on the label information and the data volume of sample data are reduced. The invention constructs a non-cooperative underwater target automatic identification system based on the human auditory perception mechanism and the Gaussian mixture model by combining MFCC characteristic parameters capable of representing the human auditory characteristics, does not need to acquire specific labels or classes of samples when training the model, and simultaneously reduces the data volume of the samples required to be collected for each underwater target. Through the method of internal and external identification, whether a non-cooperative target which is difficult to obtain label information belongs to a sample in a known sample set or not is distinguished, the category of the target is preliminarily identified, the computing efficiency of the system is high, and the identification performance and robustness are good.

The technical scheme adopted by the invention for solving the technical problems is as follows: a non-cooperative underwater target automatic identification method based on a Gaussian mixture model is characterized by comprising the following steps:

the method comprises the following steps that firstly, training samples observed in a data set are preprocessed, and the preprocessing comprises three parts, namely pre-emphasis, framing and windowing; the training samples are underwater sound target data.

And step two, extracting MFCC characteristics from the preprocessed training samples.

And step three, training the GMM model by using the MFCC features extracted from the data in the set as feature vectors. Where for a feature vector, its mixture probability density in the GMM is defined as:

where x is the feature vector, λ is GMM, p (x | λ) represents the likelihood that the feature vector appears in the model λ, M is the degree of mixing of the GMM, p_i(x) Probability density, ω, for each univariate Gaussian density function_iThe mixed weight occupied by each univariate Gaussian density function is satisfied, and the defined relation is required to be satisfied:

the mixed probability density is M p in GMM_i(x) Weighted linear combination of (3). p is a radical of_i(x) Is represented by the formula (3):

where D is the dimension of the feature vector, μ_iD-dimensional mean vector, Σ, representing the ith univariate gaussian density function_iA covariance matrix of D × D representing the ith univariate gaussian density function, where i ═ 1, 2. Wherein the mixing degree M is selected in advance and is larger than the number of categories, and μ_i、∑_iAnd ω_iThen, an expectation maximization algorithm, namely an EM algorithm, is used to train a GMM model, and the specific steps are as follows:

(a) for a given signature sequence X ═ X₁,x₂,x₃,...,x_TSelecting the mixing degree M of the GMM to be trained, setting initial values for the weight, the mean value and the variance of the ith Gaussian distribution, and calculating p (x) according to the formula (1)_t| λ), then calculate the log-likelihood function according to equation (4):

(b) calculating the posterior probability of the feature vector according to the formula (5) according to the given initial values of the weight, the mean value and the variance;

(c) recalculating new weights, means and variances from the a posteriori probabilities calculated in step (b) and calculating new log-likelihood functions by substituting equation (4):

for the ith gaussian density function, the number of eigenvectors contained is:

the weight is:

the mean value is:

the variance is:

(d) iteratively iterating steps (b) and (c) until the log-likelihood function or estimated parameters converge.

(e) The three parameters of weight, mean value and variance obtained at this time make p (X lambda) maximum, complete the parameter estimation of GMM, train GMM model.

Step four, preprocessing and extracting MFCC characteristics in the step one and the step two are carried out on data which are not observed in the set and the non-collected data which are used as test samples.

And step five, substituting the test sample into the trained GMM to calculate the likelihood ratio. For the feature vector x representing the target to be recognized, the determination method can be determined by the following formula:

Λ(x)＝logp(x|λ) (10)

and comparing the log-likelihood ratio of the feature vector belonging to the set with the threshold value by giving a threshold value, wherein if the log-likelihood ratio is greater than the threshold value, the sample belongs to the set, and otherwise, the sample belongs to the outside of the set.

Step six, respectively calculating the false alarm probability E of the system when testing the system_faAnd probability of missed alarm E_missThe definition is as follows:

wherein n is_faRepresenting the number of times that off-set data is misinterpreted as in-set data, n_imposterIs the number of data outside the set, n_missIs the number of times that the data in the set is overlooked and judged as data out of the set, n_targetRefers to the number of data in the set.

The detection error relation curve, namely DET curve, is obtained according to different decision threshold values within the score range of the test set_faAnd E_missGraph of the corresponding relationship of (a). The closer the DET curve is to the lower left and the closer to the origin, the corresponding E_faAnd E_missThe lower the probability of error occurrence, the higher the model accuracy and the better the accuracy of system identification. With decreasing threshold, E_faIncrease and E_missDecreasing, the DET curve tends to fall, when the threshold decreases to a certain point, E_fa＝E_missThe corresponding threshold is the equal error rate threshold. And finding out the equal false recognition rate and the equal false recognition rate threshold value for the actual use stage of the system by drawing the DET curve.

The invention has the beneficial effects that: when the method faces to an unknown class of target, starting from the angle of a mode of in-set and out-set identification, Mel frequency cepstrum coefficients, namely MFCC coefficients, which can describe the non-linear characteristics of human ear hearing, are extracted by using a known in-set data structure, a Gaussian mixture model is trained, a proper threshold value is selected, so that a target identification system is constructed, samples which are difficult to obtain label information are substituted into the target identification system to distinguish whether the samples belong to the known sample set, the class of the samples is judged, the class of a non-cooperative target is preliminarily judged, and meanwhile, the requirements on the label information and the data volume of sample data are reduced. The invention constructs a non-cooperative underwater target automatic identification system based on the human auditory perception mechanism and the Gaussian mixture model by combining MFCC characteristic parameters capable of representing the human auditory characteristics, does not need to acquire specific labels or classes of samples when training the model, and simultaneously reduces the data volume of the samples required to be collected for each underwater target. Through the method of internal and external identification, whether a non-cooperative target which is difficult to obtain label information belongs to a sample in a known sample set or not is distinguished, the category of the target is preliminarily identified, the computing efficiency of the system is high, and the identification performance and robustness are good.

The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

Drawings

FIG. 1 is a flow chart of the non-cooperative underwater target automatic identification method based on the Gaussian mixture model.

Fig. 2 is a flow chart of MFCC feature extraction in fig. 1.

Fig. 3 is a diagram of the GMM calculation structure in the method of the present invention.

Fig. 4 is a DET curve plotted for an embodiment of the method of the present invention.

Detailed Description

Reference is made to fig. 1-4.

The data set used in this embodiment includes 3 types of underwater sound targets, 15 pieces of sound are collected for each type, the length of each piece of sound is about 6 seconds, total 45 sample files are respectively recorded as type i, type ii, and type iii, and the sampling frequency is 8000 Hz. During testing, the class I and class II data are used as the data in the set, and the class III data are used as the data out of the set. And selecting a part of the I-type and II-type data as a training set for training and establishing the GMM model, and using the rest part of the I-type and II-type data and the III-type data as a test set for testing the trained GMM model.

The automatic identification method of the non-cooperative underwater target based on the Gaussian mixture model comprises the following specific steps:

the method comprises the following steps: preprocessing a training sample observed in a data set, wherein the preprocessing comprises three parts of pre-emphasis, framing and windowing; the training samples are underwater sound target data.

The method comprises the steps of adopting MATLAB as an algorithm implementation platform, selecting 7 samples from class I and class II data respectively, taking 14 samples as training samples, completing frame division, windowing, filter bank construction and other processing by utilizing related functions in a voicebox toolbox, and completing preprocessing, wherein the duration of each frame is 0.1s during framing processing.

Step two: MFCC features are extracted from the preprocessed training samples.

And extracting 13-dimensional MFCC characteristics of the preprocessed result to obtain a 1667 x 13 characteristic matrix.

Step three: and calling a fitgmist function in the MATLAB by using the MFCC features extracted by the training samples, setting the mixedness M to be 5, completing parameter estimation, and training the GMM model, wherein the mixing weight of the GMM is an array of 1 multiplied by 5, the mean value is an array of 5 multiplied by 13, and the covariance is an array of 13 multiplied by 5.

And training the GMM model by using the MFCC features extracted from the data in the set as feature vectors. Where for a feature vector, its mixture probability density in the GMM is defined as:

where D is the dimension of the feature vector, μ_iD-dimensional mean vector, Σ, representing the ith univariate Gaussian density function_iA covariance matrix of D × D representing the ith univariate gaussian density function, where i ═ 1, 2. Wherein the degree of mixing M is selected in advance and is greater than the number of categories, and μ_i、∑_iAnd ω_iThen, using expectation maximization algorithm, i.e. EM algorithm, it can be estimated, so as to train out the GMM model, the specific steps are as follows:

(c) and (3) calculating new weight, mean value and variance according to the posterior probability calculated in the last step, and substituting formula (4) to calculate new log-likelihood function:

for the ith gaussian density function, the number of eigenvectors contained is:

the weight is:

the mean value is:

the variance is:

(e) The three parameters of weight, mean value and variance obtained at this time can make p (X | lambda) reach the maximum, thus can complete the parameter estimation of GMM and train the GMM model.

Step four: and taking the rest samples of each class in the 3 classes of underwater sound targets as test samples, similarly performing the steps of preprocessing and extracting 13-dimensional MFCC features, obtaining a 1905 x 13 feature matrix belonging to the set from the rest 16 samples of the class I and class II data, and obtaining an 1800 x 13 feature matrix belonging to the outside of the set from 15 samples of the class III.

And (4) performing preprocessing in the first step and the second step and extracting MFCC features on the data which are not observed in the set and the non-collected data as test samples.

Step five: and substituting the test sample into the trained GMM to calculate the likelihood ratio. For the feature vector x representing the target to be recognized, the determination method can be determined by the following formula:

Λ(x)＝logp(x|λ) (10)

Step six: when testing the system, respectively calculating the false alarm probability (E) of the system_fa) And probability of missed alarm (E)_miss) The definition is as follows:

Claims

1. A non-cooperative underwater target automatic identification method based on a Gaussian mixture model is characterized by comprising the following steps:

the method comprises the following steps that firstly, training samples observed in a data set are preprocessed, and the preprocessing comprises three parts, namely pre-emphasis, framing and windowing; the training sample is underwater sound target data;

step two, extracting MFCC characteristics from the preprocessed training samples;

step three, training a GMM model by using MFCC features extracted from the data in the set as feature vectors; where for a feature vector, its mixture probability density in the GMM is defined as:

the mixed probability density is M p in GMM_i(x) Weighted linear combination of (a); p is a radical of_i(x) Is represented by the formula (3):

where D is the dimension of the feature vector, μ_iD-dimensional mean vector, Σ, representing the ith univariate gaussian density function_iA covariance matrix representing D × D of an ith univariate gaussian density function, where i ═ 1, 2. Wherein the mixing degree M is selected in advance and is larger than the number of categories, and μ_i、∑_iAnd ω_iThen the expectation maximization algorithm, i.e. the EM algorithm, is utilizedTraining a GMM model by the method, which comprises the following steps:

for the ith gaussian density function, the number of eigenvectors contained is:

the weight is:

the mean value is:

the variance is:

(d) iteratively iterating steps (b) and (c) until the log-likelihood function or estimated parameters converge;

(e) the three parameters of weight, mean value and variance obtained at the moment make p (X | lambda) reach the maximum, complete the parameter estimation of GMM, train out GMM model;

step four, preprocessing and extracting MFCC characteristics in the step one and the step two are carried out on data which are not observed in the set and the non-collected data which are used as test samples;

substituting the test sample into the trained GMM, and calculating the likelihood ratio; for the feature vector x representing the target to be recognized, the determination method can be determined by the following formula:

Λ(x)＝logp(x|λ) (10)

comparing the log-likelihood ratio of the feature vector belonging to the set with the threshold value by giving a threshold value, wherein if the log-likelihood ratio is greater than the threshold value, the sample belongs to the set, otherwise, the sample belongs to the outside of the set;

wherein n is_faRepresenting the number of times that off-set data is misinterpreted as in-set data, n_imposterIs the number of data outside the set, n_missIs the number of times that the data in the set is overlooked and judged as data out of the set, n_targetThe number of data in the set;

the detection error relationship curve, i.e., the DET curve, is within the test set score range,e obtained from different decision thresholds_faAnd E_missA graph of the corresponding relationship of (a); the closer the DET curve is to the lower left and to the origin, the corresponding E_faAnd E_missThe lower the error condition is, the smaller the possibility of the error condition is, the higher the model precision is, and the accuracy of system identification is better; with decreasing threshold, E_faIncrease and E_missDecreasing, the DET curve tends to fall, when the threshold decreases to a certain point, E_fa＝E_missWhen the EER is equal to the EER, the corresponding threshold value is the equal error recognition rate threshold value; and finding out the equal false recognition rate and the equal false recognition rate threshold value for the actual use stage of the system by drawing the DET curve.