CN109919241B

CN109919241B - Hyperspectral unknown class target detection method based on probability model and deep learning

Info

Publication number: CN109919241B
Application number: CN201910200211.2A
Authority: CN
Inventors: 江天; 彭元喜; 张立雄; 宋明辉; 郝昊; 刘煜; 张俊; 李春潮; 余永涛; 张龙龙
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2019-03-15
Filing date: 2019-03-15
Publication date: 2020-09-29
Anticipated expiration: 2039-03-15
Also published as: CN109919241A

Abstract

The invention belongs to the field of hyperspectral intelligent perception, and discloses a hyperspectral unknown class target detection method based on a probability model and deep learning, which comprises the steps of S1, inputting hyperspectral training data into a trained CNN classification model, and outputting activation vectors of all samples of each class; s2, accumulating and averaging all the activation vectors of the samples which belong to the same category and are correctly classified to obtain a mean activation vector, and representing the center of the category by the mean activation vector; s3, fitting a Weibull model belonging to each class based on the activation vectors of all samples in each class and the mean activation vector of the class; s4, inputting the hyperspectral test data into a CNN model and a probability model to form a network based on the Weibull fitting result of each category, and calculating the probability of the hyperspectral test data belonging to unknown categories; the method has a clear structure, is easy to realize, reduces the training requirement of the neural network learning model, and can obviously improve the effect of detecting the unknown class target.

Description

Hyperspectral unknown class target detection method based on probability model and deep learning

Technical Field

The invention mainly relates to the field of hyperspectral intelligent sensing, in particular to a hyperspectral unknown class target detection method based on a probability model and deep learning.

Background

The hyperspectral data represents two-dimensional information of ground object distribution and one-dimensional information describing spectral characteristics of the ground objects, the spectral resolution of the hyperspectral data reaches the nanometer level, the spectral information of dozens of to hundreds of wave bands is provided for each pixel, and a complete and continuous spectral curve can be generated. Compared with visible light and multiple spectrums, the high spectrums have richer surface feature spectrum information and can reflect the fine spectrum attributes of the surface features to be detected in detail. According to the difference of spectral characteristics between different ground objects to be detected, a computer or special equipment is used for operating a classification and identification program to allocate a label for a class mark to each pixel, and the ground objects to be detected are divided into a plurality of regions of different classes, which is called hyperspectral image classification and identification. The hyperspectral image classification and identification are classified into supervised classification identification and unsupervised classification identification (including semi-supervised classification).

The hyper-spectral image classification and identification method based on deep learning mostly belongs to supervised classification and identification. The deep neural network is trained through the labeled hyperspectral image data, and then the trained deep neural network is used for classifying and identifying the hyperspectral image. Due to the complexity and dynamics of the actual environment, categories which are not available in the training sample are necessarily encountered in the process of classifying the hyperspectral images by using the trained deep neural network. We define objects that are input during the classification and recognition process but do not belong to any one of the classes in the training sample as hyper-spectral unknown class objects. The hyperspectral unknown class target detection is to determine the position and the size of the hyperspectral unknown class target. The unknown target position and size are determined through unknown target detection, and the purpose of identifying the unknown type target is achieved.

At present, the detection of an unknown class target by utilizing a deep neural network is mostly realized by depending on a SoftMax layer of the neural network, and when the probability of belonging to each known class calculated by the SoftMax layer is less than a given threshold value, the unknown class target is judged. But the accuracy of identifying and detecting the unknown class of objects is poor only by depending on the calculation result of the SoftMax layer.

The detection of the hyperspectral unknown class targets can also use a hyperspectral abnormal target detection technology. Abnormal target detection of the hyperspectral image refers to searching sparse pixels with unknown spectrum shapes in a hyperspectral data cube. Anomalous target detection does not require any prior information of the spectrum. There are some achievements and patents in the field of hyperspectral anomalous target detection. Comparison document 1 (Chinese patent invention, application number: CN201510593935) proposes a sparse RX abnormal target detection algorithm based on sparse representation theory. According to the algorithm, firstly, a spatial preprocessing method is utilized to enable target information of the hyperspectral image to be more prominent than background information, and then a bilateral filtering method is utilized to filter out the influence of noise interference on the hyperspectral image; and then calculating a sparse difference index of the hyperspectral image by using a sparse representation theory, reconstructing a hyperspectral image data vector by using the sparse difference index, and finally performing abnormal target detection by using an RX method to obtain an abnormal target detection result. The detection precision, the false alarm rate, the robustness and the like of the algorithm are improved to a certain extent, and the performance of the RX anomaly detection algorithm is improved. But the computation is relatively complex and the whole algorithm runs long.

Generally, currently, few effective methods are available in the field of hyperspectral unknown class target detection. The deep neural network only uses a SoftMax layer to detect unknown class targets, and the detection and identification precision is not high. Although the hyperspectral abnormal target detection algorithm can also be used for hyperspectral unknown class target detection, the calculation is complex and the running time is long. The invention provides a hyperspectral unknown class target detection method based on a probability model and deep learning. In practical use, the activation vector of the CNN network is input into the existing probability model, the existing probability model is used for measuring the distance between the vector and the mean activation vector of all known classes, and whether the activation vector belongs to the target of the unknown class or not is judged. The method solves the problem that the detection precision of the deep neural network only using the SoftMax layer to detect the unknown class target is not high, and improves the efficiency and the precision of the detection of the hyperspectral unknown class target.

Disclosure of Invention

And the target which is input by the hyperspectral classification and recognition system in the classification and recognition process and does not belong to any one category in the training sample is called a hyperspectral unknown category target. Determining the position and the size of the hyperspectral unknown class target is called hyperspectral unknown class target detection. The hyperspectral abnormal target detection can also be used for detecting unknown targets, but many abnormal target detection algorithms are complex in calculation and too long in calculation time. The detection precision of the deep neural network for detecting the unknown class of targets only by using the SoftMax layer is not high, and the synthesized deception image cannot be effectively detected and identified.

The invention aims at the problems and the characteristics, and introduces a Weibull probability model calculation layer between a first last layer (SoftMax layer) and a second last layer (CNN classification model output activated layer) of the deep neural network. The probability of belonging to each known class and the probability of belonging to the unknown class are calculated for the activation vector output by the CNN by using a Weibull probability model established by classifying correct samples during training, and classification is performed according to the calculation result.

A hyperspectral unknown class target detection method based on a probability model and deep learning comprises the following steps:

s1, inputting hyperspectral training data into a trained CNN classification network for classification, wherein the classification output result is the activation vector of all samples of each class, and the activation vector output by the CNN classification model is expressed as v (x) -value (v)₁(x),v₂(x),...,v_N(x) Wherein v) is₁(x),v₂(x)...v_N(x) Is excitingAn element or component of a live vector;

s2, accumulating the activation vectors of all the samples which belong to the same category and are correctly classified, and averaging to obtain a mean activation vector, and using the mean activation vector to represent the center of the category, wherein the ith input sample is correctly classified into the jth category after passing through the CNN network, and the ith input sample is represented as x_i,jSample x_i,jThe activation vector output after the CNN model processing is represented as v_j(x_i,j) Taking s_i,j＝v_j(x_i,j) The number of training samples correctly classified as the jth class among the training samples is N₁The formula for solving the mean vector of the j-type activation vector is as follows:

∑ in the formula is a summation over N₁Summing the activation vectors of class j, dividing by N₁Obtaining an average activation vector, and calculating N times in this way to obtain average activation vectors of N classes;

s3, fitting a Weibull (Weibull) probability model belonging to each class based on the activation vectors of the samples and the mean activation vector of each class, wherein the fitting result is the position parameter tau of the Weibull (Weibull) probability model of each class_jProportional parameter lambda_jShape parameter κ_j；

Wherein the probability density function of the Weibull distribution is

In the formula, λ_jIs a proportional parameter, κ_jIs the shape parameter, τ_jIs a position parameter, x is a random variable;

wherein, the j-th activation vector is represented by tau_j,κ_j,λ_jThe method can be calculated by using a FitHigh function in a python language libMR library, and the calculation method comprises the following steps:

first, the average activation vector μ calculated in S2 is used_jCalculate farthest distance η_j，η_jIs all activations of class jVector to average activation vector mu_jThe maximum value of the distances;

further, a normalized vector is calculated by averaging the activation vectors

Is an activation vector s_i,jSubtract the average activation vector μ_jThen dividing the variance of the j-th activation vector to obtain the variance of the j-th activation vector;

finally, using the calculated

The value calls a FitHigh function in a python language libMR library to calculate the tau of the j type_j,κ_j,λ_jWherein j is 1.. multidot.n; tau is_j,κ_j,λ_jIs calculated by

And calculating the obtained parameter tau_j,κ_j,λ_jStoring;

s4, inputting the new activation vector of the hyperspectral test data into the Weibull distribution model of each category based on the Weibull fitting result of each category, and calculating the probability of belonging to the unknown category, wherein the specific steps are as follows:

(4a) the activation vector output after the test data is processed by the CNN network is v (x) ═ v₁(x),v₂(x),...,v_N(x) For activation vector v (x) v ═ v)₁(x),v₂(x),...,v_N(x) Returns the position index of each component in the activation vector in descending order, wherein s (i) ═ argsort (v)_j(x) Is expressed for the activation vector component v_j(x) Is ordered from small to large and returns the component subscript, s (i) is v_j(x) Subscript, preset adjustment factor ω_j1, and j 1,2,3.. N;

(4b) adjusting the categories of alpha top levels, and calculating an adjustment coefficient in the following way:

wherein i is 1, 2.. multidot. α;. tau_s(i)、λ_s(i)、κ_s(i)Position parameter, scale parameter, shape parameter, ω, representing class s (i)_s(i)(x) Is the calculated adjustment coefficients, and is calculated by α adjustment coefficients omega_s(i)(x) S (i) ═ 1,2,. multidot. α and the remaining N- α omega with no adjustment_jWherein, ω is_jN, N components make up the adjustment vector ω (x) ═ ω (ω) · α +1, α +2₁(x),ω₂(x),...,ω_α(x),1,...,1)^TT denotes vector transposition;

(4c) and (4) adjusting the activation vector by using the adjustment vector omega (x) obtained by the calculation in the step (4b), wherein the adjustment mode is as follows:

in the formula (I), the compound is shown in the specification,

means multiplication of two vectors, multiplication of two vector-corresponding elements to obtain a new vector-corresponding element, v (x) ═ v₁(x),v₂(x),...,v_N(x) Is an activation vector, ω (x) is an adjustment vector formed by adjustment coefficients, and v (x) and ω (x) are vector-multiplied to obtain an adjusted activation vector

(4d) Computing unknown class activation adjustment components

Wherein ∑ represents the summation, i ranges from 1 to α, v_i(x) Is the ith element or component of the activation vector, ω_i(x) Is the adjustment coefficient, the result of calculationIs that

(4e) Using a SoftMax function to

And

calculating the probability of the input test data belonging to an unknown class target as input, wherein the calculation formula of the probability belonging to the jth class is as follows:

wherein j ranges from 0 to N,

the method comprises the steps of solving an exponent with e as the base for the components of the adjusted activation vector, adding the exponents, and dividing the exponent with e as the base for the jth activation component of the adjusted activation vector by the input test sample with the probability of belonging to the jth class

Thus calculating N +1 probabilities;

the category number of the maximum probability among the N +1 probabilities is taken:

y*＝argmaxj P(y＝j|x),j＝0,1,2,...,N

in the formula, argmaxj represents a class number that takes the maximum value among N +1 probabilities, and j ═ 0 is an unknown target class number; if the maximum probability class number is equal to 0, i.e., y ═ 0, or the maximum probability is less than the threshold, P (y ═ y | x) <isrepresented by a probability, then it is determined that the test input x belongs to the unknown class object.

Wherein, S1 to S3 are to establish a probability model of each known class by using training data, and may be calculated in an Offline (Offline) manner based on a training set, and store the calculation result. S4, detecting unknown class targets for input data by using the established probability model, and processing test data in an Online (Online) mode in real time.

Compared with the prior art, the invention has the advantages that:

1. according to the invention, statistical Weibull distribution (Weibull distribution) is introduced, and a Weibull probability model calculation layer is added in front of a SoftMax layer of a deep neural network, so that the neural network is more suitable for open set identification application; in addition, the invention classifies the targets and detects the unknown targets, thereby improving the efficiency and the precision of the detection of the unknown targets.

2. The invention reduces the training requirement of the neural network learning model; the invention applies deep learning to the field of hyperspectral unknown target recognition and widens the field of deep learning application.

Drawings

FIG. 1 is a flow chart of a hyperspectral unknown object detection framework of the invention;

FIG. 2 is a schematic diagram of hyperspectral unknown object detection algorithms S1 to S3 (part of establishing a probability model);

FIG. 3 is a schematic diagram of a hyperspectral unknown object detection algorithm S4 (detection of unknown object parts) according to the invention.

Detailed Description

The invention will be described in further detail below with reference to the drawings and specific examples.

Fig. 1 shows an overall flowchart of unknown class target detection, where hyperspectral training data is input into a trained CNN classification model, and the output of the last fully-connected layer of the CNN classification model is used as an activation vector of a sample of each class. And accumulating the activation vectors of the correctly classified samples in each category, calculating a mean value, and characterizing the category by using the mean activation vector. Fitting a Weibull (Weibull) probability model belonging to each class by using the activation vectors and the mean activation vector of each class, wherein the fitting result is a position parameter tau of the Weibull (Weibull) probability model_jProportional parameter lambda_jShape parameter, etc.. kappa_j. Inputting new activation vectors of hyperspectral test data into a Weibull distribution model of each class based on Weibull fitting results of each classType, the probability of belonging to an unknown class is calculated.

FIG. 2 shows the calculation flow of the training part when the probabilistic model is built in S1 to S3, and the process only depends on the training set, so that the process can be calculated in advance.

S1, inputting hyperspectral training data into a trained CNN classification network for classification, wherein the classification output result is the activation vector of all samples of each class, and the activation vector output by the CNN classification model is expressed as v (x) -value (v)₁(x),v₂(x),...,v_N(x) Wherein v) is₁(x),v₂(x)...v_N(x) Is an element or component of an activation vector;

s2, accumulating the activation vectors of all samples belonging to the same class and correctly classified, and averaging to obtain a mean activation vector, and using the mean activation vector to represent the center of the class, wherein the ith input sample is correctly classified into the jth class after passing through the CNN network, and the ith input sample is represented as x_i,jSample x_i,jThe activation vector output after the CNN model processing is represented as v_j(x_i,j) Taking s_i,j＝v_j(x_i,j) The number of training samples correctly classified as the jth class among the training samples is N₁The formula for solving the mean vector of the j-type activation vector is as follows:

∑ in the formula is a summation over N₁Summing the activation vectors of class j, dividing by N₁And obtaining an average activation vector, thus obtaining an average activation vector of N classes.

In order to realize the hyperspectral unknown class detection function, the probabilistic model uses the theory of Meta-Recognition (Meta-Recognition), namely, the output activation vector of the CNN is analyzed by means of Extreme Value theory (EVT for short). The theorem explains the statistical characteristics of an extreme value of probability distribution, and simultaneously defines three distribution functions to carry out mathematical modeling on the characteristics, wherein the Weibull (Weibull) distribution conforms to the application scene of unknown class target detection. When calculating a Weibull (Weibull) model of each category, model parameters can be obtained by using FitHigh functions in a python language libMR library to complete model fitting and calculate model parameters.

Wherein the probability density function of the Weibull distribution is

first, the average activation vector μ calculated in S2 is used_jCalculate farthest distance η_j，η_jIs all activation vectors of class j to the mean activation vector mu_jThe maximum value of the distances;

further, a normalized vector is calculated by averaging the activation vectors

finally, using the calculated

And calculating the obtained parameter tau_j,κ_j,λ_jStoring;

fig. 3 shows a calculation flow for detecting unknown target portions, which involves real data processing, so that the calculation needs to be completed online, and this step corresponds to an inference step in computer vision. The index of the unknown class object in this algorithm is 0. As shown, the inputs to the detection section are the output activation vector of CNN (i.e., the activation vector of the test data) and the Weibull model parameters for each class calculated in the previous section.

wherein i is 1, 2.. multidot. α;. tau_s(i)、λ_s(i)、κ_s(i)Position parameter, scale parameter, shape parameter, ω, representing class s (i)_s(i)(x) Is the calculated adjustment coefficients, and is calculated by α adjustment coefficients omega_s(i)(x) S (i) ═ 1,2,. multidot. α and the remaining N- α did notRegulated omega_jWherein, ω is_jN, N components make up the adjustment vector ω (x) ═ ω (ω) · α +1, α +2₁(x),ω₂(x),...,ω_α(x) 1, 1.., 1) T, T represents a vector transpose;

in the formula (I), the compound is shown in the specification,

(4d) Computing unknown class activation adjustment components

Wherein ∑ represents the summation, i ranges from 1 to α, v_i(x) Is the ith element or component of the activation vector, ω_i(x) Is an adjustment coefficient, the calculation result is

(4e) Using a SoftMax function to

And

wherein j ranges from 0 to N,

Thus calculating N +1 probabilities;

y*＝argmaxj P(y＝j|x),j＝0,1,2,...,N

In the calculation process, the joblib library in python language is used to complete the storage of all intermediate data, including model parameters of Weibull (Weibull) distribution, average activation vectors (MAV) of all classes, and the like.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims

1. A hyperspectral unknown class target detection method based on a probability model and deep learning is characterized by comprising the following steps: the method comprises the following steps:

s2, accumulating and averaging the activation vectors of all samples which belong to the same class and are correctly classified to obtain an average activation vector, and using the average activation vector to represent the center of the class, wherein the ith input sample is correctly classified into the jth class after passing through the CNN network, and the ith input sample is represented as x_i,jSample x_i,jThe activation vector output after the CNN model processing is represented as v_j(x_i,j) Taking s_i,j＝v_j(x_i,j) The number of training samples correctly classified as the jth class among the training samples is N₁The formula for calculating the average activation vector of the j-type activation vectors is as follows:

s3, fitting a Weibull probability model belonging to each class based on the activation vectors of the samples and the average activation vector of each class, wherein the fitting result is the position parameter tau of the Weibull probability model of each class_jProportional parameter lambda_jShape parameter κ_j；

Wherein the probability density function of the Weibull distribution is

wherein, the j-th activation vector is represented by tau_j,κ_j,λ_jThe method is calculated by using a FitHigh function in a python language libMR library, and the calculation method comprises the following steps:

further, a normalized vector is calculated by averaging the activation vectors

finally, the calculated μ is used_j,η_j,

The value calls a FitHigh function in a python language libMR library to calculate the tau of the j type_j,κ_j,λ_jWherein j is 1_j,κ_j,λ_jThe calculation formula is

And calculating the obtained parameter tau_j,κ_j,λ_jStoring;

(4a) the activation vector output after the test data is processed by the CNN network is v (x) ═ v₁(x),v₂(x),...,v_N(x) For activation vector v (x) v ═ v)₁(x),v₂(x),...,v_N(x) Returns the position index of each component in the activation vector in descending order, wherein s (i) ═ arg sort (v)_j(x) Is expressed for the activation vector component v_j(x) Is ordered from small to large and returns the component subscript, s (i) is v_j(x) Subscript, preset adjustment factor ω_j1, and j 1,2,3.. N;