CN112733736A

CN112733736A - Class imbalance hyperspectral image classification method based on enhanced oversampling

Info

Publication number: CN112733736A
Application number: CN202110042320.3A
Authority: CN
Inventors: 冯伟; 朱文涛; 全英汇; 王勇; 李强; 杨琦; 童莹萍
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2021-01-13
Filing date: 2021-01-13
Publication date: 2021-04-30

Abstract

The invention discloses a category imbalance hyperspectral image classification method based on enhanced oversampling, which comprises the following steps: acquiring a hyperspectral image to be classified and a hyperspectral image set to be trained; performing dimensionality reduction treatment by adopting a principal component analysis method, and performing edge filling and blocking on each hyperspectral image subjected to dimensionality reduction; carrying out enhancement oversampling unbalanced processing on the training sample set; and building a convolutional neural network model, training the convolutional neural network, and performing class prediction on pixel blocks to be classified. The method solves the problem of low classification accuracy caused by small samples and unbalanced classes in hyperspectral image classification, and improves the classification accuracy.

Description

Class imbalance hyperspectral image classification method based on enhanced oversampling

Technical Field

The invention relates to the technical field of remote sensing image processing, in particular to a category imbalance hyperspectral image classification method based on enhanced oversampling.

Background

The hyperspectral image is an image formed by collecting information of an object in a plurality of electromagnetic wave band ranges and then by means of an infrared spectrum imaging technology. Compared with a natural image only comprising three channels of red, green and blue, the spectrum dimension of the hyperspectral image contains rich information, and the information has extremely important effects on detection, classification, identification and the like of objects, so that the hyperspectral image has wide application in the fields of precision agriculture, environmental monitoring, urban planning, military reconnaissance and the like. An important content of the hyperspectral data processing is hyperspectral image classification, which generally refers to terrain classification, that is, each pixel in an image is classified into different terrain categories according to the semantics of the pixel.

In recent years, deep learning has become a trend of big data analysis, and has made a major breakthrough in many computer vision tasks such as image classification, target detection, natural language processing, and the like. Based on these successful applications, deep learning is introduced into hyperspectral image classification and achieves good results. Compared with the traditional manual manufacturing method, the deep learning technology can extract information features from original data through a series of hierarchical structures. In particular, the shallow network extracts some simple features, such as texture and edge information, the deeper layers can express more complex features, and the learning process is fully automatic, which makes deep learning more suitable for coping with various situations. The Convolutional Neural Network (CNN) is one of the most representative network models in deep learning, and a plurality of CNN-based hyperspectral image classification methods have been proposed in remote sensing communities, and have better classification performance compared with the traditional machine learning method. However, CNN models are usually parameterized, require a large amount of training data to ensure performance, and when the number of training samples is small, the classification performance is reduced, and the classification accuracy is seriously affected.

Classification is an important task in the field of machine learning and data mining, which can be expressed intuitively as: given a set of training examples, each example has a class label, thereby determining the class labels of one or more unseen test examples. In the real world, since rare instances occur infrequently, the number of instances in some classes in the data is greater than others, thereby creating a class imbalance classification problem. The small number of instances results in classification rules that predict subclasses that tend to be rare, undetected, or ignored, and therefore test samples belonging to subclasses are more prone to misclassification than test samples belonging to general classes. In some applications, the correct classification of small class samples is often of greater value than the opposite, and therefore models that can effectively classify unbalanced data are of particular importance.

Existing approaches to solving the category imbalance problem fall broadly into two categories, namely external and internal approaches. The internal method reduces the sensitivity to class imbalance through algorithm level modification, and can be divided into a special algorithm for directly learning class imbalance from a data set, a single class classification algorithm based on identification, a cost sensitive learning algorithm and an integrated learning algorithm. The external method uses a sampling technology to preprocess the training data set so as to balance the training data set, and is mainly divided into undersampling and oversampling because the external method is simple to implement and reasonable in precision and receives wide attention. The under-sampling method balances the data set by deleting most classes of samples, mainly random under-sampling (RUS), but such random deletion of most classes of samples may cause the classifier to miss important features associated with most classes, resulting in a significant degradation of the overall classification performance. The oversampling method is to balance the data set by adding a few classes of training samples, and the common oversampling methods are Random Oversampling (ROS) and synthetic few classes of oversampling (SMOTE). The ROS method balances the data set by randomly copying a few types of samples in the training set, and overfitting easily occurs under a certain scale of extremely small data sets; the SMOTE method creates artificial data based on the feature space similarity among a few existing class samples, so that class overlapping and noise sample introduction are easily caused, and the method has the biggest characteristic that the method cannot be directly applied to sample generation of multi-dimensional data such as images.

Disclosure of Invention

Aiming at the problems in the prior art, the invention aims to provide a classification method of a category unbalanced hyperspectral image based on enhanced oversampling, so as to solve the problem of low classification accuracy caused by small samples and unbalanced categories in hyperspectral image classification and improve the classification accuracy.

In order to achieve the purpose, the invention is realized by adopting the following technical scheme.

The classification method of the category-unbalanced hyperspectral image based on the enhanced oversampling comprises the following steps:

step 1, acquiring a hyperspectral image to be classified and a hyperspectral image set to be trained;

step 2, performing dimensionality reduction treatment on the hyperspectral images to be classified and the images in the hyperspectral image set to be trained respectively by adopting a principal component analysis method to obtain the hyperspectral images to be classified and the hyperspectral image set to be trained after dimensionality reduction; then, performing edge filling and blocking on each hyperspectral image subjected to dimension reduction to obtain a corresponding pixel block to be classified and a training sample set;

step 3, performing enhancement oversampling unbalanced processing on the training sample set to obtain a corresponding enhancement oversampled training sample set; wherein the training samples in the training sample set are training pixel blocks;

step 4, building a convolutional neural network model, and training the convolutional neural network by adopting the over-sampled training sample set to obtain a trained convolutional neural network; and inputting the pixel blocks to be classified into the trained convolutional neural network for class prediction to obtain corresponding prediction classes, and finishing the classification of the hyperspectral images to be classified.

Compared with the prior art, the invention has the beneficial effects that:

(1) the invention establishes a general sample balance rule, can adapt to the preprocessing of multidimensional data such as images and the like, and provides a solution for deep learning training of small samples;

(2) the enhanced oversampling method provided by the invention effectively reduces the influence of potential overfitting of the ROS sampling technology and introduction of noise samples by the SMOTE sampling technology;

(3) the sample processing process of the invention can help to establish a model for effectively classifying unbalanced data, thereby greatly improving the accuracy of small category classification.

Drawings

The invention is described in further detail below with reference to the figures and specific embodiments.

FIG. 1 is a flow chart of the implementation of the method of the present invention

FIG. 2 is a schematic diagram of the 2-fold ROS sampling process in the present invention;

fig. 3 is a schematic diagram of a CNN network processing procedure according to an embodiment of the present invention.

Detailed Description

Embodiments of the present invention will be described in detail below with reference to examples, but it will be understood by those skilled in the art that the following examples are only illustrative of the present invention and should not be construed as limiting the scope of the present invention.

Referring to fig. 1, the invention provides a classification method for a hyperspectral image based on category imbalance of enhanced oversampling, which includes the following steps:

the land cover categories on the images in the hyperspectral image set to be trained comprise the land cover categories on the hyperspectral images to be classified. The images in the hyperspectral image set to be trained are hyperspectral images with the size of h multiplied by w multiplied by n acquired from a remote sensing image public data platform; the number of the images in the land cover can be one or more, and the land cover category is N;

meanwhile, a ground truth value, namely a real category label, on each image in the hyperspectral image set to be trained needs to be acquired from field examination or high-resolution image visual interpretation.

(2.1) performing Principal Component Analysis (PCA) on each hyperspectral image, and reserving k main characteristic components:

PCA is a method of properly expressing all samples with a hyperplane, and the idea is to map n-dimensional features onto k-dimensions (k < n), which are completely new orthogonal features called principal components. The main calculation process is as follows:

1) carrying out averaging processing on each feature of the data;

2) solving a covariance matrix of the features;

3) solving an eigenvalue and an eigenvector of the covariance matrix;

4) solving the first k maximum eigenvalues, and forming a new matrix by corresponding eigenvectors;

5) and projecting the sample points onto the selected feature vectors.

The hyperspectral image has high dimensionality, the adverse effect of the classifier on limited sample training can be reduced through PCA dimension reduction, and the generalization capability of the classifier is enhanced.

(2.2) edge filling of the hyperspectral image:

first, the fill edge size is calculated:

wherein s is the pixel block window size,

represents rounding down;

then, edges of size p are filled around each hyperspectral image with a fixed pixel value of 0.

(2.3) traversing each hyperspectral image in a pixel point mode in a space dimension, and intercepting a pixel block with the size of s multiplied by k as a sample; the number of samples of pixel blocks created per hyperspectral image is h x w.

(3.1) performing ROS sampling for 2 times of samples in the training sample set: and recording the number of samples corresponding to the category with the largest number of samples in the pixel block set to be trained as M, and then respectively carrying out intra-class random copying on the samples of each category to enable the number of the samples of each category to reach 2M, thereby forming an oversampled training sample set.

In practice, the ROS sampling technique is to treat the classes smaller than the maximum number of samples as the minority classes, and sample the number of the minority class pixel blocks to the maximum number of samples in the original sample set respectively by randomly copying in the original sample set, and 2 times ROS sampling means to sample the minority class samples to 2 times the maximum number of samples in the original sample set. The number of classes of the sampled data set is shown in fig. 2.

(3.2) performing data enhancement on the oversampled training sample set:

according to the characteristic of the rotation invariance of the remote sensing image, enhancement operations such as data turning and rotation are carried out on a pixel block to be trained, and the potential overfitting problem caused by sample copying is avoided.

Randomly selecting half of samples in the over-sampled training sample set to perform data enhancement processing, namely performing vertical overturning, horizontal overturning or rotation on the spatial dimension to form enhanced samples, and replacing the samples at corresponding positions in the training sample set with the enhanced samples to form the enhanced over-sampled training sample set.

The rotation is a rotation operation of randomly selecting a rotation angle at intervals of 30 degrees within an interval of [ -180 °,180 ° ] for the sample.

The inspiration of CNN (convolutional neural network) comes from the structure of the visual system. Unlike fully connected networks, CNNs extract contextual 2D spatial features of images using local connections. In addition, the network parameters can be obviously reduced through a weight sharing mechanism, and the calculation efficiency is obviously improved. As described in the background, CNN models are usually parameterized, require a large amount of training data to ensure performance, and when the number of training samples is small, the classification performance is reduced, and the classification accuracy is seriously affected. The aforementioned steps of the method of the present invention all solve the problems of small sample training and class imbalance of CNN classification, and therefore, the CNN classifier is selected as the final classifier model.

The invention builds a hyperspectral image classification model based on 2D-CNN, and the structure is shown in figure 3. The model is simple in structure and comprises an input layer, two convolution layers, a first dropout layer, a full-connection layer, a second dropout layer and a full-connection output layer which are sequentially connected, and a pooling layer matched with the convolution layers in the traditional CNN network is removed in order to prevent excessive loss of spatial information due to small sample size of a pixel block.

In the 2D-CNN model, the size of the input layer of the model is the size of the pixel block training sample in the enhanced over-sampled training sample set obtained in the step 3, and the input layer is used for receiving the pixel block training sample; the number of convolution kernels of the first convolution layer is 3k (k is the number of spectral dimension channels of the input pixel block), the size of the convolution kernels is 3 x 3, the number of convolution kernels of the second convolution layer is 9k, and the size of the convolution kernels is 3 x 3, so that the pixel block classification features are extracted; in order to prevent the model from being over-fitted, a dropout layer is added after the two convolution layers, and the discarding rate is 0.25; in order to meet the requirement of full-link input, the multi-dimensional feature graph is flattened into one-dimensional features; the number of the neurons of the first full-connection layer is 6k, and the neurons are input into a second dropout layer with the second discarding rate of 0.5 after the features are extracted; the second full-connection layer is an output layer of the model, the number of the neurons is N (the number of classification categories), the prediction probability value of the output sample of the softmax activation layer is adopted, and the category label corresponding to the maximum probability value is used as the final prediction category of the sample.

In the model training process, all model weights are updated by adopting a random initialization and back propagation algorithm, a cross entropy loss function is adopted as a loss function, and a random gradient descent algorithm is adopted as an optimizer. And training the model to a plurality of epochs, inputting all training samples into the network for weight updating by each epoch, and completing training when the training loss function curve is converged and the model achieves a fitting state on the training data. And finally, inputting the pixel block to be classified of the unknown label in the step 2 into the trained model for prediction to obtain a classification result of the hyperspectral image to be classified.

Simulation experiment

The effect of the present invention can be further illustrated by one specific example below:

the Indian Pines dataset is a published hyperspectral image dataset, and a piece of Indian pine tree in indiana, usa was imaged by an onboard imaging spectrometer (AVIRIS) in 1992, and 145 × 145 size was intercepted for labeling as a hyperspectral image classification test application. The available band of the data set is 200, the total number of pixels is 21025, 10776 background pixels are removed, and the remaining 10249 ground object pixels are used for ground object classification. The spatial resolution of the data set imaging is 20m, mixed pixels are easily generated, which brings difficulty to classification, and in the 16 classes, the distribution of samples is extremely uneven, which also increases the difficulty of classification. Because of this, the spectral data set is one of the most common test data currently used in hyperspectral image classification studies. The number of samples of each type of feature is shown in table 1.

TABLE 1 number of samples per type for Indian Pines dataset, training set, and test set

As the Indian Pines data set has the characteristic of class imbalance, the data set is adopted by the invention to carry out example test. In the experimental test, one group adopts the basic CNN network to directly train and test the sample, and the other group carries out the experiment according to the steps of the method. Each type selects 5% of training samples, 30 main components are reserved in the preprocessing part, the width and the height of a pixel block are 25, the iteration frequency of the CNN classifier is 100 generations, the overall precision (OA), the average precision (AA) and the Kappa coefficient (Kappa) are adopted for measurement in the precision evaluation mode, and the final experimental result is shown in table 2.

TABLE 2 test set evaluation index for two methods

As can be seen from table 2, compared with the basic CNN method, the present invention has various classification evaluation indexes higher than the accuracy of the basic CNN classifier, and the classification accuracy of each class is greatly improved, which proves that the present invention has a good application effect.

Although the present invention has been described in detail in this specification with reference to specific embodiments and illustrative embodiments, it will be apparent to those skilled in the art that modifications and improvements can be made thereto based on the present invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims

1. The classification method of the category unbalanced hyperspectral image based on the enhanced oversampling is characterized by comprising the following steps:

2. The method for classifying hyperspectral images based on class imbalance based on enhanced oversampling of claim 1, wherein the land cover classes on the images in the hyperspectral image set to be trained comprise land cover classes on the hyperspectral images to be classified.

3. The classification method for the hyperspectral images with unbalanced category based on the enhanced oversampling as claimed in claim 1, wherein the images in the hyperspectral image set to be trained are hyperspectral images with a size of h x w x n acquired from a remote sensing image public data platform.

4. The classification method for the category-unbalanced hyperspectral images based on the enhanced oversampling as claimed in claim 3 is characterized in that the principal component analysis method performs dimensionality reduction on the hyperspectral images to be classified and the images in the hyperspectral image set to be trained respectively, specifically, performs principal component analysis on each hyperspectral image, namely, mapping n-dimensional features of each hyperspectral image to k-dimensional features, wherein k is less than n, and the k-dimensional features are brand-new orthogonal features.

5. The classification method for the class-imbalanced hyperspectral images based on the enhanced oversampling according to claim 3, wherein the edge filling and blocking are performed on each hyperspectral image after the dimensionality reduction, specifically:

first, the fill edge size is calculated:

wherein s is the pixel block window size,

represents rounding down;

secondly, filling edges with the size of p around each hyperspectral image by using a fixed pixel value of 0;

finally, traversing each hyperspectral image in a two-dimensional space pixel point mode, and intercepting a pixel block with the size of s multiplied by k as a corresponding sample; the number of samples of pixel blocks created per hyperspectral image is h x w.

6. The classification method for the class-imbalanced hyperspectral images based on the enhanced oversampling as claimed in claim 1, wherein the processing of the class-imbalanced hyperspectral images based on the enhanced oversampling is performed on a training sample set, and specifically comprises:

(3.1) performing ROS sampling for 2 times of samples in the training sample set: recording the number of samples corresponding to the category with the largest number of samples in the training sample set as M, and then respectively carrying out intra-category random copying on the samples of each category to enable the number of the samples of each category to reach 2M, so as to form an oversampled training sample set; (3.2) performing data enhancement on the oversampled training sample set: and randomly selecting half of the samples in the training sample set after oversampling for data enhancement, and replacing the samples at the corresponding positions in the training sample set with the enhanced samples to form the enhanced oversampling training sample set.

7. The classification method for the hyperspectral image with the unbalanced category based on the enhanced oversampling as claimed in claim 6, wherein the data enhancement processing is: vertically overturning, horizontally overturning or rotating on the spatial dimension to form an enhanced sample; wherein the rotation is a rotation operation of the sample by randomly selecting rotation angles at intervals of 300 in an interval of [ -180 °,180 ° ].

8. The method for classifying the hyperspectral images with unbalanced categories based on the enhanced oversampling as claimed in claim 1, wherein the convolutional neural network model is composed of an input layer, two convolutional layers, a first dropout layer, a fully connected layer, a second dropout layer and a fully connected output layer which are connected in sequence, wherein the size of the input layer is the size of the training samples in the enhanced oversampled training sample set obtained in step 3, the number of neurons in the fully connected output layer is N, and N is the total category number.

9. The classification method for the class-imbalanced hyperspectral images based on the enhanced oversampling of claim 8, wherein the training of the convolutional neural network by using the oversampled training sample set specifically comprises: the weight value of the convolutional neural network model is updated by adopting a random initialization and back propagation algorithm, the loss function adopts a cross entropy loss function, and the optimizer adopts a random gradient descent algorithm; training a plurality of epochs by the model, and inputting all training samples in the training sample set after oversampling into a network by each epoch for weight updating; when the training loss function curve is converged, the model achieves a fitting state on the training data, and the training process is completed.