CN114494777A

CN114494777A - Hyperspectral image classification method and system based on 3D CutMix-transform

Info

Publication number: CN114494777A
Application number: CN202210082474.XA
Authority: CN
Inventors: 冯志玺; 高雅晨; 杨淑媛; 陈帅; 胡浩; 彭同庆
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2022-01-24
Filing date: 2022-01-24
Publication date: 2022-05-13

Abstract

The invention discloses a hyperspectral image classification method and system based on a 3D CutMix-transform, which are used for dividing hyperspectral data into a labeled training data set and a labeled verification data set; pre-training 3D CutMix by CNN; training a region-level teacher model and a sample-level teacher model by using the enhanced data obtained by performing 3D CutMix on the training data set; the student models were trained jointly with two teacher models and a small number of labeled datasets. According to the invention, CNN is used for pre-training 3D CutMix, then 3D CutMix is used for data enhancement of an original label data set, and respective self-supervision losses and mutual cross pseudo-supervision losses of two teacher models are optimized, so that the robustness of a co-trained student model is better, the generalization capability and accuracy of models under small samples in the existing hyperspectral image classification technology are enhanced and improved, and the method can be used for hyperspectral image classification.

Description

Hyperspectral image classification method and system based on 3D CutMix-transform

Technical Field

The invention belongs to the technical field of hyperspectral remote sensing image classification, and particularly relates to a hyperspectral image classification method and system based on 3D CutMix-transform.

Background

With the development of artificial intelligence technology, the hyperspectral image intelligent classification technology has deeply influenced aspects of modern life, and the hyperspectral image intelligent classification technology is more and more widely applied to the fields of precision agriculture, military, ocean, disaster detection and the like. The traditional remote sensing image analysis utilizes image space information, and the core of the hyperspectral image analysis is spectral analysis. The hyperspectral remote sensing data is a spectrum image cube, as shown in fig. 1, the hyperspectral remote sensing data has the main characteristic that image space dimension and spectrum dimension information are combined into a whole, and compared with a single waveband, one-dimensional spectrum information is added. And acquiring the surface space image and the surface feature spectrum information corresponding to each pixel. Therefore, the hyperspectral image classification has the advantage of high information enrichment degree compared with the traditional remote sensing image classification, which means that more features can be learned through hyperspectral data for intelligent image classification so as to improve the classification precision.

At present, the problems of few labeled samples and poor model robustness exist in hyperspectral image classification. The model accuracy and robustness of hyperspectral image classification can be greatly influenced by a data set by a data-driven deep learning method under a small number of samples, namely, the existing method such as CNN (Convolutional Neural Network) is low in efficiency and poor in robustness under the condition that training samples are few, and the effect on cross-data set (obtained by different sensors) is poor. The above-mentioned problems are not solved effectively today.

Disclosure of Invention

The invention aims to solve the technical problems that in order to overcome the defects in the prior art, the invention provides a hyperspectral image classification method and system based on a 3D CutMix-transform, the 3D CutMix is used for enhancing hyperspectral image data to amplify a data set, and a one-dimensional transform model is used for feature learning of a hyperspectral image, so that the classification accuracy is improved, and the problems that the existing hyperspectral image classification technology is few in labeled samples and poor in model robustness are solved.

The invention adopts the following technical scheme:

a hyperspectral image classification method based on 3D CutMix-transform comprises the following steps:

s1, dividing the hyperspectral data into a labeled training data set Train and a labeled verification data set Test;

s2, building a CNN convolutional neural network, inputting the labeled training data set Train divided in the step S1 into the CNN convolutional neural network, and building a 3D CutMix pre-training model based on the CNN;

s3, constructing a hyperspectral image classification region class teacher model M based on the 3D CutMix pre-training model of the step S2_RLAnd a sample level teacher model M_SL；

S4, integrating the hyperspectral image classification region level teacher model M constructed in the step S3_RLAnd a sample level teacher model M_SLAnd step S1 dividing labeled training data set Train to jointly Train student model M_SInputting the labeled verification data set Test divided in the step S1 into the trained student model M_SIn the method, hyperspectral image classification based on a Transformer teacher student model is realized.

Specifically, in step S1, the labeled training data set Train accounts for 80% of the hyperspectral data, and the labeled verification data set Test accounts for 20% of the hyperspectral data.

Specifically, step S2 specifically includes:

s201, building a CNN (convolutional neural network);

s202, inputting the labeled training data set Train divided in the step S1 into a CNN convolutional neural network to obtain a classification result of each pixel element;

s203, analyzing the contribution rate of the classification result of each pixel element obtained in the step S202 and the band data contained in the classification result to obtain a band with the highest contribution rate, and selecting the band which does not exceed 5% of the number of the bands from the candidates;

and S204, designing the 3D CutMix according to the plurality of wave bands with the highest contribution rate of each pixel element classification result obtained in the step S203 to obtain a 3D Mask of hyperspectral data, and finally obtaining a pre-trained 3D Mask pre-training model.

Further, in step S201, the input layer network input of the CNN convolutional neural network uses mat format standard data, and performs decentralized and standardized processing on raw data in format (L × W) × H, and the output result is F × H as the input of the first convolutional layer, where F ═ L × W, L is the pixel number of the long side of the hyperspectral data, W is the pixel number of the wide side of the hyperspectral data, H is the number of bands of the hyperspectral data, and F is the total pixel number of the hyperspectral data; performing 0-1 constraint on the weight of the CNN input layer, if

If the input weight is more than 0.5, the input weight is 1, otherwise, the input weight is 0;

convolution layers employ convolution kernels/filters of size 4x4, and the convolution kernels slide one pixel at a time, one feature map uses the same convolution kernel, and the activation function employs the Relu function.

Further, in step S202, each of the hyperspectral image data is represented as

J-th pixel element, j is 1, 2, …, F, i is 1, 2, …, H, and the corresponding label is L^(j)The multi-classification cross entropy loss function is adopted as follows:

wherein p ═ p₀,...,p_C-1]For a probability distribution, each element p_iRepresenting the probability of the sample belonging to the ith class; y ═ y₀,...,y_C-1]One-hot representation for the sample label, y when the sample belongs to the category i_i1, otherwise y_iC is the number of sample labels, 0.

Specifically, step S3 specifically includes:

s301, randomly selecting two pixel elements X in hyperspectral data^(m)And X⁽ⁿ⁾Performing 3D CutMix operation to obtain X _ Cut^(m)And X _ Cut⁽ⁿ⁾；

S302, repeating the step S301 for N times to obtain original hyperspectral data

Enhanced data obtained by 3D CutMix operation

j＝1，2，…，F，i＝1，2，...，H；

S303, the enhancement data obtained in the step S302

Input region level teacher model M_RLObtaining a pseudo label L1 through a one-dimensional Transformer^(j)；

S304, the original hyperspectral data

Input sample level teacher model M_SLObtaining a pseudo label L2 through a one-dimensional Transformer^(j)；

S305, respectively using the pseudo label L1 of the step S303^(j)With dummy L2 of step S304^(j)Self-supervision loss with real label, and calculation of pseudo label L1^(j)And pseudo label L2^(j)Cross-pseudo-supervised loss between, region level teacher model M_RLIs an objective function L_RLAnd a sample level teacher model M_SLIs an objective function L_SL。

Further, in step S305, the region level teacher model M_RLIs an objective function L_RLAnd a sample level teacher model M_SLIs an objective function L_SLRespectively as follows:

argmin(L_RL)＝argmin{L_ce(L1,L)+L_ce(L1,L2)}

argmin(L_SL)＝argmin{L_ce(L2,L)+L_ce(L1,L2)}

wherein L is_RL、L_SLRespectively representing region level teacher model M_RLAnd a sample level teacher model M_SLAn objective function of (3), L_ceRepresenting the cross entropy loss function, L_ce(L1, L) and L_ce(L1, L2) denotes the loss of self-supervision, L_ce(L1, L2) represents M_RLResulting pseudo tag and M_SLResulting cross-pseudo-surveillance loss between pseudo-labels.

Specifically, step S4 specifically includes:

s401, enabling the region level teacher model M_RLAnd a sample level teacher model M_SLPerforming an integration operation to average the output of the two complementary models to M_mean；

S402, region level teacher model M_RLAnd a sample level teacher model M_SLIs integrated as a guide to supervise the student model M_STraining on unlabeled target data.

Specifically, in step S401, the region level teacher model M_RLAnd a sample level teacher model M_SLThe integration of (a) is represented as:

wherein E () represents the averaging region level teacher model M_RLAnd a sample level teacher model M_SLOutput of (a), x^uDenotes unlabeled samples, q denotes the probability of each class output, and μ denotes the weight.

The invention also provides a hyperspectral image classification system based on the 3D CutMix-transform, which comprises the following steps:

the data module divides the hyperspectral data into a labeled training data set Train and a labeled verification data set Test;

the building module is used for building a CNN convolutional neural network, inputting a labeled training data set Train divided by the data module into the CNN convolutional neural network, and building a 3D CutMix pre-training model based on the CNN;

an enhancement module, which is used for constructing a high-spectrum image classification region level teacher model M based on a 3D CutMix pre-training model of the construction module_RLAnd a sample level teacher model M_SL；

A classification module, a hyperspectral image classification region level teacher model M constructed by the integration enhancement module_RLAnd a sample level teacher model M_SLAnd a labeled training data set Train divided by data modules jointly Train the student model M_SInputting the labeled verification data set Test divided by the data module into the trained student model M_SIn the method, hyperspectral image classification based on a Transformer teacher student model is realized.

Compared with the prior art, the invention has at least the following beneficial effects:

the invention relates to a hyperspectral image classification method based on 3D CutMix-transform, which is characterized in that 3D CutMix is pre-trained through CNN; training a region-level teacher model and a sample-level teacher model by using the enhanced data obtained by performing 3D CutMix on the training data set; training a student model by using two teacher models and a small amount of labeled data sets; the 3D CutMix is used for enhancing hyperspectral data, the problem of insufficient training samples is solved, and the accuracy and robustness of the model are improved and enhanced; meanwhile, the global feature extraction capability and the parallel computing capability of the Transformer enable the feature extraction capability of the Transformer to be more comprehensive than that of the CNN which can only extract local information, and the computing capability of the GPU can be fully utilized.

Further, the data set is divided, 80% of data is used for training, the other 20% of data is used for verification, the 20% of data does not participate in training and is used for evaluating the performance of the model, and the performance of the parameters is verified in continuous training-verification iteration so as to achieve the purpose of selecting correct hyper-parameters.

Further, a CNN-based 3D CutMix pre-training model is constructed, pre-training of hyperspectral data is performed by utilizing the characteristics that CNN is easy to build and high in accuracy rate, the contribution rate of each wave band in each pixel element is obtained, then the 3D Mask of the hyperspectral data is obtained by utilizing the first 5% with the highest contribution rate in all the wave bands, 5% quantity selection ensures that significant features are removed on the premise of distinguishability among the pixel elements so as to improve the robustness of a subsequent classification model, and meanwhile, the hyperspectral data is enhanced.

Further, a classification model module is built: the Transformer network structure, in the Transformer, performs parallel computation using an attention mechanism, which speeds up computation.

Further, in the classification network, a cross entropy loss function is used, and according to the result, when a mean square error loss function is used, the gradient is proportional to that of the activation function, but for the sigmoid activation function, the input is z, the output is σ (z), and when σ (z) is large, σ' (z) is small and is almost 0, so that the parameter updating is slow. However, for the cross entropy loss function, the gradient at this time is irrelevant to the activation function, and when the difference between the predicted result σ (z) and the actual result y is larger, the gradient is larger, the parameter is updated faster, and when the difference between the predicted result and the actual result is smaller, the gradient is smaller, the parameter is basically in a stable state, and the cross entropy loss function brings two benefits.

Further, the hyperspectral data is subjected to data enhancement through a 3D Mask to obtain enhanced data, and the enhanced data are used for training two one-dimensional Transformer teacher models. The teacher model obtained by training the enhanced data has higher robustness, and the credibility of the data pseudo label obtained by the two teacher models is also higher.

Further, a region level teacher model M is set_RLIs an objective function L_RLAnd a sample-level teacher model M_SLIs an objective function L_SLThe two objective functions comprise the self-supervision loss of the pseudo labels obtained by the two teacher models respectively and the cross pseudo-supervision loss between the pseudo labels obtained by the two teacher models, so that the two teacher models supervise each other in training and continuously approach to the real labels, and the two teacher models have strong guiding capability.

Further, the region level teacher model M_RLAnd a sample level teacher model M_SLAveraging the outputs of to obtain M_meanAnd use it asStudent model M_SGuidance trained on unlabeled target data. The output results of the two teacher models are well integrated by the averaging processing, namely the guidance of the student models has strong reliability, so that the training accuracy of the student models on the unmarked target data is improved.

Further, for regional teacher model M_RLAnd a sample level teacher model M_SLThe output of the model M is integrated and high accuracy is ensured, and the regional teacher model M is performed_RLAnd a sample level teacher model M_SLIn such a way that the student model M_SCan be strongly guided by two teacher models, and then has good performance on unmarked data sets.

In summary, the 3D CutMix technology is integrated into the teacher-student model for hyperspectral image classification by the deep learning method, and accurate classification of hyperspectral images is realized.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

FIG. 1 is a schematic diagram of a hyperspectral data structure;

FIG. 2 is a schematic diagram of a CNN model training;

FIG. 3 is a diagram of a region level teacher model M_RLAnd a sample level teacher model M_SLA training schematic of (a);

fig. 4 is a schematic diagram of the 3D CutMix operation.

FIG. 5 is a structural diagram of a Transformer model.

FIG. 6 is a student model M_SA training schematic of (a).

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the description of the present invention, it should be understood that the terms "comprises" and/or "comprising" indicate the presence of the stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and including such combinations, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

It should be understood that although the terms first, second, third, etc. may be used to describe preset ranges, etc. in embodiments of the present invention, these preset ranges should not be limited to these terms. These terms are only used to distinguish preset ranges from each other. For example, the first preset range may also be referred to as a second preset range, and similarly, the second preset range may also be referred to as the first preset range, without departing from the scope of the embodiments of the present invention.

The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.

Various structural schematics according to the disclosed embodiments of the invention are shown in the drawings. The figures are not drawn to scale, wherein certain details are exaggerated and possibly omitted for clarity of presentation. The shapes of various regions, layers and their relative sizes and positional relationships shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, according to actual needs.

The invention provides a hyperspectral image classification method based on a 3D CutMix-transform, which comprises the steps of dividing hyperspectral data into a labeled training data set and a labeled verification data set; pre-training 3D CutMix by CNN (Convolutional Neural Network); training a region-level teacher model and a sample-level teacher model by using the enhanced data obtained by performing 3D CutMix on the training data set; the student models were trained jointly with two teacher models and a small number of labeled datasets. According to the invention, CNN is used for pre-training 3D CutMix, then 3D CutMix is used for data enhancement of an original label data set, and respective self-supervision losses and mutual cross pseudo-supervision losses of two teacher models are optimized, so that the robustness of a co-trained student model is better, the generalization capability and accuracy of models under small samples in the existing hyperspectral image classification technology are enhanced and improved, and the method can be used for hyperspectral image classification.

The invention relates to a hyperspectral image classification method based on 3D CutMix-transform, which comprises the following steps of:

converting (L multiplied by W) multiplied by H hyperspectral data into F multiplied by H, wherein F is L multiplied by W, L is the pixel number of the long edge of the hyperspectral data, W is the pixel number of the wide edge of the hyperspectral data, H is the number of wave bands of the hyperspectral data, and F is the total pixel number of the hyperspectral data, considering the classification of converting multidimensional data into one-dimensional data, the training efficiency of the model can be greatly improved, 80% of the multidimensional data is divided into a labeled training data set Train, and the rest 20% of the data set is divided into a labeled verification data set Test.

For example, if the input hyperspectral dataset is Indian Pines dataset, L is 145, W is 145, H is 200, and F is L × W is 21025. At this time, the hyperspectral data in the format of (145 × 145) × 200 is converted into one-dimensional data 21025 × 200, 80% of the one-dimensional data is divided into labeled training data sets Train (16820 × 200), and the remaining 20% of the one-dimensional data is divided into labeled verification data sets Test (4205 × 200).

S2, constructing a 3D CutMix pre-training model based on the CNN convolutional neural network;

s201, building a CNN as shown in FIG. 2;

the INPUT layer (INPUT) network INPUTs the standard data using mat format, and performs decentralized and standardized processing on the original data (L multiplied by W) multiplied by H, and outputs the result F multiplied by H as the INPUT of the first convolutional layer.

Meanwhile, the weight of the CNN input layer is constrained by 0-1, namely if

An input weight greater than 0.5 is 1, otherwise it is 0.

Convolution layers (constants) all use convolution kernels/filters (kernel/filter) of 4 × 4 size, and the convolution kernels are slid one pixel at a time (stride 1), and one feature map uses the same convolution kernel. In addition, the activation function employs a Relu function.

S202, inputting the labeled training data set Train divided in the step S1 into CNN to obtain a classification result of each pixel element;

each hyperspectral image data is represented as

The jth pixel element, j 1, 2, …, F, i 1, 2, and H, has the corresponding label L^(j)Meanwhile, the multi-classification cross entropy loss function is adopted as follows:

S203, analyzing the contribution rate of the classification result of each pixel element obtained in the step S202 and the band data contained in the classification result to obtain a plurality of bands with the highest contribution rate;

the selection method of the wave band with the highest contribution rate of each pixel element to the classification result is as follows: and (3) carrying out 0-1 constraint on the weight of the CNN input layer, wherein after the training is finished, the wave band with the weight of 1 is a candidate for the wave band with the highest contribution rate to the classification result. And selecting the number of the wave bands with the highest contribution rate of each pixel element to the classification result according to the principle that the number of the wave bands is not more than 5% of the number of the wave bands from the candidates.

And S204, designing the 3D CutMix according to the plurality of wave bands with the highest contribution rate of each pixel element classification result obtained in the step S203 to obtain a 3D Mask of the hyperspectral data, as shown in FIG. 4.

Taking the Indian Pines dataset as an example, first initialize the 0-1Mask matrix

j 1, 2, …, 21025, i 1, 2, 200, and then obtaining the j-th pixel element in the hyperspectral data according to step S203

The k wave bands with the highest contribution rate to the classification result are selected from the 200 wave bands, k is less than or equal to 21025 multiplied by 5 percent, and then the k wave bands are subjected to the classification

And setting 0 corresponding to the k positions to finally obtain a pre-trained 3D Mask pre-training model.

S3, constructing a 3D CutMix-transform-based hyperspectral image componentClass region class teacher model M_RLAnd a sample level teacher model M_SL；

Referring to FIG. 3, a 3D CutMix-transform-based hyperspectral image classification region level teacher model M_RLAnd a sample level teacher model M_SLThe method specifically comprises the following steps:

s301, as shown in FIG. 4, two pixel elements X are randomly selected from the hyperspectral data^(m)And X⁽ⁿ⁾Performing 3D CutMix operation to obtain X _ Cut^(m)And X _ Cut⁽ⁿ⁾；

The 3D CutMix is described as follows:

X_Cut^(m)＝M^(m)⊙X^m+(1-M^(m))⊙Xⁿ

X_Cut⁽ⁿ⁾＝M^(m)⊙Xⁿ+(1-M^(m))⊙X^m

wherein M is a 3D Mask matrix, X _ Cut is a matrix element multiplication^(m)And X _ Cut⁽ⁿ⁾Respectively is original high spectral data X^(m)And X⁽ⁿ⁾And (4) obtaining enhanced data through a 3D CutMix operation.

S302, repeating the step S301 for N times to obtain original hyperspectral data

Enhanced data obtained by 3D CutMix operations

j＝1，2，…，F，i＝1，2，...，H；

S303, the enhancement data obtained in the step S302

S304, the original hyperspectral data

Input sample level teacher model M_SLObtained by one-dimensional TransformerPseudo label L2^(j)；

S305, respectively enabling pseudo labels L1^(j)And pseudo label L2^(j)Self-supervision loss with real label, while computing pseudo label L1^(j)And pseudo label L2^(j)Cross-pseudo-supervised losses between.

The adopted Transformer structure is shown in FIG. 5, and the region level teacher model M_RLAnd a sample level teacher model M_SLIs an objective function L of_RLAnd L_SLRespectively as follows:

argmin(L_RL)＝argmin{L_ce(L1,L)+L_ce(L1,L2)}

argmin(L_SL)＝argmin{L_ce(L2,L)+L_ce(L1,L2)}

wherein L is_RL、L_SLRespectively representing region level teacher model M_RLAnd a sample level teacher model M_SLAn objective function of, L_ceRepresenting the cross entropy loss function, L_ce(L1, L) and L_ce(L1, L2) denotes the loss of self-supervision, L_ce(L1, L2) represents M_RLThe resulting pseudo label L1^(j)And M_SLThe resulting pseudo label L2^(j)Cross-pseudo-supervision between.

S4, integrating the hyperspectral image classification region level teacher model M constructed in the step S3_RLAnd a sample level teacher model M_SLAnd step S1 dividing labeled training data set Train co-training student model M_SInputting the labeled verification data set Test divided in the step S1 into the trained student model M_SIn the method, hyperspectral image classification based on a Transformer teacher student model is realized.

Region level teacher model M_RLAnd a sample level teacher model M_SLThe integration (knowledge distillation) of (a) is expressed as:

where E () represents the average of the two teacher outputs, x^uRepresenting a large number of unlabeled samples, q representing the probability of output for each class, and μ representing a weight.

S402, region level teacher model M_RLAnd a sample level teacher model M_SLIs integrated into a stronger guide to supervise the student model M_STraining on unlabeled target data. In addition, student model M_SBut also by labels on a small number of authentic labels label.

Student model M_SBy minimizing M_RL、M_SLKnowledge is obtained by KL-divergence between the two model outputs. At the same time, the student model M_SAlso receives a small number of marked labels Y^lAnd (4) supervision.

Student model M_SIs an objective function L_SComprises the following steps:

L_S＝λ_klL_KL(E(M_RL(X^u),M_SL(X^u)))+λ_ceL_ce(M_S(X^l),Y^l)

wherein L is_KLIndicating KL divergence loss, λ_klAnd λ_ceKL divergence weights and cross entropy loss weights, respectively. X^uA large amount of unlabeled data; the specific flow is shown in FIG. 6.

In another embodiment of the invention, a 3D CutMix-transform-based hyperspectral image classification system is provided, which can be used for implementing the 3D CutMix-transform-based hyperspectral image classification method, and specifically, the 3D CutMix-transform-based hyperspectral image classification system comprises a data module, a building module, an enhancement module and a classification module.

In summary, according to the hyperspectral image classification method and system based on the 3D CutMix-transform, the 3D CutMix is pre-trained through the CNN; training a region-level teacher model and a sample-level teacher model by using the enhanced data obtained by performing 3D CutMix on the training data set; training a student model by using two teacher models and a small amount of labeled data sets; the 3D CutMix is used for enhancing hyperspectral data, the problem of insufficient training samples is solved, and the accuracy and robustness of the model are improved and enhanced; meanwhile, the global feature extraction capability and the parallel computing capability of the Transformer enable the feature extraction capability of the Transformer to be more comprehensive than that of the CNN which can only extract local information, and the computing capability of the GPU can be fully utilized.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims

1. A hyperspectral image classification method based on 3D CutMix-transform is characterized by comprising the following steps:

s3, constructing a hyperspectral image classification region level teacher model M based on the 3D CutMix pre-training model in the step S2_RLAnd a sample level teacher model M_SL；

2. The 3D CutMix-transform-based hyperspectral image classification method according to claim 1, wherein in the step S1, the labeled training data set Train accounts for 80% of the hyperspectral data, and the label verification data set Test accounts for 20% of the hyperspectral data.

3. The 3D CutMix-transform-based hyperspectral image classification method according to claim 1, wherein the step S2 specifically comprises:

s201, building a CNN convolutional neural network;

4. The method for classifying hyperspectral images based on 3D CutMix-fransformer as claimed in claim 3, wherein in step S201, the input layer network input of the CNN convolutional neural network uses mat format standard data, and the raw data in format (L × W) × H is processed by decentralization and normalization, and the output result is F × H as the input of the first convolutional layer, F ═ L × W, L is the pixel number of the long edge of the hyperspectral data, W is the pixel number of the wide edge of the hyperspectral data, H is the number of bands of the hyperspectral data, and F is the total pixel number of the hyperspectral data; performing 0-1 constraint on the weight of the CNN input layer if X is_i ^(j)If the input weight is more than 0.5, the input weight is 1, otherwise, the input weight is 0;

5. The method for classifying hyperspectral images based on 3D CutMix-transform of claim 3, wherein in step S202, each hyperspectral image data is represented as X_i ^(j)J-th pixel element, j is 1, 2, …, F, i is 1, 2, …, H, and the corresponding label is L^(j)The multi-classification cross entropy loss function is adopted as follows:

6. The 3D CutMix-transform-based hyperspectral image classification method according to claim 1, wherein the step S3 specifically comprises:

S302, repeating the step S301 for N times to obtain original hyperspectral data

Enhanced data obtained by 3D CutMix operations

S303, the enhancement data obtained in the step S302

S304, subjecting the original hyperspectral data

7. The method for classifying 3D CutMix-transform-based hyperspectral images according to claim 6, wherein in step S305, a region level teacher model M_RLIs an objective function L_RLAnd a sample level teacher model M_SLIs an objective function L_SLRespectively as follows:

argmin(L_RL)＝argmin{L_ce(L1,L)+L_ce(L1,L2)}

argmin(L_SL)＝argmin{L_ce(L2,L)+L_ce(L1,L2)}

wherein L is_RL、L_SLRespectively representing region level teacher model M_RLAnd a sample level teacher model M_SLAn objective function of, L_ceRepresenting the cross entropy loss function, L_ce(L1, L) and L_ce(L1, L2) denotes the self-supervision loss, L_ce(L1, L2) represents M_RLResulting pseudo tag and M_SLResulting cross-pseudo-surveillance loss between pseudo-labels.

8. The 3D CutMix-transform-based hyperspectral image classification method according to claim 1, wherein the step S4 specifically comprises:

9. The 3D CutMix-transform-based hyperspectral image classification method according to claim 1, wherein in step S401, a region level teacher model M_RLAnd a sample level teacher model M_SLThe integration of (a) is represented as:

10. A hyperspectral image classification system based on 3D CutMix-transform is characterized by comprising:

A classification module, a hyperspectral image classification region level teacher model M constructed by the integration enhancement module_RLAnd a sample level teacher model M_SLAnd jointly training the student model M by using the labeled training data set Train divided by the data module_SInputting the labeled verification data set Test divided by the data module into the trained student model M_SIn the method, hyperspectral image classification based on a Transformer teacher student model is realized.