CN114494777A - Hyperspectral image classification method and system based on 3D CutMix-transform - Google Patents

Hyperspectral image classification method and system based on 3D CutMix-transform Download PDF

Info

Publication number
CN114494777A
CN114494777A CN202210082474.XA CN202210082474A CN114494777A CN 114494777 A CN114494777 A CN 114494777A CN 202210082474 A CN202210082474 A CN 202210082474A CN 114494777 A CN114494777 A CN 114494777A
Authority
CN
China
Prior art keywords
data
cutmix
model
hyperspectral
teacher model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210082474.XA
Other languages
Chinese (zh)
Inventor
冯志玺
高雅晨
杨淑媛
陈帅
胡浩
彭同庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202210082474.XA priority Critical patent/CN114494777A/en
Publication of CN114494777A publication Critical patent/CN114494777A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a hyperspectral image classification method and system based on a 3D CutMix-transform, which are used for dividing hyperspectral data into a labeled training data set and a labeled verification data set; pre-training 3D CutMix by CNN; training a region-level teacher model and a sample-level teacher model by using the enhanced data obtained by performing 3D CutMix on the training data set; the student models were trained jointly with two teacher models and a small number of labeled datasets. According to the invention, CNN is used for pre-training 3D CutMix, then 3D CutMix is used for data enhancement of an original label data set, and respective self-supervision losses and mutual cross pseudo-supervision losses of two teacher models are optimized, so that the robustness of a co-trained student model is better, the generalization capability and accuracy of models under small samples in the existing hyperspectral image classification technology are enhanced and improved, and the method can be used for hyperspectral image classification.

Description

Hyperspectral image classification method and system based on 3D CutMix-transform
Technical Field
The invention belongs to the technical field of hyperspectral remote sensing image classification, and particularly relates to a hyperspectral image classification method and system based on 3D CutMix-transform.
Background
With the development of artificial intelligence technology, the hyperspectral image intelligent classification technology has deeply influenced aspects of modern life, and the hyperspectral image intelligent classification technology is more and more widely applied to the fields of precision agriculture, military, ocean, disaster detection and the like. The traditional remote sensing image analysis utilizes image space information, and the core of the hyperspectral image analysis is spectral analysis. The hyperspectral remote sensing data is a spectrum image cube, as shown in fig. 1, the hyperspectral remote sensing data has the main characteristic that image space dimension and spectrum dimension information are combined into a whole, and compared with a single waveband, one-dimensional spectrum information is added. And acquiring the surface space image and the surface feature spectrum information corresponding to each pixel. Therefore, the hyperspectral image classification has the advantage of high information enrichment degree compared with the traditional remote sensing image classification, which means that more features can be learned through hyperspectral data for intelligent image classification so as to improve the classification precision.
At present, the problems of few labeled samples and poor model robustness exist in hyperspectral image classification. The model accuracy and robustness of hyperspectral image classification can be greatly influenced by a data set by a data-driven deep learning method under a small number of samples, namely, the existing method such as CNN (Convolutional Neural Network) is low in efficiency and poor in robustness under the condition that training samples are few, and the effect on cross-data set (obtained by different sensors) is poor. The above-mentioned problems are not solved effectively today.
Disclosure of Invention
The invention aims to solve the technical problems that in order to overcome the defects in the prior art, the invention provides a hyperspectral image classification method and system based on a 3D CutMix-transform, the 3D CutMix is used for enhancing hyperspectral image data to amplify a data set, and a one-dimensional transform model is used for feature learning of a hyperspectral image, so that the classification accuracy is improved, and the problems that the existing hyperspectral image classification technology is few in labeled samples and poor in model robustness are solved.
The invention adopts the following technical scheme:
a hyperspectral image classification method based on 3D CutMix-transform comprises the following steps:
s1, dividing the hyperspectral data into a labeled training data set Train and a labeled verification data set Test;
s2, building a CNN convolutional neural network, inputting the labeled training data set Train divided in the step S1 into the CNN convolutional neural network, and building a 3D CutMix pre-training model based on the CNN;
s3, constructing a hyperspectral image classification region class teacher model M based on the 3D CutMix pre-training model of the step S2RLAnd a sample level teacher model MSL
S4, integrating the hyperspectral image classification region level teacher model M constructed in the step S3RLAnd a sample level teacher model MSLAnd step S1 dividing labeled training data set Train to jointly Train student model MSInputting the labeled verification data set Test divided in the step S1 into the trained student model MSIn the method, hyperspectral image classification based on a Transformer teacher student model is realized.
Specifically, in step S1, the labeled training data set Train accounts for 80% of the hyperspectral data, and the labeled verification data set Test accounts for 20% of the hyperspectral data.
Specifically, step S2 specifically includes:
s201, building a CNN (convolutional neural network);
s202, inputting the labeled training data set Train divided in the step S1 into a CNN convolutional neural network to obtain a classification result of each pixel element;
s203, analyzing the contribution rate of the classification result of each pixel element obtained in the step S202 and the band data contained in the classification result to obtain a band with the highest contribution rate, and selecting the band which does not exceed 5% of the number of the bands from the candidates;
and S204, designing the 3D CutMix according to the plurality of wave bands with the highest contribution rate of each pixel element classification result obtained in the step S203 to obtain a 3D Mask of hyperspectral data, and finally obtaining a pre-trained 3D Mask pre-training model.
Further, in step S201, the input layer network input of the CNN convolutional neural network uses mat format standard data, and performs decentralized and standardized processing on raw data in format (L × W) × H, and the output result is F × H as the input of the first convolutional layer, where F ═ L × W, L is the pixel number of the long side of the hyperspectral data, W is the pixel number of the wide side of the hyperspectral data, H is the number of bands of the hyperspectral data, and F is the total pixel number of the hyperspectral data; performing 0-1 constraint on the weight of the CNN input layer, if
Figure BDA0003486455760000036
If the input weight is more than 0.5, the input weight is 1, otherwise, the input weight is 0;
convolution layers employ convolution kernels/filters of size 4x4, and the convolution kernels slide one pixel at a time, one feature map uses the same convolution kernel, and the activation function employs the Relu function.
Further, in step S202, each of the hyperspectral image data is represented as
Figure BDA0003486455760000037
J-th pixel element, j is 1, 2, …, F, i is 1, 2, …, H, and the corresponding label is L(j)The multi-classification cross entropy loss function is adopted as follows:
Figure BDA0003486455760000031
wherein p ═ p0,...,pC-1]For a probability distribution, each element piRepresenting the probability of the sample belonging to the ith class; y ═ y0,...,yC-1]One-hot representation for the sample label, y when the sample belongs to the category ii1, otherwise yiC is the number of sample labels, 0.
Specifically, step S3 specifically includes:
s301, randomly selecting two pixel elements X in hyperspectral data(m)And X(n)Performing 3D CutMix operation to obtain X _ Cut(m)And X _ Cut(n)
S302, repeating the step S301 for N times to obtain original hyperspectral data
Figure BDA0003486455760000032
Enhanced data obtained by 3D CutMix operation
Figure BDA0003486455760000033
j=1,2,…,F,i=1,2,...,H;
S303, the enhancement data obtained in the step S302
Figure BDA0003486455760000034
Input region level teacher model MRLObtaining a pseudo label L1 through a one-dimensional Transformer(j)
S304, the original hyperspectral data
Figure BDA0003486455760000035
Input sample level teacher model MSLObtaining a pseudo label L2 through a one-dimensional Transformer(j)
S305, respectively using the pseudo label L1 of the step S303(j)With dummy L2 of step S304(j)Self-supervision loss with real label, and calculation of pseudo label L1(j)And pseudo label L2(j)Cross-pseudo-supervised loss between, region level teacher model MRLIs an objective function LRLAnd a sample level teacher model MSLIs an objective function LSL
Further, in step S305, the region level teacher model MRLIs an objective function LRLAnd a sample level teacher model MSLIs an objective function LSLRespectively as follows:
argmin(LRL)=argmin{Lce(L1,L)+Lce(L1,L2)}
argmin(LSL)=argmin{Lce(L2,L)+Lce(L1,L2)}
wherein L isRL、LSLRespectively representing region level teacher model MRLAnd a sample level teacher model MSLAn objective function of (3), LceRepresenting the cross entropy loss function, Lce(L1, L) and Lce(L1, L2) denotes the loss of self-supervision, Lce(L1, L2) represents MRLResulting pseudo tag and MSLResulting cross-pseudo-surveillance loss between pseudo-labels.
Specifically, step S4 specifically includes:
s401, enabling the region level teacher model MRLAnd a sample level teacher model MSLPerforming an integration operation to average the output of the two complementary models to Mmean
S402, region level teacher model MRLAnd a sample level teacher model MSLIs integrated as a guide to supervise the student model MSTraining on unlabeled target data.
Specifically, in step S401, the region level teacher model MRLAnd a sample level teacher model MSLThe integration of (a) is represented as:
Figure BDA0003486455760000041
wherein E () represents the averaging region level teacher model MRLAnd a sample level teacher model MSLOutput of (a), xuDenotes unlabeled samples, q denotes the probability of each class output, and μ denotes the weight.
The invention also provides a hyperspectral image classification system based on the 3D CutMix-transform, which comprises the following steps:
the data module divides the hyperspectral data into a labeled training data set Train and a labeled verification data set Test;
the building module is used for building a CNN convolutional neural network, inputting a labeled training data set Train divided by the data module into the CNN convolutional neural network, and building a 3D CutMix pre-training model based on the CNN;
an enhancement module, which is used for constructing a high-spectrum image classification region level teacher model M based on a 3D CutMix pre-training model of the construction moduleRLAnd a sample level teacher model MSL
A classification module, a hyperspectral image classification region level teacher model M constructed by the integration enhancement moduleRLAnd a sample level teacher model MSLAnd a labeled training data set Train divided by data modules jointly Train the student model MSInputting the labeled verification data set Test divided by the data module into the trained student model MSIn the method, hyperspectral image classification based on a Transformer teacher student model is realized.
Compared with the prior art, the invention has at least the following beneficial effects:
the invention relates to a hyperspectral image classification method based on 3D CutMix-transform, which is characterized in that 3D CutMix is pre-trained through CNN; training a region-level teacher model and a sample-level teacher model by using the enhanced data obtained by performing 3D CutMix on the training data set; training a student model by using two teacher models and a small amount of labeled data sets; the 3D CutMix is used for enhancing hyperspectral data, the problem of insufficient training samples is solved, and the accuracy and robustness of the model are improved and enhanced; meanwhile, the global feature extraction capability and the parallel computing capability of the Transformer enable the feature extraction capability of the Transformer to be more comprehensive than that of the CNN which can only extract local information, and the computing capability of the GPU can be fully utilized.
Further, the data set is divided, 80% of data is used for training, the other 20% of data is used for verification, the 20% of data does not participate in training and is used for evaluating the performance of the model, and the performance of the parameters is verified in continuous training-verification iteration so as to achieve the purpose of selecting correct hyper-parameters.
Further, a CNN-based 3D CutMix pre-training model is constructed, pre-training of hyperspectral data is performed by utilizing the characteristics that CNN is easy to build and high in accuracy rate, the contribution rate of each wave band in each pixel element is obtained, then the 3D Mask of the hyperspectral data is obtained by utilizing the first 5% with the highest contribution rate in all the wave bands, 5% quantity selection ensures that significant features are removed on the premise of distinguishability among the pixel elements so as to improve the robustness of a subsequent classification model, and meanwhile, the hyperspectral data is enhanced.
Further, a classification model module is built: the Transformer network structure, in the Transformer, performs parallel computation using an attention mechanism, which speeds up computation.
Further, in the classification network, a cross entropy loss function is used, and according to the result, when a mean square error loss function is used, the gradient is proportional to that of the activation function, but for the sigmoid activation function, the input is z, the output is σ (z), and when σ (z) is large, σ' (z) is small and is almost 0, so that the parameter updating is slow. However, for the cross entropy loss function, the gradient at this time is irrelevant to the activation function, and when the difference between the predicted result σ (z) and the actual result y is larger, the gradient is larger, the parameter is updated faster, and when the difference between the predicted result and the actual result is smaller, the gradient is smaller, the parameter is basically in a stable state, and the cross entropy loss function brings two benefits.
Further, the hyperspectral data is subjected to data enhancement through a 3D Mask to obtain enhanced data, and the enhanced data are used for training two one-dimensional Transformer teacher models. The teacher model obtained by training the enhanced data has higher robustness, and the credibility of the data pseudo label obtained by the two teacher models is also higher.
Further, a region level teacher model M is setRLIs an objective function LRLAnd a sample-level teacher model MSLIs an objective function LSLThe two objective functions comprise the self-supervision loss of the pseudo labels obtained by the two teacher models respectively and the cross pseudo-supervision loss between the pseudo labels obtained by the two teacher models, so that the two teacher models supervise each other in training and continuously approach to the real labels, and the two teacher models have strong guiding capability.
Further, the region level teacher model MRLAnd a sample level teacher model MSLAveraging the outputs of to obtain MmeanAnd use it asStudent model MSGuidance trained on unlabeled target data. The output results of the two teacher models are well integrated by the averaging processing, namely the guidance of the student models has strong reliability, so that the training accuracy of the student models on the unmarked target data is improved.
Further, for regional teacher model MRLAnd a sample level teacher model MSLThe output of the model M is integrated and high accuracy is ensured, and the regional teacher model M is performedRLAnd a sample level teacher model MSLIn such a way that the student model MSCan be strongly guided by two teacher models, and then has good performance on unmarked data sets.
In summary, the 3D CutMix technology is integrated into the teacher-student model for hyperspectral image classification by the deep learning method, and accurate classification of hyperspectral images is realized.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a schematic diagram of a hyperspectral data structure;
FIG. 2 is a schematic diagram of a CNN model training;
FIG. 3 is a diagram of a region level teacher model MRLAnd a sample level teacher model MSLA training schematic of (a);
fig. 4 is a schematic diagram of the 3D CutMix operation.
FIG. 5 is a structural diagram of a Transformer model.
FIG. 6 is a student model MSA training schematic of (a).
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be understood that the terms "comprises" and/or "comprising" indicate the presence of the stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and including such combinations, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It should be understood that although the terms first, second, third, etc. may be used to describe preset ranges, etc. in embodiments of the present invention, these preset ranges should not be limited to these terms. These terms are only used to distinguish preset ranges from each other. For example, the first preset range may also be referred to as a second preset range, and similarly, the second preset range may also be referred to as the first preset range, without departing from the scope of the embodiments of the present invention.
The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
Various structural schematics according to the disclosed embodiments of the invention are shown in the drawings. The figures are not drawn to scale, wherein certain details are exaggerated and possibly omitted for clarity of presentation. The shapes of various regions, layers and their relative sizes and positional relationships shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, according to actual needs.
The invention provides a hyperspectral image classification method based on a 3D CutMix-transform, which comprises the steps of dividing hyperspectral data into a labeled training data set and a labeled verification data set; pre-training 3D CutMix by CNN (Convolutional Neural Network); training a region-level teacher model and a sample-level teacher model by using the enhanced data obtained by performing 3D CutMix on the training data set; the student models were trained jointly with two teacher models and a small number of labeled datasets. According to the invention, CNN is used for pre-training 3D CutMix, then 3D CutMix is used for data enhancement of an original label data set, and respective self-supervision losses and mutual cross pseudo-supervision losses of two teacher models are optimized, so that the robustness of a co-trained student model is better, the generalization capability and accuracy of models under small samples in the existing hyperspectral image classification technology are enhanced and improved, and the method can be used for hyperspectral image classification.
The invention relates to a hyperspectral image classification method based on 3D CutMix-transform, which comprises the following steps of:
s1, dividing the hyperspectral data into a labeled training data set Train and a labeled verification data set Test;
converting (L multiplied by W) multiplied by H hyperspectral data into F multiplied by H, wherein F is L multiplied by W, L is the pixel number of the long edge of the hyperspectral data, W is the pixel number of the wide edge of the hyperspectral data, H is the number of wave bands of the hyperspectral data, and F is the total pixel number of the hyperspectral data, considering the classification of converting multidimensional data into one-dimensional data, the training efficiency of the model can be greatly improved, 80% of the multidimensional data is divided into a labeled training data set Train, and the rest 20% of the data set is divided into a labeled verification data set Test.
For example, if the input hyperspectral dataset is Indian Pines dataset, L is 145, W is 145, H is 200, and F is L × W is 21025. At this time, the hyperspectral data in the format of (145 × 145) × 200 is converted into one-dimensional data 21025 × 200, 80% of the one-dimensional data is divided into labeled training data sets Train (16820 × 200), and the remaining 20% of the one-dimensional data is divided into labeled verification data sets Test (4205 × 200).
S2, constructing a 3D CutMix pre-training model based on the CNN convolutional neural network;
s201, building a CNN as shown in FIG. 2;
the INPUT layer (INPUT) network INPUTs the standard data using mat format, and performs decentralized and standardized processing on the original data (L multiplied by W) multiplied by H, and outputs the result F multiplied by H as the INPUT of the first convolutional layer.
Meanwhile, the weight of the CNN input layer is constrained by 0-1, namely if
Figure BDA0003486455760000091
An input weight greater than 0.5 is 1, otherwise it is 0.
Convolution layers (constants) all use convolution kernels/filters (kernel/filter) of 4 × 4 size, and the convolution kernels are slid one pixel at a time (stride 1), and one feature map uses the same convolution kernel. In addition, the activation function employs a Relu function.
S202, inputting the labeled training data set Train divided in the step S1 into CNN to obtain a classification result of each pixel element;
each hyperspectral image data is represented as
Figure BDA0003486455760000092
The jth pixel element, j 1, 2, …, F, i 1, 2, and H, has the corresponding label L(j)Meanwhile, the multi-classification cross entropy loss function is adopted as follows:
Figure BDA0003486455760000101
wherein p ═ p0,...,pC-1]For a probability distribution, each element piRepresenting the probability of the sample belonging to the ith class; y ═ y0,...,yC-1]One-hot representation for the sample label, y when the sample belongs to the category ii1, otherwise yiC is the number of sample labels, 0.
S203, analyzing the contribution rate of the classification result of each pixel element obtained in the step S202 and the band data contained in the classification result to obtain a plurality of bands with the highest contribution rate;
the selection method of the wave band with the highest contribution rate of each pixel element to the classification result is as follows: and (3) carrying out 0-1 constraint on the weight of the CNN input layer, wherein after the training is finished, the wave band with the weight of 1 is a candidate for the wave band with the highest contribution rate to the classification result. And selecting the number of the wave bands with the highest contribution rate of each pixel element to the classification result according to the principle that the number of the wave bands is not more than 5% of the number of the wave bands from the candidates.
And S204, designing the 3D CutMix according to the plurality of wave bands with the highest contribution rate of each pixel element classification result obtained in the step S203 to obtain a 3D Mask of the hyperspectral data, as shown in FIG. 4.
Taking the Indian Pines dataset as an example, first initialize the 0-1Mask matrix
Figure BDA0003486455760000102
j 1, 2, …, 21025, i 1, 2, 200, and then obtaining the j-th pixel element in the hyperspectral data according to step S203
Figure BDA0003486455760000103
The k wave bands with the highest contribution rate to the classification result are selected from the 200 wave bands, k is less than or equal to 21025 multiplied by 5 percent, and then the k wave bands are subjected to the classification
Figure BDA0003486455760000104
And setting 0 corresponding to the k positions to finally obtain a pre-trained 3D Mask pre-training model.
S3, constructing a 3D CutMix-transform-based hyperspectral image componentClass region class teacher model MRLAnd a sample level teacher model MSL
Referring to FIG. 3, a 3D CutMix-transform-based hyperspectral image classification region level teacher model MRLAnd a sample level teacher model MSLThe method specifically comprises the following steps:
s301, as shown in FIG. 4, two pixel elements X are randomly selected from the hyperspectral data(m)And X(n)Performing 3D CutMix operation to obtain X _ Cut(m)And X _ Cut(n)
The 3D CutMix is described as follows:
X_Cut(m)=M(m)⊙Xm+(1-M(m))⊙Xn
X_Cut(n)=M(m)⊙Xn+(1-M(m))⊙Xm
wherein M is a 3D Mask matrix, X _ Cut is a matrix element multiplication(m)And X _ Cut(n)Respectively is original high spectral data X(m)And X(n)And (4) obtaining enhanced data through a 3D CutMix operation.
S302, repeating the step S301 for N times to obtain original hyperspectral data
Figure BDA0003486455760000111
Enhanced data obtained by 3D CutMix operations
Figure BDA0003486455760000112
j=1,2,…,F,i=1,2,...,H;
S303, the enhancement data obtained in the step S302
Figure BDA0003486455760000113
Input region level teacher model MRLObtaining a pseudo label L1 through a one-dimensional Transformer(j)
S304, the original hyperspectral data
Figure BDA0003486455760000114
Input sample level teacher model MSLObtained by one-dimensional TransformerPseudo label L2(j)
S305, respectively enabling pseudo labels L1(j)And pseudo label L2(j)Self-supervision loss with real label, while computing pseudo label L1(j)And pseudo label L2(j)Cross-pseudo-supervised losses between.
The adopted Transformer structure is shown in FIG. 5, and the region level teacher model MRLAnd a sample level teacher model MSLIs an objective function L ofRLAnd LSLRespectively as follows:
argmin(LRL)=argmin{Lce(L1,L)+Lce(L1,L2)}
argmin(LSL)=argmin{Lce(L2,L)+Lce(L1,L2)}
wherein L isRL、LSLRespectively representing region level teacher model MRLAnd a sample level teacher model MSLAn objective function of, LceRepresenting the cross entropy loss function, Lce(L1, L) and Lce(L1, L2) denotes the loss of self-supervision, Lce(L1, L2) represents MRLThe resulting pseudo label L1(j)And MSLThe resulting pseudo label L2(j)Cross-pseudo-supervision between.
S4, integrating the hyperspectral image classification region level teacher model M constructed in the step S3RLAnd a sample level teacher model MSLAnd step S1 dividing labeled training data set Train co-training student model MSInputting the labeled verification data set Test divided in the step S1 into the trained student model MSIn the method, hyperspectral image classification based on a Transformer teacher student model is realized.
S401, enabling the region level teacher model MRLAnd a sample level teacher model MSLPerforming an integration operation to average the output of the two complementary models to Mmean
Region level teacher model MRLAnd a sample level teacher model MSLThe integration (knowledge distillation) of (a) is expressed as:
Figure BDA0003486455760000121
where E () represents the average of the two teacher outputs, xuRepresenting a large number of unlabeled samples, q representing the probability of output for each class, and μ representing a weight.
S402, region level teacher model MRLAnd a sample level teacher model MSLIs integrated into a stronger guide to supervise the student model MSTraining on unlabeled target data. In addition, student model MSBut also by labels on a small number of authentic labels label.
Student model MSBy minimizing MRL、MSLKnowledge is obtained by KL-divergence between the two model outputs. At the same time, the student model MSAlso receives a small number of marked labels YlAnd (4) supervision.
Student model MSIs an objective function LSComprises the following steps:
LS=λklLKL(E(MRL(Xu),MSL(Xu)))+λceLce(MS(Xl),Yl)
wherein L isKLIndicating KL divergence loss, λklAnd λceKL divergence weights and cross entropy loss weights, respectively. XuA large amount of unlabeled data; the specific flow is shown in FIG. 6.
In another embodiment of the invention, a 3D CutMix-transform-based hyperspectral image classification system is provided, which can be used for implementing the 3D CutMix-transform-based hyperspectral image classification method, and specifically, the 3D CutMix-transform-based hyperspectral image classification system comprises a data module, a building module, an enhancement module and a classification module.
The data module divides the hyperspectral data into a labeled training data set Train and a labeled verification data set Test;
the building module is used for building a CNN convolutional neural network, inputting a labeled training data set Train divided by the data module into the CNN convolutional neural network, and building a 3D CutMix pre-training model based on the CNN;
an enhancement module, which is used for constructing a high-spectrum image classification region level teacher model M based on a 3D CutMix pre-training model of the construction moduleRLAnd a sample level teacher model MSL
A classification module, a hyperspectral image classification region level teacher model M constructed by the integration enhancement moduleRLAnd a sample level teacher model MSLAnd a labeled training data set Train divided by data modules jointly Train the student model MSInputting the labeled verification data set Test divided by the data module into the trained student model MSIn the method, hyperspectral image classification based on a Transformer teacher student model is realized.
In summary, according to the hyperspectral image classification method and system based on the 3D CutMix-transform, the 3D CutMix is pre-trained through the CNN; training a region-level teacher model and a sample-level teacher model by using the enhanced data obtained by performing 3D CutMix on the training data set; training a student model by using two teacher models and a small amount of labeled data sets; the 3D CutMix is used for enhancing hyperspectral data, the problem of insufficient training samples is solved, and the accuracy and robustness of the model are improved and enhanced; meanwhile, the global feature extraction capability and the parallel computing capability of the Transformer enable the feature extraction capability of the Transformer to be more comprehensive than that of the CNN which can only extract local information, and the computing capability of the GPU can be fully utilized.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims (10)

1. A hyperspectral image classification method based on 3D CutMix-transform is characterized by comprising the following steps:
s1, dividing the hyperspectral data into a labeled training data set Train and a labeled verification data set Test;
s2, building a CNN convolutional neural network, inputting the labeled training data set Train divided in the step S1 into the CNN convolutional neural network, and building a 3D CutMix pre-training model based on the CNN;
s3, constructing a hyperspectral image classification region level teacher model M based on the 3D CutMix pre-training model in the step S2RLAnd a sample level teacher model MSL
S4, integrating the hyperspectral image classification region level teacher model M constructed in the step S3RLAnd a sample level teacher model MSLAnd step S1 dividing labeled training data set Train to jointly Train student model MSInputting the labeled verification data set Test divided in the step S1 into the trained student model MSIn the method, hyperspectral image classification based on a Transformer teacher student model is realized.
2. The 3D CutMix-transform-based hyperspectral image classification method according to claim 1, wherein in the step S1, the labeled training data set Train accounts for 80% of the hyperspectral data, and the label verification data set Test accounts for 20% of the hyperspectral data.
3. The 3D CutMix-transform-based hyperspectral image classification method according to claim 1, wherein the step S2 specifically comprises:
s201, building a CNN convolutional neural network;
s202, inputting the labeled training data set Train divided in the step S1 into a CNN convolutional neural network to obtain a classification result of each pixel element;
s203, analyzing the contribution rate of the classification result of each pixel element obtained in the step S202 and the band data contained in the classification result to obtain a band with the highest contribution rate, and selecting the band which does not exceed 5% of the number of the bands from the candidates;
and S204, designing the 3D CutMix according to the plurality of wave bands with the highest contribution rate of each pixel element classification result obtained in the step S203 to obtain a 3D Mask of hyperspectral data, and finally obtaining a pre-trained 3D Mask pre-training model.
4. The method for classifying hyperspectral images based on 3D CutMix-fransformer as claimed in claim 3, wherein in step S201, the input layer network input of the CNN convolutional neural network uses mat format standard data, and the raw data in format (L × W) × H is processed by decentralization and normalization, and the output result is F × H as the input of the first convolutional layer, F ═ L × W, L is the pixel number of the long edge of the hyperspectral data, W is the pixel number of the wide edge of the hyperspectral data, H is the number of bands of the hyperspectral data, and F is the total pixel number of the hyperspectral data; performing 0-1 constraint on the weight of the CNN input layer if X isi (j)If the input weight is more than 0.5, the input weight is 1, otherwise, the input weight is 0;
convolution layers employ convolution kernels/filters of size 4x4, and the convolution kernels slide one pixel at a time, one feature map uses the same convolution kernel, and the activation function employs the Relu function.
5. The method for classifying hyperspectral images based on 3D CutMix-transform of claim 3, wherein in step S202, each hyperspectral image data is represented as Xi (j)J-th pixel element, j is 1, 2, …, F, i is 1, 2, …, H, and the corresponding label is L(j)The multi-classification cross entropy loss function is adopted as follows:
Figure FDA0003486455750000021
wherein p ═ p0,...,pC-1]For a probability distribution, each element piRepresenting the probability of the sample belonging to the ith class; y ═ y0,...,yC-1]One-hot representation for the sample label, y when the sample belongs to the category ii1, otherwise yiC is the number of sample labels, 0.
6. The 3D CutMix-transform-based hyperspectral image classification method according to claim 1, wherein the step S3 specifically comprises:
s301, randomly selecting two pixel elements X in hyperspectral data(m)And X(n)Performing 3D CutMix operation to obtain X _ Cut(m)And X _ Cut(n)
S302, repeating the step S301 for N times to obtain original hyperspectral data
Figure FDA0003486455750000022
Enhanced data obtained by 3D CutMix operations
Figure FDA0003486455750000023
S303, the enhancement data obtained in the step S302
Figure FDA0003486455750000024
Input region level teacher model MRLObtaining a pseudo label L1 through a one-dimensional Transformer(j)
S304, subjecting the original hyperspectral data
Figure FDA0003486455750000031
Input sample level teacher model MSLObtaining a pseudo label L2 through a one-dimensional Transformer(j)
S305, respectively using the pseudo label L1 of the step S303(j)With dummy L2 of step S304(j)Self-supervision loss with real label, and calculation of pseudo label L1(j)And pseudo label L2(j)Cross-pseudo-supervised loss between, region level teacher model MRLIs an objective function LRLAnd a sample level teacher model MSLIs an objective function LSL
7. The method for classifying 3D CutMix-transform-based hyperspectral images according to claim 6, wherein in step S305, a region level teacher model MRLIs an objective function LRLAnd a sample level teacher model MSLIs an objective function LSLRespectively as follows:
argmin(LRL)=argmin{Lce(L1,L)+Lce(L1,L2)}
argmin(LSL)=argmin{Lce(L2,L)+Lce(L1,L2)}
wherein L isRL、LSLRespectively representing region level teacher model MRLAnd a sample level teacher model MSLAn objective function of, LceRepresenting the cross entropy loss function, Lce(L1, L) and Lce(L1, L2) denotes the self-supervision loss, Lce(L1, L2) represents MRLResulting pseudo tag and MSLResulting cross-pseudo-surveillance loss between pseudo-labels.
8. The 3D CutMix-transform-based hyperspectral image classification method according to claim 1, wherein the step S4 specifically comprises:
s401, enabling the region level teacher model MRLAnd a sample level teacher model MSLPerforming an integration operation to average the output of the two complementary models to Mmean
S402, region level teacher model MRLAnd a sample level teacher model MSLIs integrated as a guide to supervise the student model MSTraining on unlabeled target data.
9. The 3D CutMix-transform-based hyperspectral image classification method according to claim 1, wherein in step S401, a region level teacher model MRLAnd a sample level teacher model MSLThe integration of (a) is represented as:
Figure FDA0003486455750000032
wherein E () represents the averaging region level teacher model MRLAnd a sample level teacher model MSLOutput of (a), xuDenotes unlabeled samples, q denotes the probability of each class output, and μ denotes the weight.
10. A hyperspectral image classification system based on 3D CutMix-transform is characterized by comprising:
the data module divides the hyperspectral data into a labeled training data set Train and a labeled verification data set Test;
the building module is used for building a CNN convolutional neural network, inputting a labeled training data set Train divided by the data module into the CNN convolutional neural network, and building a 3D CutMix pre-training model based on the CNN;
an enhancement module, which is used for constructing a high-spectrum image classification region level teacher model M based on a 3D CutMix pre-training model of the construction moduleRLAnd a sample level teacher model MSL
A classification module, a hyperspectral image classification region level teacher model M constructed by the integration enhancement moduleRLAnd a sample level teacher model MSLAnd jointly training the student model M by using the labeled training data set Train divided by the data moduleSInputting the labeled verification data set Test divided by the data module into the trained student model MSIn the method, hyperspectral image classification based on a Transformer teacher student model is realized.
CN202210082474.XA 2022-01-24 2022-01-24 Hyperspectral image classification method and system based on 3D CutMix-transform Pending CN114494777A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210082474.XA CN114494777A (en) 2022-01-24 2022-01-24 Hyperspectral image classification method and system based on 3D CutMix-transform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210082474.XA CN114494777A (en) 2022-01-24 2022-01-24 Hyperspectral image classification method and system based on 3D CutMix-transform

Publications (1)

Publication Number Publication Date
CN114494777A true CN114494777A (en) 2022-05-13

Family

ID=81473908

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210082474.XA Pending CN114494777A (en) 2022-01-24 2022-01-24 Hyperspectral image classification method and system based on 3D CutMix-transform

Country Status (1)

Country Link
CN (1) CN114494777A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019215A (en) * 2022-08-09 2022-09-06 之江实验室 Hyperspectral image-based soybean disease and pest identification method and device
CN116681997A (en) * 2023-06-13 2023-09-01 北京数美时代科技有限公司 Classification method, system, medium and equipment for bad scene images
CN116681997B (en) * 2023-06-13 2024-05-17 北京数美时代科技有限公司 Classification method, system, medium and equipment for bad scene images

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019215A (en) * 2022-08-09 2022-09-06 之江实验室 Hyperspectral image-based soybean disease and pest identification method and device
CN115019215B (en) * 2022-08-09 2022-12-09 之江实验室 Hyperspectral image-based soybean disease and pest identification method and device
CN116681997A (en) * 2023-06-13 2023-09-01 北京数美时代科技有限公司 Classification method, system, medium and equipment for bad scene images
CN116681997B (en) * 2023-06-13 2024-05-17 北京数美时代科技有限公司 Classification method, system, medium and equipment for bad scene images

Similar Documents

Publication Publication Date Title
Lu et al. 3-D channel and spatial attention based multiscale spatial–spectral residual network for hyperspectral image classification
CN110532859A (en) Remote Sensing Target detection method based on depth evolution beta pruning convolution net
CN110175613A (en) Street view image semantic segmentation method based on Analysis On Multi-scale Features and codec models
CN111291556B (en) Chinese entity relation extraction method based on character and word feature fusion of entity meaning item
Zhao et al. Multiobjective sparse ensemble learning by means of evolutionary algorithms
CN111339935B (en) Optical remote sensing picture classification method based on interpretable CNN image classification model
CN101894276A (en) Training method of human action recognition and recognition method
CN106446954A (en) Character recognition method based on depth learning
Wang et al. FE-YOLOv5: Feature enhancement network based on YOLOv5 for small object detection
CN112733866A (en) Network construction method for improving text description correctness of controllable image
CN113469088A (en) SAR image ship target detection method and system in passive interference scene
CN106127240A (en) A kind of classifying identification method of plant image collection based on nonlinear reconstruction model
CN110334724A (en) Remote sensing object natural language description and multiple dimensioned antidote based on LSTM
Li et al. Dating ancient paintings of Mogao Grottoes using deeply learnt visual codes
CN114821340A (en) Land utilization classification method and system
CN108229505A (en) Image classification method based on FISHER multistage dictionary learnings
CN111310820A (en) Foundation meteorological cloud chart classification method based on cross validation depth CNN feature integration
CN114494777A (en) Hyperspectral image classification method and system based on 3D CutMix-transform
Ouadiay et al. Simultaneous object detection and localization using convolutional neural networks
CN112598055B (en) Helmet wearing detection method, computer-readable storage medium and electronic device
Thirumaladevi et al. Improved transfer learning of CNN through fine-tuning and classifier ensemble for scene classification
CN114757189A (en) Event extraction method and device, intelligent terminal and storage medium
CN114419372A (en) Multi-scale point cloud classification method and system
CN113627522A (en) Image classification method, device and equipment based on relational network and storage medium
Dhariwal et al. Image Normalization and Weighted Classification Using an Efficient Approach for SVM Classifiers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination