CN116524265A

CN116524265A - Hyperspectral image classification method based on multi-scale mixed convolution network

Info

Publication number: CN116524265A
Application number: CN202310492375.3A
Authority: CN
Inventors: 葛微; 陈博文; 陈婷婷; 王鹏; 詹伟达; 唐雁峰
Original assignee: Chongqing Research Institute Of Changchun University Of Technology
Current assignee: Chongqing Research Institute Of Changchun University Of Technology
Priority date: 2023-05-04
Filing date: 2023-05-04
Publication date: 2023-08-01

Abstract

The invention belongs to the technical field of image classification processing, in particular to a hyperspectral image classification method based on a multi-scale mixed convolution network, which specifically comprises the following steps of step 1, hyperspectral image: using the disclosed hyperspectral image dataset; step 2, image preprocessing: and (3) carrying out data dimension reduction on the hyperspectral image in the step (1) and carrying out sample block taking on the image after dimension reduction to obtain a hyperspectral sample block. According to the invention, the space-spectrum combined features of the hyperspectral image are extracted by adopting a plurality of multi-scale 3D convolution modules, so that the spectrum and space dimension features under different scales are fully fused, and the classification performance of the hyperspectral image is improved.

Description

Hyperspectral image classification method based on multi-scale mixed convolution network

Technical Field

The invention relates to the technical field of image classification processing, in particular to a hyperspectral image classification method based on a multi-scale mixed convolution network.

Background

The hyperspectral remote sensing utilizes a plurality of electromagnetic waves to acquire related data of a measured object, is the leading edge field of the current remote sensing technology, and a hyperspectral image is formed by combining surface image information and spectrum information, and has the advantages of large spectrum information quantity, nanoscale spectrum resolution, spectrum integration and the like. Classifying hyperspectral images acquired by combining an imaging spectrometer with an aircraft and a satellite by using a deep learning technology has become an emerging research field.

The basis of hyperspectral image classification is spectral information and spatial information, and the combined use of the spatial information and the spectral information is widely applied to hyperspectral image classification in the hyperspectral image classification field at present, and the Chinese patent publication number is CN113837314A, and the name is a hyperspectral image classification method based on a mixed convolutional neural network; the network model adopts a single 3D convolution model to simultaneously extract the spectrum and space dimension characteristics of the preprocessed hyperspectral image; further extracting space dimension characteristics by adopting a single 2D convolution model; processing the output information by adopting a single 1D convolution model; the method does not consider the relation between the features of different scales and different layers, the relation between the remote pixel blocks is not utilized, and the classification effect is required to be improved.

Disclosure of Invention

(one) solving the technical problems

Aiming at the defects of the prior art, the invention provides a hyperspectral image classification method based on a multi-scale mixed convolution network, which solves the problem of classification precision.

(II) technical scheme

The invention adopts the following technical scheme for realizing the purposes: a hyperspectral image classification method based on a multiscale mixed convolution network specifically comprises the following steps

Step 1, hyperspectral image: using the disclosed hyperspectral image dataset;

step 2, image preprocessing: performing data dimension reduction on the hyperspectral image in the step 1, and performing sample block taking on the image after dimension reduction to obtain a hyperspectral sample block;

step 3, constructing a multi-scale mixed convolution network model: the whole network model comprises two multi-scale three-dimensional convolution blocks, two multi-scale two-dimensional convolution blocks, a mixed attention block, a three-dimensional convolution layer, a two-dimensional convolution layer, a one-dimensional convolution layer, two full connection blocks and a classifier. Inputting a hyperspectral sample block into two directly connected multi-scale three-dimensional convolution blocks to extract spatial spectrum characteristics containing multi-scale information; reconstructing the spatial spectrum features extracted by the two multi-scale three-dimensional convolution blocks, and then respectively inputting the reconstructed spatial spectrum features into the two multi-scale two-dimensional convolution blocks to further extract multi-scale and multi-level spatial features; the spatial spectrum features extracted by the two multi-scale three-dimensional convolution blocks are input into a three-dimensional convolution layer, the convolution result is reconstructed and then input into a two-dimensional convolution layer, and the spectral features are further extracted; the multi-scale multi-level spatial features and the spectrum features are subjected to feature fusion and then input into the mixed attention block, so that the mixed attention block only pays attention to useful spatial information and spectrum information; the space information and the spectrum information of interest are reconstructed and then input into a one-dimensional convolution layer to obtain the characteristics of a vector form; finally, the data are input into two full connecting blocks, and a classification result is output through a classifier.

Step 4, selecting a loss function and an evaluation index: the model parameter training is considered to be completed by calculating the classified result and the loss function of the label until the training times reach a set threshold value or the value of the loss function reaches a set range, and meanwhile, an evaluation index is selected to measure the accuracy of the algorithm and evaluate the performance of the system;

step 5, saving a training model: and (3) selecting a group of model parameters with the best effect in the training process for solidification, and then when hyperspectral image classification operation is needed, directly inputting hyperspectral images into a network to obtain final classified images.

Further, the disclosed data set adopted in the step 1 is as follows: an indian pine data set (IN), a university of parkia data set (UP), and a salunas data Set (SA).

Further, the method for reducing the dimension of the data in the step 2 uses a principal component analysis (PCA, principal Component Analysis), and the dimension reduction process is as follows:

for dimension W x H x C ₁ Is the original hyperspectral image I of (1) ₁ Performing covariance matrix feature decomposition to convert into W×H×C dimensions ₂ Novel hyperspectral image I of (2) ₂ Wherein W is the image width, H is the image height, C ₁ C is the number of original image channels ₂ For the number of the wave bands after conversion, the dimension reduction operation can reduce the redundancy of data characteristics and can also reduce the calculation parameters.

Further, the block taking operation process of the sample in the step 2 is as follows:

-transforming said new hyperspectral image I ₂ Cut into a size of w×w×c ₁ Is input into the network model, w is the window size.

Further, the multi-scale three-dimensional convolution block in the step 3 is composed of three branches of small scale, medium scale and large scale, the small scale branch is sequentially composed of a three-dimensional convolution layer, a batch normalization layer and an activation function layer, the medium scale branch and the large scale branch are sequentially composed of a three-dimensional convolution layer, an activation function layer, a three-dimensional convolution layer, a batch normalization layer and an activation function layer, and tensors obtained by the three branches are spliced together according to a first dimension number; the multi-scale two-dimensional convolution block consists of three branches with small scale, medium scale and large scale, each branch consists of a two-dimensional convolution layer, a batch normalization layer and an activation function layer in sequence, and tensors obtained by the three branches are spliced together according to a first dimension number; the mixed attention block comprises a spectrum attention block and a space attention block which are connected in a serial mode; the method comprises the steps of firstly carrying out global average pooling and global maximum pooling on input through two branches in a spectrum attention block, sequentially inputting pooling results into a one-dimensional convolution layer and an activation function layer 1, splicing obtained tensors according to a first dimension, sequentially inputting the tensors into a linear layer 1, an activation function layer 2, a linear layer 2, an activation function layer 1, a one-dimensional convolution layer and an activation function layer 1, and finally carrying out element multiplication on the tensors and input features; the method comprises the steps that firstly, input is subjected to average pooling and maximum pooling through two branches in a space attention block, pooling results are sequentially input into a two-dimensional convolution layer and a batch normalization layer, obtained tensors are spliced according to a first dimension number, and then the tensors are sequentially input into the two-dimensional convolution layer and an activation function layer; the full connection block is sequentially composed of a linear layer, an activation function layer and a Dropout layer.

Further, the loss function in the step 4 selects a cross entropy loss function; the selection of the loss function influences the quality of the model, can truly reflect the difference between the predicted value and the actual value, and can correctly feed back the quality of the model. The evaluation index selects overall accuracy, average accuracy and consistency, can effectively evaluate the quality of classification and measure the effect of a classification network.

(III) beneficial effects

Compared with the prior art, the hyperspectral image classification method based on the multi-scale mixed convolution network has the following beneficial effects:

1. according to the invention, the space-spectrum combined features of the hyperspectral image are extracted by adopting a plurality of multi-scale 3D convolution modules, so that the spectrum and space dimension features under different scales are fully fused, and the classification performance of the hyperspectral image is improved.

2. According to the method, a plurality of multi-scale 2D convolution modules are adopted, the space dimension characteristics of different layers are further extracted, the calculated amount is reduced, and meanwhile, the classification precision of hyperspectral images is further improved.

3. According to the invention, the mixed attention module is added to establish the connection between the far pixel blocks, so that the classification precision of the hyperspectral image is further improved.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a diagram of a network architecture of the present invention;

FIG. 3 is a schematic diagram of the specific composition of a multi-scale 3D convolution module according to the present disclosure;

FIG. 4 is a schematic diagram of the specific composition of a multi-scale 2D convolution module according to the present disclosure;

FIG. 5 is a schematic diagram showing the specific components of the hybrid attention module of the present invention;

FIG. 6 is a schematic diagram showing the specific composition of a spectral attention block in the hybrid attention module of the present invention;

FIG. 7 is a schematic diagram showing the specific composition of a spatial attention block in the hybrid attention module of the present invention;

FIG. 8 is a schematic diagram showing the specific composition of the full connection block of the present invention;

FIG. 9 is a graph comparing correlation metrics over three data sets according to the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Examples

As shown in fig. 1-9, a hyperspectral image classification method based on a multi-scale mixed convolution network according to an embodiment of the present invention specifically includes the following steps:

step 1, hyperspectral image: the disclosed hyperspectral image dataset is employed.

Step 2, image preprocessing: and (3) performing data dimension reduction on the hyperspectral image in the step (1), performing sample block taking on the image after dimension reduction to obtain a hyperspectral sample block, performing dimension reduction on the original hyperspectral image I1 according to a principal component analysis method, and performing sample block taking on the new hyperspectral image I2 after dimension reduction to obtain a three-dimensional image block.

Because of the large number of hyperspectral images and numerous wave bands, the data is necessary to be subjected to dimension reduction operation, and a Principal Component Analysis (PCA) is a statistical method, and a group of variables possibly with correlation are converted into a group of variables which are not linearly related through forward-backward conversion, and the converted variables are called as principal components.

The specific implementation method of the sample block is that in the space dimension, the new hyperspectral image I2 is cut into a three-dimensional image block with the size of w multiplied by B, and the three-dimensional image block is input into a network model, wherein w is the size of a window; the sample block is labeled with intermediate pixels.

Step 3, constructing a multi-scale mixed convolution network model: the whole network model comprises two multi-scale three-dimensional convolution blocks, two multi-scale two-dimensional convolution blocks, a mixed attention block, a three-dimensional convolution layer, a two-dimensional convolution layer, a one-dimensional convolution layer, two full connection blocks and a classifier, and hyperspectral sample blocks are input into the two multi-scale three-dimensional convolution blocks which are directly connected to extract spatial spectrum features containing multi-scale information; reconstructing the spatial spectrum features extracted by the two multi-scale three-dimensional convolution blocks, and then respectively inputting the reconstructed spatial spectrum features into the two multi-scale two-dimensional convolution blocks to further extract multi-scale and multi-level spatial features; the spatial spectrum features extracted by the two multi-scale three-dimensional convolution blocks are input into a three-dimensional convolution layer, the convolution result is reconstructed and then input into a two-dimensional convolution layer, and the spectral features are further extracted; the multi-scale multi-level spatial features and the spectrum features are subjected to feature fusion and then input into the mixed attention block, so that the mixed attention block only pays attention to useful spatial information and spectrum information; the space information and the spectrum information of interest are reconstructed and then input into a one-dimensional convolution layer to obtain the characteristics of a vector form; finally, inputting the data into two full connecting blocks, and outputting a classification result through a classifier; the multi-scale three-dimensional convolution block consists of three branches of small scale, medium scale and large scale, wherein the small scale branch consists of a three-dimensional convolution layer, a batch normalization layer and an activation function layer in sequence, the medium scale and large scale branch consists of a three-dimensional convolution layer, an activation function layer, a three-dimensional convolution layer, a batch normalization layer and an activation function layer in sequence, and tensors obtained by the three branches are spliced together according to a first dimension; the multi-scale two-dimensional convolution block consists of three branches with small scale, medium scale and large scale, each branch consists of a two-dimensional convolution layer, a batch normalization layer and an activation function layer in sequence, and tensors obtained by the three branches are spliced together according to a first dimension number; the mixed attention block comprises a spectrum attention block and a space attention block which are connected in a serial mode; the method comprises the steps of firstly carrying out global average pooling and global maximum pooling on input through two branches in a spectrum attention block, sequentially inputting pooling results into a one-dimensional convolution layer and an activation function layer 1, splicing obtained tensors according to a first dimension, sequentially inputting the tensors into a linear layer 1, an activation function layer 2, a linear layer 2, an activation function layer 1, a one-dimensional convolution layer and an activation function layer 1, and finally carrying out element multiplication on the tensors and input features; the method comprises the steps that firstly, input is subjected to average pooling and maximum pooling through two branches in a space attention block, pooling results are sequentially input into a two-dimensional convolution layer and a batch normalization layer, obtained tensors are spliced according to a first dimension number, and then the tensors are sequentially input into the two-dimensional convolution layer and an activation function layer; the full connection block is sequentially composed of a linear layer, an activation function layer and a Dropout layer.

The batch normalization layer forcibly pulls the distribution of any neuron input values of each layer of neural network back to the standard normal distribution with the mean value of 0 and the variance of 1 through a certain normalization means, so that the activated input values fall in a region of which the nonlinear function is sensitive to input, the output of the network is not too large, a relatively large gradient is obtained, the problem of gradient disappearance is avoided, and further gradient enlargement also means a learning convergence speed block; the Dropout layer may set a certain probability to stop some neurons from working when propagating forward, then start training, update and keep those neurons still working and weight parameters, after all the parameters are updated, re-make some neurons stop working according to the probability set by us, then start training, if a new neuron for training has been trained in the first time, continue updating its parameters, if the parameters are updated for the first time, but keep the weights of the neurons stopped for the second time, and keep the process until the training is finished, so as to prevent the network from fitting during learning.

Step 4, selecting a loss function and an evaluation index: the model parameter training is considered to be completed by calculating the loss function of the classified result image and the label until the training times reach a set threshold value or the value of the loss function reaches a set range, the model parameter is saved, meanwhile, an evaluation index is selected to measure the accuracy of an algorithm, the performance of a system is evaluated, the selection of the loss function influences the quality of the model, the difference between a predicted value and a true value can be truly reflected, the quality of the model can be correctly fed back, the overall accuracy, the average accuracy and the consistency of the evaluation index are selected, the classified quality can be effectively evaluated, and the effect of a classification network is measured.

Further, said step 1 selecting an indian pine data set (IN), a university of parkia data set (UP) and a salunas data Set (SA); the indian pine data set (IN) is a hyperspectral image obtained by an on-board visible infrared imaging spectrometer IN northwest of indiana, the space size of the image is 145×145, the number of wave bands is 220, the resolution of the spectrum and space is 10nm and 20m, background pixels are removed, the number of space pixels generally used for experiments is 10249, the true class of ground object is 16, and IN 220 wave bands, 20 are unavailable, and the rest 200 wave bands are only used for the experiments for research; the parkia university dataset (UP) was obtained by AVIRIS sensor in florida 1996, with a spatial size of 512 x 614, a spatial resolution of 18m, and the dataset was divided into 9 categories; 115 bands, 12 noise bands are removed, and 103 available bands are left; the salina dataset (SA) is a hyperspectral image obtained in the united states by an avisis sensor; the spatial size of the image is 512×217, the spatial resolution is 1.7m, wherein the ground features have 16 categories, 224 wave bands, but 20 water absorption band frequency bands are removed, and the remaining 204 wave bands are used for hyperspectral image classification experiments.

Further, in the step 2, taking the Indian pine data set as an example, the original hyperspectral image I1 is firstly reduced to 145×145×30, the covariance matrix of the original hyperspectral image is solved, and then the characteristic root lambda of the covariance matrix is calculated ₁ ≥λ ₂ …≥λ ₂₀₀ And (3) setting a threshold value theta, selecting the first P principal components larger than the threshold value theta, obtaining corresponding unit feature vectors from the feature roots of the first P principal components, combining the unit feature vectors into a matrix, solving a transposed matrix, and then transforming the original hyperspectral image by the transposed matrix to obtain the hyperspectral image with reduced dimension. And then, the dimension-reduced image I2 is subjected to block taking, and a three-dimensional image block with the size of 25 multiplied by 30 is obtained. The calculation formula of the band mean and covariance matrix is as follows:

wherein X is _i Representing the ith pixel point of the original hyperspectral image, Q represents the number of the pixel points and X _j The j-th wave band of the original hyperspectral image is represented, and B is represented as the number of wave bands;

further, in the step 3, the structure of the multi-scale mixed convolution network model is shown in fig. 2, and the whole network model includes two multi-scale three-dimensional convolution blocks, two multi-scale two-dimensional convolution blocks, one mixed attention block, one three-dimensional convolution layer, one two-dimensional convolution layer, one-dimensional convolution layer, two full connection blocks and one classifier. The structure of the multi-scale three-dimensional convolution block is shown in fig. 3, wherein all activation function layers use a mich function; in a small-scale branch of a first multi-scale three-dimensional convolution block, the size of convolution kernels in a convolution layer is 1 multiplied by 1, and the number of the convolution kernels is 16; in a mesoscale branch of the first multi-scale three-dimensional convolution block, the size of convolution kernels in the first three-dimensional convolution layer is 1 multiplied by 1, the number of convolution kernels is 8, the size of convolution kernels in the second three-dimensional convolution layer is 3 multiplied by 3, and the number of convolution kernels is 16; in a large-scale branch of a first multi-scale three-bit convolution block, the size of convolution kernels in a first three-dimensional convolution layer is 1 multiplied by 1, the number of convolution kernels is 8, the size of convolution kernels in a second three-dimensional convolution layer is 5 multiplied by 5, and the number of convolution kernels is 32; in the small-scale branch of the second multi-scale three-dimensional convolution block, the size of convolution kernels in the convolution layer is 1 multiplied by 1, and the number of the convolution kernels is 32; in a mesoscale branch of the second multi-scale three-dimensional convolution block, the size of convolution kernels in the first three-dimensional convolution layer is 1 multiplied by 1, the number of convolution kernels is 16, the size of convolution kernels in the second three-dimensional convolution layer is 3 multiplied by 3, and the number of convolution kernels is 32; in the large-scale branch of the second multi-scale three-dimensional convolution block, the size of convolution kernels in the first three-dimensional convolution layer is 1 multiplied by 1, the number of convolution kernels is 16, the size of convolution kernels in the second three-dimensional convolution layer is 5 multiplied by 5, and the number of convolution kernels is 64; the structure of the multi-scale two-dimensional convolution block is shown in fig. 4, wherein all activation function layers use a mich function; in a small-scale branch of a first multi-scale two-dimensional convolution block, the convolution kernel size in the convolution layer is 1 multiplied by 1, and the number of the convolution kernels is 16; in a mesoscale branch of a first multiscale two-dimensional convolution block, the size of convolution kernels in a convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 16; in a large-scale branch of the first multi-scale two-dimensional convolution block, the size of convolution kernels in the convolution layer is 5 multiplied by 5, and the number of the convolution kernels is 16; in a small-scale branch of the second multi-scale two-dimensional convolution block, the convolution kernel size in the convolution layer is 1 multiplied by 1, and the number of the convolution kernels is 32; in a mesoscale branch of the second multiscale two-dimensional convolution block, the size of convolution kernels in the convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 32; in the large-scale branch of the second multi-scale two-dimensional convolution block, the size of convolution kernels in the convolution layer is 5 multiplied by 5, and the number of the convolution kernels is 32; the size of the convolution kernel in the three-dimensional convolution layer connected with the second multi-scale three-dimensional convolution block is 1 multiplied by 1, the number of the convolution kernels is 64, the size of the convolution kernel in the two-dimensional convolution layer connected with the three-dimensional convolution layer is 3 multiplied by 3, and the number of the convolution kernels is 64; the structure of the mixed attention block is shown in fig. 5; the structure of the spectrum attention block is shown in fig. 6, the size of a convolution kernel in a one-dimensional convolution layer in two branches is B (B is the number of channels after dimension reduction), a Sigmoid function is used for an activation function layer 1, a Mish function is used for an activation function layer 2, and the size of the convolution kernel in the subsequent one-dimensional convolution layer is 2B; the structure of the spatial attention block is shown in fig. 7, wherein the convolution kernel size of the two-dimensional convolution layers of the two branches is 1×1, the convolution kernel size in the subsequent two-dimensional convolution layer is 3×3, and the activation function layer uses a Sigmoid function; the structure of the fully connected layer is shown in fig. 8, wherein the activation function layer uses a Mish function, and the Dropout coefficient is set to 0.2; the above-mentioned Mish activation function is a non-monotonic smooth activation function, which can achieve better accuracy and generalization; the Sigmoid activation function is less affected by noise data; the LogSoftmax classifier is selected by the classifier, so that the operation speed can be increased, and the data stability can be improved; the Sigmoid function, the mix function, and the LogSoftmax function are defined as follows:

f(x) _Mish ＝x*tanh(ln(1+e ^x ))

wherein x represents the input characteristic information, x _i Representing predicted tag value, x _j Representing the true tag value.

Further, the output of the network and the label in the step 4 calculate a loss function, the loss function selects a cross entropy loss function, and the cross entropy loss function is defined as follows:

wherein C represents the cost, x represents the samples, y represents the actual value, a represents the output value, and n represents the total number of samples.

The overall accuracy is an index for measuring the overall classification accuracy of the classification model, the average accuracy is an index for indicating the classification accuracy of the classification model on a certain class, and the consistency coefficient is used for measuring the consistency of the predicted value and the true value. The calculation formulas of the overall accuracy, the average accuracy and the consistency coefficient are as follows:

where TP is the positive sample correctly classified by the model, FN is the positive sample incorrectly classified by the model, FP is the negative sample incorrectly classified by the model, and TN is the negative sample correctly classified by the model. C is the total number of categories, T _i Is the number of correctly classified samples for each category, a _i Is the number of real samples of each type, b _i The number of samples of each type is predicted, and n is the number of total samples.

The training frequency is set to be 200, the number of the network pictures input each time is 64, the upper limit of the number of the network pictures input each time is mainly determined according to the performance of a computer graphics processor, and generally, the larger the number of the network pictures input each time is, the better the network is, so that the network is more stable. The learning rate of the training process is set to 0.005, which can ensure the rapid fitting of the network without causing the over fitting of the network. The network optimizer selects an Adam optimizer. The method has the advantages of simple realization, high calculation efficiency, less memory requirement, no influence of gradient expansion transformation on parameter updating and stable parameter updating. The loss function value threshold is set to be about 0.005, and if the loss function value threshold is smaller than 0.005, training of the whole network can be considered to be basically completed.

The implementation of convolution, activation function and splicing operation is an algorithm well known to those skilled in the art, and the specific flow and method can be referred to in corresponding textbooks or technical literature.

According to the hyperspectral image classification method based on the multi-scale mixed convolution network, hyperspectral images can be classified, the hyperspectral image multi-scale and multi-level spatial spectrum characteristic information is utilized, and the connection is established between the long-distance pixels, so that the classification precision of the hyperspectral images is improved; the feasibility and superiority of the method are further verified by calculating the related indexes of the image obtained by the existing method.

On the indian pine data set (IN), the university of parkia data set (UP) and the salunas data Set (SA), the related index pairs of the prior art and the proposed method of the present invention are shown IN fig. 9; wherein 10% of the sample data is used as a training set, 10% of the sample data is used as a validation set, and the rest of the sample data is used as a test set in each data set; the method provided by the invention is applied to three data sets, and the values of three indexes of overall precision, average precision and consistency coefficient are higher than those of the existing method, and the indexes further indicate that the method provided by the invention has better classification effect.

Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A hyperspectral image classification method based on a multi-scale mixed convolution network is characterized by comprising the following steps of: the method comprises the following steps in particular,

step 1, hyperspectral image: using the disclosed hyperspectral image dataset;

step 3, constructing a multi-scale mixed convolution network model: the whole network model comprises two multi-scale three-dimensional convolution blocks, two multi-scale two-dimensional convolution blocks, a mixed attention block, a three-dimensional convolution layer, a two-dimensional convolution layer, a one-dimensional convolution layer, two full connection blocks and a classifier. Inputting a hyperspectral sample block into two directly connected multi-scale three-dimensional convolution blocks to extract spatial spectrum characteristics containing multi-scale information; reconstructing the spatial spectrum features extracted by the two multi-scale three-dimensional convolution blocks, and then respectively inputting the reconstructed spatial spectrum features into the two multi-scale two-dimensional convolution blocks to further extract multi-scale and multi-level spatial features; the spatial spectrum features extracted by the two multi-scale three-dimensional convolution blocks are input into a three-dimensional convolution layer, the convolution result is reconstructed and then input into a two-dimensional convolution layer, and the spectral features are further extracted; the multi-scale multi-level spatial features and the spectrum features are subjected to feature fusion and then input into the mixed attention block, so that the mixed attention block only pays attention to useful spatial information and spectrum information; the space information and the spectrum information of interest are reconstructed and then input into a one-dimensional convolution layer to obtain the characteristics of a vector form; finally, inputting the data into two full connecting blocks, and outputting a classification result through a classifier;

2. The hyperspectral image classification method based on the multi-scale mixed convolution network according to claim 1, wherein the method comprises the following steps: the public data set adopted in the step 1 is as follows: an indian pine data set (IN), a university of parkia data set (UP), and a salunas data Set (SA).

3. The hyperspectral image classification method based on the multi-scale mixed convolution network according to claim 1, wherein the method comprises the following steps: the method for reducing the data dimension in the step 2 uses a principal component analysis (PCA, principal Component Analysis), and the dimension reduction process is as follows:

4. A hyperspectral image classification method based on a multi-scale mixed convolution network as claimed in claim 3 wherein: the block taking operation process of the sample in the step 2 is as follows:

5. The hyperspectral image classification method based on the multi-scale mixed convolution network according to claim 1, wherein the method comprises the following steps: the multi-scale three-dimensional convolution block in the step 3 is composed of three branches of small scale, medium scale and large scale, the small scale branch is sequentially composed of a three-dimensional convolution layer, a batch normalization layer and an activation function layer, the medium scale branch and the large scale branch are sequentially composed of a three-dimensional convolution layer, an activation function layer, a three-dimensional convolution layer, a batch normalization layer and an activation function layer, and tensors obtained by the three branches are spliced together according to a first dimension number; the multi-scale two-dimensional convolution block consists of three branches with small scale, medium scale and large scale, each branch consists of a two-dimensional convolution layer, a batch normalization layer and an activation function layer in sequence, and tensors obtained by the three branches are spliced together according to a first dimension number; the mixed attention block comprises a spectrum attention block and a space attention block which are connected in a serial mode; the method comprises the steps of firstly carrying out global average pooling and global maximum pooling on input through two branches in a spectrum attention block, sequentially inputting pooling results into a one-dimensional convolution layer and an activation function layer 1, splicing obtained tensors according to a first dimension, sequentially inputting the tensors into a linear layer 1, an activation function layer 2, a linear layer 2, an activation function layer 1, a one-dimensional convolution layer and an activation function layer 1, and finally carrying out element multiplication on the tensors and input features; the method comprises the steps that firstly, input is subjected to average pooling and maximum pooling through two branches in a space attention block, pooling results are sequentially input into a two-dimensional convolution layer and a batch normalization layer, obtained tensors are spliced according to a first dimension number, and then the tensors are sequentially input into the two-dimensional convolution layer and an activation function layer; the full connection block is sequentially composed of a linear layer, an activation function layer and a Dropout layer.

6. The hyperspectral image classification method based on the multi-scale mixed convolution network according to claim 1, wherein the method comprises the following steps: the loss function in the step 4 selects a cross entropy loss function; the selection of the loss function influences the quality of the model, can truly reflect the difference between the predicted value and the actual value, and can correctly feed back the quality of the model. The evaluation index selects overall accuracy, average accuracy and consistency, can effectively evaluate the quality of classification and measure the effect of a classification network.