CN113673629A

CN113673629A - Open set domain adaptive remote sensing image small sample classification method based on multi-graph convolution network

Info

Publication number: CN113673629A
Application number: CN202111054659.1A
Authority: CN
Inventors: 汪西莉; 陈杰虎; 洪灵; 马君亮
Original assignee: Shaanxi Normal University
Current assignee: Shaanxi Normal University
Priority date: 2021-09-09
Filing date: 2021-09-09
Publication date: 2021-11-19

Abstract

The method for classifying the remote sensing image small samples based on the multi-graph convolution network open domain adaptation comprises the following steps: s100: respectively reading M samples from the task data set and the auxiliary data set; s200: extracting the features of all samples by using a feature extraction network Conv 4; s300: calculating to obtain an adjacency matrix A according to the extracted features; s400: updating sample characteristics by using the obtained adjacency matrix A and multi-graph convolution operation; s500: predicting the label of the unmarked sample of the task data set by using the updated characteristics; s600: according to the predicted sample label, carrying out reverse propagation to train the network; s700: and classifying the unlabeled samples of the task data set by using the trained network to obtain a classification result. The inter-domain difference of the common part is reduced, the classification precision is improved, the similarity of the graph modeling samples in the feature space is utilized, the aggregation degree of the samples of the same type is enhanced through graph convolution operation, and the classification precision is further improved.

Description

Open set domain adaptive remote sensing image small sample classification method based on multi-graph convolution network

Technical Field

The disclosure belongs to the technical field of remote sensing image processing, and particularly relates to an open set domain adaptive remote sensing image small sample classification method based on a multi-graph convolution network.

Background

Remote sensing scene classification is an important means in remote sensing scene understanding, and aims to classify remote sensing images into different categories according to the contents of the remote sensing images. The remote sensing scene classification has wide application in the aspects of city planning, land prediction, environmental protection and the like. Deep learning has enjoyed great success in the field of remote sensing image classification. In deep learning, a large number of labeled samples are generally needed for training, but it is difficult to obtain enough labeled samples under many conditions, so that the method for learning small samples is applied to remote sensing image scene classification to solve the problem of remote sensing image scene classification under the condition that each type of image only has few labeled samples.

Metric learning is one of the mainstream methods to solve classification of small samples. The main idea of metric learning is to quantify the similarity between any sample pair by training and learning a similarity metric method. The method includes that a measurement network with certain generalization performance is learned through an auxiliary data set, for the measurement network, sample pairs of the same type obtain higher similarity, sample pairs of different types obtain lower similarity, and then under the condition that each type only has a few labeled samples, the type of unlabeled samples in the task data set can be predicted according to the similarity of the unlabeled samples and the labeled samples. Metric learning is first introduced into the field of natural image processing to solve the problem of small sample classification of natural images. In recent years, a small sample classification method is introduced into the field of remote sensing image processing, and the small sample classification problem of remote sensing images is solved.

Definition of small sample classification problem: given a task T-specific message containing a small amount of supervisory informationTask data set D_TAnd an auxiliary data set D not related to task T_AWhere D is_AIn which there are a large number of marked samples, D_TOnly a few labeled samples, D_AAnd D_TThe sources are usually different, and small samples are learned by D_TLittle supervisory information and D_ATo construct a function to complete the mapping of the input to the target. Here "not related to task T" means D_TClass set Y in (1)_TAnd D_AClass set Y in (1)_AThe intersection is empty. However, in reality, it is possible for the task data and auxiliary data tags to coincide, i.e., Y_A∩Y_TNot equal to phi and Y_A∩Y_T≠Y_TIn this case, the task of classifying the small samples is called open set small sample classification, Y_A∩Y_TThe case of phi is called closed set small sample classification. The same part of the categories in the two data sets is called common categories, and the rest has private categories. Due to the differences in the image feature distributions of different datasets, for common class data, a model trained on an auxiliary dataset cannot often obtain high accuracy on a task dataset despite having the same label. The principle of the method is to quantize the characteristic difference between the data sets into a domain loss function, and achieve the purpose of reducing the characteristic distribution difference by reducing the domain loss function in the model training process. If the difference of the auxiliary data and the task data characteristic distribution can be reduced by adopting a domain adaptation method, the classification precision of the small samples can be improved. In addition, the existing small sample classification method rarely considers the relation characteristics among data, and if the similarity among samples is modeled by using graphs, the classification precision of the small samples can be further improved through the learning of features on the graphs.

The prototype network is a small sample classification method based on metric learning, and the method mainly comprises a feature extraction network and metric prediction. The main idea is that a feature extraction network is used for mapping images to feature space, the mean value of each type of marked samples is calculated to be used as the prototype center of the type, the features of the same type of images are enabled to be closest to the prototype center of the type and far away from the prototype centers of other types through training, and then the distance between the unmarked samples and the prototype centers in the feature space is utilized to predict the unmarked samples. The classification precision of the prototype network and the aggregation degree of each type of sample features are in a negative correlation relationship, domain adaptation can improve the aggregation degree of common types in a task data set, and graph convolution operation can simultaneously improve the aggregation degrees of common type data sets and private type data sets.

Disclosure of Invention

In order to solve the above problem, the present disclosure provides an open set domain adaptive remote sensing image small sample classification method based on a Multi-Graph Convolutional Network (MGCN), which includes the following steps:

s100: respectively reading M samples from a task data set and an auxiliary data set, wherein N types exist in the M samples from the task data set, each type has K marked samples, the rest are unmarked samples, and the samples from the auxiliary data set are all marked samples;

s200: extracting the features of all samples by using a feature extraction network Conv 4;

s300: according to the extracted features, modeling a similarity relation between the images by using a K neighbor and radial basis function method, and calculating to obtain an adjacency matrix A;

s400: updating sample characteristics by using the obtained adjacency matrix A and multi-graph convolution operation;

s500: obtaining the prototype center of each type by using the updated features, calculating the distance from the unmarked sample to the prototype center of each type, and predicting the label of the unmarked sample of the task data set;

s600: selecting sample characteristics with the same category of the task data set and the auxiliary data set according to the predicted sample labels, calculating to obtain a domain loss function based on variance weighting, and training the network by back propagation;

s700: and classifying the unlabeled samples of the task data set by using the trained network to obtain a classification result.

The beneficial effect of this scheme is:

(1) in the classification of remote sensing images, a domain adaptation network and a graph convolution network are introduced into a small sample classification task, an open set domain adaptation small sample classification method based on a multi-graph convolution network is provided, and the method is a method specially for solving open set small sample classification for the first time;

(2) the method distributes different weights to different classes of domain loss functions according to the strength of feature aggregation degrees of different classes, focuses on the classes which are difficult to classify, can effectively improve the efficiency of domain adaptation, and improves the classification precision;

(3) and (3) modeling the similarity relation among the image features by using graphs, and enhancing the aggregation degree of the features of each category by using multi-graph convolution operation to further improve the classification precision.

Drawings

FIG. 1 is a flowchart of a method for classifying small samples of an open set domain adaptive remote sensing image based on a multi-graph convolution network according to an embodiment of the present disclosure;

fig. 2 is a schematic structural diagram of Conv4 provided in an embodiment of the present disclosure;

FIG. 3 is an open-set domain adapted small sample classification model based on a multi-graph convolutional neural network in one embodiment of the present disclosure;

FIG. 4 is a flowchart of an open set domain adaptive small sample classification training process based on a multi-graph convolution network according to an embodiment of the present disclosure;

FIG. 5 is a comparison graph of classification accuracy for methods in the case where each type of labeled sample in the task data set is 1, according to an embodiment of the disclosure;

FIG. 6 is a comparison graph of classification accuracy for methods in the case where each type of labeled sample in the task data set is 5 in one embodiment of the present disclosure;

FIG. 7 shows the classification accuracy and α in one embodiment of the disclosure_cA graph that varies with class;

FIG. 8 is a graph of classification accuracy versus P for a labeled sample of 1 in one embodiment of the present disclosure;

FIG. 9 is a graph of classification accuracy versus P for a labeled sample of 5 in one embodiment of the present disclosure.

Detailed Description

In one embodiment, as shown in fig. 1, the present disclosure provides an open set domain adaptive remote sensing image small sample classification method based on a multi-graph convolution network, which includes the following steps:

s100: respectively reading M samples from a task data set and an auxiliary data set, wherein N types exist in the M samples from the task data set, each type has K marked samples, the rest are unmarked samples, and the samples from the auxiliary data set are all marked samples; the task data set and the auxiliary data set are both data sets of remote sensing images;

In the embodiment, aiming at the problem that auxiliary data and task data categories are possibly overlapped in a remote sensing image small sample classification task, the method provides an open set domain adaptive small sample image classification method based on a multi-graph convolution network. The method uses the similarity relation of graph modeling images in the feature space, improves the aggregation degree of similar samples through graph convolution operation, and finally classifies unmarked samples according to the distance between the marked samples and the unmarked samples in the feature space. Meanwhile, in order to reduce the feature distribution difference between the same category data of the auxiliary data set and the task data set, a domain adaptation method based on intra-class variance weighting is provided. The multi-image convolution model and the variance weighting-based domain adaptation method provided by the method can be popularized to other open-set small sample classification tasks.

In another embodiment, the step S200 further includes the following steps:

s201: inputting an image X_iThe size of the image is uniformly adjusted to be RGB image of 80 multiplied by 80 pixel points;

s202: obtaining a characteristic map with the size of 5 multiplied by 64 through Conv 4;

s203: splicing the feature map into 1600-dimensional feature vectors serving as an input image X_iInitial characteristic h of_i。

For this embodiment, the feature extraction section employs four layers of convolutional neural networks Cony4, each followed by a normalization layer, a ReLU activation layer, and a max pooling layer. The structure of Conv4 is shown in FIG. 2.

In another embodiment, the step S300 further includes the steps of: according to the initial characteristics h of the image_iThe adjacency matrix is calculated using the radial basis functions.

For the purposes of this embodiment, a graph convolution neural network is a deep learning network defined on a graph. First, the definition of the figure is given: a graph is a non-linear data structure composed of nodes and edges, which is commonly used to describe one-to-many data relationships in non-euclidean space. The graph may be represented as G ═ (V, E, a), where V is the set of nodes, E is the set of edges, a ∈ R^M×MIs a contiguous matrix, in the figure v_iE.v represents a node in the diagram, i e {1, …, M }, M represents the number of nodes. e.g. of the type_ij＝(v_i，v_j) E.e represents an edge on the graph if E_ij∈E，A_ij> 0, otherwise A_ij＝0。

Graph convolution operation GCN:

H^(l+1)＝σ(5H^(l)W^(l)) (1)

and the method is used in the semi-supervised classification task of the graph nodes. In the formula

Is a normalized augmented adjacency matrix that is,

is an undirected graph plus a self-circled adjacency matrix,

is a degree matrix and has

l represents the number of layers of the graph convolution network, W^(l)Is a trainable weight matrix, σ () is an activation function, usually ReLU () max (0), H^(l)∈R^M×DIs a matrix formed by the eigenvectors of M nodes on the layer 1 graph convolution network, and D is the dimension of the characteristic. The process of the graph convolution operation is that the input A and the input H^(l)In the case of (2), each node on the graph aggregates the neighbor node characteristics and updates the process of own characteristics. The graph analysis indicates that the left multiplication of the characteristic matrix H of the node in the graph convolution operation by S is equivalent to low-pass filtering of H in a frequency domain, the low-pass filtering has the effect of smoothing the characteristics of the node on the graph, so that the nodes with the same label have more similar characteristics, and the graph convolution can be utilized to effectively improve the classification precision when the semi-supervised classification problem of the graph node is solved.

According to the analysis, the method utilizes a method of 'K neighbor + radial basis function' to model the similarity relation between the images, then smoothes the characteristics of the images through graph convolution operation, enhances the aggregation characteristic of the characteristics of the same kind of nodes, and improves the classification precision of the small samples of the images.

The M samples read from the data set construct a graph convolution network. Each node V in the graph convolutional network_iI e {1, …, M } represents an image in which the number of labeled nodes is N × K, N represents the number of categories in the task data, K represents the number of labeled nodes in each category, and the initial characteristics of the nodes

Extracted by the feature extraction network Conv4, and then the adjacency matrix is calculated using Radial Basis Function (RBF):

n (i) means node V_iK nearest neighbor of (1) in the experiment

γ is 0.2. By the method, under most conditions, a larger weight is distributed among the nodes in the same category, and the weights among the nodes in different categories are small or 0, so that in the process of graph convolution operation, the characteristics among the nodes in the same category can quickly approach to be similar, the mutual influence among the characteristics among the nodes in different categories is small, the distinguishing degree of the node characteristics is increased, and the classification precision is improved.

In another embodiment, the step S400 further includes the following steps: several adjacency matrixes of different powers are used for linear combination to obtain the adjacency matrix used by the convolution of the final graph.

In the embodiment, the characteristics of the nodes are updated through the graph convolution operation, the aggregation characteristic of the characteristics of the nodes of the same type is enhanced, and the classification precision of the small image samples is improved. In order to improve the effectiveness of the graph convolution operation, the method uses a plurality of adjacent matrixes with different powers to carry out linear combination to obtain the adjacent matrix used by the final graph convolution:

and is

Where P denotes the highest power of the adjacency matrix, α_jIs a weight coefficient to be learned, S^jTo the power of j of S. After the graph is generated, convolution operation is carried out on the graph, S in the formula (1) is replaced by B, and the final graph volume mathematical expression is as follows:

H＝σ(BH⁰W) (3)

it is found through experiments that the model can obtain the highest precision in most cases when P is 2, so that when small sample classification is performed on the remote sensing image, P is suggested to be 2.

In another embodiment, the step S500 further includes the following steps:

s501: calculating prototype center c of each class by using feature vector of each class labeled sample_k；

S502: calculating the distance of the center of the prototype of the kth class of the unlabeled sample in the feature space by using the Euclidean distance;

s503: calculating the probability that the sample belongs to a certain class by using a classification probability function softmax, wherein the expression of the classification probability function softmax is as follows:

wherein h is_iCharacteristic of the sample, p (y)_i＝k|h_i) Refers to the probability that the sample belongs to class k.

In this embodiment, after the graph convolution operation, the output features of the graph neural network are sent to the metric prediction part for prediction, and the class probability of the predicted node is output. The feature vector of each class of labeled nodes is first used to compute the prototype center for each class:

in the above formula, h_iAnd y_iRespectively represent nodes v_iAnd (c) corresponding feature vectors and labels, wherein N is the number of categories, and S (k) represents a set formed by k-th category labeled images in the M input images. After calculating the prototype center, calculating the distance between the unmarked data and various prototype centers in the feature space by using Euclidean distance, and then substituting softmax to calculate the probability of the category:

in another embodiment, as shown in fig. 3, the model structure on which the method is based consists essentially of three parts: (1) a feature extraction network for extracting features of the image; (2) the multi-graph convolution network is used for modeling the similarity relation between image characteristics and updating the node characteristics through convolution operation on the graph; (3) and the measurement prediction module is used for realizing prediction of the unlabeled samples according to the measurement distance between the labeled samples and the unlabeled samples in the feature space.

In another embodiment, the step S600 further includes the steps of:

s601: selecting sample characteristics with prediction probability greater than a certain threshold in the task data set, taking the prediction result as a pseudo label of the sample, and calculating the ratio alpha of the intra-class variance of each class to the nearest neighbor interval_c；

S602: using alpha_cCalculating the characteristics of the task data set and the auxiliary data set to obtain a domain loss function, classifying the loss function, and training the whole model by minimizing the domain loss function and the classification loss function;

for the embodiment, in the training process of the model, in order to reduce the feature distribution difference between the auxiliary data set and the task data set, the domain adaptation is introduced into the training process, and the purpose of reducing the feature distribution difference is achieved by reducing the domain loss, so that a domain adaptation method based on variance weighting is provided.

The method solves the problem that an open set domain is suitable for small sample classification, namely small sample classification under the condition that the categories of an auxiliary data set and a task data set are overlapped. Although the auxiliary data contains the same type of data as the common type of data in the task data set, due to the difference of feature distribution of different data sets, a model trained on the auxiliary data set cannot achieve high classification accuracy on a target data set. To this end, the method introduces a domain-adaptive approach to reduce the difference between the helper data and task data characteristics. The inter-domain differences are calculated using equation (6):

in the above equation, the left side of "+" is measured as the difference in the distribution of the global features of the two data sets, called the edge distribution difference. The right side of "+" measures the feature distribution difference between the corresponding categories of the two datasets, called conditional distribution difference. (6) Where μ is a constant, x_A、x_TRespectively referring to samples in the assistance data set and the task data set,

the superscript c indicates that the sample belongs to class c, f (-) refers to the feature extraction network,

refers to the regenerated nuclear hilbert space, and C refers to the number of classes.

In the prototype network, domain adaptation can improve the aggregation degree of the same type of samples in the feature space. The method proposes to use the intra-class variance S of the sample_cSquare of nearest neighbor distance

Ratio of (a) to (b)

To measure the aggregation degree of the class c sample in the feature space, wherein:

D_cis a matrix formed by splicing the characteristic vectors of class c data, 1 is a full 1 vector, n_cThe number of samples in class c. Alpha is alpha_cThe larger the sample size is, the more dispersed the distribution of the class c sample in the feature space is, the closer the sample size is to other classes, and the lower the aggregation degree is; conversely, the higher the degree of aggregation. Direct-viewing language alpha_cThe larger the class is, the more difficult the classification is, the experiment proves the classification precision and alpha of different classes_cThe population is in a negative correlation relationship. Therefore, in order to improve the effectiveness of domain adaptation, the domain adaptation is emphasized in the indistinguishable classes, and alpha should be given in the model training process_cThe domain loss corresponding to the smaller class is assigned a smaller value, α_cThe larger class of domain losses is assigned a larger value. The final domain loss function is modified as:

wherein:

c is the number of classes of the common class. During the training of the model, L is minimized_dThe method and the device can achieve the purposes of enhancing the aggregation degree of the same type data and improving the classification precision.

In another embodiment, the training of the model may be divided into two phases: the first phase is the pre-training of the multi-graph convolutional network and the second phase is the domain adaptation phase.

For this embodiment, the flow of the entire model training is shown in FIG. 4. Wherein, the loss function used in the first stage is the classification loss given by formula (9), and the loss function in the second stage is the loss function given by formula (11).

L＝L_A+L_T+L_D (11)

In another embodiment, the first stage uses an auxiliary data set to pre-train the multi-picture convolutional network MGCN, resulting in initial parameters of the multi-picture convolutional network MGCN.

In another embodiment, the second stage uses the pre-trained multi-graph convolutional network MGCN to perform a coarse classification on the task data set, and selects NT data predicted as a common class with a confidence level greater than a threshold for domain adaptation with the common class data in the auxiliary data set.

In another embodiment, in order to verify the effectiveness of the method in the open set domain adaptation of small sample classification, experiments were performed on three public datasets and the collected Qinghai-Tibet plateau dataset using the method with the existing ProtoNet and TPN. The similarities and differences between the three methods are shown in table 1:

TABLE 1

A remote sensing data set:

the data sets used in the experiment comprise three public data sets of AID, OPTIMAL-31 and RSI-CB256 and a Tibet plateau data set collected from Landsat satellites. The AID dataset is an aerial image dataset that has 30 scene categories for a total of approximately 10000 images, each image being 600 x 600 pixels in size. The OPTIMAL-31 dataset contains 31 classes of imagery collected from google maps, each class consisting of 60 images, 256 x 256 pixels in size, for a total of 1860 images. The RSI-CB256 is a large remote sensing image data set, the data set comprises 6 major categories of agricultural land, construction land, transportation and facilities, water conservancy facilities, forest land and other land, 35 minor categories, 24000 pictures are provided, the size of each picture is 256 multiplied by 256 pixels, and in the experiment, the category of the 35 minor categories is used as the label of the image. The Qinghai-Tibet plateau data set is remote sensing scene images of the Qinghai-Tibet plateau, which are shot by Landsat5 and Landsat8 satellites in 1990 and 2020 respectively, wherein the size of each image is 256 multiplied by 256 pixels, and the image comprises six categories of bare land, forest, grassland, lake, river and snow mountain, and about 500 images. Setting parameters:

the experiment used a pytorreh deep learning framework, accelerated using NVIDIA GeForce RTX 3090 GPU. In the experiment, all remote sensing images were resized to 80 x 80 pixel size, and all methods used Conv4 to extract image features. Weight matrix W in multi-graph convolution network belongs to R^1600×1000Obtained through training. The training uses an Adma optimizer with a training step size of 10^-4The maximum number of cycles max _ epoch is 20.

In order to evaluate the effectiveness of the method, experiments are carried out on three public data sets by using the method and the existing methods ProtoNet and TPN, and the classification accuracy of the methods on different data sets before and after domain adaptation is compared.

In order to construct an open set domain adaptive classification task, data with the same category is selected from any two data sets to serve as common category data, then a plurality of categories are selected from data with different categories to serve as private category data, and the rest data are used as verification sets. For the AID and OPTIMAL-31 data sets, the same category has 9 categories in total, 11 categories are sequentially selected from the remaining categories of each data set to serve as private categories, and the private categories and the 9 categories are combined to serve as auxiliary data sets and task data sets respectively. The categories selected are shown in table 2:

TABLE 2

Similarly, for the two combinations of AID and RSI-CB256 and OPTIMAL-31 and RSI-CB256, the data set categories are selected as shown in tables 3 and 4 respectively:

TABLE 3

TABLE 4

Table 5 and table 6 show the classification accuracy (%) of the present method and the comparative method on different data sets for each class of labeled

samples

1 and 5, respectively. The subscript "noDA" means that the method directly tests the task data without using domain adaptation after training on the auxiliary data set. The subscript "DA" indicates that the method is trained on the model using a domain adaptation method after being pre-trained on the auxiliary data set. The subscript "α" indicates that the method uses the domain adaptation method proposed by the present method. The "A", "O" and "R" in the table are respectively the shorthand forms of the three data sets of AID, OPTIMAL-31 and RSI-CB256, the left side of "- >" represents auxiliary data, and the right side represents task data.

TABLE 5

TABLE 6

As can be seen from tables 5 and 6, the classification accuracy of the three methods is improved after training using domain adaptation. Compared with ProtoNet, the MGCN increases a multi-graph convolution network on the basis of ProtoNet, and the feature extraction networks are completely consistent, so that the domain adaptation of common classes and the convolution operation of all classes of constructed graph networks can be realized, and the classification precision can be improved. In addition, the MGCN is contrasted_DAAnd MGCN_αIt can be seen that, compared to the existing domain adaptation methods,the improved domain adaptation method of the method can further improve the classification precision by about 1%.

Table 7 shows the average α output after the five methods have been trained on three data sets, respectively_c. From the data in the table, it can be derived that both domain adaptation and graph convolution can reduce α_cA value of_cReflecting the degree of aggregation of the sample characteristics, alpha_cThe smaller the clustering degree of the samples of the same category in the feature space is, the more accurate the classification result by using the measurement function is. Therefore, the domain adaptation and graph convolution network can improve the classification accuracy of the prototype network, because the aggregation degree of the similar samples in the feature space is improved.

TABLE 7

Tables 8 and 9 show the training time and the testing time of different methods, and it can be seen from the two tables that the training time is increased to a certain extent after the multi-map convolution and the domain adaptation are added, but in the test, the average time of classifying one image by the method is less than 2ms, which meets the requirements of most applications.

TABLE 8

TABLE 9

In another embodiment, the collected Tibet plateau data set is tested by the MGCN, ProtoNet and TPN methods. The Qinghai-Tibet plateau dataset has 6 classes of bare land, forest, river, grassland, lake, snow mountain. In the experiment, the auxiliary data set selects 20 categories of data of bare land, forest, river, airport, baseball field, beach, bridge, center, church, community, dense land, desert, industrial, meadow, static, storage ranks and viaducut in AID. The same category for both datasets is 3: bare land, forest, river.

Watch 10

Table 10 shows the classification accuracy of the common class, the classification accuracy of the private class and the total classification accuracy of the comparison method on the tibetan plateau dataset for the case that the labeled samples of each class are 1 and 5. As can be seen from table 10, after domain adaptation is used, the methods have significantly improved classification accuracy of common classes, and the total classification accuracy is also improved. And compare PrOtONet_noDAAnd MGCN_noDAIt can be seen that the incremental graph network can be improved by about 10% in terms of accuracy in the case where each type of the labeled sample is 1, and can be improved by about 5% in terms of accuracy in the case where the labeled sample is 5. The classification accuracy can be further improved by using multi-graph convolution. Compared with the comparison method, the method can obtain the highest classification precision on the collected Qinghai-Tibet plateau data set.

In another embodiment, in order to explore the influence of domain adaptation and graph convolution operation on the prediction accuracy of the common class data, the common class data given in tables 2 to 4 are used as training and testing data for training and testing, respectively, and the obtained classification results are shown in fig. 5 and 6.

As can be seen from fig. 5 and fig. 6, ProtoNet, TPN and MGCN proposed by the present method have a significant improvement in precision after using the domain adaptation method, which shows that in the classification of small samples, the domain adaptation to the data of the same categoryThe classification accuracy of the measurement method can be remarkably improved, and the accuracy can be further improved after the improved domain adaptation method is used. At the same time, compared to ProtONet_noDA，MGCN_noDAThe method adds one more graph convolution operation, the other parts are the same, when each type of marked sample is only 1, the precision is improved by 8% -15%, and when the marked sample is 5, the precision is improved by 2% -5%. Therefore, the similarity relation between the nodes can be modeled through the graph network, and then the node characteristics are updated through the convolution operation on the graph, so that the classification precision of small sample learning can be improved.

In another embodiment, to explore the classification accuracy and alpha of classes in a prototype network_cThe relation between the two is that common class data in OPTIMAL-31 and RSICB256 are selected, a prototype network is used for training on one data set, then testing is carried out on the other data set, the classification accuracy of each class is arranged in the order from small to large, a curve that the classification accuracy changes along with the class is made, and alpha of each class is made_cThe curve as a function of class is shown in fig. 7.

As can be seen from FIG. 7, α_cAnd the classification precision of the category is in negative correlation with the whole. Alpha is alpha_cReflecting the aggregation degree, alpha, of a certain type of samples in a feature space_cThe larger the difference is, the larger the variance of the class sample in the feature space is, and the closer the class sample is to other classes, the lower the aggregation degree of the features is, and the lower the corresponding classification precision is; alpha is alpha_cThe smaller the aggregation, the higher the corresponding classification accuracy.

In another embodiment, in order to explore the influence of the order of the adjacent matrix on the accuracy in the multi-graph convolution operation, the classification accuracy of the proposed MGCN method on 3 different data sets was tested in the case where the values of p in equation (3) are 1, 2, 3, 4, and 5, respectively, without using domain adaptation. The obtained results are shown in fig. 8 and 9.

As can be seen from fig. 8 and 9, in the case where the labeled sample of each class is 1 or 5, a lift of 0.5% to 3% can be obtained compared to P ═ 1 and P ═ 2. When P is larger than 2, the classification precision of partial data sets slightly rises, and the classification precision of other parts slightly falls. Therefore, in the experiment, the value of P is 2.

Although the embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments and application fields, and the above-described embodiments are illustrative, instructive, and not restrictive. Those skilled in the art, having the benefit of this disclosure, may effect numerous modifications thereto without departing from the scope of the invention as defined by the appended claims.

Claims

1. An open set domain adaptive remote sensing image small sample classification method based on a multi-graph convolution network comprises the following steps:

2. The method according to claim 1, wherein the step S200 further comprises the following steps:

s201: will input image x_iThe size of the image is uniformly adjusted to be RGB image of 80 multiplied by 80 pixel points;

3. The method of claim 1, wherein the step S400 further comprises the steps of: several adjacency matrixes of different orders are used for linear combination to obtain the adjacency matrix used by the convolution of the final graph.

4. The method according to claim 1, wherein the step S500 further comprises the steps of:

5. The method of claim 1, wherein the step S600 further comprises the steps of:

s601: selectingPredicting the sample characteristics with the probability greater than a certain threshold in the task data set, taking the prediction result as a pseudo label of the sample, and calculating the ratio alpha of the intra-class variance of each class to the nearest neighbor interval_c；

S602: using alpha_cAnd calculating the characteristics of the task data set and the auxiliary data set to obtain a domain loss function, classifying the loss function, and training the whole model by minimizing the loss function.

6. The method according to claim 1, wherein the model on which the method is based comprises the following three parts: (1) a feature extraction network for extracting features of the image; (2) the multi-graph convolution network is used for modeling the similarity relation between image features and updating the sample features through convolution operation on the graph; (3) and the measurement prediction module is used for realizing prediction of the unlabeled samples according to the measurement distance between the labeled samples and the unlabeled samples in the feature space.

7. The method of claim 6, wherein a variance weighting based domain adaptation method is introduced during the training of the model.

8. The method of claim 7, wherein the training of the model is divided into two phases: the first phase is the pre-training of the multi-graph convolutional network and the second phase is the domain adaptation phase.

9. The method of claim 8, wherein the first stage pre-trains the multi-convolutional network MGCN with an auxiliary data set, resulting in initial parameters for the multi-convolutional network MGCN.

10. The method of claim 8, wherein the second stage uses a pre-trained multi-graph convolutional network (MGCN) to coarsely classify the task data set, and selects data predicted as common class with confidence greater than a certain threshold for domain adaptation with common class data in the auxiliary data set.