Disclosure of Invention
Aiming at the problems, the invention provides a hyperspectral image unsupervised classification method based on graph convolution network embedded representation.
In order to realize the aim of the invention, the invention provides a hyperspectral image unsupervised classification method based on graph convolution network embedded representation, which comprises the following steps:
s10, sequentially performing EMP and spectral feature extraction on the image to be hyperspectral to obtain a space-spectrum combined feature;
s20, performing superpixel segmentation on the space spectrum combined features to obtain superpixel points of the image to be hyperspectral;
s30, solving the representation of the super pixel points by the elastic resolution network, and constructing a graph model of the super pixel points by taking the super pixel points associated with the nonzero component in the representation coefficients represented by the elastic resolution network as the neighbors of the current point;
s40, carrying out graph convolution network embedding characterization learning based on the graph model, and obtaining low-dimensional features through hierarchical vertex convergence operation;
and S50, according to the low-dimensional feature representation, realizing unsupervised classification of the hyperspectral images by using a K-means algorithm.
In one embodiment, performing EMP and spectral feature extraction on an image to be hyperspectral sequentially to obtain a spatial-spectral combined feature includes:
wherein V represents a space-spectrum combined feature matrix, X represents a spectrum feature matrix, EMP represents an EMP feature matrix, m is the number of principal components, N is the number of circular structural elements with different radiuses, d is the number of spectrum bands, and N is the number of samples.
In one embodiment, the performing superpixel segmentation on the spatial-spectral combination feature to obtain a superpixel point of the image to be hyperspectral includes:
Di,c=(1-λ)×Dspectral+λ×Dspatial,
Dspectral=tan(SAD(xi,xc)),
wherein D isspectralInter-spectral distance, SAD (x), representing a measure of tani,xc) Representing the angular distance of the spectrum, DspatialRepresenting normalized space Euclidean distance, r is the diagonal length of the search neighborhood, for constraining the search range to a local neighborhood around each cluster center, Di,cRepresenting a weighted sum of spectral and spatial distances, λ representing an action parameter balancing the spectral and spatial distances, xc=(xc1,xc2,L,xcn) Represents the center of the cluster, and the corresponding spatial coordinates are (cx, cy), xi=(xi1,xi2,L,xil) The representation is located at the cluster center xcThe local neighborhood of l pixels has corresponding spatial coordinates (ix, iy). The patent suggests setting the parameter lambda to less than 0.5.
In one embodiment, solving the representation of the super pixel points, and taking the super pixel points associated with the non-zero component in the representation coefficients of the representation of the super pixel points as the neighbors of the current point, and constructing the graph model of the super pixel points includes:
s31, selecting all other super pixel points to construct a dictionary based on each super pixel point, and finding out the elastic network representation of all pixel points in the data set by solving the following constraint optimization problem:
s.t.sxi=SDisci+ei,
wherein, SDiIs a dictionary formed by all super pixel points, sciIs a super pixel sxiBased on dictionary SDiObtained expression coefficient, SC ═ SC1,sc2,L,scN]Is a coefficient matrix, E is a characterizing error matrix, λ and γ are regularizing parameters, EiIs an error vector;
s32, constructing an elastic network graph model of the super-pixel points according to the elastic network sparse representation coefficients of each sample point, and defining a coefficient matrix SC according to the elastic network sparse representation coefficients
And as an adjacency matrix of the graph model, establishing edge connection between the hyperspectral pixel points to obtain the graph model of the hyper-pixel points.
In one embodiment, graph model-based graph convolution network embedded characterization learning is performed, and obtaining low-dimensional features through a hierarchical vertex convergence operation comprises the following steps:
s41, sampling each vertex of the graph model layer by layer, randomly sampling a fixed number of neighborhood vertices on each layer of the graph model by GraphSAGE in a random walk mode, and approximating the vertex which is not sampled by using the historical expression of the vertex;
s42, updating self information by the graph SAGE through aggregating neighborhood vertex information, wherein each aggregation is to aggregate the characteristics of each neighbor vertex obtained by the previous layer once, then combine the characteristics of the vertex and the previous layer to obtain the embedded characteristics of the layer, and repeat aggregation for K times to obtain the final embedded characteristics of the vertex, wherein the vertex characteristics of the initial layer are input sample characteristics;
s43, convolution of definition diagram:
where σ is a non-linear activation function, W
kIs a weight matrix to be learned, is used for information propagation between different layers of the model,
representing the embedded feature obtained from a layer above the vertex v, and the final low-dimensional feature obtained by obtaining the K layer is represented as
N (v) represents a set of domain points for vertex v, CONCAT () represents a concatenation of the two matrices;
s44, designing a loss function:
wherein z isuRepresenting the final embedded feature representation of any vertex in the graph model, the superscript T representing the transposition, v representing the vertex that appears with vertex u on a fixed-length random walk, PnIs a negative sample distribution, Q represents the number of negative samples, zvRepresenting the final low dimensional features obtained by acquiring the K-th layer.
In one embodiment, the unsupervised classification of the hyperspectral image according to the low-dimensional feature representation by using the K-means algorithm comprises the following steps:
s51, clustering the low-dimensional features of the super pixel points by using a K-means algorithm to obtain a label matrix of the super pixel points;
and S52, restoring the super pixel points to the original pixel points, and matching the clustering result with the real category in an optimal mode through the Hungarian algorithm to realize unsupervised classification of the hyperspectral images.
According to the hyperspectral image unsupervised classification method based on the graph convolution network embedded representation, the spatial features and the spectral features are combined to form the empty-spectrum combined features, superpixel segmentation is carried out, then elastic network decomposition is carried out on each superpixel point, a superpixel graph model is constructed, the subsequent calculation complexity is reduced, the graph convolution network is utilized to learn an embedding method in deep learning, unsupervised classification learning is carried out based on the more excellent embedded representation of the graph convolution network, and the purpose of hyperspectral image accurate classification is achieved.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
Referring to fig. 1, fig. 1 is a flowchart of an embodiment of an unsupervised classification method for hyperspectral images based on graph convolution network embedded representation, which introduces a superpixel idea, performs elastic network decomposition on each superpixel point and constructs a graph model of the superpixel, introduces a graph convolution network idea, processes the graph model through the graph convolution network, well learns the characteristics of graph vertexes and neighborhoods thereof, learns a more optimal embedded representation, and further performs unsupervised method classification based on the graph convolution network embedded representation, so as to obtain a more optimal classification result, and specifically includes the following steps:
and S10, sequentially performing EMP and spectral feature extraction on the image to be hyperspectral to obtain a space-spectrum combined feature.
And S20, performing superpixel segmentation on the space spectrum combined features to obtain superpixel points of the image to be hyperspectral.
S30, solving the representation of the super pixel points, and constructing a graph model of the super pixel points by taking the super pixel points associated with the nonzero component in the representation coefficients represented by the representation of the super pixel points as the neighbors of the current point.
And S40, carrying out graph convolution network embedding characterization learning based on the graph model, and obtaining low-dimensional features through hierarchical vertex convergence operation.
And S50, according to the low-dimensional feature representation, realizing unsupervised classification of the hyperspectral images by using a K-means algorithm.
According to the hyperspectral image unsupervised classification method based on the graph convolution network embedded representation, the spatial features and the spectral features are combined to form the empty-spectrum combined features, superpixel segmentation is carried out, then elastic network decomposition is carried out on each superpixel point, a superpixel graph model is constructed, the subsequent calculation complexity is reduced, the graph convolution network is utilized to learn an embedding method in deep learning, unsupervised classification learning is carried out based on the more excellent embedded representation of the graph convolution network, and the purpose of hyperspectral image accurate classification is achieved.
In one embodiment, performing EMP and spectral feature extraction on an image to be hyperspectral sequentially to obtain a spatial-spectral combined feature includes:
wherein V represents a space-spectrum combined feature matrix, X represents a spectrum feature matrix, EMP represents an EMP feature matrix, m is the number of principal components, N is the number of circular structural elements with different radiuses, d is the number of spectrum bands, and N is the number of samples.
In one embodiment, the performing superpixel segmentation on the spatial-spectral combination feature to obtain a superpixel point of the image to be hyperspectral includes:
Di,c=(1-λ)×Dspectral+λ×Dspatial,
Dspectral=tan(SAD(xi,xc)),
wherein D isspectralInter-spectral distance, SAD (x), representing a measure of tani,xc) Representing the angular distance of the spectrum, DspatialRepresenting normalized space Euclidean distance, r is the diagonal length of the search neighborhood, for constraining the search range to a local neighborhood around each cluster center, Di,cRepresenting a weighted sum of spectral and spatial distances, λ representing an action parameter balancing the spectral and spatial distances, xc=(xc1,xc2,L,xcn) Represents the center of the cluster, and the corresponding spatial coordinates are (cx, cy), xi=(xi1,xi2,L,xil) The representation is located at the cluster center xcThe local neighborhood of l pixels has corresponding spatial coordinates (ix, iy). Specifically, the parameter λ is set to be less than 0.5.
In this embodiment, the similarity calculation standard in superpixel segmentation is as follows:
Di,c=(1-λ)×Dspectral+λ×Dspatial,
Dspectral=tan(SAD(xi,xc)),
wherein D isspectralTo representDistance between spectra, SAD (x), measured as tani,xc) Represents the Spectral Angle (SAD) Distance with a value range of 0, pi]。DspatialAnd (3) expressing the normalized space Euclidean distance, wherein r is the diagonal length of a search neighborhood, and the search range is restricted to be a local neighborhood around the center of each cluster. Di,cRepresenting a weighted sum of spectral and spatial distances, a parameter λ balancing the effect of the spectral and spatial distances, xc=(xc1,xc2,L,xcn) Represents the center of the cluster, and the corresponding spatial coordinates are (cx, cy), xi=(xi1,xi2,L,xil) The representation is located at the cluster center xcThe local neighborhood of l pixels has corresponding spatial coordinates (ix, iy). The present embodiment suggests that the parameter λ is set to be less than 0.5.
Further, a cluster center is initialized randomly, and each pixel is judged to belong to the cluster center with the shortest distance according to the distance between the cluster center and the local neighborhood pixels. The characteristic of each super pixel point is represented by the average value of the space spectrum combined characteristics of all pixel points in the local neighborhood, and the label of each super pixel point is determined by the category with the maximum number of labels of all pixel points in the local neighborhood. The specific process can comprise the following steps: firstly, randomly selecting a pixel point as a dinner gathering center, selecting a part of pixel points with the shortest distance to belong to the center according to the distance between the center and the neighborhood pixel points obtained by calculation, then calculating the average value of various points to obtain a new clustering center, and determining the final center and the pixel points belonging to the center to form super pixel points through repeated iteration.
In one embodiment, solving the representation of the super pixel points, and taking the super pixel points associated with the non-zero component in the representation coefficients of the representation of the super pixel points as the neighbors of the current point, and constructing the graph model of the super pixel points includes:
s31, selecting all other super pixel points to construct a dictionary based on each super pixel point, and finding out the elastic network representation of all pixel points in the data set by solving the following constraint optimization problem:
s.t.sxi=SDisci+ei,
wherein, SDiIs a dictionary formed by all super pixel points, sciIs a super pixel sxiBased on dictionary SDiObtained expression coefficient, SC ═ SC1,sc2,L,scN]Is a coefficient matrix, E is a characterizing error matrix, λ and γ are regularizing parameters, EiIs an error vector;
s32, constructing an elastic network graph model of the super-pixel points according to the elastic network sparse representation coefficients of each sample point, and defining a coefficient matrix SC according to the elastic network sparse representation coefficients
And as an adjacency matrix of the graph model, establishing edge connection between the hyperspectral pixel points to obtain the graph model of the hyper-pixel points.
In one embodiment, graph model-based graph convolution network embedded characterization learning is performed, and obtaining low-dimensional features through a hierarchical vertex convergence operation comprises the following steps:
s41, sampling each vertex of the graph model in a layering mode (K layers), randomly sampling a fixed number of neighborhood vertices at each layer of the graph model by using a graph and aggreGatE mode, and approximating the vertex without sampling by using a history expression of the vertex;
s42, updating self information by the graph SAGE through aggregating neighborhood vertex information, wherein each aggregation is to aggregate the characteristics of each neighbor vertex obtained by the previous layer once, then combine the characteristics of the vertex and the previous layer to obtain the embedded characteristics of the layer, and repeat aggregation for K times to obtain the final embedded characteristics of the vertex, wherein the vertex characteristics of the initial layer are input sample characteristics; wherein the vertex features of the initial layer are the input sample features. The present invention contemplates two polymerization modes: average polymerization and pond polymerization;
s43, convolution of definition diagram:
where σ is a non-linear activation function, W
kIs a weight matrix to be learned, is used for information propagation between different layers of the model,
representing the embedded feature obtained from a layer above the vertex v, and the final low-dimensional feature obtained by obtaining the K layer is represented as
N (v) represents a set of domain points for vertex v, CONCAT () represents a concatenation of the two matrices;
s44, designing a loss function to obtain low-dimensional characteristics according to the graph convolution and the loss function, wherein the loss function specifically comprises:
wherein z isuRepresenting the final embedded feature representation of any vertex in the graph model, the superscript T representing the transposition, v representing the vertex that appears with vertex u on a fixed-length random walk, PnIs a negative sample distribution, Q represents the number of negative samples, zvRepresenting the final low dimensional features obtained by acquiring the K-th layer.
In one example, where the vertex features of the initial layer are the input sample features, this example attempts two aggregation approaches: average polymerization and pond polymerization.
The average aggregation is achieved by solving the average value of the last layer of embedding of the neighborhood vertices, defined as follows:
wherein N (v) represents the neighborhood of vertex v,
represents aggregated information of any adjacent vertex to vertex v.
And in the pooling aggregation, vectors of all adjacent vertexes share weight, and the vectors pass through a nonlinear full-connection layer and then are subjected to maximum pooling to aggregate neighborhood information, so that more effective embedding characteristics are obtained. The specific formula is as follows:
where max is the maximum operator and δ is a non-linear activation function. In principle, each neighborhood vertex can get a vector by passing through a multilayer perceptron of any depth independently, and the multilayer perceptron can be regarded as a group of functions WpoolAnd calculating the embedded characteristic of each neighborhood vertex.
In one embodiment, the unsupervised classification of the hyperspectral image according to the low-dimensional feature representation by using the K-means algorithm comprises the following steps:
s51, clustering the low-dimensional features of the super pixel points by using a K-means algorithm to obtain a label matrix of the super pixel points;
and S52, restoring the super pixel points to the original pixel points, and matching the clustering result with the real category in an optimal mode through the Hungarian algorithm to realize unsupervised classification of the hyperspectral images.
The hyperspectral image unsupervised classification method based on the graph convolution network embedded representation utilizes the graph convolution network to establish a hyperspectral image unsupervised classification model based on the graph convolution network embedded representation, and the model combines spatial features and spectral features to form a spatial-spectral combined feature on feature representation. And super pixel segmentation is carried out, so that elastic network decomposition is carried out on each super pixel point and a super pixel graph model is constructed, and the subsequent calculation complexity is reduced. The graph convolution network learning embedding method in deep learning is used, unsupervised classification learning is carried out based on the better embedding representation of the graph convolution network, and the purpose of accurately classifying the hyperspectral images is achieved.
In one embodiment, in order to verify the effect of the hyperspectral image unsupervised classification method based on graph convolution network embedded representation, a simulation experiment is performed, the specifications of testing sequences of Indian Pines-13(IP-13) and S-Pavia University Scene (S-PUS) are 145 × 145 and 306 × 340 respectively, a related parameter λ of superpixel segmentation is set to 0.3, the number K of sampling layers of graph convolution network GraphSAGE is set to 2, and the number of neighborhood samples is set to S125 and S2For 10, 25 first-order neighbors are sampled, 10 second-order neighbors are sampled, 50 random walks with step size 5 are made for each vertex, the word2vec is negatively sampled, and 20 are sampled for each node. The Batchsize is set to 512, the epochs is set to 5, the weight attenuation is set to 0.0001, the implementation is realized under a TensorFlow platform, an Adam optimizer is selected, and the dimensionalities of the output embedding representation are all C +1, wherein C is the category number of the corresponding data set.
The evaluation of the experiment used both qualitative and quantitative analytical methods.
The comparison of the hyperspectral image classification effect of the method and each algorithm shows that the classification effect of the method is obviously superior to that of other algorithms for different hyperspectral image data sets.
For quantitative comparative analysis, OA, AA and were used for evaluation. Wherein, OA is the total Accuracy (Overall Accuracy) of all sample classifications, AA is the Average Accuracy (Average Accuracy) of all sample classifications, and Kappa coefficient, which are calculated as follows:
where c is the number of sample classes, miiRepresenting the number of samples classified into the ith class from the ith class in the classification process, N is the total number of samples, piIndicates the accuracy of each sample class classification, NiIndicating the total number of class i samples.
After the model is used for classification, the method has a great effect in the classification of hyperspectral images, and the graph convolution network learning embedding is used for obtaining more excellent low-dimensional features, so that the classification accuracy is improved.
When quantitative comparison is carried out, classification is carried out on two hyperspectral image data sets, the classification result of each data set is compared with the Grountruth, and the corresponding OA, AA and value are calculated. FIGS. 2 and 3 show OA, AA, and values of the inventive algorithm and other algorithms in the data sets Indian pipes-13 (IP-13) and S-Pavia University Scene (S-PUS), respectively.
In summary, compared with the conventional composition method, the super-pixel segmentation is introduced, the super-pixels are used for replacing a plurality of pixel points of the local neighborhood, redundant information is removed, block composition is avoided, the number of the top points is greatly reduced, and the composition complexity is reduced. In addition, the graph model is processed by using the graph convolution network, the deep embedding method on the graph model is provided, the characteristics of the graph vertex and the neighborhood of the graph vertex can be well learned, a better embedding representation can be obtained, and the accuracy of subsequent clustering is further improved. The algorithm of the embodiment has certain advantages in view of the accuracy of classification and visual effect.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
It should be noted that the terms "first \ second \ third" referred to in the embodiments of the present application merely distinguish similar objects, and do not represent a specific ordering for the objects, and it should be understood that "first \ second \ third" may exchange a specific order or sequence when allowed. It should be understood that "first \ second \ third" distinct objects may be interchanged under appropriate circumstances such that the embodiments of the application described herein may be implemented in an order other than those illustrated or described herein.
The terms "comprising" and "having" and any variations thereof in the embodiments of the present application are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, product, or device that comprises a list of steps or modules is not limited to the listed steps or modules but may alternatively include other steps or modules not listed or inherent to such process, method, product, or device.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.