CN111191704B

CN111191704B - Foundation cloud classification method based on task graph convolutional network

Info

Publication number: CN111191704B
Application number: CN201911347193.7A
Authority: CN
Inventors: 刘爽; 李梅; 张重
Original assignee: Tianjin Normal University
Current assignee: Tianjin Normal University
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2023-05-02
Anticipated expiration: 2039-12-24
Also published as: CN111191704A

Abstract

The embodiment of the invention discloses a foundation cloud classification method based on a task graph convolutional network, which comprises the following steps: preprocessing an input ground cloud image to obtain a preprocessed ground cloud image, inputting the preprocessed ground cloud image into a task graph convolution network training model, and training to obtain a task graph convolution network; extracting and obtaining characteristics based on a convolutional neural network, characteristics based on graph convolution and fusion characteristic representation of each input ground cloud image based on a task graph convolutional network; training a support vector machine classifier according to the fusion characteristic representation to obtain a foundation cloud classification model; and acquiring a fusion characteristic representation of the test input foundation cloud image, and inputting the fusion characteristic representation into a foundation cloud classification model to obtain a classification result. The method has the advantages that the characteristics based on the convolutional neural network and the complementary information based on the graph convolution network are fully utilized, the correlation between the characteristics and the complementary information is effectively mined, the fusion characteristics with higher discernability are extracted, and the accuracy of classification of the foundation cloud is further improved.

Description

Foundation cloud classification method based on task graph convolutional network

Technical Field

The invention belongs to the technical fields of pattern recognition, meteorological science and artificial intelligence, and particularly relates to a foundation cloud classification method based on a task graph convolutional network.

Background

Clouds are visible polymers floating in the air, which are formed by mixing small water drops formed by liquefying vapor in the atmosphere when the vapor encounters cold or small ice crystals formed by sublimating, and more than 60% of the earth surface is covered with clouds. The cloud plays an important role in water circulation, surface radiation balance and climate modeling. Therefore, understanding the cloud is of great importance.

Cloud height, cloud quantity and cloud class are three major aspects of cloud observation and have gained widespread academic attention in recent years. However, cloud classification is always a problem due to cloud-to-phantom detection. Devices for acquiring cloud observations have been developed, including satellite-based devices and earth-based devices. Satellite-based devices can collect a wide range of cloud information, but have limited spatial resolution, insufficient to describe cloud-like features in local regions. In contrast, ground-based devices such as all-sky imagers, total sky imagers, and the like can acquire ground-based cloud images with high resolution, which provides reliable data for monitoring and understanding local sky.

Thanks to the large number of ground based cloud images, many researchers have proposed to implement cloud classification with manually designed features of texture, color, structure, etc. In recent years, deep learning has achieved significant results in many areas. In light of this, researchers have also begun to automatically classify ground clouds using Convolutional Neural Networks (CNNs). Shi et al applied average pooling or maximum pooling to each convolution activation map and then extracted features based on convolution activation and classified the ground cloud. Ye et al extract features from multiple convolutional layers of the convolutional neural network and select a representative local descriptor, and then encode the selected local descriptor with Fisher vectors and represent it as a feature representation of the ground cloud. Zhang et al propose a significant double activation aggregation algorithm that extracts significant vector features from shallow convolutional layers and corresponding weights from higher roll layers. Li et al propose that the double supervision loss function combines knowledge of different networks together, and improves the accuracy of the ground cloud classification by giving a larger weight to the difficult-to-classify samples.

However, existing methods ignore the essential data structure of the cloud image, and thus cannot learn the characteristic representation of the cloud image sufficiently. The existing method directly inputs the foundation cloud images and corresponding labels into the depth model, and the correlation between the foundation cloud images is not considered, so that the essential data structure of the cloud images cannot be learned. The cloud is a natural texture, the difference between the inner classes is large, and the difference between the classes is small. Therefore, it is necessary to establish the correlation between the cloud images of the foundation, so that the cloud images from the same class have larger correlation, the cloud images from different classes have smaller correlation, and then the potential structural information of the cloud images is mined, and finally the foundation cloud characteristics with discriminant are learned.

In recent years, researchers have proposed learning the relevance of irregular data structures with a graph rolling network (GCN), and graph rolling networks have been successfully used in the fields of behavior recognition, text classification, image recognition, and the like. In general, the construction of graph convolution networks follows both spectrum-based principles and space-based principles. Spectrum-based graph convolution implements graph convolution according to a graph fourier transform, whereas space-based graph convolution networks artificially design graph convolution acts on graph nodes and neighbors. Therefore, from the perspective of space, the graph convolution network can be utilized to integrate the relevant information of the foundation cloud image into the deep learning network.

Disclosure of Invention

The invention aims to solve the problem of difficulty in classifying foundation clouds, and provides a foundation cloud classifying method based on a task graph convolutional network.

The method comprises the following steps:

step S1, an input foundation cloud image is obtained, the input foundation cloud image is preprocessed, and a preprocessed foundation cloud image is obtained and is used as input of a task graph rolling network;

s2, inputting the preprocessed foundation cloud image into a task graph rolling network training model, and training to obtain a task graph rolling network;

step S3, extracting and obtaining the characteristic based on the convolutional neural network and the characteristic based on the graph convolution of each input ground cloud image based on the task graph convolutional network, and the fusion characteristic representation of each input ground cloud image;

s4, training a support vector machine classifier according to the fusion characteristic representation of the input foundation cloud image to obtain a foundation cloud classification model;

and S5, acquiring fusion characteristic representation of the test input foundation cloud image, and inputting the fusion characteristic representation into the foundation cloud classification model to obtain a classification result corresponding to the test input foundation cloud image.

Optionally, the step of preprocessing the input ground cloud image in step S1 includes the following steps:

step S11, normalizing the input foundation cloud image to obtain a normalized image;

step S12, horizontally overturning the normalized image to obtain a horizontally overturned image;

s13, carrying out random clipping on the horizontal overturn image;

and S14, subtracting the corresponding preset RGB pixel mean value from each RGB pixel value in the foundation cloud image obtained after random clipping to obtain a preprocessed foundation cloud image.

Optionally, the step S2 includes the steps of:

s21, constructing a task graph convolution network, wherein the task graph convolution network comprises a graph feature matrix and adjacent matrix construction module, a graph representation learning module, a feature fusion layer and a classification module;

step S22, initializing parameters of a graph feature matrix and an adjacent matrix construction module, a graph representation learning module and a classification module in the task graph convolution network to obtain a task graph convolution network training model;

and S23, inputting the preprocessed ground cloud images into a graph feature matrix and a sub-network I and a sub-network II of an adjacent matrix building module of the task graph convolutional network training model in batches for training to obtain the task graph convolution network.

Optionally, the step S21 includes the steps of:

step 211, constructing a sub-network I and a sub-network II in the graph feature matrix and adjacent matrix constructing module, inputting the preprocessed foundation cloud image into the sub-network I and the sub-network II, and learning to obtain depth features of the preprocessed foundation cloud image, wherein the depth features obtained by the sub-network I are features of the preprocessed foundation cloud image based on a convolutional neural network, and are used as one input of the feature fusion layer, and meanwhile, are also used for constructing a graph feature matrix X, which is one input of the graph representation learning module graph; the depth features obtained by the sub-network II are used for constructing an adjacency matrix A, and the adjacency matrix represents the other input of the learning module diagram for the diagram;

step 212, building a graph and graph volume lamination layer in a graph representation learning module based on the graph feature matrix X and the adjacent matrix A, and learning the graph convolution-based characteristics of the preprocessed foundation cloud image based on the graph representation learning module;

step 213, inputting the obtained characteristics of the preprocessed ground cloud image based on the convolutional neural network and the characteristics based on the graph convolution into the characteristic fusion layer to obtain the fusion characteristics of the preprocessed ground cloud image;

step 214, constructing a classification module, wherein the classification module comprises two full connection layers and a loss function.

Optionally, the sub-network I is a residual network, which includes five convolution layers, wherein a maximum pooling layer is connected after the first convolution layer, and an average pooling layer is connected after the last convolution layer; the subnetwork II is also a residual error network, two full-connection layers are additionally arranged on the basis of the subnetwork I structure, and a leakage correction linear unit is further arranged behind the first full-connection layer.

Optionally, the constructed graph represents that the graph g= (V, E) in the learning module is an undirected full-connection graph, where V is a node set composed of N nodes, and E is a set of connection edges between the nodes; the constructed graph shows that the graph in the learning module is overlaid with Z layers.

Optionally, the feature fusion layer fuses the features of the preprocessed ground cloud image based on the convolutional neural network and the features based on the graph convolution in a serial fusion mode.

Optionally, in step S23, the task graph convolutional network is further optimized by using a random gradient descent method.

Optionally, the step S3 includes the steps of:

step S31, inputting the input ground cloud images into a task graph convolution network obtained through training in batches;

and S32, extracting the output of a feature fusion layer in the task graph convolutional network as the fusion feature representation of the input ground cloud image.

Optionally, the step S4 specifically includes:

and (3) respectively inputting the fusion characteristic representation of each input foundation cloud image obtained in the step (S3) and the label corresponding to the input foundation cloud image into a support vector machine classifier, and training to obtain the foundation cloud classification model.

The beneficial effects of the invention are as follows: according to the method, a graph volume algorithm is integrated into a deep learning network through a task graph volume network, and the correlation between the foundation cloud images is learned according to the similarity between the foundation cloud images; the method can learn the depth characteristics of the foundation cloud according to the classification tasks, and achieves the purpose of effectively mining label information corresponding to the foundation cloud image in the characteristic learning process; by fusing the characteristics based on the neural network and the characteristics based on the graph rolling network, complementary information between the characteristics and the characteristics can be fully mined, and the accuracy of classification of the foundation cloud is improved.

The invention obtains national natural science foundation project No.61711530240, tianjin city natural science foundation key projects No.19JCZDJC31500 and No.17JCZDJC30600, tianjin university's young scientific research and Sharp talent cultivation plan No.135202RC1703, and mode classification national key laboratory open subject foundation No.201800002 and Tianjin higher school innovation team foundation project.

Drawings

Fig. 1 is a flowchart of a method for classifying a foundation cloud based on a task graph convolutional network according to an embodiment of the present invention.

Detailed Description

The objects, technical solutions and advantages of the present invention will become more apparent by the following detailed description of the present invention with reference to the accompanying drawings. It should be understood that the description is only illustrative and is not intended to limit the scope of the invention. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the present invention.

Fig. 1 is a flowchart of a method for classifying a foundation cloud based on a task graph convolutional network according to an embodiment of the present invention, as shown in fig. 1, where the method for classifying a foundation cloud based on a task graph convolutional network includes:

wherein the step of preprocessing the input ground cloud image further comprises the steps of:

in an embodiment of the present invention, the original size of the input ground cloud image is 1024×1024, where two 1024 respectively represent the height and width of the input ground cloud image; the normalized ground cloud image size is 252×252, where two 252 represent the height and width of the normalized ground cloud image, respectively.

the horizontal overturning refers to left-right overturning by taking the vertical center of the image as a reference.

S13, carrying out random clipping on the horizontal overturn image;

wherein, the random cropping refers to random window cropping within a range not exceeding the image size.

In an embodiment of the present invention, if the size of the horizontally flipped image is 252×252, in this step, random window cropping is performed within a range not exceeding the size of the image, the upper boundary and the left boundary of the window are within the image, and the upper boundary and the lower boundary of the distance image are not more than 28 pixels, and the obtained ground cloud image size is 224×224, where two 224 represent the height and the width of the cropped ground cloud image, respectively.

In an embodiment of the present invention, the preset RGB pixel mean value may be set as a mean value of all ground cloud images in the ground cloud image training set on the RGB channels. Wherein the size of each ground cloud image has been normalized to 224 x 224 before calculating the RGB pixel mean.

further, the step S2 includes the steps of:

further, the step S21 includes the steps of:

step 211, constructing two sub-networks of a sub-network I and a sub-network II in the graph feature matrix and adjacent matrix construction module, inputting the preprocessed foundation cloud image into the sub-network I and the sub-network II, and learning to obtain depth features of the preprocessed foundation cloud image, wherein the depth features obtained by the sub-network I are features of the preprocessed foundation cloud image based on a convolutional neural network, and are used as one input of the feature fusion layer, and meanwhile, are also used for constructing a graph feature matrix X, and the graph feature matrix is one input of the graph representation learning module graph; the depth features obtained by the sub-network II are used for constructing an adjacency matrix A, and the adjacency matrix represents the other input of the learning module diagram for the diagram;

wherein the sub-network I is a residual network comprising five convolutional layers, wherein the convolutional kernel of the first convolutional layer has a size c ₁ ×c ₁ Step length s ₁ The number of convolution kernel groups is n ₁ The method comprises the steps of carrying out a first treatment on the surface of the The second to fifth layers are composed of unequal numbers of residual blocks, each residual block is composed of K layers of convolution layers, and the convolution kernel size of the kth convolution layer of each residual block is c _k ×c _k Step length s _k The number of convolution kernel groups is n _k I.e. there is n _k A convolution activation graph, a maximum pooling layer is connected behind a first convolution layer, and the core size of the maximum pooling layer is c _max ×c _max Step length s _max After the last convolution layer, an average pooling layer is connected, and the core size of the average pooling layer is c _avg ×c _avg Step length s _avg 。

The sub-network II is also a residual network, and is additionally provided with two full-connection layers based on the structure of the sub-network I, wherein the number of the neurons of the sub-network II is M respectively ₁ And M ₂ . A leakage correction linear unit is also arranged behind the first full connection layer.

In one embodiment of the present invention, the convolution kernel size of the first convolution layer in the sub-network I is 7×7, the step size is 2, and the number of convolution kernel groups is 64; the second to fifth layers are respectively composed of 3, 4, 6 and 3 residual blocks, each residual block is composed of 3 layers of convolution layers, the convolution kernel sizes of a first convolution layer and a third convolution layer in each residual block are 1 multiplied by 1, the convolution kernel size of a second convolution layer is 3 multiplied by 3, and the step sizes of the three convolution kernels are 1; the number of the convolution kernel groups of the first layer to the third layer of the second layer residual block is 64, 64 and 256 respectively; then the number of the convolution kernel groups of the first layer to the third layer of the residual block of each layer is 2 times of the corresponding number of the previous layer; the maximum pooling layer core size is 3 multiplied by 3, and the step length is 2; the average pooling layer core size is 7 x 7 with a step size of 7. The number of neurons of two fully connected layers added by the sub-network II on the basis of the structure of the sub-network I is 256 and 7 respectively.

In an embodiment of the present invention, the leakage correction linear unit may be expressed as:

wherein h (a) is an output value after the leakage correction linear unit acts, a is an input value of the leakage correction linear unit, and lambda is a leakage coefficient.

In one embodiment of the present invention, λ is set to 0.2.

In the present inventionIn one embodiment, the output of the subnetwork I is a 2048-dimensional vector

For being an input of the feature fusion layer>

And for constructing the graph representing one input of the learning module: a graph feature matrix X; the output of the first fully connected layer of said subnetwork II is the 256-dimensional vector +.>

Another input for constructing the graph represents a learning module: the adjacency matrix a.

In an embodiment of the present invention, the sub-network I and the sub-network II receive the same preprocessed ground cloud image as input; the same network structure part parameters in the sub-network I and the sub-network II are shared; the output of the second fully-connected layer of the sub-network II is connected with a loss function, and the loss function acts on a maximum flexibility function, wherein the maximum flexibility function can be expressed as:

where T is the number of cloud species, z _m Output value, z, of neuron at mth position for second full connected layer _τ Is the output value of the neuron at the τ -th position of the second fully connected layer.

The loss function is a cross entropy function, which can be expressed as:

wherein q _m Probability of being a true tag, q when m is a true tag _m =1, otherwise q _m ＝0。

in an embodiment of the present invention, the constructed graph g= (V, E) is an undirected full-connection graph, where V is a node set composed of N nodes and E is a set of connection edges between the nodes. In graph G, each node v _i E V represents a feature vector of a ground cloud image

Namely, the depth characteristics based on the convolutional neural network, which are learned by the sub-network I; the graph feature matrix can then be expressed as: />

Wherein each row of the graph feature matrix represents a node, and 2048 represents the number of feature channels of the graph feature matrix. The adjacency matrix is used for reflecting the strength of the correlation between nodes, and is composed of +.>

Construction, expressed as:

wherein the dimension of the adjacency matrix is N x N,

representation of feature vector x _i And performing dimension reduction to obtain the feature vector.

In one embodiment of the invention, σ (·) is the maximum function of flexibility for

The activation value of which can be expressed as:

wherein, the liquid crystal display device comprises a liquid crystal display device,

is->

The value of row i and column j.

In one embodiment of the invention, each element in adjacency matrix A is square-root to obtain a normalized adjacency matrix

In one embodiment of the present invention, N has a value of 48.

In one embodiment of the present invention, the graph represents a Z-layer graph roll layer in the learning module, wherein the number of output characteristic channels of the first layer is d ^l The parameters of the first layer are

The layer l graph convolution operation may be expressed as:

X ^l ＝f(X ^l-1 ,A),

f (·) can be expressed as:

wherein h (·) is the leakage correction linear unit, X ^l-1 Is the input for the layer i graph convolution operation.

In one embodiment of the invention, the graph shows that the learning module has 3 layers of graph convolution layers, and the number of output characteristic channels is 1024,1024 and 512 respectively.

In one embodiment of the invention, the graph represents the output of the third layer of graph convolution layer of the learning module, i.e. the graph represents the output of the learning module

For the pre-treatment ofThe physical ground cloud image is based on features of the graph convolution, which serves as another input to the feature fusion layer. In particular, each input ground cloud image is characterized by a vector of 512 based on graph convolution.

in an embodiment of the present invention, the feature fusion layer fuses the feature based on the convolutional neural network and the feature based on the graph convolution of each preprocessed ground cloud image by adopting a serial fusion manner, so as to obtain the fusion feature of each preprocessed ground cloud image: 2560-dimensional vector.

In an embodiment of the present invention, two fully-connected layers of the classification module have 256 and 7 neurons, respectively, and the output of the second fully-connected layer is connected with the cross entropy loss function L _gcn The loss function acts on the maximum flexibility function.

In one embodiment of the present invention, the loss function of the classification module may be expressed as:

L＝L _cnn +L _gcn 。

in an embodiment of the present invention, parameters of the graph feature matrix and adjacency matrix construction module and classification module include weights and offsets, the weight initialization obeys standard n-ethernet distribution, and the offsets are all initialized to zero; the graph shows that the parameters of the learning module only contain weights, and the initialization obeys uniform distribution.

In one embodiment of the invention, the task graph convolutional network may also be optimized using a random gradient descent method (SGD).

further, the step S3 includes the steps of:

and S31, inputting the input ground cloud images into a task graph convolution network obtained through training in batches, namely inputting the input ground cloud images into a sub-network I and a sub-network II of a graph feature matrix and an adjacent matrix construction module of the task graph convolution network.

In one embodiment of the invention, the fused feature representation of each input ground cloud image is a 2560 dimensional vector.

the step S4 specifically includes:

In an embodiment of the present invention, the support vector machine classifier is a radial basis function.

The fusion characteristic representation of the test input foundation cloud image can be obtained according to the steps.

In an application example of the invention, the used foundation cloud image database is photographed in China at different times in different seasons, and the used camera is a fish-eye lens. The classification accuracy of the ground cloud image is 89.48% by classifying based on the fusion feature representation extracted from the feature fusion layer, so that the effectiveness of the method is seen.

In summary, the method has the capability of learning the depth features according to classification tasks and learning the structural features according to the correlation between the ground cloud images, can learn the features based on the convolutional neural network and the features based on the graph convolution network in the same network at the same time, fully utilizes the complementary information of the features based on the convolutional neural network and the features based on the graph convolution network, effectively mines the correlation of the two, extracts the fusion features with higher discrimination, and improves the accuracy of the ground cloud classification.

It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explanation of the principles of the present invention and are in no way limiting of the invention. Accordingly, any modification, equivalent replacement, improvement, etc. made without departing from the spirit and scope of the present invention should be included in the scope of the present invention. Furthermore, the appended claims are intended to cover all such changes and modifications that fall within the scope and boundary of the appended claims, or equivalents of such scope and boundary.

Claims

1. A ground cloud classification method based on a task graph convolutional network, which is characterized by comprising the following steps:

s5, obtaining a fusion characteristic representation of the test input foundation cloud image, and inputting the fusion characteristic representation into the foundation cloud classification model to obtain a classification result corresponding to the test input foundation cloud image;

wherein, the step S2 includes the following steps:

step S23, inputting the preprocessed ground cloud images into a graph feature matrix and a sub-network I and a sub-network II of an adjacent matrix building module of the task graph convolutional network training model in batches for training to obtain the task graph convolution network;

the step S3 includes the steps of:

2. The method according to claim 1, wherein the step of preprocessing the input ground cloud image in step S1 comprises the steps of:

s13, carrying out random clipping on the horizontal overturn image;

3. The method according to claim 1, wherein said step S21 comprises the steps of:

4. A method according to claim 3, characterized in that the sub-network I is a residual network comprising five convolutional layers, wherein the first convolutional layer is followed by a maximum pooling layer and the last convolutional layer is followed by an average pooling layer; the subnetwork II is also a residual error network, two full-connection layers are additionally arranged on the basis of the subnetwork I structure, and a leakage correction linear unit is further arranged behind the first full-connection layer.

5. A method according to claim 3, characterized in that the graph constructed represents the graph g= (V, E) in the learning module as an undirected fully connected graph, where V is a set of nodes consisting of N nodes and E is a set of connecting edges between the nodes; the constructed graph shows that the graph in the learning module is overlaid with Z layers.

6. The method of claim 3, wherein the feature fusion layer fuses the convolutional neural network-based features and the graph-convolution-based features of the preprocessed ground cloud image in a serial fusion manner.

7. The method according to claim 1, wherein in step S23, the task graph convolutional network is further optimized using a random gradient descent method.

8. The method according to claim 1, wherein the step S4 is specifically: