CN111191704A

CN111191704A - Foundation cloud classification method based on task graph convolutional network

Info

Publication number: CN111191704A
Application number: CN201911347193.7A
Authority: CN
Inventors: 刘爽; 李梅; 张重
Original assignee: Tianjin Normal University
Current assignee: Tianjin Normal University
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2020-05-22
Anticipated expiration: 2039-12-24
Also published as: CN111191704B

Abstract

The embodiment of the invention discloses a foundation cloud classification method based on a task graph convolutional network, which comprises the following steps: preprocessing the input foundation cloud image to obtain a preprocessed foundation cloud image, inputting the preprocessed foundation cloud image into a task graph convolution network training model, and training to obtain a task graph convolution network; extracting and obtaining the characteristics of each input foundation cloud image based on a convolutional neural network, the characteristics based on graph convolution and fusion characteristic representation based on task graph convolution network; training a support vector machine classifier according to the fusion feature representation to obtain a foundation cloud classification model; and acquiring fusion characteristic representation of the test input foundation cloud image, and inputting the fusion characteristic representation into the foundation cloud classification model to obtain a classification result. The method has the advantages that the characteristics based on the convolutional neural network and the complementary information based on the graph convolution network are fully utilized, the correlation between the characteristics and the complementary information is effectively mined, the fusion characteristics with higher distinguishability are extracted, and the accuracy of foundation cloud classification is further improved.

Description

Foundation cloud classification method based on task graph convolutional network

Technical Field

The invention belongs to the technical field of pattern recognition, meteorological science and artificial intelligence, and particularly relates to a foundation cloud classification method based on a task graph convolutional network.

Background

The cloud is a visible polymer floating in the air and formed by mixing small water drops liquefied by water vapor in the atmosphere when meeting cold or small ice crystals formed by desublimation, and more than 60 percent of the earth surface is covered with the cloud. Clouds play an important role in water circulation, surface radiation balance and climate modeling. Therefore, understanding the cloud is of great significance.

Cloud height, cloud volume, and cloud class are three main aspects of cloud observation and have gained widespread attention in academia in recent years. However, cloud classification is always a difficult problem due to cloud metamerism. Devices have been developed for collecting cloud observations, including satellite-based devices and surface-based devices. Satellite-based devices can acquire a wide range of cloud information, but have limited spatial resolution and are not sufficient to describe local regions of cloud-like features. In contrast, ground-based devices such as an all-sky imager, a total-sky imager, etc. can acquire ground-based cloud images with high resolution, which provides reliable data for monitoring and understanding the local sky.

Thanks to the large number of ground-based cloud images, many researchers have proposed performing cloud classification with manually designed features of texture, color, structure, etc. In recent years, deep learning has achieved significant results in many fields. With this inspiring, researchers have also begun to utilize Convolutional Neural Networks (CNNs) to automatically classify foundation clouds. Shi et al worked on the average or maximum pooling to each convolution activation map and then extracted features based on convolution activation and classified the foundation clouds. Ye et al extract features from the convolutional neural network's multiple convolutional layers and select a representative local descriptor, then encode the selected local descriptor with a Fisher vector, and represent it as a feature of the ground-based cloud map. Zhang et al propose a significant double-activation aggregation algorithm that extracts significant vector features from a shallow convolutional layer and extracts corresponding weights from a high convolutional layer. Li et al propose a double supervision loss function to combine knowledge of different networks together, and improve the accuracy of ground-based cloud classification by giving a greater weight to samples difficult to classify.

However, existing methods ignore the essential data structure of the cloud graph and thus do not adequately learn the characterization of the cloud graph. The existing method directly inputs the foundation cloud image and the corresponding label into the depth model, and the correlation between the foundation cloud images is not considered, so that the essential data structure of the cloud image cannot be learned. Clouds are a natural texture with large intra-class differences and small inter-class differences. Therefore, it is necessary to establish a correlation between the ground cloud images, so that the cloud images from the same class have a larger correlation, and the cloud images from different classes have a smaller correlation, and further, potential structural information of the cloud images is mined, and finally, the ground cloud features with discriminant characteristics are learned.

In recent years, researchers have proposed learning the correlation of irregular data structures with a graph-convolutional network (GCN), and have successfully applied graph-convolutional networks to the fields of behavior recognition, text classification, and image recognition. Generally, the construction of graph convolution networks follows both spectrum-based and space-based principles. Spectrum-based convolution of the graph implements graph convolution according to the graph fourier transform, while space-based convolution of the graph network artificially designs graph convolution to act on graph nodes and neighborhoods. Therefore, from the perspective of space, the relevant information of the ground-based cloud image can be merged into the deep learning network by using the graph convolution network.

Disclosure of Invention

The invention aims to solve the problem of difficulty in classification of foundation clouds, and provides a foundation cloud classification method based on a task graph convolutional network.

The method comprises the following steps:

step S1, acquiring an input foundation cloud image, preprocessing the input foundation cloud image to obtain a preprocessed foundation cloud image, and using the preprocessed foundation cloud image as the input of a task graph convolution network;

step S2, inputting the preprocessed foundation cloud image into a task graph convolution network training model, and training to obtain a task graph convolution network;

step S3, extracting and obtaining the characteristics of each input foundation cloud image based on the convolution neural network and the characteristics based on the graph convolution and the fusion characteristic representation of each input foundation cloud image based on the task graph convolution network;

step S4, training a support vector machine classifier according to the fusion feature representation of the input foundation cloud image to obtain a foundation cloud classification model;

and step S5, acquiring the fusion characteristic representation of the test input foundation cloud image, and inputting the fusion characteristic representation into the foundation cloud classification model to obtain a classification result corresponding to the test input foundation cloud image.

Optionally, the step of preprocessing the input ground-based cloud image in step S1 includes the steps of:

step S11, normalizing the input foundation cloud image to obtain a normalized image;

step S12, horizontally turning the normalized image to obtain a horizontally turned image;

step S13, randomly cutting the horizontal turnover image;

and step S14, subtracting the corresponding preset RGB pixel mean value from each RGB pixel value in the foundation cloud image obtained after random cutting to obtain the preprocessed foundation cloud image.

Optionally, the step S2 includes the following steps:

step S21, constructing a task graph convolution network, wherein the task graph convolution network comprises a graph feature matrix and adjacency matrix construction module, a graph representation learning module, a feature fusion layer and a classification module;

step S22, initializing parameters of a graph feature matrix and adjacency matrix construction module, a graph representation learning module and a classification module in the task graph convolutional network to obtain a task graph convolutional network training model;

and step S23, inputting the preprocessed foundation cloud images into a graph characteristic matrix of the task graph convolution network training model and a sub-network I and a sub-network II of an adjacent matrix construction module in batches for training to obtain the task graph convolution network.

Optionally, the step S21 includes the following steps:

step 211, constructing a sub-network I and a sub-network II in the graph feature matrix and adjacency matrix construction module, inputting the preprocessed foundation cloud image into the sub-network I and the sub-network II, and learning to obtain a depth feature of the preprocessed foundation cloud image, wherein the depth feature learned by the sub-network I is a feature of the preprocessed foundation cloud image based on a convolutional neural network, and is used as an input of the feature fusion layer, and is also used for constructing a graph feature matrix X, and the graph feature matrix is an input of the graph representation learning module graph; the depth features learned by the sub-network II are used for constructing an adjacency matrix A, and the adjacency matrix is another input of the graph representing the learning module graph;

step 212, constructing a graph and a graph convolution layer in a graph representation learning module based on the graph feature matrix X and the adjacent matrix A, and learning to obtain a graph convolution-based feature of the preprocessed foundation cloud image based on the graph representation learning module;

step 213, inputting the obtained feature of the preprocessed foundation cloud image based on the convolutional neural network and the feature based on the graph convolution into the feature fusion layer to obtain a fusion feature of the preprocessed foundation cloud image;

step 214, constructing a classification module, wherein the classification module comprises two fully connected layers and a loss function.

Optionally, the sub-network I is a residual network, which includes five convolutional layers, where a largest pooling layer is connected after the first convolutional layer, and an average pooling layer is connected after the last convolutional layer; the sub-network II is also a residual error network, two full connection layers are additionally arranged on the basis of the structure of the sub-network I, and a leakage correction linear unit is arranged behind the first full connection layer.

Optionally, the constructed graph represents that a graph G ═ (V, E) in the learning module is an undirected fully-connected graph, where V is a node set composed of N nodes, and E is a set of connected edges between nodes; the constructed graph shows that the graph convolution layer in the learning module has Z layers.

Optionally, the feature fusion layer fuses the features of the preprocessed foundation cloud image based on the convolutional neural network and the features of the preprocessed foundation cloud image based on the graph convolution in a series fusion mode.

Optionally, in step S23, the task graph convolution network is further optimized by using a stochastic gradient descent method.

Optionally, the step S3 includes the following steps:

step S31, inputting the input foundation cloud images into a task graph convolution network obtained through training in batch;

and step S32, extracting the output of the feature fusion layer in the task graph convolution network as the fusion feature representation of the input foundation cloud image.

Optionally, the step S4 specifically includes:

and (4) respectively inputting the fusion feature representation of each input foundation cloud image obtained according to the step (S3) and the label corresponding to the input foundation cloud image into a support vector machine classifier, and training to obtain the foundation cloud classification model.

The invention has the beneficial effects that: the method integrates the graph convolution algorithm into the deep learning network through the task graph convolution network, and learns the correlation between the foundation cloud images according to the similarity between the foundation cloud images; the method can learn the depth characteristics of the foundation cloud according to the classification tasks, and achieves the purpose of effectively mining label information corresponding to the foundation cloud image in the characteristic learning process; by fusing the features based on the neural network and the features based on the graph convolution network, complementary information between the features can be sufficiently mined, and the accuracy of the classification of the foundation cloud is improved.

It should be noted that the invention obtains the funding of national science fund project No.61711530240, the key projects No.19JCZDJC31500 and No.17JCZDJC30600 of the science fund in Tianjin City, No.135202RC1703 of the Qinghai scientific research talent culture plan in Tianjin teacher university, and the open project fund No.201800002 of the key laboratory in the model classification country and the innovation team fund project of Tianjin high school.

Drawings

Fig. 1 is a flowchart of a method for classifying a foundation cloud based on a task graph convolutional network according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.

Fig. 1 is a flowchart of a method for classifying foundation clouds based on a task graph convolutional network according to an embodiment of the present invention, and as shown in fig. 1, the method for classifying foundation clouds based on a task graph convolutional network includes:

wherein the step of preprocessing the input ground-based cloud image further comprises the steps of:

in an embodiment of the present invention, the original size of the input ground-based cloud image is 1024 × 1024, where two 1024 represent the height and width of the input ground-based cloud image respectively; the normalized foundation cloud image size is 252 × 252, where two 252 represent the height and width of the normalized foundation cloud image, respectively.

the horizontal turning refers to turning left and right by taking the vertical center of the image as a reference.

Step S13, randomly cutting the horizontal turnover image;

wherein the random cropping refers to random window cropping within a range not exceeding the image size.

In an embodiment of the present invention, if the size of the horizontally flipped image is 252 × 252, in this step, random window cropping is performed within a range not exceeding the size of the image, an upper boundary and a left boundary of the window are within the image, and a distance between the upper boundary and the lower boundary of the image should not exceed 28 pixels, and the size of the obtained foundation cloud image is 224 × 224, where two 224 represent the height and the width of the cropped foundation cloud image, respectively.

In an embodiment of the invention, the predetermined RGB pixel mean value may be set as a mean value of all ground-based cloud images in the ground-based cloud image training set on the RGB channel. Wherein the size of each ground-based cloud image has been normalized to 224 x 224 prior to computing the RGB pixel mean.

further, the step S2 includes the following steps:

further, the step S21 includes the following steps:

step 211, constructing two sub-networks, namely a sub-network I and a sub-network II in the graph feature matrix and adjacency matrix construction module, inputting the preprocessed foundation cloud image into the sub-networks I and II, and learning to obtain a depth feature of the preprocessed foundation cloud image, wherein the depth feature learned by the sub-network I is a feature of the preprocessed foundation cloud image based on a convolutional neural network, and is used as an input of the feature fusion layer, and is also used for constructing a graph feature matrix X, which is an input of the graph representation learning module graph; the depth features learned by the sub-network II are used for constructing an adjacency matrix A, and the adjacency matrix is another input of the graph representing the learning module graph;

wherein the sub-networkI is a residual network comprising five convolutional layers, wherein the convolutional kernel size of the first convolutional layer is c₁×c₁Step length of s₁The number of convolution kernel groups is n₁(ii) a The second to fifth layers are composed of different numbers of residual blocks, each residual block is composed of K convolutional layers, the convolutional kernel size of the K convolutional layer of each residual block is c_k×c_kStep length of s_kThe number of convolution kernel groups is n_kI.e. with n_kA convolution activation graph, a maximum pooling layer connected after the first convolution layer and having a kernel size of c_max×c_maxStep length of s_maxThe last convolution layer is connected with an average pooling layer with a core size of c_avg×c_avgStep length of s_avg。

The sub-network II is also a residual error network, and two full connection layers are additionally arranged on the sub-network II on the basis of the structure of the sub-network I, and the number of the neurons is M₁And M₂. A leakage correction linear unit is also arranged behind the first full connection layer.

In one embodiment of the present invention, the convolution kernel size of the first convolution layer in subnetwork I is 7 × 7, the step size is 2, and the number of convolution kernel groups is 64; the second layer to the fifth layer are respectively composed of 3, 4, 6 and 3 residual blocks, each residual block is composed of 3 convolutional layers, the sizes of convolutional kernels of the first convolutional layer and the third convolutional layer in each residual block are 1 multiplied by 1, the size of convolutional kernel of the second convolutional layer is 3 multiplied by 3, and the step lengths of the three convolutional kernels are all 1; the number of convolution kernel groups of the first layer to the third layer of the second layer of the residual block is respectively 64, 64 and 256; then, the number of convolution kernel groups of the first layer to the third layer of each layer of residual block is 2 times of the corresponding number of the previous layer; the maximum pooling layer kernel size is 3 × 3, and the step length is 2; the average pooling layer kernel size is 7 × 7 with a step size of 7. The number of neurons of the two fully connected layers additionally added to the structure of the subnetwork I in the subnetwork II is 256 and 7 respectively.

In an embodiment of the present invention, the leakage correction linear unit may be expressed as:

wherein h (a) is an output value after being acted by the leakage correction linear unit, a is an input value of the leakage correction linear unit, and λ is a leakage coefficient.

In one embodiment of the present invention, λ is set to 0.2.

In one embodiment of the present invention, the output of the sub-network I is a vector of 2048 dimensions

For use as an input to the feature fusion layer

And for constructing one input of the graph representation learning module: a graph feature matrix X; the output of the first fully-connected layer of subnetwork II is a 256-dimensional vector

Another input for constructing the graph representation learning module: adjacent to matrix a.

In one embodiment of the invention, sub-network I receives as input the same preprocessed foundation cloud image as sub-network II; the parameters of the same network structure part in the sub-network I and the sub-network II are shared; the output of the second fully connected layer of subnetwork II is connected to a loss function, which acts on a flexible maximum function, wherein the flexible maximum function can be expressed as:

where T is the number of cloud types, z_mIs the output value, z, of the neuron at the m-th position of the second fully-connected layer_τIs the output value of the neuron at the τ th position of the second fully connected layer.

The loss function is a cross-entropy function, which can be expressed as:

wherein q is_mIs the probability of a real tag, q is the probability of a real tag when m is a real tag_m1, otherwise q_m＝0。

in an embodiment of the present invention, the constructed graph G ═ (V, E) is a undirected fully-connected graph, where V is a set of nodes consisting of N nodes, and E is a set of connecting edges between nodes. In graph G, each node v_iE V represents a feature vector of a foundation cloud image

Namely the depth features based on the convolutional neural network obtained by learning of the subnetwork I; the graph feature matrix can then be expressed as:

wherein, each row of the graph feature matrix represents a node, and 2048 represents the number of feature channels of the graph feature matrix. The adjacency matrix is used for reflecting the strength of the correlation between nodes and is composed of

Construction, expressed as:

wherein the dimension of the adjacency matrix is NxN,

representation for feature vector x_iAnd (5) carrying out dimensionality reduction to obtain a feature vector.

In one embodiment of the invention, σ (-) is maximum complianceFunction of, for

The activation value of each element in (1) can be expressed as:

wherein the content of the first and second substances,

is that

Row i and column j.

In one embodiment of the present invention, the square root of each element in the adjacency matrix A is obtained to obtain the normalized adjacency matrix

In one embodiment of the present invention, N is 48.

In an embodiment of the present invention, the graph shows that the learning module has Z-layer graph convolution layers, wherein the number of output characteristic channels of the l-th layer is d^lThe parameter of the first layer is

The graph convolution operation for the ith layer can be expressed as:

X^l＝f(X^l-1,A),

f (-) can be expressed as:

wherein h (-) is a leakage correction linear unit, X^l-1Is the input of the l-th layer graph convolution operation.

In one embodiment of the present invention, the graph shows that the learning module has 3 layers of graph convolution layers, and the number of output characteristic channels is 1024,1024 and 512 respectively.

In the present inventionIn one embodiment, the graph represents the output of the third level graph convolution layer of the learning module, i.e., the graph represents the output of the learning module

And (3) graph convolution-based features of the preprocessed foundation cloud image, which serve as another input of the feature fusion layer. Specifically, each input ground-based cloud image has 512 vectors of features based on graph convolution.

in an embodiment of the present invention, the feature fusion layer fuses the features based on the convolutional neural network and the features based on the graph convolution of each preprocessed foundation cloud image in a series fusion manner to obtain the fusion features of each preprocessed foundation cloud image: a 2560-dimensional vector.

In one embodiment of the present invention, the two fully-connected layers of the classification module have 256 and 7 neurons, respectively, and the output of the second fully-connected layer is connected to the cross-entropy loss function L_gcnThe loss function acts on the flexibility maximum function.

In an embodiment of the present invention, the loss function of the classification module can be expressed as:

L＝L_cnn+L_gcn。

in one embodiment of the invention, the parameters of the graph feature matrix and adjacency matrix construction module and the classification module comprise weights and offsets, the weight initialization obeys standard positive-Taiji distribution, and the offsets are all initialized to zero; the graph represents that the parameters of the learning module only contain weights, and the initialization obeys uniform distribution.

In an embodiment of the present invention, the task graph convolution network may be further optimized by using a Stochastic Gradient Descent (SGD) method.

further, the step S3 includes the following steps:

step S31, inputting the input ground-based cloud images into the trained task graph convolutional network in batch, that is, inputting the input ground-based cloud images into the sub-networks I and II of the graph feature matrix and adjacency matrix building module of the task graph convolutional network.

In one embodiment of the invention, the fused feature representation of each input ground-based cloud image is a 2560-dimensional vector.

the step S4 specifically includes:

In an embodiment of the invention, the support vector machine classifier is a radial basis kernel function.

And the fused characteristic representation of the test input foundation cloud image can be obtained according to the steps.

In an application example of the invention, the used foundation cloud database is shot in China at different times in different seasons, and the used camera is a fish-eye lens. The classification accuracy of the ground-based cloud image is 89.48% by classifying based on the fused feature representation extracted from the feature fusion layer, and therefore, the effectiveness of the method of the present invention can be seen.

In conclusion, the method has the capability of learning the depth features according to the classification tasks and the structural features according to the correlation between the foundation cloud images, can simultaneously learn the features based on the convolutional neural network and the features based on the graph convolution network in the same network, fully utilizes the complementary information of the features based on the convolutional neural network and the graph convolution network, effectively excavates the correlation between the features and the graph convolution network, extracts the fusion features with higher discriminativity, and improves the accuracy of foundation cloud classification.

It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

Claims

1. A foundation cloud classification method based on a task graph convolutional network is characterized by comprising the following steps:

2. The method of claim 1, wherein the step of preprocessing the input ground-based cloud image in step S1 comprises the steps of:

step S13, randomly cutting the horizontal turnover image;

3. The method according to claim 1 or 2, wherein the step S2 comprises the steps of:

4. The method according to claim 3, wherein the step S21 comprises the steps of:

5. The method of claim 4, wherein subnetwork I is a residual network comprising five convolutional layers, wherein the first convolutional layer is followed by a largest pooling layer and the last convolutional layer is followed by an average pooling layer; the sub-network II is also a residual error network, two full connection layers are additionally arranged on the basis of the structure of the sub-network I, and a leakage correction linear unit is arranged behind the first full connection layer.

6. The method of claim 4, wherein the constructed graph represents that graph G ═ (V, E) in the learning module is a undirected fully-connected graph, where V is a set of nodes consisting of N nodes and E is a set of connected edges between nodes; the constructed graph shows that the graph convolution layer in the learning module has Z layers.

7. The method of claim 4, wherein the feature fusion layer fuses convolution neural network-based features and graph convolution-based features of the preprocessed foundation cloud image in a serial fusion manner.

8. The method according to claim 3, wherein in step S23, the task graph convolution network is further optimized by using a stochastic gradient descent method.

9. The method according to claim 1, wherein the step S3 comprises the steps of:

10. The method according to claim 1, wherein the step S4 specifically includes: