CN111291818A

CN111291818A - Non-uniform class sample equalization method for cloud mask

Info

Publication number: CN111291818A
Application number: CN202010099382.3A
Authority: CN
Inventors: 吴炜; 高星宇; 范菁; 沈瑛; 夏列钢; 葛炜炜
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-02-18
Filing date: 2020-02-18
Publication date: 2020-06-16
Anticipated expiration: 2040-02-18
Also published as: CN111291818B

Abstract

A cloud mask-oriented heterogeneous class sample equalization method comprises the following steps: step 1: data preprocessing and sample acquisition; step 2: grouping samples; and step 3: training a cloud mask model; and 4, step 4: masking and evaluating a classifier; and 5: performing iterative training; step 6: a cloud mask; and 7: and (5) performing mask data post-processing. The sample equalization method effectively solves the problem of sample unbalance caused by uneven cloud types in the remote sensing image, and realizes effective identification and segmentation of various types of clouds in the image, so that the cloud mask precision is improved; the samples are selectively input, so that the influence of the missing or wrongly detected samples can be highlighted, the small-class samples are enhanced, the features extracted by the deep learning model are effectively adjusted, and the problem of missing detection or wrong detection of the small-class clouds is solved.

Description

Non-uniform class sample equalization method for cloud mask

Technical Field

The invention relates to a non-uniform class sample equalization method, in particular to a non-uniform class sample equalization method in a cloud mask training process based on deep learning.

Background

With the development of remote sensing technology, remote sensing images are widely applied to various fields such as crop mapping, forest monitoring and the like. However, due to the influence of the cloud in the imaging process of the remote sensing image, data loss or distortion exists in the corresponding area of the obtained image, and therefore the accuracy and the application significance of the information extraction result are influenced. The problem of data loss or distortion caused by cloud is particularly prominent in cloudy and rainy areas such as southern China and the like, and the data acquirability can be improved and the guarantee capability can be enhanced by using images partially covered by the cloud. In the application of partially missing images, the cloud masking process of marking pixels affected by clouds on the image pixel by pixel is the basis of various applications.

The cloud mask method based on supervised learning can learn cloud characteristics according to marked samples and adjust parameters of a classifier, has the potential of obtaining high cloud mask precision, and characteristic design and classifier selection are important factors for determining classification precision. Common features (e.g., spectral, shape, and texture features, etc.), machine learning methods (e.g., support vector machines and shallow neural networks, etc.), and combinations thereof have all been used for cloud masks (picrorhiza, chengchun, lianda. landau. Landsat image cloud detection with joint cloud volume automatic assessment and weighted support vector machines [ J ] mapping bulletin, 2014,43(8): 848-854.). However, these features are artificially designed, and since the different kinds of clouds have different features, it is difficult to accurately describe and distinguish all clouds by using a single or multiple features, thereby limiting the applicability of the method. Therefore, the samples need to be reselected and the classifier model needs to be adjusted each time the cloud mask is generated, resulting in higher time consumption and financial cost, and reducing the practicability of the algorithm.

In recent years, deep learning is widely applied and has great success in the field of computer vision because the feature extraction and classifier optimization are coupled together to realize collaborative optimization, thereby providing a new technical idea and scheme for cloud masks. One exemplary method is as follows: similar pixels can be aggregated into super pixels through a linear iterative clustering algorithm, and then the super pixels are used as input to carry out cloud detection. Meanwhile, the convolution operation can not only utilize the spectral characteristics of a single pixel, but also extract the spatial characteristics formed by the convolution window pixels, and the cloud masking by directly utilizing the convolution characteristics is favorable for improving the masking precision (Xie F, ShiM, Shi Z, et al. multilevel closed detection in Remote Sensing images based on horizontal masking J. IEEE Journal of Selected Topics in Applied easy alignment and Remote Sensing,2017,10(8): 3631-). The method for improving the masking precision of the image is simple and convenient to operate and has a wide application prospect. From the above discussion it can be seen that: the cloud mask method based on deep learning has realized a higher-precision mask effect.

However, as the cloud is a composite category, the cloud can be divided into 3 groups of low cloud, medium cloud and high cloud 10 according to the height of the cloud and by combining information such as whether precipitation, temperature and reverse radiation are generated, wherein the height of the low cloud is not more than 4km, and the low cloud comprises types such as layer cloud, rain layer cloud, rain cloud and the like; the height of the middle clouds is 4-6km, including high-volume clouds and high-layer clouds; the height of the high clouds is more than 6km, including the rolling clouds, the rolling clouds and the rolling layer clouds. Fig. 1 gives a schematic of the various clouds, and it can be seen that: the different types of clouds exhibit morphologies and color characteristics that are different, even very different; even the same type of cloud varies in its characteristics as its thickness, height, etc. varies. Meanwhile, the number of clouds is also very different, and the number of clouds varies with time and area, i.e. the unbalanced categories (King Shuaihui, Han Shi just, Yao Shi just, etc.. macroscopic feature analysis of various clouds in China and surrounding areas based on CloudSat data [ J ]. meteorology report, 2011,69(5): 883-.

The deep learning model needs to accurately recognize the features of different types of clouds to realize accurate recognition of the different types of clouds, but because: (1) the proportion of different types of clouds is different greatly and changes along with the influence of factors such as different latitudes and different seasons, and samples with balanced proportion are difficult to obtain; (2) the separability of the cloud is determined not only by the cloud itself, but also by the differentiability of the cloud from the underlying surface (such as city, forest, water body, desert, etc.), and the cloud and the underlying surface form a complex combination relationship. Due to the factors, the deep learning model cannot well learn the small-class cloud characteristics, and cloud missing of the corresponding type is caused.

In the field of computer vision and the like, one method for processing the imbalance category is to select samples by category to ensure the relative balance of various sample numbers. However, the cloud type determination method requires other auxiliary information such as height, and only relies on a visual determination method, which is also a challenging task for professional technicians. Thus, choosing samples category by category is difficult to implement. In order to solve the problems, the invention provides a sample equalization method based on packet training, which is characterized in that the mask accuracy of a classifier on various types of clouds is judged by using the current classification result, and missing detection and false detection samples are selectively input, so that the sensitivity of the classifier on the missing detection or false detection samples is enhanced.

Disclosure of Invention

Aiming at the problem of sample imbalance caused by non-uniform classes, the invention provides a sample equalization method, so that a classifier can be ensured to accurately identify different types of clouds.

The method has the principle that samples are grouped, a relatively stable deep learning model is trained through a previous group of samples, and then cloud masking and result evaluation are carried out on a next group of samples; and neglecting the samples which can be accurately identified by the current classifier, and inputting the false detection samples and the missed detection samples into the classifier for iterative training, so that the proportion of large classes in the total samples is reduced, the proportion of small classes in the total samples is increased, and the sample equalization is realized.

According to the principle, on the basis of obtaining an image to be masked, the cloud mask-oriented non-uniform type sample equalization method disclosed by the invention comprises the following implementation steps:

step 1: data preprocessing and sample acquisition;

the vegetation on the standard false color image is red, the distinguishability of various ground objects and clouds is large, and the standard false color mode is used for wave band synthesis.

Because the original images adopt different quantization bits, the same gray value has different meanings, and the gray value is compressed to an interval [1,255] by a percentage truncation mode, so that the images obtained by different sensors have similar colors.

Since the image is distributed in the form of scenes, the size of the image is much larger than the processing capacity of the computer, the image is sliced, and image blocks with the size h x w are output, and the number of the image blocks is M.

On the basis, a polygon of an area where the cloud is located on the image block is marked, and a remote sensing image cloud mask sample set G is manufactured, wherein the remote sensing image cloud mask sample set G comprises N image blocks.

Step 2: grouping samples;

sample G was divided into K groups of samples:

G＝{G₁,G₂,...,G_K} (1)

wherein the sample subset G₁As a pre-training classifier to obtain a relatively robust deep learning model; sample subset { G }₂,...,G_k-1Testing the group of samples and selecting the samples in an iteration mode; sample subset G_KFor detecting the accuracy.

Here, according to the client requirement, whether an intersection exists between the sample subsets can be ensured.

And step 3: training a cloud mask model;

after setting the training hyper-parameters, the sample subset G is used₁Training a deep learning model by passing through a subset of samples G₁The model obtained by training is C¹。

And 4, step 4: masking and evaluating a classifier;

use of C¹For G₂The samples in (1) are predicted to obtain a mask result, and the mask result is then compared with G₂The tag data in (1) are compared, thereby evaluating the classification result.

The cloud mask results can be classified into three types of correct T, error F and omission M. Wherein, the correct representation sample is marked as cloud, and the classification result is also cloud; the error indicates that the sample is non-cloud and the classifier is extracted as cloud; missing indicates that the sample is marked as a cloud, but the classifier is not extracted. On the basis, three evaluation indexes, namely average detection rate (AR), average accuracy rate (AP) and Overall Accuracy (OA), are selected to evaluate the mask accuracy:

where, # denotes the number of pixels. It is worth noting that: since the number of background pixels formed by the clean ground object is much larger than that of the cloud pixels, so that the overall accuracy is difficult to reflect the real detection effect, the partial pixels are ignored when calculating the OA, and the OA-to-intersection ratio (IoU) is the same.

At the same time, grouping G samples_KThe classification result in (1) is subjected to cloud mask and precision evaluation thereof, and the precision is recorded as OA (G)_K,C¹)，

And 5: performing iterative training;

for a block of cloud, those with both AR and AP greater than 0.5 are taken as correct samples, and others are taken as erroneous samples. Neglecting correct samples, adding error and missing samples, returning to step 3, and training to obtain new classifier C²(ii) a Then according to step 4, the sample G is processed₃Evaluation was carried out.

The above process is repeated until all sample sets are exhausted or the classifier has stabilized, where classifier stabilization refers to the classification accuracy OA (G) of the ith time_K,Cⁱ) Classification accuracy OA (G) with order i +1_K,Cⁱ⁺¹) Is less than or equal to the tolerance e:

|OA(G_K,Cⁱ)-OA(G_K,Cⁱ⁺¹)|≤e (5)

where i represents the number of iterations of the classifier.

The stable classifier obtained in the above step is marked as C^K-1。

Step 6: a cloud mask;

using classifier model C^K-1Test set G to be processed_KAnd carrying out cloud masking and evaluating the cloud masking precision.

And 7: post-processing mask data;

performing post-processing on the mask data, and removing clouds in the small communication area smaller than T; filling clean connected pixels smaller than T in the large cloud.

Preferably, step 1, standard false color synthesis is performed on the near-infrared band corresponding to red, the red band corresponding to green and the green band corresponding to blue to obtain a standard false color image; directly reducing the image from 16 bits to 8 bits by using percentage truncation, firstly counting an image histogram, extracting a gray value a when the gray value reaches 1% and a gray value b when the gray value reaches 99%, and then linearly stretching the [ a, b ] to [1,255 ];

cutting the remote sensing image into image blocks with the size of 512 × 512 pixels, generating a data set required by training, and obtaining 16000 image blocks in total through the mode;

in order to make a cloud mask label of the data set, the ArcMap is used for labeling the cloud on the image block, making a mask label with clear cloud boundary, and then converting the mask label into a grid format.

Preferably, the grouping of samples of step 2 is specifically: the first group of samples is used as a pre-training data set and is used for training a relatively stable deep learning model; taking the second group to the last but one group of samples as precision test samples, carrying out precision evaluation, and adding the samples for incremental training; the last group of samples do not participate in training and are independently used for detecting the precision;

carrying out image slicing according to the step 1, obtaining 16000 image blocks in total, firstly using 10000 samples for training, dividing 6000 samples into 6 groups for image testing, and 4000 samples for testing; the samples are divided into 8 groups, and the number of the samples in each group is (6000, 1000, 1000, 1000, 1000, 1000, 4000).

Preferably, the cloud mask model training in step 3 specifically comprises: in terms of initializing weights, a pre-training model of ResNet101 on a COCO data set is used for initializing Mask R-CNN network weights; in the aspect of super-parameter setting, the iteration number is set to 160, the batch _ size is set to 2, the pool _ size is set to 7, the mask _ pool _ size is set to 14, the initial learning rate is 0.001, and the learning rate update coefficient is 0.9;

inputting the first group of data in the grouping of the step 2 into a Mask R-CNN model, training according to the hyper-parameters, and obtaining a stable cloud Mask model C through training¹。

The invention has the advantages that: (1) the sample equalization method effectively solves the problem of sample unbalance caused by uneven cloud types in the remote sensing image, realizes effective identification and segmentation of various types of clouds in the image, and improves the accuracy of the cloud mask. (2) The samples are selectively input, so that the influence of the missing or wrongly detected samples can be highlighted, the small-class samples are enhanced, the features extracted by the deep learning model are effectively adjusted, and the problem of missing detection or wrong detection of the small-class clouds is solved.

Drawings

Fig. 1 is a cloud classification schematic.

FIG. 2 is a schematic flow chart of the cloud mask-oriented non-uniform class sample equalization method of the present invention.

FIG. 3 is a sample illustration of the invention. FIG. 3(a) is an original image; fig. 3(b) is artificial cloud mask data.

FIG. 4 is an exemplary diagram of the prediction results of the original Mask R-CNN model in the embodiment of the present invention: fig. 4(a) is an original image and fig. 4(b) is cloud mask data; FIG. 4(c) is the original mask result; fig. 4(d) shows mask result evaluation.

FIG. 5 is a graph of results of different iteration numbers of iterative training in an embodiment of the invention: FIG. 5(a) is an original remote sensing image; FIG. 5(b) shows the mask result, i.e., true value; fig. 5(c) - (i) use iterative masking results for samples of different input sets.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described below with reference to the accompanying drawings and embodiments. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention. Therefore, all other embodiments that can be obtained by a person skilled in the art without making any inventive step based on the embodiments of the present invention belong to the protection scope of the present invention.

Assuming that an image set to be masked is obtained, a partial region on the image is shielded by the cloud to cause data loss, and the cloud in the region needs to be masked. In the embodiment, 24 scenes of Landsat 5TM images, 20 scenes of Landsat8OLI images and 17 scenes of Sentinel 2Multi Spectral Imager images are selected in total, and because the invention is mainly based on the visual characteristics of the images for segmentation, various image types are not distinguished, and the images are processed as the same data.

Fig. 2 is a flowchart of the present invention, and a specific implementation method of the cloud mask-oriented non-uniform class sample equalization method of the present invention is as follows:

step 1: data preprocessing and sample acquisition;

because most images have near infrared, red and green wave bands, the near infrared wave band is corresponding to red, the red wave band is corresponding to green, and the green wave band is corresponding to blue to carry out standard false color synthesis to obtain a standard false color image. The vegetation on the image is red, and various ground objects are greatly different from clouds, so that the vegetation is high in separability.

Because the Sentinel 2MSI and Landsat8OLI images are quantized by 16 bits and the meanings of the same gray value are different, the invention does not perform complicated operations such as radiation correction and the like because the cloud mask is mainly performed on the basis of the visual characteristics of the images, and the images are directly reduced from 16 bits to 8 bits by using percentage truncation. The implementation is that firstly, the image histogram is counted, the gray value a when the gray value reaches 1% and the gray value b when the gray value reaches 99% are extracted, and then [ a, b ] is linearly stretched to [1,255 ].

Because the remote sensing image is generally distributed in a scene mode, the length and the width of a single scene image are large, and the single scene image is difficult to be directly input into a deep learning model for processing, the remote sensing image is cut into image blocks with the size of 512 pixels by 512 pixels to generate a data set required by training, and 16000 image blocks are obtained in the mode.

In order to make a cloud mask label of the data set, the ArcMap is used for labeling the cloud on the image block, making a mask label with clear cloud boundary, and then converting the mask label into a grid format. The sample results of this project selection are shown in fig. 3. Fig. 3(a) is an original image, and fig. 3(b) is an example of a mask according to the present invention.

Step 2: grouping samples;

the invention provides a packet training strategy to solve the problem of unbalanced samples. The grouping principle is as follows: the first group of samples is used as a pre-training data set and is used for training a relatively stable deep learning model; taking the second group to the last but one group of samples as precision test samples, carrying out precision evaluation, and adding the samples for incremental training; the last group of samples does not participate in training and is used for detecting the precision independently.

After the image slicing is performed according to the step 1, a total of 16000 image blocks is obtained. The invention firstly uses 10000 samples for training, 6000 samples are divided into 6 groups for image test, and 4000 samples for test. The samples are divided into 8 groups, and the number of the samples in each group is (6000, 1000, 1000, 1000, 1000, 1000, 4000).

The grouping of samples may be in a repeatable or non-repeatable manner, and the number or proportion of samples in each group may also be adjusted as required.

And step 3: training a cloud mask model;

the Mask R-CNN and other example segmentation methods have good effects in various applications, so the implementation process of the patent is described by taking the Mask R-CNN as an example, and certainly, more advanced classifiers can be selected for cloud masking according to requirements.

In initializing weights, Mask R-CNN network weights are initialized using a pre-trained model of ResNet101 on the COCO dataset. In the super-parameter setting aspect, the iteration number is set to 160, batch _ size is set to 2, pool _ size is set to 7, mask _ pool _ size is set to 14, the initial learning rate is set to 0.001, and the learning rate update coefficient is set to 0.9.

Inputting the first group of data in the grouping into a Mask R-CNN model, training according to the hyper-parameters of the steps, and training to obtain a stable cloud Mask model C¹。

And 4, step 4: grouping training and evaluation;

model C trained using step 3¹For sample subset G₂And carrying out cloud mask to obtain a cloud mask result.

Then, for the packet G^KPerforming cloud mask to obtain mask result, and calculating its overall accuracy OA (G)_K,C¹)。

And 5: performing iterative training;

the results of the mask are compared with the labels and can be divided into three types, namely correct T, error F and missing M,

on the basis, evaluating AR and AP of each cloud object, adding the cloud objects smaller than 0.5 into the training set, and returning to the step 3 to perform retraining by using the new training set; and then carrying out step 4 to carry out cloud mask on the sample.

The above process is repeated until all the sample subsets have been used or until the test set G has been completed^KThe classification precision of the method is stable.

The present embodiment uses the OA change of less than 0.1% before and after the two times as the indicator of the classifier stability.

FIG. 4 is an example of the masking results herein. FIG. 4(a) is an original input image of the text method; FIG. 4(b) is a masking result of the method herein.

FIG. 5 shows the iterative masking results of the method herein, where FIG. 5(a) is the original image with two different types of clouds, thin and thick; FIG. 5(b) is a cloud mask result; fig. 5(c) - (i) show the effect of masking the clouds by sequentially using the sample subsets, and since there are fewer samples of thin clouds, there are more missed detections in the early masking results of fig. 5(c), and after the samples are added, the thin clouds are gradually detected, and the classification accuracy is improved.

Step 6: a cloud mask;

using classifier model C^K-1Test set G to be processed^KAnd carrying out cloud masking to obtain a pixel-level cloud masking result.

And 7: performing mask result post-processing;

and on the basis of obtaining the training model, carrying out cloud masking on the remote sensing image sample for testing to obtain a cloud masking result. Removing small communication clouds with less than 10 pixels generated in the cloud mask result; or a clean area of less than 10 pixels in the bulk communication area.

The embodiments described in this specification are merely illustrative of implementations of the inventive concept and the scope of the present invention should not be considered limited to the specific forms set forth in the embodiments but rather by the equivalents thereof as may occur to those skilled in the art upon consideration of the present inventive concept.

Claims

1. A cloud mask-oriented non-uniform class sample equalization method comprises the following implementation steps:

step 1: data preprocessing and sample acquisition;

the vegetation on the standard false color image is red, the distinguishability of various ground objects and clouds is large, and the standard false color mode is used for wave band synthesis;

because the original images adopt different quantization bits, the same gray value has different meanings, and the gray value is compressed to an interval [1,255] by a percentage truncation mode, so that the images obtained by different sensors have similar colors;

because the image is distributed in a scene mode, the size of the image is far larger than the processing capacity of a computer, the image is sliced, and image blocks with the size h x w are output, and the number of the image blocks is M;

on the basis, marking a polygon of an area where the cloud is located on the image block, and manufacturing a remote sensing image cloud mask sample set G which comprises N image blocks;

step 2: grouping samples;

sample G was divided into K groups of samples:

G＝{G₁,G₂,...,G_K} (1)

wherein the sample subset G₁As a pre-training classifier to obtain a relatively robust deep learning model; sample subset { G }₂,...,G_k-1Testing the group of samples and selecting the samples in an iteration mode; sample subset G_KThe detection precision is used;

whether intersections exist among all sample subsets can be guaranteed according to the needs of clients;

and step 3: training a cloud mask model;

after setting the training hyper-parameters, the sample subset G is used₁Training a deep learning model by passing through a subset of samples G₁The model obtained by training is C¹；

And 4, step 4: masking and evaluating a classifier;

use of C¹For G₂The samples in (1) are predicted to obtain a mask result, and the mask result is then compared with G₂The tag data in (1) are compared, thereby evaluating the classification result;

cloud mask results can be divided into three types of correct T, error F and omission M; wherein, the correct representation sample is marked as cloud, and the classification result is also cloud; the error indicates that the sample is non-cloud and the classifier is extracted as cloud; omission indicates that the sample is marked as a cloud, but the classifier is not extracted; on the basis, three evaluation indexes, namely average detection rate (AR), average accuracy rate (AP) and Overall Accuracy (OA), are selected to evaluate the mask accuracy:

wherein, # denotes the number of pixels; it is worth noting that: since the number of background pixels formed by the clean ground objects is far larger than that of cloud pixels, the whole precision is difficult to reflect the real detection effect, and therefore, when the OA is calculated, the part of pixels are ignored, and the OA and the cross-over ratio (IoU) are the same;

And 5: performing iterative training;

for a cloud, taking the samples with both AR and AP larger than 0.5 as correct samples, and taking the other samples as error samples; neglecting correct samples, adding error and missing samples, returning to step 3, and training to obtain new classifier C²(ii) a Then according to step 4, the sample G is processed₃Carrying out evaluation;

|OA(G_K,Cⁱ)-OA(G_K,Cⁱ⁺¹)|≤e (5)

wherein i represents the number of iterations of the classifier;

the stable classifier obtained in the above step is marked as C^K-1；

Step 6: a cloud mask;

using classifier model C^K-1Carrying out cloud mask on the test set to be processed, and evaluating the accuracy of the cloud mask;

and 7: post-processing mask data;

2. The cloud mask-oriented non-uniform class sample equalization method as recited in claim 1, wherein: step 1, performing standard false color synthesis on a near infrared band corresponding to red, a red band corresponding to green and a green band corresponding to blue to obtain a standard false color image; directly reducing the image from 16 bits to 8 bits by using percentage truncation, firstly counting an image histogram, extracting a gray value a when the gray value reaches 1% and a gray value b when the gray value reaches 99%, and then linearly stretching the [ a, b ] to [1,255 ];

3. The cloud mask-oriented non-uniform class sample equalization method as recited in claim 1, wherein: the sample grouping of step 2 is specifically: the first group of samples is used as a pre-training data set and is used for training a relatively stable deep learning model; taking the second group to the last but one group of samples as precision test samples, carrying out precision evaluation, and adding the samples for incremental training; the last group of samples do not participate in training and are independently used for detecting the precision;

4. The cloud mask-oriented non-uniform class sample equalization method as recited in claim 1, wherein: the specific method for training the cloud mask model in the step 3 is as follows: in terms of initializing weights, a pre-training model of ResNet101 on a COCO data set is used for initializing Mask R-CNN network weights; in the aspect of super-parameter setting, the iteration number is set to 160, the batch _ size is set to 2, the pool _ size is set to 7, the mask _ pool _ size is set to 14, the initial learning rate is 0.001, and the learning rate update coefficient is 0.9;