CN117612031A

CN117612031A - Remote sensing identification method for abandoned land based on semantic segmentation

Info

Publication number: CN117612031A
Application number: CN202410086382.8A
Authority: CN
Inventors: 赵凌园; 杨博; 周旷; 罗梓菲; 张焰
Original assignee: Huantian Smart Technology Co ltd
Current assignee: Huantian Smart Technology Co ltd
Priority date: 2024-01-22
Filing date: 2024-01-22
Publication date: 2024-02-27

Abstract

The invention discloses a remote sensing identification method of a abandoned land based on semantic segmentation, which comprises the following steps of; a1, acquiring an original remote sensing image of a region, and acquiring a permanent basic farmland image according to the permanent basic farmland division condition of the region; a2, acquiring a abandoned barren map spot, keeping the same scale and coordinate information of the abandoned barren map spot and a permanent basic farmland image, and cutting out an area covered by the abandoned barren map spot on the permanent basic farmland image to serve as a abandoned barren land sample, and taking an uncut area as a non-abandoned land sample; a3, dividing the abandoned barren land sample and the non-abandoned barren land sample into a training set, a verification set and a test set; a4, constructing a semantic segmentation model, and simultaneously inputting a training set and a verification set into the semantic segmentation model; training and fine-tuning semantic segmentation model parameters; and A5, acquiring the weight of the semantic segmentation model, and acquiring a binarized image of the test set after putting the weight into the test set. The model constructed in this way has significantly improved recognition capability and recognition accuracy for the abandoned land.

Description

Remote sensing identification method for abandoned land based on semantic segmentation

Technical Field

The invention belongs to the technical fields of computational vision and remote sensing image segmentation, and particularly relates to a remote sensing identification method of abandoned land based on semantic segmentation.

Background

With the advent of the new century, grain demand is in a gradual rising state. If the wasted land can be effectively utilized, precious resources are provided for solving the grain safety problem. Thus, there is a need to find ways to identify and utilize such abandoned lands.

And the land identification can accurately detect the idle and uncultivated land through satellite remote sensing and other means. Through the technology, the utilization condition of the land can be known, and a solution can be formulated in a targeted manner so as to exert the potential of the land to the greatest extent. The current abandoned land identification has the following technologies:

the first is remote sensing data analysis and time sequence analysis; in the patent application with the publication number of CN103914678A, a combination scheme of remote sensing data analysis and time sequence analysis is adopted, and characteristic information among data is fused by utilizing multi-sensor, multi-resolution, multi-spectrum and multi-temporal remote sensing data, so as to construct a combination solution of exploring texture characteristics and vegetation indexes. In the patent publication No. CN116129284A, a long-time sequence remote sensing image is adopted, an exponential relation among various vegetation is constructed, and a cultivated land and abandoned land sample library is established. Solutions based on remote sensing image analysis and timing analysis typically require the combination of multispectral data and imagery across multiple points in time. The effectiveness of this approach is that the multispectral data can provide rich spectral information for analysis of the abandoned land area at the current stage. Analysis of these spectral features helps to determine the utilization of the land and thus helps to identify the abandoned land. The image collection in different periods, especially the images covering different time points in the same area, can greatly enhance the generalization capability of the multispectral analysis. This means that it is possible to have its academic recognition of the general characteristics of the wasteland, so that the results can be successfully applied to the wasteland detection tasks at different points in time. The migration learning mode can keep accuracy in different time periods, so that the change of the abandoned land can be effectively monitored. However, this approach presents some challenges. Firstly, acquiring multispectral data and images across multiple time points requires high requirements on image acquisition equipment, requiring advanced remote sensing techniques and equipment. Second, collecting time-series image data over many years requires considerable time and resources. This may include image acquisition, processing and sorting at different points in time. This time-consuming and labor-consuming process may have an impact on the implementation of the solution.

The second is machine learning; in the patent application with the publication number of CN112749628A, classification of vegetation and non-vegetation is realized by constructing a plurality of models, and characteristic classification of cloud and non-cloud is realized on the basis of the classification, so that a classification result of the abandoned land is obtained; the patent publication number CN116343111A combines contrast learning and semantic segmentation to realize identification of the abandoned land, divides the regional image into a plurality of categories by contrast learning to distinguish the regional image difference, and then adopts a scheme of semantic segmentation to extract a specific abandoned land region in the image. The feature extraction method based on machine learning generally adopts a scheme based on semantic segmentation when solving the problem of abandoned land detection. The semantic segmentation can distribute different parts of pixels in the image to different semantic categories, so that the ground object can be accurately segmented. However, relying solely on semantic segmentation methods may not adequately learn the characteristics of the abandoned land because the characteristics of the abandoned land may have similarities to other features, resulting in misclassification. In order to overcome the problem, the existing method adopts a feature analysis strategy, and introduces feature engineering into a machine learning process to guide a semantic segmentation model to reduce misjudgment. The goal of this strategy is to allow the model to better understand the unique characteristics of the abandoned land, thereby improving the accuracy of the test. The multi-model combined strategy is excellent in terms of improving accuracy, but has some inconveniences in terms of later service deployment.

Disclosure of Invention

The invention aims to provide a remote sensing identification method of a abandoned land based on semantic segmentation, which aims to solve the following technical problems in the background art:

the existing scheme has high requirements on the spectrum characteristics of the remote sensing image, has insufficient identification performance of the abandoned land due to more interference in the image, is complicated in later deployment of the model, and is not beneficial to updating and deployment of the model.

In order to solve the technical problems, the invention adopts the following technical scheme:

a remote sensing identification method of abandoned land based on semantic segmentation comprises the following steps;

a1, acquiring an original remote sensing image of a region, and acquiring a permanent basic farmland image according to the permanent basic farmland division condition of the region;

a2, acquiring a abandoned barren map spot, keeping the same scale and coordinate information of the abandoned barren map spot and a permanent basic farmland image, and cutting out an area covered by the abandoned barren map spot on the permanent basic farmland image to serve as a abandoned barren land sample, and taking an uncut area as a non-abandoned land sample;

a3, dividing the abandoned barren land sample and the non-abandoned barren land sample into a training set, a verification set and a test set;

a4, constructing a semantic segmentation model, and simultaneously inputting a training set and a verification set into the semantic segmentation model; introducing the heavy parameter convolution into a semantic segmentation model, and merging multiple branches of the heavy parameter into one branch in an inference stage;

data is processedFeature extraction input to semantic separation model constructionIn the network, feature sets F with different scales are obtained _i ；

Wherein I represents an input sample picture, R represents a real number domain, C represents the number of channels of the input picture, and H and W represent the height and width of the input sample picture;

training phase characteristics F _i The acquisition formula of (1) is:

inference stage feature F _i The acquisition formula of (1) is:

wherein Conv (I) represents a feature map obtained by performing convolution operation on the input feature map I; BN (I) represents a characteristic diagram obtained by carrying out batch normalization operation on the input characteristic diagram I; f (Conv (I)) represents the processing of activating functions on the feature map obtained by the convolution operation;

training and fine-tuning semantic segmentation model parameters;

and A5, acquiring the weight of the semantic segmentation model, and acquiring a binarized image of the test set after putting the weight into the test set.

Further, remote sensing images corresponding to the abandoned map spots are segmented by using remote sensing image processing software.

Further, the training set accounts for 60% -80% of the total data, the verification set accounts for 10% -20% of the total data, and the test set accounts for 10% -20% of the total data.

Further, the semantic segmentation model adopts FCNHead with coordinate convolution, the coordinate convolution is introduced into a fully connected network, and the difference between local features and global features is learned;

scaling the dimensions to unity first, F _i The conversion formula of (2) is as follows:

wherein F is _m Represents F _i Unified ruler of (2)Degree feature map, deConv () represents deconvolution operation;

feature F to be of uniform scale _m Inputting into a neural network layer Coordconv to obtain a feature F with coordinate information _c ：

Wherein F is _c Representing the output characteristic diagram, C _x Representative and input feature map F _m Feature map of corresponding x-coordinate information, C _y Representative and input feature map F _m Characteristic diagram of corresponding y coordinate information, cat represents splicing operation, F _m 、C _x And C _y The three feature graphs are spliced according to the channel dimension;

will F _c The calculation formula of the abandoned map spots M and M obtained by the convolution mapping mode is as follows:

wherein Sigmoid represents a Sigmoid activation function.

Further, in the Conv reasoning process, the BN layer parameters are fused with the Conv parameters, and the specific formula is as follows:

wherein, reLU represents ReLU activation function, and F represents final output characteristic diagram.

Further, inputting the multilayer features extracted by the FCNHead into a fully-connected network with coordinate convolution to obtain a segmentation result of the abandoned map spots.

Further, the trained semantic segmentation model is converted into a TensorRT format.

Compared with the prior art, the invention has the following beneficial effects:

according to the invention, the original remote sensing image of the region is obtained, and the permanent basic farmland image is obtained according to the permanent basic farmland division condition of the region, so that the training interference of the background information on the abandoned land recognition model can be reduced. Because permanent basic farmlands are clearly delimited by governments, farmlands that have to be agrochemically or used for other non-agricultural production activities have a clear association with the abandoned land. This may allow the model to focus more on various features of the abandoned land that are permanently relevant to the underlying farmland.

The invention can accurately acquire the abandoned map spot by keeping the same scale and coordinate information of the abandoned map spot and the permanent basic farmland image and cutting out the area covered by the abandoned map spot on the permanent basic farmland image as the abandoned land sample. Thus, the recognition capability and recognition accuracy of the model on the abandoned land can be improved.

Compared with the prior art, the invention only needs common optical images, can reduce partial noise interference by preprocessing the images, and can improve the identification precision of the abandoned land by utilizing a semantic segmentation model, thereby effectively reducing the deployment cost.

The invention combines multiple branches of heavy parameters into one branch in the reasoning stage, which has the functions of reducing the calculation complexity of the model and improving the reasoning speed. In the training stage, multiple branches can help the model to learn characteristics better, but in the reasoning stage, in order to reduce the calculation amount and memory occupation of the model, multiple branches are fused into one branch, so that the model is lighter, and later updating and deployment of the model are facilitated.

Detailed Description

The following description of the technical solutions in the embodiments of the present invention will be clear and complete, and it is obvious that the described embodiments are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Embodiment one.

specifically, the original remote sensing image of the region adopts satellite images, coordinate verification is needed to be carried out on the images and the permanent basic farmland pattern files after the original remote sensing image of the region is obtained, the coordinate information of the images is corrected, the sizes of the images are adjusted to be consistent with the pattern, and finally, all farmland pattern spots are screened out by the permanent basic farmland pattern, so that the original remote sensing image of the region is used as a training data sample.

So as to improve the accuracy and stability of subsequent treatment. It should be noted that, the permanent basic farmland is defined as farmland which is approved to be non-agricultural or used for other non-agricultural production activities in order to maintain farmland production, to ensure grain safety and ecological balance, within a range clearly defined by the government of the country, region or place. This demarcation is based on important considerations in ensuring agricultural sustainable development, maintaining agricultural resources and ecological environment. The abandoned land is from a permanent basic farmland. In view of the fact that the abandoned land is derived from the permanent basic farmland, the permanent basic farmland pattern obtained through the regional permanent basic farmland division condition can reduce the training interference of the rest background information on the abandoned land recognition model, namely the noise reduction is realized, so that the model focuses more on various characteristics of the abandoned land related to the permanent basic farmland.

specifically, the abandoned map spots can be obtained by dividing the remote sensing image. Registering the abandoned map spots with the permanent basic farmland image to enable the abandoned map spots to have the same scale and coordinate information. Therefore, the corresponding relation between the image spots and the images can be ensured to be correct in the subsequent cutting and analyzing process. Cutting the image spots and the images according to the positions and boundary information of the image spots to obtain the abandoned image spots and the corresponding image areas. The cut pattern spots and the image should have the same scale and coordinate information. Dividing the cut image area into a abandoned land sample and a non-abandoned land sample according to the coverage condition of the image spots. Regarding the area covered by the pattern spots, the pattern spots are considered as a abandoned land sample; for uncovered areas, a sample of the unprecedented land is considered. The land samples that are not being abandoned may be marked 1 and the land samples that are not being abandoned may be marked 0.

data is processedInputting into a feature extraction network constructed by a semantic separation model to obtain feature sets F with different scales _i ；

Wherein I represents an input sample picture, R represents a real number domain, C represents the number of channels of the input picture, and H and W represent the height and width of the input sample picture; the input feature map I may be understood as a three-dimensional tensor, in which each element corresponds to a pixel point in the feature map, and the number of channels c represents different feature information contained in the feature map. In deep learning, the input feature map I is an important input in the neural network, and is subjected to a series of operations such as convolution, pooling, activation and the like, so as to finally obtain the output of the network.

Training phase characteristics F _i The acquisition formula of (1) is:

inference stage feature F _i The acquisition formula of (1) is:

then, training and fine tuning semantic segmentation model parameters are needed;

specifically, the semantic segmentation model is a deep learning model, and is used for assigning each pixel in an image to a different class, so as to realize the recognition and segmentation of the pixel level of different objects in the image. In the scheme, a semantic segmentation model can be constructed, and a training set and a verification set are simultaneously input into the model for training. The training set is used to train parameters of the model and the validation set is used to evaluate the performance of the model on unseen data. Fine-tuning semantic segmentation model parameters refers to continuing training on already trained models to accommodate new tasks or datasets. In the scheme, a trained semantic segmentation model can be used as a basic model, and fine adjustment is performed on the data set of the abandoned map spots so as to improve the recognition capability of the model on the abandoned map spots. When the semantic segmentation model is constructed, the training set and the verification set are simultaneously input into the model to verify the model in the training process so as to monitor the performance of the model and timely adjust the super parameters to improve the generalization capability of the model. Therefore, the model can be ensured not to be fitted with training data in the training process, and meanwhile, the performance of the model on the verification set can be better known.

Introducing the heavy parameter convolution into the semantic segmentation model can enlarge the receptive field of the convolution operation, so that the model can better capture the global information of the input data. The heavy parameter convolution realizes the expansion of the receptive field by introducing intervals between convolution kernels, which is helpful for improving the understanding capability of a model to input data, and particularly for semantic segmentation tasks, semantic information of different scales in an image can be better captured.

The effect of merging multiple branches of heavy parameters into one branch in the reasoning stage is to reduce the calculation complexity of the model and improve the reasoning speed. In the training stage, multiple branches can help the model to learn characteristics better, but in the reasoning stage, in order to reduce the calculation amount and memory occupation of the model, multiple branches are fused into one branch, so that the model is lighter and more suitable for being deployed in practical application.

Preferably, the divided data sets are preprocessed, input formats required by the model are obtained, and data enhancement schemes such as rotation, scaling, clipping and the like are performed on the data.

Further preferably, the preprocessed data is input into FCNHead with a heavy parameter convolution, and the multi-scale features are obtained by multi-layer convolution.

Inputting the multi-scale features into the deconvolution, unifying the feature scales, and facilitating the subsequent segmentation; and finally, transmitting the features with uniform dimensions to a CoordFCN capable of learning coordinate information to obtain model predicted abandoned map spots.

Because the phenomenon of the abandoned land exists in the permanent basic farmland and occupies only a few acres, the pattern spots containing the abandoned land after the data are cut are far less than the pattern spots without the abandoned land, and the loss weights are proportionally divided for balancing the categories of model learning, so that the loss weights of the pattern spots without the abandoned land are restrained.

Specifically, a corresponding deep learning framework, such as TensorFlow, pyTorch, is used for loading trained model weights; and inputting the images of the test set into the model for prediction. The test set should contain similar images to the training set so that the model can make accurate predictions on the new data. And inputting the images of the test set into the weighted model for prediction. The prediction result will be a classification result at one pixel level, and each pixel point is classified into different categories. And according to the prediction result, dividing each pixel point into two categories of the abandoned land and other ground features. By setting a threshold, a binary image can be obtained by setting a certain category of the prediction result to 1 and other categories to 0, wherein the abandoned land is marked as 1 and the other land features are marked as 0.

In a preferred embodiment, remote sensing image processing software is utilized to divide the remote sensing image corresponding to the abandoned map spots to obtain the abandoned map spots.

In a preferred embodiment, the training set is 60% -80% of the total data, the verification set is 10% -20% of the total data, and the test set is 10% -20% of the total data.

In a preferred embodiment, the semantic segmentation model employs FCNHead with a coordinate convolution, which is introduced into a fully connected network, learning the differences between local and global features;

wherein F is _m Represents F _i DeConv () represents a deconvolution operation;

The semantic segmentation model adopts a segmentation head module of a CoordFCN. Expanding the input multi-scale features to the same size in a deconvolution mode, and compressing the stacked features in a convolution mode; and simultaneously, coordconv is introduced into a fully connected network, and the relation between local characteristics and global characteristics is learned, so that the characteristics between crops and weeds and shrubs are more easily distinguished.

wherein Sigmoid represents a Sigmoid activation function.

In a preferred embodiment, in the Conv reasoning process, the BN layer parameters are fused with the Conv parameters, and the specific formula is expressed as follows:

In a preferred embodiment, the FCNHead extracted multi-layer features are input into a fully connected network with coordinate convolution to obtain the segmentation result of the abandoned map spots. The specific flow is as follows:

the FCNHead was used to extract the multi-layer features of the image.

And splicing the multilayer features with corresponding coordinate information to obtain a feature map with position information.

The feature map with the position information is input into a fully connected network with the coordinate convolution for processing.

The output result is the dividing result of the abandoned map spots.

The flow can fully utilize the position information and combine the expression capability of the fully connected network, thereby improving the accuracy and the robustness of the segmentation result.

In a preferred embodiment, the trained semantic segmentation model is converted to a TensorRT format. Thereby realizing end-to-end acceleration reasoning on the GPU server.

Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A remote sensing identification method of abandoned land based on semantic segmentation is characterized in that: comprises the following steps of;

training phase characteristics F _i The acquisition formula of (1) is:

inference stage feature F _i The acquisition formula of (1) is:

training and fine-tuning semantic segmentation model parameters;

2. A method of remote sensing identification of a abandoned land based on semantic segmentation as defined in claim 1, wherein: and dividing the remote sensing image corresponding to the abandoned map spots by using remote sensing image processing software.

3. A method of remote sensing identification of a abandoned land based on semantic segmentation as defined in claim 1, wherein: the training set accounts for 60% -80% of the total data, the verification set accounts for 10% -20% of the total data, and the test set accounts for 10% -20% of the total data.

4. A method of remote sensing identification of a abandoned land based on semantic segmentation as defined in claim 1, wherein: the semantic segmentation model adopts FCNHead with coordinate convolution, the coordinate convolution is introduced into a fully connected network, and the difference between local features and global features is learned;

wherein F is _m Represents F _i DeConv () represents a deconvolution operation;

Wherein F is _c Representing the output characteristic diagram, C _x Representative and input feature mapF _m Feature map of corresponding x-coordinate information, C _y Representative and input feature map F _m Characteristic diagram of corresponding y coordinate information, cat represents splicing operation, F _m 、C _x And C _y The three feature graphs are spliced according to the channel dimension;

wherein Sigmoid represents a Sigmoid activation function.

5. A method of remote sensing identification of a abandoned land based on semantic segmentation as defined in claim 4, wherein: in the Conv reasoning process, the BN layer parameters are fused with the Conv parameters, and the specific formula is as follows:

6. A method of remote sensing identification of a abandoned land based on semantic segmentation as defined in claim 4, wherein: inputting the multilayer features extracted by FCNHead into a fully-connected network with coordinate convolution to obtain a segmentation result of the abandoned map spots.

7. A method of remote sensing identification of a abandoned land based on semantic segmentation as defined in claim 1, wherein: and converting the trained semantic segmentation model into a TensorRT format.