CN113705326A

CN113705326A - Urban construction land identification method based on full convolution neural network

Info

Publication number: CN113705326A
Application number: CN202110750300.1A
Authority: CN
Inventors: 官冬杰; 殷博灵
Original assignee: Chongqing Jiaotong University
Current assignee: Chongqing Jiaotong University
Priority date: 2021-07-02
Filing date: 2021-07-02
Publication date: 2021-11-26
Anticipated expiration: 2041-07-02
Also published as: CN113705326B

Abstract

The invention relates to a city construction land identification method based on a full convolution neural network, and belongs to the field of city informatization. The method comprises the following steps; s1: network optimization and verification; s2: and identifying urban construction land under the full convolutional neural network. In view of the fact that the full convolution neural network has strong image expression capability, feature learning capability and good end-to-end performance, the invention designs a full convolution neural network model by improving the FCN network structure, and applies the full convolution neural network model to extraction of construction land of the low-resolution gray level remote sensing image.

Description

Urban construction land identification method based on full convolution neural network

Technical Field

The invention belongs to the field of urban informatization, and relates to an urban construction land identification method based on a full convolution neural network.

Background

The method is a necessary reference for finding urban development rules and constructing urban spatial patterns by mastering the urban space expansion direction, area and speed. The method can accurately judge the urban land expansion and is a basic premise for decision makers to make measures. Lack of a quantitative reference to urban space may lead to errors in government decisions, further exacerbating a series of human conflicts. At present, the method for extracting the urban land use through remote sensing satellite images is the most common method for rapidly and comprehensively mastering urban expansion change. However, urban construction land is highly complex and heterogeneous, the traditional classification method focuses on visual features, and under low-resolution satellite images, similar reflection spectrum and texture are generally shown between the construction land and non-construction land, so that the classification accuracy of images processed by the traditional method is low. The invention provides a full convolution neural network (FCN) -based urban construction land extraction model, which benefits from the development of computer technology and the support of mass data. The remote sensing satellite image with low spatial resolution is used as a classification object, the extraction rate and the extraction precision of the large-scale urban construction land space distribution are improved by improving the FCN structure, and an effective method is provided for acquiring data of urban spaces with different scales.

The land utilization/coverage classification end-to-end deep learning algorithm mainly comprises two algorithms, namely a patch-based Convolutional Neural Network (CNN) and a full convolutional neural network, wherein the patch-based CNNs classification takes 'picture labels' as classification bases, an input image is divided into a plurality of equilateral square patches, a CNN model is used for predicting the label attribution of each patch center, then class labels are sequenced, and a two-dimensional classification map is generated and output. Constructing an improved STDCNN such as Huang and the like to carry out land utilization classification on hong Kong and Shenzhen regions so as to obtain a more practical land utilization classification map; martins et al propose a multi-object CNN method based on multi-scale objects, which is used for large-area land coverage classification of high-resolution images; however, the dimension of the classification result output by this method may be different from that of the original image, artifacts are applied to adjacent patches when the model is running, resulting in low classification efficiency, and patches in the input CNNs are often inconsistent with real objects, resulting in excessive expansion/contraction of edges and geometric distortion of the classification result. Compared with the prior art, the full convolution neural network carries out semantic segmentation on the image in a pixel-level scale, can keep a two-dimensional image output structure, has less requirement on the number of training samples and shorter calculation time, and has more excellent capability in the aspect of satellite image classification. After long applies FCN to the Pascal VOC-2012 data set and obtains extremely high classification precision, it becomes possible to realize faster and more accurate remote sensing image classification through the full convolution neural network. FCNs are excellent in complex classification, for example, Fariba designs an FCN based on SARS images to classify complex land cover ecosystems, Wurm uses FCN transfer learning ability to perform high-level semantic segmentation on poor grottos in remote sensing images, and Persello uses FCN to demarcate farmlands of small farms from satellite images. In addition, the application field of FCNs is becoming wide, and important breakthroughs are made in the research aspects of three-dimensional spatial data semantic segmentation, historical image (aviation black and white photos) land coverage extraction and the like.

The end-to-end characteristics of FCNs and the excellent performance of FCNs in image classification enable the FCNs to be widely applied to city change detection in city scale, regional scale and even global scale, large cities serve as analysis objects, and urban land distribution extraction research of medium and small cities is not deep enough. On the other hand, the existing neural network classification research mainly focuses on classifying typical land utilization types from a limited data set, and accurate extraction of complex urban construction land based on different scale expressions still has a challenge, because differences exist between large cities and small cities, the data accessibility and image characteristics of satellite remote sensing images are limited, and when urban features are extracted by the FCNs method, feature information of cities of different grades presented in a network is different. The regional sustainable development not only concerns the rational planning of large cities, but also requires the rational development of medium and small cities and the coordinated development among cities. The disadvantages of FCNs can be improved in two ways, on one hand, neural networks are combined with traditional methods or other networks to optimize the classification recognition results; on the other hand, network performance is optimized by changing the network structure. Based on the difficulty of extracting large-scale urban construction land under the low-resolution gray remote sensing image, the invention designs a full convolution neural network model, takes the acquireability of data and the consistency of data space into consideration, adopts the low-resolution remote sensing image as source data, takes FCN8s as a basic network framework, improves a front-end network and a deconvolution layer structure, carries out parameter debugging and optimizes the characteristic extraction capability of the network. So as to obtain the basic data of the urban construction land with higher precision.

Disclosure of Invention

In view of the above, the present invention provides a method for identifying urban construction land based on a full convolution neural network.

In order to achieve the purpose, the invention provides the following technical scheme:

a city construction land identification method based on a full convolution neural network comprises the following steps;

s1: optimizing a network;

s2: and identifying urban construction land under the full convolutional neural network.

Optionally, the S1 specifically includes:

s11: collecting a sample;

dividing the study area into 400 sub-areas; comparing the ratio of the image area to the blank area in 400 sub-areas, and taking the area with the remote sensing image ratio larger than 90% as a sampling area, wherein the total number of the sampling areas is 228; 2-8 training samples are extracted from each sampling area; selecting 1017 night light NPP-VIIRS remote sensing image data characteristic samples in total, wherein the spatial resolution of the night light data samples is 224 x 224, and randomly extracting the collected samples into training samples and testing samples according to the ratio of 21: 4;

s12: optimizing network parameters;

the RELU function is used as an activation function, a weight initial value is given by adopting an Xavier random initialization method, the convergence speed of the deep network is accelerated, the gradient disappearance is avoided, the network oscillation is inhibited, and the mathematical formula is as follows:

wherein n is_iRepresents the ith neuron, W represents weight value, U represents uniform distribution, and its variance is

Adam is selected as a network weight optimizer, random gradient descent training rate is improved, and the problem of gradient disappearance is avoided:

wherein m is_tAs an estimate of the first moment of the gradient, n_tIs an estimate of the second moment of the gradient,

is to m_tThe correction of (2) is performed,

is to n_tGamma is the learning rate, tau is used to ensure that the denominator is not 0;

a regularization layer and a random inactivation method are added in a forward propagation stage, the generalization error is reduced through a local response and multi-neuron combination network, the network generalization capability is enhanced, the disappearance of the gradient is avoided, the convergence speed is improved, and overfitting is inhibited;

s13: optimizing a network structure;

the network structure of FCN8s is modified to meet the partitioning requirements; in the deconvolution stage, skip connection is added, a plurality of layer output characteristics in a deconvolution structure are respectively fused with the output characteristic information of the pooling layer 4, the pooling layer 3 and the pooling layer 2, and a cutting layer is introduced and the offset of the cutting layer is corrected, so that the city boundary detection capability is optimized, and the classification accuracy is improved; in the forward propagation stage, the regularization layer is introduced into the first two pooling layers, the learning rate coefficient and the bias coefficient of the first 6 pooling layers are reduced, the learning capability of the shallow-layer pooling layer is improved, the extraction capability of network space position information is optimized, and the Xavier initialization method is applied to endow initial values to all the weight of the pooling layers of the front-end network;

s14: a network hyper-parameter debugging strategy;

in the super-parameter debugging strategy, parameter testing is carried out from three aspects of initial learning rate, weight attenuation coefficient and random inactivation proportion; the learning rate controls the network convergence speed and the convergence track, the lower the value is, the slower the speed is, the smoother the convergence track is, otherwise, the opposite is true; the weight attenuation coefficient value and different random inactivation ratios can influence the generalization and fitting capability of the neural network; the three aspects all affect the classification precision of the network; the average cross-over ratio is used as the measurement standard of image segmentation to judge whether the hyper-parameter debugging is reasonable or not, and the formula is as follows:

wherein p is_iiN +1 represents the number of classification categories for the number of correctly classified pixels,

and

representing the total number of pixels of which all the categories are i;

the learning rate is adjusted from a minimum value of 1 × 10^-8The parameter value is taken up to the maximum value of 0.01, and the increment is 10 times each time³In the debugging of momentum items, the values are assigned from 0.90 to 0.999, the values are progressively increased by 0.05 each time, and in the debugging of the inactivation ratio, the values are debugged respectively by 50 percent and 25 percent; under the three debugging strategies, the optimal combination scheme obtained by testing is used as a final FCN model super parameter value; performing repeated iterative training on the network under the condition that the learning strategy is 'fixed'; comprehensively analyzing the change rate, the mean value and the maximum value of the precision, and calculating the learning rate 1 x 10^-5The momentum term is 0.95, and the random inactivation proportion is 50 percent, which is used as the optimal combination of each hyper-parameter of the FCN model;

s15: network precision verification;

training the constructed IFCN model based on a Caffe-GPU deep learning framework, applying 1017 NPP-VIIRS night light image sample training networks, distributing the samples into training samples according to a ratio of 21:5, inputting the training samples and the test samples into the network for training, iterating for 20 ten thousand times, and enabling the average cross-over ratio to reach 97.59% after complete training; the average intersection value is rapidly increased compared with the value before the iteration reaches 5000 times, which shows that the gradient of the network is obviously reduced in the training process, the situations of gradient disappearance, oscillation and the like do not occur, the network tends to be gentle after 5000 iterations, the precision value starts to be unchanged after 180000 iterations, which shows that the network fitting is finished and the optimal performance is achieved; in the aspect of change of the loss value, the descending trend is from fast to slow, the descending trend is faster before 25000 iterations and then slowly approaches 0, and the descending trend is 592 after 200000 iterations; the network meets the network debugging requirement and the model application standard;

optionally, the S2 specifically includes:

s21: extracting results from urban construction land in the economic zone of Yangtze river;

inputting night light images of the Yangtze river economic zone into the trained IFCN to obtain two classification results, wherein the classification results are respectively construction land and non-construction land; classifying the same type of the verification area by using a support vector machine classification method and an object-oriented threshold classification method; the detailed characteristics of the urban boundary obtained by IFCN and threshold classification are more prominent, and the method has better identification capability on non-construction land inside the city; compared with threshold classification, the IFCN has better detection capability on detailed characteristics of small-area city boundaries, and has more excellent segmentation effect on roads and water bodies; from the perspective of the classification results presented in space, IFCN has better interpretation effect;

the extraction performance of the construction land of the full convolution neural network under the large-scale low-resolution gray image is discussed and improved through quantitative precision evaluation; when evaluating the pixel classification precision of the network, measuring the precision of three classifiers of IFCN classification, SVM classification and threshold classification by using a Kappa coefficient and an F1-score algorithm;

the Kappa coefficient and the F1-score coefficient of the three classifiers are FCN > Threshold classification > SVM in sequence from large to small;

s22: verifying the performance of the model;

the improvement of the front end structure and the deconvolution structure of the network enables the IFCN to obtain more excellent results in the aspects of urban construction land identification and boundary extraction; comparing the improved model with a common full convolution neural network model for verifying the network structure performance of the improved model; selecting FCN32s, FCN16s and FCN8s as comparison objects, and inputting training samples into the three networks respectively to perform same-parameter training; after the same number of iterations, a city in the Yangtze river economic zone is used as a detection object and input into a trained model, and a semantic segmentation result shows that the FCN32s and the FCN16s have poor capability of identifying the boundary of the construction land, and the output details of the boundary are rough and have serious wrong division compared with other methods; FCN8s wrong classification conditions are relatively few, but connectivity of segmentation results is poor, urban edge detection capability is weak, and a common full convolution neural network is not suitable for semantic segmentation of low-resolution gray images; compared with three common FCNs, the IFCN retains the characteristics of suburban areas, the urban edge morphology is better predicted, and the pixel-level identification of construction land and non-construction land is highlighted.

The invention has the beneficial effects that: in view of the fact that the full convolution neural network has strong image expression capability, feature learning capability and good end-to-end performance, the invention designs a full convolution neural network model by improving the FCN network structure, and applies the full convolution neural network model to extraction of construction land of the low-resolution gray level remote sensing image.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.

Drawings

For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:

FIG. 1 is a sample selection criterion;

FIG. 2 is a sample label;

FIG. 3 is a network architecture;

FIG. 4 shows the network debug results;

FIG. 5 shows the results of the three classifiers;

FIG. 6 is a network interpretation result;

fig. 7 is a deconvolution heatmap for each network structure.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.

Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.

1 network optimization and verification

The pre-trained full convolution neural network model is excellent in remote sensing satellite image classification. The classifier idea is applied to a deep learning algorithm, and a full convolution neural network model with high classification precision is designed, so that the rapid and accurate extraction of the earth surface information based on the low-resolution and large-scale remote sensing image is realized.

1.1 sample Collection

In order to ensure the sufficiency and the coverage of the selection of the training sample of the large-scale remote sensing image data set and avoid overfitting (under-fitting) of the model, the invention adopts a chessboard type sample selection method to extract the characteristic sample based on ArcGIS software. The main process is that firstly, the research area is divided into 400 sub-areas; secondly, comparing the proportion of the image area to the blank area in 400 sub-areas, and taking the area with the remote sensing image accounting for more than 90% as a sampling area, wherein the total number of the sampling areas is 228; finally, 2-8 training samples are taken per sample region. In the research, 1017 feature samples of remote sensing image data of night light (NPP-VIIRS) are selected, the spatial resolution of the data samples of the night light is 224 x 224, and collected samples are randomly extracted into training samples and testing samples according to the ratio of 21: 4. The sample selection criteria are shown in FIG. 1.

The accurate calibration of the label boundary is the basis for ensuring the classification precision of the full convolution neural network model, the article uses a Chinese science and technology basic condition platform-national earth system scientific data center global 30m land cover data set FROM-GLC (2017) as a reference basis for manufacturing a sample label, the accuracy of the sample label is ensured, and a Labelme calibration tool is used for collecting the label so as to meet the application standard of the full convolution neural network label in a caffe deep learning environment. To ensure consistency of the tag dataset to the sample dataset, all samples and tags are resampled to spatial resolution 224 x 224. Sample label is shown in fig. 2.

1.2 network parameter optimization

The invention adopts the RELU function as the activation function, and adopts the Xavier random initialization method to give a weight initial value, thereby accelerating the convergence speed of the deep network, avoiding the gradient disappearance and inhibiting the network oscillation, and the mathematical formula is as follows:

The corrected night light satellite image still has noise, and the basic network VGGNet structure for the model architecture is deeper, and the parameter number is numerous, and the selection of the network optimizer is more critical under the background. The Adam (adaptive motion) first-order optimization algorithm is suitable for solving the high-noise network training problem, can intuitively explain the hyper-parameters and reduce the parameter adjusting process, and the empirical result also proves that the Adam algorithm is more excellent compared with other algorithms. Therefore, Adam is selected as a network weight optimizer, the random gradient descent training rate is improved, and the problem of gradient disappearance is avoided:

is to m_tThe correction of (2) is performed,

is to n_tγ is the learning rate and τ is used to ensure that the denominator is not 0.

In addition, a regularization layer and a random inactivation method are added in the forward propagation stage, the generalization error is reduced through the local response and the multi-neuron combination network, the network generalization capability is enhanced, the gradient disappearance is avoided, the convergence speed is increased, and the overfitting is inhibited.

1.3 network architecture optimization

Thanks to the maturity of the transfer learning technology, the invention improves the FCN8s network front end and the deconvolution process, improves the characteristic information extraction capability of the network, and constructs an improved full convolution neural network model (IFCN) suitable for extracting large-scale urban construction land.

The low-spatial-resolution remote sensing image is limited by data information amount, the thermodynamic diagram obtained through the traditional FCNs training contains less shallow classification information, so that the local information is seriously lost, the deep classification information is easy to generate errors, and the construction land classification result obtained after deconvolution is carried out on the fusion diagram with less information has fuzzy boundaries. The FCN shallow layer feature information contains more image details and position information, the deep layer feature information is more stable and rough, and for remote sensing images with different resolutions, different levels of feature maps have different extraction effects on satellite images. The extraction of the image information of the network can be optimized by fusing the multi-level characteristic graphs of the network.

Local information preservation is important in the semantic segmentation process, and the invention improves the network structure of the FCN8s to meet the segmentation requirement. In the deconvolution stage, skip connection is added, a plurality of layer output characteristics in a deconvolution structure are respectively fused with the output characteristic information of the pooling layer 4, the pooling layer 3 and the pooling layer 2, and a cutting layer is introduced and the offset of the cutting layer is corrected, so that the city boundary detection capability is optimized, and the classification accuracy is improved; in the forward propagation stage, the regularization layer is introduced into the first two pooling layers, the learning rate coefficient and the bias coefficient of the first 6 pooling layers are reduced, the learning capability of the shallow-layer pooling layer is improved, the extraction capability of network space position information is optimized, and an Xavier initialization method is applied to endow initial values for all the weight of the pooling layers of the front-end network. The specific network architecture is shown in fig. 3.

1.4 network hyper-parameter debug policy

In the super-parameter debugging strategy, parameter testing is mainly carried out from three aspects of initial learning rate, weight attenuation coefficient and random inactivation proportion. The learning rate controls the network convergence speed and the convergence track, the lower the value is, the slower the speed is, the smoother the convergence track is, otherwise, the opposite is true; the weight attenuation coefficient value and different random inactivation ratios can influence the generalization and fitting capability of the neural network. All three will affect the classification accuracy of the network. The average cross-over ratio is used as the measurement standard of image segmentation to judge whether the hyper-parameter debugging is reasonable or not, and the formula is as follows:

and

representing the total number of pixels of all classes i.

The learning rate is adjusted from a minimum value of 1 × 10^-8The parameter value is taken up to the maximum value of 0.01, and the increment is 10 times each time³And (4) carrying out assignment from 0.90 to 0.999 in the debugging of momentum items, carrying out the debugging on the inactivation proportion of 50 percent and 25 percent in each progressive 0.05. Under the three debugging strategies, the optimal combination scheme obtained by testing is used as the final FCN model super parameter value. And performing multiple times of iterative training on the network under the condition that the learning strategy is 'fixed'. The trend of the average intersection ratio with the number of iterations is shown in fig. 4. Comprehensively analyzing the change rate, the mean value and the maximum value of the precision, and calculating the learning rate 1 x 10^-5The momentum term is 0.95, and the random inactivation ratio is 50 percent, which is taken as the optimal combination of the super parameters of the FCN model.

1.5 network accuracy verification

Training the constructed IFCN model based on a Caffe-GPU deep learning framework, applying 1017 NPP-VIIRS night light image sample training networks, distributing the samples into training samples according to a ratio of 21:5, inputting the training samples and the test samples into the network for training, iterating for 20 ten thousand times, and enabling the average cross-over ratio to reach 97.59% after complete training. The average intersection ratio is rapidly increased before the iteration reaches 5000 times, which shows that the gradient of the network is obviously reduced in the training process, the situations of gradient disappearance, oscillation and the like do not occur, the network tends to be smooth after 5000 iterations, the precision value starts to be unchanged after 180000 iterations, which shows that the network fitting is completed and the optimal performance is achieved. The decrease in the loss value varies from fast to slow, faster before 25000 iterations, then slowly approaches 0, and decreases to 592 after 200000 iterations. The network achieves the network debugging requirement and the model application benchmark.

2 urban construction land identification under full convolution neural network

2.1 extraction of urban construction land in economic zone of Yangtze river

And inputting night light images of the Yangtze river economic zone into the trained IFCN to obtain two classification results, wherein the classification results are respectively construction land and non-construction land. And classifying the same types of the verified areas by using a support vector machine classification method (linear kernel type) and an object-oriented threshold classification method. As shown in fig. 5, the city boundary obtained by the support vector machine method is more obvious, but the classification result is wrong, the city boundary contains some non-construction lands, and the non-construction lands in the city, such as reservoirs, parks and the like, cannot be identified. The detailed characteristics of the urban boundary obtained by IFCN and threshold classification are more prominent, and the method has better identification capability on the non-construction land in the city. Compared with threshold classification, the IFCN has better detection capability on detailed characteristics of small-area city boundaries, and has better segmentation effect on roads and water bodies. The IFCN has a better interpretation effect in view of the spatially presented classification results.

The extraction performance of the construction land of the full convolution neural network under the large-scale low-resolution gray image is discussed and improved through quantitative precision evaluation. Because the classification object is single, when the pixel classification precision of the network is evaluated, not only the number of the accurately classified pixels but also the proportion of the accurately classified pixels need to be considered. In order to present a more intuitive and scientific precision result, the article simultaneously uses a Kappa coefficient and an F1-score algorithm to measure the precision of the three classifiers.

The results of the precision evaluation are shown in table 1. The Kappa coefficient and the F1-score coefficient of the three are FCN > Threshold classification > SVM in sequence from large to small, and compared with two traditional classifiers, the Kappa coefficient of the full convolution neural network is improved by 6% and the F1-score coefficient of the full convolution neural network is improved by 4% in the aspect of urban construction land extraction accuracy. For night light imagery, the grayscale between non-construction and construction sites is similar, while the FCN better infers and segments categories by taking into account the geometry of the context. In addition, the FCN recall ratio reaches 95.37%, which is far higher than 90.15% of SVM, and 87.18% of threshold classification, because the deep full-convolution network has both shallow information and deep information, and the optimized deconvolution structure simultaneously mashes the shallow spatial position information and the deep robust information, so that the discrimination of pixel-level classification is more accurate.

TABLE 1 results of classification accuracy

2.2 model Performance verification

The improvement of the front-end structure and the deconvolution structure of the network enables the IFCN to obtain more excellent results in the aspects of urban construction land identification and boundary extraction. To verify the network structure performance of the improved model, it was compared to a common full convolution neural network model. And selecting FCN32s, FCN16s and FCN8s as comparison objects, and inputting training samples into the three networks respectively to perform same-parameter training. After the same number of iterations, a city in the economic zone of the Yangtze river is used as a detection object and input into a trained model, the output result is shown in FIG. 6, and the semantic segmentation result shows that the FCN32s and the FCN16s have poor capability in identifying the boundary of the construction area, and the output details of the boundary are rough and have serious misscores compared with other methods. FCN8s wrong classification conditions are relatively few, but connectivity of segmentation results is poor, city edge detection capability is weak, and a common full convolution neural network is not suitable for semantic segmentation of low-resolution gray images. Compared with three common FCNs, the IFCN retains the characteristics of suburban areas, has better urban edge form prediction and has outstanding expression in the aspect of pixel-level identification of construction land and non-construction land.

The optimization of the network structure effectively improves the semantic segmentation capability of the IFCN under the low-resolution gray level image. Fig. 7 shows the up-sampling process of four FCN deconvolution layers, the up-sampled output feature information is different with the change of network structure, in the four networks, IFCN is up-sampled by 4 times, the output heatmap contains the most abundant feature information, the two skipped FCN8s are adopted to have less detailed features than IFCN, and the up-sampled heatmaps of FCN16s and FCN32s show fuzzy effect in space. In the traditional FCNs, a large number of heatmaps contain invalid information, and the number of the effective information heatmaps output by the improved IFCN front end is greatly increased, which shows that the IFCN designed by the invention has more excellent extraction performance on sample characteristic information in the same type of sample background. The reason for the above difference is that, firstly, a great deal of spatial position information is contained in pool2, and pool2 is added to a deconvolution structure, so that the network can significantly promote the improvement of the segmentation result by retrieving more spatial details, thereby obtaining more details; secondly, the improved FCN has more skip connections (the improved FCN has three skip connections, the FCN16s and FCN8s have only two and one skip connection, and the FCN32s generates dense pixel-based label mapping without skip connections), and the skip connections fuse deep semantic information with shallow appearance information to enhance network segmentation performance; in addition, due to the fact that smaller convolution kernels and more reasonable clipping offset in the deconvolution layer and parameter optimization of the shallow convolution layer in the front-end network are benefited, network context information is more dense, and the detail information intake capability of the deconvolution output heatmap is improved by combining a multi-stream learning process.

3 summary of the invention

In view of the fact that the full convolution neural network has strong image expression capability, feature learning capability and good end-to-end performance, the invention designs a full convolution neural network model by improving the FCN network structure, and applies the full convolution neural network model to extraction of construction land of the low-resolution gray level remote sensing image. In the method, 1017 night light image data samples are extracted by taking the economic zone of Yangtze river China as an experimental object and input into a network, the precision of the IFCN is discussed, the extraction effect of the network on the construction land is analyzed, and the result shows that: compared with the traditional classifier, the improved FCN has more accurate classification and judgment on pixel levels, has better recognition and segmentation capability on detailed characteristics of roads, water bodies and small city boundaries, and improves the Kappa coefficient extracted by the IFCN in the urban construction land by 6% and the F1-score coefficient by 4% in the aspect of precision evaluation compared with the two traditional classifiers. In general, the IFCN has a good semantic segmentation effect on city construction sites.

Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims

1. A city construction land identification method based on a full convolution neural network is characterized in that: the method comprises the following steps;

s1: optimizing a network;

s2: and identifying the multi-scale urban construction land under the full convolutional neural network.

2. The method for identifying urban construction land based on full convolution neural network as claimed in claim 1, wherein: the S1 specifically includes:

s11: collecting a sample;

s12: optimizing network parameters;

wherein n is_iRepresents the ith godChannel element, W represents weight value, U represents uniform distribution, and its variance is

is to m_tThe correction of (2) is performed,

s13: optimizing a network structure;

s14: a network hyper-parameter debugging strategy;

and

representing the total number of pixels of which all the categories are i;

s15: network precision verification;

training the constructed IFCN model based on a Caffe-GPU deep learning framework, applying 1017 NPP-VIIRS night light image sample training networks, distributing the samples into training samples according to a ratio of 21:5, inputting the training samples and the test samples into the network for training, iterating for 20 ten thousand times, and enabling the average cross-over ratio to reach 97.59% after complete training; the average intersection value is rapidly increased compared with the value before the iteration reaches 5000 times, which shows that the gradient of the network is obviously reduced in the training process, the situations of gradient disappearance and oscillation do not occur, the network tends to be gentle after 5000 iterations, the precision value starts to be unchanged after 180000 iterations, which shows that the network fitting is completed and the optimal performance is achieved; in the aspect of change of the loss value, the descending trend is from fast to slow, the descending trend is faster before 25000 iterations and then slowly approaches 0, and the descending trend is 592 after 200000 iterations; the network achieves the network debugging requirement and the model application benchmark.

3. The method for identifying urban construction land based on full convolution neural network as claimed in claim 1, wherein: the S2 specifically includes:

s22: verifying the performance of the model;

the improvement of the front end structure and the deconvolution structure of the network enables the IFCN to obtain more excellent results in the aspects of urban construction land identification and boundary extraction; comparing the improved model with the full convolution neural network model for verifying the network structure performance of the improved model; selecting FCN32s, FCN16s and FCN8s as comparison objects, and inputting training samples into the three networks respectively to perform same-parameter training; after the same number of iterations, a city in the Yangtze river economic zone is used as a detection object and input into a trained model, and a semantic segmentation result shows that the FCN32s and the FCN16s have poor capability of identifying the boundary of the construction land, and the output details of the boundary are rough and have serious wrong division compared with other methods; FCN8s wrong classification conditions are relatively few, but connectivity of segmentation results is poor, urban edge detection capability is weak, and a common full convolution neural network is not suitable for semantic segmentation of low-resolution gray images; compared with three common FCNs, the IFCN retains the characteristics of suburban areas, the urban edge morphology is better predicted, and the pixel-level identification of construction land and non-construction land is highlighted.