CN117351370A

CN117351370A - Automatic out-of-range building pattern spot extraction method based on high Jing Weixing image

Info

Publication number: CN117351370A
Application number: CN202311339535.7A
Authority: CN
Inventors: 关涛; 安文占; 柴祥君; 滕龙妹; 刘琦; 钟和曦; 李玲; 高青山; 陶从辉; 邹瑜; 赵梦琳
Original assignee: Zhejiang Provincial Institute Of Land And Space Planning And Research
Current assignee: Zhejiang Provincial Institute Of Land And Space Planning And Research
Priority date: 2023-10-17
Filing date: 2023-10-17
Publication date: 2024-01-05
Anticipated expiration: 2043-10-17

Abstract

The invention relates to the technical field related to remote sensing applications, and discloses an automatic extraction method of over-range building patches based on high-view satellite images, which includes: remote sensing image preprocessing; sample data set production; construction of a convolutional neural network; and use of sample training Network model; model inference and prediction. This invention mainly automatically extracts construction land through the improved U-Net model, uses resnet50 as the encoder part of the U-Net model, uses its powerful feature extraction capabilities to better capture feature information at different scales, and introduces the CBAM hybrid attention mechanism. The model can adaptively adjust the importance of different channels and spatial positions in the feature map to better focus on the key areas of the architectural target. At the same time, the FPN feature pyramid structure is added to the model to integrate multi-scale features at different levels. Feature information allows the model to better obtain contextual information of building targets, thereby enhancing the robustness and accuracy of building land extraction.

Description

An automatic extraction method for over-range building patches based on high-view satellite images

技术领域Technical field

本发明涉及遥感应用相关技术领域，具体为一种基于高景卫星影像的超范围建筑物图斑自动提取方法。The invention relates to the technical field related to remote sensing applications, specifically an automatic extraction method of over-range building spots based on high-view satellite images.

背景技术Background technique

随着城镇化的快速发展，各类土地用途转建筑用地，需要对建筑用地进行动态监测，以更好为国土空间用途管制提供快速、具体的建筑用地批后实施变化信息，优化完善监测评估体系。新一代传感器能够获取到较高分辨率的影像，其中高景卫星影像空间分辨率可达0.5m，极大增强了地物特征信息和定位信息，能清晰显示出超范围施工建设区域的空间分布情况、边界信息以及建设状况。但单幅影像像素也存在几何倍数增长，而且建筑用地结构较复杂、光谱特征多样，存在大量干扰信息和相似地物特征信息，从高分辨率影像中提取不同类别的建筑用地边界信息较困难。With the rapid development of urbanization, various types of land use have been transferred to construction land. Dynamic monitoring of construction land is needed to better provide rapid and specific post-approval implementation change information for land space use control and optimize and improve the monitoring and evaluation system. . The new generation of sensors can acquire higher-resolution images, among which the spatial resolution of high-view satellite images can reach 0.5m, which greatly enhances the feature information and positioning information of ground objects, and can clearly display the spatial distribution of over-scale construction areas. conditions, boundary information, and construction status. However, the pixels of a single image also increase geometrically, and the construction land structure is complex, the spectral characteristics are diverse, and there is a large amount of interference information and similar feature information. It is difficult to extract different types of construction land boundary information from high-resolution images.

许多学者针对建筑用地信息提取开展了大量研究，根据提取原理可归纳为三种类型：一、基于建筑用地自身几何特征，依靠边缘信息和角点信息对建筑用地外部轮廓识别进一步提取建设用地。二、借助辅助信息，结合激光雷达数据、合成孔径雷达、数字表面模型等数据中获取的位置和高程等辅助信息提取建筑用地。三、基于分割方法，将影像分割为每一个单元，根据单元内特征进行信息提取。然而，以上所述存在以下不足：一、传统方法依靠地物自身纹理、形状以及阴影等潜在特征，对于建筑用地信息表达能力较弱，无法依靠空间上下文信息等高级语义特征提取建筑用地，存在漏提、错提现象。二、借助辅助信息能够一定程度增强分割效果，但由于获取辅助信息成本较高存在一定局限，并且对于建筑用地提取容易受辅助信息精度限制。三、在高空间分辨率影像中空间关系和类别内差异增多，分割尺度较难确定，存在欠分割、过度分割等问题，造成建筑用地提取较零碎、连片等情况，分割参数须根据具体区域地物特征进行设置，存在主观因素且泛化能力较差。Many scholars have conducted a lot of research on building land information extraction, which can be summarized into three types according to the extraction principles: 1. Based on the geometric characteristics of the building land itself, relying on edge information and corner information to identify the external contour of the building land to further extract the construction land. 2. Extract construction land with the help of auxiliary information, combined with auxiliary information such as location and elevation obtained from lidar data, synthetic aperture radar, digital surface model and other data. 3. Based on the segmentation method, the image is divided into each unit, and information is extracted based on the characteristics within the unit. However, the above has the following shortcomings: 1. The traditional method relies on potential features such as texture, shape, and shadow of the ground objects. It has weak ability to express building land information and cannot rely on advanced semantic features such as spatial context information to extract building land. There are loopholes. The phenomena of mentioning and wrong mentioning. 2. The segmentation effect can be enhanced to a certain extent with the help of auxiliary information, but there are certain limitations due to the high cost of obtaining auxiliary information, and the extraction of building land is easily limited by the accuracy of auxiliary information. 3. In high spatial resolution images, spatial relationships and intra-category differences increase, and the segmentation scale is difficult to determine. There are problems such as under-segmentation and over-segmentation, resulting in fragmented and contiguous building land extraction. Segmentation parameters must be based on specific areas. Setting the features of ground objects has subjective factors and poor generalization ability.

发明内容Contents of the invention

本发明的目的在于提供一种基于高景卫星影像的超范围建筑物图斑自动提取方法，利用改进的U-Net模型提取建筑物信息，用于克服现有技术中的上述缺陷，提升建筑用地提取的准确性。The purpose of the present invention is to provide an automatic extraction method of over-range building patches based on high-view satellite images, using an improved U-Net model to extract building information, to overcome the above-mentioned defects in the existing technology, and improve the construction land use Extraction accuracy.

本发明是通过以下技术方案来实现的。The present invention is achieved through the following technical solutions.

本发明的一种基于高景卫星影像的超范围建筑物图斑自动提取方法，包括：An automatic extraction method of over-range building patches based on high-view satellite images of the present invention includes:

步骤S1：遥感影像预处理；Step S1: Remote sensing image preprocessing;

步骤S2：样本数据集制作；Step S2: Sample data set production;

步骤S3：构建卷积神经网络，包括：Step S3: Construct a convolutional neural network, including:

S3.1：基于U-Net网络结构进行改进，使用resnet50作为U-Net网络的编码部分；S3.1: Improve based on the U-Net network structure, using resnet50 as the encoding part of the U-Net network;

S3.2：在改进的U-Net网络编码器部分引入CBAM混合注意力机制；S3.2: Introducing the CBAM hybrid attention mechanism in the improved U-Net network encoder part;

S3.3：在改进的U-Net网络解码器部分引入FPN特征金字塔结构；S3.3: Introduce the FPN feature pyramid structure in the improved U-Net network decoder part;

步骤S4：利用样本训练网络模型，包括：Step S4: Use samples to train the network model, including:

S4.1：对人工勾画的数据集进行数据增广处理；S4.1: Perform data augmentation on the manually outlined data set;

S4.2：将增广后所有数据集按照训练集、验证集和测试集划分；S4.2: Divide all augmented data sets into training sets, validation sets and test sets;

S4.3：运行环境设置；S4.3: Operating environment settings;

S4.4：网络参数设置；S4.4: Network parameter setting;

步骤S5：模型推理预测，包括：Step S5: Model inference prediction, including:

S5.1：设置模型的最优权重，并将测试集输入到网络中进行推理预测；S5.1: Set the optimal weight of the model and input the test set into the network for inference prediction;

S5.2：超范围建筑物图斑提取应用，采用膨胀裁剪拼接的方法提取超范围建筑物图斑。S5.2: Application of over-range building patch extraction, using the expansion cropping and splicing method to extract over-range building patches.

进一步的技术方案，在步骤S1的遥感影像预处理中，包括影像辐射校正、正射校正、影像融合以及匀光匀色。Further technical solutions, in the remote sensing image preprocessing in step S1, include image radiation correction, orthorectification, image fusion and light and color uniformity.

进一步的技术方案，在步骤S2的样本数据集制作中，通过高景一号卫星影像对建筑物目标使用Arcgis来进行样本标注，采用512×512pixels分别对原影像和标签进行裁剪，生成tiff格式的遥感样本数据集。Further technical solution, in the sample data set production in step S2, use Arcgis to label the building targets through Gaojing-1 satellite images, use 512×512pixels to crop the original images and labels respectively, and generate tiff format Remote sensing sample data set.

进一步的技术方案，所述步骤S3中的所述S3.1具体包括：As a further technical solution, S3.1 in step S3 specifically includes:

将resnet50网络stage0的第一层输出特征图Efeat1、stage1的输出特征图Efeat2、stage2的输出特征图Efeat3、stage3的输出特征图Efeat4和stage4的输出特征图Efeat5替换U-Net网络编码器对应的特征层。Replace the first layer output feature map Efeat1 of the resnet50 network stage0, the output feature map Efeat2 of stage1, the output feature map Efeat3 of stage2, the output feature map Efeat4 of stage3, and the output feature map Efeat5 of stage4 with the corresponding features of the U-Net network encoder. layer.

进一步的技术方案，所述步骤S3中的所述S3.2具体包括：As a further technical solution, S3.2 in step S3 specifically includes:

在特征图Efeat1和特征图Efeat5后分别加入CBAM混合注意力机制，使网络更加聚焦于关键区域。The CBAM hybrid attention mechanism is added after the feature map Efeat1 and feature map Efeat5 respectively, so that the network can focus more on key areas.

进一步的技术方案，所述步骤S3中的所述S3.3具体包括：As a further technical solution, S3.3 in step S3 specifically includes:

先将编码器每一层的输出特征图Dfeat1、Dfeat2、Dfeat3、Dfeat4自上而下进行信息融合得到特征图Dfeat5，再将Dfeat5进行8倍上采样处理后与Dfeat1融合得到Dfeat6，最后将Dfeat6进行2倍上采样处理后输出。First, the output feature maps Dfeat1, Dfeat2, Dfeat3, and Dfeat4 of each layer of the encoder are fused from top to bottom to obtain the feature map Dfeat5. Then Dfeat5 is upsampled 8 times and then fused with Dfeat1 to obtain Dfeat6. Finally, Dfeat6 is Output after 2 times upsampling processing.

进一步的技术方案，所述S4.1中的所述数据增广处理，主要包括随机旋转、水平翻转、垂直翻转和对角镜像操作；Further technical solution, the data augmentation processing in S4.1 mainly includes random rotation, horizontal flipping, vertical flipping and diagonal mirroring operations;

进一步的技术方案，所述S4.2中所述数据集划分的划分比例为8：1：1。In a further technical solution, the division ratio of the data set described in S4.2 is 8:1:1.

进一步的技术方案，所述S4.3中所述运行环境设置具体选用windows操作系统、Pytorch深度学习框架以及VScode语言编译器。As a further technical solution, the running environment settings described in S4.3 specifically select the windows operating system, Pytorch deep learning framework and VScode language compiler.

进一步的技术方案，所述S4.4中所述网络参数设置具体选用Adam优化器，选用交叉熵损失和骰子损失组合作为网络的损失函数，初始学习率设为0.0007，选用poly学习策略以及ReLu激活函数。As a further technical solution, the network parameter settings described in S4.4 specifically select the Adam optimizer, select a combination of cross-entropy loss and dice loss as the loss function of the network, set the initial learning rate to 0.0007, and select poly learning strategy and ReLu activation. function.

进一步的技术方案，所述步骤S5中的所述S5.1具体包括：As a further technical solution, S5.1 in step S5 specifically includes:

首先设置模型的最优权重，随后测试集输入到网络中进行推理预测，为评估网络的实际学习能力，并计算出测试集的精确率（Precision）、召回率（Recall）、F1值（F1-score）和单类交并比(Iou)。First, the optimal weight of the model is set, and then the test set is input into the network for inference prediction. In order to evaluate the actual learning ability of the network, the precision (Precision), recall (Recall), and F1 value (F1- score) and single-class intersection and union (Iou).

进一步的技术方案，所述步骤S5中的所述S5.2具体包括：As a further technical solution, S5.2 in step S5 specifically includes:

在超范围建筑物图斑提取应用过程中，选取项目红线500m缓冲区的遥感影像，由于网络训练尺寸为512×512，应将待预测的遥感影像进行固定尺寸裁剪再进行拼接，并采用膨胀裁剪拼接的方法提取超范围建筑物图斑来避免推理预测结果出现明显的拼接痕迹。In the application process of over-scale building patch extraction, the remote sensing images of the 500m buffer zone of the project's red line are selected. Since the network training size is 512×512, the remote sensing images to be predicted should be cropped to a fixed size and then spliced, and expansion cropping should be used. The splicing method extracts over-range building patches to avoid obvious splicing traces in the inference prediction results.

本发明的有益效果：Beneficial effects of the present invention:

本发明针对国土空间用途管制中建设项目超范围建筑物图斑监测工作，以解决建设项目超范围建筑物图斑监测中发现难、发现慢的问题。超范围建筑物图斑的地表情况复杂多变，识别目标表现在高分辨率遥感影像上的纹理特征非常复杂，对深度学习模型的细节提取能力、边缘检测能力提出较高的要求。本发明使用改进后的U-NET深度学习模型对建筑物目标进行精确的位置识别和边界分割，具有如下收益效果：The present invention is aimed at monitoring out-of-range building pattern spots of construction projects in land space use control, so as to solve the problem of difficult and slow discovery in the monitoring of out-of-range building pattern spots in construction projects. The surface conditions of over-the-range building patches are complex and changeable, and the texture features of the identification targets displayed on high-resolution remote sensing images are very complex, which places higher requirements on the detail extraction capabilities and edge detection capabilities of the deep learning model. This invention uses the improved U-NET deep learning model to perform accurate location identification and boundary segmentation of building targets, which has the following benefits:

使用resnet50作为U-Net模型的编码器部分，利用其强大的特征提取能力更好地捕捉不同尺度的特征信息。Use resnet50 as the encoder part of the U-Net model to use its powerful feature extraction capabilities to better capture feature information at different scales.

引入CBAM混合注意力机制使得模型可以自适应地调整特征图中不同通道和空间位置的重要性，从而更好地聚焦于建筑目标的关键区域。The introduction of the CBAM hybrid attention mechanism allows the model to adaptively adjust the importance of different channels and spatial positions in the feature map, thereby better focusing on key areas of the architectural target.

在模型中加入FPN特征金字塔结构，通过在不同层级上融合多尺度的特征信息，让模型能够更好地获取建筑目标的上下文信息，从而增强了建筑用地提取的鲁棒性和准确性。Adding the FPN feature pyramid structure to the model allows the model to better obtain contextual information of building targets by integrating multi-scale feature information at different levels, thereby enhancing the robustness and accuracy of building land extraction.

附图说明Description of drawings

为了更清楚地说明发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only For some embodiments of the invention, those of ordinary skill in the art can also obtain other drawings based on these drawings without exerting creative efforts.

下面结合附图和实施例对本发明进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and examples.

图1是本发明的一种基于高景卫星影像的超范围建筑物图斑自动提取方法的流程图；Figure 1 is a flow chart of an automatic extraction method of over-range building patches based on high-view satellite images according to the present invention;

图2是构建深度学习模型过程中引入resnet50的网络示意图；Figure 2 is a schematic diagram of the network introducing resnet50 in the process of building a deep learning model;

图3是构建深度学习模型过程中引入CBAM混合注意力机制的网络示意图；Figure 3 is a schematic diagram of the network that introduces the CBAM hybrid attention mechanism in the process of building a deep learning model;

图4是本发明中改进的U-Net网络示意图；Figure 4 is a schematic diagram of the improved U-Net network in the present invention;

图5是本发明高景一号影像建筑物图斑提取对比示意图；Figure 5 is a schematic diagram for comparison of building pattern extraction from the Gaojing No. 1 image according to the present invention;

图6是已批项目红线500m缓冲区建筑物图斑提取示意图；Figure 6 is a schematic diagram of extracting building patches in the 500m buffer zone of the approved project red line;

图7是膨胀裁剪拼接效果对比示意图。Figure 7 is a schematic diagram comparing the effect of expansion cutting and splicing.

实施方式Implementation

图1所示为本发明专利提供的方法流程图，本发明的一种基于高景卫星影像的超范围建筑物图斑自动提取方法，具体工作步骤如下：Figure 1 shows the method flow chart provided by the patent of the present invention. The present invention is an automatic extraction method of over-range building spots based on high-view satellite images. The specific working steps are as follows:

步骤S1：遥感影像预处理；Step S1: Remote sensing image preprocessing;

步骤1.1：辐射校正；遥感成像的过程中，传感器自身存在一定的绝对误差，此外，大气散射、云层反射形成的天空光会和地物目标的辐射能量也会一起进入遥感探测器，从而导致遥感辐射量失真，图像对比度下降，降低遥感图像的质量，通过辐射校正消除传感器以及大气影响等产生的误差。Step 1.1: Radiation correction; in the process of remote sensing imaging, the sensor itself has a certain absolute error. In addition, the sky light caused by atmospheric scattering and cloud reflection and the radiation energy of ground objects will also enter the remote sensing detector together, resulting in remote sensing The amount of radiation is distorted, the image contrast is reduced, and the quality of remote sensing images is reduced. Errors caused by sensors and atmospheric effects are eliminated through radiation correction.

步骤1.2：正射校正；由于成像过程中地形、相机几何特性以及与传感器带来的误差会造成明显的几何畸变以及地面高程会引起视差，而建筑用地信息提取对定位精度及轮廓精度要求较高，因此通过对影像进行正射校正消除以上影响。Step 1.2: Orthorectification; due to the terrain, camera geometric characteristics and errors caused by the sensor during the imaging process, obvious geometric distortion and ground elevation will cause parallax, and the extraction of building land information requires high positioning accuracy and contour accuracy. , so the above effects are eliminated by orthorectifying the image.

步骤1.3：影像融合；为了获取多通道的高分辨率影像，采用影像融合的方式将全色传感器影像和多光谱传感器影像融合，以保留多光谱数据的颜色和光谱信息，同时增强图像的细节和分辨率。Step 1.3: Image fusion; In order to obtain multi-channel high-resolution images, image fusion is used to fuse panchromatic sensor images and multispectral sensor images to retain the color and spectral information of the multispectral data, while enhancing the details and details of the image. resolution.

步骤1.4：匀光匀色；在影像获取过程中，由于成像时间、各种环境因素导致每条航带间影像存在色差、亮度等多方面差异，因此对融合后的影像进行匀光匀色处理，使得每景影像整体精度一致、色调均匀、纹理清晰、反差适中、色彩自然过渡。Step 1.4: Light and color uniformity; during the image acquisition process, due to imaging time and various environmental factors, there are many differences in color, brightness and other aspects of the images between each flight strip, so the fused image is subjected to light and color uniformity processing. , so that the overall accuracy of each scene image is consistent, the tone is uniform, the texture is clear, the contrast is moderate, and the color transition is natural.

步骤S2：样本数据集制作；使用Arcgis软件对高景一号影像建筑物目标进行人工勾画，将目标地物属性赋值为255，其它地物赋值为0，并转为tiff格式的栅格图像，再按照512×512pixels分别对原影像和标签进行裁剪，生成遥感建筑物样本数据集。Step S2: Sample data set production; use Arcgis software to manually outline the building targets in the Gaojing No. 1 image, assign the attributes of the target features as 255, and assign the values of other features as 0, and convert them into raster images in tiff format. The original image and label are then cropped according to 512×512pixels to generate a remote sensing building sample data set.

步骤S3：构建卷积神经网络；Step S3: Construct a convolutional neural network;

步骤3.1：使用resnet50作为U-Net网络的编码部分，将resnet50网络stage0的第一层输出特征图Dfeat1、stage1的输出特征图Dfeat2、stage2的输出特征图Dfeat3、stage3的输出特征图Dfeat4和stage4的输出特征图Dfeat5替换U-Net网络编码器对应的特征层。resnet50网络的核心思想是设计了残差结构，其残差块由1×1、3×3和1×1三个卷积层组成，残差结构的数学表达式如下：Step 3.1: Use resnet50 as the encoding part of the U-Net network, and use the first layer output feature map Dfeat1 of the resnet50 network stage0, the output feature map Dfeat2 of stage1, the output feature map Dfeat3 of stage2, the output feature map Dfeat4 of stage3, and the output feature map Dfeat4 of stage4. The output feature map Dfeat5 replaces the feature layer corresponding to the U-Net network encoder. The core idea of the resnet50 network is to design a residual structure. Its residual block consists of three convolutional layers: 1×1, 3×3 and 1×1. The mathematical expression of the residual structure is as follows:

x_i+1 = x_i+ F( x_i,W_i)x _i+1 = x _i + F( x _i ,W _i )

其中，x_i为输入特征，x_i+1为输出特征，x_i和x_i+1的通道数相同，F( x_i,W_i)为残差部分，W_i表示卷积操作。Among them, _xi is the input feature, xi ₊₁ is the output feature, _xi and xi ₊₁ have the same number of channels, F( _xi ,W _i ) is the residual part, and W _i represents the convolution operation.

步骤3.2：在改进的U-Net网络编码器部分引入CBAM混合注意力机制，在特征图Dfeat1和特征图Dfeat5后分别加入CBAM混合注意力机制，使网络更加聚焦于关键区域。图3所示为CBAM混合注意力机制的网络示意图。CBAM混合注意力机制的数学表达式如下：Step 3.2: Introduce the CBAM hybrid attention mechanism into the improved U-Net network encoder part, and add the CBAM hybrid attention mechanism after the feature map Dfeat1 and feature map Dfeat5 respectively, so that the network can focus more on key areas. Figure 3 shows the network diagram of the CBAM hybrid attention mechanism. The mathematical expression of CBAM hybrid attention mechanism is as follows:

A_c(x_i) = σ(W₁(W₀(x_iavg)) + W₁(W₀(x_imax))A _c (x _i ) = σ(W ₁ (W ₀ (x _iavg )) + W ₁ (W ₀ (x _imax ))

A_s(x_i) = σ(W_7×7(x_iavg⊙ x_imax))A _s (x _i ) = σ(W _7×7 (x _iavg ⊙ x _imax ))

X_i+1= A_s(A_c(x_i)) Ⓧ A_c(x_i)X _i+1 = A _s (A _c (x _i )) Ⓧ A _c (x _i )

其中，x_i为输入特征，x_i+1为输出特征，W₀和W₁分别表示使用ReLu激活函数和使用sigmod激活函数的卷积操作，x_iavg和x_imax分别表示平均池化和最大池化，A_c(x_i)表示通道注意力函数，A_s(x_i)表示空间注意力函数，σ为sigmod激活函数，⊙表示拼接操作，Ⓧ表示逐元素相乘。Among them, x _i is the input feature, x _i+1 is the output feature, W ₀ and W ₁ represent the convolution operation using the ReLu activation function and the sigmod activation function respectively, x _iavg and x _imax represent the average pooling and the maximum pooling respectively. ation, A _c ( _xi ) represents the channel attention function, A _s ( _xi ) represents the spatial attention function, σ is the sigmod activation function, ⊙ represents the splicing operation, and Ⓧ represents element-wise multiplication.

步骤3.3：在改进的U-Net网络解码器部分引入FPN特征金字塔结构，先将编码器每一层的输出特征图Efeat1、Efeat2、Efeat3、Efeat4自上而下进行信息融合得到特征图Efeat5，再将Efeat5进行8倍上采样处理后与Efeat1融合得到Efeat6，最后将Efeat6进行2倍上采样处理后输出。图4所示为加入FPN特征金字塔结构的改进U-Net网络示意图。Step 3.3: Introduce the FPN feature pyramid structure into the improved U-Net network decoder part, first fuse the output feature maps Efeat1, Efeat2, Efeat3, and Efeat4 of each layer of the encoder from top to bottom to obtain the feature map Efeat5, and then Efeat5 is upsampled 8 times and then fused with Efeat1 to obtain Efeat6. Finally, Efeat6 is upsampled 2 times and output. Figure 4 shows a schematic diagram of the improved U-Net network adding the FPN feature pyramid structure.

步骤S4：利用样本训练网络模型；Step S4: Use samples to train the network model;

步骤4.1：将人工勾画的数据集进行数据增广处理，同时将原图和标签进行0-360°随机旋转、水平翻转、垂直翻转和对角镜像操作，增加数据的丰富度。Step 4.1: Perform data augmentation on the manually outlined data set, and perform random rotation, horizontal flip, vertical flip and diagonal mirroring operations of 0-360° on the original image and labels to increase the richness of the data.

步骤4.2：将增广后所有数据集按照8：1：1的比例划分训练集、验证集和测试集划分。Step 4.2: Divide all augmented data sets into training sets, validation sets, and test sets in a ratio of 8:1:1.

步骤4.3：运行环境设置；选用windows操作系统、Pytorch深度学习框架以及VScode语言编译器，运行内存64G，显卡为NVIDIA GeForce RTX 3060。Step 4.3: Set up the running environment; select the windows operating system, Pytorch deep learning framework and VScode language compiler, run the memory 64G, and the graphics card is NVIDIA GeForce RTX 3060.

步骤4.4：网络参数设置；选用Adam优化器，Adam吸收了自适应学习率的梯度下降算法和动量梯度下降算法的优点，能适应稀疏梯度，又能缓解梯度震荡。选用交叉熵损失和骰子损失组合作为网络的混合损失函数，该类混合函数既能关注输出结果与实际结果之间的差异，又能兼顾样本中类型不平衡的问题。初始学习率设为0.0007，选用poly学习策略以及ReLu激活函数，batchsize设为2，epoch设为300。Adam优化器以及混合损失函数的数学表达式如下所示。Step 4.4: Network parameter setting; select Adam optimizer. Adam absorbs the advantages of adaptive learning rate gradient descent algorithm and momentum gradient descent algorithm, can adapt to sparse gradients, and can alleviate gradient oscillation. A combination of cross-entropy loss and dice loss is selected as the hybrid loss function of the network. This type of hybrid function can not only focus on the difference between the output results and the actual results, but also take into account the problem of type imbalance in the sample. The initial learning rate is set to 0.0007, the poly learning strategy and the ReLu activation function are selected, the batchsize is set to 2, and the epoch is set to 300. The mathematical expressions of the Adam optimizer and hybrid loss function are as follows.

Adam优化器数学表达式：Adam optimizer mathematical expression:

m_t=β₁m_t-1+ (1-β₁)g_t m _t =β ₁ m _t-1 + (1-β ₁ )g _t

v_t= β₂m_t-1+ (1-β₂)g_t ² v _t = β ₂ m _t-1 + (1-β ₂ )g _t ²

其中，β₁为梯度的指数加权平均值，β₂为梯度的平方的指数加权平均值，g_t表示参数梯度，m_t为t时刻的一阶矩估计，v_t为t时刻的二阶矩估计，和/>为对应梯度的偏差纠正后的移动平均值，/>更新后参数，/>为接近0的常数。Among them, β ₁ is the exponential weighted average of the gradient, β ₂ is the exponential weighted average of the square of the gradient, g _t represents the parameter gradient, m _t is the first-order moment estimate at time t, v _t is the second-order moment at time t estimate, and/> is the moving average corrected for the deviation of the corresponding gradient,/> Updated parameters,/> is a constant close to 0.

混合损失函数的数学表达式：Mathematical expression of hybrid loss function:

L_cross= -(y*log(p)+(1-y)*log(1-p))L _cross = -(y*log(p)+(1-y)*log(1-p))

L_dice= L _dice =

L_loss = L_cross+ L_dice L _loss = L _cross + L _dice

其中，y表示输入网络的标签值，p表示预测概率值，A表示预测图，B表式标签图。Among them, y represents the label value of the input network, p represents the prediction probability value, A represents the prediction map, and B represents the label map.

步骤S5：模型推理预测。Step S5: Model inference and prediction.

步骤5.1：当损失函数曲线图趋于平稳即收敛时，选择收敛阶段最后一个权重作为预测权重，利用测试集进行推理预测，并计算测试集的Precision、Recall、F1-score和Iou，进行定量测试网络的实际学习能力。为验证本专利所提算法的性能，选用经典的U-Net、基于resnet50骨干网络的U-Net（以下简称R50_U-Net）进行对比分析。图5所示为三种算法在测试集上提取高景一号建筑物图斑的结果图。每一行从左到右分别表示原图、标签图、本发明专利改进的U-Net网络建筑物图斑提取结果、R50_U-Net网络建筑物图斑提取结果以及U-Net网络建筑物图斑提取结果，可以看到，本发明专利所提方法在分割结果上更接近于标签图，漏题、错提现象明显减少，并且提取的建筑物图斑轮廓清晰，分割效果较好。Step 5.1: When the loss function curve tends to be stable or converges, select the last weight in the convergence stage as the prediction weight, use the test set for inference prediction, and calculate the Precision, Recall, F1-score and Iou of the test set for quantitative testing. The actual learning capabilities of the network. In order to verify the performance of the algorithm proposed in this patent, the classic U-Net and U-Net based on the resnet50 backbone network (hereinafter referred to as R50_U-Net) were selected for comparative analysis. Figure 5 shows the results of the three algorithms extracting the Gaojing No. 1 building pattern on the test set. Each row represents, from left to right, the original image, the label image, the U-Net network building patch extraction results improved by the invention's patent, the R50_U-Net network building patch extraction results, and the U-Net network building patch extraction results. As a result, it can be seen that the segmentation result of the method proposed in the patent of the present invention is closer to the label map, the phenomenon of missed questions and wrong mentions is significantly reduced, and the extracted building pattern outlines are clear and the segmentation effect is better.

步骤5.2：为体现本专利所提方法在国土空间用途管制中建设项目超范围建筑物图斑监测工作中的作用，现选几处已批项目红线，设置500m的缓冲区进行影像裁剪，为避免推理预测结果出现明显的拼接痕迹从而降低建筑物提取的准确度，本专利采用膨胀裁剪拼接的方法对已批项目红线500m缓冲区影像进行推理预测（如图6所示），同时使用常规的裁剪拼接方法进行对比（如图7所示）。Step 5.2: In order to reflect the role of the method proposed in this patent in the monitoring of out-of-range building patterns in construction projects in land space use control, several red lines of approved projects are now selected, and a 500m buffer zone is set for image cropping. In order to avoid The inference and prediction results show obvious splicing traces, which reduces the accuracy of building extraction. This patent uses the expansion cropping and splicing method to perform inference and prediction on the 500m buffer image of the approved project red line (as shown in Figure 6), while using conventional cropping. The splicing methods are compared (as shown in Figure 7).

膨胀裁剪拼接预测的具体工作步骤如下：The specific working steps of dilated cropping and splicing prediction are as follows:

步骤5.3.1：获取待预测影像的宽和高，设置滑动步长为256，将宽和高分别除以256取余，余数不足256的分别对宽和高进行填充到256。Step 5.3.1: Obtain the width and height of the image to be predicted, set the sliding step size to 256, divide the width and height by 256 and take the remainder. If the remainder is less than 256, fill the width and height to 256 respectively.

步骤5.3.1：设置滑动窗口为512，根据滑动步长对预测影像进行滑动裁剪。Step 5.3.1: Set the sliding window to 512, and perform sliding cropping of the predicted image according to the sliding step size.

步骤5.3.2：将滑动裁剪后的影像进行推理预测，获取对应影像的预测结果图。Step 5.3.2: Perform inference prediction on the sliding cropped image to obtain the prediction result map of the corresponding image.

步骤5.3.3：将每张预测结果图取中心位置256×256尺寸的影像，再将其按照顺序拼接起来。Step 5.3.3: Take the 256×256 size image at the center of each prediction result image, and then splice them together in order.

上述实施例只为说明本发明的技术构思及特点，其目的在于让熟悉此领域技术的人士能够了解本发明内容并加以实施，并不能以此限制本发明的保护范围。凡根据本发明精神实质所作的等效变化或修饰，都应涵盖在本发明的保护范围内。The above embodiments are only for illustrating the technical concepts and characteristics of the present invention. Their purpose is to enable those familiar with the technology in this field to understand the content of the present invention and implement it, and cannot limit the scope of protection of the present invention. All equivalent changes or modifications made based on the spirit of the present invention should be included in the protection scope of the present invention.

Claims

1. An automatic extraction method for out-of-range building pattern spots based on high Jing Weixing images is characterized by comprising the following steps:

step S1: preprocessing a remote sensing image;

step S2: sample data set preparation;

step S3: constructing a convolutional neural network, comprising:

s3.1: based on the improvement of the U-Net network structure, the res 50 is used as the coding part of the U-Net network;

s3.2: introducing a CBAM mixed attention mechanism in the improved U-Net network encoder part;

s3.3: introducing an FPN feature pyramid structure into the improved U-Net network decoder part;

step S4: training a network model using samples, comprising:

s4.1: performing data augmentation processing on the manually delineated data set;

s4.2: dividing all the amplified data sets according to a training set, a verification set and a test set;

s4.3: setting an operation environment;

s4.4: setting network parameters;

step S5: model inference prediction, comprising:

s5.1: setting the optimal weight of the model, and inputting the test set into a network for reasoning and predicting;

s5.2: and (3) extracting the out-of-range building pattern spots by adopting an expansion cutting and splicing method.

2. The method for automatically extracting the pattern spots of the over-range building based on the high Jing Weixing image according to claim 1, which is characterized in that: in the remote sensing image preprocessing of step S1, image radiation correction, orthographic correction, image fusion and dodging are included.

3. The method for automatically extracting the pattern spots of the over-range building based on the high Jing Weixing image according to claim 1, which is characterized in that: in the sample data set production of the step S2, the building target is marked with samples by Arcgis through the first satellite image of the scene, and the original image and the tag are respectively cut by 512×512pixels, so as to generate a tiff remote sensing sample data set.

4. The method for automatically extracting the pattern spots of the over-range building based on the high Jing Weixing image according to claim 1, which is characterized in that: the step S3.1 in the step S3 specifically includes:

and replacing the feature layers corresponding to the U-Net network encoder with the first layer output feature map Efeat1 of the network stage0 of the resnet50, the output feature map Efeat2 of the stage1, the output feature map Efeat3 of the stage2, the output feature map Efeat4 of the stage3 and the output feature map Efeat5 of the stage 4.

5. The method for automatically extracting the pattern spots of the over-range building based on the high Jing Weixing image according to claim 1, which is characterized in that: the step S3.2 in the step S3 specifically includes:

after the feature map Efeat1 and the feature map Efeat5 are respectively added with a CBAM mixed attention mechanism, so that the network is focused on a key area.

6. The method for automatically extracting the pattern spots of the over-range building based on the high Jing Weixing image according to claim 1, which is characterized in that: the step S3.3 in the step S3 specifically includes:

the method comprises the steps of firstly, carrying out information fusion on output feature graphs Dfeat1, dfeat2, dfeat3 and Dfeat4 of each layer of an encoder from top to bottom to obtain a feature graph Dfeat5, then carrying out 8 times of upsampling treatment on the Dfeat5, then carrying out fusion with the Dfeat1 to obtain Dfeat6, and finally carrying out 2 times of upsampling treatment on the Dfeat6 and outputting.

7. The method for automatically extracting the pattern spots of the over-range building based on the high Jing Weixing image according to claim 1, which is characterized in that:

the data augmentation processing in the S4.1 mainly comprises random rotation, horizontal overturn, vertical overturn and diagonal mirror image operation;

the dividing ratio of the data set in S4.2 is 8:1:1, a step of;

the running environment in the S4.3 is set by a windows operating system, a Pytorch deep learning frame and a VScode language compiler;

the network parameter setting in S4.4 specifically selects Adam optimizers, selects a combination of cross entropy loss and dice loss as a loss function of the network, sets an initial learning rate to 0.0007, and selects a poly learning strategy and a ReLu activation function.

8. The method for automatically extracting the pattern spots of the over-range building based on the high Jing Weixing image according to claim 1, which is characterized in that: the step S5.1 in the step S5 specifically includes:

firstly, setting the optimal weight of a model, inputting a test set into a network to perform reasoning prediction, and calculating the accuracy (Precision), recall (Recall), F1 value (F1-score) and single-class cross-merging ratio (Iou) of the test set for evaluating the actual learning capability of the network.

9. The method for automatically extracting the pattern spots of the over-range building based on the high Jing Weixing image according to claim 1, which is characterized in that: the step S5.2 in the step S5 specifically includes:

in the process of extracting and applying the beyond-range building pattern spots, selecting remote sensing images in a project red line 500m buffer area, cutting the remote sensing images to be predicted in a fixed size and then splicing the remote sensing images due to the fact that the network training size is 512 multiplied by 512, and extracting the beyond-range building pattern spots by adopting an expansion cutting splicing method to avoid obvious splicing marks from appearing in the reasoning and predicting results.