CN115564926B

CN115564926B - Three-dimensional patch model construction method based on image building structure learning

Info

Publication number: CN115564926B
Application number: CN202211552856.0A
Authority: CN
Inventors: 郑先伟; 龚健雅; 王嘉宁; 熊汉江; 岳林蔚
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-03-10
Anticipated expiration: 2042-12-06
Also published as: CN115564926A

Abstract

The invention discloses a three-dimensional patch model construction method based on image building structure learning, which comprises the following steps: acquiring an aerial image of a building and processing the aerial image to obtain an aerial orthographic image and a digital surface model; building a building roof line segment detection network model to extract a building roof vector line segment in an aerial orthographic image; removing abnormal noise in the digital surface model by adopting a combined bilateral sampling method based on gradient term constraint; converting the digital surface model into corresponding three-dimensional point cloud data, and fitting a roof plane according to the three-dimensional point cloud data so as to draw an initial surface element of the roof scene distribution of the building; decomposing a building polygon into rectangular areas; and (3) in each rectangular area, a roof vector structure line is used as a geometric constraint, a total objective function is constructed by combining the initial surface elements of the roof, and then the local optimal solution of the total objective function is carried out to complete the construction of the three-dimensional surface patch model of the building roof. The invention provides a foundation for the fine reconstruction of the subsequent building model.

Description

Three-dimensional patch model construction method based on image building structure learning

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a three-dimensional patch model construction method based on image building structure learning.

Background

With the rapid advance of smart city construction and urbanization process in China, the development of economic society needs to move spatial information from two dimensions to three dimensions, and people urgently need to express a more real geographic live-action space by a high-precision and fine-grained modeling method. The building is used as a main ground feature category in an urban scene, and the construction of a large-scale three-dimensional live-action model is a data basis for urban land resource management and various space analysis applications. However, although mature technologies and procedures exist around three-dimensional geometric reconstruction of large-scale urban buildings at present, researches on automatic semantic identification and monomer modeling of buildings are still insufficient, and the problems of business splitting, low efficiency, difficulty in large-scale application and the like exist, so that building realistic three-dimensional model construction is difficult to be brought into an application-level entity data production system.

In recent years, the rapid development of sensor technology and earth observation technology enables high-resolution aerial images to become widely available, and the existing three-dimensional reconstruction method based on the aerial images becomes an important way for constructing a large-scale urban building three-dimensional model by virtue of the advantages of low cost, simple process and adaptability to various ground and object environments. However, in the aspect of extracting information of an aerial image building, in the prior art, in the face of an urban scene with a great variety of ground objects and a great amount of redundant information, the external contour of the building is usually extracted based on characteristics such as spectrum, texture and geometry, the method excessively simplifies structural information of the building, loses geometrical structure and spatial topological information inside a roof, and brings great difficulty to fine reconstruction of a subsequent building model. In the three-dimensional model reconstruction stage of the building, the digital surface model generated based on the multi-view image dense matching has the problems of elevation mutation, edge blurring, internal cavities and the like, and the digital surface model can only be used for reconstructing a low-quality LOD 1-level model due to the lack of significant geometric features, so that the usability of the reconstructed model is greatly restricted.

Disclosure of Invention

The invention aims to provide a three-dimensional patch model construction method based on image building structure learning aiming at the defects of the prior art, the method reserves the geometrical structure and spatial topological information inside a roof, and provides a foundation for the fine reconstruction of a subsequent building model.

In order to solve the technical problems, the invention adopts the following technical scheme:

a three-dimensional surface patch model construction method based on image building structure learning comprises the following steps:

step 1, acquiring aerial image data of a building, and preprocessing the aerial image data to obtain an aerial ortho-image and a digital surface model;

step 2, building a building roof segment detection network model, inputting the aviation ortho-image into the building roof segment detection network model, extracting roof key node information according to the geometric and semantic information of the building by the building roof segment detection network model to generate candidate segments, fusing multi-scale features of the aviation ortho-image, aligning the candidate segments with a feature layer, filtering out the candidate segments with wrong topology, and outputting a building roof vector segment with complete topology in the aviation ortho-image;

step 3, removing abnormal noise in the digital surface model by adopting a combined bilateral sampling method based on gradient term constraint, and improving the data quality of the digital surface model;

step 4, converting the digital surface model processed in the step 3 into corresponding three-dimensional point cloud data, fitting a roof plane according to the three-dimensional point cloud data so as to extract all the roof planes, classifying attributes of all the roof planes according to whether the roof planes are inclined or not, and finally drawing initial surface elements of the roof scene distribution of the building according to different attributes;

step 5, decomposing the irregular building polygon into a plurality of rectangular areas based on the grids; and in each rectangular area, taking the building roof vector line segment with complete topology output in the step 2 as a geometric constraint, combining the roof initial patch in the step 4 to construct an overall objective function of the initial patch, and finally performing local optimal solution on the overall objective function in each rectangular area, thereby completing the construction of the building roof three-dimensional patch model.

Further, the building roof segment detection network model comprises a backbone network, a segment detection branch, a semantic segmentation branch and an LOI Pooling module;

the backbone network is used for extracting features of the aerial orthographic images and inputting images after the features are extracted into the line segment detection branch and the semantic segmentation branch;

the segment detection branch is used for extracting the key node information of the roof according to the geometric and semantic information of the building and generating a candidate segment;

the semantic segmentation branch is used for fusing the multi-scale features of the aerial ortho-image;

and the LOI Pooling module aligns the candidate line segments with the feature layer of the depth down-sampling, filters out the candidate line segments with wrong topology, and outputs the building roof vector line segments of the topological structure in the aerial ortho-image.

Further, the line segment detection branch comprises an encoder, a node prediction module and a mixed sampling module, and the method for generating the candidate line segment in the line segment detection branch comprises the following steps:

using a shared characteristic diagram obtained by down-sampling of an encoder as input, wherein the shared characteristic diagram is output from a backbone network, extracting position coordinates of key nodes of a roof by using a node prediction module, dividing the shared characteristic diagram into a plurality of grid units, converting truth values of the key nodes into Gaussian probability distribution of corresponding coordinates, and converting the node probability distribution in the grid units into a node probability distribution diagramEAnd position deviation mapOThe corresponding probability distribution map and offset map are represented as:

；

wherein, the first and the second end of the pipe are connected with each other,

representing a collection of corner sequence numbersVIn the presence ofiA point of intersection located in a grid cellbInner, grid cellbIs expressed as a center coordinate of

，

Then represents the firstiThe coordinates of the individual nodes are then calculated,W _b representing grid cells

The width of (d);

finally, the one with the best prediction probability is selectedKAnd generating candidate line segments by adopting a mixed sampling module for each key node.

Further, the specific method for removing the abnormal noise in the digital surface model in step 3 is as follows:

according to the height information, the color information and the spatial correlation of the height information and the color information in the digital surface model, a depth guide image is introduced in the color domain weight calculation, and original abnormal data noise is removed by adopting combined double-side sampling and combining the gradient value in the digital surface model, so that the data quality of the digital surface model is improved.

Further, the upsampling method specifically includes:

;

in addition, compared with the calculation of similar gray value difference in a color domain, the gradient value in the digital surface model can reflect the edge characteristics of the image better, and the original color kernel Gaussian function is transformed to calculate the gradient term constraint between adjacent pixels:

；

；

；

wherein, the first and the second end of the pipe are connected with each other,pthe coordinates of the center of the core are represented,qindicating adjacent windowsDThe coordinates of the pixels within the image frame,

representing coordinatespThe depth of the output of the light source,W _p representing coordinatespThe normalized coefficient of (a) is determined,

represents a spatial kernel in which

The standard deviation of the spatial kernel is represented,

representing coordinatesqThe DSM gray value of the input of (a),

represents a color kernel wherein

The standard deviation of the color kernel is represented,

representing coordinatespThe remote sensing ortho image RGB value of the input,

representing coordinatesqThe RGB values of the input are processed,

representing coordinatespThe depth value of the input is detected,

representing coordinatesqAt input depth values, depth images

Obtained by linear interpolation of the digital surface model,αis a normalized adaptive weight;

；

；

；

wherein the content of the first and second substances,G _x 、G _y then represent the image respectively

In (1)p、qThe gradient change degree in the horizontal and vertical directions;x、yrespectively representing the horizontal and vertical directions;p _x 、p _y respectively representpIs/are as followsx、yCoordinates of the object，q _x 、q _y Respectively representqIs/are as followsx、yThe coordinates of the position of the object to be imaged,σ _r represents the standard deviation of the color kernel;

by adopting the improved combined bilateral sampling method, the original abnormal data noise in the digital surface model is removed, so that the data quality of the digital surface model is improved.

Further, the method for drawing the initial bin of the building roof scene distribution in step 4 comprises the following steps:

fitting a plane model parameter equation with the highest confidence coefficient from the point cloud data by using a RANSAC algorithm;

continuously performing iterative fitting on the remaining external points by adopting a RANSAC algorithm until all roof planes are extracted, and calculating the corresponding normal directions of the point cloud data while fitting the point cloud data into a plurality of roof planes;

calculating an included angle between the normal direction of the plane of the roof and the normal line of the z axis of the ground object coordinate system according to the normal direction of the plane of the roof, thereby dividing the plane model of the roof into horizontal and inclined attributes;

and finally, re-projecting the point cloud sets corresponding to the roof planes into the digital surface model according to the attributes, so as to mark an initial surface element reflecting the distribution of the roof scene.

Further, in step 5, the method for decomposing the irregular building polygon into a plurality of rectangular areas comprises:

firstly, rotating a target building polygon to enable the main direction of the target building polygon to be parallel to the X axis of the aerial image, and executing the same transformation in the corresponding areas of the aerial ortho-image and the digital surface model; then, the gradients of the digital surface model in the horizontal direction and the vertical direction are respectively calculated according to the interpolation of adjacent pixels, a line segment of which the gradient value is larger than a threshold value and the average height of the digital surface model is a peak value in a non-maximum value suppression window is selected in each direction to serve as a candidate boundary of a rectangle, and meanwhile, each candidate boundary is further filtered by utilizing color information on a corresponding aerial ortho image: creating buffer areas on two sides of each boundary line segment, reserving candidate boundaries when the gray-scale color difference is larger than a threshold value, and simultaneously performing twice uniform down-sampling on an original image for multiple times to obtain grid layers with different resolutions;

finally, thresholds are set for the color difference and the height difference of the rectangles to decide whether to merge the adjacent rectangles on both sides, so that the complex and irregular building polygon is decomposed into a plurality of rectangular areas.

Further, whether to merge adjacent rectangles on two sides is determined by the following criteria:

；

in the formula (I), the compound is shown in the specification,

、

representing the average color of each of two rectangular areas on the aerial orthographic image,

、

representing the average height of each of two rectangular areas on the digital surface model;

when the two rectangular areas simultaneously satisfy the following three conditions, the merging of two adjacent rectangles is completed:

(1) Average color difference over an orthoimage

Less than a color threshold

；

(2) Average height difference in digital surface model

Less than a height threshold

；

(3) To avoid intermediate narrow streets, height gradients in overlapping buffer areas covering common edges

Less than threshold

。

Further, step 5 further comprises:

cutting a plurality of polygon units in each rectangular area by using the roof vector structure line extracted in the step 2, and then obtaining an initial patch

Is expressed as:

；

in the formula (I), the compound is shown in the specification,

representing smooth transition relationship between two polygons for measuring plane measure and distance term between polygon units

The degree of fit of the initial patch to the digital surface model is reflected,

to balance the weight occupied by the distance term and the smoothing term,

a set of sequence numbers representing pairs of adjacent patches,

is shown asiThe number of the dough sheets is counted,

is prepared by reacting with

Adjacent patches;

the final polygon element with the roof plane marker is obtained by minimizing the overall objective function using a graph cut algorithm and visualized as a building three-dimensional patch model.

Compared with the prior art, the invention has the beneficial effects that:

1) Aiming at the problem that the current building information extraction task only focuses on the external contour of a building and ignores the geometric structure and topological information inside the roof of the building, the invention firstly introduces a segment detection task in the computer vision field, converts the traditional building extraction method into the building roof structure line detection task, leads the segment detection result to simultaneously contain the external edge contour and the internal geometric structure of the building, provides topological structure information for the three-dimensional reconstruction of the building and also provides a foundation for the fine reconstruction of a subsequent building model;

2) In order to solve the problem that the existing building extraction method neglects geometric elements such as a roof internal point, a line, a surface and the like and is easily interfered by multi-level characteristic noise and ground objects, a building roof structure line extraction depth network fusing geometric structure and semantic information is provided, the network performs mixed sampling from key corner points of a building by using a line segment detection branch to generate candidate line segments, fuses building region information acquired by a semantic perception branch, filters candidate line segments with lower confidence coefficient based on a line segment verification network and an heuristic algorithm, and finally obtains a roof structure line segment result with complete topology and accurate position;

3) The invention provides a large-scale building three-dimensional modeling method based on roof topological structure geometric constraint, aiming at the problems of low precision, poor reality, serious structural feature loss and the like of a building three-dimensional model directly generated by a digital surface model at present, the method takes the digital surface model generated by multi-view aerial image matching as input, utilizes RANSAC algorithm to carry out multi-plane fitting to obtain a roof initialized polygonal patch, simultaneously introduces a roof structure line as geometric constraint to establish a probability graph model, and utilizes a Markov field optimization algorithm to extrude an initial plane to a corresponding line segment, thereby finally obtaining an LOD 2-level building roof patch model.

Drawings

FIG. 1 is a flow chart of a method for constructing a three-dimensional patch model of a building roof according to an embodiment of the invention;

FIG. 2 is a schematic structural diagram of a building rooftop segment detection network model according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a key node of a roof of an aerial image building according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a building roof initial patch segmentation process according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the following embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.

The present invention is further illustrated by the following examples, which are not to be construed as limiting the invention.

Referring to fig. 1, the embodiment of the invention discloses a building roof three-dimensional patch model construction method based on aerial images, which comprises the following steps:

firstly, extracting feature points from a two-dimensional image, matching, calculating the pose of a camera and resolving exterior orientation elements of all images; solving a target space coordinate through multi-view dense matching, and generating point cloud data by utilizing triangularization positioning; interpolating point cloud data to generate a Delaney triangulation network and a corresponding Digital Surface Model (DSM); and then, carrying out digital differential correction on the remote sensing image data by using the digital surface model, and embedding to obtain the aerial ortho-image. Aerial orthographic images and digital surface models can be automatically generated from aerial image data, typically using photogrammetric specialty software.

Step 2, building a building roof segment detection network model, inputting the aviation ortho-image into the building roof segment detection network model, extracting roof key node information according to the geometric and semantic information of the building by the building roof segment detection network model to generate candidate segments, fusing multi-scale features of the aviation ortho-image, aligning the candidate segments with the feature layer of the depth down-sampling, filtering out the candidate segments with wrong topology, and outputting the building roof vector segments with complete topology in the aviation ortho-image;

as shown in fig. 2, the embodiment provides a segment detection task focusing on reflecting the roof topology structure of a building, and constructs a high-resolution aerial image-oriented roof segment detection network model of the building according to deep learning and a convolutional neural network theory. The building roof segment detection network model comprises a backbone network, segment detection branches, semantic segmentation branches and an LOI Pooling module. The backbone network adopts a stacked hourglass network and is formed by connecting 5 basic hourglass structures in series, each basic hourglass firstly adopts a convolution layer and a maximum pooling layer to perform feature extraction and downsampling on an aerial orthographic image, and then a nearest neighbor upsampling and cross-layer connection are utilized to restore a feature map to an input size. The image after the extraction of the backbone network features is input into the line segment detection branch and the semantic segmentation branch.

The segment detection branch is used for extracting the information of the key nodes of the roof according to the geometric and semantic information of the building and generating candidate segments, and comprises an encoder, a node prediction module and a mixed sampling module. In particular, in the segment detection branch, to compileThe decoder down-samples the obtained shared feature map as input, wherein the shared feature map is output from the backbone network, and the Anchor-free based node prediction module is used to extract the position coordinates of the key nodes on the roof, as shown in FIG. 3,H ×Wshared profile of dimensions is divided intoH _b ×W _b The grid units with different sizes convert the truth values of the key points into Gaussian probability distribution of corresponding coordinates, each grid unit plays the role of a candidate anchor frame to conveniently confirm whether nodes exist in each grid unit and predict the relative positions of the nodes in each grid unit, and the node probability distribution in each grid unit is represented by a node probability distribution diagramEAnd position deviation mapOThe corresponding probability distribution map and offset map can be expressed as:

；

wherein the content of the first and second substances,

set of sequence numbers representing corner pointsVIn the presence ofiA point of intersection located in a grid cellbInner, grid cellbIs expressed as

，

The width of (d);

then selecting the one with the best prediction probabilityKAnd generating candidate line segments by the key nodes by adopting a mixed sampling module.

And a void convolution structure is applied to fuse the multi-scale features of the aerial image in the semantic segmentation branch, and the features of all levels are effectively transmitted and selectively used. And finally, aligning the candidate line segment with the feature layer of the depth down-sampling by adopting an LOI Pooling module to realize effective filtering of the error topological line segment, thereby outputting the vector line segment of the building roof in the aerial image.

due to the characteristics of vegetation shielding, ground object dense distribution, a three-dimensional structure of a building and the like, an image shot by aerial photogrammetry has dead corners and a view blind area, a Digital Surface Model (DSM) generated by relying on aerial image dense matching contains a large amount of noise, and the phenomena of abnormal elevation fluctuation, area holes or edge sawteeth and the like are usually shown, so that the data quality is obviously reduced. In order to give consideration to the height characteristics and the spectral information of the building, abnormal noise in the digital surface model data can be effectively removed through a combined double-edge sampling method, and the quality of the digital surface model is improved. Aiming at the problems that black and white colors corresponding to DSM edge gray values are similar and the colors of regions with abrupt elevation changes synchronously change, the embodiment introduces a depth guide map in color domain weight calculation on the basis of the original combined bilateral method, and the improved combined bilateral sampling formula is as follows:

wherein the content of the first and second substances,pthe coordinates of the center of the core are represented,qindicating adjacent windowsDThe coordinates of the pixels within the image frame,

represents a spatial kernel in which

The standard deviation of the spatial kernel is represented,

representing coordinatesqThe DSM gray value of the input of (a),

represents a color kernel wherein

The standard deviation of the color kernel is represented,

representing coordinatespThe remote sensing ortho image RGB value of the input,

representing coordinatesqThe RGB values of the input (c) are,

representing coordinatespThe depth value of the input is detected,

representing coordinatesqAt input depth values, depth images

；

；

；

Inp、qThe degree of gradient change in the horizontal and vertical directions;x、yrespectively representing the horizontal and vertical directions;p _x 、p _y respectively representpIsx、yCoordinates of the object，q _x 、q _y Each representing qx、yThe coordinates of the position of the object to be imaged,σ _r represents the standard deviation of the color kernel;

Step 4, converting the digital surface model processed in the step 3 into corresponding three-dimensional point cloud data, fitting a roof plane according to the three-dimensional point cloud data so as to extract all roof planes, classifying attributes of all the roof planes according to whether the roof planes are inclined or not, and finally drawing initial surface elements of the roof scene distribution of the building according to different attributes;

due to the lack of sufficient prior information in the three-dimensional reconstruction of the building, the simple geometric primitive topological relations are neither stable nor accurate enough, and if the roof plane is constructed by only blindly relying on the line segment topological relations, many repeated planes or even meaningless error planes are necessarily generated. In order to reduce unnecessary operations and improve the construction efficiency and accuracy of the roof structure of the building, in this embodiment, an initial patch of the roof is constructed according to the distance characteristics of the digital surface model from the digital surface model acquired in advance. As shown in FIG. 4, the main idea is to raise each pixel in the digital surface model to the height of its representation, so as to convert the distance three-dimensionally into corresponding three-dimensional point cloud data, fit a plurality of planes and record the plane parameter equation in the point cloud data by using RANSAC algorithm, first iteratively fit the plane model parameters from the point cloud data by using RANSAC algorithm, randomly select 3 points to determine a plane and evaluate the number of points close to the plane during each iteration, and fit by multiple iterationsAnd (3) generating a plane model parameter with the highest confidence (the number of approximate points to the plane is the most), and continuously using a RANSAC algorithm for the rest outer points (outliers) by adopting an iterative idea until all planes are extracted. Meanwhile, the normal of the roof plane is utilized to carry out directional constraint, and the random sampling algorithm is used for fitting the point cloud data into a plurality of roof plane modelsP _i While calculating the corresponding normal directionn _i When the plane of the roof is approximately horizontal, the normal line and the ground are approximately vertical, at the moment, the plane model is corrected to be horizontal to the ground, namely the normal line is vertical to the ground, and the normal line of each plane is calculatedn _i Coordinate system with ground objectzNormal line of axisn _z The included angle divides the fitted cloud planes of each point into a horizontal type and an inclined type, wherein the dark color point cloud in the graph 4 represents the horizontal plane, and the light color point cloud represents the inclined plane. And finally, re-projecting the corresponding point cloud set into the digital surface model according to the attributes of the point cloud set, thereby dividing an initial patch reflecting the distribution of the roof scene.

Step 5, decomposing the irregular building polygon into a plurality of rectangular areas based on the grids; in each building rectangular area, building roof vector line segments with complete topology output in the step 2 are used as geometric constraints, the initial patches in the step 4 are combined to construct an overall objective function of the initial patches, and finally local optimal solution is carried out on the overall objective function in each rectangular area, so that the construction of a building roof three-dimensional patch model is completed;

when the three-dimensional model of the building is reconstructed in a large scale, the area included by the aerial image contains various building targets with irregular shapes. In order to ensure the robustness and accuracy of a building three-dimensional model construction algorithm, building rectangular region blocking is required to be carried out firstly, and the grid-based region decomposition method is designed in the embodiment. Then, gradients of the DSM in two orthogonal directions, horizontal and vertical, are calculated based on adjacent pixel interpolation, respectively, in each directionA line segment having a gradient value greater than a threshold and a mean height of the DSM that is a peak within the non-maximum suppression window is selected as a candidate boundary of the rectangle. And simultaneously, further filtering each candidate boundary by utilizing the color information on the corresponding aerial ortho-image: that is, buffer areas are created on both sides of each boundary line segment, and when the gray-scale color difference is larger than the color threshold value

And reserving the candidate boundary, and simultaneously performing twice uniform down-sampling on the original image twice to obtain the grid layers with different resolutions. Finally, since the rectangle extraction performed independently in the buffer may generate repeated boundaries, a threshold needs to be set from the color difference and the height difference of the rectangles to determine whether to merge the adjacent rectangles on both sides, and the specific criteria are as follows:

；

in the formula (I), the compound is shown in the specification,

、

、

(1) Average color difference over an orthoimage

Less than a color threshold

；

(2) Average height difference in digital surface model

Less than a height threshold

；

Less than threshold

Thereby achieving the purpose of rectangular partitioning.

Based on the region blocking method, complex and irregular building polygons can be decomposed into independent basic rectangle combinations, so that not only can individual building model reconstruction be carried out, but also joint optimization of probability map models can be carried out in each building rectangular region more conveniently.

Utilizing the roof vector structure line obtained in the step 2 to cut out a plurality of polygonal units in each rectangular area and constructing a dual graph

For an undirected graph structure, each vertex in a dual graphv _i Representing different polygon elementsc _i Each edge of

Then the polygon units corresponding to the vertexes at two ends are represented as the adjacent relation on the roof, and meanwhile, the initial patch formed by fitting in the digital surface model is represented as

. Within each polygon unit, each discrete pixel point

Spatial three-dimensional object points corresponding to the pixel points in the digital surface model

. Using distance terms

Reflecting the sum of the distances from each pixel point in the polygon to the corresponding fitting initial patch, and smoothing items

Reflect two adjacent polygon cellsc _i 、c _j The pair-wise smoothing relationship between, the initial patch

Is expressed as:

；

in the formula (I), the compound is shown in the specification,

to balance the weight occupied by the distance term and the smoothing term,

a set of sequence numbers representing pairs of adjacent patches,

is shown asiThe number of the dough sheets is counted,

is prepared by reacting with

Adjacent to each otherA patch.

The optimization problem is solved by minimizing an objective function using Graph-cut algorithm to obtain a final polygon element with a roof plane marker and visualizing it as a building three-dimensional patch model.

While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims

1. A three-dimensional surface patch model construction method based on image building structure learning is characterized by comprising the following steps:

step 1, acquiring aerial image data of a building, and preprocessing the aerial image data to obtain an aerial orthographic image and a digital surface model;

2. The method for constructing a three-dimensional patch model based on image building structure learning of claim 1, wherein the building roof segment detection network model comprises a backbone network, a segment detection branch, a semantic segmentation branch and an LOI Pooling module;

3. The method for constructing a three-dimensional patch model based on image building structure learning of claim 2, wherein the line segment detection branch comprises an encoder, a node prediction module and a mixed sampling module, and the method for generating the line segment candidates in the line segment detection branch comprises the following steps:

the method comprises the steps of taking a shared characteristic diagram obtained by down-sampling of an encoder as input, wherein the shared characteristic diagram is output from a backbone network, extracting position coordinates of key nodes of a roof by adopting a node prediction module, dividing the shared characteristic diagram into a plurality of grid units, and converting truth values of the key nodes into Gaussian profile of corresponding coordinatesRate distribution, the node probability distribution in the grid cells being derived from the node probability distribution mapEAnd a position deviation mapOThe corresponding probability distribution map and offset map are represented as:

；

wherein the content of the first and second substances,

，

Then represents the firstiThe coordinates of the individual nodes are then compared,W _b representing grid cells

The width of (d);

finally, the one with the best prediction probability is selectedKAnd generating candidate line segments by the key nodes by adopting a mixed sampling module.

4. The method for constructing a three-dimensional patch model based on image building structure learning according to claim 1, wherein the specific method for removing abnormal noise in the digital surface model in step 3 is as follows:

according to the height information and the color information in the digital surface model and the spatial correlation of the height information and the color information, the depth guide graph is introduced to calculate the color domain weight, and original abnormal data noise is removed by adopting combined double-edge sampling and combining the gradient value in the digital surface model, so that the data quality of the digital surface model is improved.

5. The image building structure learning-based three-dimensional patch model construction method according to claim 4, wherein the upsampling method specifically comprises:

;

wherein the content of the first and second substances,pthe coordinates of the center of the core are represented,qindicating adjacent windowsDThe coordinates of the pixels within the image plane are,

represents a spatial kernel in which

The standard deviation of the spatial kernel is represented,

representing coordinatesqThe DSM gray value of the input of (a),

represents a color kernel wherein

The standard deviation of the color kernel is represented,

representing coordinatespThe remote sensing ortho image RGB value of the input,

representing coordinatesqThe RGB values of the input are processed,

representing coordinatespThe depth value of the input is detected,

representing coordinatesqAt input depth values, depth images

；

；

；

Inp、qThe gradient change degree in the horizontal and vertical directions;x、yrespectively representing horizontal and vertical directions;p _x 、p _y respectively representpIs/are as followsx、yCoordinates of the object，q _x 、q _y Respectively representqIs/are as followsx、yThe coordinates of the position of the object to be imaged,σ _r represents the standard deviation of the color kernel;

6. The method for constructing the three-dimensional surface patch model based on the image building structure learning of claim 1, wherein the method for drawing the initial surface element of the building roof scene distribution in the step 4 comprises the following steps:

fitting a plane model parameter equation with the highest confidence coefficient from the point cloud data by adopting a RANSAC algorithm;

7. The method for constructing a three-dimensional patch model based on image building structure learning of claim 1, wherein in step 5, the method for decomposing the irregular building polygon into a plurality of rectangular areas comprises:

8. The method for constructing a three-dimensional patch model based on image building structure learning of claim 7, wherein the specific criteria for whether to merge adjacent rectangles on two sides are as follows:

；

in the formula (I), the compound is shown in the specification,

、

、

(1) Average color difference over an orthoimage

Less than a color threshold

；

(2) Average height difference in digital surface model

Less than a height threshold

；

Is less than the threshold value

。

9. The method for constructing a three-dimensional patch model based on image building structure learning according to claim 1, wherein the step 5 further comprises:

Is expressed as:

；

in the formula (I), the compound is shown in the specification,

representing smooth transition relations between two polygons for measuring planar measure, distance term between polygon elements

to balance the weight occupied by the distance term and the smoothing term,

a set of sequence numbers representing pairs of adjacent patches,

is shown asiThe number of the dough sheets is counted,

is and is

Adjacent patches;

the final polygon elements marked with the roof plane are obtained by minimizing the overall objective function using a graph cut algorithm and visualized as a building three-dimensional patch model.