CN115564926B - Three-dimensional patch model construction method based on image building structure learning - Google Patents

Three-dimensional patch model construction method based on image building structure learning Download PDF

Info

Publication number
CN115564926B
CN115564926B CN202211552856.0A CN202211552856A CN115564926B CN 115564926 B CN115564926 B CN 115564926B CN 202211552856 A CN202211552856 A CN 202211552856A CN 115564926 B CN115564926 B CN 115564926B
Authority
CN
China
Prior art keywords
building
roof
image
model
digital surface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211552856.0A
Other languages
Chinese (zh)
Other versions
CN115564926A (en
Inventor
郑先伟
龚健雅
王嘉宁
熊汉江
岳林蔚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202211552856.0A priority Critical patent/CN115564926B/en
Publication of CN115564926A publication Critical patent/CN115564926A/en
Application granted granted Critical
Publication of CN115564926B publication Critical patent/CN115564926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A30/00Adapting or protecting infrastructure or their operation
    • Y02A30/60Planning or developing urban green infrastructure

Abstract

The invention discloses a three-dimensional patch model construction method based on image building structure learning, which comprises the following steps: acquiring an aerial image of a building and processing the aerial image to obtain an aerial orthographic image and a digital surface model; building a building roof line segment detection network model to extract a building roof vector line segment in an aerial orthographic image; removing abnormal noise in the digital surface model by adopting a combined bilateral sampling method based on gradient term constraint; converting the digital surface model into corresponding three-dimensional point cloud data, and fitting a roof plane according to the three-dimensional point cloud data so as to draw an initial surface element of the roof scene distribution of the building; decomposing a building polygon into rectangular areas; and (3) in each rectangular area, a roof vector structure line is used as a geometric constraint, a total objective function is constructed by combining the initial surface elements of the roof, and then the local optimal solution of the total objective function is carried out to complete the construction of the three-dimensional surface patch model of the building roof. The invention provides a foundation for the fine reconstruction of the subsequent building model.

Description

Three-dimensional patch model construction method based on image building structure learning
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a three-dimensional patch model construction method based on image building structure learning.
Background
With the rapid advance of smart city construction and urbanization process in China, the development of economic society needs to move spatial information from two dimensions to three dimensions, and people urgently need to express a more real geographic live-action space by a high-precision and fine-grained modeling method. The building is used as a main ground feature category in an urban scene, and the construction of a large-scale three-dimensional live-action model is a data basis for urban land resource management and various space analysis applications. However, although mature technologies and procedures exist around three-dimensional geometric reconstruction of large-scale urban buildings at present, researches on automatic semantic identification and monomer modeling of buildings are still insufficient, and the problems of business splitting, low efficiency, difficulty in large-scale application and the like exist, so that building realistic three-dimensional model construction is difficult to be brought into an application-level entity data production system.
In recent years, the rapid development of sensor technology and earth observation technology enables high-resolution aerial images to become widely available, and the existing three-dimensional reconstruction method based on the aerial images becomes an important way for constructing a large-scale urban building three-dimensional model by virtue of the advantages of low cost, simple process and adaptability to various ground and object environments. However, in the aspect of extracting information of an aerial image building, in the prior art, in the face of an urban scene with a great variety of ground objects and a great amount of redundant information, the external contour of the building is usually extracted based on characteristics such as spectrum, texture and geometry, the method excessively simplifies structural information of the building, loses geometrical structure and spatial topological information inside a roof, and brings great difficulty to fine reconstruction of a subsequent building model. In the three-dimensional model reconstruction stage of the building, the digital surface model generated based on the multi-view image dense matching has the problems of elevation mutation, edge blurring, internal cavities and the like, and the digital surface model can only be used for reconstructing a low-quality LOD 1-level model due to the lack of significant geometric features, so that the usability of the reconstructed model is greatly restricted.
Disclosure of Invention
The invention aims to provide a three-dimensional patch model construction method based on image building structure learning aiming at the defects of the prior art, the method reserves the geometrical structure and spatial topological information inside a roof, and provides a foundation for the fine reconstruction of a subsequent building model.
In order to solve the technical problems, the invention adopts the following technical scheme:
a three-dimensional surface patch model construction method based on image building structure learning comprises the following steps:
step 1, acquiring aerial image data of a building, and preprocessing the aerial image data to obtain an aerial ortho-image and a digital surface model;
step 2, building a building roof segment detection network model, inputting the aviation ortho-image into the building roof segment detection network model, extracting roof key node information according to the geometric and semantic information of the building by the building roof segment detection network model to generate candidate segments, fusing multi-scale features of the aviation ortho-image, aligning the candidate segments with a feature layer, filtering out the candidate segments with wrong topology, and outputting a building roof vector segment with complete topology in the aviation ortho-image;
step 3, removing abnormal noise in the digital surface model by adopting a combined bilateral sampling method based on gradient term constraint, and improving the data quality of the digital surface model;
step 4, converting the digital surface model processed in the step 3 into corresponding three-dimensional point cloud data, fitting a roof plane according to the three-dimensional point cloud data so as to extract all the roof planes, classifying attributes of all the roof planes according to whether the roof planes are inclined or not, and finally drawing initial surface elements of the roof scene distribution of the building according to different attributes;
step 5, decomposing the irregular building polygon into a plurality of rectangular areas based on the grids; and in each rectangular area, taking the building roof vector line segment with complete topology output in the step 2 as a geometric constraint, combining the roof initial patch in the step 4 to construct an overall objective function of the initial patch, and finally performing local optimal solution on the overall objective function in each rectangular area, thereby completing the construction of the building roof three-dimensional patch model.
Further, the building roof segment detection network model comprises a backbone network, a segment detection branch, a semantic segmentation branch and an LOI Pooling module;
the backbone network is used for extracting features of the aerial orthographic images and inputting images after the features are extracted into the line segment detection branch and the semantic segmentation branch;
the segment detection branch is used for extracting the key node information of the roof according to the geometric and semantic information of the building and generating a candidate segment;
the semantic segmentation branch is used for fusing the multi-scale features of the aerial ortho-image;
and the LOI Pooling module aligns the candidate line segments with the feature layer of the depth down-sampling, filters out the candidate line segments with wrong topology, and outputs the building roof vector line segments of the topological structure in the aerial ortho-image.
Further, the line segment detection branch comprises an encoder, a node prediction module and a mixed sampling module, and the method for generating the candidate line segment in the line segment detection branch comprises the following steps:
using a shared characteristic diagram obtained by down-sampling of an encoder as input, wherein the shared characteristic diagram is output from a backbone network, extracting position coordinates of key nodes of a roof by using a node prediction module, dividing the shared characteristic diagram into a plurality of grid units, converting truth values of the key nodes into Gaussian probability distribution of corresponding coordinates, and converting the node probability distribution in the grid units into a node probability distribution diagramEAnd position deviation mapOThe corresponding probability distribution map and offset map are represented as:
Figure SMS_1
wherein, the first and the second end of the pipe are connected with each other,
Figure SMS_2
representing a collection of corner sequence numbersVIn the presence ofiA point of intersection located in a grid cellbInner, grid cellbIs expressed as a center coordinate of
Figure SMS_3
Figure SMS_4
Then represents the firstiThe coordinates of the individual nodes are then calculated,W b representing grid cells
Figure SMS_5
The width of (d);
finally, the one with the best prediction probability is selectedKAnd generating candidate line segments by adopting a mixed sampling module for each key node.
Further, the specific method for removing the abnormal noise in the digital surface model in step 3 is as follows:
according to the height information, the color information and the spatial correlation of the height information and the color information in the digital surface model, a depth guide image is introduced in the color domain weight calculation, and original abnormal data noise is removed by adopting combined double-side sampling and combining the gradient value in the digital surface model, so that the data quality of the digital surface model is improved.
Further, the upsampling method specifically includes:
Figure SMS_6
;
in addition, compared with the calculation of similar gray value difference in a color domain, the gradient value in the digital surface model can reflect the edge characteristics of the image better, and the original color kernel Gaussian function is transformed to calculate the gradient term constraint between adjacent pixels:
Figure SMS_7
Figure SMS_8
Figure SMS_9
wherein, the first and the second end of the pipe are connected with each other,pthe coordinates of the center of the core are represented,qindicating adjacent windowsDThe coordinates of the pixels within the image frame,
Figure SMS_11
representing coordinatespThe depth of the output of the light source,W p representing coordinatespThe normalized coefficient of (a) is determined,
Figure SMS_12
represents a spatial kernel in which
Figure SMS_14
The standard deviation of the spatial kernel is represented,
Figure SMS_16
representing coordinatesqThe DSM gray value of the input of (a),
Figure SMS_18
represents a color kernel wherein
Figure SMS_19
The standard deviation of the color kernel is represented,
Figure SMS_20
representing coordinatespThe remote sensing ortho image RGB value of the input,
Figure SMS_10
representing coordinatesqThe RGB values of the input are processed,
Figure SMS_13
representing coordinatespThe depth value of the input is detected,
Figure SMS_15
representing coordinatesqAt input depth values, depth images
Figure SMS_17
Obtained by linear interpolation of the digital surface model,αis a normalized adaptive weight;
in addition, compared with the calculation of similar gray value difference in a color domain, the gradient value in the digital surface model can reflect the edge characteristics of the image better, and the original color kernel Gaussian function is transformed to calculate the gradient term constraint between adjacent pixels:
Figure SMS_21
Figure SMS_22
Figure SMS_23
wherein the content of the first and second substances,G x 、G y then represent the image respectively
Figure SMS_24
In (1)p、qThe gradient change degree in the horizontal and vertical directions;x、yrespectively representing the horizontal and vertical directions;p x 、p y respectively representpIs/are as followsx、yCoordinates of the object,q x 、q y Respectively representqIs/are as followsx、yThe coordinates of the position of the object to be imaged,σ r represents the standard deviation of the color kernel;
by adopting the improved combined bilateral sampling method, the original abnormal data noise in the digital surface model is removed, so that the data quality of the digital surface model is improved.
Further, the method for drawing the initial bin of the building roof scene distribution in step 4 comprises the following steps:
fitting a plane model parameter equation with the highest confidence coefficient from the point cloud data by using a RANSAC algorithm;
continuously performing iterative fitting on the remaining external points by adopting a RANSAC algorithm until all roof planes are extracted, and calculating the corresponding normal directions of the point cloud data while fitting the point cloud data into a plurality of roof planes;
calculating an included angle between the normal direction of the plane of the roof and the normal line of the z axis of the ground object coordinate system according to the normal direction of the plane of the roof, thereby dividing the plane model of the roof into horizontal and inclined attributes;
and finally, re-projecting the point cloud sets corresponding to the roof planes into the digital surface model according to the attributes, so as to mark an initial surface element reflecting the distribution of the roof scene.
Further, in step 5, the method for decomposing the irregular building polygon into a plurality of rectangular areas comprises:
firstly, rotating a target building polygon to enable the main direction of the target building polygon to be parallel to the X axis of the aerial image, and executing the same transformation in the corresponding areas of the aerial ortho-image and the digital surface model; then, the gradients of the digital surface model in the horizontal direction and the vertical direction are respectively calculated according to the interpolation of adjacent pixels, a line segment of which the gradient value is larger than a threshold value and the average height of the digital surface model is a peak value in a non-maximum value suppression window is selected in each direction to serve as a candidate boundary of a rectangle, and meanwhile, each candidate boundary is further filtered by utilizing color information on a corresponding aerial ortho image: creating buffer areas on two sides of each boundary line segment, reserving candidate boundaries when the gray-scale color difference is larger than a threshold value, and simultaneously performing twice uniform down-sampling on an original image for multiple times to obtain grid layers with different resolutions;
finally, thresholds are set for the color difference and the height difference of the rectangles to decide whether to merge the adjacent rectangles on both sides, so that the complex and irregular building polygon is decomposed into a plurality of rectangular areas.
Further, whether to merge adjacent rectangles on two sides is determined by the following criteria:
Figure SMS_25
in the formula (I), the compound is shown in the specification,
Figure SMS_26
Figure SMS_27
representing the average color of each of two rectangular areas on the aerial orthographic image,
Figure SMS_28
Figure SMS_29
representing the average height of each of two rectangular areas on the digital surface model;
when the two rectangular areas simultaneously satisfy the following three conditions, the merging of two adjacent rectangles is completed:
(1) Average color difference over an orthoimage
Figure SMS_30
Less than a color threshold
Figure SMS_31
(2) Average height difference in digital surface model
Figure SMS_32
Less than a height threshold
Figure SMS_33
(3) To avoid intermediate narrow streets, height gradients in overlapping buffer areas covering common edges
Figure SMS_34
Less than threshold
Figure SMS_35
Further, step 5 further comprises:
cutting a plurality of polygon units in each rectangular area by using the roof vector structure line extracted in the step 2, and then obtaining an initial patch
Figure SMS_36
Is expressed as:
Figure SMS_37
in the formula (I), the compound is shown in the specification,
Figure SMS_38
representing smooth transition relationship between two polygons for measuring plane measure and distance term between polygon units
Figure SMS_39
The degree of fit of the initial patch to the digital surface model is reflected,
Figure SMS_40
to balance the weight occupied by the distance term and the smoothing term,
Figure SMS_41
a set of sequence numbers representing pairs of adjacent patches,
Figure SMS_42
is shown asiThe number of the dough sheets is counted,
Figure SMS_43
is prepared by reacting with
Figure SMS_44
Adjacent patches;
the final polygon element with the roof plane marker is obtained by minimizing the overall objective function using a graph cut algorithm and visualized as a building three-dimensional patch model.
Compared with the prior art, the invention has the beneficial effects that:
1) Aiming at the problem that the current building information extraction task only focuses on the external contour of a building and ignores the geometric structure and topological information inside the roof of the building, the invention firstly introduces a segment detection task in the computer vision field, converts the traditional building extraction method into the building roof structure line detection task, leads the segment detection result to simultaneously contain the external edge contour and the internal geometric structure of the building, provides topological structure information for the three-dimensional reconstruction of the building and also provides a foundation for the fine reconstruction of a subsequent building model;
2) In order to solve the problem that the existing building extraction method neglects geometric elements such as a roof internal point, a line, a surface and the like and is easily interfered by multi-level characteristic noise and ground objects, a building roof structure line extraction depth network fusing geometric structure and semantic information is provided, the network performs mixed sampling from key corner points of a building by using a line segment detection branch to generate candidate line segments, fuses building region information acquired by a semantic perception branch, filters candidate line segments with lower confidence coefficient based on a line segment verification network and an heuristic algorithm, and finally obtains a roof structure line segment result with complete topology and accurate position;
3) The invention provides a large-scale building three-dimensional modeling method based on roof topological structure geometric constraint, aiming at the problems of low precision, poor reality, serious structural feature loss and the like of a building three-dimensional model directly generated by a digital surface model at present, the method takes the digital surface model generated by multi-view aerial image matching as input, utilizes RANSAC algorithm to carry out multi-plane fitting to obtain a roof initialized polygonal patch, simultaneously introduces a roof structure line as geometric constraint to establish a probability graph model, and utilizes a Markov field optimization algorithm to extrude an initial plane to a corresponding line segment, thereby finally obtaining an LOD 2-level building roof patch model.
Drawings
FIG. 1 is a flow chart of a method for constructing a three-dimensional patch model of a building roof according to an embodiment of the invention;
FIG. 2 is a schematic structural diagram of a building rooftop segment detection network model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a key node of a roof of an aerial image building according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a building roof initial patch segmentation process according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the following embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
The present invention is further illustrated by the following examples, which are not to be construed as limiting the invention.
Referring to fig. 1, the embodiment of the invention discloses a building roof three-dimensional patch model construction method based on aerial images, which comprises the following steps:
step 1, acquiring aerial image data of a building, and preprocessing the aerial image data to obtain an aerial ortho-image and a digital surface model;
firstly, extracting feature points from a two-dimensional image, matching, calculating the pose of a camera and resolving exterior orientation elements of all images; solving a target space coordinate through multi-view dense matching, and generating point cloud data by utilizing triangularization positioning; interpolating point cloud data to generate a Delaney triangulation network and a corresponding Digital Surface Model (DSM); and then, carrying out digital differential correction on the remote sensing image data by using the digital surface model, and embedding to obtain the aerial ortho-image. Aerial orthographic images and digital surface models can be automatically generated from aerial image data, typically using photogrammetric specialty software.
Step 2, building a building roof segment detection network model, inputting the aviation ortho-image into the building roof segment detection network model, extracting roof key node information according to the geometric and semantic information of the building by the building roof segment detection network model to generate candidate segments, fusing multi-scale features of the aviation ortho-image, aligning the candidate segments with the feature layer of the depth down-sampling, filtering out the candidate segments with wrong topology, and outputting the building roof vector segments with complete topology in the aviation ortho-image;
as shown in fig. 2, the embodiment provides a segment detection task focusing on reflecting the roof topology structure of a building, and constructs a high-resolution aerial image-oriented roof segment detection network model of the building according to deep learning and a convolutional neural network theory. The building roof segment detection network model comprises a backbone network, segment detection branches, semantic segmentation branches and an LOI Pooling module. The backbone network adopts a stacked hourglass network and is formed by connecting 5 basic hourglass structures in series, each basic hourglass firstly adopts a convolution layer and a maximum pooling layer to perform feature extraction and downsampling on an aerial orthographic image, and then a nearest neighbor upsampling and cross-layer connection are utilized to restore a feature map to an input size. The image after the extraction of the backbone network features is input into the line segment detection branch and the semantic segmentation branch.
The segment detection branch is used for extracting the information of the key nodes of the roof according to the geometric and semantic information of the building and generating candidate segments, and comprises an encoder, a node prediction module and a mixed sampling module. In particular, in the segment detection branch, to compileThe decoder down-samples the obtained shared feature map as input, wherein the shared feature map is output from the backbone network, and the Anchor-free based node prediction module is used to extract the position coordinates of the key nodes on the roof, as shown in FIG. 3,H ×Wshared profile of dimensions is divided intoH b ×W b The grid units with different sizes convert the truth values of the key points into Gaussian probability distribution of corresponding coordinates, each grid unit plays the role of a candidate anchor frame to conveniently confirm whether nodes exist in each grid unit and predict the relative positions of the nodes in each grid unit, and the node probability distribution in each grid unit is represented by a node probability distribution diagramEAnd position deviation mapOThe corresponding probability distribution map and offset map can be expressed as:
Figure SMS_45
wherein the content of the first and second substances,
Figure SMS_46
set of sequence numbers representing corner pointsVIn the presence ofiA point of intersection located in a grid cellbInner, grid cellbIs expressed as
Figure SMS_47
Figure SMS_48
Then represents the firstiThe coordinates of the individual nodes are then calculated,W b representing grid cells
Figure SMS_49
The width of (d);
then selecting the one with the best prediction probabilityKAnd generating candidate line segments by the key nodes by adopting a mixed sampling module.
And a void convolution structure is applied to fuse the multi-scale features of the aerial image in the semantic segmentation branch, and the features of all levels are effectively transmitted and selectively used. And finally, aligning the candidate line segment with the feature layer of the depth down-sampling by adopting an LOI Pooling module to realize effective filtering of the error topological line segment, thereby outputting the vector line segment of the building roof in the aerial image.
Step 3, removing abnormal noise in the digital surface model by adopting a combined bilateral sampling method based on gradient term constraint, and improving the data quality of the digital surface model;
due to the characteristics of vegetation shielding, ground object dense distribution, a three-dimensional structure of a building and the like, an image shot by aerial photogrammetry has dead corners and a view blind area, a Digital Surface Model (DSM) generated by relying on aerial image dense matching contains a large amount of noise, and the phenomena of abnormal elevation fluctuation, area holes or edge sawteeth and the like are usually shown, so that the data quality is obviously reduced. In order to give consideration to the height characteristics and the spectral information of the building, abnormal noise in the digital surface model data can be effectively removed through a combined double-edge sampling method, and the quality of the digital surface model is improved. Aiming at the problems that black and white colors corresponding to DSM edge gray values are similar and the colors of regions with abrupt elevation changes synchronously change, the embodiment introduces a depth guide map in color domain weight calculation on the basis of the original combined bilateral method, and the improved combined bilateral sampling formula is as follows:
Figure SMS_50
wherein the content of the first and second substances,pthe coordinates of the center of the core are represented,qindicating adjacent windowsDThe coordinates of the pixels within the image frame,
Figure SMS_53
representing coordinatespThe depth of the output of the light source,W p representing coordinatespThe normalized coefficient of (a) is determined,
Figure SMS_54
represents a spatial kernel in which
Figure SMS_57
The standard deviation of the spatial kernel is represented,
Figure SMS_58
representing coordinatesqThe DSM gray value of the input of (a),
Figure SMS_59
represents a color kernel wherein
Figure SMS_60
The standard deviation of the color kernel is represented,
Figure SMS_61
representing coordinatespThe remote sensing ortho image RGB value of the input,
Figure SMS_51
representing coordinatesqThe RGB values of the input (c) are,
Figure SMS_52
representing coordinatespThe depth value of the input is detected,
Figure SMS_55
representing coordinatesqAt input depth values, depth images
Figure SMS_56
Obtained by linear interpolation of the digital surface model,αis a normalized adaptive weight;
in addition, compared with the calculation of similar gray value difference in a color domain, the gradient value in the digital surface model can reflect the edge characteristics of the image better, and the original color kernel Gaussian function is transformed to calculate the gradient term constraint between adjacent pixels:
Figure SMS_62
Figure SMS_63
Figure SMS_64
wherein the content of the first and second substances,G x 、G y then represent the image respectively
Figure SMS_65
Inp、qThe degree of gradient change in the horizontal and vertical directions;x、yrespectively representing the horizontal and vertical directions;p x 、p y respectively representpIsx、yCoordinates of the object,q x 、q y Each representing qx、yThe coordinates of the position of the object to be imaged,σ r represents the standard deviation of the color kernel;
by adopting the improved combined bilateral sampling method, the original abnormal data noise in the digital surface model is removed, so that the data quality of the digital surface model is improved.
Step 4, converting the digital surface model processed in the step 3 into corresponding three-dimensional point cloud data, fitting a roof plane according to the three-dimensional point cloud data so as to extract all roof planes, classifying attributes of all the roof planes according to whether the roof planes are inclined or not, and finally drawing initial surface elements of the roof scene distribution of the building according to different attributes;
due to the lack of sufficient prior information in the three-dimensional reconstruction of the building, the simple geometric primitive topological relations are neither stable nor accurate enough, and if the roof plane is constructed by only blindly relying on the line segment topological relations, many repeated planes or even meaningless error planes are necessarily generated. In order to reduce unnecessary operations and improve the construction efficiency and accuracy of the roof structure of the building, in this embodiment, an initial patch of the roof is constructed according to the distance characteristics of the digital surface model from the digital surface model acquired in advance. As shown in FIG. 4, the main idea is to raise each pixel in the digital surface model to the height of its representation, so as to convert the distance three-dimensionally into corresponding three-dimensional point cloud data, fit a plurality of planes and record the plane parameter equation in the point cloud data by using RANSAC algorithm, first iteratively fit the plane model parameters from the point cloud data by using RANSAC algorithm, randomly select 3 points to determine a plane and evaluate the number of points close to the plane during each iteration, and fit by multiple iterationsAnd (3) generating a plane model parameter with the highest confidence (the number of approximate points to the plane is the most), and continuously using a RANSAC algorithm for the rest outer points (outliers) by adopting an iterative idea until all planes are extracted. Meanwhile, the normal of the roof plane is utilized to carry out directional constraint, and the random sampling algorithm is used for fitting the point cloud data into a plurality of roof plane modelsP i While calculating the corresponding normal directionn i When the plane of the roof is approximately horizontal, the normal line and the ground are approximately vertical, at the moment, the plane model is corrected to be horizontal to the ground, namely the normal line is vertical to the ground, and the normal line of each plane is calculatedn i Coordinate system with ground objectzNormal line of axisn z The included angle divides the fitted cloud planes of each point into a horizontal type and an inclined type, wherein the dark color point cloud in the graph 4 represents the horizontal plane, and the light color point cloud represents the inclined plane. And finally, re-projecting the corresponding point cloud set into the digital surface model according to the attributes of the point cloud set, thereby dividing an initial patch reflecting the distribution of the roof scene.
Step 5, decomposing the irregular building polygon into a plurality of rectangular areas based on the grids; in each building rectangular area, building roof vector line segments with complete topology output in the step 2 are used as geometric constraints, the initial patches in the step 4 are combined to construct an overall objective function of the initial patches, and finally local optimal solution is carried out on the overall objective function in each rectangular area, so that the construction of a building roof three-dimensional patch model is completed;
when the three-dimensional model of the building is reconstructed in a large scale, the area included by the aerial image contains various building targets with irregular shapes. In order to ensure the robustness and accuracy of a building three-dimensional model construction algorithm, building rectangular region blocking is required to be carried out firstly, and the grid-based region decomposition method is designed in the embodiment. Then, gradients of the DSM in two orthogonal directions, horizontal and vertical, are calculated based on adjacent pixel interpolation, respectively, in each directionA line segment having a gradient value greater than a threshold and a mean height of the DSM that is a peak within the non-maximum suppression window is selected as a candidate boundary of the rectangle. And simultaneously, further filtering each candidate boundary by utilizing the color information on the corresponding aerial ortho-image: that is, buffer areas are created on both sides of each boundary line segment, and when the gray-scale color difference is larger than the color threshold value
Figure SMS_66
And reserving the candidate boundary, and simultaneously performing twice uniform down-sampling on the original image twice to obtain the grid layers with different resolutions. Finally, since the rectangle extraction performed independently in the buffer may generate repeated boundaries, a threshold needs to be set from the color difference and the height difference of the rectangles to determine whether to merge the adjacent rectangles on both sides, and the specific criteria are as follows:
Figure SMS_67
in the formula (I), the compound is shown in the specification,
Figure SMS_68
Figure SMS_69
representing the average color of each of two rectangular areas on the aerial orthographic image,
Figure SMS_70
Figure SMS_71
representing the average height of each of two rectangular areas on the digital surface model;
when the two rectangular areas simultaneously satisfy the following three conditions, the merging of two adjacent rectangles is completed:
(1) Average color difference over an orthoimage
Figure SMS_72
Less than a color threshold
Figure SMS_73
(2) Average height difference in digital surface model
Figure SMS_74
Less than a height threshold
Figure SMS_75
(3) To avoid intermediate narrow streets, height gradients in overlapping buffer areas covering common edges
Figure SMS_76
Less than threshold
Figure SMS_77
Thereby achieving the purpose of rectangular partitioning.
Based on the region blocking method, complex and irregular building polygons can be decomposed into independent basic rectangle combinations, so that not only can individual building model reconstruction be carried out, but also joint optimization of probability map models can be carried out in each building rectangular region more conveniently.
Utilizing the roof vector structure line obtained in the step 2 to cut out a plurality of polygonal units in each rectangular area and constructing a dual graph
Figure SMS_79
For an undirected graph structure, each vertex in a dual graphv i Representing different polygon elementsc i Each edge of
Figure SMS_80
Then the polygon units corresponding to the vertexes at two ends are represented as the adjacent relation on the roof, and meanwhile, the initial patch formed by fitting in the digital surface model is represented as
Figure SMS_81
. Within each polygon unit, each discrete pixel point
Figure SMS_82
Spatial three-dimensional object points corresponding to the pixel points in the digital surface model
Figure SMS_83
. Using distance terms
Figure SMS_84
Reflecting the sum of the distances from each pixel point in the polygon to the corresponding fitting initial patch, and smoothing items
Figure SMS_85
Reflect two adjacent polygon cellsc i 、c j The pair-wise smoothing relationship between, the initial patch
Figure SMS_78
Is expressed as:
Figure SMS_86
in the formula (I), the compound is shown in the specification,
Figure SMS_87
representing smooth transition relationship between two polygons for measuring plane measure and distance term between polygon units
Figure SMS_88
The degree of fit of the initial patch to the digital surface model is reflected,
Figure SMS_89
to balance the weight occupied by the distance term and the smoothing term,
Figure SMS_90
a set of sequence numbers representing pairs of adjacent patches,
Figure SMS_91
is shown asiThe number of the dough sheets is counted,
Figure SMS_92
is prepared by reacting with
Figure SMS_93
Adjacent to each otherA patch.
The optimization problem is solved by minimizing an objective function using Graph-cut algorithm to obtain a final polygon element with a roof plane marker and visualizing it as a building three-dimensional patch model.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims (9)

1. A three-dimensional surface patch model construction method based on image building structure learning is characterized by comprising the following steps:
step 1, acquiring aerial image data of a building, and preprocessing the aerial image data to obtain an aerial orthographic image and a digital surface model;
step 2, building a building roof segment detection network model, inputting the aviation ortho-image into the building roof segment detection network model, extracting roof key node information according to the geometric and semantic information of the building by the building roof segment detection network model to generate candidate segments, fusing multi-scale features of the aviation ortho-image, aligning the candidate segments with a feature layer, filtering out the candidate segments with wrong topology, and outputting a building roof vector segment with complete topology in the aviation ortho-image;
step 3, removing abnormal noise in the digital surface model by adopting a combined bilateral sampling method based on gradient term constraint, and improving the data quality of the digital surface model;
step 4, converting the digital surface model processed in the step 3 into corresponding three-dimensional point cloud data, fitting a roof plane according to the three-dimensional point cloud data so as to extract all the roof planes, classifying attributes of all the roof planes according to whether the roof planes are inclined or not, and finally drawing initial surface elements of the roof scene distribution of the building according to different attributes;
step 5, decomposing the irregular building polygon into a plurality of rectangular areas based on the grids; and in each rectangular area, taking the building roof vector line segment with complete topology output in the step 2 as a geometric constraint, combining the roof initial patch in the step 4 to construct an overall objective function of the initial patch, and finally performing local optimal solution on the overall objective function in each rectangular area, thereby completing the construction of the building roof three-dimensional patch model.
2. The method for constructing a three-dimensional patch model based on image building structure learning of claim 1, wherein the building roof segment detection network model comprises a backbone network, a segment detection branch, a semantic segmentation branch and an LOI Pooling module;
the backbone network is used for extracting features of the aerial orthographic images and inputting images after the features are extracted into the line segment detection branch and the semantic segmentation branch;
the segment detection branch is used for extracting the key node information of the roof according to the geometric and semantic information of the building and generating a candidate segment;
the semantic segmentation branch is used for fusing the multi-scale features of the aerial ortho-image;
and the LOI Pooling module aligns the candidate line segments with the feature layer of the depth down-sampling, filters out the candidate line segments with wrong topology, and outputs the building roof vector line segments of the topological structure in the aerial ortho-image.
3. The method for constructing a three-dimensional patch model based on image building structure learning of claim 2, wherein the line segment detection branch comprises an encoder, a node prediction module and a mixed sampling module, and the method for generating the line segment candidates in the line segment detection branch comprises the following steps:
the method comprises the steps of taking a shared characteristic diagram obtained by down-sampling of an encoder as input, wherein the shared characteristic diagram is output from a backbone network, extracting position coordinates of key nodes of a roof by adopting a node prediction module, dividing the shared characteristic diagram into a plurality of grid units, and converting truth values of the key nodes into Gaussian profile of corresponding coordinatesRate distribution, the node probability distribution in the grid cells being derived from the node probability distribution mapEAnd a position deviation mapOThe corresponding probability distribution map and offset map are represented as:
Figure QLYQS_1
wherein the content of the first and second substances,
Figure QLYQS_2
set of sequence numbers representing corner pointsVIn the presence ofiA point of intersection located in a grid cellbInner, grid cellbIs expressed as
Figure QLYQS_3
Figure QLYQS_4
Then represents the firstiThe coordinates of the individual nodes are then compared,W b representing grid cells
Figure QLYQS_5
The width of (d);
finally, the one with the best prediction probability is selectedKAnd generating candidate line segments by the key nodes by adopting a mixed sampling module.
4. The method for constructing a three-dimensional patch model based on image building structure learning according to claim 1, wherein the specific method for removing abnormal noise in the digital surface model in step 3 is as follows:
according to the height information and the color information in the digital surface model and the spatial correlation of the height information and the color information, the depth guide graph is introduced to calculate the color domain weight, and original abnormal data noise is removed by adopting combined double-edge sampling and combining the gradient value in the digital surface model, so that the data quality of the digital surface model is improved.
5. The image building structure learning-based three-dimensional patch model construction method according to claim 4, wherein the upsampling method specifically comprises:
Figure QLYQS_6
;
wherein the content of the first and second substances,pthe coordinates of the center of the core are represented,qindicating adjacent windowsDThe coordinates of the pixels within the image plane are,
Figure QLYQS_8
representing coordinatespThe depth of the output of the light source,W p representing coordinatespThe normalized coefficient of (a) is determined,
Figure QLYQS_10
represents a spatial kernel in which
Figure QLYQS_11
The standard deviation of the spatial kernel is represented,
Figure QLYQS_14
representing coordinatesqThe DSM gray value of the input of (a),
Figure QLYQS_15
represents a color kernel wherein
Figure QLYQS_16
The standard deviation of the color kernel is represented,
Figure QLYQS_17
representing coordinatespThe remote sensing ortho image RGB value of the input,
Figure QLYQS_7
representing coordinatesqThe RGB values of the input are processed,
Figure QLYQS_9
representing coordinatespThe depth value of the input is detected,
Figure QLYQS_12
representing coordinatesqAt input depth values, depth images
Figure QLYQS_13
Obtained by linear interpolation of the digital surface model,αis a normalized adaptive weight;
in addition, compared with the calculation of similar gray value difference in a color domain, the gradient value in the digital surface model can reflect the edge characteristics of the image better, and the original color kernel Gaussian function is transformed to calculate the gradient term constraint between adjacent pixels:
Figure QLYQS_18
Figure QLYQS_19
Figure QLYQS_20
wherein the content of the first and second substances,G x 、G y then represent the image respectively
Figure QLYQS_21
Inp、qThe gradient change degree in the horizontal and vertical directions;x、yrespectively representing horizontal and vertical directions;p x 、p y respectively representpIs/are as followsx、yCoordinates of the object,q x 、q y Respectively representqIs/are as followsx、yThe coordinates of the position of the object to be imaged,σ r represents the standard deviation of the color kernel;
by adopting the improved combined bilateral sampling method, the original abnormal data noise in the digital surface model is removed, so that the data quality of the digital surface model is improved.
6. The method for constructing the three-dimensional surface patch model based on the image building structure learning of claim 1, wherein the method for drawing the initial surface element of the building roof scene distribution in the step 4 comprises the following steps:
fitting a plane model parameter equation with the highest confidence coefficient from the point cloud data by adopting a RANSAC algorithm;
continuously performing iterative fitting on the remaining external points by adopting a RANSAC algorithm until all roof planes are extracted, and calculating the corresponding normal directions of the point cloud data while fitting the point cloud data into a plurality of roof planes;
calculating an included angle between the normal direction of the plane of the roof and the normal line of the z axis of the ground object coordinate system according to the normal direction of the plane of the roof, thereby dividing the plane model of the roof into horizontal and inclined attributes;
and finally, re-projecting the point cloud sets corresponding to the roof planes into the digital surface model according to the attributes, so as to mark an initial surface element reflecting the distribution of the roof scene.
7. The method for constructing a three-dimensional patch model based on image building structure learning of claim 1, wherein in step 5, the method for decomposing the irregular building polygon into a plurality of rectangular areas comprises:
firstly, rotating a target building polygon to enable the main direction of the target building polygon to be parallel to the X axis of the aerial image, and executing the same transformation in the corresponding areas of the aerial ortho-image and the digital surface model; then, the gradients of the digital surface model in the horizontal direction and the vertical direction are respectively calculated according to the interpolation of adjacent pixels, a line segment of which the gradient value is larger than a threshold value and the average height of the digital surface model is a peak value in a non-maximum value suppression window is selected in each direction to serve as a candidate boundary of a rectangle, and meanwhile, each candidate boundary is further filtered by utilizing color information on a corresponding aerial ortho image: creating buffer areas on two sides of each boundary line segment, reserving candidate boundaries when the gray-scale color difference is larger than a threshold value, and simultaneously performing twice uniform down-sampling on an original image for multiple times to obtain grid layers with different resolutions;
finally, thresholds are set for the color difference and the height difference of the rectangles to decide whether to merge the adjacent rectangles on both sides, so that the complex and irregular building polygon is decomposed into a plurality of rectangular areas.
8. The method for constructing a three-dimensional patch model based on image building structure learning of claim 7, wherein the specific criteria for whether to merge adjacent rectangles on two sides are as follows:
Figure QLYQS_22
in the formula (I), the compound is shown in the specification,
Figure QLYQS_23
Figure QLYQS_24
representing the average color of each of two rectangular areas on the aerial orthographic image,
Figure QLYQS_25
Figure QLYQS_26
representing the average height of each of two rectangular areas on the digital surface model;
when the two rectangular areas simultaneously satisfy the following three conditions, the merging of two adjacent rectangles is completed:
(1) Average color difference over an orthoimage
Figure QLYQS_27
Less than a color threshold
Figure QLYQS_28
(2) Average height difference in digital surface model
Figure QLYQS_29
Less than a height threshold
Figure QLYQS_30
(3) To avoid intermediate narrow streets, height gradients in overlapping buffer areas covering common edges
Figure QLYQS_31
Is less than the threshold value
Figure QLYQS_32
9. The method for constructing a three-dimensional patch model based on image building structure learning according to claim 1, wherein the step 5 further comprises:
cutting a plurality of polygon units in each rectangular area by using the roof vector structure line extracted in the step 2, and then obtaining an initial patch
Figure QLYQS_33
Is expressed as:
Figure QLYQS_34
in the formula (I), the compound is shown in the specification,
Figure QLYQS_35
representing smooth transition relations between two polygons for measuring planar measure, distance term between polygon elements
Figure QLYQS_36
The degree of fit of the initial patch to the digital surface model is reflected,
Figure QLYQS_37
to balance the weight occupied by the distance term and the smoothing term,
Figure QLYQS_38
a set of sequence numbers representing pairs of adjacent patches,
Figure QLYQS_39
is shown asiThe number of the dough sheets is counted,
Figure QLYQS_40
is and is
Figure QLYQS_41
Adjacent patches;
the final polygon elements marked with the roof plane are obtained by minimizing the overall objective function using a graph cut algorithm and visualized as a building three-dimensional patch model.
CN202211552856.0A 2022-12-06 2022-12-06 Three-dimensional patch model construction method based on image building structure learning Active CN115564926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211552856.0A CN115564926B (en) 2022-12-06 2022-12-06 Three-dimensional patch model construction method based on image building structure learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211552856.0A CN115564926B (en) 2022-12-06 2022-12-06 Three-dimensional patch model construction method based on image building structure learning

Publications (2)

Publication Number Publication Date
CN115564926A CN115564926A (en) 2023-01-03
CN115564926B true CN115564926B (en) 2023-03-10

Family

ID=84770658

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211552856.0A Active CN115564926B (en) 2022-12-06 2022-12-06 Three-dimensional patch model construction method based on image building structure learning

Country Status (1)

Country Link
CN (1) CN115564926B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116071273B (en) * 2023-03-09 2023-06-27 中国科学院空天信息创新研究院 Method for processing color consistency of orthophoto based on extended update area
CN116310189B (en) * 2023-05-22 2023-09-01 浙江大华技术股份有限公司 Map model construction method and terminal
CN117237474B (en) * 2023-11-16 2024-02-09 长春理工大学 Depth guidance-based on-chip integrated multispectral imaging reconstruction method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025685A (en) * 2017-04-11 2017-08-08 南京林业大学 Airborne building summit cloud modeling method under topology ambiguity
CN109993783A (en) * 2019-03-25 2019-07-09 北京航空航天大学 A kind of roof and side optimized reconstruction method towards complex three-dimensional building object point cloud
CN111815776A (en) * 2020-02-04 2020-10-23 山东水利技师学院 Three-dimensional building fine geometric reconstruction method integrating airborne and vehicle-mounted three-dimensional laser point clouds and streetscape images
CN111898543A (en) * 2020-07-31 2020-11-06 武汉大学 Building automatic extraction method integrating geometric perception and image understanding
CN112489212A (en) * 2020-12-07 2021-03-12 武汉大学 Intelligent three-dimensional mapping method for building based on multi-source remote sensing data
CN113313835A (en) * 2021-07-29 2021-08-27 深圳市数字城市工程研究中心 Building roof automatic modeling method based on airborne LiDAR point cloud

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8401222B2 (en) * 2009-05-22 2013-03-19 Pictometry International Corp. System and process for roof measurement using aerial imagery
US11094113B2 (en) * 2019-12-04 2021-08-17 Geomni, Inc. Systems and methods for modeling structures using point clouds derived from stereoscopic image pairs

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025685A (en) * 2017-04-11 2017-08-08 南京林业大学 Airborne building summit cloud modeling method under topology ambiguity
CN109993783A (en) * 2019-03-25 2019-07-09 北京航空航天大学 A kind of roof and side optimized reconstruction method towards complex three-dimensional building object point cloud
CN111815776A (en) * 2020-02-04 2020-10-23 山东水利技师学院 Three-dimensional building fine geometric reconstruction method integrating airborne and vehicle-mounted three-dimensional laser point clouds and streetscape images
CN111898543A (en) * 2020-07-31 2020-11-06 武汉大学 Building automatic extraction method integrating geometric perception and image understanding
CN112489212A (en) * 2020-12-07 2021-03-12 武汉大学 Intelligent three-dimensional mapping method for building based on multi-source remote sensing data
CN113313835A (en) * 2021-07-29 2021-08-27 深圳市数字城市工程研究中心 Building roof automatic modeling method based on airborne LiDAR point cloud

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于机载LiDAR点云的约束BSP模型重建方法;李静等;《地理信息世界》;20161225(第06期);全文 *

Also Published As

Publication number Publication date
CN115564926A (en) 2023-01-03

Similar Documents

Publication Publication Date Title
Chen et al. A methodology for automated segmentation and reconstruction of urban 3-D buildings from ALS point clouds
CN115564926B (en) Three-dimensional patch model construction method based on image building structure learning
Vanegas et al. Building reconstruction using manhattan-world grammars
US11423610B2 (en) Large-scale environment-modeling with geometric optimization
Elaksher et al. Reconstructing 3d buildings from lidar data
Chauve et al. Robust piecewise-planar 3D reconstruction and completion from large-scale unstructured point data
Haala et al. An update on automatic 3D building reconstruction
Cheng et al. 3D building model reconstruction from multi-view aerial imagery and lidar data
Zhou et al. Seamless fusion of LiDAR and aerial imagery for building extraction
Poullis et al. Automatic reconstruction of cities from remote sensor data
Xu et al. Reconstruction of scaffolds from a photogrammetric point cloud of construction sites using a novel 3D local feature descriptor
CN110717983A (en) Building facade three-dimensional reconstruction method based on knapsack type three-dimensional laser point cloud data
CN108171780A (en) A kind of method that indoor true three-dimension map is built based on laser radar
Truong-Hong et al. Octree-based, automatic building facade generation from LiDAR data
Budroni et al. Automatic 3D modelling of indoor manhattan-world scenes from laser data
CN111652241B (en) Building contour extraction method integrating image features and densely matched point cloud features
CN113916130B (en) Building position measuring method based on least square method
Wang et al. A method for detecting windows from mobile LiDAR data
Haala et al. Cell decomposition for the generation of building models at multiple scales
Tripodi et al. Brightearth: Pipeline for on-the-fly 3D reconstruction of urban and rural scenes from one satellite image
Elaksher et al. Building extraction using lidar data
Mahphood et al. Virtual first and last pulse method for building detection from dense LiDAR point clouds
Kurdi et al. Automated building footprint and 3d building model generation from lidar point cloud data
Ahmed et al. High-quality building information models (BIMs) using geospatial datasets
Gruen et al. An Operable System for LoD3 Model Generation Using Multi-Source Data and User-Friendly Interactive Editing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant