CN116402942A - Large-scale building three-dimensional reconstruction method integrating multi-scale image features - Google Patents
Large-scale building three-dimensional reconstruction method integrating multi-scale image features Download PDFInfo
- Publication number
- CN116402942A CN116402942A CN202310194010.2A CN202310194010A CN116402942A CN 116402942 A CN116402942 A CN 116402942A CN 202310194010 A CN202310194010 A CN 202310194010A CN 116402942 A CN116402942 A CN 116402942A
- Authority
- CN
- China
- Prior art keywords
- ray
- scale
- light
- color
- dimensional reconstruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000012549 training Methods 0.000 claims description 21
- 239000008186 active pharmaceutical agent Substances 0.000 claims description 19
- 238000004364 calculation method Methods 0.000 claims description 16
- 238000005070 sampling Methods 0.000 claims description 16
- 238000009877 rendering Methods 0.000 claims description 13
- 238000012937 correction Methods 0.000 claims description 11
- 238000002834 transmittance Methods 0.000 claims description 10
- 108010089741 opacity factor Proteins 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- 238000012360 testing method Methods 0.000 claims description 5
- 239000013598 vector Substances 0.000 claims description 5
- RVRCFVVLDHTFFA-UHFFFAOYSA-N heptasodium;tungsten;nonatriacontahydrate Chemical compound O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.[Na+].[Na+].[Na+].[Na+].[Na+].[Na+].[Na+].[W].[W].[W].[W].[W].[W].[W].[W].[W].[W].[W] RVRCFVVLDHTFFA-UHFFFAOYSA-N 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- MYVIATVLJGTBFV-UHFFFAOYSA-M thiamine(1+) chloride Chemical compound [Cl-].CC1=C(CCO)SC=[N+]1CC1=CN=C(C)N=C1N MYVIATVLJGTBFV-UHFFFAOYSA-M 0.000 claims description 2
- 238000012800 visualization Methods 0.000 claims description 2
- 239000010410 layer Substances 0.000 description 24
- 230000006870 function Effects 0.000 description 23
- 230000001537 neural effect Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000010276 construction Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000005855 radiation Effects 0.000 description 4
- 241001424688 Enceliopsis Species 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 210000005036 nerve Anatomy 0.000 description 2
- 230000000750 progressive effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 241000486463 Eugraphe sigma Species 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000002310 reflectometry Methods 0.000 description 1
- 230000001932 seasonal effect Effects 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/06—Ray-tracing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2210/00—Indexing scheme for image generation or computer graphics
- G06T2210/04—Architectural design, interior design
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a large-scale building three-dimensional reconstruction method fusing multi-scale image features, which comprises the following steps: remote sensing data selection and pretreatment; constructing a basic network module and a residual network module, which are used for obtaining a shadow scalar, an uncertainty predicted value, a reflected light color and an ambient light color; inputting light rays corresponding to the highest-level images into the basic network module, inputting light rays corresponding to other scale images into the residual network module layer by layer, inputting the light rays corresponding to each image into one residual network module, then fusing the results of all the residual network modules to obtain a unified output, finally fusing the unified output of the residual network modules with the output result of the basic network module to obtain a final output result, calculating the depth of a point corresponding to a target object and the color of a corresponding pixel through the final output result, and updating the network weight through a constructed loss function; and realizing three-dimensional reconstruction of the remote sensing image to be reconstructed by using the trained network.
Description
Technical Field
The invention belongs to the application of a deep learning technology in the field of three-dimensional reconstruction using high-resolution remote sensing images, relates to a large-scale building three-dimensional reconstruction method fusing multi-scale image features, and particularly relates to a method for automatically generating a building three-dimensional model from multi-view remote sensing images.
Background
Three-dimensional reconstruction is a core technology in the fields of smart cities, autopilot, virtual reality, and the like (document 1). Conventional three-dimensional reconstruction methods which are relatively commonly used at present comprise oblique photogrammetry, proximity photogrammetry and the like (documents 2-3), and the methods are used for fine modeling of scenes, and the main data sources are equipment such as cameras, unmanned aerial vehicles and the like. The high-resolution remote sensing image has the characteristics of wide coverage, abundant scene characteristics, multiple times and the like, can be rapidly reconstructed in a large scale by means of a large-scale remote sensing scene, can ensure the quality of three-dimensional reconstruction by abundant scene characteristic information and multiple time-phase image data, has great research and application values and is widely focused. However, due to the specificity of the remote sensing camera model and the sparsity of target photography, the traditional method requires more manual editing to determine the coordinates of the control points to finish the orientation of the images when the high-resolution images are used for three-dimensional reconstruction, and the workflow is complex, so that the three-dimensional reconstruction of the remote sensing scene is difficult to be performed quickly, at low cost and in a large scale.
With the rapid development of artificial intelligence and hardware related technologies, deep learning technologies represented by MVS-Net networks have achieved great success in the field of three-dimensional reconstruction. MVS-Net provides a cost capacity (cost volume) function based on differentiable homography transformation based on a binocular stereo matching theory, calculates the confidence coefficient of a depth prediction result by constructing a cost body, trains with the maximum confidence coefficient as a target, realizes high-quality generation of an end-to-end scene depth map, and overcomes the dependence of the traditional remote sensing image three-dimensional reconstruction on a large number of manual control points (document 4). However, the model training requires a depth map generated by a three-dimensional data acquisition system as a true value, and the depth map of a large-scale remote sensing scene is difficult to acquire and has high cost, so that the depth map is difficult to be widely applied.
In recent years, neural radiation fields (Neural Radiance Fields, neRF) fit radiation and density fields of three-dimensional scenes by creatively using a multi-layer perceptron (MLP), constructing a loss function with color information of pixels in an image as a true value, eliminating dependence of model training on depth maps, realizing high-quality three-dimensional scene expression in a concise manner, and promoting great development in the field of three-dimensional reconstruction (document 5). The three-dimensional reconstruction research represented by Block-NeRF, bungeeNeRF, mega-NeRF realizes urban-level high-precision large-scale reconstruction by using unmanned aerial vehicle aerial images and street view images, and opens up a new technical route for three-dimensional reconstruction of large-scale scenes by using satellite remote sensing images (documents 6-8). However, the camera model used by the method has a large difference from the satellite sensor model, so that the method is difficult to be directly applied to the three-dimensional reconstruction task of the remote sensing image. In addition, the scene complexity, view sparsity and diversity of external influence factors of the remote sensing image are difficult to ensure the model reconstruction precision. Sat-NeRF successfully applied a NeRF-based network to the task of three-dimensional reconstruction of remote sensing images by introducing a rational polynomial coefficient (Rational Polynomial Coefficients, RPC) camera model of the satellite (literature 9). However, the method lacks of capturing details of the remote sensing scene, and the reconstruction accuracy of the method has room for further improvement.
[ document 1] Li Mingyang, chen Wei, wang Shanshan, etc.. Three-dimensional reconstruction methods for visual deep learning [ J/OL ]. Computer science and exploration: 1-26[2023-02-11]. Http:// kns.cnki.net/kcms/detail/11.5602.TP.20221020.1347.002.Html.
[ document 2] Sun Hongwei ] three-dimensional digital urban modeling based on oblique photogrammetry [ J ]. Modern mapping 2014,37 (1): 18-21.
[ document 3] Miao Zhicheng, yang Yongchong, yu Qing and the like. Proximate to the application of photogrammetry to detailed modeling of a single building [ J ]. Remote sensing information, 2021,36 (5): 107G113.
[ document 4]Yao Y,Luo Z,Li S,et al.Mvsnet:Depth inference for unstructured multi-view stereo [ C ]// Proceedings of the European conference on computer vision (ECCV) 2018:767-783 ] [ document 5]Mildenhall B,Srinivasan P P,Tancik M,et al.Nerf:Representing scenes as neural radiance fields for view synthesis[J ]. Communications of the ACM,2021,65 (1): 99-106.
[ document 6]Tancik M,Casser V,Yan X,et al.Block-surf: scalable large scene neural view synthesis [ C ]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern recording.2022:8248-8258.
[ document 7]Xiangli Y,Xu L,Pan X,et al.Bungeenerf:Progressive neural radiance field for extreme multi-scale scene rendering [ C ]// Computer Vision-ECCV 2022:17th European Conference,Tel Aviv,Israel,October 23-27,2022,Proceedings,Part XXXII.Cham:Springer Nature Switzerland,2022:106-122.
[ document 8]Turki H,Ramanan D,Satyanarayanan M.Mega-surf: scalable construction of large-scale nerfs for virtual fly-through hs [ C ]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern recording.2022:12922-12931.
[ document 9]Mari R,Facciolo G,Ehret T.Sat-surf: learning Multi-view satellite photogrammetry with transient objects and shadow modeling using rpc cameras [ C ]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recgnition.2022:1311-1321.
Disclosure of Invention
Aiming at the defects of the existing remote sensing three-dimensional reconstruction method, the invention adopts the RPC model of the satellite to construct a network, adds shadow detection processing and instantaneous object (such as vehicles, pedestrians and the like) processing aiming at the problems of shadow, background disorder, complex content and the like of the remote sensing image, and eliminates the influence of inconsistent imaging conditions on the model precision to a certain extent. Finally, a residual error network module is gradually and dynamically added in the training process to capture local details and integral features of the remote sensing image, so that the result accuracy is improved.
The technical scheme adopted by the invention is as follows: a large-scale building three-dimensional reconstruction method integrating multi-scale image features comprises the following steps:
step 1, remote sensing data are selected and preprocessed, and a data set is divided into training and testing data sets according to a certain proportion;
acquiring RGB data, RPC model parameters, sun direction parameters and time phase data of the remote sensing image from a data set; acquiring light rays emitted from a satellite sensor based on an RPC model through preprocessing, wherein each light ray is a vector, and the origin coordinates emitted by the light rays and the directions of the origin coordinates are represented; finally, organizing RGB data, sun direction parameters and light vectors into dictionary type data for use;
step 2, constructing a basic network module and a residual error network module, wherein the main bodies of the basic network module and the residual error network module are multi-layer perceptron (MLP) which comprises an input layer, a plurality of hidden layers and an output layer; each layer of the MLP has a width h, the input of the input layer is the coordinate of any point on the light and the direction of the coordinate, and the output of the first half part of the MLP is the predicted pixel value and voxel density;
after the voxel density sigma is output, an implicit layer with the width of h is additionally added, RGB output by the first half part of the MLP is taken as input, and finally, a shadow scalar s, an uncertainty predictive value beta and a reflected light color c are obtained through the additional implicit layer with the width of h/2 and the output layer a Ambient light color a;
step 3, up-sampling remote sensing images input in the training data set by using a plurality of convolution layers to obtain a plurality of images with different scales, constructing an image pyramid by using the multi-scale images, and constructing the images in the image pyramid as light rays; firstly, inputting light rays corresponding to the highest-level image to a basic network module to learn overall scene information of a relatively abstract macro scale so as to obtain an output result; inputting light rays corresponding to other scale images into the residual error network module layer by layer to learn more specific and rich scene detail information in the low-level images, inputting the light rays corresponding to each image into one residual error network module, fusing the results of all the residual error network modules to obtain a unified output, finally fusing the unified output of the residual error network module with the output result of the basic network module to obtain a final output result, calculating the depth of a corresponding point of a target object and the color of a corresponding pixel through the final output result, and updating the network weight through a constructed loss function;
and step 4, inputting the remote sensing images in the test data set into the trained network in the step 3, and realizing three-dimensional reconstruction of the remote sensing images to be reconstructed.
Further, the processing procedure of the first half of the MLP in step 2 is as follows:
(RGB,σ)=F(x,d) (1)
where RGB represents the pixel values predicted by MLP, σ is the voxel density, x represents the three-dimensional coordinates of the point on the ray, and d is the direction of the ray.
Further, the specific implementation manner of calculating the depth of the corresponding point of the target object and the color of the corresponding pixel in the step 3 is as follows;
step 3.1, first, sigma is calculated i The transmittance alpha of the space point is calculated by taking the formula (2) i And opacity factor T i :
Middle sigma i To input the coordinate of the ith point and the light direction thereof to the first half of the MLP and output the obtained target object voxel density value delta i =t i+1 -t i-1 Is the distance between two adjacent sampling points;
step 3.2, alpha i 、T i 、t i Carry-over formula(3) Calculating to obtain the depth d (r) of the corresponding point of the target object;
step 3.3, c a 、a i 、s i Carrying out formula (4) to calculate and obtain a color predicted value c of the corresponding light sampling point i ;
c(x i ,ω,t j )=c a (x i )·(s(x i ,ω)+(1-s(x i ,ω))·a(ω)) (4)
Wherein c (x) i ,ω,t j ) Representing color value obtained by rendering i points on each ray, namely c i Wherein x is i Representing three-dimensional coordinates of point i on the light ray, wherein omega is sun direction angle, and c a (x i ) Refer to the reflected light color c of the MLP output a A (omega) refers to the ambient light color output by the MLP, t j Phase data of the image; s (x) i ω) is a shadow scalar, taking a value between 0 and 1;
step 3.4, T is taken i 、c i 、α i Bringing the color c (r) of the pixel corresponding to the light into the formula (5);
c (r) is the color value obtained by rendering the ray r, d (r) is the depth value of the corresponding point of the target object on the ray, N is the number of rays, t i 、c i 、α i And T i The distance to the camera, the predicted color value, the transmittance and the opacity factor for the i-th point of the up-sample of ray r, respectively.
Further, the loss function constructed in the step 2 includes a solar ray direction correction term, an MSE loss, and a depth supervision loss, wherein a calculation formula of the solar ray direction correction term is as follows:
wherein N is SC Refers to the total number of sampling points on the light, s i Scalar for the shadow of the ith point on the ray, R SC Is a secondary sun correction ray.
Further, the calculation formula of the MSE loss is as follows:
here, R refers to the ray set of the present training input, c GT (r 2) is the true color value of the corresponding pixel of ray r2 in the input image, β' (r 2) =β (r 2) +β min (beta), wherein beta min And η is the specified empirical value.
The β (r 2) associated with ray r2 is obtained by integrating the uncertainty predictions for the N samples of r2, as follows:
wherein beta is i Is the uncertainty predictive value of the ith point of the ray r;
further, in order to learn depth information of a scene, a depth supervision loss is constructed, and a calculation formula is as follows:
wherein R is DS The method is characterized in that the method refers to light rays which are constructed by key points extracted from a remote sensing image through a SIFT algorithm, X (r 3) is a three-dimensional coordinate of a sampling point on the light rays, o (r 3) is a light ray origin, and omega (r 3) is a contribution weight of X (r 3) to depth supervision information, namely the ratio of the absolute value of a difference value between each light ray X (r 3) and o (r 3) selected in the training to the sum of the absolute values of all light ray difference values.
Further, the calculation formula of the final loss function is as follows:
L=L RGB (R)+λ SC L SC (R SC )+λ DS L DS (R DS ) (10)
wherein lambda is SC And lambda is DS Is the weight of the corresponding loss term.
Further, after network training is completed, the network weight is kept in a ckpt file, end-to-end three-dimensional reconstruction is directly carried out through an input remote sensing image by means of the ckpt file, an imaging tool is used for reconstructing a three-dimensional model of the whole scene according to the output c (r) and d (r), and a corresponding DSM or mp4 type visualization result is output.
Further, each layer of the MLP has a width of 1024.
Further, lambda SC And lambda is DS Taking 0.1/3 and 1000/3 respectively.
Further, in step 1, an IEEE GRSS data set is selected, where the data set includes the following parts:
(1) WorldView-3 satellite images, full color and eight-band visible light and near infrared, ground sampling distances of 35 cm and 1.3 m respectively;
(2) Three-dimensional data provided by point clouds or digital surface models DSMs generated by an airborne laser radar with a resolution of 80 centimeters;
(3) Sensor RPC parameter, solar ray direction and shooting time phase information data.
The invention fully utilizes the multi-level characteristics of the high-resolution remote sensing image to realize the efficient automatic three-dimensional reconstruction of the large-scale building. The advantages are as follows:
(1) The RPC model is introduced into the nerve radiation field, so that camera projection light is constructed, the remote sensing image can be applied to NeRF, and meanwhile, the shadow shielding and the instantaneous object existing in the remote sensing image are correspondingly processed, so that the final result is finer.
(2) And multi-level features in the image are extracted in a convolution mode, the whole and partial information of the image is fully utilized, and the final result is improved.
(3) And progressively training the whole network model, combining the basic network module and the residual network module, finally fusing the output results of the two modules to obtain a final result, and improving the reconstruction precision.
Drawings
FIG. 1 is a schematic diagram of a neural network model for large-scale three-dimensional reconstruction according to an embodiment of the present invention;
fig. 2 is an overall flow chart of an embodiment of the present invention.
Detailed Description
In order to facilitate an understanding and practice of the invention by those of ordinary skill in the art, the invention will be further described with reference to the drawings and specific examples, it being understood that the examples described herein are for illustration and description only and are not intended to be limiting of the invention.
Referring to fig. 2, the method for reconstructing a large-scale building by fusing multi-scale image features provided by the invention is characterized by comprising the following steps: the invention adopts the RPC model of the satellite to construct the NeRF network, and adds a shadow detection module and an instantaneous object (such as vehicles, pedestrians and the like) detection module aiming at the problems of shadow, disordered background, complex content and the like of the remote sensing image, thereby eliminating the influence of inconsistent imaging conditions of the image on the model precision to a certain extent. Finally, a residual block is gradually and dynamically added in the training process to capture local details and integral features of the remote sensing image, so that the result accuracy is improved, and the integral model structure is shown in fig. 1. And finally, realizing the three-dimensional reconstruction of the large-scale building from end to end through the trained network.
Based on the structure of the model, the embodiment of the invention provides an end-to-end progressive nerve radiation field remote sensing image large-scale three-dimensional reconstruction method. The specific implementation steps are as follows:
step 1, remote sensing data selection and preprocessing
In the embodiment, the 2019 IEEE GRSS data fusion large-scale race data set is selected to be compared and tested with various building reconstruction methods. The data set has a plurality of complex urban scenes composed of different building densities, space sizes and surrounding environments, and can well verify the extraction precision and reliability of the three-dimensional reconstruction methods of different buildings. The dataset contained the following:
(1) WorldView-3 satellite images (supplied by Maxar), full-color and eight-band visible and near-infrared, ground sample distances of 35 cm and 1.3 meters, respectively.
(2) The three-dimensional data provided by the point cloud or Digital Surface Models (DSMs) generated by the airborne lidar is 80 cm in resolution.
(3) Metadata such as sensor RPC parameters, solar ray direction, shooting time phase information, etc.
The present example selected 26 Maxar world view-3 images from the dataset collected in jackson ville, florida from 2014 to 2016. From these data we take as input a set of RGB image segmentation results of different sizes, approximately 800 x 800 pixels, with a minimum resolution of 0.3 m/pixel, each AOI covering a range of 256 x 256 m. The training set and the test set are divided according to the ratio of 8:2, and the RPC camera model of the satellite image is directly used for light projection, and each RPC is defined by a projection function (projecting 3D points onto image pixels) and an inverse function (positioning function) thereof. The minimum height and the maximum height of the scene are respectively expressed as h min And h max A ray passing through the scene and intersecting pixel p of the jth image is modeled as a straight line between an initial point and a final 3D point, i.e., x start And x end . RPC localization function L using jth image j For pixel p at h min And h max Positioning the position to obtain the boundary points:
x start =L j (p,h max ) ECEF ;x end =L j (p,h min ) ECEF
wherein the subindex ECEF indicates the location function L j The returned 3D points are converted into the earth's center, earth-fixed coordinate system (or geocentric system) to operate in a cartesian reference system.
Given x start And x end The origin o and the direction vector d of the ray r (t) =o+td intersecting the pixel p may select the height boundary [ h ] in various ways min ,h max ]Such as a large scale elevation model extracted from low resolution data. The j-th image is expressed as:
Maximum height point x nearest to camera start As the origin o of the ray. The boundary of ray r (t) =o+td, i.e. [ t ] min ,t max ]Let t be min =0,t max =‖x end -x start ‖ 2 . The ECEF coordinates cannot be used directly in practice, because they represent a larger coordinate value, the present invention normalizes all ray points to the interval [ -1,1 using a process similar to the subtraction offset and scaling used in the RPC function]Is a kind of medium. Limiting all pixels in the input image to h min And h max The resulting set of 3D points in range is used to calculate the offset and scale in each spatial dimension. And finally, organizing the constructed light and image RGB information and other metadata into a dictionary for convenient use.
Step 2, constructing a basic network module and a residual network module
As shown in fig. 1, the basic network model and the residual network model are identical in structure, and the main body is composed of MLPs. The key of NeRF is the construction and rendering of camera rays, the calculation amount is large, the selection of MLP can simplify the network structure to reduce the calculation amount on the one hand, and the gradient updating can be completed faster on the other hand.
Because the ground features in the remote sensing image are complex, and there are many objects which exist temporarily like pedestrians, vehicles and the like, the result obtained by learning different time phase data has larger error, and the building in the image often has a part covered by the shadow, so that processing operations for the shadow and the instantaneous object need to be added in the model to eliminate the effects.
Therefore, in constructing the basic network module and the residual network module, the two network modules have the same structure, such as the block in fig. 1, the main body is a multi-layer perceptron (MLP), each layer of the MLP has a width h (default 1024), the front half part of the MLP (the part before outputting σ in the block in fig. 1) will fit to the NeRF representation of the whole scene (i.e. the function in the form of MLP), the input is the coordinates of any point on the light (since the starting point and direction of the light are known, the coordinates of any point on the light can be calculated) and the direction thereof, and the function can be expressed as
(RGB,σ)=F(x,d) (1)
Where RGB represents the pixel values predicted by the MLP, x represents the three-dimensional coordinates of the point on the ray, and d is the direction of the ray.
After the final output voxel density sigma, an additional hidden layer with the width of h is added, RGB output by the first half part of the MLP is taken as input, and finally, a shadow scalar s, an uncertainty predictive value beta and a reflected light color c are obtained through the additional hidden layer with the width of h/2 and the output layer a And ambient light color a (related only to solar ray angle), i.e. by adding additional layers, a function expressed in MLP from input to corresponding output is obtained. It should be noted that the MLP outputting beta requires an additional input time feature t, as shown in block in FIG. 1 of the specification j The output s requires an additional input solar direction angle ω, while the output a is only a single layer MLP representation ω to a function;
step 3, progressively training the network
In the process of extracting features of an input remote sensing image by using a plurality of convolution layers, a plurality of images with different scales are obtained, the step is to up-sample an original input image for a plurality of times to construct an image pyramid, the images in the image pyramid are constructed as light rays, and the light ray direction of the image with higher hierarchy is obtained by scaling according to the light ray direction constructed by the original image.
Firstly, inputting rays constructed by the highest-level image into a basic network to learn overall scene information of abstract macroscopics, fitting spatial distribution of the scene by using MLP according to the basic network module structure, and outputting voxel density sigma of the overall scene base Color rendering value c base Shadow scalar s base Transmittance alpha base Uncertainty prediction value beta base 。
Then inputting the light constructed by each layer of images into the residual error network module layer by layer until the original image is inputLight rays are input to a residual error network module by learning more specific and rich scene detail information in low-level images, all residual error network modules output voxel density, color rendering values, shadow scalar, transmittance and uncertainty prediction values, and finally the results of all residual error network modules are fused (an average method is adopted by the method) to obtain a unified output: voxel density sigma res Color rendering value c res Shadow scalar s res Transmittance alpha res Uncertainty prediction value beta res The method comprises the steps of carrying out a first treatment on the surface of the Finally, outputting voxel density sigma of residual error network module res Color of reflected light c a res Shadow scalar s res Ambient light color a res Uncertainty prediction value beta res Voxel density sigma of overall scene output from basic network base Color of reflected light c a base Shadow scalar s base Ambient light color a base Uncertainty prediction value beta base Fusion is carried out to obtain the final output voxel density sigma i Color of reflected light c a Shadow scalar s i Ambient light color a i Uncertainty prediction value beta i 。
The depth of the spatial point is calculated by taking the parameters of the output into the following formula (i.e. sigma i The transmittance alpha of the space point is calculated by taking the formula (4) into consideration i And opacity factor T i And then alpha is i 、T i 、t i And (3) carrying out calculation by using the formula (3) to obtain the depth d (r) of the corresponding point of the target object. Will c a 、a i 、s i Carrying out formula (5) to calculate and obtain a color predicted value c of the corresponding light sampling point i Will T i 、c i 、α i Bringing into formula (2) to obtain color c (r)) of the corresponding pixel of the light ray, and updating the network weight (beta is obtained through the constructed loss function i 、T i 、α i Carrying out calculation by taking the formula (8) to obtain a predicted value beta (r) of the possibility that the corresponding pixels of the light belong to the instantaneous object, and combining c (r), beta (r) and the color value c of the corresponding pixels of the input image of the corresponding hierarchy GT (r) bring-in formula(7) Construction of loss term L RGB The method comprises the steps of carrying out a first treatment on the surface of the Will T i 、α i 、s i Build loss term L with equation (6) SC The method comprises the steps of carrying out a first treatment on the surface of the Bringing d (r) into equation (9) builds the loss term L DS The method comprises the steps of carrying out a first treatment on the surface of the Finally will L RGB 、L SC 、L DS The final loss function is obtained by substituting equation (10).
The basic formula for rendering a ray is as follows:
wherein c (r) is a color value obtained by rendering the light ray r, d (r) is a depth value of a corresponding point of the target object on the light ray, N is a point sampled on the light ray r, and t i 、c i 、α i And T i The distance to the camera, the predicted color value, the transmittance and the opacity factor for the i-th point of the up-sample of the ray r, respectively, are defined as follows:
middle sigma i To input the coordinate of the ith point and the light direction thereof into the first half of the MLP and output the obtained target object voxel density value, x i =t i+1 -t i-1 Is the distance between adjacent two sampling points.
The color value c obtained by rendering the i point on each ray is calculated by adopting a shadow perception irradiance model proposed in S-NeRF, and the calculation formula is as follows:
c(x i ,ω,t j )=c a (x i )·(s(x i ,ω)+(1-s(x i ,ω))·a(ω)) (5)
wherein c (x) i ,ω,t j ) Substituting c in the rendering formula i Wherein x is i Representing the three-dimensional coordinates of point i on the light, ω being the solar direction angle, c a (x i ) Refers to the reflected light color of the MLP output, i.e., c in FIG. 1 a A (omega) refers to the ambient light color output by the MLP, t j Phase data of the image; s (x) i ω) is a shadow scalar, takes a value between 0 and 1, and determines the shadow region by the albedo of the scene. Ideally, at those 3D points which are directly illuminated by the sun, s.apprxeq.1, the color is entirely determined by the reflectance c a (x) To explain.
The loss function comprises a solar ray direction correction term, MSE loss and depth supervision loss, and the specific construction mode is as follows:
in practice the direction of the solar rays ω is closely related to the acquisition date (in particular the satellite passing at the same time of day). Thus, due to the mixture of phenomena, the finally captured ambient irradiance of a (ω) is not only related to ω but also to the conditions of a particular date, such as weather or seasonal variations. The model can not correct the distortion caused by the change of different sun ray directions from training data, and the invention solves the problem by adding a sun ray direction correction term in a loss function, wherein the calculation formula of the correction term is as follows:
wherein N is SC Refers to the total number of sampling points, s, on ray r1 i Scalar for the shadow of the ith point on ray R1, R SC Is a secondary sun correction ray that follows the direction of the sun ray omega, while the other primary rays follow the viewing direction of the camera. The learnable geometry used by the solar ray correction term is defined by the transmittance T i And transparency alpha i Encoding to further supervise the learning of shadow-aware shadows s (x, ω). The first part of the formula means that for R SC R1, s predicted at the ith point i Should be matched with T i Similarly; while the second part of the formula requires that the integral of s/r1 is as close to 1 as possible, since non-occluded and non-shadowed areas must beMainly explained by the reflectivity in the shadow-aware irradiance model. The above operation is an operation of adding a shadow process as in fig. 2.
The MSE loss is to refer to a task uncertainty learning method in W-NeRF to improve the robustness of the model, and in this embodiment, the uncertainty prediction β is weighted according to the contribution of each ray between the rendered and known colors to the MSE, where the formula is as follows:
here, R refers to the ray set of the present training input, c GT (r 2) is the true color value of the corresponding pixel of ray r2 in the input image, β' (r 2) =β (r 2) +β min Wherein beta is min The value of eta is manually specified empirical value, and beta is taken min =0.05, η=3 to avoid taking negative values in the logarithm; logarithmic at L RGB The function of (a) is to prevent β from converging to infinity. In this way, the model can trade off between uncertainty coefficient β and color difference to get the final value.
The β (r 2) associated with ray r2 is obtained by integrating the uncertainty predictions for the N samples of r2, as follows:
wherein beta is i Is the uncertainty predictor of the ith point of ray r. The above operation is implemented as the add transient object handling operation in fig. 2.
Meanwhile, in order to learn the depth information of the scene, a depth supervision loss function can be constructed, and the calculation formula is as follows:
wherein R is DS Refers to extraction from remote sensing images through SIFT algorithmThe light constructed by the key points is X (r 3) which is the three-dimensional coordinate of the sampling point on the light, o (r 3) is the origin of the light, and omega (r 3) is the contribution weight of X (r 3) to the depth supervision information, namely the ratio of the absolute value of the difference value between each light X (r 3) and o (r 3) selected in the training to the sum of the absolute values of the difference values of all the light.
And finally, constructing a final loss function by weighted addition of the loss functions:
L=L RGB (R)+λ SC L SC (R SC )+λ DS L DS (R DS ) (10)
wherein lambda is SC And lambda is DS The weight of the corresponding loss item is respectively 0.1/3 and 1000/3.
And 4, realizing three-dimensional reconstruction of the building by utilizing the trained network in the step 3.
After the network training is finished, the network weight is kept in a ckpt file, and the end-to-end three-dimensional reconstruction can be directly carried out through the input remote sensing image by means of the ckpt file, so that a final result is output.
In specific implementation, the invention can adopt computer software technology to realize automatic operation flow, and the device for operating the flow of the invention should be within the protection scope.
It should be understood that the foregoing description of the preferred embodiments is not intended to limit the scope of the invention, but rather to limit the scope of the claims, and that those skilled in the art can make substitutions or modifications without departing from the scope of the invention as set forth in the appended claims.
Claims (10)
1. A large-scale building three-dimensional reconstruction method integrating multi-scale image features is characterized by comprising the following steps:
step 1, remote sensing data are selected and preprocessed, and a data set is divided into training and testing data sets according to a certain proportion;
acquiring RGB data, RPC model parameters, sun direction parameters and time phase data of the remote sensing image from a data set; acquiring light rays emitted from a satellite sensor based on an RPC model through preprocessing, wherein each light ray is a vector, and the origin coordinates emitted by the light rays and the directions of the origin coordinates are represented; finally, organizing RGB data, sun direction parameters and light vectors into dictionary type data for use;
step 2, constructing a basic network module and a residual error network module, wherein the main bodies of the basic network module and the residual error network module are multi-layer perceptron (MLP) which comprises an input layer, a plurality of hidden layers and an output layer; each layer of the MLP has a width h, the input of the input layer is the coordinate of any point on the light and the direction of the coordinate, and the output of the first half part of the MLP is the predicted pixel value and voxel density;
after the voxel density sigma is output, an implicit layer with the width of h is additionally added, RGB output by the first half part of the MLP is taken as input, and finally, a shadow scalar s, an uncertainty predictive value beta and a reflected light color c are obtained through the additional implicit layer with the width of h/2 and the output layer a Ambient light color a;
step 3, up-sampling remote sensing images input in the training data set by using a plurality of convolution layers to obtain a plurality of images with different scales, constructing an image pyramid by using the multi-scale images, and constructing the images in the image pyramid as light rays; firstly, inputting light rays corresponding to the highest-level image to a basic network module to learn overall scene information of a relatively abstract macro scale so as to obtain an output result; inputting light rays corresponding to other scale images into the residual error network module layer by layer to learn more specific and rich scene detail information in the low-level images, inputting the light rays corresponding to each image into one residual error network module, fusing the results of all the residual error network modules to obtain a unified output, finally fusing the unified output of the residual error network module with the output result of the basic network module to obtain a final output result, calculating the depth of a corresponding point of a target object and the color of a corresponding pixel through the final output result, and updating the network weight through a constructed loss function;
and step 4, inputting the remote sensing images in the test data set into the trained network in the step 3, and realizing three-dimensional reconstruction of the remote sensing images to be reconstructed.
2. A method for three-dimensional reconstruction of a large-scale building incorporating multi-scale image features as defined in claim 1, wherein: the processing of the first half of the MLP in step 2 is shown below:
(RGB,σ)=F(x,d) (1)
where RGB represents the pixel values predicted by MLP, σ is the voxel density, x represents the three-dimensional coordinates of the point on the ray, and d is the direction of the ray.
3. A method for three-dimensional reconstruction of a large-scale building incorporating multi-scale image features as defined in claim 1, wherein: the specific implementation manner of calculating the depth of the corresponding point of the target object and the color of the corresponding pixel in the step 3 is as follows;
step 3.1, first, sigma is calculated i The transmittance alpha of the space point is calculated by taking the formula (2) i And opacity factor T i :
Middle sigma i To input the coordinate of the ith point and the light direction thereof to the first half of the MLP and output the obtained target object voxel density value delta i =t i+1 -t i-1 Is the distance between two adjacent sampling points;
step 3.2, alpha i 、T i 、t i Carrying out calculation by using the formula (3) to obtain the depth d (r) of the corresponding point of the target object;
step 3.3, c a 、a i 、s i Carrying out formula (4) to calculate and obtain a color predicted value c of the corresponding light sampling point i ;
c(x i ,ω,t j )=c a (x i )·(s(x i ,ω)+(1-s(x i ,ω))·a(ω)) (4)
Wherein c (x) i ,ω,t j ) Representing color value obtained by rendering i points on each ray, namely c i Wherein x is i Representing three-dimensional coordinates of point i on the light ray, wherein omega is sun direction angle, and c a (x i ) Refer to the reflected light color c of the MLP output a A (omega) refers to the ambient light color output by the MLP, t j Phase data of the image; s (x) i ω) is a shadow scalar, taking a value between 0 and 1;
step 3.4, T is taken i 、c i 、α i Bringing the color c (r) of the pixel corresponding to the light into the formula (5);
c (r) is the color value obtained by rendering the ray r, d (r) is the depth value of the corresponding point of the target object on the ray, N is the number of rays, t i 、c i 、α i And T i The distance to the camera, the predicted color value, the transmittance and the opacity factor for the i-th point of the up-sample of ray r, respectively.
4. A method for three-dimensional reconstruction of a large-scale building incorporating multi-scale image features as claimed in claim 3, wherein: the loss function constructed in the step 2 comprises a solar ray direction correction term, MSE loss and depth supervision loss, wherein the calculation formula of the solar ray direction correction term is as follows:
wherein N is SC Refers to the total number of sampling points on the light, s i Scalar for the shadow of the ith point on the ray, R SC Is a secondary sun correction ray.
5. The method for three-dimensional reconstruction of a large-scale building fused with multi-scale image features according to claim 4, wherein: the calculation formula of the MSE loss is as follows:
here, R refers to the ray set of the present training input, c GT (r 2) is the true color value of the corresponding pixel of ray r2 in the input image, β' (r 2) =β (r 2) +β min (beta), wherein beta min And η is the specified empirical value.
The β (r 2) associated with ray r2 is obtained by integrating the uncertainty predictions for the N samples of r2, as follows:
wherein beta is i Is the uncertainty predictor of the ith point of ray r.
6. The method for three-dimensional reconstruction of a large-scale building fused with multi-scale image features according to claim 5, wherein: in order to learn the depth information of the scene, a depth supervision loss is constructed, and the calculation formula is as follows:
wherein R is DS The method is characterized in that the method refers to light rays which are constructed by key points extracted from a remote sensing image through a SIFT algorithm, X (r 3) is a three-dimensional coordinate of a sampling point on the light rays, o (r 3) is a light ray origin, and omega (r 3) is a contribution weight of X (r 3) to depth supervision information, namely the ratio of the absolute value of a difference value between each light ray X (r 3) and o (r 3) selected in the training to the sum of the absolute values of all light ray difference values.
7. The method for three-dimensional reconstruction of a large-scale building fused with multi-scale image features according to claim 6, wherein: the calculation formula of the final loss function is as follows:
L=L RGB (R)+λ SC L SC (R SC )+λ DS L DS (R DS ) (10)
wherein lambda is SC And lambda is DS Is the weight of the corresponding loss term.
8. A method for three-dimensional reconstruction of a large-scale building incorporating multi-scale image features as defined in claim 1, wherein: after network training is completed, the network weight is kept in a ckpt file, end-to-end three-dimensional reconstruction is directly carried out through the input remote sensing image by means of the ckpt file, an image tool is used for reconstructing a three-dimensional model of the whole scene according to the output c (r) and d (r), and a corresponding DSM or mp4 type visualization result is output.
9. The method for three-dimensional reconstruction of a large-scale building fused with multi-scale image features according to claim 7, wherein: lambda (lambda) SC And lambda is DS Taking 0.1/3 and 1000/3 respectively.
10. A method for three-dimensional reconstruction of a large-scale building incorporating multi-scale image features as defined in claim 1, wherein: in step 1, an IEEE GRSS data set is selected, wherein the data set comprises the following parts:
(1) WorldView-3 satellite images, full color and eight-band visible light and near infrared, ground sampling distances of 35 cm and 1.3 m respectively;
(2) Three-dimensional data provided by point clouds or digital surface models DSMs generated by an airborne laser radar with a resolution of 80 centimeters;
(3) Sensor RPC parameter, solar ray direction and shooting time phase information data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310194010.2A CN116402942A (en) | 2023-03-02 | 2023-03-02 | Large-scale building three-dimensional reconstruction method integrating multi-scale image features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310194010.2A CN116402942A (en) | 2023-03-02 | 2023-03-02 | Large-scale building three-dimensional reconstruction method integrating multi-scale image features |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116402942A true CN116402942A (en) | 2023-07-07 |
Family
ID=87016846
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310194010.2A Pending CN116402942A (en) | 2023-03-02 | 2023-03-02 | Large-scale building three-dimensional reconstruction method integrating multi-scale image features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116402942A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116580161A (en) * | 2023-07-13 | 2023-08-11 | 湖南省建筑设计院集团股份有限公司 | Building three-dimensional model construction method and system based on image and NeRF model |
CN117710583A (en) * | 2023-12-18 | 2024-03-15 | 中铁第四勘察设计院集团有限公司 | Space-to-ground image three-dimensional reconstruction method, system and equipment based on nerve radiation field |
CN117765165A (en) * | 2023-12-06 | 2024-03-26 | 之江实验室 | Three-dimensional reconstruction method and device, storage medium and electronic equipment |
CN117765172A (en) * | 2023-12-12 | 2024-03-26 | 之江实验室 | Method and device for three-dimensional reconstruction of remote sensing image |
CN117765171A (en) * | 2023-12-12 | 2024-03-26 | 之江实验室 | Three-dimensional model reconstruction method and device, storage medium and electronic equipment |
-
2023
- 2023-03-02 CN CN202310194010.2A patent/CN116402942A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116580161A (en) * | 2023-07-13 | 2023-08-11 | 湖南省建筑设计院集团股份有限公司 | Building three-dimensional model construction method and system based on image and NeRF model |
CN116580161B (en) * | 2023-07-13 | 2023-09-22 | 湖南省建筑设计院集团股份有限公司 | Building three-dimensional model construction method and system based on image and NeRF model |
CN117765165A (en) * | 2023-12-06 | 2024-03-26 | 之江实验室 | Three-dimensional reconstruction method and device, storage medium and electronic equipment |
CN117765172A (en) * | 2023-12-12 | 2024-03-26 | 之江实验室 | Method and device for three-dimensional reconstruction of remote sensing image |
CN117765171A (en) * | 2023-12-12 | 2024-03-26 | 之江实验室 | Three-dimensional model reconstruction method and device, storage medium and electronic equipment |
CN117765172B (en) * | 2023-12-12 | 2024-05-28 | 之江实验室 | Method and device for three-dimensional reconstruction of remote sensing image |
CN117710583A (en) * | 2023-12-18 | 2024-03-15 | 中铁第四勘察设计院集团有限公司 | Space-to-ground image three-dimensional reconstruction method, system and equipment based on nerve radiation field |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116402942A (en) | Large-scale building three-dimensional reconstruction method integrating multi-scale image features | |
Marí et al. | Sat-nerf: Learning multi-view satellite photogrammetry with transient objects and shadow modeling using rpc cameras | |
CN115082639B (en) | Image generation method, device, electronic equipment and storage medium | |
Tian et al. | Depth estimation using a self-supervised network based on cross-layer feature fusion and the quadtree constraint | |
Derksen et al. | Shadow neural radiance fields for multi-view satellite photogrammetry | |
CA3214444A1 (en) | Quotidian scene reconstruction engine | |
CN103559737A (en) | Object panorama modeling method | |
CN110910437B (en) | Depth prediction method for complex indoor scene | |
MX2013003853A (en) | Rapid 3d modeling. | |
CN114758337B (en) | Semantic instance reconstruction method, device, equipment and medium | |
US20230281913A1 (en) | Radiance Fields for Three-Dimensional Reconstruction and Novel View Synthesis in Large-Scale Environments | |
CN116580161B (en) | Building three-dimensional model construction method and system based on image and NeRF model | |
CN115937288A (en) | Three-dimensional scene model construction method for transformer substation | |
Condorelli et al. | A comparison between 3D reconstruction using nerf neural networks and mvs algorithms on cultural heritage images | |
CN116245757B (en) | Multi-scene universal remote sensing image cloud restoration method and system for multi-mode data | |
CN111683221B (en) | Real-time video monitoring method and system for natural resources embedded with vector red line data | |
CN116612468A (en) | Three-dimensional target detection method based on multi-mode fusion and depth attention mechanism | |
CN115147709A (en) | Underwater target three-dimensional reconstruction method based on deep learning | |
Li et al. | 3D virtual urban scene reconstruction from a single optical remote sensing image | |
CN117115359A (en) | Multi-view power grid three-dimensional space data reconstruction method based on depth map fusion | |
Zhang et al. | Fast satellite tensorial radiance field for multi-date satellite imagery of large size | |
CN116152442A (en) | Three-dimensional point cloud model generation method and device | |
CN116310228A (en) | Surface reconstruction and new view synthesis method for remote sensing scene | |
CN108171731B (en) | Minimum image set automatic optimization method considering topological geometry multi-element constraint | |
Motayyeb et al. | Enhancing contrast of images to improve geometric accuracy of a UAV photogrammetry project |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |