CN112509124A - Depth map obtaining method and system, unmanned aerial vehicle orthographic map generating method and medium - Google Patents

Depth map obtaining method and system, unmanned aerial vehicle orthographic map generating method and medium Download PDF

Info

Publication number
CN112509124A
CN112509124A CN202011462830.8A CN202011462830A CN112509124A CN 112509124 A CN112509124 A CN 112509124A CN 202011462830 A CN202011462830 A CN 202011462830A CN 112509124 A CN112509124 A CN 112509124A
Authority
CN
China
Prior art keywords
matching
pictures
depth
ncc
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011462830.8A
Other languages
Chinese (zh)
Other versions
CN112509124B (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Shuzhilian Technology Co Ltd
Original Assignee
Chengdu Shuzhilian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Shuzhilian Technology Co Ltd filed Critical Chengdu Shuzhilian Technology Co Ltd
Priority to CN202011462830.8A priority Critical patent/CN112509124B/en
Publication of CN112509124A publication Critical patent/CN112509124A/en
Application granted granted Critical
Publication of CN112509124B publication Critical patent/CN112509124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • G06T7/596Depth or shape recovery from multiple images from stereo images from three or more stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a depth map obtaining method and a depth map obtaining system, an unmanned aerial vehicle ortho map generating method and a medium, which relate to the field of unmanned aerial vehicle remote sensing image processing and comprise the following steps: shooting the same scene at a plurality of positions by using an unmanned aerial vehicle to obtain a plurality of pictures; obtaining a plurality of matching pairs among different pictures and whether the pictures are overlapped or not through feature point matching; based on the matching result, motion recovery calculation is carried out to obtain camera external parameters and sparse space point clouds of all the pictures; performing multi-view stereo geometric calculation on all the pictures to obtain dense depth maps corresponding to all the pictures; the method uses block matching to match pixels, and the block matching adds space constraint for depth measurement of pixel points through regularization processing; according to the method, the error of depth measurement can be reduced through space regularization constraint, the local depths tend to be uniform, the method is more suitable for the condition that the depth change amplitude in the three-dimensional reconstruction scene of the high-altitude unmanned aerial vehicle is not large, and meanwhile, the depths of some block matching failure points can be reasonably recovered.

Description

Depth map obtaining method and system, unmanned aerial vehicle orthographic map generating method and medium
Technical Field
The invention relates to the field of unmanned aerial vehicle remote sensing image processing, in particular to a depth map obtaining method and system, an unmanned aerial vehicle orthographic map generating method and medium.
Background
At present, the three-dimensional reconstruction technology is mature day by day, and the fusion with deep learning is also performed vigorously. The most dominant techniques in three-dimensional reconstruction are also matching point pairs, SfM (motion recovery) and MVS (multi-view stereovision). The main function of the matching point pairs is to match the pixels with the same name, and the main methods are a characteristic point method and an optical flow method. SfM estimates camera parameters from the result of the matching point pair, and performs global optimization on the camera parameters to obtain a smaller error globally. The MVS is to perform depth measurement on pixel points in an image by using estimated camera parameters to obtain dense or semi-dense point clouds. And whether the depth refers to the distance between the measuring tool and the surface of the measured target or not, wherein the measuring tool is a camera, and the measured target is an object corresponding to each pixel point in the image.
At present, in a three-dimensional reconstruction technology, for a high-altitude unmanned aerial vehicle, under the condition that a measurement distance is long and a depth change amplitude is not large (5% of flight height) in a three-dimensional reconstruction process (namely, under the condition that the height change of a shot object is not large), no method for improving the scene exists at present. In the conventional MVS method, the depth measurement is less accurate the farther away. Meanwhile, the situation that the depth change is severe and gentle needs to be adapted, but the situation that measurement errors easily occur in the three-dimensional reconstruction of the unmanned aerial vehicle is increased. In addition, common block matching algorithms are also prone to mismatch.
Disclosure of Invention
In order to solve the problems in the background art, the invention provides a depth map obtaining method and system and an unmanned aerial vehicle orthographic map generating method. The depth measurement error can be reduced through space regularization constraint, the local depth tends to be uniform, the method is more suitable for the condition that the depth change amplitude in a high-altitude unmanned aerial vehicle three-dimensional reconstruction scene is not large, and meanwhile, the depth of some block matching failure points can be reasonably recovered.
To achieve the above object, the present invention provides a depth map obtaining method, including:
shooting the same scene at a plurality of positions by using an unmanned aerial vehicle to obtain a plurality of pictures;
obtaining a plurality of matching pairs among different pictures and whether the pictures are overlapped or not through feature point matching;
based on the matching result, motion recovery calculation is carried out to obtain camera external parameters and sparse space point clouds of all the pictures;
performing multi-view stereo geometric calculation on all the pictures to obtain dense depth maps corresponding to all the pictures;
in the multi-view stereo geometric calculation process, matching among pixels is carried out by using block matching, and space constraint is added for pixel point depth measurement through regularization processing by the block matching.
The depth regularization method based on the unmanned aerial vehicle depth regularization is used for conducting spatial regularization on the depth to adapt to an unmanned aerial vehicle aerial shooting scene, the accuracy of depth measurement is improved, and meanwhile a reasonable depth can be given to some points which cannot be measured. The method is used in MVS (multi-view stereoscopic vision), and after the steps of matching, SfM and the like, the camera parameters of each camera, the space coordinates of a plurality of space points and the association of the space points and pixel points in an image are obtained.
Preferably, the method constructs a spatial regularization term during the block matching process, and performs the regularization process by using a linear function with respect to the depth distance.
Preferably, the method performs block matching by using a normalized cross-correlation matching method.
Preferably, the method uses a linear function in equation (1) to perform the regularization process:
nccnorm=(k1△d+bias)ncc (1)
wherein d ═ d-dmI, k1 is the slope of the linear function, bias is the bias of the linear function, NCC is the value obtained by taking the mean value NCC calculation, d is the depth measured with the current pixel as the matching pointmTo predict depth.
Preferably, the method uses a linear function in equation (2) to perform the regularization process:
nccnorm=(k1l+bias)ncc (2)
where k1 is the slope of the linear function, bias is the bias of the linear function, NCC is the value obtained by the calculation of the mean NCC, and l is the pixel distance of the matching point from the expected matching point.
The method can represent the distance of the depth by simply calculating the pixel distance, and avoids the need of calculating the depth for each matching.
Preferably, the height change of the object shot in the scene shot by the unmanned aerial vehicle is within a preset range. I.e. the variation in height of the ground-based object is relatively small with respect to the flying height.
The present invention also provides a depth map obtaining system, the system comprising:
the picture obtaining unit is used for shooting the same scene at a plurality of positions by using the unmanned aerial vehicle to obtain a plurality of pictures;
the characteristic point matching unit is used for obtaining a plurality of matching pairs among different pictures and whether the pictures are overlapped or not through characteristic point matching;
the motion recovery calculation unit is used for carrying out motion recovery calculation to obtain camera external parameters and sparse space point clouds of all the pictures based on the matching result;
the multi-view solid geometry calculation unit is used for performing multi-view solid geometry calculation on all the pictures to obtain dense depth maps corresponding to all the pictures;
in the multi-view stereo geometric calculation process, pixel matching is carried out by using block matching, and space constraint is added for pixel point depth measurement by regularization processing of the block matching.
The system uses a linear function in formula (1) to perform regularization processing:
nccnorm=(k1△d+bias)ncc (1)
wherein d ═ d-dmI, k1 is the slope of the linear function, bias is the bias of the linear function, NCC is the value obtained by taking the mean value NCC calculation, d is the depth measured with the current pixel as the matching pointmTo predict depth.
The system uses a linear function in an equation (2) to perform regularization processing:
nccnorm=(k1l+bias)ncc (2)
where k1 is the slope of the linear function, bias is the bias of the linear function, NCC is the value obtained by the calculation of the mean NCC, and l is the pixel distance of the matching point from the expected matching point.
The invention also provides an unmanned aerial vehicle orthograph generation method, which comprises the following steps:
shooting a scene from a plurality of angles by using an unmanned aerial vehicle to obtain a plurality of pictures;
preprocessing all pictures;
extracting feature points from the preprocessed pictures, matching any two pictures, determining a picture pair with an overlapping region, and obtaining a feature point matching result;
performing motion recovery calculation based on the feature point matching result to obtain camera parameters of each picture and coordinates of space points corresponding to the matching feature points;
based on the unmanned aerial vehicle camera parameters and the coordinates of the space points corresponding to the matched feature points, performing multi-view stereo geometric calculation to obtain dense depth maps of all the pictures; performing pixel matching by using epipolar line search and block matching in the multi-view solid geometry calculation process;
carrying out depth unified processing on the dense depth maps of all the pictures;
carrying out image fusion processing on the dense depth map subjected to the depth unification processing to generate an unmanned aerial vehicle orthographic map;
performing motion recovery calculation on all pictures to obtain camera parameters and sparse space point clouds of all the pictures;
performing multi-view stereo geometric calculation on all the pictures based on the camera parameters and the sparse space point clouds of all the pictures to obtain dense depth maps corresponding to all the pictures;
in the multi-view stereo geometric calculation process, a block matching method is used for matching pixels, and space constraint is added for pixel point depth measurement through regularization processing in the block matching.
The invention also provides a depth map obtaining device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the depth map obtaining method when executing the computer program.
The invention also provides a computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the depth map obtaining method.
One or more technical schemes provided by the invention at least have the following technical effects or advantages:
the traditional ncc without regularization often has the phenomena of matching error and nonuniform local depth, after the regularization processing in the invention, the matching accuracy in the repeated texture is improved, a reasonable depth is calculated in the area with changed brightness, so that the local depth is unified, and the finally generated depth map is obviously denser than the depth map without regularization. The depth measurement error can be reduced through space regularization constraint, the local depth tends to be uniform, the method is more suitable for the condition that the depth change amplitude in a high-altitude unmanned aerial vehicle three-dimensional reconstruction scene is not large, and meanwhile, the depth of some block matching failure points can be reasonably recovered.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention;
FIG. 1 is a schematic flow chart of a depth map acquisition method;
FIG. 2 is a geometric schematic of epipolar lines in an imaging model;
FIG. 3 is a geometric schematic of epipolar lines in an imaging model;
fig. 4 is a schematic composition diagram of a depth map acquisition system.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflicting with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described and thus the scope of the present invention is not limited by the specific embodiments disclosed below.
Example one
The invention provides a depth map obtaining method, and FIG. 1 is a flow diagram of the depth map obtaining method, and the method includes:
shooting the same scene at a plurality of positions by using an unmanned aerial vehicle to obtain a plurality of pictures;
obtaining a plurality of matching pairs among different pictures and whether the pictures are overlapped or not through feature point matching;
based on the matching result, motion recovery calculation is carried out to obtain camera external parameters and sparse space point clouds of all the pictures;
performing multi-view stereo geometric calculation on all the pictures to obtain dense depth maps corresponding to all the pictures;
in the multi-view stereo geometric calculation process, matching among pixels is carried out by using block matching, and space constraint is added for pixel point depth measurement through regularization processing by the block matching.
The depth is spatially regularized by the method, so that the method is suitable for an unmanned aerial vehicle aerial shooting scene, the accuracy of depth measurement is improved, and meanwhile, a reasonable depth can be given to some points which cannot be measured. The method is used in MVS (multi-view stereoscopic vision), and after the steps of matching, SfM and the like, the camera parameters of each camera, the space coordinates of a plurality of space points and the association of the space points and pixel points in an image are obtained.
The camera parameters mainly include internal parameters and external parameters: the intrinsic parameters are often represented by a 3x3 matrix, written K, for the purpose of converting the camera coordinate system to the image coordinate system, and the intrinsic parameters of the pictures taken by the same camera are the same for the same camera. The extrinsic parameters include two rotation matrices R, and a displacement matrix t, for implementing the conversion from the world coordinate system to the camera coordinate system. Assuming that a certain spatial point is P and its corresponding pixel is P in the world coordinate system (the pixel coordinate at this time is homogeneous coordinate, that is, there are 3 dimensions), the conversion from the world coordinate system to the pixel coordinate can be written as:
Figure BDA0002833127220000051
z in the above formula is depth, which is a scalar quantity, and is a quantity to be measured by the MVS. (u, v) is the coordinate of pixel P, (u, v,1) is the homogeneous coordinate representation of P, and (x, y, z) is the coordinate of spatial point P.
The steps of measuring the depth of a single pixel in the MVS are as follows:
1. polar line search:
(1) polar line finding
Suppose pictures a and B both originate from the same camera, so they have the same in-camera parameters, denoted as K. Suppose the camera extrinsic parameter of Picture A is R1,t1And the camera external parameter of the picture B is R2,t2
For two cameras with obtained camera parameters, the depth of a pixel point can be calculated only by finding the corresponding point of the pixel point in one camera in the other camera. At this time, it is not difficult to find that, for a pixel point with unknown depth, it must be at a certain point on a ray formed by the camera optical center and the pixel point, and this ray is imaged as a straight line in another camera, and if the camera extrinsic parameter obtained through SfM calculation is accurate, then the point matching this point must also be on this straight line, and this straight line is called epipolar line.
The optical center of the pinhole camera refers to the pinhole, and the optical center of the digital camera is generally the optical center of the convex lens at the forefront of the lens.
If the depth of the pixels in Picture A is known, then each of Picture A can be calculated by remapping using the camera parametersThe position of the corresponding point of the pixel in the picture B assumes that a pixel point in the picture A is p1The point of the image A after depth normalization in the camera coordinate system is P1Depth z1The corresponding pixel point in the picture B is p2,z2Then the equation can be obtained:
P1=K-1p1 (4)
p2=KP2 (5)
Figure BDA0002833127220000052
order:
Figure BDA0002833127220000053
the following can be obtained:
z2P2=z1R21P1+t21 (7)
for z1Theoretically, its range is (0, + ∞), i.e. there is no upper limit for the epipolar line length. However, in a specific picture, the depths of all pixel points are in a range, and the spatial points can be obtained through SfM to estimate the mean value and standard deviation of the pixel depth of the whole picture. Assume mean value e1Standard deviation of d1. So that z can be assumed1∈[e1-λd1,e1+λd1]=[zmin,zmax]Where λ is a parameter to be set, and is generally 2 or 3.
Then, the line segment of the polar line imaged in the picture B can be obtained, and the end points of the line segment are p respectivelymin,pmax
a1pmin=K(zminR21K-1p1+t21) (8)
a2pmax=K(zmaxR21K-1p1+t21) (9)
(2) Block matching
Matching the homonym points by a single pixel is not accurate enough, and block matching is often used, that is, the gray values of surrounding pixels are also considered. The simpler block matching is the de-averaged NCC:
Figure BDA0002833127220000061
wherein, NCC (A, B)PTo remap the P point into AB for two pixels of NCC, A (i, j) is the gray value at the (i, j) location in A, E (A)P) For the pixels P remapped into A and the mean value of the gray levels in its domain, APRemap P back to the area formed by the pixels in A and their leading edge.
The range of NCC values is [ -1,1], with closer to 1 indicating more similarity. A threshold is set for NCC, and when NCC is greater than the threshold, two points are considered similar, being homonymous points. Typically this threshold is taken to be 0.85.
The NCC is a normalized cross correlation matching method NCC (normalized cross correlation matching), which is a matching method based on image gray scale information, and is used for normalizing the correlation degree between the objects to be matched. The normalized cross-correlation matching algorithm is a more classical matching algorithm in the image matching algorithm.
(3) Spatial regularization
Because the NCC is relatively simple, points that pass the NCC match are more likely to be mismatched or mismatched. The mismatching occurs because the epipolar line search range is too large, when a region with repeated textures is encountered, mismatching is very easy to occur, and the matching failure occurs because illumination influences on imaging when shooting at different angles, although the NCC with mean value removed has the effect of weakening the illumination influence compared with the common NCC, the weakening degree is limited, a large number of points can not be matched, and some depth maps are possibly too sparse.
Considering the application scene of unmanned aerial vehicle aerial photography, the depth change amplitude in a single image is not very large, so that the depth change of adjacent points can be assumed to be very small, and then the depth of adjacent pixels can be used as the predicted depth of the pixel to be measured. With this assumption, a spatial regularization term can be constructed for NCC, where the regularization is not done using a simplest linear function with respect to depth distance:
nccnorm=(k1△d+bias)ncc (1)
where d ═ d-dmI, k1 is the slope of the linear function, bias is the bias of the linear function, NCC is the value calculated by taking the mean NCC. Where d is the depth measured using the current pixel as the matching point, dmTo predict depth, which is dynamically changing, the depth of neighboring pixels can be averaged.
The original formula needs to calculate the depth of the block matching point at each matching, and the calculation amount is very large, so that the relation between the pixel distance and the depth needs to be discussed next to avoid calculating the depth at each epipolar line search.
Referring to FIG. 2, FIG. 2 is a schematic view of epipolar lines in an imaging model;
a new pixel point of unmeasured depth is considered, P is a space point, depth is a predicted depth, A and B are cameras A and B respectively, a line segment AB is displacement between the two cameras, E is a pixel point on an imaging plane, and F is an optimal pixel point on an epipolar line matched with NCC. So the coordinates for A, B, P, E in fig. 2 are known
Figure BDA0002833127220000071
Is known, it is the relationship of the projected length (i.e. depth) of EF and DP in the z direction that needs to be discussed. The triangle BDP needs to be solved, the triangle BDP is taken out independently, and a coordinate system is established by taking P as an origin, PB as a y-axis and unit length as one pixel.
In fig. 3, fig. 3 is a geometrical diagram of the epipolar lines after being coordinated; let the coordinates of B be (0, B), the coordinates of E be (0, E),
Figure BDA0002833127220000072
for the polar direction, its unit vector is (m, n), the equation for the straight line AP is x-qz, and the EF length is l, so there are coordinates of F as (lm, e + ln) where only l is an unknown quantity. The problem then translates into finding the length of the DP.
a. The equation for the straight line BF is:
Figure BDA0002833127220000073
b. and (5) obtaining the intersection point of the AP and the BD. Since the AP straight line is known, the length of DP can be represented by finding only one of the horizontal and vertical coordinates, D has the vertical coordinate:
Figure BDA0002833127220000074
q is the inverse of the slope of the straight line AP, and if q is larger, it means that the camera a is very close to the ground, which is inconsistent with the assumption of the high-altitude unmanned aerial vehicle, so in the case of aerial photography by the high-altitude unmanned aerial vehicle, q is smaller, and therefore the size of (m-nq) l is smaller. In addition, q is not infinitely small, and the more q tends to be 0, the smaller the distances between the cameras A and B are until the cameras coincide with each other, but in the process of triangulating the depth, a certain distance must be ensured, and the depth which is measured by the method is not reliable. Furthermore, (b-e) is equal to the focal length of the camera, which is typically thousands of pixels, so q (b-e) > (m-nq) l can be obtained. So the above can be formulated as:
Figure BDA0002833127220000075
the analysis here is that the depth z and the x direction are the same, and the depth z and the y direction are similar, and the results obtained by the two dimensions are unified to obtain that z is proportional to l.
Therefore, the depth of the nearby pixel point can be used as the prediction depth of the pixel point to be detected, and then the prediction depth is used for obtaining the prediction matching point. The above derivation can result in that the pixel distance Δ dis between the matching point obtained in the NCC matching process and the predicted matching point is proportional to the difference Δ depth between the depth calculated by the matching point and the predicted depth.
The pixel distance between the matching point and the expected matching point is actually l, and then equation (1) can be rewritten as:
nccnorm=(k1l+bias)ncc (2)
the distance of the depth can then be represented by simply calculating the pixel distance, avoiding the need to calculate the depth for each match.
Regarding the parameter k1, the bias is selected, depending on the application scenario, for the plain, the absolute values of bias and k1 can be slightly larger, for example, 1.4, which means that ncc of the pixels at the depth close to the predicted depth is enlarged by 1.4 times. k1 may take-0.1, meaning that ncc is reduced to 0.9 times when the difference from the predicted depth to the pixel is up to 5 pixels.
Other more complex regularization functions may also be selected to achieve better regularization.
The optimum ncc can be quickly found by using the formula (2)normWill obtain the optimum nccnormThe point of (2) can be used as a matching point to calculate the depth of each pixel point in a triangularization mode. Depth unification, and image fusion may then be performed.
Example two
An embodiment of the present invention provides a depth map obtaining system, and fig. 4 is a schematic composition diagram of the depth map obtaining system, where the depth map obtaining system includes:
the picture obtaining unit is used for shooting the same scene at a plurality of positions by using the unmanned aerial vehicle to obtain a plurality of pictures;
the characteristic point matching unit is used for obtaining a plurality of matching pairs among different pictures and whether the pictures are overlapped or not through characteristic point matching;
the motion recovery calculation unit is used for carrying out motion recovery calculation to obtain camera external parameters and sparse space point clouds of all the pictures based on the matching result;
the multi-view solid geometry calculation unit is used for performing multi-view solid geometry calculation on all the pictures to obtain dense depth maps corresponding to all the pictures;
in the multi-view stereo geometric calculation process, pixel matching is carried out by using block matching, and space constraint is added for pixel point depth measurement by regularization processing of the block matching.
The system uses a linear function in formula (1) to perform regularization processing:
nccnorm=(k1△d+bias)ncc (1)
wherein d ═ d-dmI, k1 is the slope of the linear function, bias is the bias of the linear function, NCC is the value obtained by taking the mean value NCC calculation, d is the depth measured with the current pixel as the matching pointmTo predict depth.
The system uses a linear function in an equation (2) to perform regularization processing:
nccnorm=(k1l+bias)ncc (2)
where k1 is the slope of the linear function, bias is the bias of the linear function, NCC is the value obtained by the calculation of the mean NCC, and l is the pixel distance of the matching point from the expected matching point.
In this embodiment, the regularization processing method is the same as the regularization processing method in the depth map obtaining method.
EXAMPLE III
The third embodiment of the invention provides an unmanned aerial vehicle orthographic view generating method which is characterized by comprising the following steps:
shooting a scene from a plurality of angles by using an unmanned aerial vehicle to obtain a plurality of pictures;
preprocessing all pictures;
extracting feature points from the preprocessed pictures, matching any two pictures, determining a picture pair with an overlapping region, and obtaining a feature point matching result;
performing motion recovery calculation based on the feature point matching result to obtain camera parameters of each picture and coordinates of space points corresponding to the matching feature points;
based on the unmanned aerial vehicle camera parameters and the coordinates of the space points corresponding to the matched feature points, performing multi-view stereo geometric calculation to obtain dense depth maps of all the pictures; performing pixel matching by using epipolar line search and block matching in the multi-view solid geometry calculation process;
carrying out depth unified processing on the dense depth maps of all the pictures;
carrying out image fusion processing on the dense depth map subjected to the depth unification processing to generate an unmanned aerial vehicle orthographic map;
performing motion recovery calculation on all pictures to obtain camera parameters and sparse space point clouds of all the pictures;
performing multi-view stereo geometric calculation on all the pictures based on the camera parameters and the sparse space point clouds of all the pictures to obtain dense depth maps corresponding to all the pictures;
in the multi-view stereo geometric calculation process, a block matching method is used for matching pixels, and space constraint is added for pixel point depth measurement through regularization processing in the block matching.
Wherein, at the process that unmanned aerial vehicle orthographic mapping generated, the step that passes through has: preprocessing, point matching, SfM, MVS, depth unification and image fusion.
The preprocessing mainly comprises the steps of image distortion removal, image denoising and the like.
The point matching is divided into two methods, namely sparse point matching and dense point matching, the representative method of the sparse point matching is to perform feature point matching, and the feature point matching is divided into SIFT-based point matching, SURF-based point matching and the like due to different feature point extraction modes. Dense point matching is mainly referred to as optical flow. The sparse point matching effect is determined by the matching effect of the feature points, and usually, a good feature point extraction algorithm has good brightness, rotation and scale invariance. Dense point matching is carried out, because feature point descriptors do not need to be calculated, the matching speed is high, the sparsity degree of the result is controllable, the dependence on image textures is small, and the method has a good effect on two images with small changes and is suitable for inter-frame matching of videos.
Sfm (structure from motion) requires the use of matched pairs of points to estimate camera parameters and the coordinates of the spatial points corresponding to the matched points. Methods for SfM are divided into incremental SfM, global SfM and hybrid SfM.
The coordinates of a space point are determined by using a pair of matching points (the imaging positions of the space point in the two cameras are the positions of the pair of matching points in the respective images), one point in the point cloud can be generated by using the coordinates of the space point and the gray value in the images based on the coordinates of the space point, and the space points corresponding to all the matching points form a sparse point cloud.
MVS (Multi-view Stereo) requires reconstruction of a dense depth map for each picture using SfM estimated camera parameters and sparse spatial point clouds. This is also the stage to which the invention is primarily directed.
The step of generating the dense depth map is MVS, and in the process of MVS:
1. for a single picture: and calculating pictures A with overlapped areas with the pictures A one by one. Assume that a certain picture with an overlapping area is B.
2. For a pair of pictures A and B, all pixels in A are traversed, and in one traversal, if a pixel p is processed, pixels matched with p in B are searched in an epipolar line search + block matching mode.
And (4) depth unification, processing a dense depth map of each picture obtained by MVS, and enabling the depth of points in the image overlapping area to be consistent.
Image fusion is the fusion of images done with the rgb image of each image under the guidance of a dense depth map.
Example four
The fourth embodiment of the present invention provides a depth map obtaining apparatus, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the depth map obtaining method when executing the computer program.
The processor may be a Central Processing Unit (CPU), or other general-purpose processor, a digital signal processor (digital signal processor), an Application Specific Integrated Circuit (Application Specific Integrated Circuit), an off-the-shelf programmable gate array (field programmable gate array) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory can be used for storing the computer program and/or the module, and the processor can realize various functions of the depth map obtaining device in the invention by operating or executing the data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a smart memory card, a secure digital card, a flash memory card, at least one magnetic disk storage device, a flash memory device, or other volatile solid state storage device.
EXAMPLE five
An embodiment five of the present invention provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps of the depth map obtaining method are implemented.
The depth map obtaining means, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a computer-readable storage medium. Based on such understanding, all or part of the flow in the method of implementing the embodiments of the present invention may also be stored in a computer readable storage medium through a computer program, and when the computer program is executed by a processor, the computer program may implement the steps of the above-described method embodiments. Wherein the computer program comprises computer program code, an object code form, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying said computer program code, a recording medium, a usb-disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory, a random access memory, a point carrier signal, a telecommunications signal, a software distribution medium, etc. It should be noted that the computer readable medium may contain content that is appropriately increased or decreased as required by legislation and patent practice in the jurisdiction.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A depth map obtaining method, characterized in that the method comprises:
shooting the same scene at a plurality of positions by using an unmanned aerial vehicle to obtain a plurality of pictures;
obtaining a plurality of matching pairs among different pictures and whether the pictures are overlapped or not through feature point matching;
based on the matching result, motion recovery calculation is carried out to obtain camera external parameters and sparse space point clouds of all the pictures;
performing multi-view stereo geometric calculation on all the pictures to obtain dense depth maps corresponding to all the pictures;
in the multi-view stereo geometric calculation process, matching among pixels is carried out by using block matching, and space constraint is added for pixel point depth measurement through regularization processing by the block matching.
2. The method according to claim 1, wherein a spatial regularization term is constructed during the block matching process, and a linear function with respect to depth distance is used for regularization.
3. The depth map obtaining method according to claim 2, wherein the method performs block matching using a normalized cross-correlation matching method.
4. The depth map obtaining method according to claim 3, wherein the method uses a linear function in equation (1) for regularization:
nccnorm=(k1△d+bias)ncc (1)
wherein d ═ d-dmI, k1 is the slope of the linear function, bias is the bias of the linear function, NCC is the value obtained by taking the mean value NCC calculation, d is the depth measured with the current pixel as the matching pointmTo predict depth.
5. The depth map obtaining method according to claim 3, wherein the method uses a linear function in equation (2) for regularization:
nccnorm=(k1l+bias)ncc (2)
where k1 is the slope of the linear function, bias is the bias of the linear function, NCC is the value obtained by the calculation of the mean NCC, and l is the pixel distance of the matching point from the expected matching point.
6. A depth map acquisition system, characterized in that the system comprises:
the picture obtaining unit is used for shooting the same scene at a plurality of positions by using the unmanned aerial vehicle to obtain a plurality of pictures;
the characteristic point matching unit is used for obtaining a plurality of matching pairs among different pictures and whether the pictures are overlapped or not through characteristic point matching;
the motion recovery calculation unit is used for carrying out motion recovery calculation to obtain camera external parameters and sparse space point clouds of all the pictures based on the matching result;
the multi-view solid geometry calculation unit is used for performing multi-view solid geometry calculation on all the pictures to obtain dense depth maps corresponding to all the pictures;
in the multi-view stereo geometric calculation process, pixel matching is carried out by using block matching, and space constraint is added for pixel point depth measurement by regularization processing of the block matching.
7. The depth map acquisition system of claim 6, wherein the system uses a linear function in equation (1) for regularization:
nccnorm=(k1△d+bias)ncc (1)
wherein d ═ d-dmI, k1 is the slope of the linear function, bias is the bias of the linear function, NCC is the value obtained by taking the mean value NCC calculation, d is the depth measured with the current pixel as the matching pointmTo predict depth.
8. The depth map acquisition system of claim 6, wherein the system uses a linear function in equation (2) for regularization:
nccnorm=(k1l+bias)ncc (2)
where k1 is the slope of the linear function, bias is the bias of the linear function, NCC is the value obtained by the calculation of the mean NCC, and l is the pixel distance of the matching point from the expected matching point.
9. An unmanned aerial vehicle orthographic mapping generation method is characterized by comprising the following steps:
shooting a scene from a plurality of angles by using an unmanned aerial vehicle to obtain a plurality of pictures;
preprocessing all pictures;
extracting feature points from the preprocessed pictures, matching any two pictures, determining a picture pair with an overlapping region, and obtaining a feature point matching result;
performing motion recovery calculation based on the feature point matching result to obtain camera parameters of each picture and coordinates of space points corresponding to the matching feature points;
based on the unmanned aerial vehicle camera parameters and the coordinates of the space points corresponding to the matched feature points, performing multi-view stereo geometric calculation to obtain dense depth maps of all the pictures; performing pixel matching by using epipolar line search and block matching in the multi-view solid geometry calculation process;
carrying out depth unified processing on the dense depth maps of all the pictures;
carrying out image fusion processing on the dense depth map subjected to the depth unification processing to generate an unmanned aerial vehicle orthographic map;
performing motion recovery calculation on all pictures to obtain camera parameters and sparse space point clouds of all the pictures;
performing multi-view stereo geometric calculation on all the pictures based on the camera parameters and the sparse space point clouds of all the pictures to obtain dense depth maps corresponding to all the pictures;
in the multi-view solid geometry calculation process, block matching adds space constraint for pixel point depth measurement through regularization processing.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when being executed by a processor, carries out the steps of the depth map obtaining method according to any one of claims 1 to 5.
CN202011462830.8A 2020-12-14 2020-12-14 Depth map obtaining method and system, unmanned aerial vehicle orthogram generating method and medium Active CN112509124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011462830.8A CN112509124B (en) 2020-12-14 2020-12-14 Depth map obtaining method and system, unmanned aerial vehicle orthogram generating method and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011462830.8A CN112509124B (en) 2020-12-14 2020-12-14 Depth map obtaining method and system, unmanned aerial vehicle orthogram generating method and medium

Publications (2)

Publication Number Publication Date
CN112509124A true CN112509124A (en) 2021-03-16
CN112509124B CN112509124B (en) 2023-09-22

Family

ID=74972557

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011462830.8A Active CN112509124B (en) 2020-12-14 2020-12-14 Depth map obtaining method and system, unmanned aerial vehicle orthogram generating method and medium

Country Status (1)

Country Link
CN (1) CN112509124B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723373A (en) * 2021-11-02 2021-11-30 深圳市勘察研究院有限公司 Unmanned aerial vehicle panoramic image-based illegal construction detection method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150254868A1 (en) * 2014-03-07 2015-09-10 Pelican Imaging Corporation System and methods for depth regularization and semiautomatic interactive matting using rgb-d images
CN109005398A (en) * 2018-07-27 2018-12-14 杭州电子科技大学 A kind of stereo image parallax matching process based on convolutional neural networks
CN110176032A (en) * 2019-04-28 2019-08-27 暗物智能科技(广州)有限公司 A kind of three-dimensional rebuilding method and device
CN110675317A (en) * 2019-09-10 2020-01-10 中国人民解放军国防科技大学 Super-resolution reconstruction method based on learning and adaptive trilateral filtering regularization
CN111357034A (en) * 2019-03-28 2020-06-30 深圳市大疆创新科技有限公司 Point cloud generation method, system and computer storage medium
CN111462329A (en) * 2020-03-24 2020-07-28 南京航空航天大学 Three-dimensional reconstruction method of unmanned aerial vehicle aerial image based on deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150254868A1 (en) * 2014-03-07 2015-09-10 Pelican Imaging Corporation System and methods for depth regularization and semiautomatic interactive matting using rgb-d images
CN109005398A (en) * 2018-07-27 2018-12-14 杭州电子科技大学 A kind of stereo image parallax matching process based on convolutional neural networks
CN111357034A (en) * 2019-03-28 2020-06-30 深圳市大疆创新科技有限公司 Point cloud generation method, system and computer storage medium
CN110176032A (en) * 2019-04-28 2019-08-27 暗物智能科技(广州)有限公司 A kind of three-dimensional rebuilding method and device
CN110675317A (en) * 2019-09-10 2020-01-10 中国人民解放军国防科技大学 Super-resolution reconstruction method based on learning and adaptive trilateral filtering regularization
CN111462329A (en) * 2020-03-24 2020-07-28 南京航空航天大学 Three-dimensional reconstruction method of unmanned aerial vehicle aerial image based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张阳: ""基于块匹配和深度学习的多视图立体视觉方法研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 04, pages 138 - 1064 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723373A (en) * 2021-11-02 2021-11-30 深圳市勘察研究院有限公司 Unmanned aerial vehicle panoramic image-based illegal construction detection method

Also Published As

Publication number Publication date
CN112509124B (en) 2023-09-22

Similar Documents

Publication Publication Date Title
Sun et al. Disp r-cnn: Stereo 3d object detection via shape prior guided instance disparity estimation
CN108961327B (en) Monocular depth estimation method and device, equipment and storage medium thereof
US11954813B2 (en) Three-dimensional scene constructing method, apparatus and system, and storage medium
Remondino et al. Dense image matching: Comparisons and analyses
CN111563921B (en) Underwater point cloud acquisition method based on binocular camera
CN110853075A (en) Visual tracking positioning method based on dense point cloud and synthetic view
CN111340922A (en) Positioning and mapping method and electronic equipment
Kuschk Large scale urban reconstruction from remote sensing imagery
US20160232705A1 (en) Method for 3D Scene Reconstruction with Cross-Constrained Line Matching
Yuan et al. 3D reconstruction of background and objects moving on ground plane viewed from a moving camera
CN110120012B (en) Video stitching method for synchronous key frame extraction based on binocular camera
CN112509124B (en) Depth map obtaining method and system, unmanned aerial vehicle orthogram generating method and medium
Rothermel et al. Fast and robust generation of semantic urban terrain models from UAV video streams
Tanner et al. DENSER cities: A system for dense efficient reconstructions of cities
KR102587298B1 (en) Real-time omnidirectional stereo matching method using multi-view fisheye lenses and system therefore
CN112146647B (en) Binocular vision positioning method and chip for ground texture
CN116189140A (en) Binocular vision-based vehicle three-dimensional target detection algorithm
CN112258635B (en) Three-dimensional reconstruction method and device based on improved binocular matching SAD algorithm
Fan et al. Collaborative three-dimensional completion of color and depth in a specified area with superpixels
Wong et al. 3D object model reconstruction from image sequence based on photometric consistency in volume space
CN110021041B (en) Unmanned scene incremental gridding structure reconstruction method based on binocular camera
US10430971B2 (en) Parallax calculating apparatus
Mustaniemi et al. Parallax correction via disparity estimation in a multi-aperture camera
Ye et al. Precise disparity estimation for narrow baseline stereo based on multiscale superpixels and phase correlation
Chang et al. Pixel based cost computation using weighted distance information for cross-scale stereo matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 610042 No. 270, floor 2, No. 8, Jinxiu street, Wuhou District, Chengdu, Sichuan

Applicant after: Chengdu shuzhilian Technology Co.,Ltd.

Address before: No.2, floor 4, building 1, Jule road crossing, Section 1, West 1st ring road, Wuhou District, Chengdu City, Sichuan Province 610041

Applicant before: CHENGDU SHUZHILIAN TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant