CN110782524B - Indoor three-dimensional reconstruction method based on panoramic image - Google Patents

Indoor three-dimensional reconstruction method based on panoramic image Download PDF

Info

Publication number
CN110782524B
CN110782524B CN201911024676.3A CN201911024676A CN110782524B CN 110782524 B CN110782524 B CN 110782524B CN 201911024676 A CN201911024676 A CN 201911024676A CN 110782524 B CN110782524 B CN 110782524B
Authority
CN
China
Prior art keywords
image
coordinates
line segment
point
line segments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911024676.3A
Other languages
Chinese (zh)
Other versions
CN110782524A (en
Inventor
胡敏
李熠
黄宏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201911024676.3A priority Critical patent/CN110782524B/en
Publication of CN110782524A publication Critical patent/CN110782524A/en
Application granted granted Critical
Publication of CN110782524B publication Critical patent/CN110782524B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects

Abstract

The invention relates to an indoor three-dimensional reconstruction method based on a panoramic image, and belongs to the technical field of three-dimensional reconstruction. The method comprises the following steps: s1: line segment detection: detecting contour line segments in an image; s2: grouping line segments: detecting vanishing points of the image, namely, grouping line segments into line segment sets in three directions in the directions of three coordinate axes of a Cartesian coordinate system in a scene; s3: and (3) detecting intersection points: constructing an undirected graph, judging true and false intersection points through constraint conditions, and determining a recurrence order of coordinates through a minimum spanning tree; s4: and (3) lifting coordinates: and the conversion from the pixel coordinate system to the spherical coordinate system and then to the three-dimensional space coordinate system is realized. By utilizing the characteristics of wide field of view and easy acquisition of the panoramic image, the spatial geometric information in the scene can be acquired more accurately by adopting a classical geometric method and combining panoramic geometry and Manhattan world constraint, meanwhile, the line segments are used as calculation factors, the data size is small, and the three-dimensional line model of the scene can be acquired in a short time.

Description

Indoor three-dimensional reconstruction method based on panoramic image
Technical Field
The invention belongs to the technical field of three-dimensional reconstruction, and relates to an indoor three-dimensional reconstruction method based on a panoramic image.
Background
The three-dimensional reconstruction technology is to acquire three-dimensional space information of an object or a scene and establish a corresponding three-dimensional model in a computer. The three-dimensional reconstruction is divided into manual modeling and automatic modeling, wherein the manual modeling is performed through modeling software such as 3DS Max, maya, rhino and the like, and the three-dimensional reconstruction has high precision, but is time-consuming and labor-consuming; the automatic modeling is the combination of software and hardware, and the corresponding data is processed by a computer to automatically generate the three-dimensional model. The automatic modeling technology is divided into an active measurement method and a passive measurement method: the active measurement method refers to that by transmitting a controllable signal to a target object, depth information of each point of a target scene or object is calculated through the transmitted signal and a return signal, so as to perform modeling and measurement, for example: laser ranging, structured light, etc., which should be a hand-held laser scanner, depth camera kinect; the passive measurement method refers to a method for acquiring a three-dimensional structure of a target through an image, and is divided into a single-view reconstruction method and a multi-view reconstruction method, wherein the single-view reconstruction is a three-dimensional mechanism for calculating the target by combining certain known information or constraint conditions, and the multi-view reconstruction is to acquire a three-dimensional model of the object by constructing stereoscopic vision through images of the same object or scene without visual angles similar to binocular ranging of people.
Single view reconstruction techniques include focal length methods, image methods. The focal length method is similar to the laser ranging principle, and the focal position of a camera is adjusted to measure the distance between different positions in a scene and the camera, namely, the depth of a measured point is calculated by utilizing a lens imaging formula.
The image laws are divided into data driving methods and classical geometry methods. The method based on data driving mainly learns and extracts features in an image by a machine learning method, performs semantic annotation on pixels and pixel blocks, and reconstructs a scene by dividing pixel areas into floors, walls, ceilings and the like. Such as: YInda Zhang et al understand a scene by machine learning, recognize objects in a panoramic image, determine vanishing points by using a Manhattan world model, acquire spatial position information of the objects, and finally represent the objects by cuboids of different colors to realize understanding and reconstruction of the scene. Ruifeng Deng adopts multi-scale CNN to extract plane normal and depth information from the image, and then realizes layout estimation of indoor scene through multi-channel FCN. The data driving method can better divide the image so as to realize the reconstruction of the scene, and after the image is divided, the reconstruction is carried out in the form of pixel blocks, and the main purpose is to restore the rough outline of the scene, contain the scene texture and have low precision.
The method based on geometric constraint mainly comprises the steps of obtaining and analyzing the information of line segments, planes and the like in image information to obtain the space information of a scene, and the method based on geometric constraint mainly comprises the steps of analyzing, detecting and extracting features of images, and classifying the detected line segments, such as concave-convex, shielding and the like. The detected line segments are then processed to obtain the geometry of the object or scene. For the multi-direction that a single Manhattan model in a real scene does not satisfy line segments, julian Straub proposes a hybrid Manhattan framework, i.e. a plurality of Manhattan world models exist in the same scene, and a rotation matrix is used to represent the specific direction of each Manhattan world model. In a single image reconstruction method, siddhant Ranade et al estimate vanishing points of line segments by using linear constraint and a Manhattan world model, and promote 2D line segments in an image to 3D, so that the reconstruction of a 3D line model of a building is realized. After detecting the line segments of the panoramic image, hao Yang and Hui Zhang perform super-pixel segmentation, divide the image into a ground wall surface and a ceiling, and reconstruct a scene in the panoramic image. Jiu Xu et al uses object detection, 3D object pose estimation and plane normal estimation to estimate the position and direction of the wall surface and object in the panoramic image, enabling three-dimensional reconstruction of a single panoramic image.
The Manhattan 3D line reconstruction well utilizes the geometric characteristics of the artificial scene, can accurately restore the geometric structure of the building, but takes the outline of the building into consideration, and can only reconstruct a three-dimensional model of a certain part of the building. In the indoor scene, because the common image has a narrower visual angle, the image of the whole scene cannot be acquired in the indoor scene, and the reconstruction of the indoor scene can only be partially performed, so that the reconstruction of the indoor scene by using the panoramic image has great advantage in information quantity. In addition, compared with super-pixel segmentation, the depth information of the scene in the image is calculated by utilizing the back projection of the panoramic geometry, so that better reconstruction accuracy is achieved.
The method utilizes a single panoramic image to realize line drawing reconstruction of an indoor scene, detects contour lines in the indoor scene, and promotes line segments of two-dimensional coordinates to three-dimensional coordinates, so that conversion of pixel coordinates, spherical coordinates and three-dimensional space coordinates is required. When calculating the three-dimensional coordinates of the pixel points, the grouping of the line segments needs to be confirmed, and after the coordinates of the ground line segments are obtained, the three-dimensional coordinates of the line segments in the whole image are further solved through the intersection points of the rest line segments.
Disclosure of Invention
In view of the above, the invention aims to provide an indoor three-dimensional reconstruction method based on a panoramic image, which restores a line drawing three-dimensional space model of a scene through the panoramic image of the indoor scene, and solves coordinates through panoramic geometry, thereby ensuring the accuracy of reconstructed space points. Meanwhile, the panoramic image can be obtained through a common panoramic camera, and the equipment cost is low. The reconstruction effect is a three-dimensional model of the contour lines in the scene, and during data processing, the detected contour lines in the image are processed, so that dense point cloud data of the relative active vision is matched with characteristic points in the stereoscopic vision, the calculated amount is small, and the operation time is short.
In order to achieve the above purpose, the present invention provides the following technical solutions:
an indoor three-dimensional reconstruction method based on panoramic images comprises the following steps:
s1: line segment detection: detecting contour line segments in an image;
s2: grouping line segments: detecting vanishing points of the image, namely, grouping line segments into line segment sets in three directions in the directions of three coordinate axes of a Cartesian coordinate system in a scene;
s3: and (3) detecting intersection points: constructing an undirected graph, judging true and false intersection points through constraint conditions, and determining a recurrence order of coordinates through a minimum spanning tree;
s4: and (3) lifting coordinates: and the conversion from the pixel coordinate system to the spherical coordinate system and then to the three-dimensional space coordinate system is realized.
Optionally, a line segment in the panoramic image is obtained by using an LSD algorithm, and a straight line pixel point set is obtained by locally analyzing the image so as to facilitate subsequent data processing.
Optionally, when acquiring a line segment in the image, acquiring related information of the line segment, including a start point, an end point and an angle of the line segment; then acquiring vanishing point directions through Hough transformation, wherein the three vanishing point directions are orthogonal in pairs, and the three directions with highest voting scores of Hough voting are the three vanishing point directions; when the space middle line segment l is projected to the spherical image, the space middle line segment l is presented in a large circle form on the spherical surface; c is a large circle where the line segment l is located, n is a unit normal vector of the large circle where the line segment l is located, and meanwhile, the vanishing point direction corresponding to the line segment l is perpendicular to the plane normal vector; each line segment corresponds to a unique great circle, and has a unique great circle normal vector n, and the direction of the line segment is marked through n.
Optionally, S31: construction of a line-free diagram
By constructing undirected graphs
Figure GDA0002284577320000031
To represent one of the line segments given a constraintOptimizing the relation among the line segments in the image and the scene space structure;
s32: boundary line determination
The boundary line represents the edge of the plane, namely the line segment formed by intersecting different planes, while the non-boundary line is the line segment in the plane, and when the corresponding spatial information derivation is carried out, the non-boundary line is positioned in the plane and can only intersect the line segments in two directions forming the plane; a Novel Single View Constraints for Manhattan 3D Line Reconstruction method is adopted to distinguish boundary lines from non-boundary lines; for each line segment Li, determining that the line segment set intersected with the line segment is li= { ln, lm, lk..}, if the line segment in the set does not contain the line segment in the normal direction of the plane where the line segment Li is located, the line segment Li is a non-boundary line, bi=0, and otherwise bi=1;
s33: planar segmentation
The coordinate derivation can be carried out after the calculated pixel points are determined to be ground points or wall points, namely plane attributes are required to be determined;
under the constraint of Manhattan world hypothesis, the classification of planes is divided into three types by the normal directions of the planes, namely planes in three directions of x, y and z, wherein x, y and z represent the normal directions of the planes and also represent the coordinate axis directions of world coordinates, and certain plane constraint conditions exist at the same time;
closed-loop constraint, wherein the composition of the plane is composed of two groups of parallel line segments; in the undirected graph, nodes meet the connection graph area forming a closed loop, and the directions of the nodes in the area are nodes in two directions forming a plane, and the nodes are judged to be the same plane;
three types of line segments are classified in the image, namely line segments grouped into three vanishing point directions, and planes in three directions can be formed at the same time; each line segment is an element forming two types of planes, namely, an x-direction line segment can form an xy plane and an xz plane, all nodes are completely separated by the existing clustering algorithm, namely, each node has only a unique type attribute, meanwhile, the connection between planes in an image passes through another type of plane, and each plane needs to be made into a single cluster; dividing the image area in a planar manner;
s34: minimum spanning tree
In the drawings
Figure GDA0002284577320000041
Wherein, (i, j) epsilon represents the intersection point between line segments, a Boolean variable bij is created for each edge in the graph, the value of the Boolean variable bij is 1 and is represented as a real intersection point, the intersection point category is classified by adopting a Manhattan junction catalogue for spatial reasoning of indoor scenes method, the intersection point authenticity judgment is carried out by adopting a Novel Single View Constraints for Manhattan 3D Line Reconstruction method, and the optimization is carried out by combining the intersection point category so as to determine the authenticity of the intersection point in the image; the coordinates of the subsequent connecting line segments are further deduced by the real intersection points.
Optionally, the S4 specifically is:
s41: panoramic geometry
From two-dimensional image coordinates, the conversion among the image coordinates, a camera coordinate system and a world coordinate system is realized by lifting to three-dimensional space coordinates; 2D image coordinate point p i Conversion and 3D coordinate point P i The conversion formula of (2) is:
P i =λ i R -1 K -1 p i
wherein lambda is i Is the distance between the target pixel point and the origin of coordinates, R is the rotation matrix of camera coordinates and world coordinates, K is the internal parameter matrix of camera, and d i =R -1 K -1 p i The direction of the pixel point under the world coordinate system; under the condition that the parameters of a camera are assumed to be known, a rotation matrix can be obtained through a camera calibration method; in the aspect of obtaining depth information, single image reconstruction based on geometry is often calculated through vanishing points, and after determining the line segment direction, the vanishing points are solved through a least square method to obtain depth information of points and line segments in the image;
the panoramic image comprises a field of view of 360 ° in the horizontal direction and 180 ° in the vertical direction, the pixels of the image being w×h, where w=2h; in the panoramic image, pixels and angles are in one-to-one correspondence, and the correspondence is W/2 pi; the conversion formula of the image coordinates and the panoramic coordinates is as follows:
θ x =2πx/W
θ y =πy/H
after determining the direction attribute of line segments and pixel blocks in the image, the space reconstruction in the scene can be carried out through the geometric characteristic of the panoramic image; the conditions are as follows:
(1) Objects and scenes in the image meet the Manhattan world constraint;
(2) The camera height of the shot is known;
under panoramic image, pixel coordinate p i = (x, y) conversion to 3D spatial coordinates P i The conversion formula of = (X, Y, Z) is:
r=c h |cotθ y |
Figure GDA0002284577320000051
Figure GDA0002284577320000052
c h the height of the view point from the ground is shot by a camera, and r is the horizontal distance from the view point of the target pixel point under the world coordinate system; when determining the height c of the camera h Calculating three-dimensional space coordinates of corresponding pixel points in the image;
s42: scene reconstruction
And calculating three-dimensional space coordinates of each line segment in the panoramic image by using the panoramic geometry.
The invention has the beneficial effects that: by utilizing the characteristics of wide field of view and easy acquisition of the panoramic image, the spatial geometric information in the scene can be acquired more accurately by adopting a classical geometric method and combining panoramic geometry and Manhattan world constraint, meanwhile, the line segments are used as calculation factors, the data size is small, and the three-dimensional line model of the scene can be acquired in a short time.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.
Drawings
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:
FIG. 1 is a frame diagram of the present invention;
FIG. 2 is a build undirected graph;
FIG. 3 is a schematic diagram of false intersections;
FIG. 4 is a panoramic geometry;
fig. 5 is a reconstruction flow chart.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.
Wherein the drawings are for illustrative purposes only and are shown in schematic, non-physical, and not intended to limit the invention; for the purpose of better illustrating embodiments of the invention, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the size of the actual product; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numbers in the drawings of embodiments of the invention correspond to the same or similar components; in the description of the present invention, it should be understood that, if there are terms such as "upper", "lower", "left", "right", "front", "rear", etc., that indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, it is only for convenience of describing the present invention and simplifying the description, but not for indicating or suggesting that the referred device or element must have a specific azimuth, be constructed and operated in a specific azimuth, so that the terms describing the positional relationship in the drawings are merely for exemplary illustration and should not be construed as limiting the present invention, and that the specific meaning of the above terms may be understood by those of ordinary skill in the art according to the specific circumstances.
The algorithm comprises four modules:
1. line segment detection: detecting contour line segments in an image;
2. grouping line segments: detecting vanishing points of the image, namely, grouping line segments into line segment sets in three directions in the directions of three coordinate axes of a Cartesian coordinate system in a scene;
3. and (3) detecting intersection points: constructing an undirected graph, judging true and false intersection points through constraint conditions, and determining a recurrence order of coordinates through a minimum spanning tree;
4. and (3) lifting coordinates: and the conversion from the pixel coordinate system to the spherical coordinate system and then to the three-dimensional space coordinate system is realized.
An overall framework diagram of the algorithm is shown in fig. 1.
Algorithm module function and algorithm flow:
1. line segment detection
The aim of the algorithm is to reconstruct a line graph model of an indoor scene, namely reconstructing a three-dimensional space model similar to a cad graph, so that an object of image processing is line segment information in an image. The line segment detection is to acquire line segments in the panoramic image by utilizing an LSD algorithm, and obtain a straight line pixel point set by locally analyzing the image so as to facilitate subsequent data processing.
2. Grouping line segments:
when acquiring a line segment in an image, acquiring related information of the line segment, including a starting point and an ending point of the line segment, an angle and the like. Then, vanishing point directions are obtained through Hough transformation, the three vanishing point directions are orthogonal in pairs, and the three directions with highest voting scores of Hough voting are the three vanishing point directions. The line segments in the line panoramic image are curved when unfolded, which is not beneficial to acquiring the information of the line segments. When the spatial middle line segment l is projected onto the spherical image, it appears as a large circle on the sphere. c is a large circle where the line segment l is located, n is a unit normal vector of the large circle where the line segment l is located, and meanwhile, the vanishing point direction corresponding to the line segment l is perpendicular to the plane normal vector. Each line segment corresponds to a unique great circle, so that a unique corresponding great circle normal vector n is also provided, and the direction of the line segment is identified through n.
3. And (3) judging an intersection point:
since imaging by the camera projects color information of an object in the world coordinate system, perspective is projected to an image formed in the image coordinate system. The perspective projection projects in a straight line and thus may form false intersections due to parallax. When further derivation of line segment coordinates is performed later, the false intersection point may affect the reconstruction accuracy, so that it is necessary to judge the authenticity of the intersection point.
3.1 construction of a Wireless graph
In order to ensure the accuracy of line graph reconstruction, line segments are taken as edges, an undirected graph is built at the intersection points among the line segments to optimize, and the undirected graph is built
Figure GDA0002284577320000071
To represent the line segments in the line segments, and given constraints to optimize the link between the line segments in the image, the scene space structure. The undirected graph is constructed as shown in fig. 2, and is composed of a line segment in the image, a line segment to be represented by the undirected graph, and l 1-l 7 line segments, wherein each node is a line segment l in the image, and the sides are the intersection points of the line segments.
Wherein the method comprises the steps of
Figure GDA0002284577320000072
Is a node in the undirected graph, v i And (i, j) epsilon is the side of the undirected graph corresponding to the ith line segment in the image, and represents the connection relation among the line segments in the graph. v i Each line segment in the corresponding graph includes, as data nodes, corresponding parameters of the line segment: di ε S= { x, y, z } represents the direction of the ith line segment, and a Boolean variable Bi of 1 indicates that the line segment is a boundary line segment, otherwise, is a non-boundary line segment. For edge e in the undirected graph ij A Boolean variable b representing the intersection point between the ith line segment and the jth line segment ij 1 is represented as a true intersection point, 0 is represented as a false intersection point, and the region variable E ij =0 indicates that the intersection is seated as the lower half region of the image; c (C) ij E { L, K, X } represents the category of intersection points.
3.2 boundary line judgment:
the boundary line represents the edge of the plane, i.e. the line formed by intersecting different planes, while the non-boundary line is the line in the plane, and when the corresponding spatial information is deduced, the main body spatial coordinate is deduced mainly by the intersection of the boundary lines. The non-boundary line lies in the plane, it can only intersect a line segment in two directions constituting the plane, for example, a line segment in the xy plane can only intersect a line segment in the x or y direction, and cannot intersect a line segment in the z direction. The method in [ Novel Single View Constraints for Manhattan 3D Line Reconstruction ] is used to distinguish between boundary lines and non-boundary lines. For each line segment Li, a set of line segments intersecting with it is determined as li= { ln, lm, lk..}, if the line segments in the set do not contain line segments in the normal direction of the plane where Li is located, the line segment Li is a non-boundary line, bi=0, and bi=1 otherwise.
3.3 planar segmentation
The coordinate derivation can be performed after the calculated pixel point is determined to be a ground point or a wall point, namely the plane attribute needs to be determined.
Under the constraint of Manhattan world hypothesis, the classification of planes is divided into three classes by the normal directions of the planes, namely planes in three directions of x, y and z, wherein x, y and z represent the normal directions of the planes and also represent the coordinate axis directions of world coordinates, and certain plane constraint conditions exist at the same time.
Closed-loop constraint, the composition of the plane should be composed of two parallel line segments, e.g. X= { l should be contained only in the line segment set composing the xy plane x1 ,l x2 ,l x3 ...},Y={l y1 ,l y1 ,l y1 .. sets of line segments in both directions should not contain line segments in the z-direction. That is, in the undirected graph, if the nodes satisfy the connection graph region forming the closed loop and the direction of the nodes in the region is the node in two directions constituting the plane, the nodes are judged to be identicalA plane.
The line segments in the image are classified into three types, namely line segments grouped into three vanishing point directions, and can also form planes in three directions. Thus, each line segment is an element constituting two classes of planes, i.e. the x-direction line segments can constitute both the xy plane and the xz plane, and the existing clustering algorithm is to completely separate all nodes, i.e. each node has only a unique class attribute, while the connection between planes in the image is through another class of planes, so that each plane needs to be clustered individually. The image areas, i.e. floor, wall, ceiling, etc., are divided in a planar manner.
3.4 minimum spanning tree:
in line graph reconstruction, the pixel points which can directly calculate the three-dimensional coordinates of the pixel points through panoramic geometry are ground areas, and the ground line segments are determined through the plane areas divided by plane segmentation. The x and z coordinates of the ground line segment are calculated through the panoramic geometry, and the y coordinate of the ground point is 0. And then the three-dimensional coordinates of other line segments are obtained through the intersection points with the ground line segments, the coordinates of some line segments can be calculated through the conversion of a plurality of intersection points, and meanwhile, errors generated by the calculation of the three-dimensional space coordinates of the intersection points can be accumulated along with the increase of deduction nodes, so that the authenticity of the intersection points is necessary to be judged.
In the drawings
Figure GDA0002284577320000081
In the graph, (i, j) epsilon represents the intersection point between line segments, a Boolean variable bij is created for each edge in the graph, the value of the variable bij is 1 and is represented as a real intersection point, and [ Manhattan junction catalogue for spatial reasoning of indoor scenes ] is adopted]The method in (a) classifies the intersection point category by adopting [ Novel Single View Constraints for Manhattan 3D Line Reconstruction ]]The method of (1) is used for judging the authenticity of the intersection point, and is combined with the category of the intersection point to optimize so as to determine the authenticity of the intersection point in the image. The coordinates of the subsequent connecting line segments are further deduced by the real intersection points.
FIG. 3 is an example of a rectangular image in the form of a panoramic image expansion with pixel coordinates of(x, y); the spherical image is in the form of projecting the panoramic image onto the spherical surface, and the spherical image passes through theta in the spherical coordinate system x ,θ y To determine coordinates of the pixel points: demonstration including calculation of y-coordinate, c h For camera height, θ y The angle of the pixel point in the vertical direction under the spherical coordinate system; and the demonstration of calculating the horizontal coordinates x and y through r, and two coordinate values can be easily obtained after r is obtained.
1. The 2 intersection is the true intersection, while the 3,4 intersection is the false intersection due to parallax. The false intersection points in the image are formed by parallax of line segments in different directions in the scene, such as 3,4 and 5 intersection points in the figure, and due to projection relationship, the space distance between the false intersection points is not reflected in the picture. When knowing the coordinates of the line segments, the authenticity of the intersection point can be judged, for example:
when the intersection point of 2 and 6 is a ground point, the coordinate P thereof 2 (x 2 ,z 2 0) and P 6 (x 6 ,z 6 0) calculating the coordinate of the intersection point 4 by the intersection point 2 by panorama geometry, namely knowing the xy coordinates of the vertical line segments l1, l5, and calculating the coordinate of the intersection point 4 as P 4 (x 2 +Δx,z 2 0) and intersection 6 calculates the coordinates of intersection 4 as P' 4 (x 6 ,z 6 ,Δy),P 4 =P' 4 The equality condition is p2=p6, Δx=Δy=0, which cannot be satisfied, and the same can be said to determine that 3,4, and 5 are all false intersections.
On the other hand, in the closed loop formed by the line segment group li (i=2, 3,4,5, 6), which is represented in the form of a plane in the image, where l2, l6 are z-direction line segments, l3, l4 are x-direction line segments, which define an xz plane, and l5 are vertical-direction line segments, these five line segments cannot form a closed plane in space, and false intersections are necessary among the intersections formed. Meanwhile, the component of the plane in the normal direction is 0, namely the component line segments only comprise two direction line segments. There is therefore a lower constraint in the planar closed loop:
before processing the image, the only spatial scale information obtained is the height information of the camera from the ground, the spatial information in the scene is deduced according to the condition, but the direct information which can be deduced is the coordinate information of the ground of the room, the wall surface, the object and the spatial coordinates of the top of the room are deduced according to the intersecting line with the ground. When coordinate information is deduced, the intersection points are media for connecting different line segments and carrying out plane interaction, so that the accuracy of a reconstruction result is greatly influenced by correctly processing the intersection points in the image.
Intersection point classification: intersection classification the classification of the intersection is determined by the number of connecting segments for the same intersection.
The region variable is used to represent that the intersection is the upper half region or the lower half region, and is used to select the screening of the ground nodes, and the classification of the intersection is determined by the pixel coordinates of the intersection.
J represents the number of line connections of the intersection point, and P represents the number of points constituting the node:
Figure GDA0002284577320000091
l, T the intersection point is an intersection point formed by two line segments, the L-shaped intersection point is an in-plane intersection point, the T-shaped intersection point is an in-plane intersection point or a false intersection point formed by shielding, and the Y-shaped intersection point is the representation of a convex/concave structure in space. After determining the authenticity of the intersection point, the number of line segments constituting the intersection point should be equal to the number of line segments included in the intersection point line segment under the manhattan world constraint, i.e., j=num (Li) =num (D Li ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein D is Li E D represents the set of line segment directions in the set of Li line segments. Meanwhile, the types of the intersection points are different, and the intersection points are selected when the spatial deduction is performed by using different weights.
Figure GDA0002284577320000092
Image-generated undirected graph
Figure GDA0002284577320000093
After judging each node and each edge, acquiring parameters such as boundary lines, intersection point categories and the like, and generating a weighted undirected graph according to the parameters>
Figure GDA0002284577320000094
The weights are as follows:
Figure GDA0002284577320000101
Figure GDA0002284577320000102
ω ij weight of intersection point category, ω Bi Weight given to whether or not a line segment connected to the intersection point is a boundary line
The minimum spanning tree is obtained through the weighted undirected graph, and the coordinate derivation flow is determined through the minimum spanning tree, namely, the sequence of deriving the coordinates of the rest line segments from the ground line segment coordinates is selected through the intersection points.
4. And (3) lifting coordinates:
the purpose of the coordinate lifting is to realize the inverse process of camera imaging, combine the characteristics of panoramic images and the geometric characteristics of indoor scenes, and realize the calculation of the world coordinates of pixel points from the two-dimensional pixel coordinates in the images.
4.1 panoramic geometry:
from two-dimensional image coordinates, the conversion between image coordinates, camera coordinates and world coordinates is essentially achieved. 2D image coordinate point p i Conversion and 3D coordinate point P i The conversion formula of (2) is:
P i =λ i R -1 K -1 p i
wherein lambda is i Is the distance between the target pixel point and the origin of coordinates, R is the rotation matrix of camera coordinates and world coordinates, K is the internal parameter matrix of camera, and d i =R -1 K -1 p i The pixel's orientation in world coordinate system. Under the assumption that the camera parameters are known, the rotation matrix can be acquired by a camera calibration method. In acquiring depth information, geometry-based single image reconstruction is often performed by cancellationAnd calculating the vanishing point, and after determining the line segment direction, calculating the vanishing point by a least square method to obtain depth information of the point and the line segment in the image.
The panoramic image comprises a field of view of 360 ° in the horizontal direction and 180 ° in the vertical direction, the pixels of the image being w×h, where w=2h. Therefore, in the panoramic image, the pixels and the angles are in one-to-one correspondence, and the correspondence is W/2pi. The conversion formula of the image coordinates and the panoramic coordinates is as follows:
θ x =2πx/W
θ y =πy/H
after determining the directional properties of line segments, pixel blocks in the image, we can reconstruct the space in the scene from this geometrical property of the panoramic image. The conditions are as follows:
1. objects and scenes in the image satisfy manhattan world constraints such as man-made scenes, houses, streets, buildings, etc.
2. The camera for shooting is highly known
Under panoramic image, pixel coordinate p i = (x, y) conversion to 3D spatial coordinates P i The conversion formula of = (X, Y, Z) is: the panoramic geometry is shown in fig. 4.
r=c h |cotθ y |
Figure GDA0002284577320000111
Figure GDA0002284577320000112
c h The height of the view point from the ground is shot by the camera, and r is the horizontal distance from the view point of the target pixel point under the world coordinate system. When determining the height c of the camera h And then the three-dimensional space coordinates of the corresponding pixel points in the image can be accurately calculated.
4.2 scene reconstruction
With panoramic geometry, three-dimensional space coordinates of each line segment in the panoramic image are calculated, and the flow is shown in fig. 5.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims (3)

1. The indoor three-dimensional reconstruction method based on the panoramic image is characterized by comprising the following steps of: the method comprises the following steps:
s1: line segment detection: detecting contour line segments in an image;
s2: grouping line segments: detecting vanishing points of the image, namely, grouping line segments into line segment sets in three directions in the directions of three coordinate axes of a Cartesian coordinate system in a scene;
s3: and (3) detecting intersection points: constructing an undirected graph, judging true and false intersection points through constraint conditions, and determining a recurrence order of coordinates through a minimum spanning tree;
s4: and (3) lifting coordinates: the conversion from the pixel coordinate system to the spherical coordinate system and then to the three-dimensional space coordinate system is realized;
when acquiring a line segment in an image, acquiring related information of the line segment, including a starting point, a finishing point and an angle of the line segment; then acquiring vanishing point directions through Hough transformation, wherein the three vanishing point directions are orthogonal in pairs, and the three directions with highest voting scores of Hough voting are the three vanishing point directions; when the space middle line segment l is projected to the spherical image, the space middle line segment l is presented in a large circle form on the spherical surface; c is a large circle where the line segment l is located, n is a unit normal vector of the large circle where the line segment l is located, and meanwhile, the vanishing point direction corresponding to the line segment l is perpendicular to the plane normal vector; each line segment corresponds to a unique great circle, and has a unique great circle normal vector n, and the direction of the line segment is marked through n.
2. The panorama-based indoor three-dimensional reconstruction method according to claim 1, wherein: the S1 specifically comprises the following steps: and acquiring line segments in the panoramic image by utilizing an LSD algorithm, and obtaining a straight line pixel point set by locally analyzing the image so as to facilitate subsequent data processing.
3. The panorama-based indoor three-dimensional reconstruction method according to claim 1, wherein: the step S4 specifically comprises the following steps:
s41: panoramic geometry
From two-dimensional image coordinates, the conversion among the image coordinates, a camera coordinate system and a world coordinate system is realized by lifting to three-dimensional space coordinates; 2D image coordinate point p i Conversion and 3D coordinate point P i The conversion formula of (2) is:
P i =λ i R -1 K -1 p i
wherein lambda is i Is the distance between the target pixel point and the origin of coordinates, R is the rotation matrix of camera coordinates and world coordinates, K is the internal parameter matrix of camera, and d i =R -1 K -1 p i The direction of the pixel point under the world coordinate system; under the condition that the parameters of a camera are assumed to be known, a rotation matrix can be obtained through a camera calibration method; in the aspect of obtaining depth information, single image reconstruction based on geometry is often calculated through vanishing points, and after determining the line segment direction, the vanishing points are solved through a least square method to obtain depth information of points and line segments in the image;
the panoramic image comprises a field of view of 360 ° in the horizontal direction and 180 ° in the vertical direction, the pixels of the image being w×h, where w=2h; in the panoramic image, pixels and angles are in one-to-one correspondence, and the correspondence is W/2 pi; the conversion formula of the image coordinates and the panoramic coordinates is as follows:
θ x =2πx/W
θ y =πy/H
after determining the direction attribute of line segments and pixel blocks in the image, the space reconstruction in the scene can be carried out through the geometric characteristic of the panoramic image; the conditions are as follows:
(1) Objects and scenes in the image meet the Manhattan world constraint;
(2) The camera height of the shot is known;
under panoramic image, imagePlain coordinates p i = (x, y) conversion to 3D spatial coordinates P i The conversion formula of = (X, Y, Z) is:
r=c h |cotθ y |
Figure FDA0004181391610000021
Figure FDA0004181391610000022
c h the height of the view point from the ground is shot by a camera, and r is the horizontal distance from the view point of the target pixel point under the world coordinate system; when determining the height c of the camera h Calculating three-dimensional space coordinates of corresponding pixel points in the image;
s42: scene reconstruction
And calculating three-dimensional space coordinates of each line segment in the panoramic image by using the panoramic geometry.
CN201911024676.3A 2019-10-25 2019-10-25 Indoor three-dimensional reconstruction method based on panoramic image Active CN110782524B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911024676.3A CN110782524B (en) 2019-10-25 2019-10-25 Indoor three-dimensional reconstruction method based on panoramic image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911024676.3A CN110782524B (en) 2019-10-25 2019-10-25 Indoor three-dimensional reconstruction method based on panoramic image

Publications (2)

Publication Number Publication Date
CN110782524A CN110782524A (en) 2020-02-11
CN110782524B true CN110782524B (en) 2023-05-23

Family

ID=69386740

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911024676.3A Active CN110782524B (en) 2019-10-25 2019-10-25 Indoor three-dimensional reconstruction method based on panoramic image

Country Status (1)

Country Link
CN (1) CN110782524B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325662A (en) * 2020-02-21 2020-06-23 广州引力波信息科技有限公司 Method for generating 3D space house type model based on spherical projection panoramic image
CN111508058A (en) * 2020-02-24 2020-08-07 当家移动绿色互联网技术集团有限公司 Method and device for three-dimensional reconstruction of image, storage medium and electronic equipment
CN113409235B (en) * 2020-03-17 2023-08-22 杭州海康威视数字技术股份有限公司 Vanishing point estimation method and apparatus
CN111784826A (en) * 2020-07-14 2020-10-16 深圳移动互联研究院有限公司 Method and system for generating three-dimensional structure schematic diagram based on panoramic image
CN111986322B (en) * 2020-07-21 2023-12-22 西安理工大学 Point cloud indoor scene layout reconstruction method based on structural analysis
CN112634460B (en) * 2020-11-27 2023-10-24 浙江工商大学 Outdoor panorama generation method and device based on Haar-like features
CN112802120B (en) * 2021-01-13 2024-02-27 福州视驰科技有限公司 Camera external parameter calibration method based on non-uniform segmentation accumulation table and orthogonal blanking points
CN112861024B (en) * 2021-02-03 2023-08-01 北京百度网讯科技有限公司 Method and device for determining road network matrix, electronic equipment and storage medium
CN112802193B (en) * 2021-03-11 2023-02-28 重庆邮电大学 CT image three-dimensional reconstruction method based on MC-T algorithm
CN113486223B (en) * 2021-06-07 2022-09-09 海南太美航空股份有限公司 Air route display method and system and electronic equipment
US11625860B1 (en) * 2021-09-07 2023-04-11 Hong Kong Applied Science And Technology Research Institute Co., Ltd. Camera calibration method
CN113989376B (en) * 2021-12-23 2022-04-26 贝壳技术有限公司 Method and device for acquiring indoor depth information and readable storage medium
CN114663618B (en) * 2022-03-03 2022-11-29 北京城市网邻信息技术有限公司 Three-dimensional reconstruction and correction method, device, equipment and storage medium
CN115237159B (en) * 2022-09-21 2023-09-15 国网江苏省电力有限公司苏州供电分公司 Wire inspection method adopting unmanned aerial vehicle
CN116563474B (en) * 2023-07-05 2023-09-19 有方(合肥)医疗科技有限公司 Oral cavity panorama generating method and device
CN117173370B (en) * 2023-11-03 2024-03-01 北京飞渡科技股份有限公司 Method for maintaining object boundary in light weight process

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105115560A (en) * 2015-09-16 2015-12-02 北京理工大学 Non-contact measurement method for cabin capacity
CN106327532A (en) * 2016-08-31 2017-01-11 北京天睿空间科技股份有限公司 Three-dimensional registering method for single image
CN107292956A (en) * 2017-07-12 2017-10-24 杭州电子科技大学 A kind of scene reconstruction method assumed based on Manhattan
CN108280858A (en) * 2018-01-29 2018-07-13 重庆邮电大学 A kind of linear global camera motion method for parameter estimation in multiple view reconstruction
US10026218B1 (en) * 2017-11-01 2018-07-17 Pencil and Pixel, Inc. Modeling indoor scenes based on digital images

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09331440A (en) * 1996-06-13 1997-12-22 Gen Tec:Kk Three-dimension scene re-configuration system
JPH1166302A (en) * 1997-08-26 1999-03-09 Matsushita Electric Works Ltd Straight line detecting method
US8630446B2 (en) * 2011-03-31 2014-01-14 Mitsubishi Electronic Research Laboratories, Inc Method and system for determining projections in non-central catadioptric optical systems
CN102679959B (en) * 2012-05-03 2014-01-29 浙江工业大学 Omnibearing 3D (Three-Dimensional) modeling system based on initiative omnidirectional vision sensor
US9224205B2 (en) * 2012-06-14 2015-12-29 Qualcomm Incorporated Accelerated geometric shape detection and accurate pose tracking
US9269187B2 (en) * 2013-03-20 2016-02-23 Siemens Product Lifecycle Management Software Inc. Image-based 3D panorama
US9595134B2 (en) * 2013-05-11 2017-03-14 Mitsubishi Electric Research Laboratories, Inc. Method for reconstructing 3D scenes from 2D images
US9183635B2 (en) * 2013-05-20 2015-11-10 Mitsubishi Electric Research Laboratories, Inc. Method for reconstructing 3D lines from 2D lines in an image
US9900505B2 (en) * 2014-07-23 2018-02-20 Disney Enterprises, Inc. Panoramic video from unstructured camera arrays with globally consistent parallax removal
US9805138B2 (en) * 2015-02-06 2017-10-31 Xerox Corporation Efficient calculation of all-pair path-based distance measures
US20160232705A1 (en) * 2015-02-10 2016-08-11 Mitsubishi Electric Research Laboratories, Inc. Method for 3D Scene Reconstruction with Cross-Constrained Line Matching
CN108229424A (en) * 2018-01-26 2018-06-29 西安工程大学 A kind of augmented reality system object recognition algorithm based on Hough ballot
CN109443359B (en) * 2018-09-27 2020-08-14 北京空间机电研究所 Geographical positioning method of ground panoramic image

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105115560A (en) * 2015-09-16 2015-12-02 北京理工大学 Non-contact measurement method for cabin capacity
CN106327532A (en) * 2016-08-31 2017-01-11 北京天睿空间科技股份有限公司 Three-dimensional registering method for single image
CN107292956A (en) * 2017-07-12 2017-10-24 杭州电子科技大学 A kind of scene reconstruction method assumed based on Manhattan
US10026218B1 (en) * 2017-11-01 2018-07-17 Pencil and Pixel, Inc. Modeling indoor scenes based on digital images
CN108280858A (en) * 2018-01-29 2018-07-13 重庆邮电大学 A kind of linear global camera motion method for parameter estimation in multiple view reconstruction

Also Published As

Publication number Publication date
CN110782524A (en) 2020-02-11

Similar Documents

Publication Publication Date Title
CN110782524B (en) Indoor three-dimensional reconstruction method based on panoramic image
US10977818B2 (en) Machine learning based model localization system
CN110383343B (en) Inconsistency detection system, mixed reality system, program, and inconsistency detection method
Belhumeur et al. The bas-relief ambiguity
Ding et al. Automatic registration of aerial imagery with untextured 3d lidar models
CN102663810B (en) Full-automatic modeling approach of three dimensional faces based on phase deviation scanning
CN104036488B (en) Binocular vision-based human body posture and action research method
CN109446892B (en) Human eye attention positioning method and system based on deep neural network
JP2014127208A (en) Method and apparatus for detecting object
CN104183019B (en) Method for rebuilding 3D lines
CN109523595A (en) A kind of architectural engineering straight line corner angle spacing vision measuring method
CN104463899A (en) Target object detecting and monitoring method and device
CN102831601A (en) Three-dimensional matching method based on union similarity measure and self-adaptive support weighting
Jin et al. An indoor location-based positioning system using stereo vision with the drone camera
CN109147027A (en) Monocular image three-dimensional rebuilding method, system and device based on reference planes
CN110675436A (en) Laser radar and stereoscopic vision registration method based on 3D feature points
CN108010125A (en) True scale three-dimensional reconstruction system and method based on line-structured light and image information
Lin et al. Research on 3D reconstruction in binocular stereo vision based on feature point matching method
CN106683163A (en) Imaging method and system used in video monitoring
CN111489392B (en) Single target human motion posture capturing method and system in multi-person environment
Yang et al. Vision system of mobile robot combining binocular and depth cameras
Liu et al. The applications and summary of three dimensional reconstruction based on stereo vision
Neverova et al. 2 1/2 D scene reconstruction of indoor scenes from single RGB-D images
CN114935316B (en) Standard depth image generation method based on optical tracking and monocular vision
Ren et al. Application of stereo vision technology in 3D reconstruction of traffic objects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant