CN110782524B

CN110782524B - Indoor three-dimensional reconstruction method based on panoramic image

Info

Publication number: CN110782524B
Application number: CN201911024676.3A
Authority: CN
Inventors: 胡敏; 李熠; 黄宏程
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2019-10-25
Filing date: 2019-10-25
Publication date: 2023-05-23
Anticipated expiration: 2039-10-25
Also published as: CN110782524A

Abstract

The invention relates to an indoor three-dimensional reconstruction method based on a panoramic image, and belongs to the technical field of three-dimensional reconstruction. The method comprises the following steps: s1: line segment detection: detecting contour line segments in an image; s2: grouping line segments: detecting vanishing points of the image, namely, grouping line segments into line segment sets in three directions in the directions of three coordinate axes of a Cartesian coordinate system in a scene; s3: and (3) detecting intersection points: constructing an undirected graph, judging true and false intersection points through constraint conditions, and determining a recurrence order of coordinates through a minimum spanning tree; s4: and (3) lifting coordinates: and the conversion from the pixel coordinate system to the spherical coordinate system and then to the three-dimensional space coordinate system is realized. By utilizing the characteristics of wide field of view and easy acquisition of the panoramic image, the spatial geometric information in the scene can be acquired more accurately by adopting a classical geometric method and combining panoramic geometry and Manhattan world constraint, meanwhile, the line segments are used as calculation factors, the data size is small, and the three-dimensional line model of the scene can be acquired in a short time.

Description

Indoor three-dimensional reconstruction method based on panoramic image

Technical Field

The invention belongs to the technical field of three-dimensional reconstruction, and relates to an indoor three-dimensional reconstruction method based on a panoramic image.

Background

The three-dimensional reconstruction technology is to acquire three-dimensional space information of an object or a scene and establish a corresponding three-dimensional model in a computer. The three-dimensional reconstruction is divided into manual modeling and automatic modeling, wherein the manual modeling is performed through modeling software such as 3DS Max, maya, rhino and the like, and the three-dimensional reconstruction has high precision, but is time-consuming and labor-consuming; the automatic modeling is the combination of software and hardware, and the corresponding data is processed by a computer to automatically generate the three-dimensional model. The automatic modeling technology is divided into an active measurement method and a passive measurement method: the active measurement method refers to that by transmitting a controllable signal to a target object, depth information of each point of a target scene or object is calculated through the transmitted signal and a return signal, so as to perform modeling and measurement, for example: laser ranging, structured light, etc., which should be a hand-held laser scanner, depth camera kinect; the passive measurement method refers to a method for acquiring a three-dimensional structure of a target through an image, and is divided into a single-view reconstruction method and a multi-view reconstruction method, wherein the single-view reconstruction is a three-dimensional mechanism for calculating the target by combining certain known information or constraint conditions, and the multi-view reconstruction is to acquire a three-dimensional model of the object by constructing stereoscopic vision through images of the same object or scene without visual angles similar to binocular ranging of people.

Single view reconstruction techniques include focal length methods, image methods. The focal length method is similar to the laser ranging principle, and the focal position of a camera is adjusted to measure the distance between different positions in a scene and the camera, namely, the depth of a measured point is calculated by utilizing a lens imaging formula.

The image laws are divided into data driving methods and classical geometry methods. The method based on data driving mainly learns and extracts features in an image by a machine learning method, performs semantic annotation on pixels and pixel blocks, and reconstructs a scene by dividing pixel areas into floors, walls, ceilings and the like. Such as: YInda Zhang et al understand a scene by machine learning, recognize objects in a panoramic image, determine vanishing points by using a Manhattan world model, acquire spatial position information of the objects, and finally represent the objects by cuboids of different colors to realize understanding and reconstruction of the scene. Ruifeng Deng adopts multi-scale CNN to extract plane normal and depth information from the image, and then realizes layout estimation of indoor scene through multi-channel FCN. The data driving method can better divide the image so as to realize the reconstruction of the scene, and after the image is divided, the reconstruction is carried out in the form of pixel blocks, and the main purpose is to restore the rough outline of the scene, contain the scene texture and have low precision.

The method based on geometric constraint mainly comprises the steps of obtaining and analyzing the information of line segments, planes and the like in image information to obtain the space information of a scene, and the method based on geometric constraint mainly comprises the steps of analyzing, detecting and extracting features of images, and classifying the detected line segments, such as concave-convex, shielding and the like. The detected line segments are then processed to obtain the geometry of the object or scene. For the multi-direction that a single Manhattan model in a real scene does not satisfy line segments, julian Straub proposes a hybrid Manhattan framework, i.e. a plurality of Manhattan world models exist in the same scene, and a rotation matrix is used to represent the specific direction of each Manhattan world model. In a single image reconstruction method, siddhant Ranade et al estimate vanishing points of line segments by using linear constraint and a Manhattan world model, and promote 2D line segments in an image to 3D, so that the reconstruction of a 3D line model of a building is realized. After detecting the line segments of the panoramic image, hao Yang and Hui Zhang perform super-pixel segmentation, divide the image into a ground wall surface and a ceiling, and reconstruct a scene in the panoramic image. Jiu Xu et al uses object detection, 3D object pose estimation and plane normal estimation to estimate the position and direction of the wall surface and object in the panoramic image, enabling three-dimensional reconstruction of a single panoramic image.

The Manhattan 3D line reconstruction well utilizes the geometric characteristics of the artificial scene, can accurately restore the geometric structure of the building, but takes the outline of the building into consideration, and can only reconstruct a three-dimensional model of a certain part of the building. In the indoor scene, because the common image has a narrower visual angle, the image of the whole scene cannot be acquired in the indoor scene, and the reconstruction of the indoor scene can only be partially performed, so that the reconstruction of the indoor scene by using the panoramic image has great advantage in information quantity. In addition, compared with super-pixel segmentation, the depth information of the scene in the image is calculated by utilizing the back projection of the panoramic geometry, so that better reconstruction accuracy is achieved.

The method utilizes a single panoramic image to realize line drawing reconstruction of an indoor scene, detects contour lines in the indoor scene, and promotes line segments of two-dimensional coordinates to three-dimensional coordinates, so that conversion of pixel coordinates, spherical coordinates and three-dimensional space coordinates is required. When calculating the three-dimensional coordinates of the pixel points, the grouping of the line segments needs to be confirmed, and after the coordinates of the ground line segments are obtained, the three-dimensional coordinates of the line segments in the whole image are further solved through the intersection points of the rest line segments.

Disclosure of Invention

In view of the above, the invention aims to provide an indoor three-dimensional reconstruction method based on a panoramic image, which restores a line drawing three-dimensional space model of a scene through the panoramic image of the indoor scene, and solves coordinates through panoramic geometry, thereby ensuring the accuracy of reconstructed space points. Meanwhile, the panoramic image can be obtained through a common panoramic camera, and the equipment cost is low. The reconstruction effect is a three-dimensional model of the contour lines in the scene, and during data processing, the detected contour lines in the image are processed, so that dense point cloud data of the relative active vision is matched with characteristic points in the stereoscopic vision, the calculated amount is small, and the operation time is short.

In order to achieve the above purpose, the present invention provides the following technical solutions:

an indoor three-dimensional reconstruction method based on panoramic images comprises the following steps:

s1: line segment detection: detecting contour line segments in an image;

s2: grouping line segments: detecting vanishing points of the image, namely, grouping line segments into line segment sets in three directions in the directions of three coordinate axes of a Cartesian coordinate system in a scene;

s3: and (3) detecting intersection points: constructing an undirected graph, judging true and false intersection points through constraint conditions, and determining a recurrence order of coordinates through a minimum spanning tree;

s4: and (3) lifting coordinates: and the conversion from the pixel coordinate system to the spherical coordinate system and then to the three-dimensional space coordinate system is realized.

Optionally, a line segment in the panoramic image is obtained by using an LSD algorithm, and a straight line pixel point set is obtained by locally analyzing the image so as to facilitate subsequent data processing.

Optionally, when acquiring a line segment in the image, acquiring related information of the line segment, including a start point, an end point and an angle of the line segment; then acquiring vanishing point directions through Hough transformation, wherein the three vanishing point directions are orthogonal in pairs, and the three directions with highest voting scores of Hough voting are the three vanishing point directions; when the space middle line segment l is projected to the spherical image, the space middle line segment l is presented in a large circle form on the spherical surface; c is a large circle where the line segment l is located, n is a unit normal vector of the large circle where the line segment l is located, and meanwhile, the vanishing point direction corresponding to the line segment l is perpendicular to the plane normal vector; each line segment corresponds to a unique great circle, and has a unique great circle normal vector n, and the direction of the line segment is marked through n.

Optionally, S31: construction of a line-free diagram

By constructing undirected graphs

To represent one of the line segments given a constraintOptimizing the relation among the line segments in the image and the scene space structure;

s32: boundary line determination

The boundary line represents the edge of the plane, namely the line segment formed by intersecting different planes, while the non-boundary line is the line segment in the plane, and when the corresponding spatial information derivation is carried out, the non-boundary line is positioned in the plane and can only intersect the line segments in two directions forming the plane; a Novel Single View Constraints for Manhattan 3D Line Reconstruction method is adopted to distinguish boundary lines from non-boundary lines; for each line segment Li, determining that the line segment set intersected with the line segment is li= { ln, lm, lk..}, if the line segment in the set does not contain the line segment in the normal direction of the plane where the line segment Li is located, the line segment Li is a non-boundary line, bi=0, and otherwise bi=1;

s33: planar segmentation

The coordinate derivation can be carried out after the calculated pixel points are determined to be ground points or wall points, namely plane attributes are required to be determined;

under the constraint of Manhattan world hypothesis, the classification of planes is divided into three types by the normal directions of the planes, namely planes in three directions of x, y and z, wherein x, y and z represent the normal directions of the planes and also represent the coordinate axis directions of world coordinates, and certain plane constraint conditions exist at the same time;

closed-loop constraint, wherein the composition of the plane is composed of two groups of parallel line segments; in the undirected graph, nodes meet the connection graph area forming a closed loop, and the directions of the nodes in the area are nodes in two directions forming a plane, and the nodes are judged to be the same plane;

three types of line segments are classified in the image, namely line segments grouped into three vanishing point directions, and planes in three directions can be formed at the same time; each line segment is an element forming two types of planes, namely, an x-direction line segment can form an xy plane and an xz plane, all nodes are completely separated by the existing clustering algorithm, namely, each node has only a unique type attribute, meanwhile, the connection between planes in an image passes through another type of plane, and each plane needs to be made into a single cluster; dividing the image area in a planar manner;

s34: minimum spanning tree

In the drawings

Wherein, (i, j) epsilon represents the intersection point between line segments, a Boolean variable bij is created for each edge in the graph, the value of the Boolean variable bij is 1 and is represented as a real intersection point, the intersection point category is classified by adopting a Manhattan junction catalogue for spatial reasoning of indoor scenes method, the intersection point authenticity judgment is carried out by adopting a Novel Single View Constraints for Manhattan 3D Line Reconstruction method, and the optimization is carried out by combining the intersection point category so as to determine the authenticity of the intersection point in the image; the coordinates of the subsequent connecting line segments are further deduced by the real intersection points.

Optionally, the S4 specifically is:

s41: panoramic geometry

From two-dimensional image coordinates, the conversion among the image coordinates, a camera coordinate system and a world coordinate system is realized by lifting to three-dimensional space coordinates; 2D image coordinate point p _i Conversion and 3D coordinate point P _i The conversion formula of (2) is:

P _i ＝λ _i R ^-1 K ^-1 p _i

wherein lambda is _i Is the distance between the target pixel point and the origin of coordinates, R is the rotation matrix of camera coordinates and world coordinates, K is the internal parameter matrix of camera, and d _i ＝R ^-1 K ^-1 p _i The direction of the pixel point under the world coordinate system; under the condition that the parameters of a camera are assumed to be known, a rotation matrix can be obtained through a camera calibration method; in the aspect of obtaining depth information, single image reconstruction based on geometry is often calculated through vanishing points, and after determining the line segment direction, the vanishing points are solved through a least square method to obtain depth information of points and line segments in the image;

the panoramic image comprises a field of view of 360 ° in the horizontal direction and 180 ° in the vertical direction, the pixels of the image being w×h, where w=2h; in the panoramic image, pixels and angles are in one-to-one correspondence, and the correspondence is W/2 pi; the conversion formula of the image coordinates and the panoramic coordinates is as follows:

θ _x ＝2πx/W

θ _y ＝πy/H

after determining the direction attribute of line segments and pixel blocks in the image, the space reconstruction in the scene can be carried out through the geometric characteristic of the panoramic image; the conditions are as follows:

(1) Objects and scenes in the image meet the Manhattan world constraint;

(2) The camera height of the shot is known;

under panoramic image, pixel coordinate p _i = (x, y) conversion to 3D spatial coordinates P _i The conversion formula of = (X, Y, Z) is:

r＝c ^h |cotθ _y |

c ^h the height of the view point from the ground is shot by a camera, and r is the horizontal distance from the view point of the target pixel point under the world coordinate system; when determining the height c of the camera ^h Calculating three-dimensional space coordinates of corresponding pixel points in the image;

s42: scene reconstruction

And calculating three-dimensional space coordinates of each line segment in the panoramic image by using the panoramic geometry.

The invention has the beneficial effects that: by utilizing the characteristics of wide field of view and easy acquisition of the panoramic image, the spatial geometric information in the scene can be acquired more accurately by adopting a classical geometric method and combining panoramic geometry and Manhattan world constraint, meanwhile, the line segments are used as calculation factors, the data size is small, and the three-dimensional line model of the scene can be acquired in a short time.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.

Drawings

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:

FIG. 1 is a frame diagram of the present invention;

FIG. 2 is a build undirected graph;

FIG. 3 is a schematic diagram of false intersections;

FIG. 4 is a panoramic geometry;

fig. 5 is a reconstruction flow chart.

Detailed Description

Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.

Wherein the drawings are for illustrative purposes only and are shown in schematic, non-physical, and not intended to limit the invention; for the purpose of better illustrating embodiments of the invention, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the size of the actual product; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The same or similar reference numbers in the drawings of embodiments of the invention correspond to the same or similar components; in the description of the present invention, it should be understood that, if there are terms such as "upper", "lower", "left", "right", "front", "rear", etc., that indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, it is only for convenience of describing the present invention and simplifying the description, but not for indicating or suggesting that the referred device or element must have a specific azimuth, be constructed and operated in a specific azimuth, so that the terms describing the positional relationship in the drawings are merely for exemplary illustration and should not be construed as limiting the present invention, and that the specific meaning of the above terms may be understood by those of ordinary skill in the art according to the specific circumstances.

The algorithm comprises four modules:

1. line segment detection: detecting contour line segments in an image;

2. grouping line segments: detecting vanishing points of the image, namely, grouping line segments into line segment sets in three directions in the directions of three coordinate axes of a Cartesian coordinate system in a scene;

3. and (3) detecting intersection points: constructing an undirected graph, judging true and false intersection points through constraint conditions, and determining a recurrence order of coordinates through a minimum spanning tree;

4. and (3) lifting coordinates: and the conversion from the pixel coordinate system to the spherical coordinate system and then to the three-dimensional space coordinate system is realized.

An overall framework diagram of the algorithm is shown in fig. 1.

Algorithm module function and algorithm flow:

1. line segment detection

The aim of the algorithm is to reconstruct a line graph model of an indoor scene, namely reconstructing a three-dimensional space model similar to a cad graph, so that an object of image processing is line segment information in an image. The line segment detection is to acquire line segments in the panoramic image by utilizing an LSD algorithm, and obtain a straight line pixel point set by locally analyzing the image so as to facilitate subsequent data processing.

2. Grouping line segments:

when acquiring a line segment in an image, acquiring related information of the line segment, including a starting point and an ending point of the line segment, an angle and the like. Then, vanishing point directions are obtained through Hough transformation, the three vanishing point directions are orthogonal in pairs, and the three directions with highest voting scores of Hough voting are the three vanishing point directions. The line segments in the line panoramic image are curved when unfolded, which is not beneficial to acquiring the information of the line segments. When the spatial middle line segment l is projected onto the spherical image, it appears as a large circle on the sphere. c is a large circle where the line segment l is located, n is a unit normal vector of the large circle where the line segment l is located, and meanwhile, the vanishing point direction corresponding to the line segment l is perpendicular to the plane normal vector. Each line segment corresponds to a unique great circle, so that a unique corresponding great circle normal vector n is also provided, and the direction of the line segment is identified through n.

3. And (3) judging an intersection point:

since imaging by the camera projects color information of an object in the world coordinate system, perspective is projected to an image formed in the image coordinate system. The perspective projection projects in a straight line and thus may form false intersections due to parallax. When further derivation of line segment coordinates is performed later, the false intersection point may affect the reconstruction accuracy, so that it is necessary to judge the authenticity of the intersection point.

3.1 construction of a Wireless graph

In order to ensure the accuracy of line graph reconstruction, line segments are taken as edges, an undirected graph is built at the intersection points among the line segments to optimize, and the undirected graph is built

To represent the line segments in the line segments, and given constraints to optimize the link between the line segments in the image, the scene space structure. The undirected graph is constructed as shown in fig. 2, and is composed of a line segment in the image, a line segment to be represented by the undirected graph, and l 1-l 7 line segments, wherein each node is a line segment l in the image, and the sides are the intersection points of the line segments.

Wherein the method comprises the steps of

Is a node in the undirected graph, v _i And (i, j) epsilon is the side of the undirected graph corresponding to the ith line segment in the image, and represents the connection relation among the line segments in the graph. v _i Each line segment in the corresponding graph includes, as data nodes, corresponding parameters of the line segment: di ε S= { x, y, z } represents the direction of the ith line segment, and a Boolean variable Bi of 1 indicates that the line segment is a boundary line segment, otherwise, is a non-boundary line segment. For edge e in the undirected graph _ij A Boolean variable b representing the intersection point between the ith line segment and the jth line segment _ij 1 is represented as a true intersection point, 0 is represented as a false intersection point, and the region variable E _ij =0 indicates that the intersection is seated as the lower half region of the image; c (C) _ij E { L, K, X } represents the category of intersection points.

3.2 boundary line judgment:

the boundary line represents the edge of the plane, i.e. the line formed by intersecting different planes, while the non-boundary line is the line in the plane, and when the corresponding spatial information is deduced, the main body spatial coordinate is deduced mainly by the intersection of the boundary lines. The non-boundary line lies in the plane, it can only intersect a line segment in two directions constituting the plane, for example, a line segment in the xy plane can only intersect a line segment in the x or y direction, and cannot intersect a line segment in the z direction. The method in [ Novel Single View Constraints for Manhattan 3D Line Reconstruction ] is used to distinguish between boundary lines and non-boundary lines. For each line segment Li, a set of line segments intersecting with it is determined as li= { ln, lm, lk..}, if the line segments in the set do not contain line segments in the normal direction of the plane where Li is located, the line segment Li is a non-boundary line, bi=0, and bi=1 otherwise.

3.3 planar segmentation

The coordinate derivation can be performed after the calculated pixel point is determined to be a ground point or a wall point, namely the plane attribute needs to be determined.

Under the constraint of Manhattan world hypothesis, the classification of planes is divided into three classes by the normal directions of the planes, namely planes in three directions of x, y and z, wherein x, y and z represent the normal directions of the planes and also represent the coordinate axis directions of world coordinates, and certain plane constraint conditions exist at the same time.

Closed-loop constraint, the composition of the plane should be composed of two parallel line segments, e.g. X= { l should be contained only in the line segment set composing the xy plane _x1 ,l _x2 ,l _x3 ...}，Y＝{l _y1 ,l _y1 ,l _y1 .. sets of line segments in both directions should not contain line segments in the z-direction. That is, in the undirected graph, if the nodes satisfy the connection graph region forming the closed loop and the direction of the nodes in the region is the node in two directions constituting the plane, the nodes are judged to be identicalA plane.

The line segments in the image are classified into three types, namely line segments grouped into three vanishing point directions, and can also form planes in three directions. Thus, each line segment is an element constituting two classes of planes, i.e. the x-direction line segments can constitute both the xy plane and the xz plane, and the existing clustering algorithm is to completely separate all nodes, i.e. each node has only a unique class attribute, while the connection between planes in the image is through another class of planes, so that each plane needs to be clustered individually. The image areas, i.e. floor, wall, ceiling, etc., are divided in a planar manner.

3.4 minimum spanning tree:

in line graph reconstruction, the pixel points which can directly calculate the three-dimensional coordinates of the pixel points through panoramic geometry are ground areas, and the ground line segments are determined through the plane areas divided by plane segmentation. The x and z coordinates of the ground line segment are calculated through the panoramic geometry, and the y coordinate of the ground point is 0. And then the three-dimensional coordinates of other line segments are obtained through the intersection points with the ground line segments, the coordinates of some line segments can be calculated through the conversion of a plurality of intersection points, and meanwhile, errors generated by the calculation of the three-dimensional space coordinates of the intersection points can be accumulated along with the increase of deduction nodes, so that the authenticity of the intersection points is necessary to be judged.

In the drawings

In the graph, (i, j) epsilon represents the intersection point between line segments, a Boolean variable bij is created for each edge in the graph, the value of the variable bij is 1 and is represented as a real intersection point, and [ Manhattan junction catalogue for spatial reasoning of indoor scenes ] is adopted]The method in (a) classifies the intersection point category by adopting [ Novel Single View Constraints for Manhattan 3D Line Reconstruction ]]The method of (1) is used for judging the authenticity of the intersection point, and is combined with the category of the intersection point to optimize so as to determine the authenticity of the intersection point in the image. The coordinates of the subsequent connecting line segments are further deduced by the real intersection points.

FIG. 3 is an example of a rectangular image in the form of a panoramic image expansion with pixel coordinates of(x, y); the spherical image is in the form of projecting the panoramic image onto the spherical surface, and the spherical image passes through theta in the spherical coordinate system _x ，θ _y To determine coordinates of the pixel points: demonstration including calculation of y-coordinate, c _h For camera height, θ _y The angle of the pixel point in the vertical direction under the spherical coordinate system; and the demonstration of calculating the horizontal coordinates x and y through r, and two coordinate values can be easily obtained after r is obtained.

1. The 2 intersection is the true intersection, while the 3,4 intersection is the false intersection due to parallax. The false intersection points in the image are formed by parallax of line segments in different directions in the scene, such as 3,4 and 5 intersection points in the figure, and due to projection relationship, the space distance between the false intersection points is not reflected in the picture. When knowing the coordinates of the line segments, the authenticity of the intersection point can be judged, for example:

when the intersection point of 2 and 6 is a ground point, the coordinate P thereof ₂ (x ₂ ,z ₂ 0) and P ₆ (x ₆ ,z ₆ 0) calculating the coordinate of the intersection point 4 by the intersection point 2 by panorama geometry, namely knowing the xy coordinates of the vertical line segments l1, l5, and calculating the coordinate of the intersection point 4 as P ₄ (x ₂ +Δx,z ₂ 0) and intersection 6 calculates the coordinates of intersection 4 as P' ₄ (x ₆ ,z ₆ ,Δy)，P ₄ ＝P' ₄ The equality condition is p2=p6, Δx=Δy=0, which cannot be satisfied, and the same can be said to determine that 3,4, and 5 are all false intersections.

On the other hand, in the closed loop formed by the line segment group li (i=2, 3,4,5, 6), which is represented in the form of a plane in the image, where l2, l6 are z-direction line segments, l3, l4 are x-direction line segments, which define an xz plane, and l5 are vertical-direction line segments, these five line segments cannot form a closed plane in space, and false intersections are necessary among the intersections formed. Meanwhile, the component of the plane in the normal direction is 0, namely the component line segments only comprise two direction line segments. There is therefore a lower constraint in the planar closed loop:

before processing the image, the only spatial scale information obtained is the height information of the camera from the ground, the spatial information in the scene is deduced according to the condition, but the direct information which can be deduced is the coordinate information of the ground of the room, the wall surface, the object and the spatial coordinates of the top of the room are deduced according to the intersecting line with the ground. When coordinate information is deduced, the intersection points are media for connecting different line segments and carrying out plane interaction, so that the accuracy of a reconstruction result is greatly influenced by correctly processing the intersection points in the image.

Intersection point classification: intersection classification the classification of the intersection is determined by the number of connecting segments for the same intersection.

The region variable is used to represent that the intersection is the upper half region or the lower half region, and is used to select the screening of the ground nodes, and the classification of the intersection is determined by the pixel coordinates of the intersection.

J represents the number of line connections of the intersection point, and P represents the number of points constituting the node:

l, T the intersection point is an intersection point formed by two line segments, the L-shaped intersection point is an in-plane intersection point, the T-shaped intersection point is an in-plane intersection point or a false intersection point formed by shielding, and the Y-shaped intersection point is the representation of a convex/concave structure in space. After determining the authenticity of the intersection point, the number of line segments constituting the intersection point should be equal to the number of line segments included in the intersection point line segment under the manhattan world constraint, i.e., j=num (Li) =num (D _Li ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein D is _Li E D represents the set of line segment directions in the set of Li line segments. Meanwhile, the types of the intersection points are different, and the intersection points are selected when the spatial deduction is performed by using different weights.

Image-generated undirected graph

After judging each node and each edge, acquiring parameters such as boundary lines, intersection point categories and the like, and generating a weighted undirected graph according to the parameters>

The weights are as follows:

ω _ij weight of intersection point category, ω _Bi Weight given to whether or not a line segment connected to the intersection point is a boundary line

The minimum spanning tree is obtained through the weighted undirected graph, and the coordinate derivation flow is determined through the minimum spanning tree, namely, the sequence of deriving the coordinates of the rest line segments from the ground line segment coordinates is selected through the intersection points.

4. And (3) lifting coordinates:

the purpose of the coordinate lifting is to realize the inverse process of camera imaging, combine the characteristics of panoramic images and the geometric characteristics of indoor scenes, and realize the calculation of the world coordinates of pixel points from the two-dimensional pixel coordinates in the images.

4.1 panoramic geometry:

from two-dimensional image coordinates, the conversion between image coordinates, camera coordinates and world coordinates is essentially achieved. 2D image coordinate point p _i Conversion and 3D coordinate point P _i The conversion formula of (2) is:

P _i ＝λ _i R ^-1 K ^-1 p _i

wherein lambda is _i Is the distance between the target pixel point and the origin of coordinates, R is the rotation matrix of camera coordinates and world coordinates, K is the internal parameter matrix of camera, and d _i ＝R ^-1 K ^-1 p _i The pixel's orientation in world coordinate system. Under the assumption that the camera parameters are known, the rotation matrix can be acquired by a camera calibration method. In acquiring depth information, geometry-based single image reconstruction is often performed by cancellationAnd calculating the vanishing point, and after determining the line segment direction, calculating the vanishing point by a least square method to obtain depth information of the point and the line segment in the image.

The panoramic image comprises a field of view of 360 ° in the horizontal direction and 180 ° in the vertical direction, the pixels of the image being w×h, where w=2h. Therefore, in the panoramic image, the pixels and the angles are in one-to-one correspondence, and the correspondence is W/2pi. The conversion formula of the image coordinates and the panoramic coordinates is as follows:

θ _x ＝2πx/W

θ _y ＝πy/H

after determining the directional properties of line segments, pixel blocks in the image, we can reconstruct the space in the scene from this geometrical property of the panoramic image. The conditions are as follows:

1. objects and scenes in the image satisfy manhattan world constraints such as man-made scenes, houses, streets, buildings, etc.

2. The camera for shooting is highly known

Under panoramic image, pixel coordinate p _i = (x, y) conversion to 3D spatial coordinates P _i The conversion formula of = (X, Y, Z) is: the panoramic geometry is shown in fig. 4.

r＝c ^h |cotθ _y |

c ^h The height of the view point from the ground is shot by the camera, and r is the horizontal distance from the view point of the target pixel point under the world coordinate system. When determining the height c of the camera ^h And then the three-dimensional space coordinates of the corresponding pixel points in the image can be accurately calculated.

4.2 scene reconstruction

With panoramic geometry, three-dimensional space coordinates of each line segment in the panoramic image are calculated, and the flow is shown in fig. 5.

Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims

1. The indoor three-dimensional reconstruction method based on the panoramic image is characterized by comprising the following steps of: the method comprises the following steps:

s1: line segment detection: detecting contour line segments in an image;

s4: and (3) lifting coordinates: the conversion from the pixel coordinate system to the spherical coordinate system and then to the three-dimensional space coordinate system is realized;

when acquiring a line segment in an image, acquiring related information of the line segment, including a starting point, a finishing point and an angle of the line segment; then acquiring vanishing point directions through Hough transformation, wherein the three vanishing point directions are orthogonal in pairs, and the three directions with highest voting scores of Hough voting are the three vanishing point directions; when the space middle line segment l is projected to the spherical image, the space middle line segment l is presented in a large circle form on the spherical surface; c is a large circle where the line segment l is located, n is a unit normal vector of the large circle where the line segment l is located, and meanwhile, the vanishing point direction corresponding to the line segment l is perpendicular to the plane normal vector; each line segment corresponds to a unique great circle, and has a unique great circle normal vector n, and the direction of the line segment is marked through n.

2. The panorama-based indoor three-dimensional reconstruction method according to claim 1, wherein: the S1 specifically comprises the following steps: and acquiring line segments in the panoramic image by utilizing an LSD algorithm, and obtaining a straight line pixel point set by locally analyzing the image so as to facilitate subsequent data processing.

3. The panorama-based indoor three-dimensional reconstruction method according to claim 1, wherein: the step S4 specifically comprises the following steps:

s41: panoramic geometry

P _i ＝λ _i R ^-1 K ^-1 p _i

θ _x ＝2πx/W

θ _y ＝πy/H

(1) Objects and scenes in the image meet the Manhattan world constraint;

(2) The camera height of the shot is known;

under panoramic image, imagePlain coordinates p _i = (x, y) conversion to 3D spatial coordinates P _i The conversion formula of = (X, Y, Z) is:

r＝c ^h |cotθ _y |

s42: scene reconstruction