CN113920270B

CN113920270B - Layout reconstruction method and system based on multi-view panorama

Info

Publication number: CN113920270B
Application number: CN202111527178.8A
Authority: CN
Inventors: 颜青松
Original assignee: Shenzhen Qiyu Innovation Technology Co ltd
Current assignee: Shenzhen Qiyu Innovation Technology Co ltd
Priority date: 2021-12-15
Filing date: 2021-12-15
Publication date: 2022-08-19
Anticipated expiration: 2041-12-15
Also published as: CN113920270A

Abstract

The invention relates to the technical field of indoor scene layout, in particular to a layout reconstruction method and a system thereof based on multi-view panorama; the method comprises the steps of firstly projecting a panoramic image onto a cube to obtain perspective images of six surfaces, and extracting image characteristics and matching the characteristics from the perspective images of the six surfaces so as to obtain position information and posture information of each panoramic image; then predicting three-dimensional layout information of an indoor scene for each panoramic image by using a pre-trained deep neural network; finally, fusing the position information and the posture information of the panoramic image with the three-dimensional layout information of the indoor scene, thereby completing the reconstruction of the indoor scene; the invention can complete the layout reconstruction of the whole indoor scene only by one panoramic sensor without manual intervention, and can efficiently reconstruct the indoor scene.

Description

Layout reconstruction method and system based on multi-view panorama

Technical Field

The invention relates to the technical field of indoor scene layout, in particular to a layout reconstruction method and a system based on multi-view panorama.

Background

Compared with the reconstruction of an indoor three-dimensional model, the indoor scene layout reconstruction result is simpler and more compact, and the method has a wider application range in the VR/AR field.

The patent of the invention discloses a Chinese patent with the name of an indoor three-dimensional layout reconstruction method (patent number ZL 201910343315), which discloses acquiring image sequence data and inertia measurement data of an indoor scene, and simultaneously performing real-time three-dimensional reconstruction on the indoor scene to obtain a key frame image sequence and a real-time reconstructed position and posture of the key frame image sequence; then, performing off-line three-dimensional reconstruction by using the key frame image sequence and the corresponding real-time reconstructed position and posture to obtain dense three-dimensional point cloud of the indoor scene; then extracting a plane structure from the dense three-dimensional point cloud of the indoor scene, screening and classifying the plane structure to obtain a roof plane, a ground plane and a candidate wall plane, and simultaneously obtaining the floor height; then, constructing a three-dimensional layout of the indoor scene according to a roof plane, a ground plane, a candidate wall plane and a floor height; finally, outputting a three-dimensional layout reconstruction result; the method mainly comprises the steps of extracting a plane structure from dense three-dimensional point cloud of an indoor scene after completing indoor scene reconstruction by utilizing RGB color images, and screening and classifying the plane structure so as to obtain layout information of the indoor scene.

The Chinese patent application with the patent name of an indoor reconstruction method, a device, equipment and a medium (with the patent number of CN 201711163966) discloses acquiring a panoramic image, depth of field data and an acquisition position of indoor decoration; according to the panoramic image, the field depth data and the acquisition position, three-dimensional reconstruction is carried out on the three-dimensional space of the house where the indoor decoration is located, and a three-dimensional house model is generated; the method mainly utilizes a single RGB image to estimate indoor three-dimensional layout information based on a deep neural network, and only can perform layout reconstruction on a small area each time.

Disclosure of Invention

The invention mainly solves the technical problem of providing a layout reconstruction method based on multi-view panorama, which can complete the layout reconstruction of the whole indoor scene only by one panorama sensor without manual intervention and can efficiently reconstruct the indoor scene; a layout reconstruction system based on multi-view panorama is also provided.

In order to solve the technical problems, the invention adopts a technical scheme that: a layout reconstruction method based on multi-view panorama is provided, wherein the method comprises the following steps:

step S1, projecting the panoramic image onto a cube to obtain perspective images of six faces, and extracting image features and matching the features from the perspective images of the six faces to obtain position information and posture information of each panoramic image;

step S2, training each panoramic image by the pre-trained deep neural network, and predicting three-dimensional layout information of an indoor scene;

and step S3, fusing the position information and the posture information of the panoramic image with the three-dimensional layout information of the indoor scene, thereby completing the reconstruction of the indoor scene.

As a modification of the present invention, in step S1, the image feature is extracted using a region with a larger change in gradation as a feature point.

As a further improvement of the present invention, in step S1, feature matching of feature points is performed by optical flow matching.

As a further improvement of the present invention, in step S1, the position information and the orientation information of each panoramic image are calculated based on the extracted image features and feature matching points.

As a further improvement of the present invention, in step S2, when the pre-trained deep neural network is trained for each panoramic image, the panoramic image is input, and the house corner points and the house wall map are output.

As a further improvement of the present invention, in step S2, house corner points, house corner point depths, and two-dimensional house wall lines are extracted from the panoramic image, and then back-projected into a three-dimensional space, so as to predict three-dimensional layout information of an indoor scene.

As a further improvement of the present invention, in step S3, scale information of sparse points is acquired based on the pose information of the panoramic image, and thus the scale information of the three-dimensional layout information of the indoor scene in step S2 is corrected.

As a further improvement of the present invention, in step S3, the scale information of the sparse points is compared with the scale information of the three-dimensional layout information of the indoor scene to obtain a scale correction factor, so that the three-dimensional layout information of the indoor scene is scaled according to the scale correction factor.

As a further improvement of the present invention, in step S3, the three-dimensional layouts of the corrected indoor scenes generated from all the panoramic images are merged to obtain the layout of the indoor scene.

A multi-view panorama based layout reconstruction system, comprising:

the attitude acquisition module is used for projecting the panoramic image onto a cube, acquiring perspective images of six surfaces, and extracting image features and matching the features from the perspective images of the six surfaces so as to obtain position information and attitude information of each panoramic image;

the training layout module is used for training each panoramic image by using a pre-trained deep neural network and predicting three-dimensional layout information of an indoor scene;

and the fusion reconstruction module fuses the position information and the posture information of the panoramic image and the three-dimensional layout information of the indoor scene so as to complete the reconstruction of the indoor scene.

The beneficial effects of the invention are: the method comprises the steps of firstly projecting a panoramic image onto a cube to obtain perspective images of six surfaces, and extracting image characteristics and matching the characteristics from the perspective images of the six surfaces so as to obtain position information and posture information of each panoramic image; then, training each panoramic image by using a pre-trained deep neural network, and predicting three-dimensional layout information of an indoor scene; finally, fusing the position information and the attitude information of the panoramic image with the three-dimensional layout information of the indoor scene, thereby completing the reconstruction of the indoor scene; the invention can complete the layout reconstruction of the whole indoor scene only by one panoramic sensor without manual intervention, and can efficiently reconstruct the indoor scene.

Drawings

FIG. 1 is a block diagram of the steps of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.

Referring to fig. 1, a layout reconstruction method based on multi-view panorama according to the present invention includes the following steps:

step S1, projecting the panoramic image onto a cube to obtain perspective images of six surfaces, and extracting image characteristics and matching the characteristics from the perspective images of the six surfaces to obtain position information and posture information of each panoramic image;

The invention can complete the layout reconstruction of the whole indoor scene only by one panoramic sensor without manual intervention, and can efficiently reconstruct the indoor scene.

In step S1, extracting image features using an area with a larger change in gradation as a feature point, and performing feature matching of the feature point by optical flow matching; and calculating the position information and the attitude information of each panoramic image according to the extracted image features and the feature matching points.

Specifically, compared with a perspective image of a scene, the panoramic image has a 360-degree FOV, so that the panoramic image is less susceptible to interference of occlusion and weak/repeated textures when pose estimation is performed in an indoor scene; the real-time pose estimation process of the panoramic image comprises the steps of firstly projecting the panoramic image onto a cube to obtain 6 perspective images, then carrying out image feature extraction and feature matching on the 6 perspective images, and further calculating the position and posture information of each panoramic image, namely, when the panoramic camera carries out feature extraction and feature matching, in order to ensure the calculation efficiency, using an efficient area with large gray scale change as a feature point, and carrying out feature point matching through optical flow, therefore, the selection basis of the feature point is as follows, and the feature point is recorded on the panoramic image

The luminance at is greater than the threshold value than the difference in pixel luminance within a window size of 2m +1,

；

while performing optical flow matching, assuming that the image is

Is a function associated with time

And further converting the matching problem into an optimization problem to complete feature matching.

There is a need to optimize the formula, wherein

For the unknown to be solved for,

；

finally, based on the feature matching points and the multi-view solid geometry, the position posture of the panoramic image is adjusted

And the three-dimensional position of the feature point

As an unknown quantity, the solution is made by the following equation, wherein

Representing characteristic points

In the image

The coordinates of the pixel of (a) above,

。

in step S2, when training each panoramic image with the pre-trained deep neural network, inputting the panoramic image and outputting the panoramic image as house corner points and house wall line map; during operation, house corner points, house corner point depth and two-dimensional house wall lines are extracted from the panoramic image and then are back projected into a three-dimensional space, so that three-dimensional layout information of indoor scenes is predicted.

Specifically, three-dimensional layout information of an indoor scene is directly predicted on each panoramic image by using a deep neural network, and the layout information of the indoor scene corresponding to the current panoramic image can be acquired; the core element of panoramic layout generation is to extract the corner points of the house from the panoramic image

And depth and two-dimensional house wall line thereof

And then back projected into three-dimensional space.

Therefore, a deep neural network needs to be constructed, wherein the input of the deep neural network is a panoramic image, the output of the deep neural network is a house corner point and a house wall line graph, and the layout acquired correspondingly is called as the layout

Wherein

The expression is a neural network, which is,

and

the prediction result of the network is as follows:

in step S3, obtaining scale information of sparse points according to the pose information of the panoramic image, thereby correcting the scale information of the three-dimensional layout information of the indoor scene in step S2; that is, comparing the scale information of the sparse points with the scale information of the three-dimensional layout information of the indoor scene to obtain a scale correction factor, and scaling the three-dimensional layout information of the indoor scene according to the scale correction factor; and performing topological fusion on the three-dimensional layouts of the corrected indoor scenes generated by all the panoramic images to obtain the layouts of the indoor scenes.

Specifically, a single panoramic image and layout information thereof can only complete reconstruction of a part of indoor scenes, but cannot complete three-dimensional reconstruction of the whole indoor scene, and aiming at the problem, multi-view layout reconstruction can be performed by two steps:

firstly, correcting the scale information of indoor layout information by using the scale information of sparse points acquired during real-time pose estimation, and ensuring that the layout generated by a multi-view panoramic image has the same scale; that is, directly comparing the sparse point scale and the corresponding layout scale, and calculating the scale correction factor by the optimization algorithm, for the panoramic image

And corresponding thereto

First, can acquire

Projected coordinates on a panorama

And depth

(ii) a At the same time, the panorama can be laid out

Is projected to

Obtaining

Depth of the site

To eliminate

The following equation can be constructed to solve the scale correction factor

：

；

And secondly, after the scale correction is finished, fusing layout information generated by a plurality of images.

Because the indoor layout can be simplified into a two-dimensional plane except for the height, a two-dimensional voxel with the resolution ratio of r is constructed firstly, then the layout of a plurality of frames of panoramic images is projected into the voxel, all layout information is further fused according to the voxel, and finally the voxel information is converted into topology information to be finally output, so that the layout reconstruction result of the whole indoor scene is obtained.

The invention also provides a layout reconstruction system based on multi-view panorama, comprising:

the attitude acquisition module is used for projecting the panoramic image onto a cube, acquiring perspective images of six surfaces, and extracting image characteristics and matching the characteristics from the perspective images of the six surfaces so as to obtain position information and attitude information of each panoramic image;

Compared with other indoor layout reconstruction schemes, the indoor layout reconstruction method can complete the layout reconstruction of the whole indoor scene only by one panoramic sensor without manual intervention, and can efficiently reconstruct the indoor scene.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A layout reconstruction method based on multi-view panorama is provided, which is characterized by further comprising the following steps:

step S2, predicting each panoramic image by using a pre-trained deep neural network to obtain three-dimensional layout information of an indoor scene;

step S3, fusing the position information and the posture information of the panoramic image with the three-dimensional layout information of the indoor scene, thereby completing the reconstruction of the indoor scene;

in step S1, image features are extracted using the regions with changed gray levels as feature points;

in step S1, feature matching of the feature points is performed by optical flow matching;

in step S1, calculating position information and orientation information of each panoramic image based on the extracted image features and feature matching points;

the region selection basis of the feature points is positioned in the panoramic imageIAbove (A)x,y) The brightness of the spot is larger than the window size by 2mThe difference in pixel brightness within +1 is greater than a thresholdtTo select, the expression is:

where m is the projection coordinate.

2. The layout reconstruction method based on multi-view panorama of claim 1, wherein in step S2, when a pre-trained deep neural network is used to predict each panoramic image, the input is panoramic image, and the output is house corner points and house wall line map.

3. The layout reconstruction method based on the multi-view panorama of claim 2, wherein in step S2, house corner points, house corner point depths, and two-dimensional house wall lines are extracted from the panoramic image and then back projected into a three-dimensional space, thereby predicting three-dimensional layout information of an indoor scene.

4. The method of claim 3, wherein in step S3, the scale information of sparse points is obtained according to the pose information of the panoramic image, so as to correct the scale information of the three-dimensional layout information of the indoor scene in step S2.

5. The method for reconstructing a layout based on multi-view panorama of claim 4, wherein in step S3, the scale information of the sparse points is compared with the scale information of the three-dimensional layout information of the indoor scene to obtain the scale correction factor, so as to scale the three-dimensional layout information of the indoor scene according to the scale correction factor.

6. The method for reconstructing a layout based on multi-view panorama of claim 5, wherein in step S3, the three-dimensional layout of the corrected indoor scene generated by all the panorama images is merged to obtain the layout of the indoor scene.

7. A multi-view panorama based layout reconstruction system employing the layout reconstruction method according to any one of claims 1 through 6, comprising:

the training layout module is used for predicting each panoramic image by using a pre-trained deep neural network to acquire three-dimensional layout information of an indoor scene;