CN109191526B

CN109191526B - Three-dimensional environment reconstruction method and system based on RGBD camera and optical encoder

Info

Publication number: CN109191526B
Application number: CN201811052260.8A
Authority: CN
Inventors: 王亚利
Original assignee: Hangzhou Amy Ronotics Co ltd
Current assignee: Hangzhou Amy Ronotics Co ltd
Priority date: 2018-09-10
Filing date: 2018-09-10
Publication date: 2020-07-07
Anticipated expiration: 2038-09-10
Also published as: CN109191526A

Abstract

The invention discloses a three-dimensional environment reconstruction method and a three-dimensional environment reconstruction system based on an RGBD (red, green and blue) camera and an optical encoder, wherein when a key frame is selected, if a previous frame cannot be used as the key frame, the pose of the optical encoder of the previous frame and the pose of the optical encoder of a current frame are used for calculating the pose change information between the current frame and the previous frame, so as to obtain the pose estimation of the current frame; in the process of reconstruction, when the pose and the error of the current frame relative to the nearest key frame estimated based on the vision method exceed a first set threshold, the output pose of the optical encoder is used as the initial pose of the camera, and the pose and the error of the current frame relative to the nearest key frame are estimated again. The camera pose is estimated by using the vision and optical encoder, so that the influence of environmental illumination change on pose estimation can be reduced; when the initialization of the key frame fails, the pose change estimation is given by the optical encoder, the robustness of the reconstruction of the environment lacking textures can be enhanced, and the method can be widely used.

Description

Three-dimensional environment reconstruction method and system based on RGBD camera and optical encoder

Technical Field

The invention relates to the technical field of image processing, in particular to a three-dimensional environment reconstruction method and a three-dimensional environment reconstruction system based on an RGBD camera and an optical encoder.

Background

In computer vision, three-dimensional reconstruction refers to the process of reconstructing three-dimensional information from single-view or multi-view images. Three-dimensional reconstruction is difficult because the information of a single video is incomplete. The general method is that the camera is calibrated, namely the relation between the image coordinate system of the camera and the world coordinate system is calculated, and then three-dimensional information is reconstructed by utilizing information in a plurality of two-dimensional images.

At present, a three-dimensional environment reconstruction method based on an RGBD (RGB + Depth, color image + Depth image) camera is used, and input images thereof are a color image and a Depth image. Among them, color images are susceptible to ambient light variations such as over-brightness, over-darkness, over-exposure, etc., resulting in image degradation or even unavailability. Furthermore, in an environment lacking texture, edges and feature points of an object cannot be extracted from a color image. The edge and feature points of the object are very important for estimating the pose of the camera, and if the edge and feature points of the object cannot be extracted, the pose estimation error of the camera is over large, so that the reconstruction effect is influenced.

Therefore, the existing three-dimensional environment reconstruction method based on the RGBD camera is easily influenced by the environment and the illumination change thereof, has low robustness to the environment and the illumination change thereof in the reconstruction process, and is limited in application.

Disclosure of Invention

Based on this, the invention aims to provide a three-dimensional environment reconstruction method and a three-dimensional environment reconstruction system based on an RGBD camera and an optical encoder, wherein when the color image is degraded or can not be used as a key frame due to environmental illumination or environmental texture change, the pose estimation of the camera is compensated by the output of the optical encoder, so that the robustness of the reconstruction process to the environment and illumination change thereof is increased, and the three-dimensional environment reconstruction method and the three-dimensional environment reconstruction system can be widely used.

The invention provides a three-dimensional environment reconstruction method based on an RGBD camera and an optical encoder, which comprises the following steps:

when a key frame is selected, if the previous frame can not be used as the key frame, calculating interframe pose change information between the current frame and the previous frame by using the pose of the optical encoder of the previous frame and the pose of the optical encoder of the current frame to obtain pose estimation of the current frame;

in the process of reconstruction, when the pose and the error of the current frame relative to the nearest key frame estimated based on the vision method exceed a first set threshold, the output pose of the optical encoder is used as the initial pose of the camera, and the pose and the error of the current frame relative to the nearest key frame are estimated again.

As an implementation mode, the key frame is selected according to the edge and the characteristic point of the image.

As an implementation manner, when selecting a key frame, if a previous frame cannot be used as the key frame, calculating pose change information between a current frame and the previous frame by using the pose of an optical encoder of the previous frame and the pose of an optical encoder of the current frame to obtain a pose estimation of the current frame, including the following steps:

when a key frame is selected, if the previous frame can not be used as the key frame, calculating the posture change information between the current frame and the previous frame by using the posture information of the optical encoder of the previous frame and the posture information of the optical encoder of the current frame;

and obtaining the pose estimation of the current frame according to the pose of the previous frame and the pose change information between frames.

As an implementation manner, the three-dimensional environment reconstruction method based on the RGBD camera and the optical encoder further includes the following steps:

initializing a key frame;

if the initialization is successful, obtaining the pose estimation of the current frame according to the current frame and the nearest key frame;

if the initialization fails, judging whether the previous frame can be used as a key frame;

if the previous frame can be used as a key frame, the previous frame is stored as a new key frame, the position posture of the previous frame is initialized into a unit array, and the mark is initialized successfully.

As an implementation manner, after obtaining the pose estimation of the current frame according to the current frame and the nearest key frame, the method further includes the following steps:

calculating the pose and the error of the current frame relative to the nearest key frame;

if the error is smaller than a second set threshold value, saving the current frame as a previous frame;

and if the error is larger than a second set threshold value, marking initialization failure.

As an implementable manner, in the reconstruction process, when the pose and the error of the current frame estimated based on the vision method relative to the nearest key frame exceed a first set threshold, the pose and the error of the current frame relative to the nearest key frame are re-estimated by taking the output pose of the optical encoder as the initial pose of the camera, and the method comprises the following steps:

estimating the pose and the error of the current frame relative to the nearest key frame based on a general visual method;

if the error is larger than a first set threshold, obtaining an inter-frame pose according to the output pose of the previous frame optical encoder and the output pose of the current frame optical encoder;

obtaining an initial estimation of the pose of the current frame according to the pose of the frame and the pose of the previous frame;

and obtaining the pose of the current frame according to the initial estimation of the pose of the latest key frame and the pose of the current frame.

Correspondingly, the three-dimensional environment reconstruction system based on the RGBD camera and the optical encoder comprises a selection module and a reconstruction module;

the selecting module is used for calculating the pose change between the current frame and the previous frame by using the pose information of the optical encoder of the previous frame and the pose of the optical encoder of the current frame to obtain the pose estimation of the current frame if the previous frame can not be used as the key frame when the key frame is selected;

and the reconstruction module is used for re-estimating the pose and the error of the current frame relative to the nearest key frame by taking the output pose of the optical encoder as the initial pose of the camera when the pose and the error of the current frame relative to the nearest key frame estimated based on the vision method exceed a first set threshold value in the reconstruction process.

As an implementation, the selecting module includes a first calculating unit and a second calculating unit;

the first calculating unit is used for calculating the pose change information between the current frame and the previous frame by using the pose information of the optical encoder of the previous frame and the pose information of the optical encoder of the current frame if the previous frame can not be used as the key frame when the key frame is selected;

and the second calculation unit is used for obtaining the pose estimation of the current frame according to the pose of the previous frame and the pose change information between frames.

The three-dimensional environment reconstruction system based on the RGBD camera and the optical encoder further comprises an initialization module;

the initialization module is used for initializing key frames; if the initialization is successful, obtaining the pose estimation of the current frame according to the current frame and the nearest key frame; if the initialization fails, whether the previous frame can be used as a key frame is judged, if the previous frame can be used as the key frame, the previous frame is stored as a new key frame, the position posture of the previous frame is initialized to a unit array, and the successful initialization is marked.

As an implementation manner, the selecting module further includes a third calculating unit;

the third calculation unit is used for calculating the pose and the error of the current frame relative to the nearest key frame; if the error is smaller than a second set threshold value, saving the current frame as a previous frame; and if the error is larger than a second set threshold value, marking initialization failure.

As an implementable manner, the reconstruction module includes a pose estimation unit, a comparison unit, a fourth calculation unit, and a fifth calculation unit;

the pose estimation unit is used for estimating and obtaining the pose of the current frame relative to the nearest key frame and the error thereof based on a general visual method;

the comparison unit is used for obtaining an inter-frame pose according to the output pose of the optical encoder of the previous frame and the output pose of the optical encoder of the current frame if the error is larger than a first set threshold;

the fourth calculation unit is used for obtaining an initial estimation of the pose of the current frame according to the pose between frames and the pose of the previous frame;

and the fifth calculating unit is used for obtaining the pose of the current frame according to the initial estimation of the pose of the latest key frame and the current frame.

Compared with the prior art, the technical scheme has the following advantages:

according to the three-dimensional environment reconstruction method and system based on the RGBD camera and the optical encoder, when a key frame is selected, if a previous frame cannot be used as the key frame, the pose of the optical encoder of the previous frame and the pose of the optical encoder of a current frame are used for calculating the pose change information between the current frame and the previous frame, and the pose estimation of the current frame is obtained; in the process of reconstruction, when the pose and the error of the current frame relative to the nearest key frame estimated based on the vision method exceed a first set threshold, the output pose of the optical encoder is used as the initial pose of the camera, and the pose and the error of the current frame relative to the nearest key frame are estimated again. The camera pose is estimated by using the vision and optical encoder, so that the influence of environmental illumination change on pose estimation can be reduced; when the initialization of the key frame fails, the pose change estimation is given by the optical encoder, the robustness of the reconstruction of the environment lacking textures can be enhanced, and the method can be widely used.

Drawings

Fig. 1 is a schematic diagram of positions of a previous key frame, a latest key frame, a previous frame, and a current frame on a time axis in a three-dimensional environment reconstruction method based on an RGBD camera and an optical encoder according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of a three-dimensional environment reconstruction method based on an RGBD camera and an optical encoder according to an embodiment of the present invention;

fig. 3 is a schematic diagram illustrating an initialization flow of a keyframe in a three-dimensional environment reconstruction method based on an RGBD camera and an optical encoder according to an embodiment of the present invention;

fig. 4 is a schematic reconstruction flow chart of a three-dimensional environment reconstruction method based on an RGBD camera and an optical encoder according to an embodiment of the present invention;

fig. 5 is a schematic flow chart of estimating a current pose in the three-dimensional environment reconstruction method based on the RGBD camera and the optical encoder according to the first embodiment of the present invention;

fig. 6 is a schematic structural diagram of a three-dimensional environment reconstruction system based on an RGBD camera and an optical encoder according to a second embodiment of the present invention.

Detailed Description

The above and further features and advantages of the present invention will be apparent from the following, complete description of the invention, taken in conjunction with the accompanying drawings, wherein the described embodiments are merely some, but not all embodiments of the invention.

Each ordinary frame in the present invention contains the following information: the color image and the depth image output by the camera, the pose output by the optical encoder and the pose of the camera. The sequential positions on the time axis of the previous key frame, the most recent key frame, the previous frame, and the current frame are shown in fig. 1. A plurality of common frames are generally separated between two adjacent key frames, and the information of the current frame is stored in the previous frame.

Referring to fig. 2, a three-dimensional environment reconstruction method based on an RGBD camera and an optical encoder provided in an embodiment of the present invention includes the following steps:

s100, when a key frame is selected, if the previous frame cannot be used as the key frame, calculating interframe pose change information between the current frame and the previous frame by using the pose of the optical encoder of the previous frame and the pose of the optical encoder of the current frame to obtain pose estimation of the current frame.

S200, in the process of reconstruction, when the pose and the error of the current frame relative to the nearest key frame estimated based on the vision method exceed a first set threshold, the output pose of the optical encoder is used as the initial pose of the camera, and the pose and the error of the current frame relative to the nearest key frame are estimated again.

In step S100, generally, in the process of reconstructing the three-dimensional environment, the first step is to initialize the key frame. The key frames are selected from the common frames and are formed according to the time sequence. The initialization of the key frame is to select one frame from the normal frames as the first key frame. In order to estimate the pose of the following normal frame, the key frame is required to have sufficiently rich edges or feature points.

In this embodiment, as shown in fig. 3, the key frame initialization is specifically implemented as follows:

s010, receiving a new common frame and storing the new common frame as a current frame;

s011, judging whether the current frame can be used as a key frame; i.e. to determine whether the current frame has sufficiently rich edges or feature points.

S012, if the current frame can be used as the key frame, then saving the current frame as the first key frame, marking the key frame initialization success, and saving the current frame as the previous frame; the reconstruction process can then be started, proceeding to step S021.

S013, if the current frame can not be used as the key frame, receiving a newly arrived common frame as the current frame, and continuously judging whether the current frame can be used as the key frame; meanwhile, if the current frame cannot be used as the key frame, the key frame initialization is marked to fail, and the step S022 is performed.

S021, if the initialization is successful, obtaining pose estimation of the current frame according to the current frame and the nearest key frame;

s022, if the initialization fails, judging whether the previous frame can be used as a key frame;

s023, if the previous frame can be used as the key frame, the previous frame is saved as a new key frame, and the pose of the previous frame is initialized to the unit matrix, marking that the initialization is successful.

Specifically, step S100 may be implemented by:

firstly, when a key frame is selected, if the previous frame can not be used as the key frame, calculating the posture change information between the current frame and the previous frame by using the posture information of the optical encoder of the previous frame and the posture information of the optical encoder of the current frame; and then, obtaining the pose estimation of the current frame according to the pose of the previous frame and the pose change information between frames.

Next, calculating the pose and the error of the current frame relative to the nearest key frame; if the error is smaller than a second set threshold value, saving the current frame as a previous frame; and if the error is larger than a second set threshold value, marking initialization failure.

In the reconstruction process combining the camera and the optical encoder, the pose estimation and the error of the current frame relative to the nearest key frame are calculated, and when the error is smaller than a second set threshold value, the pose estimation is correct, so that the current frame is saved as a previous frame, and a new common frame is received and saved as the current frame; when the error is greater than the second set threshold, it indicates that the pose estimation is incorrect, so a new key frame needs to be generated.

Further, if the previous frame meets the condition of the key frame, the previous frame is saved as a new key frame, the pose of the previous frame is initialized to be a unit array of 4X4, and the key frame is marked to be initialized successfully; then, the pose of the current frame relative to the nearest key frame is estimated.

If the previous frame does not meet the condition of the key frame, the pose of the optical encoder of the previous frame and the pose of the optical encoder of the current frame are utilized to obtain the inter-frame pose change information between the current frame and the previous frame, and further obtain the pose estimation of the current frame; the mark key frame initialization is unsuccessful since the previous frame failed to become a key frame.

Therefore, when the initialization of the key frame fails, the light encoder is used for estimating the pose change, and the robustness of the reconstruction of the environment lacking the texture can be enhanced.

As an implementable manner, the reconstruction process of the above-mentioned camera and optical encoder is combined, as shown in fig. 4:

s41, receiving a new common frame and saving the common frame as a current frame;

s42, after the key frame is initialized successfully, estimating the pose and the error of the current frame relative to the nearest key frame; if the error is greater than the set threshold, the initialization is marked to fail, and step S43 is executed; if the error is smaller than the set threshold, indicating that the pose estimation is accurate, executing step S47;

s43, after the initialization of the key frame fails, judging whether the previous frame can be used as the key frame;

s44, if the previous frame can be used as a key frame, the previous frame is saved as a new key frame, the position and posture estimation of the previous frame is initialized to a unit array of 4X4, and the key frame is marked to be initialized successfully;

s45, estimating the pose between the current frame and the nearest key frame, and saving the current frame as the previous frame;

s46, if the previous frame can not be used as the key frame, the pose between the current frame and the previous frame is obtained by using the pose information of the optical encoder of the previous frame and the optical encoder information of the current frame, so that the pose estimation of the current frame is obtained, and the initialization of the key frame is marked to be unsuccessful; then, the current frame is saved to the previous frame (S47), and a new normal frame is received to the current frame (S41);

s47, saving the current frame to the previous frame, receiving the new common frame and saving it as the current frame (S41);

next, referring to fig. 5, step S200 may be implemented by:

s210, estimating the pose and the error of the current frame relative to the nearest key frame based on a general visual method;

s220, if the error is larger than a first set threshold, obtaining an inter-frame pose according to the output pose of the previous frame optical encoder and the output pose of the current frame optical encoder;

s230, obtaining initial estimation of the pose of the current frame according to the pose of the frame and the pose of the previous frame;

and S240, obtaining the pose of the current frame according to the initial estimation of the pose of the latest key frame and the current frame.

Estimating the pose and the error of the current frame relative to the nearest key frame based on a general visual method, wherein when the error is greater than a first set threshold, the pose error estimated by the visual method is larger; therefore, the pose output by the optical encoder is used as an initial value, and the pose estimation and the error of the current frame relative to the nearest key frame are estimated again by a visual method.

And if the error is smaller than the first set threshold, the pose estimation value at the moment is more accurate, the pose estimation value replaces the original current frame pose estimation value, the estimation error replaces the original error, and the estimation error is output.

According to the three-dimensional environment reconstruction method based on the RGBD camera and the optical encoder, provided by the embodiment of the invention, after the key frame is initialized successfully, a reconstruction process is started; obtaining pose estimation and errors thereof on the output of the RGBD camera by using a general visual method; when the error is too large, the output of the optical encoder is used as an initial pose value, and the pose and the error thereof are estimated again; and taking new pose estimation as output until the error falls into a set error range. When the key frame is generated, if the previous frame image can not be used as the key frame, the pose changes between frames are solved by using the output of the optical encoders of the previous frame and the current frame, and the pose estimation of the current frame is obtained.

In the embodiment, the camera pose is estimated by using the vision and optical encoder, so that the influence of environmental illumination change on pose estimation can be reduced; when the initialization of the key frame fails, the light encoder is used for giving out the estimation of the pose change, so that the robustness of the reconstruction of the environment lacking the texture can be enhanced.

Based on the same inventive concept, the second embodiment of the present invention further provides a three-dimensional environment reconstruction system based on an RGBD camera and an optical encoder, which has the same principle as the aforementioned method, so that the implementation of the system can be realized by referring to the flow of the aforementioned method, and the repetition part is not described in detail.

Referring to fig. 6, a three-dimensional environment reconstruction system based on an RGBD camera and an optical encoder according to a second embodiment of the present invention includes a selecting module 100 and a reconstruction module 200.

The selecting module 100 is configured to, when a key frame is selected, calculate an inter-frame pose change between a current frame and a previous frame by using pose information of an optical encoder of the previous frame and a pose of an optical encoder of the current frame if the previous frame cannot be used as the key frame, and obtain a pose estimation of the current frame;

the reconstruction module 200 is configured to, in a reconstruction process, when the pose and the error of the current frame estimated based on the visual method with respect to the nearest key frame exceed a first set threshold, re-estimate the pose and the error of the current frame with respect to the nearest key frame by using the output pose of the optical encoder as the initial pose of the camera.

The three-dimensional environment reconstruction system based on the RGBD camera and the optical encoder provided in this embodiment selects the keyframe according to the edge and the feature point of the image.

The selecting module 100 includes a first calculating unit and a second calculating unit. The first calculating unit is used for calculating the pose change information between the current frame and the previous frame by using the pose information of the optical encoder of the previous frame and the pose information of the optical encoder of the current frame if the previous frame can not be used as the key frame when the key frame is selected; and the second calculation unit is used for obtaining the pose estimation of the current frame according to the pose of the previous frame and the pose change information between frames.

The selection module also comprises a third calculation unit; the third calculating unit is used for calculating the pose and the error of the current frame relative to the nearest key frame; if the error is smaller than a second set threshold value, saving the current frame as a previous frame; and if the error is larger than a second set threshold value, marking initialization failure.

Further, the three-dimensional environment reconstruction system based on the RGBD camera and the optical encoder provided in this embodiment further includes an initialization module 010;

the initialization module 010 is configured to initialize the key frame; if the initialization is successful, obtaining the pose estimation of the current frame according to the current frame and the nearest key frame; if the initialization fails, whether the previous frame can be used as a key frame is judged, if the previous frame can be used as the key frame, the previous frame is stored as a new key frame, the position posture of the previous frame is initialized to a unit array, and the successful initialization is marked.

The reconstruction module 200 includes a pose estimation unit, a comparison unit, a fourth calculation unit, and a fifth calculation unit. The pose estimation unit is used for estimating and obtaining the pose of the current frame relative to the nearest key frame and the error thereof based on a general visual method; the comparison unit is used for obtaining an inter-frame pose according to the output pose of the previous frame optical encoder and the output pose of the current frame optical encoder if the error is larger than a first set threshold; the fourth calculation unit is used for obtaining the initial estimation of the pose of the current frame according to the pose between frames and the pose of the previous frame; and the fifth calculation unit is used for obtaining the pose of the current frame according to the initial estimation of the pose of the latest key frame and the current frame.

According to the three-dimensional environment reconstruction method and system based on the RGBD camera and the optical encoder, when the color image is degraded or can not be used as a key frame due to the change of the ambient illumination or the ambient texture, the pose estimation of the camera is compensated by the output of the optical encoder, the robustness of the reconstruction process on the environment and the illumination change of the environment is increased, and the three-dimensional reconstruction method and system can be widely used.

Although the present invention has been described with reference to the preferred embodiments, it is not intended to limit the present invention, and those skilled in the art can make variations and modifications of the present invention without departing from the spirit and scope of the present invention by using the methods and technical contents disclosed above.

Claims

1. A three-dimensional environment reconstruction method based on an RGBD camera and an optical encoder is characterized by comprising the following steps:

in the process of reconstruction, when the pose and the error of a current frame relative to a nearest key frame estimated based on a vision method exceed a first set threshold, the pose output by an optical encoder is taken as the initial pose of a camera, and the pose and the error of the current frame relative to the nearest key frame are estimated again, and the method comprises the following steps:

2. The RGBD camera and optical encoder based three-dimensional environment reconstruction method of claim 1, wherein the key frame is selected according to the edge and feature points of the image.

3. The RGBD camera and optical encoder based three-dimensional environment reconstruction method according to claim 1, characterized by comprising the following steps:

4. The RGBD camera and optical encoder based three-dimensional environment reconstruction method according to claim 1, further comprising the steps of:

initializing a key frame;

5. The RGBD camera and optical encoder based three-dimensional environment reconstruction method of claim 3, which is characterized by further comprising the following steps after obtaining the pose estimation of the current frame according to the current frame and the nearest key frame:

6. A three-dimensional environment reconstruction system based on an RGBD camera and an optical encoder is characterized by comprising a selection module and a reconstruction module;

7. The RGBD camera and optical encoder based three-dimensional environment reconstruction system of claim 6, wherein the key frames are selected according to the edges and feature points of the image.

8. The RGBD camera and optical encoder based three-dimensional environment reconstruction system of claim 6, wherein the selecting module comprises a first computing unit and a second computing unit;

9. The RGBD camera and optical encoder based three-dimensional environment reconstruction system of claim 6, further comprising an initialization module;

10. The RGBD camera and optical encoder based three-dimensional environment reconstruction system of claim 8, wherein the selecting module further comprises a third computing unit;

11. The RGBD camera and optical encoder based three-dimensional environment reconstruction system according to any one of claims 6 to 10, wherein the reconstruction module comprises a pose estimation unit, a comparison unit, a fourth calculation unit and a fifth calculation unit;