CN110555908A - three-dimensional reconstruction method based on indoor moving target background restoration - Google Patents

three-dimensional reconstruction method based on indoor moving target background restoration Download PDF

Info

Publication number
CN110555908A
CN110555908A CN201910799527.8A CN201910799527A CN110555908A CN 110555908 A CN110555908 A CN 110555908A CN 201910799527 A CN201910799527 A CN 201910799527A CN 110555908 A CN110555908 A CN 110555908A
Authority
CN
China
Prior art keywords
target
rgb image
depth
image
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910799527.8A
Other languages
Chinese (zh)
Other versions
CN110555908B (en
Inventor
吴宪祥
耿煜恒
张晋新
孙牧野
陈笑
赵博
周旭阳
程开
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Electronic Science and Technology
Original Assignee
Xian University of Electronic Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Electronic Science and Technology filed Critical Xian University of Electronic Science and Technology
Priority to CN201910799527.8A priority Critical patent/CN110555908B/en
Publication of CN110555908A publication Critical patent/CN110555908A/en
Application granted granted Critical
Publication of CN110555908B publication Critical patent/CN110555908B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20164Salient point detection; Corner detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

the invention provides a three-dimensional reconstruction method based on indoor moving target background restoration, which is used for solving the technical problem of low accuracy of three-dimensional reconstruction of an indoor dynamic scene caused by a large number of moving target noise points in the technology, and comprises the following specific steps: (1) acquiring an RGB image sequence and a depth image sequence of an indoor scene to be reconstructed; (2) acquiring a target area of each RGB image; (3) carrying out feature matching on adjacent frames in the RGB image sequence; (4) calculating a rotation matrix and a translation vector of the pose transformation of the depth camera when shooting adjacent frames; (5) determining the area of the moving target in each RGB image; (6) repairing a static background blocked by a moving target in a moving target area of each RGB image; (7) and acquiring a three-dimensional reconstruction result. The three-dimensional reconstruction accuracy in the indoor dynamic environment is obviously higher than that of the prior art, and the method can be used for acquiring and analyzing the three-dimensional information of the indoor dynamic scene.

Description

Three-dimensional reconstruction method based on indoor moving target background restoration
Technical Field
the invention belongs to the technical field of computer vision image processing, relates to a three-dimensional reconstruction method, and particularly relates to a three-dimensional reconstruction method based on indoor moving target background restoration.
background
The three-dimensional reconstruction is to simulate a three-dimensional object in the real world by a computer to acquire complete three-dimensional information of the object including structure, texture, dimension and the like. At present, two common three-dimensional reconstruction methods are available, namely a contact measurement method and a non-contact measurement method. The contact type measuring method is based on a force triggering principle, and the coordinates of the surface sampling points of the object are obtained through the direct contact of the probe and the object. The non-contact measurement method is a method for obtaining a target three-dimensional space and measuring the target three-dimensional space on the premise of not contacting an object. A technique for realizing three-dimensional reconstruction based on an RGB image and a depth image is a typical non-contact measurement method that reconstructs a three-dimensional shape of an object from the depth image and RGB image characteristic information. Wherein the reconstruction accuracy is an important index for evaluating the reconstruction result.
the existing three-dimensional reconstruction method based on RGB image and depth image mainly includes the following types:
1. the method comprises the steps of reconstructing a static scene, obtaining images of an object to be reconstructed under different shooting angles, establishing an effective imaging model through camera calibration, solving internal and external parameters of a camera, extracting image feature points, solving the pose of the camera by adopting an ICP (inductively coupled plasma) algorithm, and finally splicing each point cloud image.
2. And the reconstruction of the dynamic scene mainly comprises the steps of detecting a moving target in the scene by using a target detection technology, removing point clouds corresponding to the moving target in a point cloud splicing stage, and reconstructing a point cloud picture without the moving target. To improve the accuracy of three-dimensional reconstruction in dynamic scenes, developers must try to remove point cloud data of moving objects in the reconstructed scene. For example, a Master thesis entitled "semantic SLAM Key technology research based on Vision" published by information engineering university, strategic support department, information engineering university, Japan, 10/15/2018 discloses a three-dimensional reconstruction method for moving object removal, which provides a SLAM composition method based on a lookup table. The method comprises the steps of firstly segmenting an image, estimating eight neighborhood motion directions of the image, and then optimizing composition by utilizing the motion directions of the image when a scene map is constructed. A method for removing the influence of a dynamic target based on the combination of a lookup table and an optical flow method is provided. The method uses the optical flow to detect the dynamic target in the scene, and adopts a lookup table to build the map of the scene. A target detection method based on deep learning is researched, and an improved SLAM method for removing dynamic target influence is realized by adopting a FasterR-CNN network. The effect of visual SLAM is improved by target detection and elimination of dynamic targets in the scene. Experiments prove that the method can effectively identify the pedestrians in the scene and eliminate the pedestrians when the map is constructed. However, a large number of holes are left in the position of the moving target by the method, the static background blocked by the moving target is not repaired, and the accuracy of three-dimensional reconstruction is still insufficient. How to remove a moving target in a dynamic scene, recover a point cloud of a static background blocked by the moving target, and reconstruct a three-dimensional image in the dynamic scene is an important problem to be solved currently.
disclosure of Invention
the invention aims to overcome the defects in the prior art, provides a three-dimensional reconstruction method based on indoor moving target background restoration, and aims to solve the technical problem that a dynamic indoor scene cannot be accurately reconstructed in the prior art.
in order to achieve the purpose, the technical scheme adopted by the invention comprises the following steps:
(1) acquiring an RGB image sequence and a depth image sequence of a dynamic indoor scene to be reconstructed:
Carrying out N times of continuous shooting on a dynamic indoor scene to be reconstructed by using a depth camera to obtain an RGB image sequence I (I 1), I 2,. I i., I N and a depth image sequence D (D 1), D 2,. D i., D N and I i represent the ith RGB image, D i represents a depth image corresponding to I i, and N is more than 50;
(2) acquiring a target area of each RGB image I i:
(2a) by adopting a method for extracting and identifying target features based on a Yolo depth neural network, inputting each RGB image I i in an image sequence I into a target detection network Yolo one by one according to the shooting time sequence for detection, and obtaining k detection targets C i1, C i2,. C ij and C ik of I i, wherein C ij represents the j-th detection target of the I-th RGB image I i, C ij is x ij, y ij, w ij, h ij, s ij and C35 ij, (x ij and y ij) are the coordinates of the center position of C ij, w ij and h ij are the width and height of the pixel region of the C ij respectively, s ij represents the target category, and C ij represents the confidence coefficient of s ij;
(2b) marking rectangular pixel regions (x ij, y ij) and w ij and h ij which are respectively width and height in C ij to obtain a target region B ij of C ij;
(3) And (3) carrying out feature matching on adjacent frames in the RGB image sequence I:
(3a) Extracting m ORB feature points in the RGB image I i by using a FAST corner detection algorithm, and calculating a rotation invariance BRIEF descriptor of each ORB feature point by using a rotation invariance BRIEF descriptor formula, wherein m is larger than 300;
(3b) calculating a hamming distance between each rotation invariance BRIEF descriptor of I i and each rotation invariance BRIEF descriptor of I i+1, and matching feature points in I i with feature points in I i+1 by adopting a brute force matching algorithm to obtain a matching pair set consisting of v pairs of matching pairs { (p i1, p i1 '), (p iu, p iu '), (p iv, p iv ') }, wherein p i ═ p i1,. p iu.. p iv } belongs to the set of I i matched ORB feature points, p i ' { p i1 ',. p iu ',. p iv ' } belongs to the set of I i+1 matched ORB feature points, and 0 ≦ v ≦ m;
(4) calculating a rotation matrix and a translation vector of the pose transformation of the shooting I i+1 relative to the shooting I i depth camera:
(4a) The method comprises the steps of fusing p i with depth information in a corresponding depth image to obtain a three-dimensional point set pd i ═ { pd i1,. pd iu.. pd iv }, and fusing the depth information in the depth image corresponding to p i 'to obtain a three-dimensional point set pd i' ═ pd i1 ',. pd iu.. pd iv' };
(4b) Calculating centroid coordinates C i and C 'i of pd i and pd i' and centroid coordinates q iu and q iu 'of pd iu and pd iu' and calculating a rotation matrix R i of pose transformation of the shooting I i+1 relative to the shooting I i depth camera by adopting SVD dimension reduction algorithm:
(4c) calculating a translation vector t i of the pose transformation of the shooting I i+1 relative to the shooting I i depth camera:
ti=Ci-RiC′i
(5) determining the area of the moving target in each RGB image I i:
(5a) Calculating the target matching degree of each target detection result C ij in I i and each target detection result C (i+1)j' in I i+1:
f(Cij,C(i+1)j′)=(xij-x(i+1)j′)2+(yij-y(i+1)j′)2+(wij-w(i+1)j′)2+(cij-c(i+1)j′)2sij=s(i+1)j′
when f (C ij, C (i+1)j') < 0.5, then C ij and C (i+1)j' are determined to be the same target;
(5b) And obtaining a corresponding three-dimensional point set by a method of (4a) for s pairs of feature matching points in the target areas B ij and B (i+1)j' respectively corresponding to the matched target detection results C ij and C (i+1)j' as follows:
{(tdi1,tdi1′),...(tdiw,tdiw′),...(tdis,tdis′)}0<s<m;
(5c) calculating the dynamic variation T of the same target region between I i and I i+1, and marking the corresponding B ij as a moving target region B ij' when T > 0.1, wherein the calculation formula of T is as follows:
(6) repairing a static background blocked by a moving target in a moving target region B ij' of each RGB image I i:
(6a) Screening 5 contrast frames I i+5, I i+10, I i+15, I i+20 and I i+25 after I i when I is less than N-25 or screening 5 contrast frames I i-5, I i-10, I i-15, I i-20 and I i-25 before I i when I is less than or equal to N when N-25 is less than or equal to N by using the step length L which is 5;
(6b) calculating B ij ' center coordinates (x ij, y ij) ' in I i, and corresponding projection coordinates (x ij, y ij) l ' in 5 contrast frames, wherein l is 1,2,3,4 and 5;
(6c) carrying out ORB feature point extraction and matching on B ij ' and a pixel region H ijl which is in 5 contrast frames and takes projection coordinates (x ij, y ij) l ' as the center, has the same width and height as B ij ', updating the pixel value of B ij ' to the pixel value of a region H ijl ' with the least number of point pairs of feature points which are successfully matched to obtain an RGB image I i ' of I i subjected to background restoration, and simultaneously updating the pixel value of a depth image region corresponding to B ij ' to the pixel value of a depth image region corresponding to H ijl ' to obtain a depth image D i ' of D i subjected to background restoration;
(7) Obtaining a three-dimensional reconstruction result:
Fusing each RGB image I i 'subjected to background restoration and each depth image D i' subjected to background restoration to obtain N three-dimensional space point clouds Q 1, Q 2, a.t., Q i, a.t., Q N, and performing iterative splicing on Q 1, Q 2, a.t., Q i, a.t., Q N by using a rotation matrix R i and a translation vector t i between I i adjacent frames to obtain a global three-dimensional point cloud Q for eliminating a moving target:
compared with the prior art, the invention has the following advantages:
firstly, the invention adopts the background restoration technology, and after the moving target in the scene is removed, the static area shielded by the moving target is repaired, so that the three-dimensional reconstruction system can reconstruct the static scene under the condition of the interference of the moving object, the defect that a background hole is left after the moving target is removed in the prior art is avoided, and the accuracy of the three-dimensional reconstruction result is improved.
secondly, the method comprises the following steps: the invention adopts the deep neural network Yolo, detects the moving target by combining the parallax information between two frames, has accurate and effective detection method of the moving target, can detect a plurality of moving targets in a scene, and avoids the defects that the type of the moving target must be known in advance and only a single moving target can be detected in the prior art.
drawings
FIG. 1 is a flow chart of an implementation of the present invention;
FIG. 2 is a diagram illustrating the effect of background restoration on a moving target area according to the present invention;
FIG. 3 is a simulation comparison graph of the reconstruction results of the present invention and the prior art.
Detailed Description
the invention is described in further detail below with reference to the figures and specific examples.
Referring to fig. 1, the present invention includes the steps of:
step 1) acquiring an RGB image sequence and a depth image sequence of a dynamic indoor scene to be reconstructed:
Using a depth camera to perform N times of continuous shooting on an indoor scene to be reconstructed, so as to obtain an RGB image sequence I-I 1, I 2,. I i., I N and a depth image sequence D-D 1, D 2,. D i., D N, I i represent the ith RGB image, D i represents a depth image corresponding to I i, N > 50, and in the embodiment, N-605;
step 2) obtaining a target detection network Yolo:
taking images under three catalogues of human beings, animals and indoor articles marked in the PASCAL VOC public data set as the input of a target detection network Yolo with a basic network of Darknet-53, and carrying out iterative training on the Yolo network until the network converges to obtain a trained target detection network Yolo, wherein the network can detect common targets in an indoor scene;
step 3) obtaining a target area of each RGB image I i:
step 3a) adopting a method for extracting and identifying target features based on a Yolo depth neural network, inputting each RGB image I i in an image sequence I into a target detection network Yolo one by one according to a shooting time sequence for detection, and obtaining k detection targets C i1, C i2,. C ij and C ik of I i, wherein C ij represents the j-th detection target of the I-th RGB image I i, C ij is x ij, y ij, w ij, h ij, s ij and C ij, (x ij and y ij) are the coordinates of the center position of C ij, w ij and h ij are the width and height of the image pixel region where C ij is located respectively, s ij represents a target class, and C ij represents the confidence of s ij;
step 3B) labeling rectangular pixel regions with (x ij, y ij) as the center and w ij and h ij as the width and the height respectively in C ij to obtain a target region B ij of C ij;
step 4) carrying out feature matching on adjacent frames in the RGB image sequence I:
step 4a) extracting m ORB feature points in the RGB image I i by using a FAST corner detection algorithm, and calculating a rotation invariance BRIEF descriptor of each ORB feature point through a rotation invariance BRIEF descriptor formula, wherein m is larger than 300;
Step 4b) calculating a hamming distance between each rotation invariance BRIEF descriptor of I i and each rotation invariance BRIEF descriptor of I i+1, and matching feature points in I i with feature points in I i+1 by adopting a brute force matching algorithm to obtain a matching pair set composed of v pairs of matching pairs { (p i1, p i1 '), (p iu, p iu '), (p iv, p iv ') }, wherein p i ═ p i1,. p iu.. p iv } belongs to the set of matched ORB feature points of I i, p i '. p i1 ',. p iu ',. p iv ' } belongs to the set of matched ORB feature points of I i+1, and 0 ≦ v ≦ m;
step 5) calculating a rotation matrix and a translation vector of the shooting I i+1 relative to the shooting I i depth camera pose transformation:
step 5a), fusing the p i with depth information in a corresponding depth image to obtain a three-dimensional point set pd i ═ { pd i1,. pd iu.. pd iv }, and simultaneously fusing the depth information in the depth image corresponding to p i 'to obtain a three-dimensional point set pd i' ═ pd i1 ',. pd iu.. pd iv' };
step 5b) calculate centroid coordinates C i and C 'i for pd i and pd i', and de-centroid coordinates q iu and q iu 'for pd iu and pd iu':
Ci=(pdi1+pdi2+...+pdiv)/v
Ci′=(pdi1′+pdi2′+...+pdiv′)/v
qiu=pdiu-Ci
qiu′=pdiu′-C′i
step 5c), calculating a rotation matrix R i of the pose transformation of the shooting I i+1 relative to the shooting I i depth camera by adopting an SVD (singular value decomposition) dimension reduction algorithm:
step 5d) calculating a translation vector t i of the shooting I i+1 relative to the shooting I i depth camera pose transformation:
ti=Ci-RiC′i
step 6) determining the area of the moving target in each RGB image I i:
step 6a) calculating the target matching degree of each target detection result C ij in I i and each target detection result C (i+1)j' in I i+1, where the target matching degree is obtained by combining the type of the Yolo detection result and the disparity information between two adjacent frames, and because the moving target does not have the possibility of "instantaneous motion" in the three-dimensional space, the center position coordinates of the same target between two adjacent frames and the frame height and width information of the target do not change suddenly, so that the matching degree of the same type of target between two adjacent frames within a certain threshold range can be determined to determine whether the two targets are the same target appearing in two adjacent frames:
f(Cij,C(i+1)j′)=(xij-x(i+1)j′)2+(yij-y(i+1)j′)2+(wij-w(i+1)j′)2+(cij-c(i+1)j′)2sij=s(i+1)j′
When f (C ij, C (i+1)j') < 0.5, then C ij and C (i+1)j' are determined to be the same target;
Step 6B), fusing the s pairs of feature matching points in the target areas B ij and B (i+1)j' respectively corresponding to the matched target detection results C ij and C (i+1)j' with the corresponding depth image information to obtain a corresponding three-dimensional point set, and recording the three-dimensional point set as:
{(tdi1,tdi1′),...(tdiw,tdiw′),...(tdis,tdis′)}0<s<m;
step 6c) calculating the same target area dynamic variation A between I i and I i+1, and marking B ij as a moving target area B ij' when A is more than 0.1, wherein the calculation formula of A is as follows:
Step 7) repairing the static background blocked by the moving target in the moving target area B ij' of each RGB image I i:
Step 7a) screening 5 contrast frames I i+5, I i+10, I i+15, I i+20 and I i+25 after I i when I is less than N-25 or screening 5 contrast frames I i-5, I i-10, I i-15, I i-20 and I i-25 before I i when I is less than or equal to N when N-25 is less than or equal to N from the RGB image sequence I by a step length L of 5;
step 7B) calculating B ij 'center coordinates (x ij, y ij)' in I i, corresponding projection coordinates (x ij, y ij) 'l in 5 contrast frames, wherein the projection coordinates represent the center coordinates of a region corresponding to B ij' and having the same static background in the contrast frames, and since the object in B ij 'has a moving attribute, the region having the same static background as B ij' in the contrast frames may not be occluded by the moving object, so that the background repair may be performed using the image of the corresponding position in the contrast frames, where l is 1,2,3,4, 5;
step 7b1) calibrating the camera I by adopting an SfM algorithm to obtain an initialized internal reference matrix K of the depth camera, and optimizing the initialized internal reference matrix K by adopting a cluster optimization method to obtain an optimized internal reference matrix K;
step 7b2) calculates (x ij, y ij) 'the corresponding camera coordinates (x ij', y ij ', z ij') under the camera coordinate system:
[xij′,yij′,zij′,1]T=zij′K-1[xij,yij,1]T
step 7b3) inputs (x ij ', y ij', z ij ') into F for 5 iterations, each iteration is sequentially 5,10,15,20, and 25 times, where the first iteration results in (x ij', y ij ', z ij') the projection camera coordinates (x ij ', y ij', z ij ') in the contrast frame I i+5l or I i-5l' l:
step 7b4) calculates (x ij ', y ij ', z ij ') l image coordinate system corresponding coordinates (x ij, y ij) ' l as projection coordinates of (x ij, y ij) ' in the comparison frame I i+5l or I i-5l, where the specific I i±5l is related to the selection direction of the comparison frame of the current frame, if the selection is backward from the current frame, I i+5l, if the selection is forward from the current frame, I i-5l:
Step 7c) carrying out ORB feature point extraction and matching on B ij ' and a pixel region H ijl which takes projection coordinates (x ij, y ij) ' l as the center and has the same width and height as B ij ' in 5 contrast frames respectively, updating the pixel value of B ij ' to the pixel value of a region H ijl ' with the least number of point pairs of feature points which are successfully matched to obtain an RGB image I i ' of the I i subjected to background restoration, and updating the pixel value of a depth image region corresponding to B ij ' to the pixel value of a depth image region corresponding to H ijl ' to obtain a depth image D i ' of the D i subjected to background restoration;
step 8) obtaining a three-dimensional reconstruction result:
fusing each RGB image I i 'subjected to background restoration and each depth image D i' subjected to background restoration to obtain N three-dimensional space point clouds Q 1, Q 2, a.t., Q i, a.t., Q N, and performing iterative splicing on Q 1, Q 2, a.t., Q i, a.t., Q N by using a rotation matrix R i and a translation vector t i between I i adjacent frames to obtain a global three-dimensional point cloud Q for eliminating a moving target:
the technical effects of the invention are further explained by combining simulation tests as follows:
1. Experimental conditions and contents:
The experimental conditions are as follows: the experiment is carried out on equipment provided with Ubuntu16.04, 32GB memory, Intel E5-2620 dual-core processor and GTX1080Ti GPU processor. As input, a sequence of images (605 sheets, 768 × 574) in a "freiburg 3_ walking _ xyz" dataset, which contains two moving human targets, is used.
the experimental contents are as follows: in this experiment, an image sequence freiburg3_ walking _ xyz (605 x, 768 x 574) is used as input, and three-dimensional point cloud reconstruction is performed on the image sequence by using the method provided by the present invention and the existing three-dimensional point cloud reconstruction method based on the image sequence, and the result is shown in fig. 3.
2. And (3) analyzing an experimental result:
referring to fig. 2, fig. 2(a) is an image including a moving human object, and fig. 2(b) is an image of the same scene after background restoration. Referring to fig. 3, fig. 3(a) is an indoor three-dimensional point cloud model reconstructed by using a conventional three-dimensional reconstruction method; fig. 3(b) shows an indoor three-dimensional point cloud model reconstructed by the three-dimensional reconstruction method of the present invention. In fig. 3(a), since the dynamic object is not removed, a large number of three-dimensional point clouds of a moving human body appear in the reconstruction result. Fig. 3(b) uses the image subjected to background restoration for reconstruction, removes the moving human body in the indoor scene, restores the static background blocked by the moving human body, and improves the accuracy of three-dimensional reconstruction in the indoor dynamic scene.

Claims (3)

1. a three-dimensional reconstruction method based on indoor moving target background restoration is characterized by comprising the following steps:
(1) acquiring an RGB image sequence and a depth image sequence of a dynamic indoor scene to be reconstructed:
carrying out N times of continuous shooting on a dynamic indoor scene to be reconstructed by using a depth camera to obtain an RGB image sequence I (I 1), I 2,. I i., I N and a depth image sequence D (D 1), D 2,. D i., D N and I i represent the ith RGB image, D i represents a depth image corresponding to I i, and N is more than 50;
(2) acquiring a target area of each RGB image I i:
(2a) by adopting a method for extracting and identifying target features based on a Yolo depth neural network, inputting each RGB image I i in an image sequence I into a target detection network Yolo one by one according to the shooting time sequence for detection, and obtaining k detection targets C i1, C i2,. C ij and C ik of I i, wherein C ij represents the j-th detection target of the I-th RGB image I i, C ij is x ij, y ij, w ij, h ij, s ij and C35 ij, (x ij and y ij) are the coordinates of the center position of C ij, w ij and h ij are the width and height of the pixel region of the C ij respectively, s ij represents the target category, and C ij represents the confidence coefficient of s ij;
(2b) marking rectangular pixel regions (x ij, y ij) and w ij and h ij which are respectively width and height in C ij to obtain a target region B ij of C ij;
(3) and (3) carrying out feature matching on adjacent frames in the RGB image sequence I:
(3a) extracting m ORB feature points in the RGB image I i by using a FAST corner detection algorithm, and calculating a rotation invariance BRIEF descriptor of each ORB feature point by using a rotation invariance BRIEF descriptor formula, wherein m is larger than 300;
(3b) Calculating a hamming distance between each rotation invariance BRIEF descriptor of I i and each rotation invariance BRIEF descriptor of I i+1, and matching feature points in I i with feature points in I i+1 by adopting a brute force matching algorithm to obtain a matching pair set consisting of v pairs of matching pairs { (p i1, p i1 '), (p iu, p iu '), (p iv, p iv ') }, wherein p i ═ p i1,. p iu.. p iv } belongs to the set of I i matched ORB feature points, p i ' { p i1 ',. p iu ',. p iv ' } belongs to the set of I i+1 matched ORB feature points, and 0 ≦ v ≦ m;
(4) calculating a rotation matrix and a translation vector of the pose transformation of the shooting I i+1 relative to the shooting I i depth camera:
(4a) the method comprises the steps of fusing p i with depth information in a corresponding depth image to obtain a three-dimensional point set pd i ═ { pd i1,. pd iu.. pd iv }, and fusing the depth information in the depth image corresponding to p i 'to obtain a three-dimensional point set pd i' ═ pd i1 ',. pd iu.. pd iv' };
(4b) calculating centroid coordinates C i and C 'i of pd i and pd i' and centroid coordinates q iu and q iu 'of pd iu and pd iu' and calculating a rotation matrix R i of pose transformation of the shooting I i+1 relative to the shooting I i depth camera by adopting SVD dimension reduction algorithm:
(4c) calculating a translation vector t i of the pose transformation of the shooting I i+1 relative to the shooting I i depth camera:
ti=Ci-RiC′i
(5) determining the area of the moving target in each RGB image I i:
(5a) calculating the target matching degree of each target detection result C ij in I i and each target detection result C (i+1)j' in I i+1:
f(Cij,C(i+1)j′)=(xij-x(i+1)j′)2+(yij-y(i+1)j′)2+(wij-w(i+1)j′)2+(cij-c(i+1)j′)2sij=s(i+1)j′
when f (C ij, C (i+1)j') < 0.5, then C ij and C (i+1)j' are determined to be the same target;
(5b) and fusing s pairs of feature matching points in the target areas B ij and B (i+1)j' corresponding to the matched target detection results C ij and C (i+1)j' respectively with corresponding depth image information to obtain a corresponding three-dimensional point set as follows:
{(tdi1,tdi1′),...(tdiw,tdiw′),...(tdis,tdis′)}0<s<m;
(5c) Calculating the dynamic variation A of the same target region between I i and I i+1, and marking the corresponding B ij when A is greater than 0.1 as a moving target region B ij', wherein the calculation formula of A is as follows:
(6) repairing a static background blocked by a moving target in a moving target region B ij' of each RGB image I i:
(6a) screening 5 contrast frames I i+5, I i+10, I i+15, I i+20 and I i+25 after I i when I is less than N-25 or screening 5 contrast frames I i-5, I i-10, I i-15, I i-20 and I i-25 before I i when I is less than or equal to N when N-25 is less than or equal to N by using the step length L which is 5;
(6b) calculating B ij ' center coordinates (x ij, y ij) ' in I i, and corresponding projection coordinates (x ij, y ij) l ' in 5 contrast frames, wherein l is 1,2,3,4 and 5;
(6c) Carrying out ORB feature point extraction and matching on B ij ' and a pixel region H ijl which takes projection coordinates (x ij, y ij) ' l as the center and has the same width and height as B ij ' in 5 contrast frames respectively, updating the pixel value of B ij ' to the pixel value of a region H ijl ' with the least number of point pairs of feature points which are successfully matched to obtain an RGB image I i ' of I i subjected to background restoration, and updating the pixel value of a depth image region corresponding to B ij ' to the pixel value of a depth image region corresponding to H ijl ' to obtain a depth image D i ' of D i subjected to background restoration;
(7) obtaining a three-dimensional reconstruction result:
fusing each RGB image I i 'subjected to background restoration and each depth image D i' subjected to background restoration to obtain N three-dimensional space point clouds Q 1, Q 2, a.t., Q i, a.t., Q N, and performing iterative splicing on Q 1, Q 2, a.t., Q i, a.t., Q N by using a rotation matrix R i and a translation vector t i between I i adjacent frames to obtain a global three-dimensional point cloud Q for eliminating a moving target:
2. the method of three-dimensional reconstruction for indoor moving object background restoration based on RGB-D images as claimed in claim 1, wherein the step (4b) calculates the centroid coordinates C i and C 'i of pd i and pd i' and the centroid coordinates q iu and q iu 'of pd iu and pd iu', respectively, as:
Ci=(pdi1+pdi2+...+pdiv)/v
C′i=(pdi1′+pdi2′+...+pdiv′)/v
qiu=pdiu-Ci
qiu′=pdiu′-C′i
3. the three-dimensional reconstruction method for RGB-D image based indoor moving object background restoration according to claim 1, wherein the step (6b) of calculating the projection coordinates (x ij, y ij)' l is implemented by the steps of:
(6b1) carrying out camera calibration on the I by adopting an SfM algorithm to obtain an initialized internal reference matrix of the depth camera, and optimizing the initialized internal reference matrix by adopting a cluster optimization method to obtain an optimized internal reference matrix;
(6b2) converting (x ij, y ij) 'into corresponding camera coordinates (x ij', y ij ', z ij') under a camera coordinate system by using an internal reference matrix;
(6b3) Inputting (x ij ', y ij', z ij ') into F for 5 iterations, the number of each iteration is 5,10,15,20,25, wherein the first iteration results in the projection camera coordinates (x ij', y ij ', z ij') in the contrast frame I i+5l or I i-5l (x ij ', y ij', z ij ')' l:
(6b4) the (x ij ', y ij ', z ij ') l is converted into image coordinates (x ij, y ij) ' l in a corresponding image coordinate system through an internal reference matrix as projection coordinates of (x ij, y ij) ' in the contrast frame I i+5l or I i-5l.
CN201910799527.8A 2019-08-28 2019-08-28 Three-dimensional reconstruction method based on indoor moving target background restoration Active CN110555908B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910799527.8A CN110555908B (en) 2019-08-28 2019-08-28 Three-dimensional reconstruction method based on indoor moving target background restoration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910799527.8A CN110555908B (en) 2019-08-28 2019-08-28 Three-dimensional reconstruction method based on indoor moving target background restoration

Publications (2)

Publication Number Publication Date
CN110555908A true CN110555908A (en) 2019-12-10
CN110555908B CN110555908B (en) 2022-12-02

Family

ID=68738106

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910799527.8A Active CN110555908B (en) 2019-08-28 2019-08-28 Three-dimensional reconstruction method based on indoor moving target background restoration

Country Status (1)

Country Link
CN (1) CN110555908B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111444811A (en) * 2020-03-23 2020-07-24 复旦大学 Method for detecting three-dimensional point cloud target
CN111627061A (en) * 2020-06-03 2020-09-04 贝壳技术有限公司 Pose detection method and device, electronic equipment and storage medium
CN111709982A (en) * 2020-05-22 2020-09-25 浙江四点灵机器人股份有限公司 Three-dimensional reconstruction method for dynamic environment
CN112116534A (en) * 2020-08-07 2020-12-22 贵州电网有限责任公司 Ghost eliminating method based on position information
CN112419305A (en) * 2020-12-09 2021-02-26 深圳云天励飞技术股份有限公司 Face illumination quality detection method and device, electronic equipment and storage medium
CN112509115A (en) * 2020-11-26 2021-03-16 中国人民解放军战略支援部队信息工程大学 Three-dimensional time-varying unconstrained reconstruction method and system for dynamic scene of sequence image
CN114913064A (en) * 2022-03-15 2022-08-16 天津理工大学 Large parallax image splicing method and device based on structure keeping and many-to-many matching
TWI808321B (en) * 2020-05-06 2023-07-11 圓展科技股份有限公司 Object transparency changing method for image display and document camera
CN114913064B (en) * 2022-03-15 2024-07-02 天津理工大学 Large parallax image splicing method and device based on structure maintenance and many-to-many matching

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105825518A (en) * 2016-03-31 2016-08-03 西安电子科技大学 Sequence image rapid three-dimensional reconstruction method based on mobile platform shooting
US20170019653A1 (en) * 2014-04-08 2017-01-19 Sun Yat-Sen University Non-feature extraction-based dense sfm three-dimensional reconstruction method
CN107292965A (en) * 2017-08-03 2017-10-24 北京航空航天大学青岛研究院 A kind of mutual occlusion processing method based on depth image data stream
CN108460779A (en) * 2018-02-12 2018-08-28 浙江大学 A kind of mobile robot image vision localization method under dynamic environment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170019653A1 (en) * 2014-04-08 2017-01-19 Sun Yat-Sen University Non-feature extraction-based dense sfm three-dimensional reconstruction method
CN105825518A (en) * 2016-03-31 2016-08-03 西安电子科技大学 Sequence image rapid three-dimensional reconstruction method based on mobile platform shooting
CN107292965A (en) * 2017-08-03 2017-10-24 北京航空航天大学青岛研究院 A kind of mutual occlusion processing method based on depth image data stream
CN108460779A (en) * 2018-02-12 2018-08-28 浙江大学 A kind of mobile robot image vision localization method under dynamic environment

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111444811A (en) * 2020-03-23 2020-07-24 复旦大学 Method for detecting three-dimensional point cloud target
CN111444811B (en) * 2020-03-23 2023-04-28 复旦大学 Three-dimensional point cloud target detection method
TWI808321B (en) * 2020-05-06 2023-07-11 圓展科技股份有限公司 Object transparency changing method for image display and document camera
CN111709982B (en) * 2020-05-22 2022-08-26 浙江四点灵机器人股份有限公司 Three-dimensional reconstruction method for dynamic environment
CN111709982A (en) * 2020-05-22 2020-09-25 浙江四点灵机器人股份有限公司 Three-dimensional reconstruction method for dynamic environment
CN111627061A (en) * 2020-06-03 2020-09-04 贝壳技术有限公司 Pose detection method and device, electronic equipment and storage medium
CN112116534A (en) * 2020-08-07 2020-12-22 贵州电网有限责任公司 Ghost eliminating method based on position information
CN112509115B (en) * 2020-11-26 2021-09-07 中国人民解放军战略支援部队信息工程大学 Three-dimensional time-varying unconstrained reconstruction method and system for dynamic scene of sequence image
CN112509115A (en) * 2020-11-26 2021-03-16 中国人民解放军战略支援部队信息工程大学 Three-dimensional time-varying unconstrained reconstruction method and system for dynamic scene of sequence image
CN112419305A (en) * 2020-12-09 2021-02-26 深圳云天励飞技术股份有限公司 Face illumination quality detection method and device, electronic equipment and storage medium
CN112419305B (en) * 2020-12-09 2024-06-11 深圳云天励飞技术股份有限公司 Face illumination quality detection method and device, electronic equipment and storage medium
CN114913064A (en) * 2022-03-15 2022-08-16 天津理工大学 Large parallax image splicing method and device based on structure keeping and many-to-many matching
CN114913064B (en) * 2022-03-15 2024-07-02 天津理工大学 Large parallax image splicing method and device based on structure maintenance and many-to-many matching

Also Published As

Publication number Publication date
CN110555908B (en) 2022-12-02

Similar Documents

Publication Publication Date Title
CN110555908B (en) Three-dimensional reconstruction method based on indoor moving target background restoration
KR102647351B1 (en) Modeling method and modeling apparatus using 3d point cloud
US10977818B2 (en) Machine learning based model localization system
CN111063021B (en) Method and device for establishing three-dimensional reconstruction model of space moving target
Schindler et al. Line-based structure from motion for urban environments
CN113012293B (en) Stone carving model construction method, device, equipment and storage medium
US20110249865A1 (en) Apparatus, method and computer-readable medium providing marker-less motion capture of human
Tabb Shape from silhouette probability maps: reconstruction of thin objects in the presence of silhouette extraction and calibration error
CN111080776B (en) Human body action three-dimensional data acquisition and reproduction processing method and system
CN110751097A (en) Semi-supervised three-dimensional point cloud gesture key point detection method
Laycock et al. Aligning archive maps and extracting footprints for analysis of historic urban environments
CN112613123A (en) AR three-dimensional registration method and device for aircraft pipeline
CN114862973A (en) Space positioning method, device and equipment based on fixed point location and storage medium
CN117274515A (en) Visual SLAM method and system based on ORB and NeRF mapping
Rao et al. Extreme feature regions detection and accurate quality assessment for point-cloud 3D reconstruction
CN110458177B (en) Method for acquiring image depth information, image processing device and storage medium
Mo et al. Soft-aligned gradient-chaining network for height estimation from single aerial images
Dong et al. Learning stratified 3D reconstruction
CN112766313B (en) Crystal segmentation and positioning method, device, equipment and medium based on U-net structure
CN112950787B (en) Target object three-dimensional point cloud generation method based on image sequence
Wang et al. Stratification approach for 3-d euclidean reconstruction of nonrigid objects from uncalibrated image sequences
CN113470073A (en) Animal center tracking method based on deep learning
Seoud et al. Increasing the robustness of CNN-based human body segmentation in range images by modeling sensor-specific artifacts
Li et al. Research on MR virtual scene location method based on image recognition
Zhu et al. Toward the ghosting phenomenon in a stereo-based map with a collaborative RGB-D repair

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant