WO2021241994A1 - Rgb-d 카메라의 트래킹을 통한 3d 모델의 생성 방법 및 장치 - Google Patents
Rgb-d 카메라의 트래킹을 통한 3d 모델의 생성 방법 및 장치 Download PDFInfo
- Publication number
- WO2021241994A1 WO2021241994A1 PCT/KR2021/006524 KR2021006524W WO2021241994A1 WO 2021241994 A1 WO2021241994 A1 WO 2021241994A1 KR 2021006524 W KR2021006524 W KR 2021006524W WO 2021241994 A1 WO2021241994 A1 WO 2021241994A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- original
- vertices
- rgb
- camera pose
- frames
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 230000000007 visual effect Effects 0.000 claims description 10
- 230000008859 change Effects 0.000 claims description 5
- 230000008569 process Effects 0.000 description 29
- 230000006870 function Effects 0.000 description 24
- 238000010586 diagram Methods 0.000 description 14
- 239000011159 matrix material Substances 0.000 description 11
- 238000004590 computer program Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 230000033001 locomotion Effects 0.000 description 6
- 238000005457 optimization Methods 0.000 description 6
- 238000011156 evaluation Methods 0.000 description 5
- 230000015654 memory Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
- G06T17/20—Finite element generation, e.g. wire-frame surface description, tesselation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/74—Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/248—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/292—Multi-camera tracking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/579—Depth or shape recovery from multiple images from motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/593—Depth or shape recovery from multiple images from stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30244—Camera pose
Definitions
- the present invention relates to a method and apparatus for generating a 3D model through tracking of an RGB-D camera, and is accurate and accurate by estimating a camera pose and determining a key frame using RGB-D data obtained through an RGB-D camera. It relates to a technique for efficiently performing camera tracking.
- a technique frequently used for tracking a camera is DVO SLAM (Direct Visual Odometry simultaneous localization and mapping) or ORB-SLAM.
- DVO SLAM Direct Visual Odometry simultaneous localization and mapping
- ORB-SLAM ORB-SLAM
- the present invention provides a method and apparatus capable of accurately and efficiently tracking a camera by estimating a camera pose and determining a key frame using RGB-D data acquired through an RGB-D camera.
- the present invention determines a camera pose while comparing points corresponding to the same point in different frames of RGB-D data, and efficiently performs camera tracking by selecting a key vertex from among vertices representing the determined camera poses.
- a method for generating a 3D model includes: identifying an RGB image and a depth image obtained from an RGB-D camera; designating an initial camera pose of the RGB-D camera for each original frame of the depth image and the RGB image; The final camera for each original frame by updating the initial camera pose using the difference between the original frame converted according to the initial camera position and the remaining original frames among the two original frames adjacent in time in the depth image and the RGB image determining a pose; selecting key vertices capable of covering all vertices from among original vertices corresponding to each original frame of the depth image and the RGB image; and generating a 3D model by using key frames corresponding to the key vertices in the original frame of the depth image and the RGB image and a final camera pose corresponding to the key frame.
- the initial camera pose may indicate the degree of change in the position and angle of the RGB-D camera from the viewpoint of the previous original frame to the viewpoint of the current original frame in two adjacent original frames.
- the final camera pose In the determining of the final camera pose, the difference between the 3D point included in the original frame converted according to the initial camera position among the two original frames adjacent in the depth image and the 3D point included in the remaining original frame
- the final camera pose can be determined by modifying the initial camera pose to minimize it.
- the determining of the final camera pose may include minimizing a difference between pixels included in the original frame converted according to the initial camera position among the two adjacent original frames in the RGB image and pixels included in the remaining original frames.
- the final camera pose may be determined by modifying the initial camera pose.
- the selecting of the key vertices includes an edge (vertex) indicating the presence or absence of a visual correlation between the original frames of the depth image and the RGB image among original vertices corresponding to each original frame of the depth image and the RGB image. Edges), the minimum vertices that can be connected to all the vertices may be selected as key vertices.
- Original frames forming a loop-closing pair are determined in consideration of the similarity between original frames among the original frames of the depth image and the RGB image, and relative camera poses of the original frames forming the loop-closing pair
- the method may further include updating the final camera pose based on .
- a method for generating a 3D model includes: identifying an RGB image and a depth image obtained from an RGB-D camera; Designating an initial camera pose of the RGB-D camera for each original frame of the depth image and the RGB image -
- the initial camera pose is the position of the RGB-D camera from the viewpoint of the previous original frame to the viewpoint of the current original frame; indicates the degree to which the angle has changed;
- the first twist on the registration of the two original frames by comparing the 3D point included in the current original frame converted according to the initial camera pose among the two adjacent original frames of the depth image with the 3D point included in the previous original frame determining (twist);
- a second twist on pixel intensity is obtained by comparing the pixel intensity difference between pixels included in the current original frame converted according to the initial camera pose among the two adjacent original frames of the RGB image and pixels included in the previous original frame determining; determining a final camera pose for each original frame of the depth image and the RGB image based on the first twist and the second twist; All of the vertices and the
- the determining of the first twist may include minimizing the difference between the 3D points of the original frame of the depth image converted according to the initial camera pose and the 3D points of the previous original frame of the depth image so that the first twist is minimized. can be updated.
- the determining of the second twist includes updating the second twist so that a pixel intensity difference between pixels of the original frame of the RGB image converted according to the initial camera pose and pixels of the previous original frame of the RGB image is minimized. can do.
- the method further includes determining bridging vertices that allow the key vertices to be connected through the edge among the original vertices, and generating the 3D model
- the step of generating a 3D model may include generating a 3D model using key frames corresponding to the key vertex and the bridging vertices in the original frame of the depth image and the RGB image, and a final camera pose corresponding to the key frame. .
- Original frames forming a loop-closing pair are determined in consideration of the similarity between original frames among the original frames of the depth image and the RGB image, and relative camera poses of the original frames forming the loop-closing pair
- the method may further include updating the final camera pose based on .
- An apparatus for generating a 3D model includes a processor, wherein the processor identifies an RGB image and a depth image obtained from an RGB-D camera, and each original frame of the depth image and the RGB image
- the initial camera pose of the RGB-D camera is specified, and the difference between the original frame converted according to the initial camera position among the two original frames adjacent to each other in chronological order in the depth image and the RGB image and the difference between the remaining original frames is used.
- the final camera pose for each original frame is determined by updating the initial camera pose, and key vertices that can cover all vertices are selected from among original vertices corresponding to each original frame of the depth image and the RGB image.
- a 3D model may be generated using key frames corresponding to the key vertices in the original frame of the depth image and the RGB image and a final camera pose corresponding to the key frame.
- the processor determines the initial camera pose to minimize the difference between the 3D point included in the original frame converted according to the initial camera position among the two adjacent original frames in the depth image and the 3D point included in the remaining original frame. By modifying it, the final camera pose can be determined.
- the processor modifies the initial camera pose to minimize a difference between pixels included in the original frame converted according to the initial camera position among the two adjacent original frames in the RGB image and pixels included in the remaining original frames.
- the final camera pose can be determined.
- the processor is configured to perform the processing through edges indicating the presence or absence of visual correlation between original frames of the depth image and the RGB image among original vertices corresponding to each original frame of the depth image and the RGB image.
- the minimum number of vertices that can be connected to all vertices can be selected as key vertices.
- the apparatus for generating a 3D model includes a processor, wherein the processor identifies an RGB image and a depth image obtained from an RGB-D camera, and the depth image and the original frame of the RGB image Specifies the initial camera pose of the RGB-D camera, and among the two adjacent original frames of the depth image, the 3D point included in the current original frame converted according to the initial camera pose and the 3D point included in the previous original frame
- the first twist for the registration of the two original frames is determined by comparing, and among the two adjacent original frames of the RGB image, pixels included in the current original frame converted according to the initial camera pose and the previous original frame
- a second twist for pixel intensity is determined by comparing the pixel intensity difference between included pixels, and a final camera pose for each original frame of the depth image and the RGB image is determined based on the first twist and the second twist, and , all of the vertices through the edges indicating the presence or absence of visual association between the original frames of the depth image and the RGB image among original vertices corresponding to
- the initial camera pose may indicate the degree of change in the position and angle of the RGB-D camera from the viewpoint of the previous original frame to the viewpoint of the current original frame.
- the processor determines, among the original vertices, bridging vertices that allow the key vertices to be connected through the edge, the depth image and the RGB A 3D model may be generated using key frames corresponding to the key vertex and the bridging vertices in the original frame of the image, and a final camera pose corresponding to the key frame.
- the processor determines original frames forming a loop-closing pair in consideration of similarity between original frames among original frames of the depth image and the RGB image, and the original frame forming the loop-closing pair
- the final camera pose may be updated based on the relative camera pose of
- a camera pose is determined while comparing points corresponding to the same point in different frames of RGB-D data, and a key vertex is selected from among the vertices representing the determined camera pose to efficiently camera can be tracked.
- FIG. 1 is a diagram illustrating an example of an input/output target of an apparatus for generating a 3D model according to an embodiment of the present invention.
- FIG. 2 is a diagram illustrating a process of generating a 3D model through a 3D model generating method according to an embodiment of the present invention.
- FIG. 3 is a diagram illustrating a process of generating a 3D model according to movement of an RGB-D camera according to an embodiment of the present invention.
- 4A and 4B are diagrams illustrating vertices indicating a determined camera pose according to an embodiment of the present invention.
- 5A and 5B are diagrams illustrating a plurality of key vertices and bridging vertices connecting them according to an embodiment of the present invention.
- FIG. 6 is a flowchart illustrating a 3D model generation method according to an embodiment of the present invention.
- FIG. 1 is a diagram illustrating an example of an input/output target of a 3D model generating apparatus 101 according to an embodiment of the present invention.
- the present invention is a 3D model generation method capable of accurately performing image-based odometer measurement (hereinafter referred to as 'VO') of the RGB-D camera 102 and selecting an appropriate key frame from a sequence of original frames and a 3D model generating apparatus 101 for performing the method.
- 'VO' image-based odometer measurement
- the present invention automatically determines a weight parameter for combining the texture of the original frame with the depth map in RGB-D data.
- the present invention proposes an adaptive VO for camera pose estimation.
- the camera pose means the degree to which the position and angle of the RGB-D camera 102 are changed.
- the camera pose may be expressed as a 4x4 transformation matrix (homogeneous transformation matrix). Then, in order to select an appropriate key frame from the sequence of original frames, the present invention selects a key frame using integer programming based on the Set Covering Problem. In addition, the present invention determines an optimal key frame by generating a similarity matrix between original frames.
- 4x4 transformation matrix homogeneous transformation matrix
- the 3D model generation method of the present invention consists of independently processed front-end and back-end processes.
- the front-end is a process of estimating a camera trajectory by comparing corresponding pixels between two consecutive original frames.
- the back-end process is a process for handling drift errors accumulated in the front-end process.
- the 3D model generation method consists of a tracking process corresponding to the front-end and an optimization process corresponding to the back-end process.
- the 3D model generating apparatus 101 estimates a camera pose for each original frame.
- the 3D model generating apparatus 101 identifies key frames and corrects the camera poses of the key frames.
- the 3D model generating apparatus 101 may generate a 3D model as a result of camera tracking from a key frame finally determined through pose graph optimization and a camera pose of the key frame.
- RGB-D data input from a moving or rotating RGB-D camera 102 to a 3D model generating apparatus 101 of the present invention includes an RGB image and a depth image.
- the depth image is an image in which each of a plurality of original frames is a depth map.
- the 3D model generating apparatus 101 may generate a 3D model from RGB-D data through camera tracking.
- the 3D model generating apparatus 101 includes a processor, and the processor performs the 3D model generating method of the present invention.
- FIG. 2 is a diagram illustrating a process of generating a 3D model through a 3D model generating method according to an embodiment of the present invention.
- the 3D model generating apparatus 101 identifies the RGB image and the depth image received from the RGB-D camera. In the tracking process 210 , the 3D model generating apparatus 101 determines an initial camera pose of the RGB-D camera for each viewpoint corresponding to each original frame of the depth image or the RGB image.
- the initial camera pose indicates the degree of change in the position and angle of the RGB-D camera from the viewpoint of the previous original frame to the viewpoint of the current original frame.
- the previous original frame and the current original frame are adjacent frames
- the original frame of the depth image and the original frame of the RGB image are composed of original frames corresponding to the same time.
- step 211 the 3D model generating apparatus 101 determines a final camera pose for each original frame by using a difference between two original frames adjacent in time in the depth image and the RGB image.
- the 3D model generating apparatus 101 estimates a first twist with respect to matching of two original frames adjacent in chronological order in consideration of an iterative closed point (ICP) in the depth image.
- the first twist belongs to Special Euclidean Group SE(3), represents a rigid motion of the RGB-D camera, and may be expressed as a 6x1 velocity vector.
- the first twist means a twist determined according to ICP, is determined corresponding to an initial camera pose, and is updated according to a difference between two adjacent original frames. Then, the updated first twist is used to determine the final camera pose.
- the original frame of the depth image is composed of 3D points.
- the 3D model generating apparatus 101 calculates a point-to-plane ICP residual between 3D points included in two adjacent original frames.
- the 3D model generating apparatus 101 determines the point-to-face ICP residual by comparing the 3D points of the current original frame transformed according to the initial camera pose determined through Equation 1 with the 3D points of the previous original frame.
- the 3D point of the current original frame to be compared and the 3D point of the previous original frame are points pointing to the same point in the generated 3D model or the real coordinate system.
- Equation 1 is the point-to-face ICP residual.
- C denotes an initial camera pose determined arbitrarily. Belongs to Special Euclidean Group SE(3), determined by the initial camera pose, and is the first twist representing the rigid motion of the RGB-D camera.
- n k means a normal vector. k is for numbering three-dimensional points, and X k means the k-th three-dimensional point of the previous original frame of the depth image.
- X' k means a 3D point corresponding to X k in the current original frame of the depth image.
- exp is an exponential function. That is, the more accurate the determined initial camera pose is, the closer the ICP residual is determined to zero when the transformed point is point-to-point.
- the 3D model generating apparatus 101 may use an iteratively running the re-weighted least square (IRLS) algorithm to determine a first twist with respect to matching of two adjacent original frames of a depth image.
- IRLS re-weighted least square
- an ICP cost function including a weight function including a predetermined weight parameter is defined according to Equation 2 below, and a first twist that minimizes the cost is determined while modifying the weight parameter.
- Equation 2 ( ) denotes a weight function for minimizing the influence of a case in which three-dimensional points are incorrectly matched, and includes a weight parameter.
- E i means the ICP cost function, and is determined by updating the first twist such that the sum of the point-to-face ICP residuals is minimized.
- the 3D model generating apparatus 101 performs the first twist ( i ) is determined.
- the 3D model generating apparatus 101 estimates the initial camera pose of the current original frame of the depth image, and compares the three-dimensional point of the current original frame converted according to the initial camera pose with the three-dimensional point of the previous original frame. A first twist for matching of the original frame may be determined.
- the 3D model generating device 101 uses the ICP cost function to update the first twist to minimize the difference between the 3D point of the current original frame transformed according to the initial camera pose and the 3D point of the previous original frame, and , it is possible to determine the first twist that minimizes the difference between the 3D point of the current original frame transformed according to the initial camera pose and the 3D point of the previous original frame.
- the 3D model generating device 101 uses the difference between pixels included in the current original frame of the RGB image converted according to the initial camera pose and pixels included in the previous original frame of the RGB image to determine the pixel intensity. 2 Determine the twist.
- the original frame of the RGB image is composed of a plurality of pixels.
- the current original frame of the RGB image and the current original frame of the depth image are frames of the same viewpoint.
- the 3D model generating apparatus 101 determines the second twist for the DVO by comparing pixel intensities between corresponding pixels in two adjacent original frames through Equation 3 below.
- the second twist is a twist determined according to the DVO, and is a twist for pixel intensity.
- the second twist belongs to Special Euclidean Group SE(3), represents a rigid motion of the RGB-D camera, and can be expressed as a 6x1 velocity vector.
- the second twist is determined corresponding to the initial camera pose, and is updated by comparing the difference between two adjacent original frames. Then, the updated second twist is used to determine the final camera pose.
- Equation 3 x k denotes a k-th pixel included in a previous original frame among two original frames adjacent in time in an RGB image. And, x' k means the k-th pixel corresponding to x k in the current original frame of the RGB image converted according to the initial camera pose. k denotes the pixel intensity residual for the kth pixel. I 1 (x k ) means the pixel intensity of x k , and I 2 (x' k ) means the pixel intensity of x' k . x' k is converted according to Equation (4).
- C means the initial camera pose
- -1 means the inverse of the projection function
- x denotes a pixel in the current original frame of the RGB image
- x' denotes a pixel included in the current original frame of the RGB image converted according to the initial camera pose
- Z(x) is a function indicating depth information of x.
- the depth information of x may be obtained from the current original frame of the depth image.
- the current original frame of the depth image and the current original frame of the RGB image mean original frames corresponding to the same viewpoint in RGB-D data.
- the 3D model generating device 101 transforms the pixels of the second original frame in three dimensions, transforms them according to the initial camera pose, and then converts them back to two dimensions, thereby converting the pixels of the pixels of the previous original frame and the pixels of the current original frame. strength can be compared.
- the 3D model generating apparatus 101 may determine the second twist with respect to the pixel intensity of two adjacent original frames of the RGB image by using the IRLS algorithm.
- a pixel intensity cost function including a weighting function including a predetermined weighting parameter is defined according to Equation 5 below, and a second twist that minimizes the cost while modifying the weighting parameter is determined.
- the weight function used in Equations 2 and 5 is a function including the same weight parameter. Accordingly, the 3D model generating apparatus 101 repeatedly performs the process of correcting the weight parameter and updating the first and second twists according to Equations 2 and 5 according to the IRLS algorithm.
- Equation 5 ( ) denotes a weight function for minimizing the influence of a case where pixels are incorrectly matched, and is the same function as the function used to calculate the ICP cost function in Equation (2).
- E d denotes the pixel intensity cost function and is calculated by determining the second twist such that the sum of the pixel intensity residuals is minimal.
- the 3D model generating device 101 updates the second twist for pixel intensity so that the result of the pixel intensity cost function is minimized with a second twist ( d ) is determined.
- the 3D model generating apparatus 101 generates pixels between pixels included in the current original frame of the RGB image converted according to the initial camera pose in the two original frames adjacent in chronological order and pixels included in the previous original frame of the RGB image.
- a second twist for pixel intensity is determined by comparing the intensity difference.
- the 3D model generating device 101 is converted according to the initial camera pose by updating the second twist to minimize the pixel intensity difference between the pixels of the previous original frame and the pixels of the current original frame transformed according to the initial camera pose.
- a second twist that minimizes a pixel intensity difference between pixels included in the current original frame of the RGB image and pixels included in the previous original frame of the RGB image may be determined.
- the 3D model generating apparatus 101 may determine the final first and second twists by correcting the weight parameter according to the IRLS algorithm and updating the first and second twists according to Equations 2 and 5.
- the present invention proposes a method according to the relative fitness between DVO and ICP for each IRLS iteration procedure.
- the weighting parameters By adaptively adjusting the weighting parameters, both ICP and DVO's other characteristic behaviors are reflected. Accordingly, the VO process according to the present invention leads to more accurate and robust results.
- the 3D model generating apparatus 101 down-samples the depth image and the RGB image to four pyramid levels for a precise VO process, and repeats the above-described process for each pyramid level.
- the above-described process refers to a process from setting an initial camera pose to determining a final camera pose, which will be described later. At this time, the final camera pose is used as the initial camera pose of the next pyramid level.
- the 3D model generating apparatus 101 may determine an optimal camera pose for each original frame by repeating the above-described process by downsampling to four pyramid levels.
- the 3D model generating apparatus 101 may evaluate the quality of the determined first and second twists using Equations 6 and 7 .
- Equations 6 and 7 e d is the second twist ( d ) is the second evaluation parameter evaluated, and e i is the first twist ( i ) is the first evaluation parameter evaluated.
- X k means the k-th 3D point of the previous original frame of the depth image, and X' k means the k-th 3D point corresponding to X k in the current original frame of the depth image.
- exp is an exponential function.
- e d is the second twist for pixel intensity according to DVO in the depth image ( d ) represents the sum of the differences between the three-dimensional points included in the current original frame transformed according to d) and the three-dimensional points of the previous original frame.
- e i is the first twist for registration according to ICP in the depth image ( i ) represents the sum of the difference between the 3D point included in the current original frame transformed according to ) and the 3D point of the previous original frame.
- the 3D model generating apparatus 101 may determine a weight between the first and second evaluation parameters by using a gain parameter.
- the gain parameter is defined according to Equation (8).
- the second gain parameter for DVO is denotes a ratio of the second evaluation parameter to the sum of the first and second evaluation parameters.
- v represents the number of meaningful Sobel responses in the DVO process in which the second twist is determined, and in the present invention, v is determined as 2% of the resolution of the original frame of the depth image.
- n represents the number of 3D points corresponding (matching) between two consecutive depth images.
- the first gain parameter for the ICP is i is 1- is determined by d.
- n ⁇ v since there is no more than a certain amount of texture in the RGB-D data, it is difficult for the DVO to operate properly, so the final camera pose is determined using the first twist according to the ICP.
- the final camera pose is determined by adaptively considering the estimation of the camera pose according to the ICP and the estimation of the camera pose according to the DVO. A process of determining the final camera pose using the first and second gain parameters will be described later.
- the 3D model generating device 101 is the first and second twist ( i , d ) and the first and second gain parameters ( i, d ) can be interpolated to determine the final camera pose. Specifically, the 3D model generating apparatus 101 determines the final camera pose from the first and second twist and the first and second gain parameters using Equation (9).
- the final camera pose is not determined differently depending on the depth image or RGB image, It is determined for each viewpoint of each original frame.
- C avg is the final camera pose.
- the final camera pose determined through the interpolation function is used as the initial camera pose for the next pyramid level.
- step 211 the process of estimating the camera pose in step 211 follows the algorithm below (RGBD ODOM( )).
- Eq. 7 means Equation 8
- Eq. 8 means Equation 9.
- the 3D model generating apparatus 101 generates a similarity matrix indicating a degree of similarity between original frames of a depth image or an RGB image. Specifically, the 3D model generating apparatus 101 may generate a similarity matrix by extracting and matching features from original frames (Feature Extraction and Matching, FEM).
- the 3D model generating apparatus 101 generates a similarity matrix by matching a plurality of wide baselines of the original frames of the depth image or the RGB image received from the RGB-D camera.
- the 3D model generating apparatus 101 may use a bag of words (BoW) technique in generating the similarity matrix.
- BoW bag of words
- the 3D model generating apparatus 101 may perform the process of generating the similarity matrix in a thread independent of the process of determining the final camera pose. That is, the 3D model generating apparatus 101 may perform steps 211 and 212 in parallel in order to reduce time consumption for camera tracking.
- the generated similarity matrix is used to select an optimal key frame.
- the 3D model generating apparatus 101 determines ( 221 ) a key frame among original frames of the depth image or the RGB image, and generates ( 222 ) a 3D model using the determined key frame.
- Final camera poses corresponding to respective original frames of the depth image or the RGB image may be expressed in a graph according to SLAM.
- the graph is composed of a set of original vertices corresponding to each final camera pose and a set of edges indicating the presence or absence of visual associations between the original vertices.
- the 3D model generating apparatus 101 determines the visual association by calculating the number of feature matches.
- the 3D model generating apparatus 101 determines the minimum number of original vertices that can cover all original vertices in order to select a key frame.
- the minimum original vertices that cover all original vertices are the minimum original vertices that can be connected to all original vertices using the edge of the graph among all original vertices.
- the 3D model generating apparatus 101 determines the minimum original vertices that can cover all the original vertices as a set of key vertices.
- Equation 10 means the set cover problem.
- U means the set of all original vertices
- V i means the i-th original vertex.
- V* means all original vertices.
- the 3D model generating apparatus 101 determines, as a set of key vertices, a set of minimum original vertices that can i) cover all original vertices and ii) can be connected to each other in a pose graph optimization (PGO) step. do.
- PGO pose graph optimization
- the 3D model generating apparatus 101 selects the optimized key frame of the original vertices that minimize the cost according to Equation 11 among the sets of the original vertices satisfying the above i) and ii) conditions. Determine the set as a set of key vertices.
- Equation 11 w j is 1/
- S j means the j-th subgroup.
- a subgroup is a set including a specific original vertex and an original vertex connected to the specific original vertex and an edge.
- v j is the j-th original vertex, and a specific original vertex of S j .
- x j is a decision variable, v is set to 1 when the selection j-, v is set to zero if j- is not selected.
- a ji which is a binary coefficient, is set to 1 when v j is included in the i-th subgroup.
- Equation 11 (1) the condition of Equation 10, which is a condition for covering all original vertices, is satisfied, and according to Equation 11 (2), original vertices that are not connected by edges to other original vertices are not selected. does not (3) of Equation 11 can be selectively applied, and at least c or more original vertices can be selected.
- the 3D model generating apparatus 101 may select optimal key vertices from among all original vertices by using Equation (11).
- the 3D model generating apparatus 101 may add bridging vertices to the graph so that the selected key vertices can be connected to each other.
- the 3D model generating apparatus 101 determines bridging vertices through which the selected original vertices are connected to each other by using a breadth first search (BFS).
- BFS breadth first search
- An original frame corresponding to the added bridging vertices may also be determined as a key frame.
- the 3D model generating apparatus 101 selects original frames of a depth image or RGB image corresponding to key vertices as key frames, and when bridging vertices are added, a depth image corresponding to the added bridging vertices or The original frame of the RGB image is also selected as the key frame.
- the 3D model generating apparatus 101 generates a 3D model using the determined key frames. Specifically, the 3D model generating apparatus 101 may generate the 3D model by performing PGO using the Cere-solver with the determined key frames.
- the 3D model generating apparatus 101 may update the final camera pose of the key frame by using the similarity matrix in order to optimize the final camera pose determined for each original frame of the depth image or the RGB image.
- the 3D model generating apparatus 101 performs loop-closing with the highest similarity among original frames of the depth image or RGB image using a similarity matrix generated by matching original frames of the depth image or the RGB image. A pair of original frames can be determined.
- the 3D model generating apparatus 101 may update the final camera pose of the key frame by calculating a relative camera pose between final camera poses corresponding to the original frame pair forming the loop closing pair.
- the 3D model generating apparatus 101 may determine the relative camera pose by using Equation (12).
- C j means the final camera pose of the second original frame among the two original frames constituting the j-th loop frame pair of the depth image or the RGB image
- x ji is the i-th of the j-th original frame of the RGB image means pixels.
- X j-1,i denotes a 3D point corresponding to the upper i-th pixel of the first original frame among the two original frames constituting the loop frame pair of the depth image.
- the 3D model generating apparatus 101 converts the 3D point included in the second original frame among the two original frames constituting the loop frame pair in the depth image according to the final camera pose, and converts the 3D point in the two-dimensionally projected pixel and RGB image.
- the final camera pose in consideration of the relative camera pose may be determined by modifying the final camera pose so that the difference between pixels included in the first original frame among the two original frames constituting the loop frame pair is minimized.
- the 3D model generating apparatus 101 may perform a process to be described later. For example, the 3D model generating apparatus 101 calculates a relative camera pose between final camera poses corresponding to an original frame pair forming a loop closing pair by using the RANSAC protocol. In addition, the 3D model generating apparatus 101 may optimize the relative camera pose by performing median absolute deviation (MAD) based on the outlier removal step and performing motion-only bundle adjustment. . The 3D model generating apparatus 101 calculates the MAD of 1.4826 times the absolute deviation (r i ) of the reprojection error from the median residual. to calculate Then, an inlier match is determined. At this time, the absolute deviation (r i ) of the reprojection error is the scale parameter (T ) is determined to be a smaller value, and for example, in the present invention, the scale parameter may be set to 2.5.
- the scale parameter may be set to 2.5.
- FIG. 3 is a diagram illustrating a process of generating a 3D model according to movement of an RGB-D camera according to an embodiment of the present invention.
- the RGB-D camera may collect RGB-D data while changing an angle or a position.
- the original frames 311 , 312 , 313 , and 314 of the depth image or the RGB image are frames obtained by different camera poses.
- 3D points or pixels corresponding to the same point on the real coordinate system or on the 3D model may be included.
- the three-dimensional point or pixel 302 of the original frame 311 of the depth image or RGB image is the three-dimensional point or pixel 302 of the original frame 312 in which the camera pose is changed. Since it points to the same point on the real coordinate system or on the 3D model, it is a corresponding relationship.
- the 3D point or pixel converted according to the camera pose estimated in the current original frame 312 and the corresponding 3D point or pixel in the previous original frame 311 coincide with each other.
- 4A and 4B are diagrams illustrating vertices representing a determined final camera pose according to an embodiment of the present invention.
- FIG. 4A is a diagram illustrating a graph including original vertices 401 and an edge 402 corresponding to each original frame based on a correlation between original frames of a depth image or an RGB image.
- the original vertices may correspond to each original frame of the depth image or RGB image, or a final camera pose for each original frame.
- 4B is a diagram illustrating key vertices 403 capable of covering all original vertices according to the set cover problem selected from among all vertices. Only the key vertices 403 of FIG. 4B can be connected to all original vertices 401 through edges.
- the 3D model generating apparatus selects key vertices from among all original vertices according to the set cover problem, and determines an original frame corresponding to the selected key vertices as a key frame. Then, the 3D model generating apparatus generates a 3D model using key frames.
- 5A and 5B are diagrams illustrating a plurality of key vertices and bridging vertices connecting them according to an embodiment of the present invention.
- 5A is a diagram illustrating key vertices selected according to Equation (10) among all original vertices.
- key vertices may cover all original vertices according to Equation 10, but are not connected to each other.
- FIG. 5B is a diagram illustrating that bridging vertices are added using BFS.
- original vertices connected by edges based on key vertices may be determined as the bridging vertices.
- the 3D model generating apparatus may efficiently generate a 3D model by selecting the bridging vertex and original frames corresponding to the key vertex as the key frame.
- FIG. 6 is a flowchart illustrating a 3D model generation method according to an embodiment of the present invention.
- the 3D model generating apparatus identifies the RGB image and the depth image obtained from the RGB-D camera.
- the 3D model generating apparatus designates an initial camera pose of the RGB-D camera for each original frame of the depth image and the RGB image.
- the initial camera pose represents the degree of change in the position and angle of the RGB-D camera from the viewpoint of the previous original frame to the viewpoint of the current original frame.
- step 603 the 3D model generating apparatus compares the 3D point included in the current original frame converted according to the initial camera pose with the 3D point included in the previous original frame among the two adjacent original frames of the depth image. A first twist in consideration of the ICP of the original frame is determined.
- the 3D model generating apparatus minimizes the difference between the 3D points of the original frame of the depth image converted according to the initial camera pose and the 3D points of the previous original frame of the depth image by using the ICP cost function to which IRLS is applied. Update the first twist to be possible.
- step 604 the 3D model generating device calculates the pixel intensity difference between pixels included in the current original frame converted according to the initial camera pose and pixels included in the previous original frame among the two adjacent original frames of the RGB image. The comparison determines the second twist for the DVO.
- the 3D model generating device uses the IRLS-applied DVO cost function with a second twist such that the pixel intensity difference between the pixels of the original frame of the RGB image converted according to the initial camera pose and the pixels of the previous original frame of the RGB image is minimized. update
- the 3D model generating apparatus determines a final camera pose for each original frame of the depth image and the RGB image based on the first twist and the second twist. In this case, the 3D model generating apparatus may repeat steps 603-605 while downsampling the image.
- the 3D model generating apparatus determines whether or not there is a visual association between the original frames of the depth image and the RGB image among the original vertices corresponding to each original frame of the depth image and the RGB image by using the aggregation cover problem.
- the minimum vertices that can be connected to all vertices through the edges indicated by are selected as key vertices.
- the 3D model generating apparatus determines bridging vertices that allow the key vertices to be connected through the edge among the original vertices.
- the 3D model generating apparatus generates a 3D model by using key frames corresponding to key vertices in the original frame of the depth image and the RGB image and a final camera pose corresponding to the key frame.
- the 3D model generating apparatus determines original frames forming a loop-closing pair in consideration of the similarity between the original frames among the original frames of the depth image and the RGB image, and the original frame forming the loop-closing pair By comparing their final camera poses, it is possible to determine the final camera pose based on the relative camera pose.
- the method according to the present invention is written as a program that can be executed on a computer and can be implemented in various recording media such as magnetic storage media, optical reading media, and digital storage media.
- Implementations of the various techniques described herein may be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or combinations thereof. Implementations may be implemented for processing by, or controlling the operation of, a data processing device, eg, a programmable processor, computer, or number of computers, a computer program product, ie an information carrier, eg, a machine readable storage It may be embodied as a computer program tangibly embodied in an apparatus (computer readable medium) or a radio signal.
- a computer program such as the computer program(s) described above, may be written in any form of programming language, including compiled or interpreted languages, as a standalone program or in a module, component, subroutine, or computing environment. It can be deployed in any form, including as other units suitable for use in A computer program may be deployed to be processed on one computer or multiple computers at one site or to be distributed across multiple sites and interconnected by a communications network.
- processors suitable for processing a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer.
- a processor will receive instructions and data from read only memory or random access memory or both.
- Elements of a computer may include at least one processor that executes instructions and one or more memory devices that store instructions and data.
- a computer may include one or more mass storage devices for storing data, for example magnetic, magneto-optical disks, or optical disks, receiving data from, sending data to, or both.
- Information carriers suitable for embodying computer program instructions and data are, for example, semiconductor memory devices, for example, magnetic media such as hard disks, floppy disks and magnetic tapes, Compact Disk Read Only Memory (CD-ROM). ), an optical recording medium such as a DVD (Digital Video Disk), a magneto-optical medium such as a floppy disk, a ROM (Read Only Memory), a RAM (RAM) , Random Access Memory), flash memory, EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), and the like. Processors and memories may be supplemented by, or included in, special purpose logic circuitry.
- the computer-readable medium may be any available medium that can be accessed by a computer, and may include both computer storage media and transmission media.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (15)
- RGB-D 카메라로부터 획득한 RGB 영상과 깊이 영상을 식별하는 단계;상기 깊이 영상 및 상기 RGB 영상의 각 원본 프레임 별 상기 RGB-D 카메라의 초기 카메라 포즈를 지정하는 단계;상기 깊이 영상 및 상기 RGB 영상에서 시간 순으로 인접하는 두 개의 원본 프레임 중 상기 초기 카메라 위치에 따라 변환된 원본 프레임과 나머지 원본 프레임 간의 차이를 이용하여 상기 초기 카메라 포즈를 업데이트함으로써 각 원본 프레임 별 최종 카메라 포즈를 결정하는 단계;상기 깊이 영상 및 상기 RGB 영상의 각 원본 프레임에 대응하는 원본 버텍스(Vertex)들 중 모든 버텍스들을 커버할 수 있는 키 버텍스들을 선택하는 단계; 및상기 깊이 영상 및 상기 RGB 영상의 원본 프레임에서 상기 키 버텍스들에 대응하는 키 프레임들과 상기 키 프레임에 대응하는 최종 카메라 포즈를 이용하여 3D 모델을 생성하는 단계를 포함하는 3D 모델 생성 방법.
- 제1항에 있어서,상기 초기 카메라 포즈는,인접하는 두 원본 프레임에서 이전 원본 프레임의 시점부터 현재 원본 프레임의 시점까지 상기 RGB-D 카메라의 위치 및 각도가 변한 정도를 나타내는, 3D 모델 생성 방법.
- 제1항에 있어서,상기 최종 카메라 포즈를 결정하는 단계는,상기 깊이 영상에서 인접하는 두 개의 원본 프레임 중 상기 초기 카메라 위치에 따라 변환된 원본 프레임에 포함된 3차원 포인트와 나머지 원본 프레임에 포함된 3차원 포인트 간의 차이를 최소화하도록 상기 초기 카메라 포즈를 수정함으로써 최종 카메라 포즈를 결정하는, 3D 모델 생성 방법.
- 제1항에 있어서,상기 최종 카메라 포즈를 결정하는 단계는,상기 RGB 영상에서 인접하는 두 개의 원본 프레임 중 상기 초기 카메라 위치에 따라 변환된 원본 프레임에 포함된 픽셀들과 나머지 원본 프레임에 포함된 픽셀들 간의 차이를 최소화하도록 상기 초기 카메라 포즈를 수정함으로써 최종 카메라 포즈를 결정하는, 3D 모델 생성 방법.
- 제1항에 있어서,상기 키 버텍스들을 선택하는 단계는,상기 깊이 영상 및 상기 RGB 영상의 각 원본 프레임에 대응하는 원본 버텍스(Vertex)들 중 상기 깊이 영상 및 상기 RGB 영상의 원본 프레임들 간의 시각적 연관성의 존부를 나타낸 엣지(Edge)들을 통해 상기 모든 버텍스들과 연결될 수 있는 최소한의 버텍스들을 키 버텍스들로 선택하는, 3D 모델 생성 방법.
- 제1항에 있어서,상기 깊이 영상 및 상기 RGB 영상의 원본 프레임들 중 원본 프레임들 간의 유사도를 고려하여 루프-클로징(loop-closing) 쌍을 이루는 원본 프레임들을 결정하고, 상기 루프-클로징 쌍을 이루는 원본 프레임들의 상대적 카메라 포즈에 기초하여 최종 카메라 포즈를 업데이트하는 단계를 더 포함하는, 3D 모델 생성 방법.
- RGB-D 카메라로부터 획득한 RGB 영상과 깊이 영상을 식별하는 단계;상기 깊이 영상 및 상기 RGB 영상의 원본 프레임 별로 상기 RGB-D 카메라의 초기 카메라 포즈를 지정하는 단계- 상기 초기 카메라 포즈는 이전 원본 프레임의 시점부터 현재 원본 프레임의 시점까지 상기 RGB-D 카메라의 위치 및 각도가 변한 정도를 나타냄;상기 깊이 영상의 인접한 두 원본 프레임들 중 상기 초기 카메라 포즈에 따라 변환된 현재 원본 프레임에 포함된 3차원 포인트와 이전 원본 프레임에 포함된 3차원 포인트를 비교함으로써 두 원본 프레임의 정합에 대한 제1 트위스트(twist)를 결정하는 단계;상기 RGB 영상의 인접한 두 원본 프레임들 중 상기 초기 카메라 포즈에 따라 변환된 현재 원본 프레임에 포함된 픽셀들과 이전 원본 프레임에 포함된 픽셀들 간의 픽셀 강도 차이를 비교함으로써 픽셀 강도에 대한 제2 트위스트를 결정하는 단계;상기 제1 트위스트와 제2 트위스트에 기초하여 상기 깊이 영상 및 상기 RGB 영상의 원본 프레임 별 최종 카메라 포즈를 결정하는 단계;상기 깊이 영상 및 상기 RGB 영상의 각 원본 프레임에 대응하는 원본 버텍스(Vertex)들 중 상기 깊이 영상 및 상기 RGB 영상의 원본 프레임들 간의 시각적 연관성의 존부를 나타낸 엣지(Edge)들을 통해 상기 모든 버텍스들과 연결될 수 있는 최소한의 버텍스들을 키 버텍스들로 선택하는 단계; 및상기 깊이 영상 및 상기 RGB 영상의 원본 프레임에서 상기 키 버텍스들에 대응하는 키 프레임들과 상기 키 프레임에 대응하는 최종 카메라 포즈를 이용하여 3D 모델을 생성하는 단계를 포함하는 3D 모델 생성 방법.
- 제7항에 있어서,상기 제1 트위스트를 결정하는 단계는,상기 초기 카메라 포즈에 따라 변환된 상기 깊이 영상의 원본 프레임의 3차원 포인트들과 상기 깊이 영상의 이전 원본 프레임의 3차원 포인트들 간의 차이가 최소화되도록 제1 트위스트를 업데이트하는, 3D 모델 생성 방법.
- 제7항에 있어서,상기 제2 트위스트를 결정하는 단계는,상기 초기 카메라 포즈에 따라 변환된 상기 RGB 영상의 원본 프레임의 픽셀들과 상기 RGB 영상의 이전 원본 프레임의 픽셀들 간의 픽셀 강도 차이가 최소화되도록 제2 트위스트를 업데이트하는, 3D 모델 생성 방법.
- 제7항에 있어서,상기 키 버텍스들이 상기 엣지를 통해 서로 연결되지 않는 경우, 상기 원본 버텍스들 중 상기 키 버텍스들이 상기 엣지를 통해 연결되게 하는 브릿징(bridging) 버텍스들을 결정하는 단계를 더 포함하고,상기 3D 모델을 생성하는 단계는,상기 깊이 영상 및 상기 RGB 영상의 원본 프레임에서 상기 키 버텍스 및 상기 브릿징 버텍스들에 대응하는 키 프레임들과 상기 키 프레임에 대응하는 최종 카메라 포즈를 이용하여 3D 모델을 생성하는, 3D 모델 생성 방법.
- 제7항에 있어서,상기 깊이 영상 및 상기 RGB 영상의 원본 프레임들 중 원본 프레임들 간의 유사도를 고려하여 루프-클로징(loop-closing) 쌍을 이루는 원본 프레임들을 결정하고, 상기 루프-클로징 쌍을 이루는 원본 프레임들의 상대적 카메라 포즈에 기초하여 최종 카메라 포즈를 업데이트하는 단계를 더 포함하는 3D 모델 생성 방법.
- 제1항 내지 제11항 중 어느 한 항의 방법을 실행하기 위한 프로그램이 기록된 컴퓨터에서 판독 가능한 기록 매체.
- 3D 모델 생성 장치는 프로세서를 포함하고,상기 프로세서는,RGB-D 카메라로부터 획득한 RGB 영상과 깊이 영상을 식별하고, 상기 깊이 영상 및 상기 RGB 영상의 각 원본 프레임 별 상기 RGB-D 카메라의 초기 카메라 포즈를 지정하고, 상기 깊이 영상 및 상기 RGB 영상에서 시간 순으로 인접하는 두 개의 원본 프레임 중 상기 초기 카메라 위치에 따라 변환된 원본 프레임과 나머지 원본 프레임 간의 차이를 이용하여 상기 초기 카메라 포즈를 업데이트함으로써 각 원본 프레임 별 최종 카메라 포즈를 결정하고, 상기 깊이 영상 및 상기 RGB 영상의 각 원본 프레임에 대응하는 원본 버텍스(Vertex)들 중 모든 버텍스들을 커버할 수 있는 키 버텍스들을 선택하고, 상기 깊이 영상 및 상기 RGB 영상의 원본 프레임에서 상기 키 버텍스들에 대응하는 키 프레임들과 상기 키 프레임에 대응하는 최종 카메라 포즈를 이용하여 3D 모델을 생성하는,3D 모델 생성 장치.
- 제13항에 있어서,상기 프로세서는,상기 깊이 영상에서 인접하는 두 개의 원본 프레임 중 상기 초기 카메라 위치에 따라 변환된 원본 프레임에 포함된 3차원 포인트와 나머지 원본 프레임에 포함된 3차원 포인트 간의 차이를 최소화하도록 상기 초기 카메라 포즈를 수정함으로써 최종 카메라 포즈를 결정하는, 3D 모델 생성 장치.
- 제13항에 있어서,상기 프로세서는,상기 RGB 영상에서 인접하는 두 개의 원본 프레임 중 상기 초기 카메라 위치에 따라 변환된 원본 프레임에 포함된 픽셀들과 나머지 원본 프레임에 포함된 픽셀들 간의 차이를 최소화하도록 상기 초기 카메라 포즈를 수정함으로써 최종 카메라 포즈를 결정하는, 3D 모델 생성 장치.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/928,568 US20240135578A1 (en) | 2020-05-29 | 2021-05-26 | Method and device for generating 3d model through rgb-d camera tracking |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2020-0065244 | 2020-05-29 | ||
KR1020200065244A KR102298098B1 (ko) | 2020-05-29 | 2020-05-29 | Rgb-d 카메라의 트래킹을 통한 3d 모델의 생성 방법 및 장치 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021241994A1 true WO2021241994A1 (ko) | 2021-12-02 |
Family
ID=77784829
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2021/006524 WO2021241994A1 (ko) | 2020-05-29 | 2021-05-26 | Rgb-d 카메라의 트래킹을 통한 3d 모델의 생성 방법 및 장치 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20240135578A1 (ko) |
KR (1) | KR102298098B1 (ko) |
WO (1) | WO2021241994A1 (ko) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20140130096A (ko) * | 2012-02-22 | 2014-11-07 | 아셀산 엘렉트로닉 사나이 베 티카렛 아노님 시르케티 | 트래커 시스템을 최적화하는 시스템 및 방법 |
KR20150006958A (ko) * | 2013-07-09 | 2015-01-20 | 삼성전자주식회사 | 카메라 포즈 추정 장치 및 방법 |
KR101793975B1 (ko) * | 2016-09-13 | 2017-11-07 | 서강대학교산학협력단 | 깊이 카메라에 의한 스트리밍 영상에 대한 카메라 트래킹 방법 및 장치 |
KR20180087947A (ko) * | 2017-01-26 | 2018-08-03 | 삼성전자주식회사 | 3차원의 포인트 클라우드를 이용한 모델링 방법 및 모델링 장치 |
KR101896131B1 (ko) * | 2011-01-31 | 2018-09-07 | 마이크로소프트 테크놀로지 라이센싱, 엘엘씨 | 깊이 맵을 이용하는 모바일 카메라 로컬라이제이션 |
-
2020
- 2020-05-29 KR KR1020200065244A patent/KR102298098B1/ko active IP Right Grant
-
2021
- 2021-05-26 US US17/928,568 patent/US20240135578A1/en active Pending
- 2021-05-26 WO PCT/KR2021/006524 patent/WO2021241994A1/ko active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101896131B1 (ko) * | 2011-01-31 | 2018-09-07 | 마이크로소프트 테크놀로지 라이센싱, 엘엘씨 | 깊이 맵을 이용하는 모바일 카메라 로컬라이제이션 |
KR20140130096A (ko) * | 2012-02-22 | 2014-11-07 | 아셀산 엘렉트로닉 사나이 베 티카렛 아노님 시르케티 | 트래커 시스템을 최적화하는 시스템 및 방법 |
KR20150006958A (ko) * | 2013-07-09 | 2015-01-20 | 삼성전자주식회사 | 카메라 포즈 추정 장치 및 방법 |
KR101793975B1 (ko) * | 2016-09-13 | 2017-11-07 | 서강대학교산학협력단 | 깊이 카메라에 의한 스트리밍 영상에 대한 카메라 트래킹 방법 및 장치 |
KR20180087947A (ko) * | 2017-01-26 | 2018-08-03 | 삼성전자주식회사 | 3차원의 포인트 클라우드를 이용한 모델링 방법 및 모델링 장치 |
Also Published As
Publication number | Publication date |
---|---|
US20240135578A1 (en) | 2024-04-25 |
KR102298098B1 (ko) | 2021-09-03 |
KR102298098B9 (ko) | 2021-10-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2015194864A1 (ko) | 이동 로봇의 맵을 업데이트하기 위한 장치 및 그 방법 | |
WO2015194867A1 (ko) | 다이렉트 트래킹을 이용하여 이동 로봇의 위치를 인식하기 위한 장치 및 그 방법 | |
WO2015194866A1 (ko) | 에지 기반 재조정을 이용하여 이동 로봇의 위치를 인식하기 위한 장치 및 그 방법 | |
WO2015194865A1 (ko) | 검색 기반 상관 매칭을 이용하여 이동 로봇의 위치를 인식하기 위한 장치 및 그 방법 | |
US7565029B2 (en) | Method for determining camera position from two-dimensional images that form a panorama | |
US10296798B2 (en) | System and method of selecting a keyframe for iterative closest point | |
CN112184824A (zh) | 一种相机外参标定方法、装置 | |
WO2021085757A1 (ko) | 예외적 움직임에 강인한 비디오 프레임 보간 방법 및 그 장치 | |
WO2022005157A1 (en) | Electronic device and controlling method of electronic device | |
WO2023008791A1 (ko) | 근접 영역 감지를 수행하여 이동체의 전방향에 위치한 적어도 하나의 물체에 대한 거리를 획득하는 방법 및 이를 이용한 이미지 처리 장치 | |
WO2023055033A1 (en) | Method and apparatus for enhancing texture details of images | |
WO2023048380A1 (ko) | 카메라뷰 뎁스맵을 활용하여 이동체의 전방향에 위치한 적어도 하나의 물체에 대한 거리를 획득하는 방법 및 이를 이용한 이미지 처리 장치 | |
WO2024155137A1 (ko) | 비주얼 로컬라이제이션을 수행하기 위한 방법 및 장치 | |
WO2019245320A1 (ko) | 이미지 센서와 복수의 지자기 센서를 융합하여 위치 보정하는 이동 로봇 장치 및 제어 방법 | |
WO2021241994A1 (ko) | Rgb-d 카메라의 트래킹을 통한 3d 모델의 생성 방법 및 장치 | |
WO2020251151A1 (ko) | 3차원 가상 공간 모델을 이용한 사용자 포즈 추정 방법 및 장치 | |
WO2022092451A1 (ko) | 딥러닝을 이용한 실내 위치 측위 방법 | |
WO2022250372A1 (ko) | Ai에 기반한 프레임 보간 방법 및 장치 | |
WO2022124865A1 (en) | Method, device, and computer program for detecting boundary of object in image | |
WO2022225375A1 (ko) | 병렬처리 파이프라인을 이용한 다중 dnn 기반 얼굴 인식 방법 및 장치 | |
WO2021182793A1 (ko) | 단일 체커보드를 이용하는 이종 센서 캘리브레이션 방법 및 장치 | |
WO2020231006A1 (ko) | 영상 처리 장치 및 그 동작방법 | |
WO2023017947A1 (ko) | 자율주행에서의 시각적 속성 추정을 위한 시각 정보 처리 방법 및 시스템 | |
WO2024085456A1 (en) | Method and electronic device for generating a panoramic image | |
WO2023017978A1 (en) | Adaptive sub-pixel spatial temporal interpolation for color filter array |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21811924 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21811924 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17.05.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21811924 Country of ref document: EP Kind code of ref document: A1 |