WO2018005359A1 - Systems and methods for dynamic occlusion handling - Google Patents
Systems and methods for dynamic occlusion handling Download PDFInfo
- Publication number
- WO2018005359A1 WO2018005359A1 PCT/US2017/039278 US2017039278W WO2018005359A1 WO 2018005359 A1 WO2018005359 A1 WO 2018005359A1 US 2017039278 W US2017039278 W US 2017039278W WO 2018005359 A1 WO2018005359 A1 WO 2018005359A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- depth
- edge points
- processing system
- depth map
- boundary
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 104
- 238000012545 processing Methods 0.000 claims abstract description 98
- 239000011521 glass Substances 0.000 claims description 30
- 238000005457 optimization Methods 0.000 claims description 18
- 238000004891 communication Methods 0.000 claims description 12
- 230000003190 augmentative effect Effects 0.000 claims description 4
- 230000001131 transforming effect Effects 0.000 claims description 2
- 238000009499 grossing Methods 0.000 claims 1
- 230000008569 process Effects 0.000 description 82
- 238000009877 rendering Methods 0.000 description 25
- 230000000007 visual effect Effects 0.000 description 23
- 238000010586 diagram Methods 0.000 description 12
- 238000005516 engineering process Methods 0.000 description 12
- 230000008901 benefit Effects 0.000 description 10
- 230000000694 effects Effects 0.000 description 10
- 238000012800 visualization Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 238000013459 approach Methods 0.000 description 5
- 238000001914 filtration Methods 0.000 description 5
- 230000003993 interaction Effects 0.000 description 5
- 210000004247 hand Anatomy 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 210000003811 finger Anatomy 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 230000008439 repair process Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 239000003086 colorant Substances 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 210000003813 thumb Anatomy 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002146 bilateral effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 230000010339 dilation Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000003628 erosive effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 238000011049 filling Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/10—Geometric effects
- G06T15/40—Hidden part removal
- G06T15/405—Hidden part removal using Z-buffer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/10—Geometric effects
- G06T15/20—Perspective computation
- G06T15/205—Image-based rendering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/10—Geometric effects
- G06T15/40—Hidden part removal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/04—Indexing scheme for image data processing or generation, in general involving 3D image data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
Definitions
- This disclosure relates to systems and methods for enhancing depth maps, and more particularly to dynamic occlusion handling with enhanced depth maps.
- Augmented Reality relates to technology that provides a composite view of a real- world environment together with a virtual-world environment (e.g., computer generated input).
- a virtual-world environment e.g., computer generated input.
- Correct perception of depth is often needed to deliver a realistic and seamless AR experience.
- the user tends to interact frequently with both real and virtual objects.
- it is difficult to provide a seamless interaction experience with the appropriate occlusion handling between the real-world scene and the virtual-world scene.
- RGB-Depth RGB-Depth
- these RGB-D cameras typically have low cost consumer depth sensors, which usually suffer from various types of noises, especially around object boundaries. Such limitations typically cause unsuitable visual artifacts when these lightweight RGB-D cameras are used for AR applications, thereby prohibiting decent AR experiences.
- RGB-D RGB-Depth
- filtering is often used for image enhancement.
- some examples include a joint bilateral filtering process or a guided image filtering process.
- other examples include a domain transform process, an adaptive manifolds process, or an inpainting process.
- these processes are typically computationally expensive and often result in edge blurring, thereby causing interpolation artifacts around boundaries.
- a computing system includes a processing system with at least one processing unit.
- the processing system is configured to receive a depth map with a first boundary of an object.
- the processing system is configured to receive a color image that corresponds to the depth map.
- the color image includes a second boundary of the object.
- the processing system is configured to extract depth edge points of the first boundary from the depth map.
- the processing system is configured to identify target depth edge points on the depth map.
- the target depth edge points correspond to color edge points of the second boundary of the object in the color image.
- the processing system is configured to snap the depth edge points to the target depth edge points such that the depth map is enhanced with an object boundary for the object.
- a system for dynamic occlusion handling includes at least a depth sensor, a camera, and a processing system.
- the depth sensor is configured to provide a depth map.
- the depth map includes a first boundary of an object.
- the camera is configured to provide a color image.
- the color image includes a second boundary of the object.
- the processing system includes at least one processing unit.
- the processing system is configured to receive the depth map with the first boundary of an object.
- the processing system is configured to receive a color image that corresponds to the depth map.
- the color image includes a second boundary of the object.
- the processing system is configured to extract depth edge points of the first boundary from the depth map.
- the processing system is configured to identify target depth edge points on the depth map.
- the target depth edge points correspond to color edge points of the second boundary of the object in the color image.
- the processing system is configured to snap the depth edge points to the target depth edge points such that the depth map is enhanced with an object boundary for the object.
- a computer-implemented method includes receiving a depth map with a first boundary of an object.
- the method includes receiving a color image that corresponds to the depth map.
- the color image includes a second boundary of the object.
- the method includes extracting depth edge points of the first boundary from the depth map.
- the method includes identifying target depth edge points on the depth map.
- the target depth edge points correspond to color edge points of the second boundary of the object in the color image.
- the method includes snapping the depth edge points towards the target depth edge points such that the depth map is enhanced with an object boundary for the object.
- FIG. 1 is a diagram of a system according to an example embodiment of this disclosure.
- FIG. 2A is a rendering of a virtual object in video view without dynamic occlusion handling.
- FIG. 2B is a rendering of a virtual object in video view with dynamic occlusion handling according to an example embodiment of this disclosure.
- FIGS. 2C and 2D are renderings of the virtual object of FIG. 2 A without dynamic occlusion handling.
- FIGS. 2E and 2F are renderings of the virtual object of FIG. 2B with dynamic occlusion handling according to an example embodiment of this disclosure.
- FIGS. 2G and 2H are visualizations of FIG. 2A in glasses view without dynamic occlusion handling.
- FIGS. 21 and 2J are visualizations of FIG. 2B in glasses view with dynamic occlusion handling according to an example embodiment of this disclosure.
- FIGS. 3A is an example of a depth map according to an example embodiment of this disclosure.
- FIG. 3B is an example of a color image according to an example embodiment of this disclosure.
- FIG. 3C illustrates an example of the depth map of FIG. 3 A overlaying the color image of FIG. 3B according to an example embodiment of this disclosure.
- FIGS. 3D and 3E are enlarged views of exemplary regions of FIG. 3C.
- FIG. 3F is a visualization of FIG. 3C together with a virtual object.
- FIGS. 3G and 3H are enlarged views of exemplary regions of FIG. 3F.
- FIG. 4 is a block diagram of a process of the system of FIG. 1 according to an example embodiment of this disclosure.
- FIG. 5 is a flow diagram of an example implementation of the depth edge point process according to an example embodiment of this disclosure.
- FIG. 6A is an example of a color image according to an example embodiment of this disclosure.
- FIG. 6B is an example of a depth map according to an example embodiment of this disclosure.
- FIG. 6C is an example of depth edge points, which overlay a gray-scale image, according to an example embodiment of this disclosure.
- FIG. 6D is an enlarged view of raw depth edge points within an exemplary region of FIG. 6C according to an example embodiment of this disclosure.
- FIG. 6E is an enlarged view of smoothed depth edge points of a region of FIG. 6C according to an example embodiment of this disclosure.
- FIG. 6F illustrates an example of 2D-normals that are generated based on the raw depth edge points of FIG. 6D according to an example embodiment of this disclosure.
- FIG. 6G illustrates an example of 2D-normals that are generated based on the smoothed depth edge points of FIG. 6E according to an example embodiment of this disclosure.
- FIG. 7 is a flow diagram of an example implementation of the candidate search process and the optimization process according to an example embodiment of this disclosure.
- FIG. 8A is an example of a color image according to an example embodiment of this disclosure.
- FIG. 8B is an enlarged view of a region of FIG. 8A along with a visualization of edge- snapping with image gradients from the RGB space according to an example embodiment of this disclosure.
- FIG. 8C is an enlarged view of a region of FIG. 8A along with a visualization of edge- snapping with image gradients from both the RGB space and the YCbCr space according to an example embodiment of this disclosure.
- FIG. 9A is an example of a color image according to an example embodiment of this disclosure.
- FIG. 9B is an example of the magnitude of image gradients from a red channel according to an example embodiment of this disclosure.
- FIG. 9C is an example of the magnitude of image gradients in a converted CR channel according to an example embodiment of this disclosure.
- FIG. 10A is an example of a color image according to an example embodiment of this disclosure.
- FIG. 10B illustrates an example of edge-snapping results for a region of FIG. 10A without a smoothness constraint according to an example embodiment of this disclosure.
- FIG. IOC illustrates an example of edge-snapping results for a region of FIG. 10A with a smoothness constraint according to an example embodiment of this disclosure.
- FIG. 11A is an example of a color image according to an example embodiment of this disclosure.
- FIG. 11B illustrates an example of edge-snapping results for an exemplary region of FIG. 11A without a smoothness constraint according to an example embodiment of this disclosure.
- FIG. l lC illustrates an example of edge-snapping results for an exemplary region of FIG. 11 A with a smoothness constraint according to an example embodiment of this disclosure.
- FIG. 12 is a flow diagram of an example implementation of the depth map enhancement process according to an example embodiment of this disclosure.
- FIGS. 13A, 13B, 13C, 13D, and 13E illustrate aspects of the depth map enhancement process based on edge-snapping according to an example embodiment of this disclosure.
- FIG. 14 is a flow diagram of an example implementation of the glasses view rendering process according to an example embodiment of this disclosure.
- FIG. 15 A illustrates an issue associated with changing between video view and glasses view.
- FIG. 15B illustrates an example of occlusion effects using interpolation.
- FIG.15C illustrates an example of occlusion effects using the process of FIG. 14 according to an example embodiment of this disclosure.
- FIG. 16A is an example of an AR scene without dynamic occlusion handling.
- FIG. 16B is an example of an AR scene with dynamic occlusion handling using raw depth data according to an example embodiment of this disclosure.
- FIG. 16C is an example of an AR scene with dynamic occlusion handling using an enhanced depth map according to an example embodiment of this disclosure.
- FIG. 17A is an example of an AR scene without dynamic occlusion handling.
- FIG. 17B is an example of an AR scene with dynamic occlusion handling using raw depth data according to an example embodiment of this disclosure.
- FIG. 17C is an example of an AR scene with dynamic occlusion handling using an enhanced depth map according to an example embodiment of this disclosure.
- FIG. 18A is an example of an AR scene without dynamic occlusion handling.
- FIG. 18B is an example of an AR scene with dynamic occlusion handling using raw depth data according to an example embodiment of this disclosure.
- FIG. 18C is an example of an AR scene with dynamic occlusion handling using an enhanced depth map according to an example embodiment of this disclosure.
- FIG. 19A is an example of an AR scene without dynamic occlusion handling.
- FIG. 19B is an example of an AR scene with dynamic occlusion handling using raw depth data according to an example embodiment of this disclosure.
- FIG. 19C is an example of an AR scene with dynamic occlusion handling using an enhanced depth map according to an example embodiment of this disclosure.
- FIG. 20A is an example of an AR scene without dynamic occlusion handling according to an example embodiment of this disclosure.
- FIG. 20B is an example of an AR scene with dynamic occlusion handling using raw depth data according to an example embodiment of this disclosure.
- FIG. 20C is an example of an AR scene with dynamic occlusion handling using an enhanced depth map according to an example embodiment of this disclosure.
- FIGS. 21 A, 21B, 21C and 21D are color images with outlines of ground-truth boundaries according to an example embodiment of this disclosure.
- FIGS. 22A, 22B, 22C, and 22D are visualizations of raw depth maps overlaid over the corresponding color images of FIGS. 21 A, 21B, 21 C, and 2 ID, respectively, according to an example embodiment of this disclosure.
- FIGS. 23 A, 23B, 23C, and 23D are visualizations of enhanced depth maps overlaid over the corresponding color images of FIGS. 21A, 21B, 21C, and 21D, respectively, according to an example embodiment of this disclosure.
- FIG. 1 illustrates a block diagram of a system 100 for dynamic occlusion handling in AR according to an example embodiment.
- the system 100 includes a head mounted display 110 and a dynamic occlusion handling system 120.
- the system 100 includes communication technology 1 18 that connects the head mounted display 110 to the dynamic occlusion handling system 120.
- the communication technology 118 is configured to provide at least data transfer between the head mounted display 110 and the dynamic occlusion system 120.
- the communication technology 118 includes wired technology, wireless technology, or a combination thereof.
- the communication technology 118 includes HDMI technology, WiFi technology, or any suitable communication link.
- the head mounted display 110 is an optical head mounted display, which is enabled to reflect projected images while allowing a user to see through it.
- the head mounted display 110 includes at least a depth sensor 114 and a video camera 116.
- the head mounted display 110 includes an RGB-D camera 112, which includes a depth sensor 114 and a video camera 116.
- the RGB-D camera 112 can be near-range.
- the depth sensor 114 is configured to provide depth data, as well as geometry information for dynamic occlusion handling.
- the depth sensor 114 is a structured-light sensor or Time-of-Flight sensor.
- a stereo sensor can be used to obtain dynamic depth information.
- the depth sensor 114 can have any suitable sensing range.
- the RGB-D camera 112 includes a depth sensor 114 with a sensing range of 0.2m to 1.2m, which is sufficient to cover an area of AR interactions involving the user's hands 204.
- the video camera 116 is configured to provide video or a recorded series of color images. In an example embodiment, the video camera 116 is configured to provide scene tracking (e.g., visual SLAM).
- the system 100 uses the video data from the video view 200 and adopts the video view 200 as the glasses view 212 to provide dynamic occlusion handling.
- the system 100 includes the dynamic occlusion handling system 120.
- the dynamic occlusion handling system 120 is any suitable computing system that includes a dynamic occlusion handling module 130 and that can implement the functions disclosed herein. As non-limiting examples, the computing system is a personal computer, a laptop, a tablet, or any suitable computer technology that is enabled to implement the functions of the dynamic occlusion handling module 130.
- the computing system includes at least input/output (I/O) devices 122, a communication system 124, computer readable media 126, other functional modules 128, and a processing system 132.
- the I/O devices can include any suitable device or combination of devices, such as a keyboard, a speaker, a microphone, a display, etc.
- the communication system 124 includes any suitable communication means that enables the components of the dynamic occlusion handling system 120 to communicate with each other and also enables the dynamic occlusion handling system 120 to communicate with the head mounted display 110 via the communication technology 118.
- the communication system 124 includes any suitable communication means that enables the computing the dynamic occlusion handling system 120 to connect to the Internet, as well as with other computing systems and/or devices on a computer network or any suitable network.
- the computer readable media 126 is a computer or electronic storage system that is configured to store and provide access to various data to enable the functions disclosed herein.
- the computer readable media 126 can include electrical, electronic, magnetic, optical, semiconductor, electromagnetic, or any suitable memory technology.
- the computer readable media 126 is local, remote, or a combination thereof (e.g., partly local and partly remote).
- the other functional modules 128 can include hardware, software, or a combination thereof.
- the other functional modules 128 can include an operating system, logic circuitry, any hardware computing components, any software computing components, or any combination thereof.
- the processing system 132 includes at least one processing unit to perform and implement the dynamic occlusion handling in accordance with the dynamic occlusion handling module 130.
- the processing system 132 includes at least a central processing unit (CPU) and a graphics processing unit (GPU).
- the dynamic occlusion handling system 120 includes a dynamic occlusion handling module 130.
- the dynamic occlusion handling module 130 includes hardware, software, or a combination thereof.
- the dynamic occlusion handling module 130 is configured to provide the requisite data and support to the processing system 132 such that a process 400 (e.g. FIG. 4) is enabled to execute and provide enhanced depth data and dynamic occlusion handling, thereby providing a realistic AR experience.
- FIGS. 2A-2B illustrate non-limiting examples in which virtual objects 202 are rendered in video view 200, as the acquisition sensor space.
- FIG. 2A illustrates a virtual object 202 rendering without dynamic occlusion handling
- FIG. 2B illustrates a virtual object 202 rendering with dynamic occlusion handling.
- the virtual object 202 rendering includes a treasure chest as the virtual object 202.
- the remaining parts of this video view include the user's hand 204 in a real- world environment.
- the user's hand 204 is improperly occluded by the virtual object 202, as shown in the circled region 206 of FIG.
- the circled region 206 of FIG. 2A does not provide a realistic portrayal of the user's hand 204 interacting with the virtual object 202.
- the user's hand 204 is not occluded by the virtual object 202, as shown in the circled region 208 of FIG. 2B.
- the circled region 208 of FIG. 2B is able to provide a realistic portrayal of the user's hand 204 interacting with the virtual object 202.
- FIGS. 2C-2D and FIGS. 2E-2F relate to renderings of the virtual objects 202 of FIGS. 2A and FIGS. 2B, respectively. More specifically, FIGS. 2C-2D illustrate non-limiting examples of the rendering of the virtual objects 202 without dynamic occlusion handling. In this regard, FIG. 2C represents a left eye view of the rendering of the virtual object 202 and FIG. 2D represents a right eye view of the rendering of the virtual object 202. In contrast, FIGS. 2E-2F illustrate non-limiting examples of the rendering of the virtual objects 202 with dynamic occlusion handling. More specifically, FIG. 2E represents a left eye view of the rendering of the virtual object 202 and FIG.
- FIGS. 2E and 2F represents a right eye view of the rendering of the virtual object 202.
- the virtual object 202 is modified, as highlighted in each circled region 210, such that the virtual object 202 does not occlude the user's hand 204.
- the interaction between the virtual object 202 and the user's hand 204 is presented in a proper and realistic manner, as shown in at least the circled region 208 of FIGS. 2B and 21-2 J.
- FIGS. 2G-2H and 2I-2J illustrate non-limiting examples of optical, see-through images of the virtual objects 202 in glasses view 212 via the optical head mounted display 110.
- FIGS. 2G-2H illustrate examples without dynamic occlusion handling.
- FIG. 2G represents a left eye view of the virtual object 202 in the glasses view 212
- FIG. 2H represents a right eye view of the virtual object 202 in the glasses view 212.
- FIGS. 2I-2J illustrate examples with dynamic occlusion handling.
- FIG. 21 represents a left eye view of the virtual object 202 in the glasses view 212
- FIG. 2 J represents a right eye view of the virtual object 202 in the glasses view 212.
- the inclusion of the dynamic occlusion handling provides a more realistic and immersive experience, as the parts of the virtual objects 202 that should be occluded by the user's hands 204 are removed from view.
- FIGS. 3A-3E provide non-limiting examples of mismatches between boundaries of objects taken from depth maps compared to corresponding boundaries of objects taken from color images.
- FIG. 3A illustrates an example of a depth map 300
- FIG. 3B illustrates a corresponding example of a color image 302.
- FIG. 3C illustrates an example of the depth map 300 of FIG. 3A overlaying the color image 302 of FIG. 3B.
- FIG. 3D illustrates an enlarged view of the boxed region 304 of FIG. 3C.
- FIG. 3E illustrates an enlarged view of the boxed region 306 of FIG. 3C.
- the boundary of the user's hand 204 in the depth map 300 does not match the corresponding boundary of the user's hand 204 in the color image 302.
- FIGS. 3F- 3H are example results based on dynamic occlusion handling with raw depth data from the depth map 300.
- FIG. 3F includes dynamic occlusion handling, particularly with regard to a rendering of a virtual object 202 (e.g., smartphone) in relation to a user's hand 204.
- FIG. 3G illustrates an enlarged view of the boxed region 304 of FIG. 3F.
- FIG. 3H illustrates an enlarged view of the boxed region 306 of FIG. 3F.
- 3F-3H include visual artifacts due to the mismatches in boundaries of at least the user's hand 204 between the raw depth map 300 and the color image 302.
- the system 100 includes a process 400, which is enabled to overcome this issue by improving the consistency of object boundaries, for instance, between depth data and RGB data.
- FIG. 4 is a block diagram of a process 400 of the system 100 according to an example embodiment.
- the process 400 upon receiving depth data and video data from the RGB-D camera 112, the process 400 includes at least a video view process 410 and a glasses view rendering process 490.
- the process 400 is performed when the processing systeml32 executes computer-readable data (e.g., computer-executable data), which is stored on non-transitory computer readable media via the dynamic occlusion handling module 130, the computer readable media 126, or a combination thereof.
- the computer executable data can include various instructions, data structures, applications, routines, programs, modules, procedures, other software components, or any combination thereof.
- the process 400 leverages instances in which boundaries in raw depth maps are normally reasonably close to their counterparts in the corresponding color images, where the image gradients are typically high.
- the process 400 includes snapping at least one depth edge point towards its desired target location.
- the process 400 includes discretizing the solution space by constraining the target position of the depth edge point to be on a local line segment and then find an optimal solution for the entire set of depth edge points via discrete energy minimization.
- the process 400 includes a video view process 410 and a glasses view rendering process 490, as shown in FIG. 4.
- the video view process 410 includes a depth edge point process 420, a candidate search process 460, an optimization process 470, and a depth map enhancement process 480.
- the depth edge point process 420 includes depth edge point extraction 430, grouping and ordering 440, and 2D normal computations 450.
- the process 400 includes extracting depth edge points from a depth map and computing smooth 2D normal directions with respect to the extracted depth edge points.
- each 2D normal segment or line defines a solution space for a corresponding edge point, i.e.
- the process 400 includes an optimization process 470 based on the results of the candidate search 460 to locate and utilize optimal snapping targets.
- the optimization process 470 includes defining energy functions in a solution space that includes at least a data term and a smoothness term.
- the optimization process 470 includes performing energy minimization efficiently via dynamic programming to identify the optimal target position for each edge point.
- the process 400 includes a depth map enhancement process 480, which is based on an output of edge-snapping. Upon enhancing the depth map, the process 400 switches from the video view process 410 to the glasses view rendering process 490.
- FIG. 5 is a flow diagram of a depth edge point process 420 according to an example embodiment.
- the depth edge point process 420 is configured to extract depth edge points from depth points (or pixels) with valid depth values.
- the depth edge point process is configured to perform a number of operations in preparation for the candidate search process 460 and optimization process 470. More specifically, an example implementation 500 of the depth edge point process 420 is discussed below.
- the processing system 132 is configured to extract depth edge points.
- the depth edge points are those points whose local neighborhood exhibits large depth discontinuity.
- the processing system 132 primarily or only considers depth points (or pixels) with valid depth values. For each of these pixels, a 3x3 local patch is examined. If any of the four-neighbor pixels either has an invalid depth value or has a valid depth value that differs from the center pixel beyond a certain value, then this center pixel is considered to be a depth edge point.
- the raw depth map normally could contain some outliers as isolated points or a very small patch.
- the processing system 132 is configured to apply a morphological opening, i.e. erosion followed by dilation, to the depth map mask before extracting the depth edge points.
- a morphological opening i.e. erosion followed by dilation
- the processing system 132 is configured to perform a depth first search on each image group to group the extracted depth edge points. During the depth first search, two depth edge points are considered connected only when one is in the 3 x3 neighborhood of the other and the depth difference between these two depth points (or pixels) is less than a certain threshold xmax.
- the processing system 132 is configured to order the depth edge points of each group so that they traverse from one end of the edge contour towards the other, as required by some of other processes (e.g., the optimization process 470).
- the processing system 132 is configured to select one of the depth edge points as the starting point, wherein the selection can be performed at random or by any suitable selection method.
- the following operations in the remainder of this discussion of FIG. 5 are performed for each group of depth edge points separately.
- FIG. 6C shows an example containing multiple groups of depth edge points.
- the processing system 132 is configured to perform low pass filtering on the raw depth edge points to smooth the 2D positions of these depth edge points. More specifically, due to zigzag pattern or unevenness of the raw depth edges, the normal directly computed from these raw depth edge points may suffer from substantial artifacts. In contrast, with low pass filtering, the processing system 132 is configured to reduce noise and artifacts by utilizing these smoothed depth edge points at step 510.
- the processing system 132 is configured to compute the 2D normal of these depth edge points.
- the processing system 132 is configured to compute the 2D normal of each depth edge point using two neighboring points.
- the processing system 132 utilizes the smoothed depth edge points only for the 2D normal computation, while relying on the raw depth edge points for all (or most) of the later processing.
- FIGS. 6A-6G illustrate certain aspects of the example implementation 500 of the depth edge point processing according to an example embodiment.
- FIG. 6A illustrates an example of a color image 302 from the RGB-D camera 112.
- FIG. 6B illustrates an example of a raw depth map 300 from the RGB-D camera 112.
- FIG. 6C illustrates examples of raw depth edge points 312, which overlay a gray-scale image 310.
- FIGS. 6D-6G illustrate enlarged views of portions of FIG. 6C that correspond to the boxed region 308 of FIG. 6A.
- FIG. 6D illustrates the raw depth edge points 312 associated with a boundary of a thumb of the user's hand 204 while
- FIG. 6E illustrates smoothed depth edge points 314.
- FIG. 6F illustrates 2D normals 316 that are generated based on the raw depth edge points 312.
- FIG. 6G illustrates 2D normals 316 that are generated based on the smoothed depth edge points. As shown, the 2D-normals of the smoothed depth edge points in FIG. 6G carry less noise than that of the raw depth edge points FIG. 6F.
- FIGS. 7, 8A-8C, 9A-9C, lOA-lOC, and 1 1A-11C relate to the candidate search process 460 and the optimization process 470. More specifically, FIG. 7 is a flow diagram of an example implementation 700 of the candidate search process 460 and the optimization process 470 according to an example embodiment. Meanwhile, FIGS. 8A-8C, 9A-9C, 1 OA- IOC, and 11A- 11C illustrate various aspects of the candidate search process 460 and the optimization process 470.
- the processing system 132 searches for candidates for each depth edge point.
- k l, ... ,2r s +1 ⁇ .
- the processing system 132 obtains the image gradients using a Sobel operator in multiple color spaces.
- the first part of the image gradients is computed directly in the RGB color space by the following equation:
- the processing system 132 combines these image gradients and defines the cost of snapping a point pi towards a candidate cy ⁇ as follows: [Equation 3]
- encoding image gradients from multiple color spaces provides a number of advantages. For example, combining different color spaces generally provides more discriminating power for this edge-snapping framework. For instance, in some cases, the RGB color space alone might not be sufficient. In this regard, turning to FIGS. 9A-9C, as an example, the boundaries of the fingertips, as shown in the circled regions 328, do not have a strong image gradient in the RGB space. In this case, when involving only the RGB color space, there are some edge points associated with these fingertips that cannot be snapped to the desired locations.
- the processing system 132 is configured to achieve snapping results, which are improved compared to those snapping results, which involve only the RGB space.
- the incorporation of the YCbCr color space is particularly suitable for differentiating skin color associated with a user from other colors (e.g., non-skin colors).
- other color spaces can be used. For instance, the hue channel of an HSV color space can be used.
- this example embodiment uses RGB and YCbCr spaces, other example embodiments include various combinations of various color spaces.
- the processing system 132 defines a smoothness term to penalize a large deviation between neighboring depth edge points (or depth edge pixels).
- the processing system 132 snaps neighboring depth edge points to locations that are relative close to each other and/or not far away from each other. For instance, in an example embodiment, for a pair of consecutive depth edge points pi and p j , the processing system 132 computes the cost of snapping pi onto Cy ⁇ and p j onto Cj ;1 via the following equation: [0104] E s (i, k,j, I) [Equation 4]
- the parameter dmax defines the maximal discrepancy allowed for two consecutive depth edge points.
- the processing system 132 determines or finds a candidate for each depth edge point to minimize the following energy function:
- this class of discrete optimization problem is solved in an efficient manner via dynamic programming, which identifies an optimal path in the solution space.
- H(i + 1, 1) H(i + 1, 1) + min fc ⁇ H(i, k) + E s (i, k, i + 1, 1) ⁇ [Equation 6]
- the processing system 132 then traverses back to locate the best candidates for a previous point given the decision for the current point, which was recorded earlier during the update. In an example embodiment, the processing system 132 continues this procedure until the first point is reached where the optimal path is found. In this regard, the optimal path provides a target position to snap for each edge point.
- FIGS. 8A-8C illustrate at least one benefit associated with using image gradients from multiple color spaces.
- FIG. 8A is a non-limiting example of a color image 302.
- each of FIGS. 8B and 8C illustrate an enlarged view of the boxed region 318 of FIG. 8A.
- FIGS. 8B and 8C include the raw depth edge points 320 and their target positions 324 after the optimization process 470.
- FIGS. 8B and 8C also include the paths 322, which show the movements of the raw depth edge points 320 to their corresponding target positions 324. More specifically, FIG. 8B shows the results obtained by only using the image gradient in the RGB space. In contrast, FIG.
- FIG. 8C shows the result obtained by combining image gradients from both the RGB space and the YCbCr space.
- the fusion of multiple color spaces improves the robustness of the edge-snapping framework compared to that of a single color space, as shown in FIG. 8B.
- FIGS. 9A-9C illustrate at least one other benefit associated with using image gradients from multiple color spaces.
- FIG. 9A illustrates a non-limiting example of a color image (or raw RGB data), as obtained from the RGB-D camera 112.
- FIG. 9B is a non-limiting example of the magnitude of image gradients from the red channel 326.
- the circled regions 328 highlight instances in which the image gradients of the object boundary 330 of the user's hand 204 are relatively low in the RGB space.
- FIG. 9C is a non-limiting example of the magnitude of image gradients in a converted C R channel 332, where the object boundary 330 of the user' s hand 204, particularly at the fingertips, are more visible. 101 14
- FIG. 10A illustrates a non-limiting example of a color image 302.
- FIGS. 10B and IOC are enlarged views of the boxed region 334 of FIG. 10A. More specifically, FIG. 10B illustrates the edge-snapping results without a smoothness constraint. In contrast, FIG. IOC illustrates the edge-snapping results with at least one smoothness constraint.
- FIGS. 10B and IOC include the raw depth edge points 320 and their target positions 324. In addition, FIGS. 10B and IOC also include the paths 322, which show the movements of the raw depth edge points 320 to their corresponding target positions 324.
- the results provided with smoothness constraints as shown in FIG. IOC, provide greater edge-snapping accuracy compared to that without smoothness constraints, as shown in FIG. 10B.
- FIGS. 11A-11C illustrate a number of benefits of applying smoothness terms.
- FIG. 11A illustrates a non-limiting example of a color image 302.
- FIGS. 11B and 11C are enlarged views of the boxed region 336 of FIG. 11 A.
- FIG. 11B illustrates the edge-snapping results without a smoothness constraint.
- FIG. 11C illustrates the edge-snapping results with at least one smoothness constraint.
- FIGS. 11B and 11C include the raw depth edge points 320 and their target positions 324.
- FIGS. 1 IB and 11C also include the paths 322, which show the movements of the raw depth edge points 320 to their corresponding target positions 324.
- the results provided with smoothness constraints as shown in FIG. I IC, provides better edge-snapping accuracy compared to that without smoothness constraints, as shown in FIG. 1 IB.
- FIGS. 10B and 11B illustrate examples in which some depth edge points were snapped to undesirable positions having high image gradients.
- the inclusion of the smoothness term within the process 400 can effectively prevent such artifacts from occurring, as shown in FIGS. IOC and 11C.
- FIGS. 12 and 13A-13E relate to the depth map enhancement process 480. Specifically, FIG.
- FIGS. 13A-13E illustrate depth map enhancement based on edge-snapping. More specifically, each of FIGS. 13A-13E illustrate a depth map 300 overlaying a color image 302. Also, in each of FIGS. 13A-13E, the curve 320 represents a boundary of a thumb from the user's hand 204, as taken from the depth map 300. In this example, the shaded region 340, bounded by at least the curve 318, has valid depth measurements while the remaining regions have zero depths.
- FIGS. 13B-13E also illustrate depth edge points 320 A and 320B (as taken from curve 320) and their corresponding target positions 342A and 342B. Also, the points 344A and 344B, illustrated as triangles in FIGS. 13C and 13E, represent the depth points (or pixels), which are used for retrieving reference depth values. That is, FIGS. 13A-13E illustrate examples of certain aspects of the example implementation 1200, as discussed below.
- the processing system 132 considers two consecutive depth edge points 320A and 320B as well as their target positions 342A and 342B, which form a quadrilateral as illustrated by the shaded region 340 in each of FIGS. 13B and 13D.
- the processing system 132 processes all of the depth points (or pixels) inside this quadrilateral (or the shaded region 340) for enhancement. In an example embodiment, this processing is performed for each pair of consecutive depth edge points 320A and 320B. Essentially, each depth point (or pixel) inside the quadrilateral (or the shaded region 340) has incorrect depth measurements due to sensor noises.
- the true depth of each of these points (or pixels) is recovered.
- the processing system 132 is configured to perform an approximation to estimate reasonable depth values for these depth points (or pixels) that are generally sufficient.
- the first type of error (“case 1”) includes at least one missing value, where the object boundaries of the depth map 300 are generally inside the object, as shown within the boxed region 336 of FIG. 13 A.
- Another type of error (“case 2”) occurs when depth points (or pixels) belonging to an object further away are labeled with depth values from the occluding object, as shown within the boxed region 338 of FIG. 13 A.
- the processing system 132 implements the following same methodology to modify the depth values.
- step 1204 in an example embodiment, for each depth edge point (or pixel) of the pair of consecutive depth edge points 320A and 320B, the processing system 132 traverses one step back along the direction from the target to this pixel and retrieves the depth value as a reference depth value. Examples of these reference pixels are represented by the black triangles 344 in FIGS. 13C and 13E, respectively.
- the processing system 132 then takes the average of the reference depth values from the pair and assigns it to all of the depth points (or pixels) inside the region.
- the reference values are taken from a region inside the finger. Therefore, the target region 340 will be filled in with some depth from the finger, resulting in filling effects for the missing values.
- the reference values will be zero and the target region will be replaced with zero depth resulting in this piece being removed.
- the processing system 132 achieves both effects, as desired. When considering speed, this approximation is sufficient for dynamic occlusion handling.
- the processing system 132 is configured to implement an extrapolation process to estimate the depth values.
- the depth map enhancement process 480 is highly parallel. Accordingly, with regard to the processing system 132, the CPU, the GPU, or a combination thereof can perform the depth map enhancement process 480.
- the edge-snapping moves the depth edge points 320A and 320B in directions towards their target positions 342A and 342B.
- the processing system 132 is configured to process all or substantially all of the depth points (or pixels) that fall within the regions of the edge-snapping.
- the process 400 includes a glasses view rendering process 490. [0123] FIGS.
- FIG. 14 is a flow diagram of an example implementation 1400 of the glasses view rendering process 490 according to an example embodiment.
- the CPU, the GPU, or a combination thereof can perform this example implementation 1400.
- the GPU of the processing system 132 is configured to perform the glasses view rendering process 490.
- FIGS. 15A-15C illustrate examples of certain aspects of the example implementation 1400.
- the processing system 132 transforms the depth data from the video view 200 to the glasses view 212.
- the transformation is obtained via calibration using software technology for AR applications, such as ARToolKit or other similar software programs. Due to the differences between the video view 200 and the glasses view 212, empty regions (holes) might be created as illustrated in FIG. 15 A.
- the curve 1500 represents the object surface. Also in FIG. 15 A, point pi and point p2 are on the surface that projects to nearby points in video view 200, and pi is further than p2.
- the point (or pixel) nearby p2 follows the ray R, for which there is no direct depth measurement in this case.
- One way to obtain the depth is via interpolation between point pi and point p2, ending up with the point p4.
- this interpolation might be problematic for occlusion handling.
- point p4 will occlude the virtual object 202.
- a safer way which is also used for view synthesis, is to take the larger depth between point pi and point p2 as the estimation, resulting in point p3 as shown in FIG. 15C.
- the processing system 132 performs a number of operations when transforming the scene depth from video view 200 to the glasses view 212 before depth testing in glasses view 212.
- the processing system 132 triangulates all or substantially all of the points (or pixels) on the image grid and renders the enhanced depth map as a triangular mesh to a depth texture. [0126] At step 1406, in an example embodiment, during this rendering, the processing system 132 identifies the triangles with an edge longer than a certain threshold. As one non-limiting example, the threshold is 20 mm. In this regard, the points (or pixels) within these triangles correspond to the case illustrated in FIG. 15 A.
- the processing system 132 assigns these points (or pixels) with the maximum depth among the three end points of this triangle.
- the processing system 132 renders the depths for dynamic occlusion handling.
- the processing system 132 is configured to implement this process via appropriate software technology, such as OpenGL Shader or any other software program, and apply this process to both the left and right view of the glasses.
- the process 400 is configured to leverage the data provided by RGB- D camera 112. More specifically, the dynamic occlusion handling system 120 includes an edge- snapping algorithm that snaps (or moves) an object boundary of the raw depth data towards the corresponding color image and then enhances the object boundary of the depth map based on the edge-snapping results.
- This edge-snapping is particularly beneficial as the use of raw depth data may include holes, low resolutions, and significant noises around the boundaries, thereby introducing visual artifacts that are undesirable in various applications including AR.
- the enhanced depth maps are then used for depth testing with the virtual objects 202 for dynamic occlusion handling. Further, there are several AR applications that can benefit from this dynamic occlusion handling. As non-limiting examples, this dynamic occlusion handling can be applied to at least the following two AR use cases.
- a first AR use case involves an automotive repair application, where a user uses an AR system for guidance.
- the automotive repair application includes an AR scene 600 with a 3D printed dashboard as an example.
- the AR scene 600 includes virtual objects 202, specifically a virtual touch screen and a windshield.
- the following discussion includes positioning a user's hand 204 in different locations of the AR scene 600.
- the user's hand 204 should be occluded by the touch screen but not the windshield; while in others, the user's hand 204 should occlude both virtual objects 202.
- FIGS. 16A-16C, 17A-17C, and 18A-18C illustrate visual results of different occlusion handling strategies in an AR-assisted automotive repair scenario.
- FIGS. 16A-16C illustrate an instance in which the user's hand 204 should reside between two virtual objects 202 (e.g., the virtual touchscreen and the virtual windshield).
- FIG. 16A illustrates the visual results of virtual objects 202 in relation to the user's hand 204 without any occlusion handling.
- FIG. 16 A instead of residing between two virtual objects 202, as desired, the user's hand 204 is improperly occluded by both virtual objects 202.
- FIG. 16B illustrates the visual results with occlusion handling using raw depth data. As shown in FIG.
- FIG. 16B illustrates the AR scene 600 suffers from defects, such as various visual artifacts, as indicated by the arrows 602.
- FIG. 16C illustrate the visual results of virtual objects 202 with dynamic occlusion handling using an enhanced depth map, as disclosed herein.
- the AR scene 600 includes a boundary for the user's hand 204, which is better preserved and properly positioned in relation to the virtual objects 202, when dynamic occlusion handling is performed with an enhanced depth map.
- FIGS. 17A-17C illustrate an instance in which the user's hand 204 should occlude both virtual objects 202 (e.g., the virtual touchscreen and the virtual windshield).
- FIG. 17 A illustrates the visual results of virtual objects 202 in relation to the user's hand 204, without any occlusion handling. As shown in FIG. 17 A, instead of residing in front of the virtual objects 202, as desired, the user's hand 204 is improperly occluded by both virtual objects 202.
- FIG. 17B illustrates the visual results of the virtual objects 202 in relation to the user's hand 204 with occlusion handling using raw depth data. As shown in FIG.
- FIG. 17B illustrates the AR scene 600 suffers from defects, such as various visual artifacts, as indicated by the arrows 602.
- FIG. 17C illustrate the visual results of the virtual objects 202 in relation to the user's hand 204 with dynamic occlusion handling using an enhanced depth map, as disclosed herein.
- the AR scene 600 includes a boundary for the user's hand 204, which is better preserved and properly positioned in relation to the virtual objects 202, when dynamic occlusion handling is performed with an enhanced depth map.
- FIGS. 18A-18C illustrate instances in which a user's hand 204 should occlude at least two virtual objects 202 (e.g., the virtual touchscreen and the virtual windshield).
- FIG. 18A-18C illustrate instances in which a user's hand 204 should occlude at least two virtual objects 202 (e.g., the virtual touchscreen and the virtual windshield).
- FIG. 18A illustrates the visual results of virtual objects 202 in relation to the user's hand 204, without any occlusion handling.
- FIG. 18A instead of residing in front of the virtual objects 202, as desired, the finger of the user's hand 204 is improperly occluded by both virtual objects 202.
- FIG. 18B illustrates the visual results of the virtual objects 202 in relation to the user's hand 204 with occlusion handling using raw depth data.
- FIG. 18C illustrate the visual results of the virtual objects 202 in relation to the user's hand 204 with dynamic occlusion handling using an enhanced depth map, as disclosed herein.
- the AR scene 600 includes a boundary for the user's hand 204, which is better preserved and properly positioned in relation to the virtual objects 202, when dynamic occlusion handling is performed with an enhanced depth map.
- a second AR use case involves AR gaming.
- the real scene serves as the playground while the virtual treasure chest is a virtual object 202 hidden somewhere in the real scene. More specifically, in this example, the virtual treasure chest is hidden behind a closet door 606 and behind a box 604. Therefore, in this AR scene 600, to be able to find the hidden virtual treasure chest, the user should open the closet door 606 and remove the box 604.
- FIGS. 19A-19C and 20A-20C illustrate visual results of different occlusion handling strategies in an AR treasure hunting scenario.
- the virtual object 202 e.g., treasure chest
- FIGS. 19A and 20A illustrate visual effects without occlusion handling.
- the virtual object 202 occludes the box 604 and the closet door 606 and is therefore not positioned behind the box 604, as intended.
- FIGS. 19B and 20B when applying occlusion handling on raw depth data, the virtual object 202 is correctly positioned between the closet door 606, as intended, but incorrectly occludes the box 604.
- FIGS. 19B and 20B illustrate the visual effects with dynamic occlusion handling, as discussed herein, in which enhanced depth maps are used and contribute to the AR scene 600.
- the virtual object 202 is rendered behind both the box 604 and the closet door 606 in an appropriate manner and without any visual artifacts. That is, with dynamic occlusion handling, the user is provided with a proper and realistic AR experience.
- FIGS. 21A to 23D illustrate object boundaries in color images, raw depth maps, and enhanced depth maps.
- each of FIGS. 21A, 21B, 21C and 21D is a color image 302 with a ground-truth boundary 800 of the user's hand 204.
- each of FIGS. 21A, 21B, 21C, and 21D presents different hand gestures and/or background scene.
- FIGS. 22A-22D and 23A-23D these illustrations utilize, for instance, a standard JET color scheme, with the corresponding color images 302 overlaying their corresponding depth maps 300. More specifically, FIGS. 22A-22D include the raw depth maps 300 while FIGS. 23A-23D include the enhanced depth maps 900.
- the object boundaries 902 of the hands 204 in the enhanced depth maps 900 correspond more closely to the ground-truth boundary 800 than the object boundaries 312 of the hands 204 in the raw depth maps 300. That is, the enhanced depth maps 900 provide improved object boundaries 902, thereby achieving dynamic occlusion handling that results in an improved AR experience.
- FIGS. 21A-21D visualize the desired ground-truth boundary 800 of the hand 204 over the original color image 302.
- the object boundary in the depth maps should match this curve.
- the raw depth maps 300 suffer from various types of noises and missing values, resulting in mismatches with the ground-truth boundary 800.
- FIGS. 23A-23D represent the results of example embodiments after depth map enhancement. As shown by the results of FIGS. 23A-23D, the process 400 improves the consistency of boundaries of objects between image data (e.g., RGB data) and depth data.
- the system 100 provides dynamic occlusion handling, which enables accurate depth perception in AR applications. Dynamic occlusion handling therefore ensures a realistic and immersive AR experience.
- existing solutions typically suffer from various limitations, e.g. static scene assumption or high computational complexity.
- this system 100 is configured to implement a process 400 that includes a depth map enhancement process 480 for dynamic occlusion handling in AR applications.
- this system 100 implements an edge-snapping approach, formulated as discrete optimization, that improves the consistency of object boundaries between RGB data and depth data.
- the system 100 solves the optimization problem efficiently via dynamic programming.
- the system 100 is configured to run at an interactive rate on a computing platform, (e.g., tablet platform).
- the system 100 provides a rendering strategy for glasses view 212 to avoid holes and artifacts due to interpolation that originate from differences between the video view 200 (data acquisition sensor) and the glasses view 212. Furthermore, experimental evaluations demonstrate that this edge-snapping approach largely enhances the raw sensor data and is particularly suitable compared to several related approaches in terms of both speed and quality. Also, unlike other approaches that focus on the entire image, this process 400 advantageously focuses on the edge regions. Moreover, the system 100 delivers visually pleasing dynamic occlusion effects during user interactions. [0140] As aforementioned, in an example embodiment, the system 100 is configured to perform edge-snapping between the depth maps and color images, primarily based on image gradients.
- the system 100 when the characteristics of the sensor data from the depth sensor 114 provides raw depth edges that are close to the corresponding desired color edges, the system 100 is configured to model the color characteristic of individual objects for segmentation. Additionally or alternatively, the system 100 is configured to further enhance the above- mentioned energy function by taking into account other information besides image gradients, such as that of color distributions or other relevant data, to better accommodate complicated scenarios such as a cluttered scene. Additionally or alternatively, the system 100 can consider and include temporal information. Additionally or alternatively, the system 100 can include explicit tracking of moving objects to enhance the robustness of the edge-snapping framework.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Graphics (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Geometry (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Image Processing (AREA)
- Processing Or Creating Images (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2017288933A AU2017288933B2 (en) | 2016-06-27 | 2017-06-26 | Systems and methods for dynamic occlusion handling |
JP2018567839A JP6813600B2 (en) | 2016-06-27 | 2017-06-26 | Systems and methods for dynamic occlusion processing |
CN201780052439.0A CN109844819A (en) | 2016-06-27 | 2017-06-26 | System and method for dynamic barriers disposition |
EP17821009.2A EP3475923A4 (en) | 2016-06-27 | 2017-06-26 | Systems and methods for dynamic occlusion handling |
BR112018077095A BR112018077095A8 (en) | 2016-06-27 | 2017-06-26 | SYSTEMS AND METHODS FOR MANIPULATING DYNAMIC OCCLUSION |
KR1020197002344A KR102337931B1 (en) | 2016-06-27 | 2017-06-26 | Systems and methods for dynamic occlusion handling |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201662354891P | 2016-06-27 | 2016-06-27 | |
US62/354,891 | 2016-06-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018005359A1 true WO2018005359A1 (en) | 2018-01-04 |
Family
ID=60675555
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2017/039278 WO2018005359A1 (en) | 2016-06-27 | 2017-06-26 | Systems and methods for dynamic occlusion handling |
Country Status (8)
Country | Link |
---|---|
US (1) | US10706613B2 (en) |
EP (1) | EP3475923A4 (en) |
JP (1) | JP6813600B2 (en) |
KR (1) | KR102337931B1 (en) |
CN (1) | CN109844819A (en) |
AU (1) | AU2017288933B2 (en) |
BR (1) | BR112018077095A8 (en) |
WO (1) | WO2018005359A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220319129A1 (en) * | 2020-01-22 | 2022-10-06 | Facebook Technologies, Llc | 3D Reconstruction Of A Moving Object |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10645357B2 (en) * | 2018-03-01 | 2020-05-05 | Motorola Mobility Llc | Selectively applying color to an image |
US10643336B2 (en) | 2018-03-06 | 2020-05-05 | Sony Corporation | Image processing apparatus and method for object boundary stabilization in an image of a sequence of images |
CN109035375B (en) * | 2018-06-22 | 2023-11-10 | 广州久邦世纪科技有限公司 | OpenGL-based 3D glasses rendering method and system |
CN110072046B (en) * | 2018-08-24 | 2020-07-31 | 北京微播视界科技有限公司 | Image synthesis method and device |
US11227446B2 (en) * | 2019-09-27 | 2022-01-18 | Apple Inc. | Systems, methods, and graphical user interfaces for modeling, measuring, and drawing using augmented reality |
CN111275827B (en) * | 2020-02-25 | 2023-06-16 | 北京百度网讯科技有限公司 | Edge-based augmented reality three-dimensional tracking registration method and device and electronic equipment |
US11107280B1 (en) * | 2020-02-28 | 2021-08-31 | Facebook Technologies, Llc | Occlusion of virtual objects in augmented reality by physical objects |
JP2022123692A (en) * | 2021-02-12 | 2022-08-24 | ソニーグループ株式会社 | Image processing device, image processing method and image processing system |
US11941764B2 (en) | 2021-04-18 | 2024-03-26 | Apple Inc. | Systems, methods, and graphical user interfaces for adding effects in augmented reality environments |
US11741671B2 (en) | 2021-06-16 | 2023-08-29 | Samsung Electronics Co., Ltd. | Three-dimensional scene recreation using depth fusion |
US11887267B2 (en) | 2021-07-07 | 2024-01-30 | Meta Platforms Technologies, Llc | Generating and modifying representations of hands in an artificial reality environment |
WO2023070421A1 (en) * | 2021-10-28 | 2023-05-04 | Intel Corporation | Methods and apparatus to perform mask-based depth enhancement for multi-view systems |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013173749A1 (en) * | 2012-05-17 | 2013-11-21 | The Regents Of The University Of California | Sampling-based multi-lateral filter method for depth map enhancement and codec |
US20140016862A1 (en) * | 2012-07-16 | 2014-01-16 | Yuichi Taguchi | Method and Apparatus for Extracting Depth Edges from Images Acquired of Scenes by Cameras with Ring Flashes Forming Hue Circles |
US20140269935A1 (en) * | 2010-07-07 | 2014-09-18 | Spinella Ip Holdings, Inc. | System and method for transmission, processing, and rendering of stereoscopic and multi-view images |
US20140294237A1 (en) * | 2010-03-01 | 2014-10-02 | Primesense Ltd. | Combined color image and depth processing |
US20150139533A1 (en) * | 2013-11-15 | 2015-05-21 | Htc Corporation | Method, electronic device and medium for adjusting depth values |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7787678B2 (en) * | 2005-10-07 | 2010-08-31 | Siemens Corporation | Devices, systems, and methods for processing images |
JP4789745B2 (en) | 2006-08-11 | 2011-10-12 | キヤノン株式会社 | Image processing apparatus and method |
WO2011046607A2 (en) * | 2009-10-14 | 2011-04-21 | Thomson Licensing | Filtering and edge encoding |
CA2803028C (en) * | 2010-06-29 | 2019-03-12 | 3Shape A/S | 2d image arrangement |
US9122053B2 (en) * | 2010-10-15 | 2015-09-01 | Microsoft Technology Licensing, Llc | Realistic occlusion for a head mounted augmented reality display |
KR101972356B1 (en) * | 2010-12-21 | 2019-04-25 | 한국전자통신연구원 | An apparatus and a method for detecting upper body |
US10455219B2 (en) | 2012-11-30 | 2019-10-22 | Adobe Inc. | Stereo correspondence and depth sensors |
RU2013106513A (en) * | 2013-02-14 | 2014-08-20 | ЭлЭсАй Корпорейшн | METHOD AND DEVICE FOR IMPROVING THE IMAGE AND CONFIRMING BORDERS USING AT LEAST A SINGLE ADDITIONAL IMAGE |
US9514574B2 (en) | 2013-08-30 | 2016-12-06 | Qualcomm Incorporated | System and method for determining the extent of a plane in an augmented reality environment |
US9754377B2 (en) * | 2014-08-15 | 2017-09-05 | Illinois Institute Of Technology | Multi-resolution depth estimation using modified census transform for advanced driver assistance systems |
JP2016048467A (en) | 2014-08-27 | 2016-04-07 | Kddi株式会社 | Motion parallax reproduction method, device and program |
US9824412B2 (en) * | 2014-09-24 | 2017-11-21 | Intel Corporation | Position-only shading pipeline |
US20160140761A1 (en) | 2014-11-19 | 2016-05-19 | Microsoft Technology Licensing, Llc. | Using depth information for drawing in augmented reality scenes |
-
2017
- 2017-06-26 WO PCT/US2017/039278 patent/WO2018005359A1/en unknown
- 2017-06-26 EP EP17821009.2A patent/EP3475923A4/en active Pending
- 2017-06-26 JP JP2018567839A patent/JP6813600B2/en active Active
- 2017-06-26 AU AU2017288933A patent/AU2017288933B2/en active Active
- 2017-06-26 CN CN201780052439.0A patent/CN109844819A/en active Pending
- 2017-06-26 KR KR1020197002344A patent/KR102337931B1/en active IP Right Grant
- 2017-06-26 US US15/633,221 patent/US10706613B2/en active Active
- 2017-06-26 BR BR112018077095A patent/BR112018077095A8/en unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140294237A1 (en) * | 2010-03-01 | 2014-10-02 | Primesense Ltd. | Combined color image and depth processing |
US20140269935A1 (en) * | 2010-07-07 | 2014-09-18 | Spinella Ip Holdings, Inc. | System and method for transmission, processing, and rendering of stereoscopic and multi-view images |
WO2013173749A1 (en) * | 2012-05-17 | 2013-11-21 | The Regents Of The University Of California | Sampling-based multi-lateral filter method for depth map enhancement and codec |
US20140016862A1 (en) * | 2012-07-16 | 2014-01-16 | Yuichi Taguchi | Method and Apparatus for Extracting Depth Edges from Images Acquired of Scenes by Cameras with Ring Flashes Forming Hue Circles |
US20150139533A1 (en) * | 2013-11-15 | 2015-05-21 | Htc Corporation | Method, electronic device and medium for adjusting depth values |
Non-Patent Citations (1)
Title |
---|
See also references of EP3475923A4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220319129A1 (en) * | 2020-01-22 | 2022-10-06 | Facebook Technologies, Llc | 3D Reconstruction Of A Moving Object |
US11715272B2 (en) * | 2020-01-22 | 2023-08-01 | Meta Platforms Technologies, Llc | 3D reconstruction of a moving object |
Also Published As
Publication number | Publication date |
---|---|
AU2017288933B2 (en) | 2021-09-23 |
KR102337931B1 (en) | 2021-12-13 |
JP2019526847A (en) | 2019-09-19 |
EP3475923A1 (en) | 2019-05-01 |
JP6813600B2 (en) | 2021-01-13 |
BR112018077095A2 (en) | 2019-04-02 |
AU2017288933A1 (en) | 2019-02-14 |
US20170372510A1 (en) | 2017-12-28 |
CN109844819A (en) | 2019-06-04 |
EP3475923A4 (en) | 2020-01-01 |
BR112018077095A8 (en) | 2023-04-04 |
KR20190014105A (en) | 2019-02-11 |
US10706613B2 (en) | 2020-07-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2017288933B2 (en) | Systems and methods for dynamic occlusion handling | |
Du et al. | Edge snapping-based depth enhancement for dynamic occlusion handling in augmented reality | |
EP3144899B1 (en) | Apparatus and method for adjusting brightness of image | |
US10082879B2 (en) | Head mounted display device and control method | |
KR102298378B1 (en) | Information processing device, information processing method, and program | |
EP1969560B1 (en) | Edge-controlled morphological closing in segmentation of video sequences | |
US9767611B2 (en) | Information processing apparatus and method for estimating depth values using an approximate plane | |
TWI536318B (en) | Depth measurement quality enhancement | |
US9141873B2 (en) | Apparatus for measuring three-dimensional position, method thereof, and program | |
US9269194B2 (en) | Method and arrangement for 3-dimensional image model adaptation | |
CN107516335A (en) | The method for rendering graph and device of virtual reality | |
JP6491517B2 (en) | Image recognition AR device, posture estimation device, and posture tracking device | |
CN112233215B (en) | Contour rendering method, device, equipment and storage medium | |
JP5728080B2 (en) | Wrinkle detection device, wrinkle detection method and program | |
CN107808388B (en) | Image processing method and device containing moving object and electronic equipment | |
KR102511620B1 (en) | Apparatus and method for displaying augmented reality | |
JP6061334B2 (en) | AR system using optical see-through HMD | |
US11682234B2 (en) | Texture map generation using multi-viewpoint color images | |
CN114066814A (en) | Gesture 3D key point detection method of AR device and electronic device | |
Abate et al. | An image based approach to hand occlusions in mixed reality environments | |
JP2022524787A (en) | Methods, systems, and programs for object detection range estimation | |
CN108885778A (en) | Image processing equipment and image processing method | |
CN113064539B (en) | Special effect control method and device, electronic equipment and storage medium | |
JP6244885B2 (en) | Image processing apparatus, image processing method, and program | |
CN114066715A (en) | Image style migration method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17821009 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2018567839 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112018077095 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 20197002344 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2017821009 Country of ref document: EP Effective date: 20190128 |
|
ENP | Entry into the national phase |
Ref document number: 2017288933 Country of ref document: AU Date of ref document: 20170626 Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 112018077095 Country of ref document: BR Kind code of ref document: A2 Effective date: 20181226 |