WO2024013142A1 - Image capture device with wavelength separation device - Google Patents

Image capture device with wavelength separation device Download PDF

Info

Publication number
WO2024013142A1
WO2024013142A1 PCT/EP2023/069135 EP2023069135W WO2024013142A1 WO 2024013142 A1 WO2024013142 A1 WO 2024013142A1 EP 2023069135 W EP2023069135 W EP 2023069135W WO 2024013142 A1 WO2024013142 A1 WO 2024013142A1
Authority
WO
WIPO (PCT)
Prior art keywords
light
wavelength
scene
reflections
originated
Prior art date
Application number
PCT/EP2023/069135
Other languages
French (fr)
Inventor
Sergey OMELKOV
Peeter PIKSARV
Toomas BERGMANN
Heli VALTNA
Andreas VALDMANN
Original Assignee
Lightcode Photonics Oü
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lightcode Photonics Oü filed Critical Lightcode Photonics Oü
Publication of WO2024013142A1 publication Critical patent/WO2024013142A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/48Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
    • G01S7/481Constructional features, e.g. arrangements of optical elements
    • G01S7/4811Constructional features, e.g. arrangements of optical elements common to transmitter and receiver
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/86Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders

Definitions

  • Image capture devices are utilized for many different purposes. In addition to the common purpose of capturing images by users for preserving memories, image capture devices are also commonly used to capture images of scenes or environments. The images are processed to identify and learn information about the scene or environment. This information can be used as input to other systems for decision making by the system. For example, image capture devices may be utilized to capture images of a scene to identify objects contained within the environment. This information may be useful for systems to make decisions based upon the objects contained within the environment, for example, in self-driving car applications, service robotics applications, and/or other applications.
  • one aspect provides a system for capturing images utilizing a depth sensor and image sensor according to claim 1.
  • the system includes: a light modulation system that receives light from a scene, wherein the light includes infrared light reflections and visible light reflections; a wavelength separation device positioned between the light modulation system and the depth sensor and image sensor, wherein the infrared light reflections reflected from the light modulation system pass through the wavelength separation device onto the depth sensor and wherein the visible light reflections reflected from the light modulation system are reflected from the wavelength separation device onto the image sensor; the image sensor that captures the visible light reflections, wherein the system identifies objects of the scene from the visible light reflections; and the depth sensor that captures the infrared light reflections, wherein the system identifies depth information for the objects of the scene from the infrared light reflections.
  • depth information are, for example, to be understood as describing the distance to the system, in particular to the depth sensor.
  • Another aspect provides a method for capturing images utilizing a depth sensor and image sensor according to claim 14.
  • the method includes: receiving, at a light modulation system, light from a scene, wherein the light includes infrared light reflections and visible light reflections; reflecting, using the light modulation system, the light from the scene onto a wavelength separation device positioned between the light modulation system and the depth sensor and the image sensor, wherein the infrared light reflections reflected from the light modulation system pass through the wavelength separation device onto the depth sensor and wherein the visible light reflections reflected from the light modulation system are reflected from the wavelength separation device onto the image sensor; identifying depth information for objects of the scene from the infrared light reflections captured at the depth sensor; and identifying objects of the scene from the visible light reflections captured at the image sensor.
  • FIG. 1 illustrates an example block diagram of an image capture device including a wavelength separation device, depth sensor, and image sensor.
  • FIG. 2 illustrates example patterns that can be utilized by the system to increase the resolution of captured images.
  • FIG. 3 illustrates an example method for capturing an image of a scene utilizing a depth sensor and image sensor where infrared light travels through a wavelength separation device to the depth sensor and visible light is reflected from the wavelength separation device to the image sensor.
  • Inaccuracies in the object and object position information can cause significant repercussions in an application where a system is moving and making decisions with respect to objects within an environment. For example, if a self-driving car application receives inaccurate object position information, the car may run into one or more objects within the environment.
  • these applications e.g., self-driving cars, robotics, object recognition for system movement, etc.
  • Two most common sensors for capturing image information from an environment include LiDAR (Light Detection and Ranging) sensors which capture depth information and cameras which capture two-dimensional color images of the environment.
  • the cameras provide high resolution color images of the environment, but do not include depth information identifying a three-dimensional position of an object within the environment.
  • utilizing the LiDAR or other depth sensors in conjunction with the cameras or other image sensors allows for the generation of a three-dimensional image of the environment containing an image of the environment and also object position information.
  • the LiDARs provide accurate distance measurements, but only work on certain wavelengths, with the preferred being infrared so as to minimize the interference with other sensors such as the image sensor, human eye, and/or the like.
  • Some conventional systems utilize cameras and LiDARs side-by-side.
  • the camera and LiDARs systems are positioned next to each other and utilize their own lens for capturing the scene.
  • the output of both of these systems must be aligned with each other. Therefore, the parallax error has to be accounted for when combining the outputs.
  • the systems may be calibrated based upon their position with respect to each other. However, the output still has to be processed to align the outputs. This processing requires computational processing to transform either the LiDAR point cloud to camera space or the camera space to LiDAR point cloud. To perform this processing, the system needs to utilize external markers, calibration images, or intrinsic features found in the captured scenes.
  • the described system and method provides an image capture device including a wavelength separation device, depth sensor, and image sensor.
  • the system includes a light modulation system that receives light from a scene.
  • the reflected light includes both infrared light reflections and visible light reflections.
  • the system may include an infrared light source to illuminate the scene with infrared light that can then be captured by the light modulation system.
  • the system includes a wavelength separation device positioned between the light modulation system and the depth sensor and image sensor of the sensor.
  • the light received from the scene may also include light that has originated in the scene, i.e. both infrared light originated in the scene and visible light originated in the scene.
  • light reflections is used which is to be understood to also include light that has originated in the scene.
  • the infrared light reflections reflected from or having passed through the light modulation system pass through the wavelength separation device onto the depth sensor.
  • the visible light reflections reflected from or having passed through the light modulation system are reflected from the wavelength separation device onto the image sensor.
  • the image sensor captures the visible light reflections which are used to generate the camera image (e.g., the two-dimensional color image).
  • the camera image is used to identify objects of the scene.
  • the depth sensor captures the infrared light reflections which are used to generate the depth map.
  • the depth map is used to identify depth information for the objects of the scene.
  • the depth information can also be used to distinguish objects having similar colors.
  • the described system and method allow for the capture of both image information and depth information of a scene practically simultaneously.
  • the described system allows for the utilization of LiDAR sensors that capture infrared so as to minimize the interference with other sensors.
  • the described system utilizes a single lens to capture the scene.
  • the described system does not require the complicated alignment and calibration of the conventional systems that have separate lenses for the LiDAR and camera. Therefore, the described system minimizes the extensive post-processing required by conventional systems. Accordingly, the described system and method provides a technique that allows for capturing images of scenes including both high-resolution color images and depth information for the scene with less alignment requirements, computational requirements, and parallax errors as compared to the conventional systems.
  • the described system combines a LiDAR and two-dimensional color camera into a single unit that utilizes a single lens, also referred to as an objective.
  • the LiDAR and two-dimensional color camera share the same optical path.
  • the scene is obtained through the lens and then is directed, via a light modulation system, to each of the image sensors which corresponds to the two-dimensional color camera and the depth sensor which corresponds to the LiDAR.
  • a wavelength separation device is placed which splits the wavelengths of the scene and directs the wavelengths to a corresponding sensor.
  • the wavelength separation device allows light of a first wavelength to pass through onto the depth sensor and light of a second wavelength is reflected by the wavelength separation device onto the image sensor.
  • the light of the second wavelength may have a wide spectrum, meaning the light may include multiple wavelengths that are reflected onto the image sensor.
  • the light may include multiple wavelengths that are reflected onto the image sensor.
  • visible light is made of multiple wavelengths and is not usually monochromatic.
  • the described system and method can also be utilized with monochromatic light.
  • the described system eliminates the parallax effect found in conventional camera and LiDAR system fusions.
  • the light modulation system is an active device that can be used as a built-in calibration device to minimize the amount of calibration and post-processing for image alignment as opposed to conventional systems. It should be noted that while the description refers to a two-dimensional color camera and corresponding sensor and a LiDAR and corresponding sensor, it should be understood that the same system can be applied to any image capture device that captures images of environments or scenes not including depth information and any depth information sensor that captures depth information of the environment or scene.
  • FIG. 1 illustrates a block diagram of an example image capture device including a wavelength separation device, depth sensor, and image sensor.
  • the image capture device 100 is being utilized to capture the scene 101. Capturing the scene 101 includes capturing both a two-dimensional image of the scene and depth information for the scene. Thus, the image capture device 100 includes both an image sensor 11 and a depth sensor 13.
  • the image sensor 11 captures light reflections of a second wavelength from the scene 101.
  • the light reflections of a second wavelength will be referred to as visible light reflections.
  • this is not intended to limit the wavelengths to only the wavelength range of the traditional visible light wavelength range.
  • the wavelength range of the light reflections of the second wavelength may be within the 200 nm - 800 nm range.
  • the depth sensor 13 captures light reflections of a first wavelength from the scene 101.
  • the light reflections of a first wavelength will be referred to as infrared light reflections.
  • this is not intended to limit the wavelengths to only the wavelength range of the traditional infrared light wavelength range.
  • the wavelength range of the light reflections of the first wavelength may be within the 800 nm - 2000 nm range.
  • the two wavelength ranges that are used as examples i.e., 200 nm -
  • 800 nm for the light reflections of the second wavelength and 800 nm - 2000 nm for the light reflections of the first wavelength are completely unique, meaning there is no overlap between the wavelength ranges, it should be noted that this is not strictly necessary. Actually, there may be instances where it may be beneficial to allow some overlap between the wavelength ranges. In other words, there are some applications where it may be beneficial or useful to design the system such that some of the light of the first wavelength can reach the image sensor. For example, in low light conditions it may be useful to allow some of the infrared light reflections to reach the image sensor to enhance the two-dimensional image. Thus, the first wavelength and the second wavelength do not have to be completely unique and may have some overlap.
  • infrared light will refer to light in wavelengths that are usable by the LiDAR system, even though the light may have a wavelength that is technically in the visible light range.
  • the depth sensor may also work with ultraviolet light.
  • the image sensor may also be a red-green-blue (RGB) and near infrared (NIR) camera.
  • RGB red-green-blue
  • NIR near infrared
  • infrared and visible light will refer to two separate wavelength bands that are separated using a separator, as described in more detail herein, and are used to collect three-dimensional and two-dimensional data, respectively, without regard to exact or actual wavelength ranges for either the infrared or visible light.
  • the infrared and visible light wavelengths may occur in the 200 nm - 2000 nm range, thereby excluding radio and x-rays.
  • the scene 101 is illuminated with both visible light, for example, natural light, flash of a camera, or the like, and infrared light, for example, using an infrared light source 2.
  • the infrared light source 2 may be a short pulsed near infrared (NIR) light source included in a transmitter module 1.
  • NIR near infrared
  • the infrared light source 2 uniformly illuminates the scene 101 with infrared laser light or any other light source capable of projecting short pulses, for example, a light-emitting diode light source.
  • the example used throughout will be the infrared laser light.
  • the infrared laser light will illuminate the scene 101 with short infrared laser pulses.
  • the light, both visible and infrared, reflected from the scene 101 is captured or collected by the image capture device 100 at a receiver module 4.
  • the light is collected by the receiver module 4 through a lens or objective 5.
  • the objective 5 may have a fixed focal length and the focus of the objective 5 may be set to its hyperfocal distance.
  • the objective 5 images the scene 101 onto a light modulation system 6.
  • the light modulation system 6 may be a digital micromirror device
  • DMD light modulation systems
  • any digital or analog light modulator e.g., liquid crystal display, liquid crystal on silicon display, etc.
  • the light modulation system 6 receives the light reflections, both visible light reflections, or light reflections of a second wavelength, and infrared light reflections, or light reflections of a first wavelength, reflected from the scene 6.
  • the image capture device 100 includes a wavelength separation device 9 that is positioned between the light modulation system 6 and the depth sensor 13 and the image sensor 11.
  • the wavelength separation device 9 may be a dichroic mirror, dichroic filter, longpass or shortpass dichroic mirror, or other wavelength separation device that can separate the infrared from the visible light.
  • the wavelength separation device 9 is positioned such that the infrared light reflections, or light reflections of a first wavelength, from the light modulation system 6 pass through the wavelength separation device 9 onto the depth sensor 13.
  • the visible light reflections, or light reflections of a second wavelength, from the light modulation system 6 are reflected from the wavelength separation device 9 onto the image sensor 11.
  • the depth sensor 13 and the image sensor 11 can be coaxially located and receive the light reflections from the scene simultaneously through the same objective 5.
  • simultaneously refers to within the same frame of acquisition of the scene.
  • simultaneously means that both the image sensor 11 and depth sensor 13 are receiving light from the same frame of the scene captured by the image capture device 100.
  • the depth sensor 13 generally has a much higher temporal resolution, for example, picoseconds, to measure the depth.
  • the image sensor 11, on the other hand does not require such a high temporal resolution and only needs to receive the image information during the same frame acquisition as the depth information is received at the depth sensor 13.
  • the image sensor 11 may have a lower temporal resolution as compared to the depth sensor, for example, milliseconds.
  • the infrared light reflections and the visible light reflections are received simultaneously, that is the infrared light reflections are received during a frame acquisition of the visible light reflections, with the note that multiple infrared light reflections may be received at the depth sensor 11 during the frame acquisition of a set of the visible light reflections.
  • the receiver module 4 may include additional components located between the light modulation system 6 and the wavelength separation device 9, as shown in FIG. 1.
  • FIG. 1 illustrates a total internal reflection (TIR) prism 7 that is located between the light modulation system 6 and the wavelength separation device 9.
  • TIR prism 7 is used to fold the light or beam path.
  • Other devices capable of folding the beam path can be utilized, for example, a suitable mirror system, a series of prisms, and/or the like.
  • Another component that may be included are imaging optics, for example, a lens, set of lenses, or mirrors, 8 positioned between either the HR 7, if included, or the light modulation system 6 and the wavelength separation device 9.
  • the imaging optics 8 may be used to relay the image on the light modulation system 6 to both the image sensor 11 and the depth sensor 13.
  • the receiver module 4 may include additional components located between the wavelength separation device 9 and the image sensor 11 or depth sensor 13, as shown in FIG. 1.
  • the receiver module 4 may include a NIR blocking filter 10 positioned between the wavelength separation device 9 and the image sensor 11. This filter blocks any infrared light that may have been reflected from the wavelength separation device 9 instead of being passed through the wavelength separation device 9.
  • a laser line infrared-transmitting filter 12 may be included and positioned between the wavelength separation device 9 and the depth sensor 13. This filter 12 matches the wavelength of the infrared light source 2 and, thus, is used to enhance the signal-to-noise ratio by filtering noise from the infrared light reflections.
  • Other components not illustrated in FIG. 1, may also be included, for example, additional imaging elements
  • the imaging optics 8 may or may not be included.
  • the depth sensor 13 captures the infrared light reflections that pass through the wavelength separation device 9. These infrared light reflections are those infrared light reflections reflected by the scene 101 from, for example, the infrared light source 2.
  • the depth sensor 13 may be a low-resolution sensor. However, used in conjunction with the light modulation system and the infrared flood illumination of the scene, a high-resolution depth image can be generated.
  • the light modulation system 6 and the depth sensor 13 can be used in a computational ghost imaging or single-pixel imaging manner to increase the resolution of the captured depth and infrared data by using suitable patterns displayed on the light modulation system 6.
  • the patterns can be linearly independent and form a complete basis, for example, patterns based on the Walsh-Hadamard basis set shown in FIG. 2.
  • other patterns can be utilized, including those that do not require the inverse to be shown.
  • the patterns are applied in complementary pairs containing a pattern together with its inverse so that an even number of patterns are shown within the camera exposure. Since the patterns are changed at a much higher frequency than the images are acquired by the camera then the light modulation system 6 effectively acts as a mirror for the camera system.
  • the number of patterns can be extended to obtain depth images of certain areas with higher lateral resolution.
  • the depth sensor 13 may be a low-resolution photodetector array that is capable of measuring the time-of-flight (ToF) of the emitted laser pulses and is used to detect the light reflected from the actively illuminated NIR light.
  • the ToF information is used to determine how far a portion of the scene corresponding to the reflection is from the image capture device 100 or a component of the image capture device 100.
  • the ToF information can be combined together to generate a depth map corresponding to the scene 101, thereby providing information regarding a three-dimensional position of each object or portion of an object within the scene.
  • Example ToF detectors include, but are not limited to, direct ToF detectors (e.g., avalanche photodiode or photodiode array, single-photon avalanche diode or diode array, silicon photomultiplier, etc.) or indirect time-of-flight detectors (e.g., amplitude-modulated time-of-flight array detector, frequency-modulated continuous-wave array detector, etc.).
  • a shortpass dichroic mirror is used instead of a longpass dichroic mirror
  • the position of the image sensor 11 and any corresponding components for example, the blocking filter 10 and the position of the depth sensor 13 and any corresponding components, for example, infrared-transmitting filter 12 may be swapped.
  • Any optical elements in the system for example, the objective 5, TIR prism 7, relay optics 8, and/or the like, may include suitable anti-reflective coatings that cover both visible and NIR wavelengths.
  • the image sensor 11 captures the visible light reflections reflected from the scene 101 from, for example, natural light, a flash, other lights, and/or the like.
  • the visible light reflections are utilized by the image sensor 11 to generate a two-dimensional image of the scene.
  • the image sensor 11 is a color image sensor and, therefore, allows for the generation of a color two-dimensional image of the scene.
  • objects within the scene can be identified.
  • the system can identify people, animals, landmarks, vehicles, traffic signals, facility objects, and any other objects that are within the scene 101 and environment surrounding the image capture device 100.
  • the depth map generated from the information captured by the depth sensor 13 and the two-dimensional image generated from the information captured by the image sensor 11 can be combined by the system into a three-dimensional image of the scene 101. Since the depth sensor 13 and the image sensor 11 received the information through the same objective 5, post-processing, if any, is minimal to align the depth map and two-dimensional image. However, to ensure alignment the system may be calibrated before first use. Since the light modulation system is an active component and shared by both the visible light and the infrared light beam paths, it can be used as a common point to directly calibrate and align the sensors 11 and 13 on a pixel-by-pixel level.
  • a single pixel of the light modulation system 6 is switched to a reflective or transmissive state at a time. It should be noted that it is also possible to define a faster calibration technique where multiple pixels are reflective at the same time, while still allowing the same calibration result.
  • the physical pixels from both the image sensor 13 and the depth sensor 11 can be directly mapped to that light modulation system 6 pixel. In other words, when a single pixel of the light modulation system 6 is switched to cause a reflection or transmission of both visible and infrared light through the system to the respective image sensor 13 and depth sensor 11, a single pixel of both of the sensors will be illuminated.
  • border pixels may also be illuminated where light from the digital micromirror device will illuminate multiple pixels on the depth sensor.
  • the calibration will also identify these instances.
  • the illuminated pixels of the sensors can then be correlated to the single pixel of the light modulation system 6. Performing this calibration for every pixel of the light modulation system 6 allows for a 1 :1 mapping of pixels between the image sensor 13 and the depth sensor 11, thus allowing for alignment of the depth map and two-dimensional image. Since the calibration allows for a 1 : 1 mapping of pixels, any post-processing and/or analysis of the depth map and/or two-dimensional image for combining into the three- dimensional image will be greatly reduced as compared to conventional systems.
  • R(TOF sensor) is a native resolution of a TOF sensor, meaning the number of independent areas from which the time of flight signal (e.g., photon arrival times) can be recorded simultaneously. This resolution can be rather low and may not be practical enough for use in depth imaging.
  • R(depth) is a resolution of the reconstructed depth image which uses multiple exposures of different patterns on the light modulation system 6 via ghost imaging or single-pixel imaging techniques.
  • R( depth) is generally larger than R(TOF sensor) so it can be useful in depth imaging, but on the upper side is limited by R(DMD), which is a resolution (number of mirrors) of the light modulation system 6.
  • R(DMD) is a resolution (number of mirrors) of the light modulation system 6.
  • the resolution of a 2D color sensor R(2D) can be anything suitable for the application. However, since high-resolution cameras are readily available it is beneficial to have R(2D) higher or equal to the R(DMD).
  • the calibration algorithm finds the correspondence of each of the pixels in R(depth) to the (set of) pixels of R(2D).
  • the calibration occurs when the system is first built. Calibration does not have to occur as frequently as with conventional systems. Rather, calibration can occur the single time and may only need to occur again if the system is jarred or experiences heavy vibration.
  • the system may determine if a calibration needs to occur based upon a metric that could indicate that the system needs to be calibrated based upon object matching between the two images, the two-dimensional image and the depth map. For example, the system could determine that there is a mismatch between the two- dimensional image and the depth map, thereby indicating a calibration is needed. Additionally, if the system is having problems identifying or detecting objects, the system may determine that a calibration is needed.
  • FIG. 3 illustrates a method for capturing an image of a scene utilizing a depth sensor and image sensor where infrared light travels through a wavelength separation device to the depth sensor and visible light is reflected from the wavelength separation device to the image sensor.
  • the image capture device at the light modulation system, receives light reflected from a scene. The reflected light includes both infrared light reflections and visible light reflections.
  • the image capture device using the light modulation system, reflects or transmits light from the scene onto a depth sensor and image sensor via a wavelength separation device. In other words, the light from the scene is reflected from or transmitted through the light modulation system to the wavelength separation device.
  • the infrared light reflections pass through the wavelength separation device onto the depth sensor.
  • the visible light reflections are reflected from the wavelength separation device to the image sensor.
  • the system may, if necessary, determine if objects of the scene and the depth of the objects of the scene can be identified at 303.
  • the object identification may be performed by a host or other component of the system and is likely not performed by the imaging system itself.
  • Object identification includes identifying depth information for objects of the scene from the infrared light reflections captured at the depth sensor. Specifically, the depth sensor uses the infrared light reflections to generate a depth map for the scene. Additionally, objects of the scene are identified from the visible light reflections captured at the image sensor. Specifically, the image sensor uses the visible light reflections to generate a two-dimensional image for the scene. If the sensors are able to generate the corresponding images, the depth map and the two-dimensional image can be used to identify the objects in the scene and the depth of the objects in the scene.
  • the system ignores the input or takes no action at 305. If, on the other hand, objects and depths of objects can be identified at 303, the system generates a three-dimensional image for the scene at 304. In other words, the two images, the depth map and the twodimensional image, can be combined into a three-dimensional image of the scene.
  • the described system can be used in different practical applications, for example, object detection, object tracking, semantic segmentation, object recognition, and/or the like.
  • An example use case is self-driving vehicles, service robotics, and automatic control systems.
  • Sensor fusion is an important problem for autonomous operation of vehicles and robots, but most importantly for object detection. It is important for distinguishing objects of the same color but at different distances and objects at the same distance but of different colors that would be otherwise difficult to distinguish using only one camera image, particularly a two-dimensional image.
  • adding depth information can be crucial for detecting objects that may be at least partially transparent in the visible wavelength range.
  • a color camera can be combined with the LiDAR sensor with precise alignment, and at low computational cost.
  • Example embodiments are described herein with reference to the figures, which illustrate example methods, devices and program products according to various example embodiments. It will be understood that the actions and functionality may be implemented at least in part by program instructions. These program instructions may be provided to a processor of a device, a special purpose information handling device, or other programmable data processing device to produce a machine, such that the instructions, which execute via a processor of the device implement the functions/acts specified.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Physics & Mathematics (AREA)
  • Electromagnetism (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Measurement Of Optical Distance (AREA)

Abstract

One aspect provides a system for capturing images utilizing a depth sensor and image sensor. The system includes a light modulation system that receives light from a scene. The system also includes a wavelength separation device positioned between the light modulation system and the sensors. The infrared light reflections reflected from the light modulation system pass through the wavelength separation device onto the depth sensor. The visible light reflections reflected from the light modulation system are reflected from the wavelength separation device onto the image sensor. The system also includes the image sensor that captures the visible light reflections and the depth sensor that captures the infrared light reflections.

Description

IMAGE CAPTURE DEVICE WITH WAVELENGTH SEPARATION DEVICE
BACKGROUND
[0001] Image capture devices are utilized for many different purposes. In addition to the common purpose of capturing images by users for preserving memories, image capture devices are also commonly used to capture images of scenes or environments. The images are processed to identify and learn information about the scene or environment. This information can be used as input to other systems for decision making by the system. For example, image capture devices may be utilized to capture images of a scene to identify objects contained within the environment. This information may be useful for systems to make decisions based upon the objects contained within the environment, for example, in self-driving car applications, service robotics applications, and/or other applications.
[0002] In these applications, not only is the information regarding objects of the scene useful, but the positioning of the objects within the environment is also very important. For example, in order to correctly make decisions regarding the environment, a self-driving car application needs to know the exact position of the object within the environment. Thus, these applications need image capture devices that can capture both two-dimensional images of the environment to identify objects within the environment, but also capture depth information to identify a three-dimensional position of the object within the environment.
BRIEF SUMMARY
[0003] In summary, one aspect provides a system for capturing images utilizing a depth sensor and image sensor according to claim 1.
[0004] In an embodiment, the system includes: a light modulation system that receives light from a scene, wherein the light includes infrared light reflections and visible light reflections; a wavelength separation device positioned between the light modulation system and the depth sensor and image sensor, wherein the infrared light reflections reflected from the light modulation system pass through the wavelength separation device onto the depth sensor and wherein the visible light reflections reflected from the light modulation system are reflected from the wavelength separation device onto the image sensor; the image sensor that captures the visible light reflections, wherein the system identifies objects of the scene from the visible light reflections; and the depth sensor that captures the infrared light reflections, wherein the system identifies depth information for the objects of the scene from the infrared light reflections.
[0005] Within this disclosure, "depth information" are, for example, to be understood as describing the distance to the system, in particular to the depth sensor. [0006] Another aspect provides a method for capturing images utilizing a depth sensor and image sensor according to claim 14.
[0007] In an embodiment, the method includes: receiving, at a light modulation system, light from a scene, wherein the light includes infrared light reflections and visible light reflections; reflecting, using the light modulation system, the light from the scene onto a wavelength separation device positioned between the light modulation system and the depth sensor and the image sensor, wherein the infrared light reflections reflected from the light modulation system pass through the wavelength separation device onto the depth sensor and wherein the visible light reflections reflected from the light modulation system are reflected from the wavelength separation device onto the image sensor; identifying depth information for objects of the scene from the infrared light reflections captured at the depth sensor; and identifying objects of the scene from the visible light reflections captured at the image sensor.
[0008] Further advantageous embodiments are set forth in the dependent claims.
[0009] The foregoing is a summary and thus may contain simplifications, generalizations, and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and is not intended to be in any way limiting. [0010] For a better understanding of the embodiments, together with other and further features and advantages thereof, reference is made to the following description, taken in conjunction with the accompanying drawings. The scope of the invention will be pointed out in the appended claims.
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
[0011] FIG. 1 illustrates an example block diagram of an image capture device including a wavelength separation device, depth sensor, and image sensor.
[0012] FIG. 2 illustrates example patterns that can be utilized by the system to increase the resolution of captured images.
[0013] FIG. 3 illustrates an example method for capturing an image of a scene utilizing a depth sensor and image sensor where infrared light travels through a wavelength separation device to the depth sensor and visible light is reflected from the wavelength separation device to the image sensor.
DETAILED DESCRIPTION
[0014] It will be readily understood that the components of the embodiments, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations in addition to the described example embodiments. Thus, the following more detailed description of the example embodiments, as represented in the figures, is not intended to limit the scope of the embodiments, as claimed, but is merely representative of example embodiments.
[0015] Reference throughout this specification to “one embodiment” or “an embodiment” (or the like) means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, appearances of the phrases “in one embodiment” or “in an embodiment” or the like in various places throughout this specification are not necessarily all referring to the same embodiment.
[0016] Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of claimed embodiments. One skilled in the relevant art will recognize, however, that the various described embodiments can be practiced without one or more of the specific details, or with other methods, components, materials, et cetera. In other instances, well- known structures, materials, or operations are not shown or described in detail. The following description is intended only by way of example, and simply illustrates certain example embodiments. [0017] Not only do applications that work utilizing both object information and object position information need the object and depth information, but the applications also need the information to be accurate. Inaccuracies in the object and object position information can cause significant repercussions in an application where a system is moving and making decisions with respect to objects within an environment. For example, if a self-driving car application receives inaccurate object position information, the car may run into one or more objects within the environment. Thus, these applications (e.g., self-driving cars, robotics, object recognition for system movement, etc.) have a continuous demand for sensors that enable understanding of the physical world or environment around them and require accurate information regarding the environment.
[0018] Two most common sensors for capturing image information from an environment include LiDAR (Light Detection and Ranging) sensors which capture depth information and cameras which capture two-dimensional color images of the environment. The cameras provide high resolution color images of the environment, but do not include depth information identifying a three-dimensional position of an object within the environment. Thus, utilizing the LiDAR or other depth sensors in conjunction with the cameras or other image sensors, allows for the generation of a three-dimensional image of the environment containing an image of the environment and also object position information. The LiDARs provide accurate distance measurements, but only work on certain wavelengths, with the preferred being infrared so as to minimize the interference with other sensors such as the image sensor, human eye, and/or the like.
[0019] Some conventional systems utilize cameras and LiDARs side-by-side. In other words, the camera and LiDARs systems are positioned next to each other and utilize their own lens for capturing the scene. However, in order to fully utilize full information from different kinds of sensors, for example, semantic segmentation, object detection, object recognition, object tracking, and/or the like, the output of both of these systems must be aligned with each other. Therefore, the parallax error has to be accounted for when combining the outputs. To achieve this alignment, the systems may be calibrated based upon their position with respect to each other. However, the output still has to be processed to align the outputs. This processing requires computational processing to transform either the LiDAR point cloud to camera space or the camera space to LiDAR point cloud. To perform this processing, the system needs to utilize external markers, calibration images, or intrinsic features found in the captured scenes.
[0020] Accordingly, the described system and method provides an image capture device including a wavelength separation device, depth sensor, and image sensor. The system includes a light modulation system that receives light from a scene. The reflected light includes both infrared light reflections and visible light reflections. Thus, the system may include an infrared light source to illuminate the scene with infrared light that can then be captured by the light modulation system. The system includes a wavelength separation device positioned between the light modulation system and the depth sensor and image sensor of the sensor.
[0021] The light received from the scene may also include light that has originated in the scene, i.e. both infrared light originated in the scene and visible light originated in the scene. For simplification, in the following description only, solely the term "light reflections" is used which is to be understood to also include light that has originated in the scene.
[0022] The infrared light reflections reflected from or having passed through the light modulation system pass through the wavelength separation device onto the depth sensor. The visible light reflections reflected from or having passed through the light modulation system are reflected from the wavelength separation device onto the image sensor. The image sensor captures the visible light reflections which are used to generate the camera image (e.g., the two-dimensional color image). The camera image is used to identify objects of the scene. The depth sensor captures the infrared light reflections which are used to generate the depth map. The depth map is used to identify depth information for the objects of the scene. The depth information can also be used to distinguish objects having similar colors. [0023] Thus, the described system provides an improvement over conventional systems for capturing image and depth information for a scene. The described system and method allow for the capture of both image information and depth information of a scene practically simultaneously. The described system allows for the utilization of LiDAR sensors that capture infrared so as to minimize the interference with other sensors. Additionally, instead of conventional systems which utilize cameras and LiDARs side- by-side, the described system utilizes a single lens to capture the scene. Thus, the described system does not require the complicated alignment and calibration of the conventional systems that have separate lenses for the LiDAR and camera. Therefore, the described system minimizes the extensive post-processing required by conventional systems. Accordingly, the described system and method provides a technique that allows for capturing images of scenes including both high-resolution color images and depth information for the scene with less alignment requirements, computational requirements, and parallax errors as compared to the conventional systems.
[0024] Generally, the described system combines a LiDAR and two-dimensional color camera into a single unit that utilizes a single lens, also referred to as an objective. In other words, the LiDAR and two-dimensional color camera share the same optical path. The scene is obtained through the lens and then is directed, via a light modulation system, to each of the image sensors which corresponds to the two-dimensional color camera and the depth sensor which corresponds to the LiDAR. Before the image sensor and the depth sensor, a wavelength separation device is placed which splits the wavelengths of the scene and directs the wavelengths to a corresponding sensor. Specifically, the wavelength separation device allows light of a first wavelength to pass through onto the depth sensor and light of a second wavelength is reflected by the wavelength separation device onto the image sensor. It should be noted that the light of the second wavelength may have a wide spectrum, meaning the light may include multiple wavelengths that are reflected onto the image sensor. For example, it is widely understood that visible light is made of multiple wavelengths and is not usually monochromatic. However, the described system and method can also be utilized with monochromatic light.
[0025] Since the two sensors are placed coaxially and share the same objective, the described system eliminates the parallax effect found in conventional camera and LiDAR system fusions. Additionally, the light modulation system is an active device that can be used as a built-in calibration device to minimize the amount of calibration and post-processing for image alignment as opposed to conventional systems. It should be noted that while the description refers to a two-dimensional color camera and corresponding sensor and a LiDAR and corresponding sensor, it should be understood that the same system can be applied to any image capture device that captures images of environments or scenes not including depth information and any depth information sensor that captures depth information of the environment or scene.
[0026] FIG. 1 illustrates a block diagram of an example image capture device including a wavelength separation device, depth sensor, and image sensor. The image capture device 100 is being utilized to capture the scene 101. Capturing the scene 101 includes capturing both a two-dimensional image of the scene and depth information for the scene. Thus, the image capture device 100 includes both an image sensor 11 and a depth sensor 13. The image sensor 11 captures light reflections of a second wavelength from the scene 101. For ease of readability, the light reflections of a second wavelength will be referred to as visible light reflections. However, as further noted herein, this is not intended to limit the wavelengths to only the wavelength range of the traditional visible light wavelength range. For example, the wavelength range of the light reflections of the second wavelength may be within the 200 nm - 800 nm range.
[0027] The depth sensor 13 captures light reflections of a first wavelength from the scene 101. For ease of readability, the light reflections of a first wavelength will be referred to as infrared light reflections. However, as further noted herein, this is not intended to limit the wavelengths to only the wavelength range of the traditional infrared light wavelength range. For example, the wavelength range of the light reflections of the first wavelength may be within the 800 nm - 2000 nm range. [0028] While the two wavelength ranges that are used as examples (i.e., 200 nm -
800 nm for the light reflections of the second wavelength and 800 nm - 2000 nm for the light reflections of the first wavelength) are completely unique, meaning there is no overlap between the wavelength ranges, it should be noted that this is not strictly necessary. Actually, there may be instances where it may be beneficial to allow some overlap between the wavelength ranges. In other words, there are some applications where it may be beneficial or useful to design the system such that some of the light of the first wavelength can reach the image sensor. For example, in low light conditions it may be useful to allow some of the infrared light reflections to reach the image sensor to enhance the two-dimensional image. Thus, the first wavelength and the second wavelength do not have to be completely unique and may have some overlap.
[0029] The scene is illuminated using both visible light and infrared light. It should be noted that in this case, infrared light will refer to light in wavelengths that are usable by the LiDAR system, even though the light may have a wavelength that is technically in the visible light range. Additionally, the depth sensor may also work with ultraviolet light. The image sensor may also be a red-green-blue (RGB) and near infrared (NIR) camera. Thus, in this disclosure infrared and visible light will refer to two separate wavelength bands that are separated using a separator, as described in more detail herein, and are used to collect three-dimensional and two-dimensional data, respectively, without regard to exact or actual wavelength ranges for either the infrared or visible light.
However, the infrared and visible light wavelengths may occur in the 200 nm - 2000 nm range, thereby excluding radio and x-rays.
[0030] The scene 101 is illuminated with both visible light, for example, natural light, flash of a camera, or the like, and infrared light, for example, using an infrared light source 2. The infrared light source 2 may be a short pulsed near infrared (NIR) light source included in a transmitter module 1. In combination with a projection lens 3, the infrared light source 2 uniformly illuminates the scene 101 with infrared laser light or any other light source capable of projecting short pulses, for example, a light-emitting diode light source. The example used throughout will be the infrared laser light. However, this is not intended to limit the scope of this disclosure to only an infrared laser light as any type of light source capable of emitting short pulses can be utilized. When using the NIR light source, the infrared laser light will illuminate the scene 101 with short infrared laser pulses. The light, both visible and infrared, reflected from the scene 101 is captured or collected by the image capture device 100 at a receiver module 4.
[0031] The light, both visible, or light of a second wavelength, and infrared, or light of a first wavelength, is collected by the receiver module 4 through a lens or objective 5. The objective 5 may have a fixed focal length and the focus of the objective 5 may be set to its hyperfocal distance. The objective 5 images the scene 101 onto a light modulation system 6. The light modulation system 6 may be a digital micromirror device
(DMD) that contains several hundred thousand small mirrors arranged in a rectangular array. Each of the mirrors can be individually switched on and off by changing the angle of the mirror. Light modulation systems other than a DMD may be utilized. For example, any digital or analog light modulator (e.g., liquid crystal display, liquid crystal on silicon display, etc.) may be utilized in the described system. Thus, the light modulation system 6 receives the light reflections, both visible light reflections, or light reflections of a second wavelength, and infrared light reflections, or light reflections of a first wavelength, reflected from the scene 6.
[0032] The image capture device 100 includes a wavelength separation device 9 that is positioned between the light modulation system 6 and the depth sensor 13 and the image sensor 11. The wavelength separation device 9 may be a dichroic mirror, dichroic filter, longpass or shortpass dichroic mirror, or other wavelength separation device that can separate the infrared from the visible light. The wavelength separation device 9 is positioned such that the infrared light reflections, or light reflections of a first wavelength, from the light modulation system 6 pass through the wavelength separation device 9 onto the depth sensor 13. The visible light reflections, or light reflections of a second wavelength, from the light modulation system 6 are reflected from the wavelength separation device 9 onto the image sensor 11. Thus, the depth sensor 13 and the image sensor 11 can be coaxially located and receive the light reflections from the scene simultaneously through the same objective 5.
[0033] It should be noted that the term simultaneously refers to within the same frame of acquisition of the scene. In other words, simultaneously means that both the image sensor 11 and depth sensor 13 are receiving light from the same frame of the scene captured by the image capture device 100. The depth sensor 13 generally has a much higher temporal resolution, for example, picoseconds, to measure the depth. The image sensor 11, on the other hand, does not require such a high temporal resolution and only needs to receive the image information during the same frame acquisition as the depth information is received at the depth sensor 13. Thus, the image sensor 11 may have a lower temporal resolution as compared to the depth sensor, for example, milliseconds. Thus, using the millisecond time scale of the image sensor 11 and the fact that the depth sensor 11 receives information more frequently, the infrared light reflections and the visible light reflections are received simultaneously, that is the infrared light reflections are received during a frame acquisition of the visible light reflections, with the note that multiple infrared light reflections may be received at the depth sensor 11 during the frame acquisition of a set of the visible light reflections.
[0034] The receiver module 4 may include additional components located between the light modulation system 6 and the wavelength separation device 9, as shown in FIG. 1. FIG. 1 illustrates a total internal reflection (TIR) prism 7 that is located between the light modulation system 6 and the wavelength separation device 9. The TIR prism 7 is used to fold the light or beam path. Other devices capable of folding the beam path can be utilized, for example, a suitable mirror system, a series of prisms, and/or the like. Another component that may be included are imaging optics, for example, a lens, set of lenses, or mirrors, 8 positioned between either the HR 7, if included, or the light modulation system 6 and the wavelength separation device 9. The imaging optics 8 may be used to relay the image on the light modulation system 6 to both the image sensor 11 and the depth sensor 13.
[0035] Additionally, the receiver module 4 may include additional components located between the wavelength separation device 9 and the image sensor 11 or depth sensor 13, as shown in FIG. 1. The receiver module 4 may include a NIR blocking filter 10 positioned between the wavelength separation device 9 and the image sensor 11. This filter blocks any infrared light that may have been reflected from the wavelength separation device 9 instead of being passed through the wavelength separation device 9. A laser line infrared-transmitting filter 12 may be included and positioned between the wavelength separation device 9 and the depth sensor 13. This filter 12 matches the wavelength of the infrared light source 2 and, thus, is used to enhance the signal-to-noise ratio by filtering noise from the infrared light reflections. Other components not illustrated in FIG. 1, may also be included, for example, additional imaging elements
(e.g., lenses, mirrors, etc.) between the wavelength separation device 9 and one of the sensors 11 or 13. If these components are included, then the imaging optics 8, may or may not be included.
[0036] The depth sensor 13 captures the infrared light reflections that pass through the wavelength separation device 9. These infrared light reflections are those infrared light reflections reflected by the scene 101 from, for example, the infrared light source 2. The depth sensor 13 may be a low-resolution sensor. However, used in conjunction with the light modulation system and the infrared flood illumination of the scene, a high-resolution depth image can be generated. The light modulation system 6 and the depth sensor 13 can be used in a computational ghost imaging or single-pixel imaging manner to increase the resolution of the captured depth and infrared data by using suitable patterns displayed on the light modulation system 6.
[0037] The patterns can be linearly independent and form a complete basis, for example, patterns based on the Walsh-Hadamard basis set shown in FIG. 2. However, other patterns can be utilized, including those that do not require the inverse to be shown. Using the example patterns shown in FIG. 2, the patterns are applied in complementary pairs containing a pattern together with its inverse so that an even number of patterns are shown within the camera exposure. Since the patterns are changed at a much higher frequency than the images are acquired by the camera then the light modulation system 6 effectively acts as a mirror for the camera system. Additionally, based on the color camera image, it is possible to estimate the complexity of the scene and, therefore, limit the number of encoded patterns for the LiDAR and achieve a higher framerate than otherwise possible. Alternatively, the number of patterns can be extended to obtain depth images of certain areas with higher lateral resolution.
[0038] The depth sensor 13 may be a low-resolution photodetector array that is capable of measuring the time-of-flight (ToF) of the emitted laser pulses and is used to detect the light reflected from the actively illuminated NIR light. The ToF information is used to determine how far a portion of the scene corresponding to the reflection is from the image capture device 100 or a component of the image capture device 100. Thus, the ToF information can be combined together to generate a depth map corresponding to the scene 101, thereby providing information regarding a three-dimensional position of each object or portion of an object within the scene. Example ToF detectors include, but are not limited to, direct ToF detectors (e.g., avalanche photodiode or photodiode array, single-photon avalanche diode or diode array, silicon photomultiplier, etc.) or indirect time-of-flight detectors (e.g., amplitude-modulated time-of-flight array detector, frequency-modulated continuous-wave array detector, etc.). [0039] In the event that a shortpass dichroic mirror is used instead of a longpass dichroic mirror, the position of the image sensor 11 and any corresponding components, for example, the blocking filter 10 and the position of the depth sensor 13 and any corresponding components, for example, infrared-transmitting filter 12 may be swapped. Any optical elements in the system, for example, the objective 5, TIR prism 7, relay optics 8, and/or the like, may include suitable anti-reflective coatings that cover both visible and NIR wavelengths.
[0040] The image sensor 11 captures the visible light reflections reflected from the scene 101 from, for example, natural light, a flash, other lights, and/or the like. The visible light reflections are utilized by the image sensor 11 to generate a two-dimensional image of the scene. Generally, the image sensor 11 is a color image sensor and, therefore, allows for the generation of a color two-dimensional image of the scene. From the two- dimensional image, objects within the scene can be identified. For example, the system can identify people, animals, landmarks, vehicles, traffic signals, facility objects, and any other objects that are within the scene 101 and environment surrounding the image capture device 100.
[0041] Thus, the depth map generated from the information captured by the depth sensor 13 and the two-dimensional image generated from the information captured by the image sensor 11 can be combined by the system into a three-dimensional image of the scene 101. Since the depth sensor 13 and the image sensor 11 received the information through the same objective 5, post-processing, if any, is minimal to align the depth map and two-dimensional image. However, to ensure alignment the system may be calibrated before first use. Since the light modulation system is an active component and shared by both the visible light and the infrared light beam paths, it can be used as a common point to directly calibrate and align the sensors 11 and 13 on a pixel-by-pixel level.
[0042] To perform the calibration, a single pixel of the light modulation system 6 is switched to a reflective or transmissive state at a time. It should be noted that it is also possible to define a faster calibration technique where multiple pixels are reflective at the same time, while still allowing the same calibration result. The physical pixels from both the image sensor 13 and the depth sensor 11 can be directly mapped to that light modulation system 6 pixel. In other words, when a single pixel of the light modulation system 6 is switched to cause a reflection or transmission of both visible and infrared light through the system to the respective image sensor 13 and depth sensor 11, a single pixel of both of the sensors will be illuminated. In some cases, border pixels may also be illuminated where light from the digital micromirror device will illuminate multiple pixels on the depth sensor. The calibration will also identify these instances. The illuminated pixels of the sensors can then be correlated to the single pixel of the light modulation system 6. Performing this calibration for every pixel of the light modulation system 6 allows for a 1 :1 mapping of pixels between the image sensor 13 and the depth sensor 11, thus allowing for alignment of the depth map and two-dimensional image. Since the calibration allows for a 1 : 1 mapping of pixels, any post-processing and/or analysis of the depth map and/or two-dimensional image for combining into the three- dimensional image will be greatly reduced as compared to conventional systems.
[0043] It should be noted that the resolution of the depth sensor 13 and the image sensor 11 may be different. Thus, when performing the calibration, the system takes into account the discrepancy between the resolutions. For example, a sample ratio of resolutions (number of pixels) may be: R(2D) >= R(DMD) >= R( depth) > R(TOF sensor). R(TOF sensor) is a native resolution of a TOF sensor, meaning the number of independent areas from which the time of flight signal (e.g., photon arrival times) can be recorded simultaneously. This resolution can be rather low and may not be practical enough for use in depth imaging. R(depth) is a resolution of the reconstructed depth image which uses multiple exposures of different patterns on the light modulation system 6 via ghost imaging or single-pixel imaging techniques. R( depth) is generally larger than R(TOF sensor) so it can be useful in depth imaging, but on the upper side is limited by R(DMD), which is a resolution (number of mirrors) of the light modulation system 6. The resolution of a 2D color sensor R(2D) can be anything suitable for the application. However, since high-resolution cameras are readily available it is beneficial to have R(2D) higher or equal to the R(DMD). Thus, the calibration algorithm finds the correspondence of each of the pixels in R(depth) to the (set of) pixels of R(2D).
[0044] The calibration occurs when the system is first built. Calibration does not have to occur as frequently as with conventional systems. Rather, calibration can occur the single time and may only need to occur again if the system is jarred or experiences heavy vibration. The system may determine if a calibration needs to occur based upon a metric that could indicate that the system needs to be calibrated based upon object matching between the two images, the two-dimensional image and the depth map. For example, the system could determine that there is a mismatch between the two- dimensional image and the depth map, thereby indicating a calibration is needed. Additionally, if the system is having problems identifying or detecting objects, the system may determine that a calibration is needed.
[0045] FIG. 3 illustrates a method for capturing an image of a scene utilizing a depth sensor and image sensor where infrared light travels through a wavelength separation device to the depth sensor and visible light is reflected from the wavelength separation device to the image sensor. At 301 the image capture device, at the light modulation system, receives light reflected from a scene. The reflected light includes both infrared light reflections and visible light reflections. At 302 the image capture device, using the light modulation system, reflects or transmits light from the scene onto a depth sensor and image sensor via a wavelength separation device. In other words, the light from the scene is reflected from or transmitted through the light modulation system to the wavelength separation device. The infrared light reflections pass through the wavelength separation device onto the depth sensor. The visible light reflections are reflected from the wavelength separation device to the image sensor.
[0046] The system may, if necessary, determine if objects of the scene and the depth of the objects of the scene can be identified at 303. The object identification may be performed by a host or other component of the system and is likely not performed by the imaging system itself. Object identification includes identifying depth information for objects of the scene from the infrared light reflections captured at the depth sensor. Specifically, the depth sensor uses the infrared light reflections to generate a depth map for the scene. Additionally, objects of the scene are identified from the visible light reflections captured at the image sensor. Specifically, the image sensor uses the visible light reflections to generate a two-dimensional image for the scene. If the sensors are able to generate the corresponding images, the depth map and the two-dimensional image can be used to identify the objects in the scene and the depth of the objects in the scene.
[0047] If objects of the scene and depths of the objects cannot be identified at 303, the system ignores the input or takes no action at 305. If, on the other hand, objects and depths of objects can be identified at 303, the system generates a three-dimensional image for the scene at 304. In other words, the two images, the depth map and the twodimensional image, can be combined into a three-dimensional image of the scene.
[0048] The described system can be used in different practical applications, for example, object detection, object tracking, semantic segmentation, object recognition, and/or the like. An example use case is self-driving vehicles, service robotics, and automatic control systems. Sensor fusion is an important problem for autonomous operation of vehicles and robots, but most importantly for object detection. It is important for distinguishing objects of the same color but at different distances and objects at the same distance but of different colors that would be otherwise difficult to distinguish using only one camera image, particularly a two-dimensional image. In addition, adding depth information can be crucial for detecting objects that may be at least partially transparent in the visible wavelength range. Thus, using the described system, a color camera can be combined with the LiDAR sensor with precise alignment, and at low computational cost.
[0049] Example embodiments are described herein with reference to the figures, which illustrate example methods, devices and program products according to various example embodiments. It will be understood that the actions and functionality may be implemented at least in part by program instructions. These program instructions may be provided to a processor of a device, a special purpose information handling device, or other programmable data processing device to produce a machine, such that the instructions, which execute via a processor of the device implement the functions/acts specified.
[0050] It is worth noting that while specific blocks are used in the figures, and a particular ordering of blocks has been illustrated, these are non-limiting examples. In certain contexts, two or more blocks may be combined, a block may be split into two or more blocks, or certain blocks may be re-ordered or re-organized as appropriate, as the explicit illustrated examples are used only for descriptive purposes and are not to be construed as limiting.
[0051] As used herein, the singular “a” and “an” may be construed as including the plural “one or more” unless clearly indicated otherwise.
[0052] This disclosure has been presented for purposes of illustration and description but is not intended to be exhaustive or limiting. Many modifications and variations will be apparent to those of ordinary skill in the art. The example embodiments were chosen and described in order to explain principles and practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.
[0053] Thus, although illustrative example embodiments have been described herein with reference to the accompanying figures, it is to be understood that this description is not limiting and that various other changes and modifications may be affected therein by one skilled in the art without departing from the scope or spirit of the disclosure.

Claims

CLAIMS What is claimed is:
1. A system for capturing images utilizing a depth sensor and image sensor, the system comprising: a light modulation system that receives light from a scene, wherein the light comprises light reflections of a first wavelength and light reflections of a second wavelength, and/or light of a first wavelength originated in the scene and/or light of a second wavelength originated in the scene; a wavelength separation device positioned between the light modulation system and the depth sensor and image sensor, wherein the light reflections of a first wavelength and/or the light of a first wavelength originated in the scene reflected from or having passed through the light modulation system pass through the wavelength separation device onto the depth sensor and wherein the light reflections of a second wavelength and/or the light of a second wavelength originated in the scene reflected from or having passed through the light modulation system are reflected from the wavelength separation device onto the image sensor; or wherein the light reflections of a first wavelength and/or the light of a first wavelength originated in the scene reflected from or having passed through the light modulation system are reflected from the wavelength separation device onto the depth sensor and wherein the light reflections of a second wavelength and/or the light of a second wavelength originated in the scene reflected from or having passed through the light modulation system pass through the wavelength separation device onto the image sensor; the image sensor that captures the light reflections of a second wavelength and/or the light of a first wavelength originated in the scene; and the depth sensor that captures the light reflections of a first wavelength and/or the light of a first wavelength originated in the scene.
2. The system of claim 1, wherein the light reflections of a second wavelength and/or the light of a second wavelength originated in the scene are utilized to identify objects of the scene, in particular wherein the system is configured to utilize the light reflections of a second wavelength and/or the light of a second wavelength originated in the scene to identify objects of the scene.
3. The system of claim 1 or 2, wherein the light reflections of a first wavelength and/or the light of a first wavelength originated in the scene are utilized to identify depth information for the objects of the scene, in particular wherein the system is configured to utilize the light reflections of a first wavelength and/or the light of a first wavelength originated in the scene to identify depth information for the objects of the scene.
4. The system of any of the preceding claims, wherein the wavelength separation device is positioned with respect to a beam path between the light modulation system on the one hand and the depth sensor and the image sensor on the other hand.
5. The system of any of the preceding claims, further comprising an infrared light source that projects infrared light towards the scene and wherein the light reflections of a first wavelength captured by the depth sensor comprises reflections off the scene resulting from the infrared light source.
6. The system of any of the preceding claims, wherein the light modulation system comprises an array of mirrors.
7. The system of any of the preceding claims, wherein the wavelength separation device comprises a dichroic mirror.
8. The system of any of the preceding claims, wherein the light reflections of a second wavelength and/or the light of a second wavelength originated in the scene are captured at the image sensor during the same frame acquisition as the light reflections of a first wavelength and/or the light of a first wavelength originated in the scene are captured at the depth sensor.
9. The system of any of the preceding claims, further comprising a total internal reflection prism positioned as a part of the light modulation system and the wavelength separation device, wherein the total internal reflection prism folds the light reflected from the scene and/or the light originated in the scene.
10. The system of any of the preceding claims, further comprising an infraredblocking filter positioned between the wavelength separation device and the image sensor to block reflected infrared light at the image sensor.
11. The system of any of the preceding claims, further comprising an infraredtransmitting filter positioned between the wavelength separation device and the depth sensor to reduce noise of the infrared reflections.
12. The system of any of the preceding claims, further comprising at least one imaging optics positioned between the light modulation system and the wavelength separation device to relay the scene from the light modulation system to the depth sensor and the image sensor.
13. The system of any of the preceding claims, wherein the first wavelength comprises a wavelength range of 800 nm to 2000 nm and the second wavelength comprises a wavelength range of 400 nm to 800 nm.
14. A method for capturing images utilizing a depth sensor and image sensor, the method comprising: receiving, at a light modulation system, light from a scene, wherein the light comprises light reflections of a first wavelength and light reflections of a second wavelength, and/or light of a first wavelength originated in the scene and/or light of a second wavelength originated in the scene; reflecting or transmitting, using the light modulation system, the light from the scene onto a wavelength separation device positioned between the light modulation system and the depth sensor and the image sensor, wherein the light reflections of a first wavelength and/or the light of a first wavelength originated in the scene reflected from or having passed through the light modulation system pass through the wavelength separation device onto the depth sensor and wherein the light reflections of a second wavelength and/or the light of a second wavelength originated in the scene reflected from or having passed through the light modulation system are reflected from the wavelength separation device onto the image sensor; or wherein the light reflections of a first wavelength and/or the light of a first wavelength originated in the scene reflected from or having passed through the light modulation system are reflected from the wavelength separation device onto the depth sensor and wherein the light reflections of a second wavelength and/or the light of a second wavelength originated in the scene reflected from or having passed through the light modulation system pass through the wavelength separation device onto the image sensor; generating a three-dimensional image of the scene from the light reflections of a first wavelength and/or the light of a first wavelength originated in the scene captured at the depth sensor and the light reflections of a second wavelength and/or the light of a second wavelength originated in the scene captured at the image sensor, wherein the light reflections of a first wavelength and/or the light of a first wavelength originated in the scene provide depth information for objects of the scene and wherein the light reflections of a second wavelength and/or the light of a second wavelength originated in the scene provide information to identify the objects of the scene.
15. The method of claim 14, wherein the wavelength separation device is positioned with respect to a beam path between the light modulation system on the one hand and the depth sensor and the image sensor on the other hand.
16. The method of claim 14 or 15, further comprising projecting, using an infrared light source, light towards the scene and wherein the light reflections of a first wavelength captured by the depth sensor comprises reflections off the scene resulting from the infrared light source.
17. The method of any of claims 14 to 16, wherein the light modulation system comprises an array of mirrors.
18. The method of any of claims 14 to 17, wherein the wavelength separation device comprises a dichroic mirror.
19. The method of any of claims 14 to 18, wherein the light reflections of a second wavelength and/or the light of a second wavelength originated in the scene are captured at the image sensor during the same frame acquisition as the light reflections of a first wavelength and/or the light of a first wavelength originated in the scene are captured at the depth sensor.
20. The method of any of claims 14 to 19, further comprising folding the light reflected from and/or originated in the scene.
21. The method of any of claims 14 to 20, further comprising blocking reflected infrared light.
22. The method of any of claims 14 to 21, further comprising reducing noise of the infrared reflections.
23. The method of any of claims 14 to 22, further comprising relaying the scene from the light modulation system to the depth sensor and the image sensor.
24. The method of any of claims 14 to 23, wherein the first wavelength comprises a wavelength range of 800 nm to 2000 nm and the second wavelength comprises a wavelength range of 400 nm to 800 nm.
PCT/EP2023/069135 2022-07-15 2023-07-11 Image capture device with wavelength separation device WO2024013142A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202217865655A 2022-07-15 2022-07-15
US17/865,655 2022-07-15

Publications (1)

Publication Number Publication Date
WO2024013142A1 true WO2024013142A1 (en) 2024-01-18

Family

ID=87280822

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2023/069135 WO2024013142A1 (en) 2022-07-15 2023-07-11 Image capture device with wavelength separation device

Country Status (1)

Country Link
WO (1) WO2024013142A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070051876A1 (en) * 2005-02-25 2007-03-08 Hirofumi Sumi Imager
US20180131853A1 (en) * 2016-11-10 2018-05-10 Magic Leap, Inc. Method and system for multiple f-number lens
US20190132572A1 (en) * 2017-10-27 2019-05-02 Baidu Usa Llc 3d lidar system using a dichroic mirror for autonomous driving vehicles

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070051876A1 (en) * 2005-02-25 2007-03-08 Hirofumi Sumi Imager
US20180131853A1 (en) * 2016-11-10 2018-05-10 Magic Leap, Inc. Method and system for multiple f-number lens
US20190132572A1 (en) * 2017-10-27 2019-05-02 Baidu Usa Llc 3d lidar system using a dichroic mirror for autonomous driving vehicles

Similar Documents

Publication Publication Date Title
EP3195042B1 (en) Linear mode computational sensing ladar
KR102136401B1 (en) Multi-wave image lidar sensor apparatus and signal processing method thereof
US10637574B2 (en) Free space optical communication system
US6724490B2 (en) Image capturing apparatus and distance measuring method
KR20210046697A (en) Multispectral ranging/imaging sensor array and system
KR101951318B1 (en) 3D image acquisition apparatus and method of obtaining color and depth images simultaneously
KR101858577B1 (en) Imaging optical system and 3D image acquisition apparatus including the imaging optical system
US20140168424A1 (en) Imaging device for motion detection of objects in a scene, and method for motion detection of objects in a scene
EP3367660A1 (en) A camera device comprising a dirt detection unit
US10962764B2 (en) Laser projector and camera
US11662443B2 (en) Method and apparatus for determining malfunction, and sensor system
CN110121659B (en) System for characterizing the surroundings of a vehicle
US20150138325A1 (en) Camera integrated with light source
JP3695188B2 (en) Shape measuring apparatus and shape measuring method
US20180038961A1 (en) System and method for stereo triangulation
US20210337178A1 (en) Stereoscopic image capturing systems
US11563873B2 (en) Wide-angle 3D sensing
EP3543742B1 (en) A 3d imaging system and method of 3d imaging
WO2024013142A1 (en) Image capture device with wavelength separation device
WO2022196109A1 (en) Measurement device, measurement method, and information processing device
US11893756B2 (en) Depth camera device
US11470261B2 (en) Three-dimensional distance measuring method and device
CN116930920A (en) Laser radar and laser radar control method
US11573324B2 (en) Lidar imaging receiver
US10742881B1 (en) Combined temporal contrast sensing and line scanning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23741376

Country of ref document: EP

Kind code of ref document: A1