WO2021253308A1 - Image acquisition apparatus - Google Patents

Image acquisition apparatus Download PDF

Info

Publication number
WO2021253308A1
WO2021253308A1 PCT/CN2020/096727 CN2020096727W WO2021253308A1 WO 2021253308 A1 WO2021253308 A1 WO 2021253308A1 CN 2020096727 W CN2020096727 W CN 2020096727W WO 2021253308 A1 WO2021253308 A1 WO 2021253308A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
image
pixels
dimensional data
tof
Prior art date
Application number
PCT/CN2020/096727
Other languages
French (fr)
Chinese (zh)
Inventor
吕萌
Original Assignee
深圳市汇顶科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市汇顶科技股份有限公司 filed Critical 深圳市汇顶科技股份有限公司
Priority to PCT/CN2020/096727 priority Critical patent/WO2021253308A1/en
Publication of WO2021253308A1 publication Critical patent/WO2021253308A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras

Definitions

  • the embodiments of the present application relate to the field of imaging, and more specifically, to an image acquisition device.
  • TOF time of flight
  • the embodiment of the present application provides an image acquisition device, which can improve the quality of the acquired three-dimensional data.
  • an image acquisition device including: a microlens array including a plurality of microlenses; a plurality of pixel units corresponding to the plurality of microlenses, each of the plurality of pixel units
  • the pixel unit includes a first pixel and a second pixel, and each microlens is used to converge the light signal returned by the object to a corresponding pixel unit, and make the light received by a plurality of first pixels in the plurality of pixel units
  • the angles of the signals are the same, and the angles of the light signals received by the plurality of second pixels are the same;
  • the processing circuit is used to obtain a first viewing angle image according to the output of the plurality of first pixels, and according to the output of the plurality of second pixels Obtain a second perspective image by outputting, and obtain a time-of-flight TOF image of the object according to the output of the plurality of pixel units, and generate the object according to at least the TOF image, the first-perspective image, and the
  • the structure of a pixel unit in this application that includes multiple pixels is similar to the principle of multi-shot. Because its baseline is small, about a few millimeters, it can mainly measure three-dimensional data of nearby objects, and because the imaging method is Passive imaging, there is no multipath problem in TOF imaging mode. However, traditional TOF can measure 3D data at medium and long distances because it does not require a baseline, but the error in the 3D data measured at close distances will be larger. In this way, the two measurement methods can be combined to obtain a larger range of three-dimensional data, and the two data can be complementary, which can alleviate the multipath problem of TOF, and can obtain data quality than either method. All good three-dimensional data. Therefore, the solution of the embodiment of the present application can improve the quality of the obtained three-dimensional data.
  • the embodiment of the present application only changes the correspondence between the microlens array and the pixels, and does not significantly increase new component materials, and does not cause a significant increase in cost.
  • angles of the light signals received by the first pixel and the second pixel are different.
  • the first pixel and the second pixel in each pixel unit are arranged in a 1 ⁇ 2 matrix, or the first pixel in each pixel unit
  • the pixels and the second pixels are arranged in a 2 ⁇ 1 matrix form.
  • the processing circuit is configured to: generate first three-dimensional data of the object according to the TOF image; generate the first three-dimensional data of the object according to the first perspective image and the second perspective image Second three-dimensional data of the object; and generating the target three-dimensional data based on the first three-dimensional data and the second three-dimensional data.
  • each pixel unit further includes a third pixel and a fourth pixel, and each microlens is used to converge the light signal returned by the object to a corresponding pixel unit, and
  • the angles of the light signals received by the plurality of third pixels in the plurality of pixel units are the same, and the angles of the light signals received by the plurality of fourth pixels are the same.
  • the first pixel, the second pixel, the third pixel, and the fourth pixel in each pixel unit are arranged in a 2 ⁇ 2 matrix form.
  • angles of the light signals received by the third pixel and the fourth pixel are different.
  • the processing circuit is configured to obtain a third-view image according to the output of the plurality of third pixels, obtain a fourth-view image according to the output of the plurality of fourth pixels, and according to the output of the plurality of fourth pixels.
  • the TOF image, the first perspective image, the second perspective image, the third perspective image, and the fourth perspective image generate target three-dimensional data of the object.
  • the processing circuit is configured to generate first three-dimensional data of the object according to the TOF image; according to the first perspective image, the second perspective image, and the third The perspective image and the fourth perspective image are used to generate second three-dimensional data of the object; and the target three-dimensional data is generated based on the first three-dimensional data and the second three-dimensional data.
  • the processing circuit is used to fuse the first three-dimensional data and the second three-dimensional data to obtain the target three-dimensional data.
  • the processing circuit is used to fuse the first three-dimensional data and the second three-dimensional data by using an iterative closest point ICP algorithm.
  • the plurality of pixel units are pulsed TOF sensing units, and the TOF image is generated according to the light signals sensed by the plurality of pixel units at least once.
  • the plurality of pixel units are phase TOF sensing units, and the TOF image is generated according to the light signals sensed by the plurality of pixel units at least three times.
  • a main lens is further included, and the main lens is arranged above the microlens array for imaging the object.
  • the microlens array is arranged on the focal plane of the main lens.
  • a light source is further included, the light source is configured to emit a light signal of a specified wavelength to the object, and the plurality of pixel units are configured to sense the light signal of a specified wavelength returned by the object.
  • the optical signal of the specified wavelength is an infrared optical signal.
  • a control unit is further included, and the control unit is configured to control the light source to synchronize clocks with the plurality of pixel units.
  • a filter structure is further included, and the filter structure is disposed above the microlens array for filtering optical signals of non-designated wavelengths.
  • the filter structure is a filter film, and the filter film is provided on the upper surface of the microlens array by coating.
  • the filter structure is a filter.
  • an image acquisition device including: a microlens array including a plurality of microlenses; a plurality of pixel units corresponding to the plurality of microlenses, each of the plurality of pixel units
  • the pixel unit includes a first pixel and a second pixel, and each microlens is used to converge the light signal returned by the object to a corresponding pixel unit, and make the light received by a plurality of first pixels in the plurality of pixel units
  • the angles of the signals are the same, and the angles of the light signals received by the plurality of second pixels are the same; wherein the output of the plurality of first pixels is used to generate a first perspective image, and the output of the plurality of second pixels is used to A second perspective image is generated, the output of the plurality of pixel units is used to generate a TOF image of the time of flight of the object, and the TOF image, the first perspective image, and the second perspective image are used to generate the The target three-dimensional data of the object.
  • Figure 1 is a schematic structural diagram of a conventional TOF system.
  • Fig. 2 is a schematic structural diagram of an image acquisition device provided by an embodiment of the present application.
  • 3 and 4 are schematic structural diagrams of a pixel unit provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of the propagation path of optical signals reflected from different points on an object provided by an embodiment of the present application.
  • Fig. 6 is a schematic structural diagram of another image acquisition device provided by an embodiment of the present application.
  • Typical technologies include dual (multi-) vision solutions, structured light solutions, and time of flight (TOF) solutions.
  • dual vision solutions multi-
  • structured light solutions structured light solutions
  • TOF time of flight
  • the principle of binocular vision is similar to that of human eyes. Two cameras are used to capture images of objects from different directions, and the captured graphics are merged to establish the correspondence between features and place the same physical point in different images. Corresponding to the image points to obtain the three-dimensional data of the object.
  • Binocular stereo vision acquires three-dimensional information based on the principle of triangulation, that is, a triangle is formed between the image plane of the two cameras and the measured object. Knowing the positional relationship between the two cameras and the coordinates of the object in the left and right images, the three-dimensional size of the object in the common field of view of the two cameras and the three-dimensional coordinates of the feature points of the space object can be obtained. Therefore, the binocular vision system is generally composed of two cameras.
  • a multi-eye vision system is generally composed of multiple cameras that capture images of an object from different directions, and merge the captured images to obtain three-dimensional data of the object.
  • the structured light scheme uses an infrared laser to project light with certain structural characteristics onto the object to be photographed, and then a special infrared camera collects the reflected structured light pattern, and calculates the depth information according to the principle of triangulation.
  • the basic principle of ToF is to continuously emit light pulses (usually invisible light) to the observed object, and then use the sensor to receive the light returned from the object, and obtain the distance between the object and the camera by detecting the flight (round trip) time of the light pulse distance.
  • the TOF method can generally be divided into two types according to different modulation methods: pulsed modulation and continuous wave modulation.
  • the principle of the pulse modulation scheme is relatively simple. It directly calculates the distance based on the time difference t between pulse transmission and reception. This method can also be called the pulse TOF method.
  • phase offset of the sine wave at the receiving end and the transmitting end is proportional to the distance between the object and the camera, the phase offset can be used to measure the distance.
  • This method can also be called the phase TOF method or the indirect TOF method.
  • the phase delay of light from emission to reception calculated through multiple frames To calculate the distance.
  • the specific calculation formula can be: distance c is the speed of light, f is the operating frequency of the light source, Is the phase delay.
  • the current consumer TOF depth cameras mainly include: Microsoft’s Kinect 2, MESA’s SR4000, Google Project Tango’s PMD Tech’s TOF depth camera, etc. These products have achieved many applications in somatosensory recognition, gesture recognition, environment modeling, etc. The most typical one is Microsoft Kinect2.
  • TOF depth camera can change the measurement distance of the camera by adjusting the frequency of the emitted pulse; TOF depth camera is different from the depth camera based on the feature matching principle, its measurement accuracy will not decrease with the increase of the measurement distance, and its measurement error is in the entire measurement range
  • the interior is basically fixed; TOF depth camera has strong anti-interference ability. Therefore, in the occasions where the measurement distance is required to be relatively far (such as unmanned driving), the TOF depth camera has a very obvious advantage.
  • TOF depth cameras have high requirements for time measurement accuracy. Even if the most high-precision electronic components are used, it is difficult to achieve millimeter-level accuracy. Therefore, in the field of close-range measurement, especially in the range of 1m, the accuracy of TOF depth camera still has a large gap compared with other depth cameras, which limits its application in the field of close-range high-precision, such as face recognition Application scenarios such as unlocking and face recognition payment.
  • the TOF solution is more and more widely used due to its advantages of no baseline requirements, compact structure, fast speed, and simple algorithm. It has been applied to smartphones of brands such as Huawei (Huawei P30 Pro, Samsung Note 10+). However, consumer TOF generally requires low price, small size, and low power consumption. This leads to poor quality of the 3D data obtained by TOF. For example, the depth information of the measured object is inaccurate, which obviously limits its use in 3D face recognition, and the photo background is blurred. The application effects of chemistry, somatosensory games, etc.
  • the structure of a conventional TOF image acquisition device may be as shown in FIG. 1, and the device includes a main lens 120 and a pixel array 140.
  • the main lens 120 is used to converge the optical signals returned from the object 110
  • the pixel array 140 is used to sense the optical signals passing through the main lens 120 to generate a TOF image of the object 110.
  • the main lens 120 may also be referred to as an imaging lens.
  • the pixels in the pixel array 140 in FIG. 1 are TOF sensing units.
  • a layer of microlens array 130 can be covered on the pixel array 140, where one pixel corresponds to one microlens, that is, the pixel 141 corresponds to the microlens 131, the pixel 142 corresponds to the microlens 132, and the pixel 143 corresponds to the microlens. 133 corresponds.
  • a microlens can converge the light signal reflected by the object 110 to a corresponding pixel. Through the converging effect of the microlens, the pixel can receive more light energy.
  • FIG. 1 shows a schematic diagram of the propagation path of an optical signal reflected by a point 111 on an object 110.
  • the optical signal reflected by the point 111 is condensed by the main lens 120 and then reaches the microlens 131, and the pixel 141 can receive the optical signal condensed by the microlens 131.
  • the embodiment of the present application can generate a TOF image of the object according to the light signals sensed by the pixel array to obtain depth information of the object.
  • the TOF image can be used to generate three-dimensional data of the object through a depth reconstruction algorithm.
  • the TOF image in the embodiments of the present application may represent the depth information of the object, or the TOF image may represent the depth information and image information of the object.
  • This application is improved on the basis of this solution to obtain an image acquisition device that can improve the quality of the acquired three-dimensional data.
  • the image capture device may include a microlens array 230 and a plurality of pixel units 240.
  • the microlens array 230 includes a plurality of microlenses, and the plurality of pixel units 240 can correspond to the plurality of microlenses one-to-one.
  • Each pixel unit of the plurality of pixel units 240 may include a first pixel and a second pixel, and each microlens is used to converge the light signal returned by the object to a corresponding pixel unit, and make the plurality of pixel units
  • the angles of the light signals received by the plurality of first pixels in 240 are the same, and the angles of the light signals received by the plurality of second pixels are the same.
  • the output of the plurality of first pixels is used to generate a first perspective image
  • the output of the plurality of second pixels is used to generate a second perspective image
  • the output of the plurality of pixel units is used to generate the object
  • the TOF image of the time of flight, and the TOF image, the first perspective image, and the second perspective image are used to generate target three-dimensional data of the object.
  • the image acquisition device also includes a processing circuit for obtaining a first-view image according to the output of the plurality of first pixels, obtaining a second-view image according to the output of the plurality of second pixels, and according to the The output of a plurality of pixel units obtains the TOF image of the time of flight of the object, and generates the target three-dimensional data of the object according to at least the TOF image, the first view image and the second view image.
  • the light signal sensed by one pixel unit can be used to generate the depth information at the position of the pixel unit, and the light signals sensed by multiple pixel units can be used to generate the depth information at multiple pixel units respectively, so as to obtain the depth information of the object to generate TOF image of the object.
  • the first pixel in the embodiment of the present application may refer to pixels located at the same relative position among multiple pixel units
  • the second pixel may refer to pixels located at the same relative position among multiple pixel units
  • the first pixel and the second pixel may be The two pixels are different.
  • the pixels located at the same relative position may refer to the pixel a in one pixel unit and the pixel b corresponding to the pixel a in another pixel unit.
  • each pixel unit contains two pixels, and the relative positions of the two pixels in this pixel unit are the left and right sides, respectively. All pixels located on the left in the plurality of pixel units are referred to as pixels located at the same relative position, and all pixels located on the right in the multiple pixel units are referred to as pixels located at the same relative position.
  • one pixel unit includes two pixels, where the pixels 241a, 242a, and 243a may be the first pixels, and the pixels 241b, 242b, and 243b may be the second pixels.
  • the pixels 241a, 242a, and 243a can be used to generate a first-view image, and the pixels 241b, 242b, and 243b can be used to generate a second-view image.
  • a pixel unit includes x pixels, and x is an integer greater than 2, the embodiment of the present application may only generate images of the object in 2 views, or Generate x-view images of the object.
  • each pixel unit in the embodiment of the present application may further include a third pixel and a fourth pixel, and each microlens is used to converge the light signal returned by the object to a corresponding pixel unit, and make all The angles of the light signals received by the plurality of third pixels in the plurality of pixel units are the same, and the angles of the light signals received by the plurality of fourth pixels are the same.
  • the angles of the light signals received by the plurality of third pixels and the plurality of fourth pixels are different. Specifically, the angles of the light signals received by the plurality of first pixels, the plurality of second pixels, the plurality of third pixels, and the plurality of fourth pixels are all different.
  • the three-dimensional data in the embodiments of the present application refers to data that can describe the shape and spatial position of an object in a three-dimensional space.
  • the three-dimensional data may include, but is not limited to, a three-dimensional point cloud (point cloud), a depth image (depth image/range image), or a three-dimensional network (mesh).
  • the one-to-one correspondence between multiple pixel units and the multiple microlenses means that each pixel unit has a corresponding microlens, and the optical signal passing through one microlens can be received by all pixels in the corresponding pixel unit arrive.
  • the light signal passing through the microlens 231 can be received by the pixels 241a and 241b
  • the light signal passing through the microlens 232 can be received by the pixels 242a and 242b
  • the light signal passing through the microlens 233 can be received by the pixels.
  • 243a and pixel 243b are received.
  • one pixel unit may include 2 pixels or 4 pixels or even more pixels.
  • one pixel unit may also include 3 pixels or other numbers of pixels.
  • one pixel unit includes 2 pixels.
  • the plurality of pixel units 240 includes a pixel unit 241, a pixel unit 242, and a pixel unit 243.
  • the pixel unit 241 includes a pixel 241a and a pixel 241b
  • the pixel unit 242 includes a pixel 242a and a pixel 242b
  • the pixel unit 243 includes a pixel 243a and a pixel 243b.
  • the embodiment of the present application does not specifically limit the arrangement of pixels in a pixel unit.
  • a pixel unit includes an even number of pixels
  • the even number of pixels can be arranged in a matrix.
  • a pixel unit when a pixel unit includes 2 pixels, that is, when it includes a first pixel and a second pixel, the first pixel and the second pixel may be arranged in a matrix of 1 ⁇ 2 or 2 ⁇ 1.
  • a pixel unit includes 4 pixels, that is, when it includes the first pixel, the second pixel, the third pixel, and the fourth pixel, the first pixel, the second pixel, the third pixel, and the fourth pixel can be in the order of 2 ⁇ 2.
  • each pixel unit contains an even number of pixels, the arrangement is relatively neat, which is beneficial to save space.
  • the two pixels can be arranged in an up-and-down structure, as shown in the left figure in Figure 3, or in a left-right structure, as shown in the left figure in Figure 3. Shown.
  • a pixel unit When a pixel unit includes 4 pixels, the 4 pixels can be arranged in a 2 ⁇ 2 matrix, as shown in FIG. 4.
  • the embodiment of the present application does not specifically limit the shape of a single pixel.
  • the pixel may be square or circular.
  • the pixel values of multiple pixel units in a pixel unit can be combined, that is, the pixel values of multiple pixel units can be summed, and the depth value at each pixel unit can be calculated according to the summed pixel values to generate TOF image of the object.
  • the pixel values of multiple pixel units in one pixel unit can be averaged, and the depth value at each pixel unit can be calculated according to the averaged pixel value to obtain the TOF image of the object.
  • the pixel value of one pixel can be used to represent the integral value of the phase of the optical signal; for pulsed TOF, the pixel value of one pixel can be used to represent the flight time of the optical signal.
  • the pixel values of pixels 241a and 241b can be averaged to obtain the pixel value of pixel unit 241; the pixel values of 242a and pixel 242b can be averaged to obtain the pixel value of pixel unit 242; The pixel values of the 243a and the pixel 243b are averaged to obtain the pixel value of the pixel unit 243. Then, based on the pixel values of the pixel unit 241, the pixel unit 242, and the pixel unit 243, a TOF image of the object is generated.
  • each pixel unit The number of pixels included in each pixel unit is the same, and the arrangement manner is also the same. In this way, the angles of the light signals that can be received by the pixels at the same relative position in the multiple pixel units can be kept the same, while the angles of the light signals that can be received by the pixels at different relative positions within a pixel are different, that is, the first pixel and the The angle of the light signal received by the second pixel is different. In this way, the light signals sensed by the pixels at the same relative position can generate an image of one viewing angle. If a pixel unit includes n pixels, an image of n viewing angles can be generated, and n is an integer greater than or equal to 2.
  • the pixel 241a can receive the light signal within the angle 2
  • the pixel 241b can receive the light signal within the angle 1
  • the pixel 242a can receive the light signal within the angle 4
  • the pixel 242b can receive the light signal within the angle 3.
  • the pixel 243a can receive the light signal within the range of angle 6, and the pixel 243b can receive the light signal within the range of angle 5.
  • angle 2, angle 4, angle 6 have the same size and direction
  • angle 1, angle 3, angle 5 have the same size and direction
  • the pixel 241a, the pixel 242a, and the pixel 243a are light signals reflected from an object received from the same viewing angle
  • the pixels 241b, 242b, and the pixel 243b are light signals reflected from an object received from the same viewing angle. Therefore, the light signals received by the pixel 241a, the pixel 242a, and the pixel 243a can be used to generate an image of one view of the object, and the pixels 241b, 242b, and the pixel 243b can be used to generate an image of another view of the object.
  • the structure of a pixel unit including multiple pixels in the embodiment of this application is similar to the principle of multi-shot. Because its baseline is small, about several millimeters, it can mainly measure the three-dimensional data of nearby objects, and because of the imaging The method is passive imaging, which is not affected by the multipath effect existing in the TOF method. However, traditional TOF can measure 3D data at medium and long distances because it does not require a baseline, but the error in the 3D data measured at close distances will be larger. In this way, the combination of the two measurement methods can obtain a larger range of three-dimensional data, and the data of the two complement each other, alleviate the multipath effect of TOF, and can obtain better data quality than either method. Three-dimensional data. Therefore, the solution of the embodiment of the present application can improve the quality of the obtained three-dimensional data.
  • Multipath effect can mean that the light signal emitted by the light source will be reflected multiple times in the target scene.
  • An object will not only reflect the light signal emitted by the light source, but also light signals from other indirect paths. There will be interference between reflected light from multiple sources, which will cause data errors in the TOF method.
  • the embodiment of the present application only changes the correspondence between the microlens array and the pixels, and does not significantly increase new component materials, and does not cause a significant increase in cost.
  • the pixel in the embodiment of the present application is a sensing unit that is the smallest unit for sensing light signals.
  • the image acquisition device in the embodiment of the present application may further include a main lens 220, which is disposed above the microlens array 230, and is used to converge the light signals returned from the object.
  • the microlens array 230 can be used to converge the light signals passing through the main lens 220 to a plurality of pixel units 240.
  • the micro lens array 230 can be arranged on the focal plane of the main lens 220, that is, the distance between the micro lens array 230 and the main lens 220 is the focal length of the main lens 220, which can simplify the process of generating three-dimensional data of the object.
  • the setting position of the microlens array 230 may also be offset from the imaging plane of the main lens 220 to a certain extent, so that the offset needs to be corrected when performing three-dimensional data processing.
  • the distance between the plurality of pixel units 240 and the microlens array 230 may be determined according to the focal length of a single microlens diameter and the size of the pixel unit. For example, the distance between the plurality of pixel units 240 and the microlens array 230 may be greater than one microlens array.
  • the focal length of the lens may be determined according to the focal length of a single microlens diameter and the size of the pixel unit. For example, the distance between the plurality of pixel units 240 and the microlens array 230 may be greater than one microlens array.
  • the focal length of the lens may be determined according to the focal length of a single microlens diameter and the size of the pixel unit. For example, the distance between the plurality of pixel units 240 and the microlens array 230 may be greater than one microlens array.
  • FIG. 5 shows a schematic diagram of the propagation path of the optical signal reflected by the point 211 in the object, and the distance between the point 211 and the main lens 220 is G.
  • the optical signal reflected by the point 211 passes through the main lens 220 and is received by a microlens 232.
  • the optical signal passing through the microlens 232 is sensed by the pixel unit below it, and part of the optical signal is Pixel a is sensed, and another part of the light signal is sensed by pixel b.
  • white pixels in a plurality of pixel units 240 are used to generate an image of one view of the object, and shaded pixels are used to generate an image of another view of the object.
  • the image acquisition device in the embodiment of the present application may further include a light source 520, the light source 520 may be used to emit a light signal of a specified wavelength to the object 510, and a plurality of pixel units 540 may be used to sense the light signal returned by the object 510 .
  • the light source may be a vertical-cavity surface-emitting laser (VCSEL) or a light emitting diode (LED).
  • VCSEL vertical-cavity surface-emitting laser
  • LED light emitting diode
  • the optical signal of the specified wavelength may be an invisible light signal, and the invisible light signal is not easily observed by the user's eyes, which is beneficial to improve the user experience.
  • visible light signals can also be used as the light source, which is not specifically limited in the embodiment of the present application.
  • the optical signal of the designated wavelength may be an infrared optical signal.
  • the image acquisition device in the embodiment of the present application may further include a control unit that can be used to control the light source 520 to be time-synchronized with the multiple pixel units 540, so that the time difference or phase of the light signal from emission to reception by the pixel can be accurately calculated. Difference.
  • the image acquisition device may further include a filter structure 530, which is disposed above the microlens array 560, and is used to filter optical signals of non-specified wavelengths, so that only the optical signals of specified wavelengths are multiplied.
  • a filter structure 530 which is disposed above the microlens array 560, and is used to filter optical signals of non-specified wavelengths, so that only the optical signals of specified wavelengths are multiplied.
  • Each pixel unit 540 receives it, which can reduce the interference of ambient light on the TOF imaging process and improve the signal-to-noise ratio of the light signal collected by the TOF pixel unit.
  • the filter structure 530 can also be arranged between the micro lens array 560 and the plurality of pixel units 540, as long as it is ensured that only the optical signals of the specified wavelength can reach the plurality of pixel units 540.
  • the filter structure 530 may be a filter film, and the filter film may be provided on the upper surface of the microlens array by coating.
  • the filter structure 530 may be a filter, and the filter is disposed on the upper surface of the micro lens array.
  • the image capture device in the embodiment of the present application may be set on a base 550, and the base 550 is used to fix the image capture device.
  • the multiple pixel units 540 in the embodiment of the present application may be TOF sensing units.
  • the TOF induction unit can be a pulse type TOF induction unit or a phase type TOF induction unit.
  • the TOF image may be generated based on the light signals sensed by the multiple pixel units at least once.
  • the pulsed TOF sensor unit only needs to transmit and collect at least once to obtain the TOF image.
  • the TOF image may be generated based on the light signals sensed by the plurality of pixel units at least 3 times.
  • the phased TOF sensing unit requires at least three transmission and acquisition processes to obtain the TOF image.
  • phase-type TOF sensing unit For example, take the phase-type TOF sensing unit as an example. Assuming that a pixel unit includes 2 pixels, the TOF image is generated based on the light signal sensed 4 times, and the light signal sensed by 2 pixels in a pixel unit is calculated each time. Average value, get 4 average values, respectively denoted as x1, x2, x3, x4, and then the depth information of the object can be obtained through the arctangent formula.
  • the TOF image in the embodiments of the present application may include not only depth information, but also image information of an object.
  • the pixel values of the multiple pixel units in the embodiments of the present application can be used to generate image information of an object in addition to being used to generate depth information.
  • the pixel value of the pixel unit can be used to obtain the depth information and the image information through different algorithms.
  • the light source 520 emits a light signal of a specified wavelength ⁇ to the object 510.
  • the light signal is reflected by the object 510 and then reaches the filter 530.
  • the filter can be used to block light signals of other wavelengths, and only make the wavelength ⁇
  • the light signal of ⁇ is transmitted through, so the plurality of pixel units 540 will only receive the light signal of the wavelength ⁇ .
  • multiple pixel units can collect a frame of image, and each pixel in the multiple pixel units has a corresponding gray value.
  • multiple pixel units only need to collect at least one image to calculate the three-dimensional data. If it is a phase TOF, multiple pixel units need to collect at least three images to calculate the three-dimensional data.
  • the TOF image, the first-view image, and the second-view image may be obtained based on the same one or multiple frames of images, or may be obtained based on different frames of images.
  • the light source emits a primary light signal, and multiple pixel units receive the corresponding primary light signal.
  • the TOF image, the first viewing angle image, and the second viewing angle image are all based on the light received by multiple pixel units at the same time. Signal generated. In this way, TOF images, first-view images, and second-view images can be obtained through a single transmission and acquisition process, which can improve the processing speed of three-dimensional data.
  • the light source emits light signals twice, and multiple pixel units receive the corresponding light signals twice.
  • the TOF image is generated based on the light signals received by the multiple pixel units for the first time.
  • the viewing angle image is generated based on the light signal received by the multiple pixel units for the second time.
  • a pixel unit may include multiple pixels, and the pixels located in the same relative position in each pixel unit can be extracted to form an image with a viewing angle. In this way, if a pixel unit includes n pixels, images with n viewing angles can be obtained.
  • the embodiment of the present application does not specifically limit the manner in which the processing circuit generates the target three-dimensional data.
  • the processing circuit can directly generate the target three-dimensional data of the object according to the TOF image, the first-perspective image, and the second-perspective image through a preset algorithm.
  • the processing circuit may generate first three-dimensional data of the object based on the TOF image, and generate second three-dimensional data of the object based on the first and second perspective images, and then based on the first three-dimensional data and the second three-dimensional data , Generate target three-dimensional data.
  • a pixel unit includes a first pixel, a second pixel, a third pixel, and a fourth pixel
  • the output of a plurality of third pixels in the plurality of pixel units is used to generate a third view image
  • the output of the plurality of fourth pixels The output is used to generate a fourth perspective image
  • the outputs of the multiple pixel units are used to generate a TOF image of the time of flight of the object
  • the third-perspective image and the fourth-perspective image are used to generate target three-dimensional data of the object.
  • the processing circuit is further configured to obtain a third-view image according to the output of the plurality of third pixels, obtain a fourth-view image according to the output of the plurality of fourth pixels, and according to the The TOF image, the first-perspective image, the second-perspective image, the third-perspective image, and the fourth-perspective image generate target three-dimensional data of the object.
  • the processing circuit is configured to generate first three-dimensional data of the object according to the TOF image; according to the first perspective image, the second perspective image, the third perspective image, and the A fourth perspective image, generating second three-dimensional data of the object; and generating the target three-dimensional data based on the first three-dimensional data and the second three-dimensional data.
  • the first three-dimensional data may be generated according to the TOF algorithm; the second three-dimensional data may be generated according to a dual (multiple) vision algorithm, or a light field camera algorithm, or a machine learning algorithm.
  • the processing circuit can fuse the first three-dimensional data and the second three-dimensional data to obtain the target three-dimensional data.
  • the processing circuit may use an iterative closest point (ICP) algorithm to fuse the first three-dimensional data and the second three-dimensional data.
  • ICP iterative closest point
  • the plurality of pixel units 540 collect an image of the returned signal light once. If it is a pulsed TOF, at least one emission collection process is used to obtain a grayscale image, which is the final grayscale image. If it is a phase TOF, it needs m (m ⁇ 3) times of emission collection process to obtain m gray-scale images, and the m gray-scale images are averaged, and the average gray-scale image obtained is the final gray-scale image.
  • Multi-pixel solution to three-dimensional data Use multi-vision algorithms or deep learning algorithms to solve the final grayscale image obtained in the previous step to obtain the first three-dimensional data.
  • the ICP algorithm and/or various evolution algorithms of the ICP algorithm can be used to solve the first three-dimensional data.
  • TOF solves three-dimensional data: According to the data collected by multiple pixel units, the pixel values of multiple pixels in each pixel unit are averaged to obtain the average pixel value, so as to obtain the original data of the traditional TOF structure, and then obtain it according to the TOF algorithm The second three-dimensional data.
  • Three-dimensional data fusion The first three-dimensional data calculated by multi-pixel and the second three-dimensional data calculated by TOF are fused to obtain better quality three-dimensional data.
  • Post-processing processing the burr points in the fused 3D data, filling the holes in the data, and smoothing and filtering the data to obtain optimized 3D data.
  • the corresponding confidence maps can be generated respectively:
  • threshold one and threshold two (1) Set two empirical thresholds, threshold one and threshold two.
  • threshold one When the confidence map value corresponding to the data point in the first three-dimensional data is greater than threshold one, the three-dimensional information of the point is considered to be highly reliable, otherwise the credibility is low .
  • the confidence map value of the second three-dimensional data is compared with the threshold value 2, and the reliability of each data point in the second three-dimensional data can also be judged.
  • the fused three-dimensional data can be directly used. Or you can judge the distance between the object and multiple pixel units first, and then determine how to use the three-dimensional data. For example, if the distance between the object and multiple pixel units is greater than a preset threshold, the first three-dimensional data can be used. In this case, the target three-dimensional data is the first three-dimensional data; if the distance between the object and multiple pixel units is If the distance of is less than or equal to the preset threshold, the second three-dimensional data can be used. In this case, the target three-dimensional data is the second three-dimensional data.
  • the image acquisition device of the embodiment of the present application can be applied to various occasions, for example, scenes such as three-dimensional face recognition, image background blurring, and somatosensory games.
  • scenes such as three-dimensional face recognition, image background blurring, and somatosensory games.
  • Three-dimensional face recognition can be applied to scenes or devices related to three-dimensional data such as mobile phones, door locks, access control, and payment devices.
  • An embodiment of the present application also provides an electronic device, which includes any one of the image acquisition devices described above.
  • the electronic device can be, for example, a mobile phone, a computer, or other devices.
  • the technical solutions of the embodiments of the present application are essentially or the part that contributes to the prior art or the part of the technical solutions can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in the embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.
  • the division of units or modules or components in the device embodiments described above is only a logical function division, and there may be other divisions in actual implementation.
  • multiple units or modules or components can be combined or integrated.
  • To another system, or some units or modules or components can be ignored or not executed.
  • the units/modules/components described above as separate/display components may or may not be physically separated, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units/modules/components can be selected according to actual needs to achieve the objectives of the embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Measurement Of Optical Distance (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

The embodiments of the present application disclose an image acquisition apparatus, which can improve the quality of acquired three-dimensional data. The image acquisition apparatus comprises: a micro lens array, comprising a plurality of micro lenses; a plurality of pixel units, corresponding to the plurality of micro lenses on a one-to-one basis, each pixel unit among the plurality of pixel units comprises a first pixel and a second pixel, and each micro lens being used for converging an optical signal returned by an object to the corresponding pixel unit, enabling the same angles of the optical signals received by the plurality of first pixels in the plurality of pixel units, and the same angles of the optical signals received by the plurality of second pixels; and a processing circuit for obtaining a first-viewing-angle image according to an output of the plurality of first pixels, obtaining a second-viewing-angle image according to an output of the plurality of second pixels, obtaining a TOF image of the object according to an output of the plurality of pixel units, and generating target three-dimensional data of the object at least according to the TOF image, the first-viewing-angle image, and the second-viewing-angle image.

Description

图像采集装置Image acquisition device 技术领域Technical field
本申请实施例涉及图像领域,并且更具体地,涉及一种图像采集装置。The embodiments of the present application relate to the field of imaging, and more specifically, to an image acquisition device.
背景技术Background technique
深度测量或者三维数据正越来越多地应用在消费级的电子产品上,其中典型的技术有双目视觉方案,结构光方案和飞行时间(time of flight,TOF)方案。TOF方案因其无基线要求、结构紧凑、速度快、算法简单等优点,应用越来越广泛,目前已经被应用到越来越多的智能手机上。但消费级TOF一般要求价格低,体积小,功耗低,然而在这些条件限制下的TOF获取的三维数据质量差,明显限制了其在三维人脸识别、照片背景虚化、体感游戏等方面的应用效果和用户体验。In-depth measurement or three-dimensional data are increasingly being used in consumer-level electronic products. Typical technologies include binocular vision solutions, structured light solutions, and time of flight (TOF) solutions. The TOF solution has become more and more widely used due to its advantages such as no baseline requirement, compact structure, fast speed, and simple algorithm. It has been applied to more and more smart phones. However, consumer TOF generally requires low price, small size, and low power consumption. However, the quality of 3D data obtained by TOF under these conditions is poor, which obviously limits its use in 3D face recognition, photo background blur, somatosensory games, etc. Application effects and user experience.
发明内容Summary of the invention
本申请实施例提供了一种图像采集装置,能够提高获取的三维数据的质量。The embodiment of the present application provides an image acquisition device, which can improve the quality of the acquired three-dimensional data.
第一方面,提供了一种图像采集装置,包括:微透镜阵列,包括多个微透镜;多个像素单元,与所述多个微透镜一一对应,所述多个像素单元中的每个像素单元包括第一像素和第二像素,每个微透镜用于将物体返回的光信号汇聚至对应的一个像素单元,并使得所述多个像素单元中的多个第一像素接收到的光信号的角度相同,以及多个第二像素接收到的光信号的角度相同;处理电路,用于根据所述多个第一像素的输出得到第一视角图像,根据所述多个第二像素的输出得到第二视角图像,以及根据所述多个像素单元的输出得到所述物体的飞行时间TOF图像,并至少根据所述TOF图像、所述第一视角图像和第二视角图像生成所述物体的目标三维数据。In a first aspect, an image acquisition device is provided, including: a microlens array including a plurality of microlenses; a plurality of pixel units corresponding to the plurality of microlenses, each of the plurality of pixel units The pixel unit includes a first pixel and a second pixel, and each microlens is used to converge the light signal returned by the object to a corresponding pixel unit, and make the light received by a plurality of first pixels in the plurality of pixel units The angles of the signals are the same, and the angles of the light signals received by the plurality of second pixels are the same; the processing circuit is used to obtain a first viewing angle image according to the output of the plurality of first pixels, and according to the output of the plurality of second pixels Obtain a second perspective image by outputting, and obtain a time-of-flight TOF image of the object according to the output of the plurality of pixel units, and generate the object according to at least the TOF image, the first-perspective image, and the second-perspective image The target three-dimensional data.
本申请中的一个像素单元包括多个像素的结构类似于多摄的原理,由于其基线很小,大约为几个毫米,故其主要能测量近处物体的三维数据,又由于该成像方式为被动成像,不存在TOF成像方式里的多路径问题。而传统TOF由于不需要基线,则可以测量中远距离的三维数据,但其在近处测量的三维数据误差会较大。这样将两种测量方式进行结合,可以获取更大范围内 的三维数据,并且将两者的数据进行互补,可以缓解TOF的多路径问题,能够得到比两者中任一种方式得到的数据质量都好的三维数据。因此,本申请实施例的方案能够提高获得的三维数据的质量。The structure of a pixel unit in this application that includes multiple pixels is similar to the principle of multi-shot. Because its baseline is small, about a few millimeters, it can mainly measure three-dimensional data of nearby objects, and because the imaging method is Passive imaging, there is no multipath problem in TOF imaging mode. However, traditional TOF can measure 3D data at medium and long distances because it does not require a baseline, but the error in the 3D data measured at close distances will be larger. In this way, the two measurement methods can be combined to obtain a larger range of three-dimensional data, and the two data can be complementary, which can alleviate the multipath problem of TOF, and can obtain data quality than either method. All good three-dimensional data. Therefore, the solution of the embodiment of the present application can improve the quality of the obtained three-dimensional data.
另外,同传统TOF的结构相比,本申请实施例只是改变了微透镜阵列与像素之间的对应关系,并没有明显增加新的部件材料,不会导致成本的明显增加。In addition, compared with the structure of the traditional TOF, the embodiment of the present application only changes the correspondence between the microlens array and the pixels, and does not significantly increase new component materials, and does not cause a significant increase in cost.
在一种可能的实现方式中,所述第一像素和所述第二像素接收到的光信号的角度不同。In a possible implementation manner, the angles of the light signals received by the first pixel and the second pixel are different.
在一种可能的实现方式中,所述每个像素单元中的所述第一像素和所述第二像素按照1×2矩阵形式进行排列,或者所述每个像素单元中的所述第一像素和所述第二像素按照2×1矩阵形式进行排列。In a possible implementation manner, the first pixel and the second pixel in each pixel unit are arranged in a 1×2 matrix, or the first pixel in each pixel unit The pixels and the second pixels are arranged in a 2×1 matrix form.
在一种可能的实现方式中,所述处理电路用于:根据所述TOF图像,生成所述物体的第一三维数据;根据所述第一视角图像和所述第二视角图像,生成所述物体的第二三维数据;以及根据所述第一三维数据和所述第二三维数据,生成所述目标三维数据。In a possible implementation manner, the processing circuit is configured to: generate first three-dimensional data of the object according to the TOF image; generate the first three-dimensional data of the object according to the first perspective image and the second perspective image Second three-dimensional data of the object; and generating the target three-dimensional data based on the first three-dimensional data and the second three-dimensional data.
在一种可能的实现方式中,所述每个像素单元进一步包括第三像素和第四像素,所述每个微透镜用于将所述物体返回的光信号汇聚至对应的一个像素单元,并使得所述多个像素单元中的多个第三像素接收到的光信号的角度相同,以及多个第四像素接收到的光信号的角度相同。In a possible implementation, each pixel unit further includes a third pixel and a fourth pixel, and each microlens is used to converge the light signal returned by the object to a corresponding pixel unit, and The angles of the light signals received by the plurality of third pixels in the plurality of pixel units are the same, and the angles of the light signals received by the plurality of fourth pixels are the same.
在一种可能的实现方式中,所述每个像素单元中的所述第一像素、所述第二像素、所述第三像素和所述第四像素按照2×2的矩阵形式进行排列。In a possible implementation manner, the first pixel, the second pixel, the third pixel, and the fourth pixel in each pixel unit are arranged in a 2×2 matrix form.
在一种可能的实现方式中,所述第三像素和所述第四像素接收到的光信号的角度不同。In a possible implementation manner, the angles of the light signals received by the third pixel and the fourth pixel are different.
在一种可能的实现方式中,所述处理电路用于根据所述多个第三像素的输出得到第三视角图像,根据所述多个第四像素的输出得到第四视角图像,并根据所述TOF图像、所述第一视角图像、所述第二视角图像、所述第三视角图像和所述第四视角图像生成所述物体的目标三维数据。In a possible implementation manner, the processing circuit is configured to obtain a third-view image according to the output of the plurality of third pixels, obtain a fourth-view image according to the output of the plurality of fourth pixels, and according to the output of the plurality of fourth pixels. The TOF image, the first perspective image, the second perspective image, the third perspective image, and the fourth perspective image generate target three-dimensional data of the object.
在一种可能的实现方式中,所述处理电路用于根据所述TOF图像,生成所述物体的第一三维数据;根据所述第一视角图像、所述第二视角图像、所述第三视角图像和所述第四视角图像,生成所述物体的第二三维数据;以及根据所述第一三维数据和所述第二三维数据,生成所述目标三维数据。In a possible implementation manner, the processing circuit is configured to generate first three-dimensional data of the object according to the TOF image; according to the first perspective image, the second perspective image, and the third The perspective image and the fourth perspective image are used to generate second three-dimensional data of the object; and the target three-dimensional data is generated based on the first three-dimensional data and the second three-dimensional data.
在一种可能的实现方式中,所述处理电路用于将所述第一三维数据和所述第二三维数据进行融合,得到所述目标三维数据。In a possible implementation manner, the processing circuit is used to fuse the first three-dimensional data and the second three-dimensional data to obtain the target three-dimensional data.
在一种可能的实现方式中,所述处理电路用于通过迭代最近点ICP算法对所述第一三维数据和所述第二三维数据进行融合。In a possible implementation manner, the processing circuit is used to fuse the first three-dimensional data and the second three-dimensional data by using an iterative closest point ICP algorithm.
在一种可能的实现方式中,所述多个像素单元为脉冲式TOF感应单元,所述TOF图像是根据所述多个像素单元至少1次感应的光信号生成的。In a possible implementation manner, the plurality of pixel units are pulsed TOF sensing units, and the TOF image is generated according to the light signals sensed by the plurality of pixel units at least once.
在一种可能的实现方式中,所述多个像素单元为相位式TOF感应单元,所述TOF图像是根据所述多个像素单元至少3次感应的光信号生成的。In a possible implementation manner, the plurality of pixel units are phase TOF sensing units, and the TOF image is generated according to the light signals sensed by the plurality of pixel units at least three times.
在一种可能的实现方式中,还包括主透镜,所述主透镜设置在所述微透镜阵列的上方,用于对所述物体进行成像。In a possible implementation manner, a main lens is further included, and the main lens is arranged above the microlens array for imaging the object.
在一种可能的实现方式中,所述微透镜阵列设置在所述主透镜的焦平面上。In a possible implementation manner, the microlens array is arranged on the focal plane of the main lens.
在一种可能的实现方式中,还包括光源,所述光源用于向所述物体发射指定波长的光信号,所述多个像素单元用于感应所述物体返回的指定波长的光信号。In a possible implementation manner, a light source is further included, the light source is configured to emit a light signal of a specified wavelength to the object, and the plurality of pixel units are configured to sense the light signal of a specified wavelength returned by the object.
在一种可能的实现方式中,所述指定波长的光信号为红外光信号。In a possible implementation manner, the optical signal of the specified wavelength is an infrared optical signal.
在一种可能的实现方式中,还包括控制单元,所述控制单元用于控制所述光源与所述多个像素单元时钟同步。In a possible implementation manner, a control unit is further included, and the control unit is configured to control the light source to synchronize clocks with the plurality of pixel units.
在一种可能的实现方式中,还包括滤光结构,所述滤光结构设置在所述微透镜阵列的上方,用于过滤非指定波长的光信号。In a possible implementation manner, a filter structure is further included, and the filter structure is disposed above the microlens array for filtering optical signals of non-designated wavelengths.
在一种可能的实现方式中,所述滤光结构为滤光膜,所述滤光膜通过镀膜方式设置在所述微透镜阵列的上表面。In a possible implementation manner, the filter structure is a filter film, and the filter film is provided on the upper surface of the microlens array by coating.
在一种可能的实现方式中,所述滤光结构为滤光片。In a possible implementation manner, the filter structure is a filter.
第二方面,提供了一种图像采集装置,包括:微透镜阵列,包括多个微透镜;多个像素单元,与所述多个微透镜一一对应,所述多个像素单元中的每个像素单元包括第一像素和第二像素,每个微透镜用于将物体返回的光信号汇聚至对应的一个像素单元,并使得所述多个像素单元中的多个第一像素接收到的光信号的角度相同,以及多个第二像素接收到的光信号的角度相同;其中,所述多个第一像素的输出用于生成第一视角图像,所述多个第二像素的输出用于生成第二视角图像,所述多个像素单元的输出用于生成所述物体的飞行时间TOF图像,且所述TOF图像、所述第一视角图像和所述第 二视角图像用于生成所述物体的目标三维数据。In a second aspect, an image acquisition device is provided, including: a microlens array including a plurality of microlenses; a plurality of pixel units corresponding to the plurality of microlenses, each of the plurality of pixel units The pixel unit includes a first pixel and a second pixel, and each microlens is used to converge the light signal returned by the object to a corresponding pixel unit, and make the light received by a plurality of first pixels in the plurality of pixel units The angles of the signals are the same, and the angles of the light signals received by the plurality of second pixels are the same; wherein the output of the plurality of first pixels is used to generate a first perspective image, and the output of the plurality of second pixels is used to A second perspective image is generated, the output of the plurality of pixel units is used to generate a TOF image of the time of flight of the object, and the TOF image, the first perspective image, and the second perspective image are used to generate the The target three-dimensional data of the object.
附图说明Description of the drawings
图1是传统的TOF系统的示意性结构图。Figure 1 is a schematic structural diagram of a conventional TOF system.
图2是本申请实施例提供的图像采集装置的示意性结构图。Fig. 2 is a schematic structural diagram of an image acquisition device provided by an embodiment of the present application.
图3和图4是本申请实施例提供的一个像素单元的示意性结构图。3 and 4 are schematic structural diagrams of a pixel unit provided by an embodiment of the present application.
图5是本申请实施例提供的物体上不同点反射的光信号的传播路径的示意图。FIG. 5 is a schematic diagram of the propagation path of optical signals reflected from different points on an object provided by an embodiment of the present application.
图6是本申请实施例提供的另一种图像采集装置的示意性结构图。Fig. 6 is a schematic structural diagram of another image acquisition device provided by an embodiment of the present application.
具体实施方式detailed description
下面将结合附图,对本申请实施例中的技术方案进行描述。The technical solutions in the embodiments of the present application will be described below in conjunction with the accompanying drawings.
深度测量或者三维数据正越来越多地应用在消费级的电子产品上,其中典型的技术有双(多)目视觉方案,结构光方案和飞行时间(time of flight,TOF)方案。In-depth measurement or three-dimensional data is increasingly being applied to consumer-level electronic products. Typical technologies include dual (multi-) vision solutions, structured light solutions, and time of flight (TOF) solutions.
双目视觉的原理和人的眼睛类似,通过两个相机分别从不同的方向捕获物体的图像,并将捕获的图形进行融合,建立特征之间的对应关系,将同一空间物理点在不同图像中的映像点对应起来,以得到物体的三维数据。The principle of binocular vision is similar to that of human eyes. Two cameras are used to capture images of objects from different directions, and the captured graphics are merged to establish the correspondence between features and place the same physical point in different images. Corresponding to the image points to obtain the three-dimensional data of the object.
双目立体视觉由三角法原理进行三维信息的获取,即由两个相机的图像平面和被测物体之间构成一个三角形。已知两个相机之间的位置关系和物体在左右图像中的坐标,便可以获得两个相机公共视场内物体的三维尺寸及空间物体特征点的三维坐标。所以,双目视觉系统一般由两个相机构成。Binocular stereo vision acquires three-dimensional information based on the principle of triangulation, that is, a triangle is formed between the image plane of the two cameras and the measured object. Knowing the positional relationship between the two cameras and the coordinates of the object in the left and right images, the three-dimensional size of the object in the common field of view of the two cameras and the three-dimensional coordinates of the feature points of the space object can be obtained. Therefore, the binocular vision system is generally composed of two cameras.
类似地,多目视觉系统一般由多个相机构成,该多个相机从不同的方向捕获物体的图像,并将捕获的图像进行融合,得到物体的三维数据。Similarly, a multi-eye vision system is generally composed of multiple cameras that capture images of an object from different directions, and merge the captured images to obtain three-dimensional data of the object.
结构光方案是通过红外激光器,将具有一定结构特征的光线投射到被拍摄物体上,再由专门的红外摄像头进行采集反射的结构光图案,根据三角测量原理进行深度信息的计算。The structured light scheme uses an infrared laser to project light with certain structural characteristics onto the object to be photographed, and then a special infrared camera collects the reflected structured light pattern, and calculates the depth information according to the principle of triangulation.
ToF的基本原理是通过连续发射光脉冲(一般为不可见光)到被观测物体上,然后用传感器接收从物体返回的光,通过探测光脉冲的飞行(往返)时间来得到物体与摄像头之间的距离。The basic principle of ToF is to continuously emit light pulses (usually invisible light) to the observed object, and then use the sensor to receive the light returned from the object, and obtain the distance between the object and the camera by detecting the flight (round trip) time of the light pulse distance.
TOF法根据调制方法的不同,一般可以分为两种:脉冲调制(pulsed  modulation)和连续波调制(continuous wave modulation)。The TOF method can generally be divided into two types according to different modulation methods: pulsed modulation and continuous wave modulation.
脉冲调制方案的原理比较简单,其直接根据脉冲发射和接收的时间差t来测算距离,这种方式也可以称为脉冲式TOF法。具体的计算公式可以为:距离L=c×t/2,c为光速。The principle of the pulse modulation scheme is relatively simple. It directly calculates the distance based on the time difference t between pulse transmission and reception. This method can also be called the pulse TOF method. The specific calculation formula can be: distance L=c×t/2, and c is the speed of light.
在实际应用中,连续波调制通常采用的是正弦波调制或者其他周期性的信号调制。由于接收端和发射端正弦波的相位偏移和物体距离摄像头的距离成正比,因此可以利用相位偏移来测量距离,这种方式也可以称为相位式TOF法或间接式TOF法。通过多帧计算出的光从发射到被接收的相位延迟
Figure PCTCN2020096727-appb-000001
来计算距离。具体的计算公式可以为:距离
Figure PCTCN2020096727-appb-000002
c为光速,f为光源的工作频率,
Figure PCTCN2020096727-appb-000003
为相位延迟。
In practical applications, continuous wave modulation usually uses sine wave modulation or other periodic signal modulation. Since the phase offset of the sine wave at the receiving end and the transmitting end is proportional to the distance between the object and the camera, the phase offset can be used to measure the distance. This method can also be called the phase TOF method or the indirect TOF method. The phase delay of light from emission to reception calculated through multiple frames
Figure PCTCN2020096727-appb-000001
To calculate the distance. The specific calculation formula can be: distance
Figure PCTCN2020096727-appb-000002
c is the speed of light, f is the operating frequency of the light source,
Figure PCTCN2020096727-appb-000003
Is the phase delay.
目前的消费级TOF深度相机主要有:微软的Kinect 2、MESA的SR4000、Google Project Tango中使用的PMD Tech的TOF深度相机等。这些产品已经在体感识别、手势识别、环境建模等方面取得了较多的应用,最典型的就是微软的Kinect 2。The current consumer TOF depth cameras mainly include: Microsoft’s Kinect 2, MESA’s SR4000, Google Project Tango’s PMD Tech’s TOF depth camera, etc. These products have achieved many applications in somatosensory recognition, gesture recognition, environment modeling, etc. The most typical one is Microsoft Kinect2.
TOF深度相机可以通过调节发射脉冲的频率改变相机测量距离;TOF深度相机与基于特征匹配原理的深度相机不同,其测量精度不会随着测量距离的增大而降低,其测量误差在整个测量范围内基本上是固定的;TOF深度相机抗干扰能力也较强。因此,在测量距离要求比较远的场合(如无人驾驶),TOF深度相机具有非常明显的优势。TOF depth camera can change the measurement distance of the camera by adjusting the frequency of the emitted pulse; TOF depth camera is different from the depth camera based on the feature matching principle, its measurement accuracy will not decrease with the increase of the measurement distance, and its measurement error is in the entire measurement range The interior is basically fixed; TOF depth camera has strong anti-interference ability. Therefore, in the occasions where the measurement distance is required to be relatively far (such as unmanned driving), the TOF depth camera has a very obvious advantage.
TOF深度相机对时间测量的精度要求较高,即使采用最高精度的电子元器件,也很难达到毫米级的精度。因此,在近距离测量领域,尤其是1m范围内,TOF深度相机的精度与其他深度相机相比还具有较大的差距,这限制了它在近距离高精度领域的应用,比如在人脸识别解锁、人脸识别支付等应用场景。TOF depth cameras have high requirements for time measurement accuracy. Even if the most high-precision electronic components are used, it is difficult to achieve millimeter-level accuracy. Therefore, in the field of close-range measurement, especially in the range of 1m, the accuracy of TOF depth camera still has a large gap compared with other depth cameras, which limits its application in the field of close-range high-precision, such as face recognition Application scenarios such as unlocking and face recognition payment.
目前TOF方案因其无基线要求,结构紧凑,速度快,算法简单的优点,应用越来越广泛,已经被应用到华为等品牌的智能手机上(华为P30 Pro,三星Note 10+)。但消费级TOF一般要求价格低,体积小,功耗低,这样导致TOF获取的三维数据质量差,如测量的物体的深度信息不准确,这明显限制了其在三维人脸识别,照片背景虚化,体感游戏等方面的应用效果。At present, the TOF solution is more and more widely used due to its advantages of no baseline requirements, compact structure, fast speed, and simple algorithm. It has been applied to smartphones of brands such as Huawei (Huawei P30 Pro, Samsung Note 10+). However, consumer TOF generally requires low price, small size, and low power consumption. This leads to poor quality of the 3D data obtained by TOF. For example, the depth information of the measured object is inaccurate, which obviously limits its use in 3D face recognition, and the photo background is blurred. The application effects of chemistry, somatosensory games, etc.
传统的TOF图像采集装置的结构可以如图1所示,该装置包括主透镜120和像素阵列140。该主透镜120用于对物体110返回的光信号进行汇聚,像素阵列140用于感应经过主透镜120的光信号,以生成物体110的TOF图像。The structure of a conventional TOF image acquisition device may be as shown in FIG. 1, and the device includes a main lens 120 and a pixel array 140. The main lens 120 is used to converge the optical signals returned from the object 110, and the pixel array 140 is used to sense the optical signals passing through the main lens 120 to generate a TOF image of the object 110.
该主透镜120也可以称为成像透镜。The main lens 120 may also be referred to as an imaging lens.
图1中的像素阵列140中的像素为TOF感应单元。The pixels in the pixel array 140 in FIG. 1 are TOF sensing units.
为了提高感光能力,可在像素阵列140上覆盖一层微透镜阵列130,其中,一个像素对应一个微透镜,即像素141与微透镜131对应,像素142与微透镜132对应,像素143与微透镜133对应。一个微透镜可以将物体110反射的光信号汇聚至对应的一个像素上,通过微透镜的汇聚作用,像素可以接收到更多的光能。In order to improve the light sensitivity, a layer of microlens array 130 can be covered on the pixel array 140, where one pixel corresponds to one microlens, that is, the pixel 141 corresponds to the microlens 131, the pixel 142 corresponds to the microlens 132, and the pixel 143 corresponds to the microlens. 133 corresponds. A microlens can converge the light signal reflected by the object 110 to a corresponding pixel. Through the converging effect of the microlens, the pixel can receive more light energy.
图1示出的是经过物体110上的一个点111反射后的光信号的传播路径的示意图。经过点111反射后的光信号被主透镜120汇聚后到达微透镜131,像素141可以接收经过微透镜131汇聚后的光信号。FIG. 1 shows a schematic diagram of the propagation path of an optical signal reflected by a point 111 on an object 110. The optical signal reflected by the point 111 is condensed by the main lens 120 and then reaches the microlens 131, and the pixel 141 can receive the optical signal condensed by the microlens 131.
像素阵列140中不同的像素可以接收到物体110上不同点反射的光信号,因此本申请实施例可以根据像素阵列感应的光信号生成物体的TOF图像,以获取物体的深度信息。例如,该TOF图像可用于通过深度重构算法生成物体的三维数据。Different pixels in the pixel array 140 can receive light signals reflected from different points on the object 110. Therefore, the embodiment of the present application can generate a TOF image of the object according to the light signals sensed by the pixel array to obtain depth information of the object. For example, the TOF image can be used to generate three-dimensional data of the object through a depth reconstruction algorithm.
本申请实施例中的TOF图像可以表示物体的深度信息,或者TOF图像可以表示物体的深度信息和图像信息。The TOF image in the embodiments of the present application may represent the depth information of the object, or the TOF image may represent the depth information and image information of the object.
本申请在该方案的基础上进行改进,得到一种图像采集装置,能够提高获取的三维数据的质量。This application is improved on the basis of this solution to obtain an image acquisition device that can improve the quality of the acquired three-dimensional data.
如图2所示,该图像采集装置可以包括微透镜阵列230和多个像素单元240。该微透镜阵列230包括多个微透镜,该多个像素单元240可以与多个微透镜一一对应。该多个像素单元240中的每个像素单元可以包括第一像素和第二像素,每个微透镜用于将物体返回的光信号汇聚至对应的一个像素单元,并使得所述多个像素单元240中的多个第一像素接收到的光信号的角度相同,以及多个第二像素接收到的光信号的角度相同。As shown in FIG. 2, the image capture device may include a microlens array 230 and a plurality of pixel units 240. The microlens array 230 includes a plurality of microlenses, and the plurality of pixel units 240 can correspond to the plurality of microlenses one-to-one. Each pixel unit of the plurality of pixel units 240 may include a first pixel and a second pixel, and each microlens is used to converge the light signal returned by the object to a corresponding pixel unit, and make the plurality of pixel units The angles of the light signals received by the plurality of first pixels in 240 are the same, and the angles of the light signals received by the plurality of second pixels are the same.
其中,所述多个第一像素的输出用于生成第一视角图像,所述多个第二像素的输出用于生成第二视角图像,所述多个像素单元的输出用于生成所述物体的飞行时间TOF图像,且所述TOF图像、所述第一视角图像和所述第 二视角图像用于生成所述物体的目标三维数据。Wherein, the output of the plurality of first pixels is used to generate a first perspective image, the output of the plurality of second pixels is used to generate a second perspective image, and the output of the plurality of pixel units is used to generate the object The TOF image of the time of flight, and the TOF image, the first perspective image, and the second perspective image are used to generate target three-dimensional data of the object.
该图像采集装置还包括处理电路,该处理电路用于根据所述多个第一像素的输出得到第一视角图像,根据所述多个第二像素的输出得到第二视角图像,以及根据所述多个像素单元的输出得到所述物体的飞行时间TOF图像,并至少根据所述TOF图像和所述第一视角图像、第二视角图像生成所述物体的目标三维数据。The image acquisition device also includes a processing circuit for obtaining a first-view image according to the output of the plurality of first pixels, obtaining a second-view image according to the output of the plurality of second pixels, and according to the The output of a plurality of pixel units obtains the TOF image of the time of flight of the object, and generates the target three-dimensional data of the object according to at least the TOF image, the first view image and the second view image.
一个像素单元感应的光信号可用于生成该像素单元位置处的深度信息,多个像素单元感应的光信号可分别用于生成多个像素单元处的深度信息,从而得到物体的深度信息,以生成物体的TOF图像。The light signal sensed by one pixel unit can be used to generate the depth information at the position of the pixel unit, and the light signals sensed by multiple pixel units can be used to generate the depth information at multiple pixel units respectively, so as to obtain the depth information of the object to generate TOF image of the object.
本申请实施例中的第一像素可以指多个像素单元中的位于相同的相对位置的像素,第二像素可以指多个像素单元中的位于相同的相对位置的像素,且第一像素和第二像素不同。The first pixel in the embodiment of the present application may refer to pixels located at the same relative position among multiple pixel units, and the second pixel may refer to pixels located at the same relative position among multiple pixel units, and the first pixel and the second pixel may be The two pixels are different.
位于相同的相对位置的像素可以指一个像素单元中的像素a与另一个像素单元中的与像素a对应的像素b。举例说明,假设每个像素单元包含两个像素,该两个像素在这个像素单元中的相对位置分别是左边和右边。多个像素单元中位于左边的所有像素称为位于相同的相对位置的像素,多个像素单元中位于右边的所有像素称为位于相同的相对位置的像素。The pixels located at the same relative position may refer to the pixel a in one pixel unit and the pixel b corresponding to the pixel a in another pixel unit. For example, suppose that each pixel unit contains two pixels, and the relative positions of the two pixels in this pixel unit are the left and right sides, respectively. All pixels located on the left in the plurality of pixel units are referred to as pixels located at the same relative position, and all pixels located on the right in the multiple pixel units are referred to as pixels located at the same relative position.
以图2为例,一个像素单元包括两个像素,其中,像素241a、242a、243a可以为第一像素,像素241b、242b、243b可以为第二像素。Taking FIG. 2 as an example, one pixel unit includes two pixels, where the pixels 241a, 242a, and 243a may be the first pixels, and the pixels 241b, 242b, and 243b may be the second pixels.
像素241a、242a、243a可用于生成第一视角图像,像素241b、242b、243b可用于生成第二视角图像。The pixels 241a, 242a, and 243a can be used to generate a first-view image, and the pixels 241b, 242b, and 243b can be used to generate a second-view image.
上文仅描述了第一视角图像和第二视角图像,当然,如果一个像素单元包括x个像素,x为大于2的整数,本申请实施例可以仅生成物体的2个视角的图像,也可以生成物体的x个视角的图像。Only the first-view image and the second-view image are described above. Of course, if a pixel unit includes x pixels, and x is an integer greater than 2, the embodiment of the present application may only generate images of the object in 2 views, or Generate x-view images of the object.
例如,本申请实施例中的每个像素单元还可以包括第三像素和第四像素,所述每个微透镜用于将所述物体返回的光信号汇聚至对应的一个像素单元,并使得所述多个像素单元中的多个第三像素接收到的光信号的角度相同,以及多个第四像素接收到的光信号的角度相同。For example, each pixel unit in the embodiment of the present application may further include a third pixel and a fourth pixel, and each microlens is used to converge the light signal returned by the object to a corresponding pixel unit, and make all The angles of the light signals received by the plurality of third pixels in the plurality of pixel units are the same, and the angles of the light signals received by the plurality of fourth pixels are the same.
由于微透镜的汇聚作用,该多个第三像素和多个第四像素接收到的光信号的角度不同。具体地,多个第一像素、多个第二像素、多个第三像素和多个第四像素接收到的光信号的角度均不同。Due to the converging effect of the microlenses, the angles of the light signals received by the plurality of third pixels and the plurality of fourth pixels are different. Specifically, the angles of the light signals received by the plurality of first pixels, the plurality of second pixels, the plurality of third pixels, and the plurality of fourth pixels are all different.
本申请实施例中的三维数据是指能够描述物体在三维空间里的形状和空间位置的数据。该三维数据可以包括但不限于三维点云(point cloud)、深度图(depth image/range image)或者三维网络(mesh)。The three-dimensional data in the embodiments of the present application refers to data that can describe the shape and spatial position of an object in a three-dimensional space. The three-dimensional data may include, but is not limited to, a three-dimensional point cloud (point cloud), a depth image (depth image/range image), or a three-dimensional network (mesh).
本申请实施例中,多个像素单元与该多个微透镜一一对应表示每个像素单元都有对应的微透镜,且经过一个微透镜的光信号可以被对应的像素单元中的所有像素接收到。In the embodiment of the present application, the one-to-one correspondence between multiple pixel units and the multiple microlenses means that each pixel unit has a corresponding microlens, and the optical signal passing through one microlens can be received by all pixels in the corresponding pixel unit arrive.
例如,参见图2,经过微透镜231的光信号可以被像素241a和像素241b接收到,经过微透镜232的光信号可以被像素242a和像素242b接收到,经过微透镜233的光信号可以被像素243a和像素243b接收到。For example, referring to FIG. 2, the light signal passing through the microlens 231 can be received by the pixels 241a and 241b, the light signal passing through the microlens 232 can be received by the pixels 242a and 242b, and the light signal passing through the microlens 233 can be received by the pixels. 243a and pixel 243b are received.
本申请实施例对一个像素单元包括的像素的个数不做具体限定。例如,一个像素单元可以包括2个像素或4个像素或甚至更多个像素。当然,一个像素单元也可以包括3个像素或其他个数的像素。The embodiment of the present application does not specifically limit the number of pixels included in a pixel unit. For example, one pixel unit may include 2 pixels or 4 pixels or even more pixels. Of course, one pixel unit may also include 3 pixels or other numbers of pixels.
以图2为例,一个像素单元包括2个像素。多个像素单元240包括像素单元241、像素单元242、像素单元243。像素单元241包括像素241a和像素241b,像素单元242包括像素242a和像素242b,像素单元243包括像素243a和像素243b。Taking FIG. 2 as an example, one pixel unit includes 2 pixels. The plurality of pixel units 240 includes a pixel unit 241, a pixel unit 242, and a pixel unit 243. The pixel unit 241 includes a pixel 241a and a pixel 241b, the pixel unit 242 includes a pixel 242a and a pixel 242b, and the pixel unit 243 includes a pixel 243a and a pixel 243b.
本申请实施例对一个像素单元中的像素的排列方式不做具体限定。例如,如果一个像素单元包括偶数个像素时,该偶数个像素可以按照矩阵的方式进行排列。The embodiment of the present application does not specifically limit the arrangement of pixels in a pixel unit. For example, if a pixel unit includes an even number of pixels, the even number of pixels can be arranged in a matrix.
举例说明,一个像素单元包括2个像素时,即包括第一像素和第二像素时,该第一像素和第二像素可以按照1×2或2×1的矩阵形式进行排列。一个像素单元包括4个像素时,即包括第一像素、第二像素、第三像素和第四像素时,该第一像素、第二像素、第三像素和第四像素可以按照2×2的矩阵形式进行排列。For example, when a pixel unit includes 2 pixels, that is, when it includes a first pixel and a second pixel, the first pixel and the second pixel may be arranged in a matrix of 1×2 or 2×1. When a pixel unit includes 4 pixels, that is, when it includes the first pixel, the second pixel, the third pixel, and the fourth pixel, the first pixel, the second pixel, the third pixel, and the fourth pixel can be in the order of 2×2. Arrange in matrix form.
每个像素单元包含偶数个像素的情况下,其排列方式比较整齐,有利于节省空间。When each pixel unit contains an even number of pixels, the arrangement is relatively neat, which is beneficial to save space.
当一个像素单元包括2个像素时,该2个像素可以采用上下结构的排列方式,如图3中的左侧图所示,也可以采用左右结构的排列方式,如图3中的左侧图所示。When a pixel unit includes two pixels, the two pixels can be arranged in an up-and-down structure, as shown in the left figure in Figure 3, or in a left-right structure, as shown in the left figure in Figure 3. Shown.
当一个像素单元包括4个像素时,该4个像素可以采用2×2矩阵的方式进行排列,如图4所示。When a pixel unit includes 4 pixels, the 4 pixels can be arranged in a 2×2 matrix, as shown in FIG. 4.
本申请实施例对单个像素的形状不做具体限定,例如像素可以是方形,也可以是圆形等。The embodiment of the present application does not specifically limit the shape of a single pixel. For example, the pixel may be square or circular.
将多个像素合并成一个像素单元使用,则变成了传统的TOF模式。即可以根据一个像素单元感应的光信号,生成物体的TOF图像。Combining multiple pixels into one pixel unit becomes the traditional TOF mode. That is, a TOF image of the object can be generated based on the light signal sensed by a pixel unit.
例如,可以将一个像素单元内多个像素单元的像素值进行合并,即将多个像素单元的像素值进行求和,根据求和之后的像素值,计算每个像素单元处的深度值,以生成物体的TOF图像。又例如,可以将一个像素单元内多个像素单元的像素值进行平均,根据求平均之后的像素值,计算每个像素单元处的深度值,以得到物体的TOF图像。再例如,还可以针对一个像素单元内的每个像素都计算出深度值,然后取这些深度值的平均值,以得到物体的TOF图像。不管采用哪种计算方式,只要每个像素单元计算的方式保持一致即可。For example, the pixel values of multiple pixel units in a pixel unit can be combined, that is, the pixel values of multiple pixel units can be summed, and the depth value at each pixel unit can be calculated according to the summed pixel values to generate TOF image of the object. For another example, the pixel values of multiple pixel units in one pixel unit can be averaged, and the depth value at each pixel unit can be calculated according to the averaged pixel value to obtain the TOF image of the object. For another example, it is also possible to calculate a depth value for each pixel in a pixel unit, and then take the average of these depth values to obtain a TOF image of the object. No matter which calculation method is used, as long as the calculation method of each pixel unit is consistent.
对于相位式TOF,一个像素的像素值可用于表示光信号的相位的积分值;对于脉冲式TOF,一个像素的像素值可用于表示光信号的飞行时间。For phase TOF, the pixel value of one pixel can be used to represent the integral value of the phase of the optical signal; for pulsed TOF, the pixel value of one pixel can be used to represent the flight time of the optical signal.
在使用TOF模式时,可以将像素241a和像素241b的像素值求平均值,得到像素单元241的像素值;将242a和像素242b的像素值求平均值,得到像素单元242的像素值;将像素243a和像素243b的像素值求平均值,得到像素单元243的像素值。然后,基于像素单元241、像素单元242和像素单元243的像素值,生成物体的TOF图像。When using TOF mode, the pixel values of pixels 241a and 241b can be averaged to obtain the pixel value of pixel unit 241; the pixel values of 242a and pixel 242b can be averaged to obtain the pixel value of pixel unit 242; The pixel values of the 243a and the pixel 243b are averaged to obtain the pixel value of the pixel unit 243. Then, based on the pixel values of the pixel unit 241, the pixel unit 242, and the pixel unit 243, a TOF image of the object is generated.
每个像素单元中包括的像素的个数相同,且排列方式也相同。这样,位于多个像素单元中相同的相对位置的像素能够接收的光信号的角度可以保持一致,而位于一个像素内不同的相对位置的像素能够接收的光信号的角度不同,即第一像素和第二像素接收到的光信号的角度不同。这样位于相同的相对位置的像素感应的光信号可生成一个视角的图像,如果一个像素单元包括n个像素,则可以生成n个视角的图像,n为大于或等于2的整数。The number of pixels included in each pixel unit is the same, and the arrangement manner is also the same. In this way, the angles of the light signals that can be received by the pixels at the same relative position in the multiple pixel units can be kept the same, while the angles of the light signals that can be received by the pixels at different relative positions within a pixel are different, that is, the first pixel and the The angle of the light signal received by the second pixel is different. In this way, the light signals sensed by the pixels at the same relative position can generate an image of one viewing angle. If a pixel unit includes n pixels, an image of n viewing angles can be generated, and n is an integer greater than or equal to 2.
根据微透镜的成像原理,像素241a能够接收角度②范围内的光信号,像素241b能够接收角度①范围内的光信号,像素242a能够接收角度④范围内的光信号,像素242b能够接收角度③范围内的光信号,像素243a能够接收角度⑥范围内的光信号,像素243b能够接收角度⑤范围内的光信号。其中,角度②、角度④、角度⑥的大小和方向相同,角度①、角度③、角度⑤的大小和方向相同,且角度②、角度④、角度⑥与角度①、角度③、角度⑤ 的方向不同。也就是说,像素241a、像素242a和像素243a是从同一个视角接收的物体反射的光信号,像素241b、242b和像素243b是从同一个视角接收的物体反射的光信号。因此,像素241a、像素242a和像素243a接收的光信号可用于生成物体的一个视角的图像,像素241b、242b和像素243b可用于生成物体的另一个视角的图像。According to the imaging principle of the microlens, the pixel 241a can receive the light signal within the angle ②, the pixel 241b can receive the light signal within the angle ①, the pixel 242a can receive the light signal within the angle ④, and the pixel 242b can receive the light signal within the angle ③. The pixel 243a can receive the light signal within the range of angle ⑥, and the pixel 243b can receive the light signal within the range of angle ⑤. Among them, angle ②, angle ④, angle ⑥ have the same size and direction, angle ①, angle ③, angle ⑤ have the same size and direction, and angle ②, angle ④, angle ⑥ and angle ①, angle ③, angle ⑤ direction different. That is, the pixel 241a, the pixel 242a, and the pixel 243a are light signals reflected from an object received from the same viewing angle, and the pixels 241b, 242b, and the pixel 243b are light signals reflected from an object received from the same viewing angle. Therefore, the light signals received by the pixel 241a, the pixel 242a, and the pixel 243a can be used to generate an image of one view of the object, and the pixels 241b, 242b, and the pixel 243b can be used to generate an image of another view of the object.
本申请实施例中的一个像素单元包括多个像素的结构类似于多摄的原理,由于其基线很小,大约为几个毫米,故其主要能测量近处物体的三维数据,且由于该成像方式为被动成像,不受TOF方式中存在的多路径效应的影响。而传统TOF由于不需要基线,则可以测量中远距离的三维数据,但其在近处测量的三维数据误差会较大。这样将两种测量方式进行结合,可以获取更大范围内的三维数据,并且两者的数据进行互补,缓解TOF的多路径效应,能够得到比两者中任一种方式得到的数据质量都好的三维数据。因此,本申请实施例的方案能够提高获得的三维数据的质量。The structure of a pixel unit including multiple pixels in the embodiment of this application is similar to the principle of multi-shot. Because its baseline is small, about several millimeters, it can mainly measure the three-dimensional data of nearby objects, and because of the imaging The method is passive imaging, which is not affected by the multipath effect existing in the TOF method. However, traditional TOF can measure 3D data at medium and long distances because it does not require a baseline, but the error in the 3D data measured at close distances will be larger. In this way, the combination of the two measurement methods can obtain a larger range of three-dimensional data, and the data of the two complement each other, alleviate the multipath effect of TOF, and can obtain better data quality than either method. Three-dimensional data. Therefore, the solution of the embodiment of the present application can improve the quality of the obtained three-dimensional data.
多路径效应可以指光源发射的光信号会在目标场景中进行多次反射,一个物体不仅会反射光源发射的光信号,也会反射来自其他间接路径的光信号。多个来源的反射光之间会存在干扰,从而引起TOF方式获得的数据误差。Multipath effect can mean that the light signal emitted by the light source will be reflected multiple times in the target scene. An object will not only reflect the light signal emitted by the light source, but also light signals from other indirect paths. There will be interference between reflected light from multiple sources, which will cause data errors in the TOF method.
另外,同传统TOF的结构相比,本申请实施例只是改变了微透镜阵列与像素之间的对应关系,并没有明显增加新的部件材料,不会导致成本的明显增加。In addition, compared with the structure of the traditional TOF, the embodiment of the present application only changes the correspondence between the microlens array and the pixels, and does not significantly increase new component materials, and does not cause a significant increase in cost.
本申请实施例的像素为用于感应光信号的最小单位的感应单元。The pixel in the embodiment of the present application is a sensing unit that is the smallest unit for sensing light signals.
本申请实施例中的图像采集装置还可以包括主透镜220,主透镜220设置在微透镜阵列230的上方,用于对物体返回的光信号进行汇聚。微透镜阵列230可用于将经过主透镜220的光信号汇聚至多个像素单元240。The image acquisition device in the embodiment of the present application may further include a main lens 220, which is disposed above the microlens array 230, and is used to converge the light signals returned from the object. The microlens array 230 can be used to converge the light signals passing through the main lens 220 to a plurality of pixel units 240.
微透镜阵列230可以设置在主透镜220的焦平面上,即微透镜阵列230与主透镜220之间的距离为主透镜220的焦距,这样可以简化生成物体的三维数据的处理过程。当然,微透镜阵列230的设置位置也可以与主透镜220的成像平面有一定的偏移,这样在进行三维数据处理时需要对该偏移进行校正处理。The micro lens array 230 can be arranged on the focal plane of the main lens 220, that is, the distance between the micro lens array 230 and the main lens 220 is the focal length of the main lens 220, which can simplify the process of generating three-dimensional data of the object. Of course, the setting position of the microlens array 230 may also be offset from the imaging plane of the main lens 220 to a certain extent, so that the offset needs to be corrected when performing three-dimensional data processing.
多个像素单元240与微透镜阵列230之间的距离可以根据单个微透镜径的焦距和像素单元的尺寸来确定,例如,多个像素单元240与微透镜阵列230 之间的距离可以大于一个微透镜的焦距。The distance between the plurality of pixel units 240 and the microlens array 230 may be determined according to the focal length of a single microlens diameter and the size of the pixel unit. For example, the distance between the plurality of pixel units 240 and the microlens array 230 may be greater than one microlens array. The focal length of the lens.
图5示出了被物体中的点211反射的光信号的传播路径的示意图,点211与主透镜220之间的距离为G。根据主透镜220的设计,被点211反射后的光信号经过主透镜220后被一个微透镜232接收到,经过微透镜232的光信号被其下方的像素单元感应到,其中,一部分光信号被像素a感应到,另一部分光信号被像素b感应到。FIG. 5 shows a schematic diagram of the propagation path of the optical signal reflected by the point 211 in the object, and the distance between the point 211 and the main lens 220 is G. According to the design of the main lens 220, the optical signal reflected by the point 211 passes through the main lens 220 and is received by a microlens 232. The optical signal passing through the microlens 232 is sensed by the pixel unit below it, and part of the optical signal is Pixel a is sensed, and another part of the light signal is sensed by pixel b.
在图5中,多个像素单元240中的白色的像素用于生成物体的一个视角的图像,阴影的像素用于生成物体的另一个视角的图像。In FIG. 5, white pixels in a plurality of pixel units 240 are used to generate an image of one view of the object, and shaded pixels are used to generate an image of another view of the object.
如图6所示,本申请实施例中的图像采集装置还可以包括光源520,该光源520可用于向物体510发射指定波长的光信号,多个像素单元540可用于感应物体510返回的光信号。As shown in FIG. 6, the image acquisition device in the embodiment of the present application may further include a light source 520, the light source 520 may be used to emit a light signal of a specified wavelength to the object 510, and a plurality of pixel units 540 may be used to sense the light signal returned by the object 510 .
该光源可以为垂直腔面发射激光器(vertical-cavity surface-emitting laser,VCSEL)或者发光二极管(light emitting diode,LED)。The light source may be a vertical-cavity surface-emitting laser (VCSEL) or a light emitting diode (LED).
该指定波长的光信号可以为不可见光信号,不可见光信号不容易被用户眼睛观察到,有利于提高用户体验。当然,在一些要求不高的场合,也可以使用可见光信号作为光源,本申请实施例对此不做具体限定。The optical signal of the specified wavelength may be an invisible light signal, and the invisible light signal is not easily observed by the user's eyes, which is beneficial to improve the user experience. Of course, in some occasions with low requirements, visible light signals can also be used as the light source, which is not specifically limited in the embodiment of the present application.
优选地,该指定波长的光信号可以为红外光信号。Preferably, the optical signal of the designated wavelength may be an infrared optical signal.
本申请实施例中的图像采集装置还可以包括控制单元,该控制单元可用于控制光源520与多个像素单元540时间同步,从而可以准确地计算出光信号从发射到被像素接收到的时间差或相位差。The image acquisition device in the embodiment of the present application may further include a control unit that can be used to control the light source 520 to be time-synchronized with the multiple pixel units 540, so that the time difference or phase of the light signal from emission to reception by the pixel can be accurately calculated. Difference.
可选地,该图像采集装置还可以包括滤光结构530,该滤光结构530设置在微透镜阵列560的上方,用于过滤非指定波长的光信号,从而仅使得指定波长的光信号被多个像素单元540接收到,能够减少环境光对TOF成像过程的干扰,提高TOF像素单元采集光信号的信噪比。Optionally, the image acquisition device may further include a filter structure 530, which is disposed above the microlens array 560, and is used to filter optical signals of non-specified wavelengths, so that only the optical signals of specified wavelengths are multiplied. Each pixel unit 540 receives it, which can reduce the interference of ambient light on the TOF imaging process and improve the signal-to-noise ratio of the light signal collected by the TOF pixel unit.
当然,滤光结构530也可以设置在微透镜阵列560与多个像素单元540之间,只要保证仅有指定波长的光信号能够到达多个像素单元540即可。Of course, the filter structure 530 can also be arranged between the micro lens array 560 and the plurality of pixel units 540, as long as it is ensured that only the optical signals of the specified wavelength can reach the plurality of pixel units 540.
该滤光结构530可以为滤光膜,该滤光膜可以通过镀膜的方式设置在微透镜阵列的上表面。或者,该滤光结构530可以为滤光片,该滤光片设置在微透镜阵列的上表面。The filter structure 530 may be a filter film, and the filter film may be provided on the upper surface of the microlens array by coating. Alternatively, the filter structure 530 may be a filter, and the filter is disposed on the upper surface of the micro lens array.
本申请实施例中的图像采集装置可以设置在底座550上,该底座550用于固定图像采集装置。The image capture device in the embodiment of the present application may be set on a base 550, and the base 550 is used to fix the image capture device.
本申请实施例中的多个像素单元540可以为TOF感应单元。TOF感应单元可以是脉冲式TOF感应单元,也可以为相位式TOF感应单元。The multiple pixel units 540 in the embodiment of the present application may be TOF sensing units. The TOF induction unit can be a pulse type TOF induction unit or a phase type TOF induction unit.
如果多个像素单元为脉冲式TOF感应单元,则TOF图像可以是根据多个像素单元至少1次感应的光信号生成的。脉冲式TOF感应单元只需至少一次的发射采集过程,即可得到TOF图像。If the multiple pixel units are pulsed TOF sensing units, the TOF image may be generated based on the light signals sensed by the multiple pixel units at least once. The pulsed TOF sensor unit only needs to transmit and collect at least once to obtain the TOF image.
如果多个像素单元为相位式TOF感应单元,则TOF图像可以是根据多个像素单元至少3次感应的光信号生成的。相位式TOF感应单元需要至少三次的发射采集过程,才能得到TOF图像。If the plurality of pixel units are phase-type TOF sensing units, the TOF image may be generated based on the light signals sensed by the plurality of pixel units at least 3 times. The phased TOF sensing unit requires at least three transmission and acquisition processes to obtain the TOF image.
举例说明,以相位式TOF感应单元为例,假设一个像素单元包括2个像素,TOF图像是根据4次感应的光信号生成的,计算一个像素单元中的2个像素每次感应的光信号的平均值,得到4个平均值,分别记为x1、x2、x3、x4,然后可以通过反正切公式得到物体的深度信息。For example, take the phase-type TOF sensing unit as an example. Assuming that a pixel unit includes 2 pixels, the TOF image is generated based on the light signal sensed 4 times, and the light signal sensed by 2 pixels in a pixel unit is calculated each time. Average value, get 4 average values, respectively denoted as x1, x2, x3, x4, and then the depth information of the object can be obtained through the arctangent formula.
此外,本申请实施例中的TOF图像除了可以包括深度信息之外,还可以包括物体的图像信息。也就是说,本申请实施例中的多个像素单元的像素值除了用于生成深度信息之外,还可用于生成物体的图像信息。例如,将像素单元的像素值通过不同的算法可分别得到深度信息和图像信息。In addition, the TOF image in the embodiments of the present application may include not only depth information, but also image information of an object. In other words, the pixel values of the multiple pixel units in the embodiments of the present application can be used to generate image information of an object in addition to being used to generate depth information. For example, the pixel value of the pixel unit can be used to obtain the depth information and the image information through different algorithms.
下面结合图6,对本申请实施例的图像采集过程进行描述。The image acquisition process of the embodiment of the present application will be described below with reference to FIG. 6.
如图6所示,光源520向物体510发射指定波长λ的光信号,该光信号被物体510反射后到达滤光片530,滤光片可用于阻挡其他波长的光信号,仅使波长为λ的光信号透过,因此多个像素单元540只会接收到波长为λ的光信号。经过一次发射采集过程,多个像素单元可以采集到一帧图像,多个像素单元中的每个像素均有对应的灰度值。As shown in FIG. 6, the light source 520 emits a light signal of a specified wavelength λ to the object 510. The light signal is reflected by the object 510 and then reaches the filter 530. The filter can be used to block light signals of other wavelengths, and only make the wavelength λ The light signal of λ is transmitted through, so the plurality of pixel units 540 will only receive the light signal of the wavelength λ. After one emission collection process, multiple pixel units can collect a frame of image, and each pixel in the multiple pixel units has a corresponding gray value.
如果是脉冲式TOF,多个像素单元最少只需要采集一张图像便可解算出三维数据。如果是相位式TOF,多个像素单元至少需要采集三张图像才能解算出三维数据。If it is a pulsed TOF, multiple pixel units only need to collect at least one image to calculate the three-dimensional data. If it is a phase TOF, multiple pixel units need to collect at least three images to calculate the three-dimensional data.
TOF图像和第一视角图像、第二视角图像可以是根据相同的一帧或多帧图像得到的,也可以是根据不同帧图像得到的。The TOF image, the first-view image, and the second-view image may be obtained based on the same one or multiple frames of images, or may be obtained based on different frames of images.
以脉冲式TOF为例,光源发射了一次光信号,多个像素单元接收了对应的一次光信号,TOF图像和第一视角图像、第二视角图像均是根据多个像素单元同一次接收的光信号生成的。这种方式通过一次的发射采集过程便可以得到TOF图像和第一视角图像、第二视角图像,能够提高三维数据的处 理速度。Taking pulsed TOF as an example, the light source emits a primary light signal, and multiple pixel units receive the corresponding primary light signal. The TOF image, the first viewing angle image, and the second viewing angle image are all based on the light received by multiple pixel units at the same time. Signal generated. In this way, TOF images, first-view images, and second-view images can be obtained through a single transmission and acquisition process, which can improve the processing speed of three-dimensional data.
又例如,光源发射了两次光信号,多个像素单元接收了对应的两次光信号,TOF图像是根据多个像素单元第一次接收到的光信号生成的,第一视角图像和第二视角图像是根据多个像素单元第二次接收到的光信号生成的。For another example, the light source emits light signals twice, and multiple pixel units receive the corresponding light signals twice. The TOF image is generated based on the light signals received by the multiple pixel units for the first time. The viewing angle image is generated based on the light signal received by the multiple pixel units for the second time.
参见上文的描述,一个像素单元可以包括多个像素,提取每个像素单元中位于相同的相对位置的像素,便可组成一个视角的图像。这样,如果一个像素单元包括n个像素,便可以得到n个视角的图像。With reference to the above description, a pixel unit may include multiple pixels, and the pixels located in the same relative position in each pixel unit can be extracted to form an image with a viewing angle. In this way, if a pixel unit includes n pixels, images with n viewing angles can be obtained.
本申请实施例对处理电路生成目标三维数据的方式不做具体限定。例如,处理电路可以直接根据TOF图像和第一视角图像、第二视角图像,经过预设的算法处理,生成物体的目标三维数据。又例如,处理电路可以根据TOF图像,生成物体的第一三维数据,并根据第一视角图像和第二视角图像,生成物体的第二三维数据,然后根据该第一三维数据和第二三维数据,生成目标三维数据。The embodiment of the present application does not specifically limit the manner in which the processing circuit generates the target three-dimensional data. For example, the processing circuit can directly generate the target three-dimensional data of the object according to the TOF image, the first-perspective image, and the second-perspective image through a preset algorithm. For another example, the processing circuit may generate first three-dimensional data of the object based on the TOF image, and generate second three-dimensional data of the object based on the first and second perspective images, and then based on the first three-dimensional data and the second three-dimensional data , Generate target three-dimensional data.
如果一个像素单元包括第一像素、第二像素、第三像素和第四像素,其中,多个像素单元中的多个第三像素的输出用于生成第三视角图像,多个第四像素的输出用于生成第四视角图像,所述多个像素单元的输出用于生成所述物体的飞行时间TOF图像,且所述TOF图像、所述第一视角图像、所述第二视角图像、所述第三视角图像、所述第四视角图像用于生成所述物体的目标三维数据。If a pixel unit includes a first pixel, a second pixel, a third pixel, and a fourth pixel, the output of a plurality of third pixels in the plurality of pixel units is used to generate a third view image, and the output of the plurality of fourth pixels The output is used to generate a fourth perspective image, the outputs of the multiple pixel units are used to generate a TOF image of the time of flight of the object, and the TOF image, the first perspective image, the second perspective image, and the The third-perspective image and the fourth-perspective image are used to generate target three-dimensional data of the object.
如果图像采集装置包括处理电路,则处理电路还用于根据所述多个第三像素的输出得到第三视角图像,根据所述多个第四像素的输出得到第四视角图像,并根据所述TOF图像、所述第一视角图像、所述第二视角图像、所述第三视角图像和所述第四视角图像生成所述物体的目标三维数据。If the image acquisition device includes a processing circuit, the processing circuit is further configured to obtain a third-view image according to the output of the plurality of third pixels, obtain a fourth-view image according to the output of the plurality of fourth pixels, and according to the The TOF image, the first-perspective image, the second-perspective image, the third-perspective image, and the fourth-perspective image generate target three-dimensional data of the object.
可选地,所述处理电路用于根据所述TOF图像,生成所述物体的第一三维数据;根据所述第一视角图像、所述第二视角图像、所述第三视角图像和所述第四视角图像,生成所述物体的第二三维数据;以及根据所述第一三维数据和所述第二三维数据,生成所述目标三维数据。Optionally, the processing circuit is configured to generate first three-dimensional data of the object according to the TOF image; according to the first perspective image, the second perspective image, the third perspective image, and the A fourth perspective image, generating second three-dimensional data of the object; and generating the target three-dimensional data based on the first three-dimensional data and the second three-dimensional data.
第一三维数据可以是根据TOF算法生成的;第二三维数据可以是根据双(多)目视觉算法生成的,或者是由光场相机类算法生成的,或者是由机器学习的算法生成的。The first three-dimensional data may be generated according to the TOF algorithm; the second three-dimensional data may be generated according to a dual (multiple) vision algorithm, or a light field camera algorithm, or a machine learning algorithm.
处理电路可以将第一三维数据和第二三维数据进行融合,得到目标三维 数据。The processing circuit can fuse the first three-dimensional data and the second three-dimensional data to obtain the target three-dimensional data.
本申请对融合方式不做具体限定。例如,处理电路可以通过迭代最近点(iterative closest point,ICP)算法对第一三维数据和第二三维数据进行融合。This application does not specifically limit the integration method. For example, the processing circuit may use an iterative closest point (ICP) algorithm to fuse the first three-dimensional data and the second three-dimensional data.
下面对三维数据的解算过程进行描述。The process of solving the three-dimensional data is described below.
每由光源520主动发射一次光信号,多个像素单元540便采集一次返回的信号光的图像。如果是脉冲式TOF,则至少只用一次发射采集过程,得到一张灰度图,该灰度图为最终的灰度图。如果是相位式TOF,则需要m(m≥3)次的发射采集过程,得到m张灰度图,将m张灰度图求平均,得到的平均灰度图即为最终的灰度图。Each time the light source 520 actively emits a light signal, the plurality of pixel units 540 collect an image of the returned signal light once. If it is a pulsed TOF, at least one emission collection process is used to obtain a grayscale image, which is the final grayscale image. If it is a phase TOF, it needs m (m≥3) times of emission collection process to obtain m gray-scale images, and the m gray-scale images are averaged, and the average gray-scale image obtained is the final gray-scale image.
多像素解算三维数据:将上一步得到的最终的灰度图,用多视觉算法或者深度学习算法进行解算,得到第一三维数据。例如,可以采用ICP算法和/或ICP算法的各种演变算法解算出第一三维数据。Multi-pixel solution to three-dimensional data: Use multi-vision algorithms or deep learning algorithms to solve the final grayscale image obtained in the previous step to obtain the first three-dimensional data. For example, the ICP algorithm and/or various evolution algorithms of the ICP algorithm can be used to solve the first three-dimensional data.
TOF解算三维数据:根据多个像素单元采集的数据,将每个像素单元中的多个像素的像素值求平均,得到平均像素值,从而得到传统TOF结构的原始数据,然后根据TOF算法得到第二三维数据。TOF solves three-dimensional data: According to the data collected by multiple pixel units, the pixel values of multiple pixels in each pixel unit are averaged to obtain the average pixel value, so as to obtain the original data of the traditional TOF structure, and then obtain it according to the TOF algorithm The second three-dimensional data.
三维数据融合:将多像素解算的第一三维数据与TOF解算的第二三维数据进行融合,得到更好质量的三维数据。Three-dimensional data fusion: The first three-dimensional data calculated by multi-pixel and the second three-dimensional data calculated by TOF are fused to obtain better quality three-dimensional data.
后处理:处理融合后的三维数据中的毛刺点、填补数据中的空洞部分以及对数据进行平滑滤波,得到优化后的三维数据。Post-processing: processing the burr points in the fused 3D data, filling the holes in the data, and smoothing and filtering the data to obtain optimized 3D data.
下面描述一种本申请实施例适用的融合方法,但本申请实施例并不限于此。The following describes a fusion method applicable to the embodiment of the present application, but the embodiment of the present application is not limited to this.
在计算第一三维数据和第二三维数据的过程中,可以分别生成对应的置信图(confidence map):In the process of calculating the first three-dimensional data and the second three-dimensional data, the corresponding confidence maps can be generated respectively:
(1)设定两个经验阈值,阈值一,阈值二,当第一三维数据中的数据点对应的置信图值大于阈值一时,认为该点的三维信息可信度高,否则可信度低。同理,将第二三维数据的置信图值和阈值二作比较,也可以判断出第二三维数据中各个数据点的可信度的高低。(1) Set two empirical thresholds, threshold one and threshold two. When the confidence map value corresponding to the data point in the first three-dimensional data is greater than threshold one, the three-dimensional information of the point is considered to be highly reliable, otherwise the credibility is low . In the same way, the confidence map value of the second three-dimensional data is compared with the threshold value 2, and the reliability of each data point in the second three-dimensional data can also be judged.
(2)当第一三维数据中的数据点的可信度高,且第二三维数据对应的数据点的可信度低时,取第一三维数据点的数据作为目标三维数据对应点的值;(2) When the credibility of the data point in the first three-dimensional data is high, and the credibility of the data point corresponding to the second three-dimensional data is low, take the data of the first three-dimensional data point as the value of the corresponding point of the target three-dimensional data ;
(3)第一三维数据中的数据点的可信度低,且第二三维数据对应的数据点 的可信度高时,取第二三维数据点的数据作为目标三维数据对应点的值;(3) When the credibility of the data point in the first three-dimensional data is low, and the credibility of the data point corresponding to the second three-dimensional data is high, the data of the second three-dimensional data point is taken as the value of the corresponding point of the target three-dimensional data;
(4)当第一三维数据中的数据点的可信度高,且第二三维数据对应的数据点的可信度高时,取第一和第二三维数据点的数据的均值作为目标三维数据对应点的值;(4) When the credibility of the data point in the first three-dimensional data is high, and the credibility of the data point corresponding to the second three-dimensional data is high, the average value of the data of the first and second three-dimensional data points is taken as the target three-dimensional data. The value of the corresponding point of the data;
(5)第一三维数据点的可信度低时,且第二三维数据点的可信度低时,取目标三维数据对应点周围值的平均值作为该目标三维数据对应点的值。(5) When the credibility of the first three-dimensional data point is low, and when the credibility of the second three-dimensional data point is low, the average value of the surrounding values of the corresponding point of the target three-dimensional data is taken as the value of the corresponding point of the target three-dimensional data.
本申请实施例在实际应用过程中,可以直接使用融合后的三维数据。或者可以先对物体与多个像素单元之间的距离进行判断,然后再确定如何使用三维数据。例如,如果物体与多个像素单元之间的距离大于预设阈值,则可以使用第一三维数据,在该情况下,目标三维数据即为第一三维数据;如果物体与多个像素单元之间的距离小于或等于预设阈值,则可以使用第二三维数据,在该情况下,目标三维数据即为第二三维数据。In the actual application process of the embodiment of the present application, the fused three-dimensional data can be directly used. Or you can judge the distance between the object and multiple pixel units first, and then determine how to use the three-dimensional data. For example, if the distance between the object and multiple pixel units is greater than a preset threshold, the first three-dimensional data can be used. In this case, the target three-dimensional data is the first three-dimensional data; if the distance between the object and multiple pixel units is If the distance of is less than or equal to the preset threshold, the second three-dimensional data can be used. In this case, the target three-dimensional data is the second three-dimensional data.
本申请实施例的图像采集装置可以应用到各种场合中,例如,三维人脸识别、图像背景虚化、体感游戏等场景中。三维人脸识别可以应用到与三维数据有关的手机、门锁、门禁、支付设备等场景或设备中。The image acquisition device of the embodiment of the present application can be applied to various occasions, for example, scenes such as three-dimensional face recognition, image background blurring, and somatosensory games. Three-dimensional face recognition can be applied to scenes or devices related to three-dimensional data such as mobile phones, door locks, access control, and payment devices.
本申请实施例还提供一种电子设备,该电子设备包括上文描述的任一种图像采集装置。该电子设备例如可以为手机、电脑等设备。An embodiment of the present application also provides an electronic device, which includes any one of the image acquisition devices described above. The electronic device can be, for example, a mobile phone, a computer, or other devices.
可以理解的是,本申请附图中的结构仅表示一种示意图,不代表真实的尺寸和比例,这并不会对本申请实施例造成限定。It can be understood that the structure in the drawings of the present application only represents a schematic diagram, and does not represent the actual size and ratio, which does not limit the embodiments of the present application.
需要说明的是,在本申请实施例和所附权利要求书中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请实施例。It should be noted that the terms used in the embodiments of the present application and the appended claims are only for the purpose of describing specific embodiments, and are not intended to limit the embodiments of the present application.
例如,在本申请实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”、“上述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。For example, the singular forms of "a", "said", "above" and "the" used in the embodiments of this application and the appended claims are also intended to include plural forms, unless the context clearly indicates other forms. meaning.
所属领域的技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请实施例的范围。Those skilled in the art may be aware that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered as going beyond the scope of the embodiments of the present application.
如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。If implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application are essentially or the part that contributes to the prior art or the part of the technical solutions can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in the embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的设备、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the equipment, device and unit described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请提供的几个实施例中,应该理解到,所揭露的电子设备、装置和方法,可以通过其它的方式实现。In the several embodiments provided in this application, it should be understood that the disclosed electronic equipment, apparatus, and method may be implemented in other ways.
例如,以上所描述的装置实施例中单元或模块或组件的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如,多个单元或模块或组件可以结合或者可以集成到另一个系统,或一些单元或模块或组件可以忽略,或不执行。For example, the division of units or modules or components in the device embodiments described above is only a logical function division, and there may be other divisions in actual implementation. For example, multiple units or modules or components can be combined or integrated. To another system, or some units or modules or components can be ignored or not executed.
又例如,上述作为分离/显示部件说明的单元/模块/组件可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元/模块/组件来实现本申请实施例的目的。For another example, the units/modules/components described above as separate/display components may or may not be physically separated, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units/modules/components can be selected according to actual needs to achieve the objectives of the embodiments of the present application.
最后,需要说明的是,上文中显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。Finally, it should be noted that the mutual coupling or direct coupling or communication connection shown or discussed above may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms. .
以上内容,仅为本申请实施例的具体实施方式,但本申请实施例的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请实施例揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请实施例的保护范围之内。因此,本申请实施例的保护范围应以权利要求的保护范围为准。The above content is only the specific implementation manners of the embodiments of the present application, but the protection scope of the embodiments of the present application is not limited thereto. Any person skilled in the art can easily think of within the technical scope disclosed in the embodiments of the present application. The change or replacement shall be covered within the protection scope of the embodiments of this application. Therefore, the protection scope of the embodiments of the present application should be subject to the protection scope of the claims.

Claims (22)

  1. 一种图像采集装置,其特征在于,包括:An image acquisition device, characterized in that it comprises:
    微透镜阵列,包括多个微透镜;Micro lens array, including multiple micro lenses;
    多个像素单元,与所述多个微透镜一一对应,所述多个像素单元中的每个像素单元包括第一像素和第二像素,每个微透镜用于将物体返回的光信号汇聚至对应的一个像素单元,并使得所述多个像素单元中的多个第一像素接收到的光信号的角度相同,以及多个第二像素接收到的光信号的角度相同;A plurality of pixel units corresponding to the plurality of microlenses, each pixel unit of the plurality of pixel units includes a first pixel and a second pixel, and each microlens is used for converging the light signal returned from the object To a corresponding pixel unit, so that the angles of the light signals received by the plurality of first pixels in the plurality of pixel units are the same, and the angles of the light signals received by the plurality of second pixels are the same;
    处理电路,用于根据所述多个第一像素的输出得到第一视角图像,根据所述多个第二像素的输出得到第二视角图像,以及根据所述多个像素单元的输出得到所述物体的飞行时间TOF图像,并至少根据所述TOF图像、所述第一视角图像和所述第二视角图像生成所述物体的目标三维数据。A processing circuit for obtaining a first-view image according to the output of the plurality of first pixels, obtaining a second-view image according to the output of the plurality of second pixels, and obtaining the image according to the output of the plurality of pixel units The time-of-flight TOF image of the object, and the target three-dimensional data of the object is generated according to at least the TOF image, the first view image, and the second view image.
  2. 根据权利要求1所述的图像采集装置,其特征在于,所述第一像素和所述第二像素接收到的光信号的角度不同。The image acquisition device according to claim 1, wherein the angles of the light signals received by the first pixel and the second pixel are different.
  3. 根据权利要求1或2所述的图像采集装置,其特征在于,所述每个像素单元中的所述第一像素和所述第二像素按照1×2矩阵形式进行排列,或者所述每个像素单元中的所述第一像素和所述第二像素按照2×1矩阵形式进行排列。The image acquisition device according to claim 1 or 2, wherein the first pixel and the second pixel in each pixel unit are arranged in a 1×2 matrix form, or each The first pixels and the second pixels in the pixel unit are arranged in a 2×1 matrix form.
  4. 根据权利要求1-3中任一项所述的图像采集装置,其特征在于,所述处理电路用于:The image acquisition device according to any one of claims 1-3, wherein the processing circuit is used for:
    根据所述TOF图像,生成所述物体的第一三维数据;根据所述第一视角图像和所述第二视角图像,生成所述物体的第二三维数据;以及根据所述第一三维数据和所述第二三维数据,生成所述目标三维数据。According to the TOF image, generate first three-dimensional data of the object; generate second three-dimensional data of the object based on the first perspective image and the second perspective image; and generate second three-dimensional data of the object based on the first three-dimensional data and The second three-dimensional data generates the target three-dimensional data.
  5. 根据权利要求1或2所述的图像采集装置,其特征在于,所述每个像素单元进一步包括第三像素和第四像素,所述每个微透镜用于将所述物体返回的光信号汇聚至对应的一个像素单元,并使得所述多个像素单元中的多个第三像素接收到的光信号的角度相同,以及多个第四像素接收到的光信号的角度相同。The image acquisition device according to claim 1 or 2, wherein each pixel unit further comprises a third pixel and a fourth pixel, and each microlens is used to converge the light signal returned by the object To a corresponding pixel unit, so that the angles of the light signals received by the plurality of third pixels in the plurality of pixel units are the same, and the angles of the light signals received by the plurality of fourth pixels are the same.
  6. 根据权利要求5所述的图像采集装置,其特征在于,所述每个像素单元中的所述第一像素、所述第二像素、所述第三像素和所述第四像素按照2×2的矩阵形式进行排列。The image acquisition device according to claim 5, wherein the first pixel, the second pixel, the third pixel, and the fourth pixel in each pixel unit are in the order of 2×2 Arranged in a matrix form.
  7. 根据权利要求5或6所述的图像采集装置,其特征在于,所述第三像素和所述第四像素接收到的光信号的角度不同。The image acquisition device according to claim 5 or 6, wherein the angles of the light signals received by the third pixel and the fourth pixel are different.
  8. 根据权利要求5-7中任一项所述的图像采集装置,其特征在于,所述处理电路用于根据所述多个第三像素的输出得到第三视角图像,根据所述多个第四像素的输出得到第四视角图像,并根据所述TOF图像、所述第一视角图像、所述第二视角图像、所述第三视角图像和所述第四视角图像生成所述物体的目标三维数据。The image acquisition device according to any one of claims 5-7, wherein the processing circuit is configured to obtain a third perspective image according to the output of the plurality of third pixels, and according to the plurality of fourth pixels. The output of the pixels obtains a fourth perspective image, and the target three-dimensional image of the object is generated according to the TOF image, the first perspective image, the second perspective image, the third perspective image, and the fourth perspective image. data.
  9. 根据权利要求8所述的图像采集装置,其特征在于,所述处理电路用于根据所述TOF图像,生成所述物体的第一三维数据;根据所述第一视角图像、所述第二视角图像、所述第三视角图像和所述第四视角图像,生成所述物体的第二三维数据;以及根据所述第一三维数据和所述第二三维数据,生成所述目标三维数据。8. The image acquisition device according to claim 8, wherein the processing circuit is used to generate first three-dimensional data of the object according to the TOF image; according to the first perspective image and the second perspective Image, the third perspective image, and the fourth perspective image to generate second three-dimensional data of the object; and the target three-dimensional data is generated based on the first three-dimensional data and the second three-dimensional data.
  10. 根据权利要求4或9所述的图像采集装置,其特征在于,所述处理电路用于将所述第一三维数据和所述第二三维数据进行融合,得到所述目标三维数据。The image acquisition device according to claim 4 or 9, wherein the processing circuit is used to fuse the first three-dimensional data and the second three-dimensional data to obtain the target three-dimensional data.
  11. 根据权利要求10所述的图像采集装置,其特征在于,所述处理电路用于通过迭代最近点ICP算法对所述第一三维数据和所述第二三维数据进行融合。The image acquisition device according to claim 10, wherein the processing circuit is used to fuse the first three-dimensional data and the second three-dimensional data by using an iterative closest point ICP algorithm.
  12. 根据权利要求1-11中任一项所述的图像采集装置,其特征在于,所述多个像素单元为脉冲式TOF感应单元,所述TOF图像是根据所述多个像素单元至少1次感应的光信号生成的。The image acquisition device according to any one of claims 1-11, wherein the plurality of pixel units are pulsed TOF sensing units, and the TOF image is sensed at least once according to the plurality of pixel units. The light signal is generated.
  13. 根据权利要求1-11中任一项所述的图像采集装置,其特征在于,所述多个像素单元为相位式TOF感应单元,所述TOF图像是根据所述多个像素单元至少3次感应的光信号生成的。The image acquisition device according to any one of claims 1-11, wherein the plurality of pixel units are phase-type TOF sensing units, and the TOF image is sensed at least 3 times according to the plurality of pixel units. The light signal is generated.
  14. 根据权利要求1-13中任一项所述的图像采集装置,其特征在于,还包括主透镜,所述主透镜设置在所述微透镜阵列的上方,用于对所述物体进行成像。15. The image acquisition device according to any one of claims 1-13, further comprising a main lens, the main lens being arranged above the microlens array for imaging the object.
  15. 根据权利要求14所述的图像采集装置,其特征在于,所述微透镜阵列设置在所述主透镜的焦平面上。The image capture device according to claim 14, wherein the micro lens array is arranged on the focal plane of the main lens.
  16. 根据权利要求1-15中任一项所述的图像采集装置,其特征在于,还包括光源,所述光源用于向所述物体发射指定波长的光信号,所述多个像 素单元用于感应所述物体返回的指定波长的光信号。The image acquisition device according to any one of claims 1-15, further comprising a light source, the light source is used to emit a light signal of a specified wavelength to the object, and the plurality of pixel units are used to sense The optical signal of the specified wavelength returned by the object.
  17. 根据权利要求16所述的图像采集装置,其特征在于,所述指定波长的光信号为红外光信号。The image acquisition device according to claim 16, wherein the optical signal of the specified wavelength is an infrared optical signal.
  18. 根据权利要求16或17所述的图像采集装置,其特征在于,还包括控制单元,所述控制单元用于控制所述光源与所述多个像素单元时钟同步。The image acquisition device according to claim 16 or 17, further comprising a control unit configured to control the light source to synchronize clocks with the plurality of pixel units.
  19. 根据权利要求16-18中任一项所述的图像采集装置,其特征在于,还包括滤光结构,所述滤光结构设置在所述微透镜阵列的上方,用于过滤非指定波长的光信号。The image acquisition device according to any one of claims 16-18, further comprising a filter structure, the filter structure is arranged above the microlens array for filtering light of non-designated wavelengths Signal.
  20. 根据权利要求19所述的图像采集装置,其特征在于,所述滤光结构为滤光膜,所述滤光膜通过镀膜方式设置在所述微透镜阵列的上表面。18. The image acquisition device according to claim 19, wherein the filter structure is a filter film, and the filter film is provided on the upper surface of the microlens array by coating.
  21. 根据权利要求19所述的图像采集装置,其特征在于,所述滤光结构为滤光片。The image acquisition device according to claim 19, wherein the filter structure is a filter.
  22. 一种图像采集装置,其特征在于,包括:An image acquisition device, characterized in that it comprises:
    微透镜阵列,包括多个微透镜;Micro lens array, including multiple micro lenses;
    多个像素单元,与所述多个微透镜一一对应,所述多个像素单元中的每个像素单元包括第一像素和第二像素,每个微透镜用于将物体返回的光信号汇聚至对应的一个像素单元,并使得所述多个像素单元中的多个第一像素接收到的光信号的角度相同,以及多个第二像素接收到的光信号的角度相同;A plurality of pixel units corresponding to the plurality of microlenses, each pixel unit of the plurality of pixel units includes a first pixel and a second pixel, and each microlens is used for converging the light signal returned from the object To a corresponding pixel unit, so that the angles of the light signals received by the plurality of first pixels in the plurality of pixel units are the same, and the angles of the light signals received by the plurality of second pixels are the same;
    其中,所述多个第一像素的输出用于生成第一视角图像,所述多个第二像素的输出用于生成第二视角图像,所述多个像素单元的输出用于生成所述物体的飞行时间TOF图像,且所述TOF图像、所述第一视角图像和所述第二视角图像用于生成所述物体的目标三维数据。Wherein, the output of the plurality of first pixels is used to generate a first perspective image, the output of the plurality of second pixels is used to generate a second perspective image, and the output of the plurality of pixel units is used to generate the object The TOF image of the time of flight, and the TOF image, the first perspective image, and the second perspective image are used to generate target three-dimensional data of the object.
PCT/CN2020/096727 2020-06-18 2020-06-18 Image acquisition apparatus WO2021253308A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/096727 WO2021253308A1 (en) 2020-06-18 2020-06-18 Image acquisition apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/096727 WO2021253308A1 (en) 2020-06-18 2020-06-18 Image acquisition apparatus

Publications (1)

Publication Number Publication Date
WO2021253308A1 true WO2021253308A1 (en) 2021-12-23

Family

ID=79268862

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/096727 WO2021253308A1 (en) 2020-06-18 2020-06-18 Image acquisition apparatus

Country Status (1)

Country Link
WO (1) WO2021253308A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6373557B1 (en) * 1997-12-23 2002-04-16 Siemens Aktiengesellschaft Method and apparatus for picking up a three-dimensional range image
JP2003007994A (en) * 2001-06-27 2003-01-10 Konica Corp Solid-state image pickup element, stereoscopic camera apparatus, and range finder
CN104221369A (en) * 2012-03-28 2014-12-17 富士胶片株式会社 Image capture element and image capture device and image capture method employing same
CN105306921A (en) * 2014-06-18 2016-02-03 中兴通讯股份有限公司 Three-dimensional photo shooting method based on mobile terminal and mobile terminal
US20170339363A1 (en) * 2016-05-17 2017-11-23 Canon Kabushiki Kaisha Image capturing apparatus, image capturing method, and storage medium using compressive sensing
CN110729320A (en) * 2019-10-18 2020-01-24 深圳市光微科技有限公司 Pixel unit, TOF image sensor including the same, and imaging apparatus

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6373557B1 (en) * 1997-12-23 2002-04-16 Siemens Aktiengesellschaft Method and apparatus for picking up a three-dimensional range image
JP2003007994A (en) * 2001-06-27 2003-01-10 Konica Corp Solid-state image pickup element, stereoscopic camera apparatus, and range finder
CN104221369A (en) * 2012-03-28 2014-12-17 富士胶片株式会社 Image capture element and image capture device and image capture method employing same
CN105306921A (en) * 2014-06-18 2016-02-03 中兴通讯股份有限公司 Three-dimensional photo shooting method based on mobile terminal and mobile terminal
US20170339363A1 (en) * 2016-05-17 2017-11-23 Canon Kabushiki Kaisha Image capturing apparatus, image capturing method, and storage medium using compressive sensing
CN110729320A (en) * 2019-10-18 2020-01-24 深圳市光微科技有限公司 Pixel unit, TOF image sensor including the same, and imaging apparatus

Similar Documents

Publication Publication Date Title
EP3586165B1 (en) Single-frequency time-of-flight depth computation using stereoscopic disambiguation
US9826216B1 (en) Systems and methods for compact space-time stereo three-dimensional depth sensing
WO2021120402A1 (en) Fused depth measurement apparatus and measurement method
KR20210027461A (en) Image processing method and apparatus and image processing device
CN110443186B (en) Stereo matching method, image processing chip and mobile carrier
US11803982B2 (en) Image processing device and three-dimensional measuring system
CN108924408B (en) Depth imaging method and system
CN111095914B (en) Three-dimensional image sensing system, related electronic device and time-of-flight distance measurement method
CN108881717B (en) Depth imaging method and system
JP2015049200A (en) Measuring device, measuring method, and measuring program
Shahnewaz et al. Color and depth sensing sensor technologies for robotics and machine vision
CN108876840A (en) A method of vertical or forward projection 3-D image is generated using virtual 3d model
WO2020087485A1 (en) Method for acquiring depth image, device for acquiring depth image, and electronic device
CN111510700A (en) Image acquisition device
US11348271B2 (en) Image processing device and three-dimensional measuring system
WO2021253308A1 (en) Image acquisition apparatus
CN108924407B (en) Depth imaging method and system
JP6868167B1 (en) Imaging device and imaging processing method
EP4314703A1 (en) Mixed-mode depth imaging
CN114119696A (en) Method, device and system for acquiring depth image and computer readable storage medium
Grunnet-Jepsen et al. Intel® RealSense™ Depth Cameras for Mobile Phones
WO2022096127A1 (en) A device and method for image processing
CN112750098B (en) Depth map optimization method, device, system, electronic device and storage medium
JP6966011B1 (en) Imaging device, imaging method and information processing device
JP6868168B1 (en) Imaging device and imaging processing method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20941057

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20941057

Country of ref document: EP

Kind code of ref document: A1