CN111510700A - Image acquisition device - Google Patents

Image acquisition device Download PDF

Info

Publication number
CN111510700A
CN111510700A CN202010557645.0A CN202010557645A CN111510700A CN 111510700 A CN111510700 A CN 111510700A CN 202010557645 A CN202010557645 A CN 202010557645A CN 111510700 A CN111510700 A CN 111510700A
Authority
CN
China
Prior art keywords
pixel
image
dimensional data
pixels
tof
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010557645.0A
Other languages
Chinese (zh)
Inventor
吕萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Goodix Technology Co Ltd
Original Assignee
Shenzhen Goodix Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Goodix Technology Co Ltd filed Critical Shenzhen Goodix Technology Co Ltd
Priority to CN202010557645.0A priority Critical patent/CN111510700A/en
Publication of CN111510700A publication Critical patent/CN111510700A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/48Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
    • G01S7/483Details of pulse systems
    • G01S7/486Receivers
    • G01S7/4865Time delay measurement, e.g. time-of-flight measurement, time of arrival measurement or determining the exact position of a peak

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

The embodiment of the application discloses an image acquisition device, which can improve the quality of acquired three-dimensional data. The image acquisition device includes: a microlens array including a plurality of microlenses; the plurality of pixel units are arranged below the microlens array and correspond to the plurality of microlenses in a one-to-one mode, one microlens is used for converging optical signals returned by an object to the corresponding pixel unit, each pixel unit comprises n pixels, the angles of the optical signals received by the pixels located at the same relative position in the plurality of pixel units are the same, n is an integer greater than or equal to 2, and the optical signals sensed by each pixel unit in the plurality of pixel units are used for generating a TOF image of the object; the optical signals sensed by the pixels positioned at the same relative position in the plurality of pixel units are used for generating an image of one visual angle of the object so as to obtain images of at least two visual angles; the TOF image and the images of at least two view angles are used to generate target three-dimensional data of the object.

Description

Image acquisition device
Technical Field
The embodiment of the application relates to the field of images, in particular to an image acquisition device.
Background
Depth measurement or three-dimensional data is increasingly being applied to consumer-grade electronics, among which typical technologies are binocular vision solutions, structured light solutions and time of flight (TOF) solutions. The TOF scheme is more and more widely applied due to the advantages of no baseline requirement, compact structure, high speed, simple algorithm and the like, and is applied to more and more smart phones at present. However, the consumer TOF generally requires low price, small size and low power consumption, but the TOF acquires poor three-dimensional data under the limitations of these conditions, and obviously limits the application effect and user experience of the TOF in the aspects of three-dimensional face recognition, photo background blurring, motion sensing games and the like.
Disclosure of Invention
The embodiment of the application provides an image acquisition device, which can improve the quality of acquired three-dimensional data.
In a first aspect, an image capturing apparatus is provided, including: a microlens array including a plurality of microlenses; the pixel units are arranged below the microlens array and correspond to the microlenses in a one-to-one mode, one microlens is used for converging an optical signal returned by an object to the corresponding pixel unit, the angles of the optical signals received by the pixels located at the same relative position in the pixel units are the same, n is an integer greater than or equal to 2, and the optical signal sensed by each pixel unit in the pixel units is used for generating a time of flight (TOF) image of the object; the light signals sensed by the pixels located at the same relative position in the plurality of pixel units are used for generating an image of one visual angle of the object, and the pixels located at least two relative positions in each pixel unit are used for generating images of at least two visual angles of the object; the TOF image and the images of the at least two view angles are used to generate target three-dimensional data of the object.
The structure that a pixel unit comprises a plurality of pixels in the application is similar to the principle of multi-shot, and the base line is very small and is about a few millimeters, so that the pixel unit mainly can measure three-dimensional data of a near object, and the imaging mode is passive imaging, so that the multi-path problem in a TOF imaging mode does not exist. While the traditional TOF can measure three-dimensional data at a medium and long distance because it does not need a baseline, the three-dimensional data measured at a near distance has a large error. Therefore, the two measurement modes are combined, three-dimensional data in a larger range can be obtained, the data of the two measurement modes are complemented, the multi-path problem of TOF can be relieved, and the three-dimensional data with better data quality than that obtained by either one of the two measurement modes can be obtained. Therefore, the scheme of the embodiment of the application can improve the quality of the obtained three-dimensional data.
In addition, compared with the structure of the traditional TOF, the embodiment of the application only changes the corresponding relation between the micro-lens array and the pixels, does not obviously increase new component materials, and does not cause obvious increase of cost.
In one possible implementation, the one pixel unit includes 2 pixels or 4 pixels.
In a possible implementation manner, the plurality of pixel units are pulsed TOF sensing units, and the TOF image is generated according to the light signals sensed by the plurality of pixel units at least 1 time.
In a possible implementation manner, the plurality of pixel units are phase TOF sensing units, and the TOF image is generated according to the optical signals sensed by the plurality of pixel units at least 3 times.
In one possible implementation, pixels located at different positions in the same pixel unit receive different angles of light signals.
In one possible implementation, the n pixels in the one pixel unit are arranged in a matrix form.
In one possible implementation, the one pixel unit includes 2 pixels, and the 2 pixels are arranged in a matrix form of 1 × 2 or 2 × 1.
In one possible implementation, the one pixel unit includes 4 pixels, and the 4 pixels are arranged in a matrix form of 2 × 2.
In a possible implementation, the micro-lens array further comprises a main lens, and the main lens is arranged above the micro-lens array and used for imaging the object.
In one possible implementation, the microlens array is disposed at a focal plane of the main lens.
In a possible implementation manner, the display device further includes a light source, the light source is configured to emit a light signal with a specified wavelength to the object, and the plurality of pixel units are configured to sense the light signal with the specified wavelength returned by the object.
In one possible implementation, the optical signal of the specified wavelength is an infrared optical signal.
In a possible implementation manner, the display device further comprises a control unit, and the control unit is used for controlling the light source to be in clock synchronization with the plurality of pixel units.
In a possible implementation manner, the optical filter structure is arranged above the micro lens array and used for filtering optical signals with non-specified wavelengths.
In a possible implementation manner, the filtering structure is a filter film, and the filter film is disposed on the upper surface of the microlens array in a film coating manner.
In one possible implementation, the filtering structure is a filter.
In one possible implementation, the system further includes a processor configured to generate first three-dimensional data of the object from the TOF image; generating second three-dimensional data of the object according to the images of the at least two visual angles; and generating the target three-dimensional data according to the first three-dimensional data and the second three-dimensional data.
In a possible implementation manner, the processor is configured to fuse the first three-dimensional data and the second three-dimensional data to obtain the target three-dimensional data.
In one possible implementation, the processor is configured to fuse the first three-dimensional data and the second three-dimensional data by an iterative closest point ICP algorithm.
Drawings
Fig. 1 is a schematic configuration diagram of a conventional TOF system.
Fig. 2 is a schematic structural diagram of an image capturing device provided in an embodiment of the present application.
Fig. 3 and 4 are schematic structural diagrams of a pixel unit provided in an embodiment of the present application.
Fig. 5 is a schematic diagram of propagation paths of optical signals reflected by different points on an object according to an embodiment of the present disclosure.
Fig. 6 is a schematic structural diagram of another image capturing device provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.
Depth measurements or three-dimensional data are increasingly being applied to consumer-grade electronics, typical technologies being bi (multi) vision schemes, structured light schemes and time of flight (TOF) schemes.
The principle of binocular vision is similar to that of human eyes, images of an object are captured from different directions through two cameras respectively, the captured images are fused, the corresponding relation between features is established, and mapping points of physical points in the same space in different images are matched to obtain three-dimensional data of the object.
The binocular stereo vision obtains three-dimensional information by a trigonometry principle, namely, a triangle is formed between the image planes of the two cameras and the object to be measured. Knowing the position relationship between the two cameras and the coordinates of the object in the left and right images, the three-dimensional size of the object in the common field of view of the two cameras and the three-dimensional coordinates of the feature points of the object in space can be obtained. Therefore, binocular vision systems are generally composed of two cameras.
Similarly, a multi-view vision system is generally composed of a plurality of cameras that capture images of an object from different directions and fuse the captured images to obtain three-dimensional data of the object.
The structured light scheme is that light with certain structural characteristics is projected to a shot object through an infrared laser, a special infrared camera is used for collecting reflected structured light patterns, and depth information is calculated according to a triangulation principle.
The basic principle of ToF is to obtain the distance between an object and a camera by continuously emitting a light pulse (typically invisible light) onto the object to be observed, and then receiving the light returning from the object with a sensor, by detecting the time of flight (round trip) of the light pulse.
TOF methods can be generally classified into two types according to modulation methods: pulse modulation (pulsed modulation) and continuous wave modulation (continuous wave modulation).
The principle of the pulse modulation scheme is simple, and the distance is directly measured according to the time difference t between the pulse transmission and the pulse reception, which can also be called pulse TOF method.
In practical applications, the continuous wave modulation usually employs sine wave modulation or other periodic signal modulation. Since the phase shift of the sine waves of the receiving end and the transmitting end is proportional to the distance between the object and the camera, the phase shift can be used for measuring the distance, and the method can also be called a phase TOF method or an indirect TOF method. The distance is calculated by the phase delay phi of the light calculated over multiple frames from transmission to reception. The specific calculation formula may be: distance between two adjacent plates
Figure 71617DEST_PATH_IMAGE001
C is the speed of light, f is the operating frequency of the light source, and phi is the phase delay.
Current consumer TOF depth cameras are mainly: kinect 2 by Microsoft, SR4000 by MESA, TOF depth camera by PMD Tech used in GoogleProject Tango, and the like. These products have been used for more applications in somatosensory recognition, gesture recognition, environment modeling, and the like, and most typically, the product is Kinect 2 by microsoft.
The TOF depth camera may change the camera measurement distance by adjusting the frequency of the transmitted pulses; different from a depth camera based on a feature matching principle, the TOF depth camera has the advantages that the measurement accuracy does not decrease along with the increase of the measurement distance, and the measurement error is basically fixed in the whole measurement range; the TOF depth camera has strong anti-interference capability. Therefore, in situations where the measurement distance is far from the target (e.g., unmanned), the TOF depth camera has a significant advantage.
The TOF depth camera has high precision requirement on time measurement, and even if electronic components with the highest precision are adopted, the precision of millimeter level is difficult to achieve. Therefore, in the near distance measurement field, especially in the 1m range, the accuracy of the TOF depth camera has a larger gap compared with other depth cameras, which limits its application in the near distance high accuracy field, such as in the application scenarios of face recognition unlocking, face recognition payment, etc.
At present, the TOF scheme is more and more widely applied due to the advantages of no baseline requirement, compact structure, high speed and simple algorithm, and is already applied to smart phones of Huashi brand and other brands (Huashi P30 Pro, Samsung Note 10 +). However, the consumer TOF generally requires low price, small size and low power consumption, which leads to poor quality of three-dimensional data acquired by the TOF, and obviously limits application effects of the TOF in aspects of three-dimensional face recognition, photograph background blurring, motion sensing games and the like if the measured depth information of an object is inaccurate.
The structure of a conventional TOF image acquisition apparatus, which may include a main lens 120 and a pixel array 140, is shown in fig. 1. The main lens 120 is used for converging the optical signal returned from the object 110, and the pixel array 140 is used for sensing the optical signal passing through the main lens 120 to generate a TOF image of the object 110.
The main lens 120 may also be referred to as an imaging lens.
The pixels in the pixel array 140 in fig. 1 are TOF sensing units.
In order to improve the light sensing capability, a microlens array 130 may be covered on the pixel array 140, wherein one microlens corresponds to one pixel, i.e., the pixel 141 corresponds to the microlens 131, the pixel 142 corresponds to the microlens 132, and the pixel 143 corresponds to the microlens 133. One microlens can converge the light signal reflected by the object 110 to a corresponding pixel, and the pixel can receive more light energy through the converging action of the microlens.
Fig. 1 shows a schematic representation of the propagation path of an optical signal after reflection at a point 111 on an object 110. The optical signal reflected by the point 111 is converged by the main lens 120 and reaches the microlens 131, and the pixel 141 can receive the optical signal converged by the microlens 131.
Different pixels in the pixel array 140 can receive light signals reflected by different points on the object 110, so that the embodiments of the present application can generate TOF images of the object according to the light signals sensed by the pixel array to obtain depth information of the object. For example, the TOF image can be used to generate three-dimensional data of the object by a depth reconstruction algorithm.
The TOF image in the embodiments of the present application may represent depth information of an object, or the TOF image may represent both depth information and image information of an object.
The image acquisition device is improved on the basis of the scheme, and the image acquisition device is obtained, so that the quality of the acquired three-dimensional data can be improved.
As shown in fig. 2, the image pickup device may include a microlens array 230 and a plurality of pixel units 240. The microlens array 230 includes a plurality of microlenses. Each of the plurality of pixel units 240 may include n pixels, n is an integer greater than or equal to 2, and the plurality of pixel units 240 correspond to the plurality of microlenses in a one-to-one manner. One microlens is used to converge an optical signal returned by an object to a corresponding pixel unit, and the angles of the optical signals received by the pixels located at the same relative position in the plurality of pixel units 240 are the same.
The light signal sensed by each pixel unit in the plurality of pixel units is used for generating a TOF image of the object, the light signal sensed by the pixel located at the same relative position in the plurality of pixel units is used for generating an image of one view angle of the object, the pixels at least two relative positions in each pixel unit can be used for generating images of at least two view angles of the object, and the TOF image and the images of at least two view angles are used for generating target three-dimensional data of the object.
The light signal sensed by one pixel unit can be used for generating depth information at the position of the pixel unit, and the light signals sensed by a plurality of pixel units can be respectively used for generating depth information at the plurality of pixel units, so that the depth information of the object is obtained, and a TOF image of the object is generated.
The pixels at the same relative positions in the embodiments of the present application may refer to a pixel a in one pixel unit and a pixel b corresponding to the pixel a in another pixel unit. For example, it is assumed that each pixel unit includes four pixels, and the relative positions of the four pixels in this pixel unit are upper left, lower left, upper right, and lower right, respectively. All the pixels positioned at the upper left in the four pixel units are referred to as pixels positioned at the same relative position, all the pixels positioned at the lower left in the four pixel units are referred to as pixels positioned at the same relative position, all the pixels positioned at the upper right in the four pixel units are referred to as pixels positioned at the same relative position, and all the pixels positioned at the lower right in the four pixel units are referred to as pixels positioned at the same relative position.
In addition, all pixels positioned at the upper left are used for generating an image of one viewing angle, all pixels positioned at the lower left are used for generating an image of one viewing angle, all pixels positioned at the upper right are used for generating an image of one viewing angle, all pixels positioned at the lower right are used for generating an image of one viewing angle, and the images of four viewing angles are used for generating target three-dimensional data.
The target three-dimensional data of the object can be generated by using the images of the n visual angles, and the target three-dimensional data of the object can also be generated by using the images of at least two visual angles in the n visual angles.
For example, if one pixel unit includes 4 pixels, the embodiment of the present application may generate images of 4 perspectives of an object to obtain target three-dimensional data; it is also possible to generate images of only 2 view angles to obtain the target three-dimensional data.
The three-dimensional data in the embodiments of the present application refers to data capable of describing the shape and spatial position of an object in a three-dimensional space. The three-dimensional data may include, but is not limited to, a point cloud (point cloud), a depth image (depth image/range image), or a three-dimensional network (mesh).
In the embodiment of the present application, the one-to-one correspondence between the plurality of pixel units and the plurality of microlenses means that each pixel unit has a corresponding microlens, and an optical signal passing through one microlens can be received by n pixels in the corresponding pixel unit.
For example, the light signal passing through the microlens 231 may be received by the pixel 241a and the pixel 241b, the light signal passing through the microlens 232 may be received by the pixel 242a and the pixel 242b, and the light signal passing through the microlens 233 may be received by the pixel 243a and the pixel 243 b. In this example, n = 2.
The number of pixels included in one pixel unit is not particularly limited in the embodiments of the present application. For example, one pixel unit may comprise 2 pixels or 4 pixels or even more pixels. Of course, a pixel unit may also include 3 pixels or other numbers of pixels.
Taking fig. 2 as an example, one pixel unit includes 2 pixels. The plurality of pixel units 240 includes a pixel unit 241, a pixel unit 242, and a pixel unit 243. The pixel unit 241 includes a pixel 241a and a pixel 241b, the pixel unit 242 includes a pixel 242a and a pixel 242b, and the pixel unit 243 includes a pixel 243a and a pixel 243 b.
The arrangement of the n pixels in one pixel unit is not specifically limited in the embodiments of the present application. For example, when n is an even number, the n pixels may be arranged in a matrix.
For example, when one pixel unit includes 2 pixels, the 2 pixels are arranged in a matrix form of 1 × 2 or 2 × 1, and when one pixel unit includes 4 pixels, the 4 pixels are arranged in a matrix form of 2 × 2.
Under the condition that each pixel unit comprises an even number of pixels, the arrangement mode is neat, and the space is saved.
When one pixel unit includes 2 pixels, the 2 pixels may be arranged in an up-down structure as shown in the left diagram of fig. 3, or in a left-right structure as shown in the left diagram of fig. 3.
When one pixel unit includes 4 pixels, the 4 pixels may be arranged in a 2 × 2 matrix manner, as shown in fig. 4.
The above arrangement is merely an example, and the present application is not limited to the above arrangement, for example, when one pixel unit includes 2 pixels, the 2 pixels may be arranged diagonally, and the like.
In the embodiment of the present application, the shape of a single pixel is not particularly limited, and for example, the pixel may be a square, a circle, or the like.
The conventional TOF mode is changed by combining n pixels into a pixel unit. I.e. a TOF image of the object can be generated from the light signal sensed by one pixel cell.
For example, the pixel values of n pixel cells within a pixel cell may be combined, i.e., the pixel values of n pixel cells are summed, and a depth value at each pixel cell is calculated from the summed pixel values to generate a TOF image of the object. For another example, the pixel values of n pixel units in one pixel unit may be averaged, and a depth value at each pixel unit may be calculated according to the averaged pixel values to obtain a TOF image of the object. For another example, depth values may be calculated for each sub-pixel in a pixel unit, and then the average value of the depth values is taken to obtain a TOF image of the object. Regardless of the calculation method, it is sufficient that the calculation method of each pixel unit is consistent.
For phase TOF, the pixel value of one pixel may be used to represent an integrated value of the phase of the optical signal; for pulsed TOF, the pixel value of one pixel may be used to represent the time of flight of the light signal.
When the TOF mode is used, the pixel values of the pixel 241a and the pixel 241b may be averaged to obtain the pixel value of the pixel unit 241; averaging the pixel values of 242a and 242b to obtain the pixel value of the pixel unit 242; the pixel values of the pixels 243a and 243b are averaged to obtain the pixel value of the pixel unit 243. Then, based on the pixel values of the pixel unit 241, the pixel unit 242, and the pixel unit 243, a TOF image of the object is generated.
The number of pixels included in each pixel unit is the same, and the arrangement manner is also the same. In this way, the angles at which pixels located at the same relative position in the plurality of pixel units can receive optical signals can be kept uniform, while the angles at which pixels located at different relative positions within one pixel can receive optical signals are different. Thus, the light signals sensed by the pixels located at the same relative position can generate an image of one viewing angle, and if one pixel unit includes n pixels, an image of n viewing angles can be generated.
The pixels located at the same relative position in the embodiment of the present application mean that the positions of the pixels in the plurality of pixel units are the same. As shown in fig. 2, the pixel 241a, the pixel 242a, and the pixel 243a are pixels located at the same relative position, and the pixel 241b, the pixel 242b, and the pixel 243b are pixels located at the same relative position.
According to the imaging principle of the microlens, the pixel 241a can receive a light signal within an angle ②, the pixel 241b can receive a light signal within an angle ①, the pixel 242a can receive a light signal within an angle ① 3, the pixel 242b can receive a light signal within an angle ① 0, the pixel 243a can receive a light signal within an angle ① 4, and the pixel 243b can receive a light signal within an angle ① 1. the angle ① 2, the angle ④, the angle ⑥ are the same in size and direction, the angle ① 5, the angle ① 6, the angle ① 7 are the same in size and direction, and the angle ②, the angle ④, the angle ⑥ are different from the angle ①, the angle ③, the angle ⑤.
The structure of one pixel unit including a plurality of pixels in the embodiment of the present application is similar to the principle of multi-shot, and since the baseline is small, about several millimeters, it can mainly measure three-dimensional data of a near object, and since the imaging mode is passive imaging, it is not affected by the multipath effect existing in the TOF mode. While the traditional TOF can measure three-dimensional data at a medium and long distance because it does not need a baseline, the three-dimensional data measured at a near distance has a large error. Therefore, the two measurement modes are combined, three-dimensional data in a larger range can be obtained, the data of the two measurement modes are complemented, the multi-path effect of TOF is relieved, and the three-dimensional data with better data quality than the data obtained by either one of the two measurement modes can be obtained. Therefore, the scheme of the embodiment of the application can improve the quality of the obtained three-dimensional data.
The multipath effect may mean that an optical signal emitted by a light source may be reflected multiple times in a target scene, and one object may reflect not only the optical signal emitted by the light source but also optical signals from other indirect paths. There may be interference between the reflected light from multiple sources, causing errors in the data obtained in the TOF regime.
In addition, compared with the structure of the traditional TOF, the embodiment of the application only changes the corresponding relation between the micro-lens array and the pixels, does not obviously increase new component materials, and does not cause obvious increase of cost.
The pixel of the embodiment of the present application is a sensing unit for sensing a minimum unit of a light signal.
The image capturing device in the embodiment of the present application may further include a main lens 220, where the main lens 220 is disposed above the micro lens array 230, and is used for converging the optical signal returned by the object. The microlens array 230 may serve to condense the light signal passing through the main lens 220 to the plurality of pixel units 240.
The microlens array 230 may be disposed on a focal plane of the main lens 220, that is, a distance between the microlens array 230 and the main lens 220 is a focal length of the main lens 220, which may simplify a process of generating three-dimensional data of an object. Of course, the position of the microlens array 230 may be offset from the image plane of the main lens 220, and thus, the offset needs to be corrected when three-dimensional data processing is performed.
The distance between the plurality of pixel units 240 and the microlens array 230 may be determined according to the focal length of a single microlens and the size of the pixel unit, for example, the distance between the plurality of pixel units 240 and the microlens array 230 may be greater than the focal length of one microlens.
Fig. 5 shows a schematic diagram of the propagation path of an optical signal reflected by a point 211 in an object, the distance between the point 211 and the main lens 220 being G. According to the design of the main lens 220, the optical signal reflected by the point 211 is received by one microlens 232 after passing through the main lens 220, and the optical signal passing through the microlens 232 is sensed by the pixel unit below the microlens 232, wherein a part of the optical signal is sensed by the pixel a, and another part of the optical signal is sensed by the pixel b.
In fig. 5, white pixels in the plurality of pixel units 240 are used to generate an image of one viewing angle of the object, and shaded pixels are used to generate an image of another viewing angle of the object.
As shown in fig. 6, the image capturing device in the embodiment of the present application may further include a light source 520, where the light source 520 may be configured to emit a light signal with a specific wavelength to the object 510, and the plurality of pixel units 540 may be configured to sense the light signal returned by the object 510.
The light source may be a vertical-cavity surface-emitting laser (VCSE L) or a light emitting diode (L ED).
The optical signal with the specified wavelength can be an invisible optical signal, and the invisible optical signal is not easy to observe by eyes of a user, so that the user experience is improved. Of course, in some occasions where the requirements are not high, the visible light signal may also be used as the light source, which is not specifically limited in the embodiment of the present application.
Preferably, the optical signal of the specified wavelength may be an infrared optical signal.
The image capturing device in the embodiment of the present application may further include a control unit, and the control unit may be configured to control the light source 520 and the plurality of pixel units 540 to be time-synchronized, so that a time difference or a phase difference between the emitted light signal and the received light signal may be accurately calculated.
Optionally, the image capturing apparatus may further include a filtering structure 530, where the filtering structure 530 is disposed above the microlens array 560, and is configured to filter the optical signals with unspecified wavelengths, so that only the optical signals with the specified wavelengths are received by the plurality of pixel units 540, which can reduce interference of ambient light on the TOF imaging process and improve a signal-to-noise ratio of the optical signals collected by the TOF pixel units.
Of course, the filtering structure 530 may be disposed between the microlens array 560 and the plurality of pixel units 540 as long as it is ensured that only optical signals of a specified wavelength can reach the plurality of pixel units 540.
The filtering structure 530 may be a filter, and the filter may be disposed on the upper surface of the microlens array by coating. Alternatively, the filtering structure 530 may be a filter disposed on the upper surface of the microlens array.
The image capturing device in the embodiment of the present application may be disposed on a base 550, and the base 550 is used for fixing the image capturing device.
The plurality of pixel units 540 in the embodiment of the present application may be TOF sensing units. The TOF induction unit can be a pulse TOF induction unit and also can be a phase TOF induction unit.
If the plurality of pixel cells are pulsed TOF sensing cells, the TOF image may be generated from the light signals sensed by the plurality of pixel cells at least 1 time. The pulse type TOF induction unit can obtain a TOF image only through at least one emission and collection process.
If the plurality of pixel cells are phase TOF sensing cells, the TOF image may be generated from the light signals sensed by the plurality of pixel cells at least 3 times. The phase type TOF sensing unit needs at least three emission acquisition processes to obtain a TOF image.
For example, taking a phase TOF sensing unit as an example, assuming that one pixel unit includes 2 pixels, a TOF image is generated according to 4 sensed optical signals, an average value of the optical signals sensed by the 2 pixels in one pixel unit at each time is calculated to obtain 4 average values, which are respectively denoted as x1, x2, x3 and x4, and then depth information of the object can be obtained through an arctangent formula.
Furthermore, the TOF image in the embodiments of the present application may include image information of an object in addition to depth information. That is, the pixel values of the plurality of pixel units in the embodiment of the present application may be used to generate image information of an object in addition to depth information. For example, the depth information and the image information may be obtained by different algorithms on the pixel values of the pixel units, respectively.
The image acquisition process of the embodiment of the present application is described below with reference to fig. 6.
As shown in fig. 6, the light source 520 emits an optical signal with a specific wavelength λ to the object 510, the optical signal is reflected by the object 510 and reaches the optical filter 530, and the optical filter can be used to block optical signals with other wavelengths and only transmit optical signals with the wavelength λ, so that the plurality of pixel units 540 only receive optical signals with the wavelength λ. Through one emission and collection process, a frame of image can be collected by the multiple pixel units, and each pixel in the multiple pixel units has a corresponding gray value.
In case of pulsed TOF, a plurality of pixel units need to acquire only one image at least to solve three-dimensional data. In the case of phase TOF, a plurality of pixel units need to acquire at least three images to solve the three-dimensional data.
The TOF image and the images of at least two viewing angles may be derived from the same one or more frame images, or may be derived from different frame images.
Taking pulsed TOF as an example, the light source emits a primary light signal, the plurality of pixel units receive corresponding primary light signals, and the TOF image and the images of at least two viewing angles are generated according to the light signals received by the plurality of pixel units at the same time. In the mode, the TOF image and the images of at least two viewing angles can be obtained through one emission and acquisition process, and the processing speed of three-dimensional data can be improved.
For another example, the light source emits the light signal twice, the plurality of pixel units receive the corresponding light signal twice, the TOF image is generated based on the light signal first received by the plurality of pixel units, and the images of at least two viewing angles are generated based on the light signal second received by the plurality of pixel units.
As described above, a pixel unit may include a plurality of pixels, and the pixels located at the same relative positions in each pixel unit are extracted to form an image of a viewing angle. Thus, if one pixel unit includes n pixels, an image of n viewing angles can be obtained.
Optionally, the image capturing device may further comprise a processor operable to generate target three-dimensional data of the object from pixel values of the plurality of pixel units. In particular, the processor may be operative to generate a TOF image of the object from the light signal sensed by each of the plurality of pixel cells; generating an image of one view angle of the object according to optical signals sensed by pixels located at opposite positions in the plurality of pixel units to obtain images of at least two view angles of the object; and generating target three-dimensional data of the object according to the TOF image and the images of at least two viewing angles.
The embodiment of the present application does not specifically limit the way in which the processor generates the target three-dimensional data. For example, the processor may generate the target three-dimensional data of the object directly from the TOF image and the images of the at least two viewing angles through a preset algorithm. For another example, the processor may generate first three-dimensional data of the object from the TOF image and second three-dimensional data of the object from the images of the at least two perspectives, and then generate the target three-dimensional data from the first three-dimensional data and the second three-dimensional data.
The first three-dimensional data may be generated according to a TOF algorithm; the second three-dimensional data may be generated according to a dual (multi) vision algorithm, or generated by a light field camera type algorithm, or generated by a machine-learned algorithm.
The processor can fuse the first three-dimensional data and the second three-dimensional data to obtain target three-dimensional data.
The present application does not specifically limit the fusion mode. For example, the processor may fuse the first three dimensional data and the second three dimensional data by an Iterative Closest Point (ICP) algorithm.
The following describes a process of resolving three-dimensional data.
The plurality of pixel units 540 collect an image of the returned signal light once every time the light signal is actively emitted by the light source 520. In case of pulsed TOF, at least one emission acquisition process is used to obtain a gray scale image, which is the final gray scale image. If the phase type TOF is adopted, m (m is more than or equal to 3) emission acquisition processes are needed to obtain m gray level images, the m gray level images are averaged, and the obtained average gray level image is the final gray level image.
Multi-pixel resolving three-dimensional data: and resolving the final gray-scale image obtained in the last step by using a multi-vision algorithm or a deep learning algorithm to obtain first three-dimensional data. For example, the first three-dimensional data may be solved using an ICP algorithm and/or various evolutions of the ICP algorithm.
TOF resolving three-dimensional data: according to data collected by a plurality of pixel units, pixel values of a plurality of pixels in each pixel unit are averaged to obtain an average pixel value, so that original data of a traditional TOF structure is obtained, and then second three-dimensional data is obtained according to a TOF algorithm.
Three-dimensional data fusion: and fusing the first three-dimensional data resolved by the multiple pixels and the second three-dimensional data resolved by the TOF to obtain three-dimensional data with better quality.
And (3) post-treatment: and processing the burr points in the fused three-dimensional data, filling the hollow parts in the data and performing smooth filtering on the data to obtain the optimized three-dimensional data.
A fusion method applicable to the embodiments of the present application is described below, but the embodiments of the present application are not limited thereto.
In the process of calculating the first three-dimensional data and the second three-dimensional data, corresponding confidence maps (confidence maps) may be generated, respectively:
(1) and setting two empirical thresholds, namely a first threshold and a second threshold, and when the confidence map value corresponding to the data point in the first three-dimensional data is greater than the first threshold, considering that the confidence level of the three-dimensional information of the point is high, otherwise, considering that the confidence level is low. Similarly, the confidence map value of the second three-dimensional data is compared with the second threshold, and the confidence level of each data point in the second three-dimensional data can also be judged.
(2) When the reliability of the data point in the first three-dimensional data is high and the reliability of the data point corresponding to the second three-dimensional data is low, taking the data of the first three-dimensional data point as the value of the corresponding point of the target three-dimensional data;
(3) when the reliability of the data point in the first three-dimensional data is low and the reliability of the data point corresponding to the second three-dimensional data is high, taking the data of the second three-dimensional data point as the value of the corresponding point of the target three-dimensional data;
(4) when the credibility of the data points in the first three-dimensional data is high and the credibility of the data points corresponding to the second three-dimensional data is high, taking the average value of the data of the first three-dimensional data points and the data of the second three-dimensional data points as the value of the corresponding point of the target three-dimensional data;
(5) and when the reliability of the first three-dimensional data point is low and the reliability of the second three-dimensional data point is low, taking the average value of the surrounding values of the corresponding point of the target three-dimensional data as the value of the corresponding point of the target three-dimensional data.
In the practical application process, the fused three-dimensional data can be directly used. Or the distances between the object and the plurality of pixel units may be determined first and then how to use the three-dimensional data may be determined. For example, if the distance between the object and the plurality of pixel units is greater than a preset threshold, the first three-dimensional data may be used, in which case, the target three-dimensional data is the first three-dimensional data; the second three-dimensional data may be used if the distance between the object and the plurality of pixel units is less than or equal to a preset threshold, in which case the target three-dimensional data is the second three-dimensional data.
The image acquisition device of the embodiment of the application can be applied to various occasions, such as three-dimensional face recognition, image background blurring, body sensing games and other scenes. The three-dimensional face recognition can be applied to scenes or equipment such as mobile phones, door locks, access controls, payment equipment and the like related to three-dimensional data.
An embodiment of the present application further provides an electronic device, which includes any one of the image capturing apparatuses described above. The electronic device may be, for example, a mobile phone, a computer, or the like.
It is to be understood that the structures shown in the drawings of the present application are merely schematic representations, not actual dimensions and proportions, and are not intended to limit the embodiments of the present application.
It is to be understood that the terminology used in the embodiments of the present application and the appended claims is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the present application.
For example, as used in the examples of this application and the appended claims, the singular forms "a," "an," "the," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
Those of skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the embodiments of the present application.
If implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or make a contribution to the prior art, or may be implemented in the form of a software product stored in a storage medium and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: u disk, removable hard disk, read only memory, random access memory, magnetic or optical disk, etc. for storing program codes.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, devices and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed electronic device, apparatus and method may be implemented in other ways.
For example, the division of a unit or a module or a component in the above-described device embodiments is only one logical function division, and there may be other divisions in actual implementation, for example, a plurality of units or modules or components may be combined or may be integrated into another system, or some units or modules or components may be omitted, or not executed.
Also for example, the units/modules/components described above as separate/display components may or may not be physically separate, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the units/modules/components can be selected according to actual needs to achieve the purposes of the embodiments of the present application.
Finally, it should be noted that the above shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The above description is only a specific implementation of the embodiments of the present application, but the scope of the embodiments of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the embodiments of the present application, and all the changes or substitutions should be covered by the scope of the embodiments of the present application. Therefore, the protection scope of the embodiments of the present application shall be subject to the protection scope of the claims.

Claims (19)

1. An image acquisition apparatus, comprising:
a microlens array including a plurality of microlenses;
a plurality of pixel units, each of the pixel units including n pixels, the pixel units being disposed below the microlens array, the pixel units corresponding to the microlenses in a one-to-one manner, wherein one microlens is used to converge an optical signal returned from an object to a corresponding pixel unit, the pixels located at the same relative position in the pixel units receive the same angle of the optical signal, and n is an integer greater than or equal to 2,
wherein the light signal sensed by each pixel cell of the plurality of pixel cells is used to generate a time of flight (TOF) image of the object; the light signals sensed by the pixels located at the same relative position in the plurality of pixel units are used for generating an image of one visual angle of the object, and the pixels located at least two relative positions in each pixel unit are used for generating images of at least two visual angles of the object; the TOF image and the images of the at least two view angles are used to generate target three-dimensional data of the object.
2. The image capturing device according to claim 1, wherein the one pixel unit includes 2 pixels or 4 pixels.
3. The image acquisition device according to claim 1, wherein the plurality of pixel units are pulsed TOF sensing units, and the TOF image is generated according to the light signals sensed by the plurality of pixel units at least 1 time.
4. The image acquisition device according to claim 1, wherein the plurality of pixel units are phase TOF sensing units, and the TOF image is generated according to the light signals sensed by the plurality of pixel units at least 3 times.
5. The image capturing device of any of claims 1-4, wherein pixels at different positions in a same pixel unit receive different angles of light signals.
6. The image capturing device according to any one of claims 1 to 4, wherein n pixels in the one pixel unit are arranged in a matrix form, and n is an even number.
7. The image pickup device according to claim 6, wherein said one pixel unit includes 2 pixels, and said 2 pixels are arranged in a matrix form of 1 × 2 or 2 × 1.
8. The image pickup device according to claim 6, wherein said one pixel unit includes 4 pixels, and said 4 pixels are arranged in a matrix form of 2 × 2.
9. The image capture device of any of claims 1-4, further comprising a primary lens disposed over the microlens array for imaging the object.
10. The image capturing device of claim 9, wherein the microlens array is disposed at a focal plane of the main lens.
11. The image acquisition device according to any one of claims 1-4, further comprising a light source for emitting light signals of a specified wavelength toward the object, wherein the plurality of pixel units are configured to sense the light signals of the specified wavelength returned by the object.
12. The image capturing device as claimed in claim 11, wherein the optical signal of the specified wavelength is an infrared optical signal.
13. The image capturing device of claim 11, further comprising a control unit for controlling the light source to be clock synchronized with the plurality of pixel cells.
14. The image capture device of claim 11, further comprising a filter structure disposed over the microlens array for filtering optical signals of unspecified wavelengths.
15. The image capturing device as claimed in claim 14, wherein the filter structure is a filter, and the filter is disposed on the upper surface of the microlens array by coating.
16. The image capturing device of claim 14, wherein the filter structure is a filter.
17. The image acquisition device of any one of claims 1-4, further comprising a processor for generating first three-dimensional data of the object from the TOF image; generating second three-dimensional data of the object according to the images of the at least two visual angles; and generating the target three-dimensional data according to the first three-dimensional data and the second three-dimensional data.
18. The image capturing device as claimed in claim 17, wherein the processor is configured to fuse the first three-dimensional data and the second three-dimensional data to obtain the target three-dimensional data.
19. The image acquisition device of claim 18, wherein the processor is configured to fuse the first three-dimensional data and the second three-dimensional data by an Iterative Closest Point (ICP) algorithm.
CN202010557645.0A 2020-06-18 2020-06-18 Image acquisition device Pending CN111510700A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010557645.0A CN111510700A (en) 2020-06-18 2020-06-18 Image acquisition device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010557645.0A CN111510700A (en) 2020-06-18 2020-06-18 Image acquisition device

Publications (1)

Publication Number Publication Date
CN111510700A true CN111510700A (en) 2020-08-07

Family

ID=71878800

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010557645.0A Pending CN111510700A (en) 2020-06-18 2020-06-18 Image acquisition device

Country Status (1)

Country Link
CN (1) CN111510700A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113671606A (en) * 2021-08-02 2021-11-19 维沃移动通信有限公司 Super lens, camera module and electronic equipment that constitute
CN116679461A (en) * 2022-09-29 2023-09-01 华为技术有限公司 Image sensor, imaging device and method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101656835A (en) * 2008-08-21 2010-02-24 索尼株式会社 Image pickup apparatus, display and image processing apparatus
CN105115445A (en) * 2015-09-14 2015-12-02 杭州光珀智能科技有限公司 Three-dimensional imaging system and imaging method based on combination of depth camera and binocular vision
US20160191768A1 (en) * 2013-07-31 2016-06-30 Samsung Electronics Co., Ltd. Light field image capturing apparatus including shifted microlens array
CN106303175A (en) * 2016-08-17 2017-01-04 李思嘉 A kind of virtual reality three dimensional data collection method based on single light-field camera multiple perspective
CN109343070A (en) * 2018-11-21 2019-02-15 深圳奥比中光科技有限公司 Time flight depth camera
CN110012280A (en) * 2019-03-22 2019-07-12 盎锐(上海)信息科技有限公司 TOF mould group and VSLAM calculation method for VSLAM system
CN110970452A (en) * 2018-10-01 2020-04-07 三星电子株式会社 Three-dimensional image sensor, depth correction method, and three-dimensional image generation method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101656835A (en) * 2008-08-21 2010-02-24 索尼株式会社 Image pickup apparatus, display and image processing apparatus
US20160191768A1 (en) * 2013-07-31 2016-06-30 Samsung Electronics Co., Ltd. Light field image capturing apparatus including shifted microlens array
CN105115445A (en) * 2015-09-14 2015-12-02 杭州光珀智能科技有限公司 Three-dimensional imaging system and imaging method based on combination of depth camera and binocular vision
CN106303175A (en) * 2016-08-17 2017-01-04 李思嘉 A kind of virtual reality three dimensional data collection method based on single light-field camera multiple perspective
CN110970452A (en) * 2018-10-01 2020-04-07 三星电子株式会社 Three-dimensional image sensor, depth correction method, and three-dimensional image generation method
CN109343070A (en) * 2018-11-21 2019-02-15 深圳奥比中光科技有限公司 Time flight depth camera
CN110012280A (en) * 2019-03-22 2019-07-12 盎锐(上海)信息科技有限公司 TOF mould group and VSLAM calculation method for VSLAM system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113671606A (en) * 2021-08-02 2021-11-19 维沃移动通信有限公司 Super lens, camera module and electronic equipment that constitute
CN116679461A (en) * 2022-09-29 2023-09-01 华为技术有限公司 Image sensor, imaging device and method
CN116679461B (en) * 2022-09-29 2024-01-05 华为技术有限公司 Image sensor, imaging device and method

Similar Documents

Publication Publication Date Title
CN110596721B (en) Flight time distance measuring system and method of double-shared TDC circuit
US9443308B2 (en) Position and orientation determination in 6-DOF
CN110596725B (en) Time-of-flight measurement method and system based on interpolation
EP1792282B1 (en) A method for automated 3d imaging
RU2769303C2 (en) Equipment and method for formation of scene representation
CA2714629A1 (en) Dual mode depth imaging
US10877153B2 (en) Time of flight based 3D scanner
CN111510700A (en) Image acquisition device
Shahnewaz et al. Color and depth sensing sensor technologies for robotics and machine vision
CN113160416B (en) Speckle imaging device and method for coal flow detection
Langmann Wide area 2D/3D imaging: development, analysis and applications
WO2021253308A1 (en) Image acquisition apparatus
JP6868167B1 (en) Imaging device and imaging processing method
US11762096B2 (en) Methods and apparatuses for determining rotation parameters for conversion between coordinate systems
CN114119696A (en) Method, device and system for acquiring depth image and computer readable storage medium
WO2022096127A1 (en) A device and method for image processing
JP6966011B1 (en) Imaging device, imaging method and information processing device
CN112750098B (en) Depth map optimization method, device, system, electronic device and storage medium
JP7120365B1 (en) IMAGING DEVICE, IMAGING METHOD AND INFORMATION PROCESSING DEVICE
WO2022202536A1 (en) Information processing apparatus and information processing method
JP7040660B1 (en) Information processing equipment and information processing method
JP7031771B1 (en) Imaging device, imaging method and information processing device
JP2021150882A (en) Image capture device and image capture processing method
Mirski et al. Adaptive scanning methods for 3D scene reconstruction
JP2021150880A (en) Image capture device and image capture processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200807

RJ01 Rejection of invention patent application after publication