WO2022265216A1

WO2022265216A1 - Depth measurement error correction method for tof camera

Info

Publication number: WO2022265216A1
Application number: PCT/KR2022/005923
Authority: WO
Inventors: Evgeny Andreevich Dorokhov; Tatiana Igorevna Kopysova; Sergey Alexandrovich TURKO; Mikhail Vyacheslavovich POPOV
Original assignee: Samsung Electronics Co., Ltd.
Priority date: 2021-06-15
Filing date: 2022-04-26
Publication date: 2022-12-22
Also published as: RU2770153C1

Abstract

The present invention relates to image capturing and processing, more specifically, to correction of depth measurement errors of three-dimensional (3D) ToF camera. The invention provides a simple and low-price solution for generating a depth map, which increases processing speed and improves ToF camera distance measurement error compensation quality. Method for generating a depth map with distance measurement error correction for a ToF camera, comprises: preliminary estimating calculated Point Spread Function (PSF) distribution model for the ToF camera; capturing a scene image with the ToF camera; compensating for ghost signals in captured scene image data using said PSF; generating a depth map using corrected data.

Description

DEPTH MEASUREMENT ERROR CORRECTION METHOD FOR TOF CAMERA

The present invention relates to image capturing and processing, and more specifically, to depth measurement error correction in three-dimensional (3D) ToF camera.

Time-of-Flight (ToF) sensor is a special sensor that emits light and, using signal scattered (reflected) from an object, determines distance to this object. Knowing the time of emission and subsequent reception of reflected light, exact distance to the object can be calculated on the basis of speed of light in a manner like a laser rangefinder.

Such sensors are becoming more and more popular in various fields, including application in mobile electronic devices (for example, for biometric identification of user by facial image and gesture recognition), in autonomous and semi-autonomous vehicles, robotics (when generating environment image to enable robot activity), photo and video cameras, virtual/augmented reality (VR/AR) applications, etc.

3D ToF camera (range imaging camera which calculates distances from screen to specific points of view using "time of flight" (ToF) measurement techniques), based on continuous wave (CW) modulation method, uses the principle of phase shift between transmitted and received light to determine distance between the object and the camera. The camera has a light source that emits modulated light s(t), the light reflects from the object, and then signal r(t) is received on sensor plane. Emitted modulated signal s(t) and reflected signal r(t) received with delay

are specified as:

,

where

- modulation frequency, a - emitted signal amplitude, A - received signal amplitude, B - offset representing additional level of intensity.

ToF camera calculates cross-correlation for s(t) and r(T) as:

,

where c(x) - cross correlation function for s(t) and r(t), s(t) - emitted signal, r(t) - signal scattered by the object and received by the camera.

In order to recover phase and amplitude of received signal, the four bucket method is typically used. The basic principle of this method is to obtain four equidistant samples c(x) at locations

, separated from each other by 90 degrees within a modulation cycle:

are raw phases.

The phase shift between s(t) and r(t) can be calculated from the raw phases:

.

Distance to the object for each (i,j) pixel is calculated as:

,

where

- distance from (i,j) pixel to object,

- phase shift for (i,j) pixel,

- signal wavelength.

Currently, low-price configurations of ToF systems targeting mass market suffer from multipath interference (MPI) problems, including, inter alia, appearance of ghost (i.e. double or echo) images, referred to as ghosting. Ghost images are scattering artifacts in image, resulting from multiple internal reflections occurring inside the optical system. The artifacts, which are indicative of depth measurement errors, are highly undesirable in ToF camera applications and largely limit their use.

In situations where a wide range of depths is imaged, wide-range signal can produce scattering from close objects, the signal from close objects coming in competition with weak signal from far objects and causing errors in depth measurements.

Antireflection coating applied to optical system elements significantly decreases amount of scattered light, but does not solve the ghosting problem completely.

Even if ghosting error is partially compensated for, remaining depth map artifacts still may limit application of ToF cameras. Therefore, it is important to provide high-quality compensation for distance measurement errors.

Most solutions use only experimental data for ghosting error analysis and compensation. Experimental data is substantially affected by noise. The fact that ghost signal is several orders of magnitude lower than signal from object makes ghost signal almost indistinguishable and resulting data is unreliable.

Real-time compensation solution is necessary for widespread and efficient use of ToF cameras. Excessive complexity of compensation method can make real-time processing infeasible.

Ghosting problem can be solved by additional modification of camera's hardware (for example, coating the uppermost surface of the receiver's array structure with black ink, leaving sensitive areas unpainted), but this solution is cumbersome and significantly increases ToF camera cost.

Development of a new solution employing traditional image processing approach with comprehensive analysis of camera scatter phenomena and effective ghost image distribution model is essential for provision of efficient real-time ghosting compensation.

US 9,760,837 B1 discloses multi-path interference compensation by using a trained machine learning component. Training data pair comprises at least one simulated raw time-of-flight sensor data value and a corresponding simulated ground truth depth value. The simulated raw time-of-flight sensor data frames are calculated using a modified computer graphics renderer which use ray-tracing to render an image from a model of a 3D object or environment. The method is highly sophisticated and does not demonstrate required ghosting compensation quality.

US 2012/0008128 A1 discloses both hardware approach: coating the uppermost surface of the array structure with black ink, leaving unpainted sensitive regions of the array, and software approach: stray light reflection error compensation by using correction values. Correction values are generated using experimental setup.

US 8,964,028 B2 discloses a method of stray light compensation, comprising determining a distance to a reference target, comparing it to a known value to determine a stray light correction value, determining a distance to an object of interest using stray light correction.

US 8,554,009 B2 describes stray light correction in conventional imaging instrument (not ToF camera). The method comprises determining a set of point spread experimentally, deriving a stray-light distribution function, obtaining a stray-light distribution array, deriving a stray-light correction array, correcting stray light errors in an image from the imaging instrument.

However, the above solutions suffer from low distance error compensation quality.

Therefore, there is a need in the prior art for a simple and inexpensive solution exhibiting high speed (real-time operation) and high quality of ToF camera distance measurement error compensation.

The present invention is aimed at solution of at least some of the above problems.

In accordance with the invention, there is provided a method for generating a depth map with distance measurement error correction for ToF camera, comprising: preliminary estimating calculated Point Spread Function (PSF) distribution model for ToF camera; capturing a scene image with the ToF camera; compensating for ghost signals in captured scene image data using said PSF; generating a depth map using corrected data, wherein said preliminary estimating of calculated PSF distribution model includes the steps of: initializing optical model of optical system of the camera; simulating light scattering in the optical system on the basis of said initialized optical system model; calculating PSF of the optical system; approximating PSF with a polynomial function of radius, where radius is distance from given pixel to image center.

In an embodiment of the method, said calculating of PSF of the optical system is performed according to the following equation:

[Equation 1]

,

where

- simulation of input signal measure that have non-zero value only at position of (i,j) pixel,

- level (intensity) of scattering signal over the camera sensor, represented as two-dimensional function (array), at input signal arriving at (i,j) pixel,

- PSF kernel at input signal arriving at (i,j) pixel.

According to another embodiment of the method, the approximating step comprises approximating

with PSF(r) function, each PSF(r) kernel depending on distance from given pixel to image center.

According to another embodiment of the method, PSF(r) is specified by the equation:

[Equation 2]

,

where A(r) is amplification parameter depending on the distance from given pixel to image center, and G is kernel 3×3, which approximates local light aberrations in pixel neighborhood.

Compensating of ghost signals in captured scene data is performed according to the expression:

,

where

- level of compensated (reconstructed) signal at (i,j) pixel,

- level of scattering signal at (i,j) pixel,

- level of measured signal at (i,j) pixel of the sensor.

According to another embodiment of the method, said preliminary estimating of calculated PSF distribution model further comprises applying machine learning optimization on the obtained PSF based on datasets including ghost-affected image/ghost-free image pairs.

According to another embodiment of the method, in optical model of the ToF camera optical system a point object is simulated as a point light source.

According to another embodiment, the method further comprises, before generating a depth map, iterative estimation of corrected data of the captured scene image.

Therefore, the present invention offers a simple and low-price solution for generating a depth map with high speed (real-time operation) and high quality compensation of ToF camera distance measurement errors.

The invention is further illustrated by the description of preferred embodiments with reference to the accompanying drawings, in which:

Fig. 1 is a flowchart of a method for generating a depth map with ToF camera distance measurement error correction in accordance with the present invention;

Fig. 2 is a flowchart illustrating in more detail the step of preliminary estimating calculated PSF distribution model of a method in accordance with the present invention;

Fig. 3 is a flowchart illustrating in more detail the step of analyzing the optical system of a method in accordance with the present invention;

Fig. 4 is a simplified ToF camera model;

Fig. 5 is an example of simulating a ghost signal of a star-shaped object;

Fig. 6 is a flowchart illustrating in more detail the step of preliminary estimating calculated PSF distribution model (step S1) of a method according to an alternative embodiment of the present invention.

The present invention combines existing error compensation approaches with novel approach of light scattering simulation in an optical system, which can be described using PSF (Point Spread Function), in order to significantly improve performance of ToF camera. Point Spread Function describes the pattern produced by imaging system when a point light source or point object is observed.

To obtain accurate PSF distribution model of ToF camera optical system, optical simulation is carried out.

Although hereinafter the method in accordance with the present invention is described in relation to ToF camera and distance error correction when generating a depth map of a captured scene, the present invention can similarly find application in cameras based on RGB (red, green, blue), ARGB (alpha, red, green, blue) or any other light sensor. In that case, the present invention accordingly allows ghosting compensation when color image of a captured scene is generated.

A method for generating a depth map with distance measurement error correction for ToF camera (see Fig.1) according to the present invention comprises:

- preliminary estimating calculated PSF distribution model for the ToF camera (step S1);

- capturing a scene image with the ToF camera (step S2);

- compensating for ghost signals in captured scene image data using said PSF (step S3);

- generating a depth map using the corrected data (step S4).

As the result of implementing the method in accordance with the present invention a depth map with compensated ToF camera distance measurement errors is generated.

Further, the steps of the method for generating a depth map with ToF camera distance measurement error correction shown in Fig. 1 will be described in detail with reference to Figs. 1 to 6.

Further, step S1 of preliminary estimating calculated PSF distribution model includes the following steps (see Fig. 2):

Step S10 - Analyzing the optical system;

Step S11 - Approximating the PSF distribution model.

In step S10, the optical system is analyzed for PSF optical simulation through the following steps (see Fig. 3):

- Optical model initializing (step S101).

Using known optical system parameters, optical system model is created in ray tracing-based simulation software such as Synopsys LightTools, ZEMAX OpticStudio, etc. To do this, at least the following optical system parameters need to be known: optical system constructive parameters (radii, thicknesses, component materials), information about coatings of optical elements, sensor reflection plane, mount parameters.

- Ray-tracing based simulation of scattering inside the optical system (step S102).

For the optical system model initialized in step S101, scattering inside the optical system is simulated based on ray tracing. Fig. 4 shows a simplified ToF camera model for PSF calculation, which includes an infrared (IR) light source (1), a "point" object (2), an optical system (3), and a camera receiver (4). The IR light source (1) emits infrared light beam. Part of emitted rays reach the "point" object (2) and scatters (reflects). Part of scattered (reflected) rays reach receiving optical system (3). Most of scattered rays propagate through the optical system (3) and form an object image on the receiver (4). But even if all optical system surfaces have antireflection coatings, small amount of light still reflects from optical component's surfaces. This light can be absorbed by optical system mounts or it can form ghost signal on the receiver (4).

- Calculating PSF for each pixel (step S103).

Simulation results are used to form PSF kernels for each pixel, which describe response of the optical system to the point object. To calculate each pixel PSF, light scattering inside the optical system is simulated.

Ghost signal generated by the point object propagates over the entire sensor, which means that PSF kernel for each pixel is of image size. PSF kernel may be described as an array representing a sequence of pairs {weight, pixel}:

,

where

is weight defining light intensity at pixel (i+l, j+p) relative to (i,j) pixel, which receives input signal, and l and p are varied so as to cover all pixels of the receiver array (except for (i,j) pixel at which received light is measured).

In the conventional approach, PSF distribution model is determined using experimental data. Intensity of ghost signal is several orders of magnitude lower than that of signal from the object; therefore, in noise-affected experimental data, ghost signal is almost indistinguishable and received data is unreliable. As a consequence, in the conventional approach, only strong ghosting components can be detected and analyzed. Furthermore, precise experiment requires significant resources. These factors result in low accuracy of the PSF model created in the approach based on only experimental data.

In optical simulation, PSF kernels are generated by ToF camera optical model as ratio of scattering signal to measured signal according to the equation:

[Equation 3]

,

where

is simulation of such measure that have non-zero signal value only at location of (i,j) pixel,

- level (intensity) of scattering signal over camera sensor, represented as two-dimensional function (array) at input signal arriving at (i,j) pixel,

- PSF kernel at input signal arriving at (i,j) pixel.

This is due to the fact that in the conventional model, the scattering signal is simulated by PSF according to the equation:

[Equation 4]

,

where

- level of measured signal at (i,j) pixel of the sensor, while

includes both direct (useful) signal and scattering signal.

Ghost signal produced by point-type object is distributed over the entire sensor, meaning that PSF kernel of each pixel is of image size. Thus, according to the conventional model, PSF is represented by position-variable kernel of image size. That is for image of 240×180 pixels, we get 240×180=43,200 PSF kernels each of 240×180 size. Although such approximation is accurate, it cannot be done in real time. At the same time, with increasing resolution of image, processing of such image becomes accordingly more complicated.

Optical simulation in accordance with the present invention can provide more accurate PSF than that calculated from experimental data only. Optical simulation allows separation of ghosting components to provide the most accurate mathematical model. All ghosting components can be detected and analyzed. There is no necessity to do experiment, so resources are significantly saved.

However, the use of such large size PSF is incompatible with real-time mode, so a sparse approximation is needed.

In step S11, PSF is approximated as a polynomial function of radius, where radius is distance from (i,j) pixel to image center.

is two-dimensional discrete function (kernel). In accordance with the present invention,

is approximated with continuous one-dimensional function PSF(r), which is represented by a third-degree polynomial. The polynomial provides sufficient approximation of kernel values.

Approximation is carried out as follows.

Let (i,j) be a pixel whose distance r to image center is known. Denote e as |

|. Denote sum of all e at sensor pixels as E. Polynomial coefficients are selected so as to minimize total error of E. The selection can be carried out, for example, by bisection method or any other standard minimum search method.

Thus, in accordance with the present invention, there is provided accurate approximation of scattering by modeling sparse representation of position-variable PSF kernel so that each PSF kernel is of 3×3 size with kernel weights depending only on pixel distance from image center, according to the following equation:

[Equation 5]

,

where PSF(r) - position-variable kernel of 3х3 size, r - distance from (i,j) pixel to image center,

- measured signal at (i,j) pixel of the sensor.

Based on the analysis of signal distribution over the sensor, optimal mathematical formulation of PSF(r) was developed:

PSF(r) = A(r)ㆍG ,

where A(r) is amplification parameter, G is kernel 3×3.

To calculate scattering signal value at given pixel P, pixel P(r) symmetric with respect to image center is found. For example, if center is at pixel with coordinates [120, 90] and pixel P has coordinates [80, 60], then pixel P(r) has coordinates [160, 120] and r=50 is the distance to image center (radius).

Then, scattering signal at point P is calculated as convolution of kernel G with measured signal at point P(r), multiplied by amplification parameter A(r). This processing is performed for all pixels of the receiver array.

A(r) is dimensionless quantity that depends on radius r (distance from P to center), while A(r) is one-dimensional function that is specified, in an exemplary embodiment, by third-degree polynomial. In general case, it can be specified as a piecewise linear function. A(r) determines the scattering signal level.

G is constant kernel (3x3 array), which does not depend on radius. G array approximates local light aberrations in the pixel neighborhood.

Fig. 5 shows an example of simulating a ghost signal of a star-shaped object. White cross marks image center, and square defines 3×3 G-kernel to convolve with. Ghost signal is formed mirror-like relatively to image based on distance from current point to image center.

Values of G array and coefficients of polynomial defining A(r) can be obtained by machine learning, i.e. they are selected automatically so that the result of applying scattering approximation mathematical model is as close as possible to real (measured) values. A(r) and G define ghost imaging model. In the simplest case, G array can be neglected, then PSF(r) = A(r).

Note that ghost and direct signal intensities are visualized in Fig. 5 on logarithmic scale so that ghost signal can be distinguished, because on linear scale ghost would be hardly noticeable.

Ghost image is simulated as

,

where

is pixel on image, positioned at distance R from image center and

is ghost pixel at mirror-position relatively toward pixel over sensor center, Image and Ghost are pixel intensities of signal from the object and ghost signal, respectively. Convolution 3×3 of G kernel with image signal is taken over

position, the convolution result amplified with A(r) forms ghost signal at point

.

In step S2, a scene is captured with ToF camera, resulting in a data set characterizing the captured scene.

Next, in step S3, ghost signals in the captured scene data set from step S2 are compensated using the PSF distribution model produced in step S1. For this, measure of scattering signal level of (i,j) pixel is iteratively subtracted from each pixel in PSF sequence as follows:

,

where

- level of compensated (reconstructed) signal (i,j),

- level of scattering signal at (i,j) pixel,

- level of measured signal at (i,j) pixel of the sensor.

In step S4, the corrected scene data with compensated ghost signals is used to generate a scene depth map.

The developed compact PSF model provides real-time implementation of ghost signal compensation algorithm in image captured with ToF camera, while maintaining the compensation quality level of conventional approach. This algorithm can be also executed by low-power devices on the SoC (System-on-a-Chip) platform.

Using pre-calculated PSF model and captured scene data (raw phases) allows real-time compensation of distance measurement errors. Use of optical simulation provides an accurate PSF distribution model and compensation for even relatively weak ghosting components, thereby ensuring high-quality compensation. At the same time, approximation of PSF distribution model can significantly reduce its size, and, consequently, the amount of memory occupied by the data and the complexity of processing, while maintaining high compensation quality. As a result, the present invention can be implemented even on devices that are not very powerful in terms of computing power. In addition, the present invention does not require additional expensive mechanical modifications of existing hardware, which show its simplicity and cost-effectiveness in terms of material, time and other resources.

In an alternative embodiment, where not all optical system parameters are known, machine learning can be applied to produce highly accurate optical PSF model. To this end, the described step S1 further includes the steps (see Fig. 6) of:

- Experimental data acquisition (step S12).

In step S12, a plurality of image pairs are captured with an image capturing device (ToF camera), each pair of images including a ghost-affected image captured with an object positioned close to the lens of the image capturing device, which results in said ghost artifacts, and a "ground truth" ghost-free image captured without an object positioned close to the lens of the image capturing device, while the background scene is left untouched.

- PSF model optimization using machine learning (step S13).

In step S13, PSF model optimization is performed during training phase of ML model based on the datasets obtained in step S12, consisting of ghost-affected image/ghost-free image pairs.

The conventional forward-backward propagation machine learning concept is utilized to tune PSF weights initialized by optical model to achieve maximum convergence with the ground-truth data. The model can be built at any common machine learning framework.

As an example, this embodiment can be described as follows:

In steps S10 and S11, initial distribution model

is formed as described above. It should be noted that since not all parameters of the optical system are known, PSF model is generated with some assumptions, i.e. model parameters may have discrepancies with parameters of real optical system.

Next, in step S12 a plurality of image pairs are captured with ToF camera, as described above, each pair of scene images including ghost-affected image (signal

) and ghost-free (ground truth) image (signal

).

Scattering signal is specified as

. Parameters of PSF kernel should be chosen so that compensated signal

be as close as possible to real (ground-truth) signal

. PSF is defined by A(r) and G, where A(r) is specified by third-degree polynomial and G is 3x3 array. Thus, coefficients of polynomial A(r) and weights of G array are unknown variables to be found.

Having reliable measurements

, it is possible to specify loss function L2, which determines the proximity of compensated signal

to signal

.

Eventually, by specifying input signal

, transformation of PSF input signal, varied variables (coefficients A(r) and G), loss function and output signal

, complete machine learning pipeline is formed. In an exemplary embodiment of the present invention, the pipeline was implemented on MXnet platform. In alternative embodiments, any other machine learning platform can be used.

Search for optimal values of varied variables is carried out by gradient search method using standard tools of the platform used in automatic mode. Starting values are initialized with pre-calculated parameters obtained using optical simulation.

Therefore, PSF distribution model can be verified and refined based on real data, which improves its accuracy.

In yet another alternative embodiment of the present invention, in the optical system model generated in step S101, point object is simulated as a point light source, for example, as Lambertian source. Thus, infrared light source can be omitted from the optical system model without affecting ghosting formation, ghosting amplitude relative to the signal from the object, and ghosting position. Consequently, when simulating scattering in the optical system in step S102, compared to the exemplary embodiment described above in this document, propagation of light from light source to the point object and its scattering can be excluded from calculations, which can significantly reduce calculations complexity and time.

In yet another alternative embodiment, before generating a depth map in step S4, the method further comprises a step of iterative estimation of corrected scene data resulting from step S3. Iterative estimation of corrected scene data is performed on basis of preliminary estimation, weights of pre-calculated kernels are refined iteratively to increase robustness to noise. This embodiment can be implemented in ToF cameras operating simultaneously at two frequencies (for example, 20 MHz and 100 MHz), i.e. depth is measured simultaneously with using signals at two frequencies. If there is a discrepancy (error)

between depth measurements obtained using different frequencies, iterative estimation of corrected scene data is performed to eliminate this error.

Iterative estimation is based on physical constraint applied to measured data according to the following expression:

,

where

,

- amplitude and phase of compensated signal,

,

- amplitude and phase of scattering signal,

,

- amplitude and phase of measured signal,

- random noise that is always present in real data.

Applying the constraint is enforced by two modulation frequencies

and

of ToF camera such that

. The closed-form solution may be formulated as a system of nonlinear equations:

[Equation 6]

[Equation 7]

[Equation 8]

[Equation 9]

,

where

,

- phase of compensated signal and scattering signal at

frequency,

,

- imaginary and real parts of measured signal at

frequency,

,

- phase of compensated signal and scattering signal at

frequency,

,

- imaginary and real parts of measured signal at

frequency.

Thus, there is a system of four nonlinear equations with four unknowns. To increase the solution stability

and

can be fixed. In this case, only two variables,

and

, need to be varied.

Since the equations are nonlinear, the search for optimal solution is carried out iteratively, by enumerating possible values of varied variables

and

and estimates, for example, L2-norm for each combination. Values giving the lowest L2-norm correspond to optimal solution.

Thus, optimal solution to the system of equations, which is minimal

, is found by iterative search, where

.

The solution {

}, corresponding to minimal

, is taken as final solution, and the final solution is estimated by

.

Since exhaustive search is computationally expensive, some optimal search strategy is required to apply. In an exemplary embodiment of the present invention, the bisection method is used. The result obtained using pre-calculated PSF kernel is taken as the starting point to search for optimal solution.

In alternative embodiments of the present invention, other conventional optimization techniques may be employed.

This embodiment can improve error compensation quality and system robustness to noise.

Therefore, the present invention provides a simple and low-price solution for generating a depth map with high speed (real-time operation) and high quality compensation of ToF camera distance measurement errors.

At least some of the steps described above can be implemented by AI (artificial intelligence) model. AI-related functions can be performed using nonvolatile memory, volatile memory and processor.

Processor may include one or more processors. Furthermore, one or more processors may be a general-purpose processor such as central processing unit (CPU), application processor (AP), or the like, a graphics-only processing unit such as graphics processing unit (GPU), visual processing unit (VPU) and/or a specialized AI processor such as neural processing unit (NPU).

One or more processors control processing of input data in accordance with a predefined operation rule or artificial intelligence (AI) model stored in nonvolatile memory and volatile memory. Predefined operation rule or artificial intelligence model is provided through learning or training.

Here, providing by training means that a predefined operation rule or AI model with desired characteristic is created by applying a training algorithm to a plurality of training data. Training may be performed in the same device, where AI is executed according to the embodiment, and/or may be implemented by a separate server/system.

AI model can consist of many neural network layers. Each layer has multiple weights and performs layer operation by calculating previous level and multiple weight operation. Examples of neural networks include, but not limited to, Convolutional Neural Network (CNN), Deep Neural Network (DNN), Recurrent Neural Network (RNN), Restricted Boltzmann Machine (RBM), Deep Belief Network (DBN), Bidirectional Recurrent Deep Neural Network (BRDNN), generative adversarial networks (GANs), and deep Q-networks.

Learning algorithm is a method of training a predetermined target device (e.g. robot) using a variety of training data to invoke, enable, or control the target device to perform a determination or prediction. Examples of learning algorithms include, but not limited to, supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning.

It should be understood that while terms such as "first", "second", "third", etc. may be used herein to describe various elements, components, regions, layers and/or sections, these elements, components, regions, layers and/or sections should not be limited by these terms. These terms are used only to distinguish one element, component, region, layer or section from another element, component, region, layer or section. Thus, first element, component, region, layer, or section may be referred to as second element, component, region, layer, or section without departing from the scope of the present invention. As used herein, the term "and/or" includes any and all combinations of one or more of corresponding listed positions. Elements mentioned in the singular do not exclude the plurality of elements, unless otherwise specified.

Functionality of an element mentioned in the description or claims as a single element may be implemented in practice by several device components, and vice versa, functionality of elements mentioned in the description or claims as several separate elements can be implemented in practice by a single component.

Embodiments of the present invention are not limited to the embodiments described herein. Those skilled in the art will envisage, based on the information set forth in the description and knowledge of the prior art, other embodiments of the invention within the scope and spirit of this invention.

Elements mentioned in the singular do not exclude the plurality of elements, unless otherwise specified.

Those skilled in the art should understand that the essence of the invention is not limited to a specific software or hardware implementation, and therefore any software and hardware known in the prior art can be used to implement the invention. Thus, hardware can be implemented in one or more specialized integrated circuits, digital signal processors, digital signal processing devices, programmable logic devices, user-programmable gate arrays, processors, controllers, microcontrollers, microprocessors, electronic devices, other electronic modules adapted to perform functions described in this document, a computer or a combination of the above.

It is apparent that when referring to storing data, programs, etc., the availability of a computer-readable medium is implied. Examples of computer-readable media include read only memory, random access memory, register, cache memory, semiconductor storage devices, magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROMs and digital versatile discs (DVDs), and any other conventional storage media.

While exemplary embodiments have been described and shown in the accompanying drawings, it should be understood that such embodiments are illustrative only and are not intended to limit the broader invention, and that the present invention should not be limited to the specific arrangements and structures shown and described since various other modifications may be apparent to those skilled in the art.

Features mentioned in various dependent claims, as well as embodiments disclosed in various parts of the description, can be combined to achieve beneficial effects, even if the possibility of such combination is not explicitly disclosed.

Claims

A method for generating a depth map with distance measurement error correction for a ToF camera, comprising:

preliminary estimating calculated Point Spread Function (PSF) distribution model for the ToF camera;

capturing a scene image with the ToF camera;

compensating for ghost signals in captured scene image data using said PSF;

generating a depth map using corrected data,

wherein said preliminary estimating of calculated PSF distribution model includes the steps of:

initializing optical model of optical system of the camera;

simulating light scattering in the optical system on the basis of said initialized optical system model;

calculating PSF of the optical system;

approximating PSF with a polynomial function of radius, where radius is distance from given pixel to image center.
The method for generating a depth map according to claim 1, wherein said calculating of PSF of the optical system is performed according to the following equation:

where
- simulation of input signal measure that have non-zero value only at position of (i,j) pixel,
- level (intensity) of scattering signal over the camera sensor, represented as two-dimensional function (array), at input signal arriving at (i,j) pixel,
- PSF kernel at input signal arriving at (i,j) pixel.
The method for generating a depth map according to claim 2, wherein at the approximating step
is approximated with PSF(r) function, each kernel PSF(r) depending on distance from given pixel to image center.
The method for generating a depth map according to claim 3, wherein PSF(r) is specified by the equation:

,

where A(r) is amplification parameter depending on the distance from given pixel to image center, and G is kernel 3×3, which approximates local light aberrations in pixel neighborhood.
The method for generating a depth map according to claim 1, wherein said compensating of ghost signals in captured scene image data is performed according to the expression:

,

where
- level of compensated (reconstructed) signal at (i,j) pixel,
- level of scattering signal at (i,j) pixel,
- level of measured signal at (i,j) pixel of the sensor.
The method for generating a depth map according to claim 1, wherein said preliminary estimating of calculated PSF distribution model further comprises applying machine learning optimization on the obtained PSF based on datasets including ghost-affected image/ghost-free image pairs.
The method for generating a depth map according to claim 1, wherein in optical model of the ToF camera optical system a point object is simulated as a point light source.
The method for generating a depth map according to claim 1, further comprising, before generating a depth map, iterative estimation of corrected data of the captured scene image.