WO2022000147A1

WO2022000147A1 - Depth image processing method and device

Info

Publication number: WO2022000147A1
Application number: PCT/CN2020/098640
Authority: WO
Inventors: 周鸿彬; 武雪飞; 罗鹏飞; 唐样洋; 董晨
Original assignee: 华为技术有限公司
Priority date: 2020-06-28
Filing date: 2020-06-28
Publication date: 2022-01-06
Also published as: CN115667989A

Abstract

Provided are a depth image processing method and a device. The method comprises: acquiring a first depth image of a first object; and supplementing and correcting a depth value of a pixel in the first depth image by means of a field-of-view phase supplementation and correction coefficient matrix to obtain a second depth image, wherein the field-of-view phase supplementation and correction coefficient matrix is a matrix that is obtained after preprocessing n initial depth images and is used for supplementing and correcting a field-of-view phase error, the n initial depth images are depth images obtained in a first case, the first case is that a starting time of a light source of a photographing device is different from a starting time of a sensor of the photographing device, preprocessing comprises mean processing performed according to the n initial depth images, and n is an integer greater than 1. By using the embodiments of the present application, the distortion rate of a depth image can be reduced.

Description

Depth image processing method and device

technical field

The present invention relates to the technical field of image processing, and in particular, to a depth image processing method and device.

Background technique

Depth image refers to the image with the vertical distance (depth) from the image collector to each point in the scene as the pixel value, which directly reflects the geometry of the visible surface of the scene. The acquisition methods of depth images include lidar depth imaging method, computer stereo vision imaging, coordinate measuring machine method, Moiré fringe method, structured light method and so on. A depth image is a three-dimensional representation of an object, which is generally acquired by a stereo camera or a time of flight (TOF) camera. However, there are errors in the depth map obtained by the TOF camera. The main sources of these errors are:

Depth nonlinear error: When the emission wave of the active light source and TOF sensor is not a perfect sine wave, the depth value measured by TOF will produce nonlinear error. Nonlinear errors have different error values at different distances. Take a square wave as an example: its error amount is similar to that of a sine wave, oscillating with distance changes, and has periodic repeatability.

Field of view phase error: The time ranging system uses the time-of-flight difference to calculate the distance, so the measured information is actually the straight-line distance between the image collector and the pixel, not the depth of the Z axis, which is not the image collector and the pixel. vertical distance between. The field of view phase error is the error between the vertical distance and the straight-line distance.

In the prior art, there is a technical scheme of calculating the field of view using lens parameters, and then using the cosine value to correct the field of view phase error. Depth image distortion rate is high. To sum up, how to correct the FOV phase error of the depth image and at the same time reduce the depth nonlinear error to reduce the distortion rate of the depth image is a technical problem that those skilled in the art need to solve urgently.

SUMMARY OF THE INVENTION

The present application provides a depth image processing method and device, which can correct the field of view phase error of the depth image and at the same time reduce the depth nonlinear error to reduce the distortion rate of the depth image.

In a first aspect, the present application provides a depth image processing method, the method comprising:

Obtain the first depth image of the first object; and obtain a second depth image by correcting the depth value of the pixel in the first depth image by a field of view phase correction coefficient matrix, wherein the field of view phase correction coefficient matrix is a pair of n initial The matrix used to correct the phase error of the field of view obtained after the depth image preprocessing, the n initial depth images are the depth images obtained in the first situation, and the first situation includes the start-up moment of the light source of the photographing device and the photographing device. The startup moments of the sensor are different moments, and the preprocessing includes mean value processing according to the n initial depth images, where n is an integer greater than 1.

Specifically, the above-mentioned field of view phase correction coefficient matrix is used to reduce the field of view phase error of the depth value of each pixel in the above-mentioned first depth image. The field of view phase error is an error due to the angular deviation of the field of view.

The sizes of the above n initial depth images are all the same, that is, the pixel matrices of the n initial depth images are n matrices with the same number of rows and the same number of columns.

The execution subject of the depth image processing method provided by the present application may be the above-mentioned photographing device or another photographing device other than the above-mentioned photographing device, and the above-mentioned photographing device or the other photographing device stores the data of the above-mentioned field of view phase correction coefficient matrix.

It should be noted that, in the present application, the activation time of the light source can also be referred to as the time when the light source starts to emit light signals, and the activation time of the sensor can also be referred to as the time when the sensor starts to receive reflected light signals.

The deep nonlinear error is a nonlinear error caused by the waveform of the optical signal not being a standard waveform (for example, not a standard sine wave, etc.), and the error amount of the nonlinear error is similar to a sine wave, which oscillates with the change of distance. Positive and negative. Therefore, in the present application, the positive and negative error amounts in the nonlinear error can cancel each other by sampling multiple times under the condition that the light source of the photographing device is started later than the sensor and performing averaging processing according to the above-mentioned n initial depth images. , which can reduce the depth nonlinear error introduced in the measurement process. At the same time, the field of view phase correction coefficient matrix obtained based on the sampling and averaging processing can correct the depth value of each pixel in the depth image, reduce the error caused by the field of view phase, and greatly reduce the distortion of the final depth image. rate to improve the quality of the depth image.

In one possible implementation manner, the preprocessing further includes surface fitting processing, and the field of view phase correction coefficient matrix is calculated and obtained according to n fitted depth images, and the n fitted depth images are obtained from the n initial The depth images are respectively obtained by surface fitting, and the n initial depth images are the depth images of the second object obtained by the time-of-flight TOF ranging method n times based on the above-mentioned shooting equipment, and the start time of the i-th light source in the n times Delay (i-1)*Δt from the start time of the sensor, the value range of the i is [1, n], n*︱Δt︱=k*T, the T is the period of the optical signal, and the k is an integer , the Δt is a preset duration.

The above n*︱Δt︱=k*T indicates that the total time amount of n delays is an integer multiple of the period of the optical signal emitted by the light source. The sizes of the above n fitted depth images are all the same, that is, the pixel matrices of the n fitted depth images are n matrices with the same number of rows and the same number of columns. In addition, the size of the n fitted depth images is the same as the size of the above n initial depth images, that is, the pixel matrix of the n initial depth images and the pixel matrix of the n fitted depth images are 2*n rows. Equal matrices with equal number of columns.

Optionally, the above-mentioned surface fitting method may be a polynomial surface fitting method of least squares or a surface fitting method using other fitting functions.

It should be noted that the above-mentioned activation time of the light source is delayed from the activation time of the sensor, which can also be referred to as the time when the light source starts to emit light signals is delayed from the time when the sensor starts to receive the reflected light signal.

In the present application, the two characteristics of periodic oscillation and zero sum (the sum of errors in a complete cycle) of the depth nonlinear error are utilized. And the sum of the times of multiple delays is an integer multiple of the period of the sampling optical signal, and then the pixel matrix of the depth image obtained by these multiple delays is averaged to achieve the effect of reducing the depth nonlinear error, thereby improving the obtained field of view. Accuracy of the phase correction coefficient matrix.

In one possible implementation manner, the initial depth image obtained for the i-th time is the i-th initial depth image, and the i-th initial depth image is represented by a pixel matrix Di(x, y), and the pixel matrix Di(x, y) is the sum of each pixel value in the pixel matrix Di(x, y)' and c*[(1-i)*Δt]/2, and this Di(x, y)' is the i-th measurement above The pixel matrix of the obtained depth image, the c*[(1-i)*Δt]/2 is the distance difference caused by the above (i-1)*Δt because the activation time of the light source is delayed from the activation time of the sensor.

In this application, the above operation of increasing the depth by c*[(1-i)*Δt]/2 after the delay time (i-1)*Δt is used to compensate the depth difference of the theoretical value of time ranging The distance difference caused by the delay time makes it possible to obtain the initial depth image while obtaining the depth nonlinear error of the same pixel from different time phase points.

In one possible implementation manner, the above-mentioned field of view phase correction coefficient matrix is calculated and obtained according to n fitted depth images, including: the above-mentioned field of view phase correction coefficient matrix is the result of fitting a plurality of pixel values in the average pixel matrix. A ratio matrix obtained by taking the ratio between the minimum value and each pixel value of the fitted average pixel matrix, where the fitted average pixel matrix is obtained by averaging the pixel matrices of the above n fitted depth images.

In this application, the field of view phase correction coefficient matrix is calculated, so that the field of view phase error of the depth image obtained by the photographing device through the TOF ranging method can be corrected.

In one possible implementation manner, the above-mentioned depth image processing method further includes: obtaining a third depth image by compensating the above-mentioned second depth image through a fixed pattern noise (fixed pattern noise, FPN) matrix, wherein the FPN matrix Each value includes hardware-induced fixed noise for the corresponding pixel in the above-mentioned first depth image, respectively. It should be noted that the present application may be implemented in combination with the method in the first aspect and one or more methods in the possible implementation manners described above.

The execution subject of the depth image processing method provided by the present application may be the above-mentioned photographing device or another photographing device other than the above-mentioned photographing device, and the above-mentioned photographing device or the other photographing device stores the data of the above-mentioned FPN matrix. Specifically, the fixed pattern noise FPN can also be called pixel fixed noise. The manufacturing process, reading order, and circuit design of each pixel are not exactly the same, so errors between different pixels will be introduced. In the present application, in addition to the field-of-view phase error of the depth image can be corrected, the fixed pattern noise of each pixel caused by the hardware of the photographing device can also be corrected. In one possible implementation manner, the above-mentioned FPN matrix is a difference value matrix obtained by taking the difference between the fitted average pixel matrix and the initial average pixel matrix, and the fitted average pixel matrix is the pixel matrix of the above-mentioned n fitted depth images Obtained by averaging, and the initial average pixel matrix is obtained by averaging the pixel matrices of the above n initial depth images.

In this application, the fixed pattern noise matrix is obtained through this calculation process, which can be used to reduce the fixed pattern noise of the depth image.

In one possible implementation manner, the above-mentioned obtaining a second depth image by compensating the above-mentioned first depth image by using a field of view phase correction coefficient matrix includes: separately calculating the pixel matrix of the first depth image and the above-mentioned field of view phase correction coefficient. The second depth image is obtained by multiplying the pixel values of the same subscript in the matrix.

The size of the above-mentioned first depth image can be the same as the size of the field of view phase correction coefficient matrix. For example, if the size of the field of view phase correction coefficient matrix S1 (x, y) is 1024*1024, then the first depth image The size of the pixel matrix is also 1024*1024.

This application provides a calculation process for correcting the phase error of the field of view.

In one possible implementation manner, the above-mentioned obtaining a third depth image by compensating the above-mentioned second depth image by using a fixed pattern noise FPN matrix includes: calculating the pixel matrix of the above-mentioned first depth image and the above-mentioned field of view phase correction coefficient matrix respectively. The product matrix is obtained by multiplying the pixel values with the same subscript in the above-mentioned third depth image; the sum of the product matrix and the FPN matrix is calculated to obtain the above-mentioned third depth image.

The size of the above-mentioned first depth image and the size of the FPN matrix are the same as the size of the field of view phase correction coefficient matrix. For example, assuming that the size of the field of view phase correction coefficient matrix is 1024*1024, then the pixel matrix of the first depth image The size of and the size of the FPN matrix are also 1024*1024.

This application presents a calculation process for simultaneously correcting fixed pattern noise and field-of-view phase error.

In a second aspect, the present application provides a depth image processing method, the method comprising: acquiring n initial depth images, where the n initial depth images are a second object obtained by a time-of-flight TOF ranging method n times based on a photographing device The depth image of , wherein the start-up time of the light source of the above-mentioned photographing device for the i-th time among the above-mentioned n times is delayed (i-1)*Δt from the start-up time of the sensor of the above-mentioned photographing device, and the value range of i is [1, n], n*︱Δt︱=k*T, the T is the period of the optical signal, the n is an integer greater than 1, the k is an integer, and the Δt is a preset duration;

Perform surface fitting on the above n initial depth images respectively to obtain n fitted depth images;

The field of view phase correction coefficient matrix is calculated to obtain a field of view phase correction coefficient matrix after averaging the n fitted depth images, and the field of view phase correction coefficient matrix is used to correct the field of view phase error of the depth image obtained by the above-mentioned photographing device through the above-mentioned TOF ranging method.

The above-mentioned second object may be a flat plane.

In the present application, the two characteristics of periodic oscillation and zero sum (the sum of errors in a complete cycle) of the depth nonlinear error are utilized. And the sum of the times of multiple delays is an integer multiple of the period of the sampling optical signal, and then the pixel matrix of the depth image obtained by these multiple delays is averaged to achieve the effect of reducing the depth nonlinear error, thereby improving the obtained field of view. Accuracy of the phase correction coefficient matrix. Therefore, the field of view phase correction coefficient matrix of the depth image obtained by the present application can correct the field of view phase error of each pixel value in the depth image obtained by the photographing device, thereby reducing the distortion rate of the depth image and improving the quality of the depth image.

In one possible implementation manner, the initial depth image obtained for the i-th time is the i-th initial depth image; the above-mentioned acquisition of n initial depth images includes:

Obtain the n measurement depth images obtained by the actual measurement of the above-mentioned shooting equipment n times by the time-of-flight TOF ranging method, and the i-th measurement depth image in the n measurement depth images is represented by a pixel matrix Di(x, y)';

Calculate the sum of each pixel value in the pixel matrix Di(x, y)' and c*[(1-i)*Δt]/2 to obtain the i-th initial depth image Di(x, y), the c* [(1-i)*Δt]/2 is the distance difference caused by the fact that the activation time of the light source is delayed from the activation time of the sensor by the above (i-1)*Δt.

In one of the possible implementations, the above-mentioned calculation to obtain a field of view phase correction coefficient matrix after performing mean value processing according to the above-mentioned n fitted depth images, including:

Average the pixel matrices of the above n fitted depth images to obtain a fitted average pixel matrix;

Extracting the minimum value of a plurality of pixel values in the above-mentioned fitting average pixel matrix, and calculating the ratio of the above-mentioned minimum value to each pixel value of the above-mentioned fitting average pixel matrix to obtain a ratio matrix, and the above-mentioned ratio matrix is the above-mentioned field of view phase correction coefficient matrix.

In one possible implementation manner, after performing surface fitting on the n initial depth images to obtain n fitted depth images, the method further includes:

Average the pixel matrices of the above n initial depth images to obtain an initial average pixel matrix;

Calculate the difference between the above-mentioned fitted average pixel matrix and the above-mentioned initial average pixel matrix to obtain a difference matrix, the above-mentioned difference matrix is a fixed pattern noise FPN matrix, and each value in the FPN matrix respectively includes the depth image caused by hardware and The fixed noise for pixels with the same index for each value.

In a third aspect, the present application provides a depth image processing device, the device comprising:

an acquisition unit for acquiring the first depth image of the first object;

The correction unit is used to obtain a second depth image through the correction of the depth value of the pixel in the first depth image by the field of view phase correction coefficient matrix, wherein, this field of view phase correction coefficient matrix is obtained after preprocessing to n initial depth images The matrix for compensating the phase error of the field of view, the n initial depth images are the depth images obtained under the first situation, and the first situation includes the start-up moment of the light source of the photographing device and the start-up moment of the sensor of the photographing device: At different times, the preprocessing includes mean value processing based on the n initial depth images, where n is an integer greater than 1.

In one of the possible implementations, the preprocessing further includes surface fitting processing,

The field of view phase correction coefficient matrix is calculated and obtained according to n fitted depth images, the n fitted depth images are obtained by performing surface fitting on the n initial depth images respectively, and the n initial depth images are based on the shooting equipment The depth image of the second object obtained by the time-of-flight TOF ranging method for n times, the start-up time of the light source at the i-th time in the n times is delayed (i-1)*Δt from the start-up time of the sensor, and the value of i is The value range is [1, n], n*︱Δt︱=k*T, where T is the period of the optical signal, k is an integer, and Δt is a preset duration.

In one possible implementation manner, the initial depth image obtained for the i-th time is the i-th initial depth image, and the i-th initial depth image is represented by a pixel matrix Di(x, y), and the pixel matrix Di(x, y) is the sum of each pixel value in the pixel matrix Di(x, y)' and c*[(1-i)*Δt]/2, and the Di(x, y)' is the i-th measurement The pixel matrix of the obtained depth image, the c*[(1-i)*Δt]/2 is the distance difference generated by the (i-1)*Δt caused by the activation time of the light source being delayed from the activation time of the sensor.

In one possible implementation manner, the field of view phase correction coefficient matrix is calculated and obtained according to n fitted depth images, including:

The field of view phase correction coefficient matrix is a ratio matrix obtained by taking the ratio of the minimum value of a plurality of pixel values in the fitted average pixel matrix and each pixel value of the fitted average pixel matrix, and the fitted average pixel matrix is the n The pixel matrices of the fitted depth images are averaged.

In one of the possible implementations, the correction unit is also used for:

A third depth image is obtained by compensating the second depth image through a fixed pattern noise FPN matrix, wherein each value in the FPN matrix includes a hardware-induced first depth image with the same subscript as each value. Fixed noise for pixels.

In one possible implementation manner, the FPN matrix is a difference value matrix obtained by taking the difference between the fitted average pixel matrix and the initial average pixel matrix, and the fitted average pixel matrix is the pixel matrix of the n fitted depth images Obtained by averaging, and the initial average pixel matrix is obtained by averaging pixel matrices of the n initial depth images.

In one of the possible implementations, the correction unit is specifically used for:

The second depth image is obtained by separately calculating the product of the pixel matrix of the first depth image and the pixel value of the same subscript in the field of view phase correction coefficient matrix.

Respectively calculate the product of the pixel matrix of the first depth image and the pixel value of the same subscript in the field of view phase correction coefficient matrix to obtain the product matrix;

The third depth image is obtained by calculating the sum of the product matrix and the FPN matrix.

In a fourth aspect, the present application provides a depth image processing device, the device comprising:

an acquisition unit, configured to acquire n initial depth images, where the n initial depth images are the depth images of the second object obtained by the time-of-flight TOF ranging method n times based on the photographing device, wherein the i-th time in the n times The start-up time of the light source of the photographing device is delayed (i-1)*Δt from the start-up time of the sensor of the photographing device, and the value range of the i is [1, n], n*︱Δt︱=k*T, The T is the period of the optical signal, the n is an integer greater than 1, the k is an integer, and the Δt is a preset duration;

The surface fitting unit is used to perform surface fitting on the n initial depth images respectively to obtain n fitted depth images; the calculation unit is used to calculate the field of view phase correction after performing mean processing on the n fitted depth images A coefficient matrix, where the field of view phase correction coefficient matrix is used to correct the field of view phase error of the depth image obtained by the photographing device through the TOF ranging method.

In one possible implementation manner, the initial depth image obtained for the i-th time is the i-th initial depth image; the obtaining unit is specifically used for:

Obtain n measured depth images obtained by the actual measurement by the time-of-flight TOF ranging method n times by the photographing device, and the i-th measured depth image in the n measured depth images is represented by a pixel matrix Di(x, y)';

Calculate the sum of each pixel value in the pixel matrix Di(x, y)' and c*[(1-i)*Δt]/2 to obtain the i-th initial depth image Di(x, y), the c* [(1-i)*Δt]/2 is the distance difference generated because the activation time of the light source is delayed by (i-1)*Δt from the activation time of the sensor.

In one of the possible implementations, the computing unit is specifically used for:

Average the pixel matrices of the n fitted depth images to obtain a fitted average pixel matrix;

Extract the minimum value of multiple pixel values in the fitted average pixel matrix, and calculate the ratio of the minimum value to each pixel value of the fitted average pixel matrix to obtain a ratio matrix, where the ratio matrix is the phase correction coefficient of the field of view matrix. In one of the possible implementations, the above calculation unit is further configured to, after the surface fitting unit performs surface fitting on the n initial depth images respectively to obtain n fitted depth images,

Average the pixel matrices of the n initial depth images to obtain an initial average pixel matrix;

Calculate the difference between the fitted average pixel matrix and the initial average pixel matrix to obtain a difference matrix. The difference matrix is a fixed pattern noise FPN matrix. Each value in the FPN matrix includes the difference between the The fixed noise for pixels with the same index for each value.

In a fifth aspect, the present application provides a deep image processing device, the device includes a processor, a communication interface and a memory, wherein the memory, the communication interface and the processor can be integrated or coupled through a coupler, the memory is used for A computer program is stored, and the processor is configured to execute the computer program stored in the memory to achieve the following operations:

Obtain the first depth image of the first object; and obtain a second depth image by correcting the depth value of the pixel in the first depth image by a field of view phase correction coefficient matrix, wherein the field of view phase correction coefficient matrix is a pair of n initial The matrix used to correct the phase error of the field of view obtained after the depth image preprocessing, the n initial depth images are the depth images obtained in the first situation, and the first situation includes the start-up moment of the light source of the photographing device and the photographing device. The start-up moments of are different moments, and the preprocessing includes mean value processing according to the n initial depth images, where n is an integer greater than 1.

The execution subject of the above-mentioned depth image processing method may be the above-mentioned photographing device or another photographing device other than the above-mentioned photographing device, and the above-mentioned photographing device or the other photographing device stores the data of the above-mentioned field of view phase correction coefficient matrix.

In one possible implementation manner, the above-mentioned depth image processing method further includes: obtaining a third depth image by compensating the above-mentioned second depth image through a fixed pattern noise FPN matrix, wherein each value in the FPN matrix includes Fixed noise of corresponding pixels in the above-mentioned first depth image caused by hardware. It should be noted that the present application may be implemented in combination with the method in the above first aspect and/or the method in the above possible implementation manner.

In one possible implementation manner, the FPN matrix is a difference value matrix obtained by taking the difference between the fitted average pixel matrix and the initial average pixel matrix, and the fitted average pixel matrix is the pixel matrix of the n fitted depth images Obtained by averaging, and the initial average pixel matrix is obtained by averaging the pixel matrices of the above n initial depth images.

In a sixth aspect, the present application provides a deep image processing device, the device includes a processor, a communication interface and a memory, wherein the memory, the communication interface and the processor can be integrated together or coupled through a coupler, and the memory is used for A computer program is stored, and the processor is configured to execute the computer program stored in the memory to achieve the following operations: Acquire n initial depth images, the n initial depth images are obtained by the time-of-flight TOF ranging method n times based on the photographing device The depth image of the second object, wherein the start-up time of the light source of the imaging device at the i-th time among the above-mentioned n times is delayed by (i-1)*Δt from the start-up time of the sensor of the above-mentioned imaging device, and the value range of i is [1, n], n*︱Δt︱=k*T, where T is the period of the optical signal, n is an integer greater than 1, k is an integer, and Δt is a preset duration;

In a seventh aspect, the present application provides an apparatus comprising a processor and a communication interface, the apparatus being configured to perform the method described in any one of the above-mentioned first aspect.

In one of the possible implementations, the device is a chip or a System on a Chip (SoC). In an eighth aspect, the present application provides an apparatus, the apparatus comprising a processor and a communication interface, the apparatus being configured to perform the method described in any one of the second aspect above.

In one of the possible implementations, the device is a chip or a system-on-a-chip SoC.

In a ninth aspect, the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to implement the method described in any one of the above-mentioned first aspect.

In a tenth aspect, the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program is executed by a processor to implement the method described in any one of the foregoing second aspects.

In an eleventh aspect, the present application provides a computer program product, when the computer program product is read and executed by a computer, the method described in any one of the above-mentioned first aspect will be executed.

In a twelfth aspect, the present application provides a computer program product, when the computer program product is read and executed by a computer, the method described in any one of the above-mentioned second aspects will be executed.

In a thirteenth aspect, the present application provides a computer program, which, when executed on a computer, enables the computer to implement the method described in any one of the above-mentioned first aspect.

In a fourteenth aspect, the present application provides a computer program, which, when executed on a computer, enables the computer to implement the method described in any one of the above-mentioned second aspects.

To sum up, the deep nonlinear error is a nonlinear error caused by the waveform of the optical signal not being a standard waveform (for example, not a standard sine wave, etc.), and the error amount of the nonlinear error is similar to a sine wave, which varies with the distance. Oscillation, the amount of error is positive and negative. Therefore, in the present application, the positive and negative error amounts in the nonlinear error can be canceled each other by sampling multiple times when the light source of the photographing device is started later than the sensor and performing the averaging process according to the above-mentioned n initial depth images. , which can reduce the depth nonlinear error introduced in the measurement process. At the same time, the field of view phase correction coefficient matrix obtained based on the sampling and averaging processing can correct the depth value of each pixel in the depth image, reduce the error caused by the field of view phase, and greatly reduce the distortion of the final depth image. rate to improve the quality of the depth image.

Description of drawings

FIG. 1 shows a schematic diagram of a scene to which a depth image processing method provided by an embodiment of the present application is applicable;

FIG. 2 shows a schematic diagram of a training flow of a depth image correction amount provided by an embodiment of the present application;

FIG. 3 shows a schematic diagram of a shooting scene provided by an embodiment of the present application;

FIG. 4 shows a schematic diagram of a depth nonlinear error provided by an embodiment of the present application;

FIG. 5 shows a schematic flowchart of a depth image processing method according to an embodiment of the present application;

FIG. 6 is a schematic flowchart of another depth image processing method provided by the embodiment of this solution;

FIG. 7 is a schematic diagram of the logical structure of the photographing device provided by the embodiment of this solution;

FIG. 8 is a schematic diagram of a logical structure of a computing device provided by an embodiment of this solution;

FIG. 9 is a schematic diagram of the hardware structure of the photographing device provided by the embodiment of this solution;

FIG. 10 is a schematic diagram of a hardware structure of a computing device according to an embodiment of this solution.

detailed description

The embodiments of the present application will be described below with reference to the accompanying drawings.

FIG. 1 is a schematic diagram of a scene to which a depth image processing method provided by an embodiment of the present application is applicable. The scene includes a photographing device 101 and a photographing subject 102 . The photographing device 101 includes a light source 1011 , a controller 1012 , a sensor 1013 and a processor 1014 . The light source 1011 , the controller 1012 , the sensor 1013 and the processor 1014 may be connected to each other through lines and/or interfaces.

The photographing device 101 is used for photographing the photographing object 102 by using the temporal ranging method to obtain a depth image of the photographing object 102 . Time ranging methods may include, for example, time of flight (TOF) ranging methods, lidar ranging methods, and the like.

Taking the TOF ranging method as an example, the process of acquiring depth images is introduced. Specifically, the controller 1012 in the photographing device 101 may send an activation signal to the light source 1011 and the sensor 1013 to activate the light source 1011 and the sensor 1013 . Then, the light source 1011 transmits a light signal to the photographing object 102 , and the light signal reaches the photographing object 102 and is reflected back, and the photographing device 101 receives the reflected light signal through the sensor 1013 . _{Then, the photographing device 101 can calculate the time difference t 1j} that the optical signal transmits between the photographing device 101 and any point of the photographing object 102 through the phase difference between the received reflected light signal and the transmitted optical signal, and the value range of j is [1, m], where m is the number of points sampled by the photographing device 101 to the photographing object 102 , corresponding to the depth image of the photographing object 102 , then the m is the number of pixels in the pixel matrix of the depth image. _{The distance D1 j} between the photographing device 101 and the arbitrary point is calculated from the time difference t _1j and the propagation speed c of the optical signal: D1 _j =c*t _1j /2. where t _1j =φ _j /(2πf), f is the frequency of the optical signal, φ _j =arctan[(Q _3j -Q _1j )/(Q _0j -Q _2j )], Q ₀ , Q ₁ , Q ₂ and Q ₃ is the 4 images acquired by the sensor with the time domain phase difference exposure of 0, 90, 180 and 270 degrees, respectively, Q _0j , Q _1j , Q _2j and Q _3j are Q ₀ , Q ₁ , Q ₂ and Q ₃ are the pixel values of these four pixels in the image corresponding to any point.

Alternatively, the photographing device 101 can also record the moment when the light signal is emitted and the moment when the reflected light signal is received, and by calculating the difference between the two times, the optical signal transmitted between the photographing device 101 and any point of the photographing object 102 can be calculated. time difference t _2j , and then, according to the time difference t _2j and the propagation speed c of the optical signal, the distance D _2j between the photographing device 101 and the arbitrary point is calculated: D _2j =c*t _2j /2.

In either of these two ways, the photographing device 101 can obtain the depth value of each pixel in the depth image of the photographing object 102 , and the depth value is the photographing object 102 corresponding to the pixel by the photographing device 101 . The value of the distance between a certain point in , can also be called the pixel value of the pixel point. After acquiring the pixel value of each pixel of the depth image, the pixel matrix of the depth image can be obtained. For the device, the pixel matrix is the depth image of the photographing object 102 .

It should be noted that, in the above two methods of acquiring depth images, the calculation of the depth value based on the measurement data obtained by the TOF ranging method may be performed in the above-mentioned photographing device 101, or the photographing device 101 may obtain these measurement data. These measurement data are then sent to other devices, such as servers in the cloud, for depth value calculation. In addition, the controller 1012 can also be used for delay control, the function of which is to adjust the time difference between the activation of the light source 1011 and the sensor 1013, which can be used for various calibrations. The controller 1012 can be a control circuit in an integrated chip, the principle of which is to use a resistor-capacitor RC circuit to generate a delay. For example, the controller 1012 can make the light source 1011 start later than the sensor 1013 or start earlier than the sensor 1013 through the RC circuit. Alternatively, the controller may also be a control unit implemented by software, and the time difference between the activation of the light source 1011 and the sensor 1013 is adjusted by software logic.

It should be noted that, in this application, the activation of the light source 1011 means that the light source 1011 starts to emit light signals to the photographed object 102 , and the activation of the sensor 1013 means that the sensor 1013 begins to receive the light signal reflected by the photographed object 102 . The light source 1011 starts later than the sensor 1013 means that the time when the light source 1011 starts to emit the light signal is delayed than the time when the sensor 1013 starts to receive the reflected light signal. The light source 1011 starts earlier than the sensor 1013 means that the time when the light source 1011 starts to emit light signals is earlier than the time when the sensor 1013 starts to receive the reflected light signals. The LiDAR ranging method can obtain the three-dimensional information of the scene by means of laser scanning. The basic principle is to emit laser light into space according to a certain time interval, and record the signal of each scanning point from the lidar to the object in the measured scene (for example, the object 102), and then pass the time interval when the object reflects back to the lidar, Based on this, the distance between the surface of the object and the lidar can be calculated, so that the depth image of the object in the measured scene can be obtained.

The photographing device 101 may be various types of cameras, mobile phones, tablets, computers, various types of cameras, lidars, and the like. The light source 1011 may be a light emitting diode (LED) or a laser or the like, and the laser may include a vertical-cavity surface-emitting laser (VCSEL) or a laser diode or the like. The controller 1012 may be a control circuit or a control chip or the like. The sensor 1013 may be a photosensitive sensor or an image sensor or an image sensor or the like composed of photosensitive elements.

The processor 1014 may include one or more processors, for example, the processor 1014 may include one or more central processing units (CPUs) and one or more graphics processing units (GPUs). When the processor 1014 includes multiple processors, the multiple processors may be integrated on the same chip, or may be independent chips.

In this embodiment of the present application, the CPU may be used to control the above-mentioned controller 1012 , the light source 1011 and the sensor 1013 to implement the above-mentioned time ranging method, thereby acquiring the measured data. The GPU can be used to perform calculations based on these measured data to obtain depth images. These calculations may include one of a calculation of acquiring a field-of-view phase correction coefficient matrix, a calculation of acquiring an FPN matrix, a calculation of correcting a field-of-view phase error of pixel values of a depth image, and a calculation of reducing fixed pattern noise of pixel values of a depth image or more. These specific calculation processes can be found in the following description, which will not be described in detail here.

Of course, in another possible implementation, the processor 1014 may include one or more central processing units (CPUs), and the above-mentioned various calculations may be performed by the CPUs.

A schematic diagram of a scene to which the depth image processing method provided by the embodiment of the present application is applicable is not limited to the scene shown in FIG. 1 above, and the scenes to which the depth image processing method provided by the embodiment of the present application can be applied are all within the protection scope of the present application.

In a specific embodiment, when the photographing device acquires the depth image of the photographed object, the error caused by temperature offset, depth nonlinearity error, error caused by global time offset, fixed pattern noise FPN and field of view phase error, etc. The depth image is distorted. Among them, the phase error of the field of view is the error caused by the angular deviation of the field of view, and the fixed noise is the fixed error caused by hardware in each pixel of the depth image.

In order to reduce the distortion rate of the depth image and obtain the high-quality depth image, the present application provides a depth image processing method, which can correct the field of view phase error of the depth image through the field of view phase correction coefficient matrix and/or use the FPN matrix to correct the field of view phase error of the depth image. The fixed pattern noise is reduced, so that the distortion rate of the obtained depth image can be greatly reduced, and the quality of the depth image can be improved.

In the embodiment of the present application, the field of view phase correction coefficient matrix and/or the FPN matrix can be obtained by training in advance, and the field of view phase correction coefficient matrix and/or the FPN matrix obtained by training is obtained after reducing the depth nonlinear error, so that further Reduce the distortion rate of depth images.

The following first introduces the training process for obtaining the field-of-view phase correction coefficient matrix and the FPN matrix. Referring to Figure 2, the training process may include but is not limited to the following steps:

S201. Build shooting equipment and shooting objects.

In the training process, the shooting object may be a flat plane, that is, the training is performed using the flat plane as a calibration reference, as shown in FIG. 3 , which is a schematic diagram of a training scene. In FIG. 3 , the plane 301 is within the reachable range of the light signal emitted by the light source 302 of the photographing device, so that every point on the plane can be sampled. And the optical axis of the lens of the photographing device is perpendicular to the plane 301 . The point A in the plane 301 shown in FIG. 3 can be the point closest to the lens, and the point A can be any point in the plane 301, and is not limited to the center point of the plane 301. Once you have set up your plane and shooting equipment, you can start shooting.

S202. Measure and acquire a depth image of the photographed object based on the time ranging method.

In the training process of the embodiment of the present application, whether the TOF ranging method, the lidar ranging method or other ranging methods are used, multiple measurements are required to obtain multiple depth images of the shooting object, and each measurement can obtain the shooting object's depth image. a depth image.

It should be noted that after obtaining the measurement data through multiple measurements, various subsequent calculations can be performed on the shooting device according to the measurement data, or the shooting device can send the measurement data to other devices after obtaining the measurement data. For example, a server in the cloud performs various subsequent calculations, or, the photographing device and the other device, such as a server in the cloud, each perform a part of the calculation and the like. These calculations may include calculations of depth values, surface fitting, averaging, and ratio calculations based on these measurement data. These calculation processes will be introduced below, and will not be described in detail here. For ease of description, a device that performs these computing operations is referred to as a computing device below, and the computing device may be a photographing device or a cloud server or the like.

In a specific embodiment, n times of measurement can be performed, and the starting time of the i-th light source in the n times is delayed by (i-1)*Δt from the starting time of the sensor, and the value range of i is [1, n] , n*︱Δt︱=k*T, where T is the period of the light signal emitted by the light source, n is an integer greater than 1, k is an integer, and Δt is a preset duration.

The Δt may be a positive number or a negative number. When the Δt is a positive number, then the activation time of the light source is delayed (i-1)*Δt from the activation time of the sensor, which means that the sensor is activated earlier than the light source by (i-1)*Δt; when the Δt is a negative number Then, the activation time of the light source is delayed by (i-1)*Δt from the activation time of the sensor, which means that the light source is activated earlier than the sensor by (i-1)*(-Δt) time. The Δt can be any value other than 0. The light source and the sensor are the light source and the sensor in the photographing device, for example, the light source 1011 and the sensor 1013 shown in FIG. 1 .

The start-up time of the light source refers to the time when the light source starts to emit light signals. Specifically, the start-up time of the light source may be the time when the light source emits light signals to the photographed object such as the plane 301 shown in FIG. 3 .

The startup moment of the above-mentioned sensor refers to the moment when the sensor starts to receive the reflected light signal. Specifically, the moment when the sensor starts to receive the reflected light signal may be the moment when the sensor starts to receive the light reflected from the photographed object, such as the plane 301 shown in FIG. 3 . signal moment. The reflected light signal is the light signal emitted by the light source that reaches the photographing object such as the plane 301 shown in FIG. 3 and is reflected back to the photographing device, and then the photographing device receives the reflected light signal through the sensor.

In order to facilitate the understanding of the start-up time of the light source and the start-up time of the sensor, an example is given below. Exemplarily, at the first second, the light source starts to emit light signals, that is, the start time of the light source is the first second; at the second second, the sensor begins to receive the reflected light signal, that is, the start time of the sensor is the first second. 2 seconds; at 3 seconds, the sensor begins to receive reflected light signals. Here, the sensor can start to receive the reflected light signal at the second second, but the reflected light signal has not yet transmitted to the sensor, so the sensor does not receive the reflected light signal at the second second. At the 3rd second, the reflected light signal starts to transmit to the sensor, so the sensor starts to receive the reflected light signal at the 3rd second. This is just an exemplary description. The specific start-up time of the light source, the start-up time of the sensor, and the time when the sensor starts to receive the reflected light signal are determined according to the actual situation, which is not limited in this solution.

The starting time of the i-th light source in the above n times is delayed from the starting time of the sensor by (i-1)*Δt, indicating that the starting time of the light source is delayed than the starting time of the sensor, and each measurement is more than the previous one. Delay Δt.

Exemplarily, assuming n is 4, then:

In the first measurement, the start-up time of the light source is 0 delayed from the start-up time of the sensor, that is, the light source and the sensor start at the same time, which means that the time when the light source starts to emit light signals is the same as the time when the sensor starts to receive reflected light signals;

In the second measurement, the start-up time of the light source is delayed by Δt from the start-up time of the sensor;

In the third measurement, the start-up time of the light source is delayed by 2*Δt from the start-up time of the sensor;

In the fourth measurement, the activation time of the light source is delayed by 3*Δt from the activation time of the sensor.

This is just an example. As for several measurements, the specific value of n is determined according to actual needs, which is not limited in this solution.

The above n*︱Δt︱=k*T indicates that the total time amount of n delays is an integer multiple of the period of the optical signal emitted by the light source. Specifically, the activation timing of the above-mentioned light sources and sensors may be controlled by a controller, and the controller may be the controller 1012 described in FIG. 1 . That is, the controller can use the resistor-capacitor RC circuit to generate a delay or use software to generate a delay, so that the light source can be started later than the sensor by (i-1)*Δt.

The above-mentioned delay in starting the light source from the sensor means that the time when the light source starts to emit the light signal is delayed from the time when the sensor starts to receive the reflected light signal. The light source starts earlier than the sensor means that the time when the light source starts to emit the light signal is earlier than the time when the sensor starts to receive the reflected light signal.

The purpose of performing n measurements above and setting the delay Δt so that n*︱Δt︱=k*T is to reduce the deep nonlinearity error. When the emission waveform of the active light source and the activation waveform of the sensor are not perfect sine waves, the measured depth value will produce nonlinear errors. The nonlinear error has different error values at different distances. The relationship between the error amount and the depth is similar to that of a sine wave. It oscillates with the distance and has periodic repeatability. Figure 4 shows an example of the depth nonlinear error at a frequency of 100Mhz, where the x-axis is the depth value, and the y-axis is the nonlinear error corresponding to the depth. It can be observed that this nonlinear error has periodic oscillations and 0 and (complete The sum of the errors in the cycle is 0) two characteristics. Next, we will introduce how to use periodic oscillation and 0 and these two characteristics to reduce the depth nonlinear error, which will not be described in detail here.

Exemplarily, during the first measurement, the light source and the sensor in the photographing device are activated at the same time, that is, when the light source starts to emit a light signal to the photographed object, the sensor also begins to sense the light signal reflected by the photographed object, and then the computing device is based on the emitted light signal. The information of the optical signal and the information of the reflected optical signal are calculated according to the calculation method described in the above description of FIG. , y), the depth image D1(x, y) can be called the first initial depth image. Wherein, the D1(x, y) is the pixel matrix of the first initial depth image, and for the device, D1(x, y) is the first initial depth image. Each pixel value in the pixel matrix D1 (x, y) is a value of the distance between the photographing device and a certain point in the photographing object corresponding to each pixel point, that is, a depth value.

In the above D1(x,y), x represents the number of rows of the pixel matrix, and y represents the number of columns of the pixel matrix, that is, D1(x,y) is a pixel matrix of x rows and y columns. The x and y in the matrix involved below also represent the number of rows and columns of the matrix, respectively, which are described here and will not be repeated below.

In the i-th (when i is greater than 1) measurement, the activation time of the light source in the photographing device is delayed by (i-1)*Δt from the activation time of the sensor, in this case, the light source emits a light signal to the subject, and then the sensor senses the shooting For the optical signal reflected by the object, the computing device calculates the depth value of each pixel of the depth image of the photographed object based on the information of the transmitted optical signal and the information of the reflected optical signal according to the calculation method described in the above description of FIG. 1 . , so as to obtain the depth image Di(x,y)' of the subject. Since the depth image Di(x,y)' is directly calculated from the measured data, it can be called the depth image Di(x,y)' Measure the depth image for the ith. Wherein, the Di(x,y)' is the pixel matrix of the i-th measurement depth image, and Di(x,y)' is the i-th measurement depth image for the device. Each pixel value in the pixel matrix Di(x,y)' is a value of the distance between the shooting device and a certain point in the shooting object corresponding to each pixel point calculated according to the measured data, that is, the depth value.

However, since the start of the light source is delayed by (i-1)*Δt, it is equivalent to a delay of (i-1)*Δt when the sensor receives the reflected light signal, resulting in an increase in the transmission time of the calculated optical signal by (i-1) *Δt, so that the depth value calculated according to the transmission time increases by c*[(i-1)*Δt]/2. Therefore, the depth value of each pixel in the i-th measured depth image should be subtracted by c*[(( i-1)*Δt]/2 is the real depth value, or adding c*[(1-i)*Δt]/2 is the real depth value. That is, the c*[(1-i)*Δt]/2 or the c*[(i-1)*Δt]/2 is the distance that the light source is delayed by (i-1)*Δt from the activation time of the sensor Difference.

Then, the i-th initial depth image finally calculated based on the i-th measurement can be represented by a pixel matrix Di(x, y). For the device, the Di(x, y) is the i-th initial depth image, and the The pixel matrix Di(x,y) is obtained by adding each pixel value in the pixel matrix Di(x,y)' and c*[(1-i)*Δt]/2 respectively. Or, the pixel matrix Di(x,y) is obtained by subtracting each pixel value in the pixel matrix Di(x,y)' from c*[(i-1)*Δt]/2 respectively.

For the above-mentioned first measurement, since the sensor and the light source are activated at the same time and there is no time delay, the first measured depth image obtained based on the measurement data is the above-mentioned first initial depth image, and the distance difference caused by the time delay does not need to be considered. It should be noted that the sizes of the n initial depth images D1(x,y)~Dn(x,y) calculated according to the above steps are all the same, that is, the D1(x,y)~Dn(x,y) are n matrices with equal number of rows and equal number of columns.

S203. Perform surface fitting on the n initial depth images obtained based on the above measurement.

After the computing device obtains the above-mentioned n initial depth images D1(x,y)˜Dn(x,y), it can perform surface fitting on the n initial depth images respectively to obtain n fitted depth images. The pixel matrix of the depth image can be represented by Df1(x,y)~Dfn(x,y), and the fitted depth image obtained by surface fitting of the i-th initial depth image is the i-th fitted depth image Df i(x,y ). For the device, the Df i(x,y) is the i-th fitting depth image.

It should be noted that the sizes of the n fitted depth images Df1(x,y)~Dfn(x,y) calculated according to the above steps are all the same, that is, the Df1(x,y)~Dfn(x,y) is a matrix of n equal rows and equal columns. In addition, the size of the n fitted depth images is the same as the size of the above-mentioned n initial depth images, that is, the D1(x,y)～Dn(x,y) and the Df1(x,y)～Dfn(x, y) is a 2*n matrix with the same number of rows and the same number of columns.

Optionally, the above-mentioned surface fitting method may be a polynomial surface fitting method of least squares or a surface fitting method using other fitting functions. Specifically, after obtaining the pixel matrices of the above n initial depth images, the approximate function type of the image can be roughly judged according to these pixel matrices. For example, drawing software or simulation software can be used to draw or simulate the scatter images corresponding to these pixel matrices. , and then judge the approximate function type of the scatter image according to experience. After judging the approximate function, the function can be used to fit the n initial depth images respectively to obtain n fitted depth images.

Exemplarily, it is assumed that the n initial depth images can be fitted by the least squares polynomial surface fitting method, then the fitted polynomial can be equation (1): surface(x, y)=p ₀₀ +p ₁₀ *x+p ₀₁ *y+p ₂₀ *x^2+p ₁₁ *x*y+p ₀₂ *y^2, or the fitted polynomial can be equation (2): surface(x,y)= p ₀₀ +p ₁₀ *x+p ₀₁ *y+p ₂₀ *x^2+p ₁₁ *x*y+p ₀₂ *y^2+p ₃₀ *x^3+p ₂₁ *x^2*y+ p ₁₂ *x*y^2+p ₀₃ *y^3+p ₄₀ *x^4+p ₃₁ *x^3*y+p ₂₂ *x^2*y^2+p ₁₃ *x*y^ 3+p ₀₄ *y^4. Wherein, the surface(x, y) is the above Df i(x, y), and p is the coefficient of the polynomial. Through the depth image obtained by fitting, the depth value of any pixel can be calculated according to the values of x and y.

In formula (1), the highest degree of the monomial is 2. Therefore, if formula (1) is used for surface fitting, it can be called quadratic polynomial surface fitting. The highest degree of the monomial in equation (2) is 4, therefore, surface fitting using equation (2) can be called quartic polynomial surface fitting. The highest degree of the monomial in the polynomial needs to be determined according to the actual needs. If it is too small, the obtained fitting surface will be quite different from the actual surface. If it is too large, overfitting will occur.

Here, the polynomial is used for surface fitting only exemplarily. In a specific embodiment, other adaptive functions may also be used for surface fitting, which is not limited in this solution.

S204. Calculate a fitted average pixel matrix according to the n fitted depth images calculated in S203.

After the above-mentioned n fitted depth images Df1(x,y)˜Dfn(x,y) are obtained, the pixel matrix of the n fitted depth images may be averaged to obtain a fitted average depth image. The fitted average depth image may be represented by a fitted average pixel matrix Dfa(x,y), and for the device, the fitted average pixel matrix Dfa(x,y) is the fitted average depth image. Specifically, the pixel values of the same subscripts in the pixel matrices Df1(x,y)~Dfn(x,y) of the n fitted depth images can be averaged respectively to obtain the fitted average pixel matrix Dfa(x,y) . Exemplarily, the specific calculation formula may be as follows:

In the embodiment of the present application, the same subscript refers to the same number of rows and the same number of columns in the matrix, then, the pixel value with the same subscript refers to the pixel value at the position with the same number of rows and the same number of columns in the matrix.

According to the previous description, the deep nonlinear error has two characteristics: periodic oscillation and 0 and 0. The above measurement sampling process from Df1(x,y) to Dfn(x,y) happens to include the complete periodic oscillation. Therefore, the n fitted depth images Df1(x,y)~Dfn(x,y) are averaged to obtain Dfa(x,y), which can cancel the nonlinear periodic oscillations and reduce the depth nonlinearity error.

S205: Calculate a field of view phase correction coefficient matrix according to the calculated fitted average pixel matrix.

After calculating the fitted average pixel matrix Dfa(x,y), extract the minimum value d0 in the Dfa(x,y), the point corresponding to the minimum value d0 can be the area of the object closest to the lens of the shooting device point within. For example, it may be point A in the above-mentioned FIG. 3 or a point in the area near point A. Then, the ratio of the minimum value d0 to each pixel in the fitted average pixel matrix Dfa(x,y) is calculated to obtain a ratio matrix S1(x,y). The ratio matrix S1(x, y) is the above-mentioned field-of-view phase correction coefficient matrix.

Since the field of view phase correction coefficient matrix S1 (x, y) is obtained from the ratio of the minimum value d0 to each pixel in the fitted average pixel matrix Dfa (x, y), then the field of view phase correction coefficient matrix S1 ( Each value in x, y) is less than or equal to 1.

S206: Calculate the FPN matrix according to the pixel matrices of the n initial depth images and the fitting average pixel matrix.

First, average the pixel matrices of the n initial depth images to obtain an initial average depth image, which can be represented by an initial average pixel matrix Da(x, y). For the device, the initial average pixel matrix Da( x, y) is the initial average depth image. Specifically, the initial average pixel matrix Da(x,y) can be obtained by averaging the pixel values of the same subscripts in the pixel matrices D1(x,y)˜Dn(x,y) of the n initial depth images respectively. Exemplarily, the specific calculation formula may be as follows:

Then, calculate the difference between the above-mentioned fitting average pixel matrix Dfa(x,y) and the initial average pixel matrix Da(x,y) to obtain the difference matrix S2(x,y), and the specific calculation formula can be as follows:

S2(x,y)=Dfa(x,y)-Da(x,y).

The difference matrix S2(x, y) is the above-mentioned FPN matrix.

It should be noted that the above S205 and S206 are executed in no particular order, either S205 may be executed first, and then S206 may be executed, or S206 may be executed first, and then S205 may be executed, or S205 and S206 may be executed simultaneously.

In one of the possible implementations, the above-mentioned field of view phase correction coefficient matrix may also be calculated in the following manner:

Calculate the ratio of each pixel point in the above-mentioned fitting average pixel matrix Dfa(x, y) to the above-mentioned minimum value d0 to obtain a ratio matrix S1(x,y)', and the ratio matrix S1(x,y)' can also be a field of view Phase correction coefficient matrix.

Since the field of view phase correction coefficient matrix S1(x,y)' is obtained from the ratio of each pixel point in the above-mentioned fitting average pixel matrix Dfa(x,y) to the above-mentioned minimum value d0, then the field of view phase correction coefficient matrix Each value in S1(x,y)' is greater than or equal to 1.

In one of the possible implementations, the above-mentioned FPN matrix can also be calculated in the following manner:

Calculate the difference between the above-mentioned initial average pixel matrix Da(x,y) and the above-mentioned fitted average pixel matrix Dfa(x,y) to obtain the difference matrix S2(x,y)', and the specific calculation formula can be as follows:

S2(x,y)=Da(x,y)-Dfa(x,y)

Wherein, the difference matrix S2(x, y)' may also be an FPN matrix.

It should be noted that, in a specific embodiment, the above-mentioned field of view phase correction coefficient matrix and FPN matrix can be obtained by training to obtain one of the matrices. For example, if you want to correct the field of view phase error in the depth image, you can train to obtain the field of view phase correction coefficient matrix; if you want to reduce the fixed pattern noise in the depth image, you can train to obtain the FPN matrix. Of course, if you want to correct the field of view phase error in the depth image and reduce the fixed pattern noise in the depth image at the same time, then both the field of view phase correction coefficient matrix and the FPN matrix can be obtained by training.

In one of the possible implementations, when calculating the above-mentioned field of view phase correction coefficient matrix, it is not necessary to perform surface fitting on the above-mentioned n initial depth images, but directly obtain the pixel matrix of the n initial depth images. The initial average depth image Da(x,y)' is obtained by averaging, and then the minimum value d0' in the Da(x,y)' is extracted, and the point corresponding to the minimum value d0' can be the distance from the shooting device in the shooting object. The point within the closest area of the lens. For example, it may be point A in the above-mentioned FIG. 3 or a point in the area near point A. Then, calculate the ratio of the minimum value d0' to each pixel in the initial average pixel matrix Da(x,y)' to obtain a ratio matrix S1(x,y)". The ratio matrix S1(x,y)" is The above field of view phase correction coefficient matrix. Or, calculate the ratio of each pixel point in the initial average pixel moment Da(x,y)' to the above-mentioned minimum value d0' to obtain a ratio matrix S1(x,y)"', the ratio matrix S1(x,y)"' A field-of-view phase correction coefficient matrix may also be used.

In the training method described in Figure 2 above, the two characteristics of periodic oscillation and zero sum (the sum of errors in a complete cycle) of the depth nonlinear error are utilized. In the case of sampling, and the sum of the times of the multiple delays is an integer multiple of the period of the sampling optical signal, and then the pixel matrix of the depth image obtained by these multiple delays is averaged to reduce the depth nonlinear error effect, thereby improving the The accuracy of the obtained field-of-view phase correction coefficient matrix. Therefore, the field-of-view phase correction coefficient matrix of the depth image obtained in the embodiment of the present application can correct the field-of-view phase error of each pixel value in the depth image obtained by the photographing device, thereby reducing the distortion rate of the depth image and improving the accuracy of the depth image. quality.

After obtaining the field-of-view phase correction coefficient matrix and/or the FPN matrix, the data of the obtained matrix can be stored in the photographing device, so as to correct the error of each pixel value in the depth image obtained by the photographing device. Optionally, the photographing equipment may be the same equipment as the above-mentioned training to obtain the field of view phase correction coefficient matrix and the FPN matrix; The photographing device is of the same model, or has the same or similar hardware performance; or the photographing device can also be any device that can acquire depth images.

Based on the above introduction, the following describes a depth image processing method provided by an embodiment of the present application, the method includes a process of a photographing device correcting pixel errors of a depth image based on the above-mentioned field of view phase correction coefficient matrix and/or FPN matrix. Referring to Figure 5, the method may include but is not limited to the following steps:

S501. The photographing device acquires a first depth image of a first object.

The first object may be any object photographed by the photographing device, and may be a plane object, a three-dimensional object, or a space object, and so on. Alternatively, the first depth image may also be the first initial depth image D1(x, y) or the like in the training process described in FIG. 2 . This program does not limit the specific shooting objects.

Specifically, the controller in the photographing device may send a signal of simultaneous activation to the light source and the sensor to activate the light source and the sensor simultaneously to measure data for calculating the first depth image. For details on how to calculate and obtain the first depth image according to the measured data, reference may be made to the relevant description in the above-mentioned introduction to FIG. 1 , which will not be repeated here.

S502, the photographing device obtains a second depth image by correcting the depth values of the pixels in the first depth image through a field of view phase correction coefficient matrix, wherein the field of view phase correction coefficient matrix is obtained after preprocessing n initial depth images The matrix for compensating the phase error of the field of view, the n initial depth images are the depth images obtained under the first situation, and the first situation includes the start-up moment of the light source of the photographing device and the start-up moment of the sensor of the photographing device: At different times, the preprocessing includes mean value processing based on the n initial depth images, where n is an integer greater than 1.

In the specific implementation process, the above-mentioned first situation also includes the case where the start-up time of the light source of the photographing device and the start-up time of the sensor of the photographing device are the same time. The above-mentioned mean value processing according to the n initial depth images may include averaging the pixel values of the same subscripts in the pixel matrices of the n initial depth images; or, including first performing surface fitting on the n initial depth images Obtain n fitted depth images, and then average the pixel values of the same subscripts in the pixel matrix of the n fitted depth images. The above-mentioned preprocessing also includes processing such as surface fitting and ratio taking. For the specific implementation process of the mean value processing, surface fitting, ratio processing, etc., reference may be made to the description of the training process described in FIG. 2 above, which will not be repeated here.

In a specific embodiment, the data of the field of view phase correction coefficient matrix has been stored in the photographing device, and the field of view phase correction coefficient matrix may be the field of view phase correction coefficient matrix S1 obtained by training in the training process described in FIG. 2 (x,y). The size of the above-mentioned first depth image can be the same as the size of the field of view phase correction coefficient matrix S1 (x, y), for example, assuming that the size of the field of view phase correction coefficient matrix S1 (x, y) is 1024*1024, Then the size of the pixel matrix of the first depth image is also 1024*1024.

The field of view phase correction coefficient matrix S1(x, y) is calculated and obtained according to the above n fitting depth images. Specifically, the field of view phase correction coefficient matrix is the minimum value of a plurality of pixel values in the fitted average pixel matrix A ratio matrix obtained by taking ratios with each pixel value of the fitted average pixel matrix, where the fitted average pixel matrix is obtained by averaging pixel values with the same subscript in the pixel matrices of the n fitted depth images.

The n fitted depth images are obtained by performing surface fitting on the above n initial depth images respectively, and the n initial depth images are obtained by the time-of-flight TOF ranging method n times based on the above-mentioned training shooting equipment. Depth image, the second object is the plane used for the above training, the starting time of the light source at the i-th time in the n times is delayed (i-1)*Δt from the starting time of the sensor, and the value range of i is [1, n], n*︱Δt︱=k*T, where T is the period of the optical signal, k is an integer, and Δt is a preset duration.

In addition, the initial depth image obtained for the i-th time is the i-th initial depth image, and the i-th initial depth image is represented by a pixel matrix Di(x, y), and the pixel matrix Di(x, y) is a pixel matrix Di(x , y)' is obtained by summing each pixel value with c*[(1-i)*Δt]/2, the Di(x, y)' is the pixel matrix of the depth image obtained by the i-th measurement, The c*[(1-i)*Δt]/2 is the distance difference generated because the light source is delayed by the (i-1)*Δt from the activation time of the sensor.

For the specific training process for obtaining the field of view phase correction coefficient matrix S1(x, y), reference may be made to the corresponding description in FIG. 2 above, and details are not repeated here.

Then, obtaining the second depth image by compensating the depth value of the pixels in the first depth image by the field of view phase correction coefficient matrix is specifically: respectively calculating the pixel matrix of the first depth image and the field of view phase correction coefficient matrix S1 (x , y) in the product of the pixel values with the same subscript to obtain the product matrix as the second depth image. The product matrix can be represented by DS(x, y), that is, the second depth image can be represented by DS(x, y).

The multiplication of elements of the same subscript of two matrices is called dot product, and the symbol of dot product is ".*". The pixel matrix of the above-mentioned first depth image can be represented by A(x,y), then, the product of the pixel matrix of the first depth image and the pixel value of the same subscript in the field of view phase correction coefficient matrix S1(x,y) is obtained. The product matrix can be represented by the following formula:

DS(x,y)=A(x,y).*S1(x,y).

In a possible implementation manner, the field-of-view phase correction coefficient matrix in S502 may be the above-mentioned S1(x,y)'. Then, the field of view phase correction coefficient matrix S1(x, y)' is calculated and obtained according to the above-mentioned n fitted depth images. Specifically, the field of view phase correction coefficient matrix is each pixel value of the fitted average pixel matrix and the A ratio matrix obtained by taking the minimum value of the multiple pixel values in the fitted average pixel matrix is obtained by taking the ratio respectively. Other descriptions are the same as those for the field of view phase correction coefficient matrix S1(x, y), and are not repeated here.

Then, obtaining the second depth image by compensating the depth value of the pixels in the first depth image by the field of view phase correction coefficient matrix is specifically: respectively calculating the pixel matrix of the first depth image and the field of view phase correction coefficient matrix S1 (x , y)' in the ratio of the pixel values with the same subscript to obtain a ratio matrix as the second depth image. Based on the training process shown in FIG. 2 and its possible implementations, it can be known that the value of the same subscript in S1(x,y)' and the above S1(x,y) are reciprocals of each other, so the ratio matrix and the above product The matrix DS(x, y) may be the same matrix, and the ratio matrix may be represented by DS(x, y), that is, the second depth image may be represented by DS(x, y).

In a possible implementation manner, the field of view phase correction coefficient matrix in S502 may be the above-mentioned S1(x,y)". Then, the depth value of the pixel in the first depth image is corrected by the field of view phase correction coefficient matrix Obtaining the second depth image is specifically as follows: respectively calculating the product of the pixel matrix of the first depth image and the pixel value of the same subscript in the field of view phase correction coefficient matrix S1(x,y)" to obtain the product matrix as the second depth image . The product matrix can be represented by DS(x, y)', that is, the second depth image can be represented by DS(x, y)'. Then the calculation process can be expressed by the following formula:

DS(x,y)'=A(x,y).*S1(x,y)".

In a possible implementation manner, the field of view phase correction coefficient matrix in S502 may be the above-mentioned S1(x,y)"'. Then, the depth value of the pixel in the first depth image is obtained through the field of view phase correction coefficient matrix Correction to obtain the second depth image is specifically: respectively calculating the ratio of the pixel matrix of the first depth image and the pixel value of the same subscript in the field of view phase correction coefficient matrix S1(x,y)"' to obtain the ratio matrix as the second depth image. Based on the training process shown in FIG. 2 and its possible implementations, it can be known that the values of the same subscripts in the S1(x,y)"' and the above S1(x,y)" are reciprocals of each other, so the ratio matrix and The above product matrix DS(x, y)' may be the same matrix, and the ratio matrix may be represented by DS(x, y)', that is, the second depth image may be represented by DS(x, y)'.

In the embodiment of the present application, by sampling multiple times during the training process when the light source of the photographing device is started later than the sensor, and performing the mean value processing according to the above n initial depth images, the positive and negative errors in the nonlinear error are made. The quantities can cancel each other, which can reduce the depth nonlinear error introduced in the measurement process. At the same time, the field of view phase correction coefficient matrix obtained based on the sampling and averaging processing can correct the depth value of each pixel in the depth image, reduce the error caused by the field of view phase, and greatly reduce the distortion of the final depth image. rate to improve the quality of the depth image.

In a possible implementation manner, referring to FIG. 6 , the depth image processing method provided by the above embodiment of the present application may further include, but is not limited to, the following steps:

S503 , the above-mentioned photographing device obtains a third depth image by compensating the depth value of each pixel in the above-mentioned second depth image by using a fixed pattern noise FPN matrix.

In a specific embodiment, data of the FPN matrix has been stored in the photographing device, and the FPN matrix may be the FPN matrix S2(x, y) obtained by training in the training process described in FIG. 2 above.

The size of the above-mentioned first depth image and the size of the FPN matrix are the same as the size of the field of view phase correction coefficient matrix S1 (x, y), for example, it is assumed that the size of the field of view phase correction coefficient matrix S1 (x, y) is 1024* 1024, then the size of the pixel matrix of the first depth image and the size of the FPN matrix are also 1024*1024.

That is, the FPN matrix is the difference matrix S2(x, y) obtained by taking the difference between the above-mentioned fitted average pixel matrix and the above-mentioned initial average pixel matrix, and the fitted average pixel matrix is the pixel matrix of the above n fitted depth images. The pixel values of the same subscript are obtained by averaging respectively, and the initial average pixel matrix is obtained by averaging the pixel values of the same subscript in the pixel matrices of the above n initial depth images respectively. For a specific process of training to obtain the FPN matrix S2(x, y), reference may be made to the corresponding description in FIG. 2 above, which will not be repeated here.

Then, obtaining the third depth image by compensating the depth value of each pixel in the second depth image by the fixed pattern noise FPN matrix is specifically: in the case that the matrix DS(x, y) is obtained by calculating the above S502, calculating the matrix DS The sum of (x, y) and the FPN matrix S2 (x, y) obtains the sum matrix Dc (x, y) as the third depth image, and the specific calculation formula is:

Dc(x,y)=DS(x,y)+S2(x,y).

Or, in the case that the matrix DS(x, y)' is obtained by calculation in the above S502, the sum of the matrix DS(x, y)' and the FPN matrix S2(x, y) is calculated to obtain the sum matrix Dc(x, y) 'As the third depth image, the specific calculation formula is:

Dc(x,y)'=DS(x,y)'+S2(x,y).

In one of the possible implementations, the FPN matrix in S503 may be S2(x,y)'. Then, the FPN matrix S2(x,y)' is the difference matrix obtained by taking the difference between the above-mentioned initial average pixel matrix Da(x,y) and the above-mentioned fitted average pixel matrix Dfa(x,y). Other descriptions are the same as those for the FPN matrix S1(x, y), and are not repeated here.

Then, obtaining the third depth image by compensating the depth value of each pixel in the second depth image by the fixed pattern noise FPN matrix is specifically: in the case that the matrix DS(x, y) is obtained by calculating the above S502, calculating the matrix DS The difference between (x, y) and the FPN matrix S2(x, y)' obtains the difference matrix Dc(x, y)" as the third depth image, and the specific calculation formula is:

Dc(x,y)"=DS(x,y)-S2(x,y)'.

Based on the training process shown in FIG. 2 and its possible implementations, it can be known that the values of the same subscript in S2(x, y)' and the above S2(x, y) are opposite numbers to each other, so the difference matrix Dc (x,y)" and the above sum matrix Dc(x,y) can be the same matrix.

Or, in the case where the matrix DS(x, y)' is obtained by calculation in the above S502, the difference between the matrix DS(x, y)' and the FPN matrix S2(x, y)' is calculated to obtain the difference matrix Dc(x, y)' y)"' as the third depth image, the specific calculation formula is: Dc(x,

Dc(x,y)"'=DS(x,y)'-S2(x,y)'.

Similarly, based on the training process shown in FIG. 2 and its possible implementations, it can be seen that the values of the same subscript in S2(x,y)' and S2(x,y) are opposite numbers to each other, so the difference The value matrix Dc(x,y)"' and the above-mentioned sum matrix Dc(x,y)' may be the same matrix.

In this embodiment of the present application, in addition to correcting the field-of-view phase error of the depth image, the fixed pattern noise of each pixel caused by the hardware of the photographing device can also be corrected.

The above mainly introduces the depth image processing method provided by the embodiments of the present application. It can be understood that, in order to implement the above-mentioned corresponding functions, each device includes corresponding hardware structures and/or software modules for performing each function. Those skilled in the art should easily realize that the present application can be implemented in hardware or a combination of hardware and computer software with the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.

In this embodiment of the present application, the device may be divided into functional modules according to the foregoing method examples. For example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that, the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and there may be other division manners in actual implementation.

In the case where each functional module is divided according to each function, FIG. 7 shows a schematic diagram of a possible logical structure of the device, and the device may be the photographing device in the method described in FIG. 5 or FIG. 6 . The photographing apparatus 700 includes an acquisition unit 701 and a correction unit 702 . in:

an acquisition unit 701, configured to acquire a first depth image of a first object;

Correction unit 702 is used to obtain a second depth image by correcting the depth value of the pixel in the first depth image by the field of view phase correction coefficient matrix, wherein, this field of view phase correction coefficient matrix is after preprocessing to n initial depth images. The obtained matrix for correcting the phase error of the field of view, the n initial depth images are the depth images obtained in the first situation, and the first situation includes the start-up time of the light source of the photographing device and the start-up time of the sensor of the photographing device For different moments, the preprocessing includes mean value processing based on the n initial depth images, where n is an integer greater than 1.

In a possible implementation, the correction unit 702 is also used for:

In a possible implementation, the correction unit 702 is specifically used for:

For the specific operations and beneficial effects of each unit in the photographing device 700 shown in FIG. 7 , reference may be made to the description of the method embodiments shown in FIG. 5 or FIG. 6 and its possible implementations, which will not be repeated here.

In the case where each functional module is divided according to each function, FIG. 8 shows a schematic diagram of a possible logical structure of the device, and the device may be the computing device in the method described in FIG. 2 above. The computing device 800 includes an acquiring unit 801 , a surface fitting unit 802 and a computing unit 803 . in:

The obtaining unit 801 is configured to obtain n initial depth images, where the n initial depth images are the depth images of the second object obtained by the time-of-flight TOF ranging method n times based on the photographing device, wherein the ith in the n times The starting time of the light source of the photographing device is delayed by (i-1)*Δt from the starting time of the sensor of the photographing device, and the value range of the i is [1, n], n*︱Δt︱=k*T , the T is the period of the optical signal, the n is an integer greater than 1, the k is an integer, and the Δt is a preset duration;

a curved surface fitting unit 802, configured to perform curved surface fitting on the n initial depth images respectively to obtain n fitted depth images;

Calculation unit 803 is used to calculate and obtain the field of view phase correction coefficient matrix after doing mean value processing according to the n fitting depth images, and this field of view phase correction coefficient matrix is used to correct the depth image that this shooting device obtains by this TOF ranging method field of view phase error.

In one possible implementation manner, the initial depth image obtained for the i-th time is the i-th initial depth image; the obtaining unit 801 is specifically used for:

In one possible implementation manner, the computing unit 803 is specifically used for:

The pixel values of the same subscript in the pixel matrices of the n fitted depth images are averaged respectively to obtain the fitted average pixel matrix;

Extract the minimum value of multiple pixel values in the fitted average pixel matrix, and calculate the ratio of the minimum value to each pixel value of the fitted average pixel matrix to obtain a ratio matrix, where the ratio matrix is the phase correction coefficient of the field of view matrix.

In one of the possible implementations, the above calculation unit 803 is further configured to, after the curved surface fitting unit 802 respectively performs surface fitting on the n initial depth images to obtain n fitted depth images,

The pixel values of the same subscript in the pixel matrix of the n initial depth images are averaged respectively to obtain the initial average pixel matrix;

For the specific operations and beneficial effects of each unit in the computing device 800 shown in FIG. 8 , reference may be made to the description of the method embodiments shown in FIG. 2 and its possible implementations, which are not repeated here.

FIG. 9 shows a schematic diagram of a possible hardware structure of the device provided by the present application, and the device may be the photographing device in the method described in FIG. 5 or FIG. 6 above. The photographing device 900 includes: a processor 901 , a memory 902 and a communication interface 903 . The processor 901 , the communication interface 903 , and the memory 902 may be connected to each other or to each other through a bus 904 .

Exemplarily, the memory 902 is used to store computer programs and data of the photographing device 900, and the memory 902 may include, but is not limited to, random access memory (RAM), read-only memory (ROM), Erasable programmable read only memory (EPROM) or portable read-only memory (compact disc read-only memory, CD-ROM), etc. In the case where the embodiment shown in FIG. 7 is implemented, and each unit described in the embodiment in FIG. 7 is implemented by software, the software required to execute the functions of the acquisition unit 701 and the correction unit 702 in FIG. 7 or Program codes are stored in memory 902 .

The communication interface 903 is used to support the photographing device 900 to communicate, such as receiving or sending data or signals.

Illustratively, the processor 901 may be a central processing unit, a graphics processing unit (GPU), a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array, or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. A processor may also be a combination that performs computing functions, such as a combination comprising one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like. The processor 901 may be configured to read the program stored in the above-mentioned memory 902, and execute the operations performed by the photographing device in the method described in the above-mentioned FIG. 5 or FIG. 6 and possible embodiments.

FIG. 10 is a schematic diagram showing a possible hardware structure of the device provided by the present application, and the device may be the computing device in the method described in FIG. 2 above. The computing device 1000 includes: a processor 1001 , a memory 1002 and a communication interface 1003 . The processor 1001 , the communication interface 1003 , and the memory 1002 may be connected to each other or to each other through a bus 1004 .

Exemplarily, the memory 1002 is used to store computer programs and data of the computing device 1000. The memory 1002 may include, but is not limited to, random access memory (RAM), read-only memory (ROM), Erasable programmable read only memory (EPROM) or portable read-only memory (compact disc read-only memory, CD-ROM), etc. When the embodiment shown in FIG. 8 is implemented, and each unit described in the embodiment of FIG. 8 is implemented by software, the acquisition unit 801 , the surface fitting unit 802 and the calculation unit 803 in FIG. 8 are executed. Software or program code required for the function may be stored in memory 1002 .

The communication interface 1003 is used to support the computing device 1000 to communicate, such as to receive or transmit data or signals, and the like.

Illustratively, the processor 1001 may be a central processing unit, a graphics processing unit (GPU), a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array, or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. A processor may also be a combination that performs computing functions, such as a combination comprising one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like. The processor 1001 may be configured to read the program stored in the above-mentioned memory 1002, and execute the operations performed by the computing device in the above-mentioned methods described in FIG. 2 and possible embodiments.

An embodiment of the present application further provides an apparatus, the apparatus includes a processor and a communication interface, and the apparatus is configured to execute the method described in FIG. 2 and its possible embodiments above.

In one of the possible implementations, the device is a chip or a System on a Chip (SoC). An embodiment of the present application further provides an apparatus, where the apparatus includes a processor and a communication interface, and the apparatus is configured to execute the method described in FIG. 5 or FIG. 6 and possible embodiments thereof.

Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program is executed by a processor to implement the method described in FIG. 2 and its possible embodiments.

Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program is executed by a processor to implement the method described in FIG. 5 or FIG. 6 and possible embodiments thereof. .

The embodiment of the present application also provides a computer program product, when the computer program product is read and executed by a computer, the method described in the above FIG. 2 and its possible embodiments will be executed.

The embodiment of the present application also provides a computer program product, when the computer program product is read and executed by a computer, the method described in the above-mentioned FIG. 5 or FIG. 6 and its possible embodiments will be executed.

The embodiments of the present application also provide a computer program, which, when executed on a computer, enables the computer to implement the method described in FIG. 2 and its possible embodiments above.

Embodiments of the present application further provide a computer program, which, when executed on a computer, enables the computer to implement the method described in FIG. 5 or FIG. 6 and possible embodiments thereof.

To sum up, the deep nonlinear error is a nonlinear error caused by the waveform of the optical signal not being a standard waveform (for example, not a standard sine wave, etc.), and the error amount of the nonlinear error is similar to a sine wave, which varies with the distance. Oscillation, the amount of error is positive and negative. Therefore, in the present application, the positive and negative error amounts in the nonlinear error can cancel each other by sampling multiple times under the condition that the light source of the photographing device is started later than the sensor and performing averaging processing according to the above-mentioned n initial depth images. , which can reduce the depth nonlinear error introduced in the measurement process. At the same time, the field of view phase correction coefficient matrix obtained based on the sampling and averaging processing can correct the depth value of each pixel in the depth image, reduce the error caused by the field of view phase, and greatly reduce the distortion of the final depth image. rate to improve the quality of the depth image.

In this application, the terms "first", "second" and other words are used to distinguish the same or similar items with basically the same function and function, and it should be understood that between "first", "second" and "nth" There are no logical or timing dependencies, and no restrictions on the number and execution order. It will also be understood that, although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by the terms. These terms are only used to distinguish one element from another. For example, a first image may be referred to as a second image, and, similarly, a second image may be referred to as a first image, without departing from the scope of various described examples. Both the first image and the second image may be images, and in some cases, may be separate and distinct images.

It should also be understood that, in each embodiment of the present application, the size of the sequence number of each process does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not be used in the embodiment of the present application. Implementation constitutes any limitation.

It will also be understood that the term "includes" (also referred to as "includes", "including", "comprises" and/or "comprising") when used in this specification designates the presence of stated features, integers, steps, operations, elements , and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groupings thereof.

It should also be understood that references throughout the specification to "one embodiment," "an embodiment," and "one possible implementation" mean that a particular feature, structure, or characteristic associated with the embodiment or implementation is included herein. in at least one embodiment of the application. Thus, appearances of "in one embodiment" or "in an embodiment" or "one possible implementation" in various places throughout this specification are not necessarily necessarily referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features thereof can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present application. Scope.

Claims

A depth image processing method, characterized in that the method comprises:

obtaining a first depth image of the first object;

A second depth image is obtained by correcting the depth values of the pixels in the first depth image through a field of view phase correction coefficient matrix, wherein the field of view phase correction coefficient matrix is obtained after preprocessing the n initial depth images for The matrix for correcting the phase error of the field of view, the n initial depth images are the depth images obtained in the first situation, and the first situation is that the start-up time of the light source of the photographing device and the start-up time of the sensor of the photographing device are: At different times, the preprocessing includes mean value processing performed according to the n initial depth images, where n is an integer greater than 1.
The method according to claim 1, wherein the preprocessing further comprises surface fitting processing,

The field of view phase correction coefficient matrix is calculated and obtained according to n fitted depth images, and the n fitted depth images are obtained by performing surface fitting on the n initial depth images respectively, and the n initial depth images are: Based on the depth images of the second object obtained by the photographing device n times by the time-of-flight TOF ranging method, the start-up time of the light source at the i-th time in the n times is delayed from the start-up time of the sensor by (i-1 )*Δt, the value range of the i is [1, n], n*︱Δt︱=k*T, the T is the period of the optical signal, the k is an integer, and the Δt is the pre- Set the duration.
The method according to claim 2, wherein the initial depth image obtained for the i-th time is the i-th initial depth image, and the i-th initial depth image is represented by a pixel matrix Di(x, y). The pixel matrix Di(x, y) is obtained by summing each pixel value in the pixel matrix Di(x, y)' and c*[(1-i)*Δt]/2, the Di(x, y) ' is the pixel matrix of the depth image obtained by the i-th measurement, and the c*[(1-i)*Δt]/2 is because the start-up time of the light source is delayed from the start-up time of the sensor (i-1)*Δt produces distance difference.
The method according to claim 2 or 3, wherein the field of view phase correction coefficient matrix is calculated and obtained according to n fitted depth images, comprising:

The field of view phase correction coefficient matrix is a ratio matrix obtained by taking the ratio between the minimum value of the multiple pixel values of the fitted average pixel matrix and each pixel value of the fitted average pixel matrix, and the fitted average pixel matrix Obtained by averaging the pixel matrices of the n fitted depth images.
The method according to any one of claims 2 to 4, wherein the method further comprises:

A third depth image is obtained by compensating the second depth image through a fixed pattern noise FPN matrix, wherein each value in the FPN matrix includes the difference between each value in the first depth image caused by hardware and the Fixed noise for pixels of the same subscript.
The method according to claim 5, wherein the FPN matrix is a difference matrix obtained by taking a difference between a fitted average pixel matrix and an initial average pixel matrix, and the fitted average pixel matrix is the n fitted average pixel matrix. The pixel matrix of the combined depth image is obtained by averaging, and the initial average pixel matrix is obtained by averaging the pixel matrices of the n initial depth images.
The method according to any one of claims 1 to 6, wherein the obtaining the second depth image by compensating the first depth image through a field of view phase correction coefficient matrix comprises:

The second depth image is obtained by separately calculating the product of the pixel matrix of the first depth image and the pixel value of the same subscript in the field of view phase correction coefficient matrix.
The method according to claim 5 or 6, wherein the obtaining a third depth image by compensating the second depth image through a fixed pattern noise FPN matrix, comprising:

Respectively calculate the product of the pixel matrix of the first depth image and the pixel value of the same subscript in the field of view phase correction coefficient matrix to obtain a product matrix;

The third depth image is obtained by calculating the sum of the product matrix and the FPN matrix.
A depth image processing method, characterized in that the method comprises:

Acquire n initial depth images, where the n initial depth images are the depth images of the second object obtained by the time-of-flight TOF ranging method n times based on the photographing device, wherein the i-th in the n times The starting time of the light source of the photographing device is delayed by (i-1)*Δt from the starting time of the sensor of the photographing device, and the value range of the i is [1, n], n*︱Δt︱=k*T, The T is the period of the optical signal, the n is an integer greater than 1, the k is an integer, and the Δt is a preset duration;

performing surface fitting on the n initial depth images respectively to obtain n fitted depth images;

The field of view phase correction coefficient matrix is calculated and obtained after averaging processing according to the n fitted depth images, and the field of view phase correction coefficient matrix is used to correct the visual field of the depth image obtained by the photographing device through the TOF ranging method. Field phase error.
The method according to claim 9, wherein the initial depth image obtained for the i-th time is the i-th initial depth image; and the acquiring n initial depth images comprises:

Acquire n measured depth images obtained by the actual measurement by the time-of-flight TOF ranging method of the shooting device n times, and the i-th measured depth image in the n measured depth images is represented by a pixel matrix Di(x, y)';

Calculate the sum of each pixel value in the pixel matrix Di(x, y)' and c*[(1-i)*Δt]/2 to obtain the i-th initial depth image Di(x, y), so The c*[(1-i)*Δt]/2 is the distance difference generated because the activation time of the light source is delayed by the (i-1)*Δt from the activation time of the sensor.
The method according to claim 9 or 10, wherein the calculation to obtain a field of view phase correction coefficient matrix after performing mean value processing on the n fitted depth images, comprising:

averaging the pixel matrices of the n fitted depth images to obtain a fitted average pixel matrix;

Extracting the minimum value of a plurality of pixel values in the fitting average pixel matrix, and calculating the ratio of the minimum value to each pixel value of the fitting average pixel matrix to obtain a ratio matrix, where the ratio matrix is the Field of view phase correction coefficient matrix.
The method according to any one of claims 9 to 11, wherein after performing surface fitting on the n initial depth images to obtain n fitted depth images, the method further comprises:

averaging the pixel matrices of the n initial depth images to obtain an initial average pixel matrix;

averaging the pixel matrices of the n fitted depth images to obtain a fitted average pixel matrix;

Calculate the difference between the fitted average pixel matrix and the initial average pixel matrix to obtain a difference matrix, the difference matrix is a fixed pattern noise FPN matrix, and each value in the FPN matrix includes Fixed noise for pixels in the depth image with the same subscript as each value.
A depth image processing device, characterized in that the device includes:

an acquisition unit for acquiring the first depth image of the first object;

The correction unit is used to obtain a second depth image by correcting the depth value of the pixel in the first depth image by the field of view phase correction coefficient matrix, wherein the field of view phase correction coefficient matrix is to preprocess the n initial depth images The matrix used to correct the phase error of the field of view obtained later, the n initial depth images are the depth images obtained in the first case, and the first case is the start-up moment of the light source of the shooting device and the shooting device. The startup moments of the sensor are different moments, and the preprocessing includes mean value processing performed according to the n initial depth images, where n is an integer greater than 1.
The device according to claim 13, wherein the preprocessing further comprises surface fitting processing,

The field of view phase correction coefficient matrix is calculated and obtained according to n fitted depth images, and the n fitted depth images are obtained by performing surface fitting on the n initial depth images respectively, and the n initial depth images are: Based on the depth images of the second object obtained by the photographing device n times by the time-of-flight TOF ranging method, the start-up time of the light source at the i-th time in the n times is delayed from the start-up time of the sensor by (i-1 )*Δt, the value range of the i is [1, n], n*︱Δt︱=k*T, the T is the period of the optical signal, the k is an integer, and the Δt is the pre- Set the duration.
The device according to claim 14, wherein the initial depth image obtained for the i-th time is the i-th initial depth image, and the i-th initial depth image is represented by a pixel matrix Di(x, y), and the The pixel matrix Di(x, y) is obtained by summing each pixel value in the pixel matrix Di(x, y)' and c*[(1-i)*Δt]/2, the Di(x, y) ' is the pixel matrix of the depth image obtained by the i-th measurement, and the c*[(1-i)*Δt]/2 is because the start-up time of the light source is delayed from the start-up time of the sensor (i-1)*Δt produces distance difference.
The device according to claim 14 or 15, wherein the field of view phase correction coefficient matrix is calculated and obtained according to n fitted depth images, comprising:

The field of view phase correction coefficient matrix is a ratio matrix obtained by taking the ratio between the minimum value of a plurality of pixel values in the fitted average pixel matrix and each pixel value of the fitted average pixel matrix, and the fitted average pixel matrix Obtained by averaging the pixel matrices of the n fitted depth images.
The device according to any one of claims 14 to 16, wherein the correction unit is also used for:

A third depth image is obtained by compensating the second depth image through a fixed pattern noise FPN matrix, wherein each value in the FPN matrix includes the difference between each value in the first depth image caused by hardware and the Fixed noise for pixels of the same subscript.
The device according to claim 17, wherein the FPN matrix is a difference value matrix obtained by taking a difference between a fitted average pixel matrix and an initial average pixel matrix, and the fitted average pixel matrix is the n fitted average pixel matrix. The pixel matrix of the combined depth image is obtained by averaging, and the initial average pixel matrix is obtained by averaging the pixel matrices of the n initial depth images.
The device according to any one of claims 13 to 18, wherein the correction unit is specifically used for:

The second depth image is obtained by separately calculating the product of the pixel matrix of the first depth image and the pixel value of the same subscript in the field of view phase correction coefficient matrix.
The device according to claim 17 or 18, wherein the correction unit is specifically used for:

Respectively calculate the product of the pixel matrix of the first depth image and the pixel value of the same subscript in the field of view phase correction coefficient matrix to obtain a product matrix;

The third depth image is obtained by calculating the sum of the product matrix and the FPN matrix.
A depth image processing device, characterized in that the device includes:

an acquisition unit, configured to acquire n initial depth images, where the n initial depth images are the depth images of the second object obtained by the time-of-flight TOF ranging method n times based on the photographing device, wherein the nth The activation time of the light source of the photographing device for i times is delayed by (i-1)*Δt from the activation time of the sensor of the photographing device, and the value range of i is [1, n], n*︱Δt︱ =k*T, the T is the period of the optical signal, the n is an integer greater than 1, the k is an integer, and the Δt is a preset duration;

a surface fitting unit, configured to perform surface fitting on the n initial depth images respectively to obtain n fitted depth images;

The calculation unit is used to calculate and obtain the field of view phase correction coefficient matrix after performing mean value processing according to the n fitting depth images, and the field of view phase correction coefficient matrix is used to correct that the shooting equipment is obtained by the TOF ranging method The field of view phase error of the depth image.
The device according to claim 21, wherein the initial depth image obtained for the i-th time is the i-th initial depth image; and the obtaining unit is specifically used for:

Acquire n measured depth images obtained by the actual measurement by the time-of-flight TOF ranging method of the shooting device n times, and the i-th measured depth image in the n measured depth images is represented by a pixel matrix Di(x, y)';

Calculate the sum of each pixel value in the pixel matrix Di(x, y)' and c*[(1-i)*Δt]/2 to obtain the i-th initial depth image Di(x, y), so The c*[(1-i)*Δt]/2 is the distance difference generated because the activation time of the light source is delayed by the (i-1)*Δt from the activation time of the sensor.
The device according to claim 21 or 22, wherein the computing unit is specifically configured to:

averaging the pixel matrices of the n fitted depth images to obtain a fitted average pixel matrix;

Extracting the minimum value of a plurality of pixel values in the fitting average pixel matrix, and calculating the ratio of the minimum value to each pixel value of the fitting average pixel matrix to obtain a ratio matrix, where the ratio matrix is the Field of view phase correction coefficient matrix.
The device according to any one of claims 21 to 23, wherein the computing unit is further configured to perform surface fitting on the n initial depth images in the curved surface fitting unit respectively to obtain n simulated images. After combining the depth images, average the pixel matrices of the n initial depth images to obtain an initial average pixel matrix;

averaging the pixel matrices of the n fitted depth images to obtain a fitted average pixel matrix;

Calculate the difference between the fitted average pixel matrix and the initial average pixel matrix to obtain a difference matrix, the difference matrix is a fixed pattern noise FPN matrix, and each value in the FPN matrix includes Fixed noise for pixels in the depth image with the same subscript as each value.
A depth image processing device, characterized in that the device includes a processor, a communication interface and a memory, wherein the memory is used to store a computer program, and the processor is used to execute the computer program stored in the memory to achieve The method according to any one of claims 1 to 8; or, the processor is configured to execute a computer program stored in the memory to implement the method according to any one of claims 9 to 12.
An apparatus comprising a processor and a communication interface, characterized in that, the apparatus is configured to perform the method of any one of claims 1 to 8 or any one of 9 to 12.
A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program, and the computer program is executed by a processor to realize any one of claims 1 to 8 or any one of claims 9 to 12. method described.
A computer program product, characterized in that, when the computer program product is read and executed by a computer, the method according to any one of claims 1 to 8 or any one of claims 9 to 12 will be executed.