WO2021036135A1 - 一种深度图像补全方法及装置、计算机可读存储介质 - Google Patents

一种深度图像补全方法及装置、计算机可读存储介质 Download PDF

Info

Publication number
WO2021036135A1
WO2021036135A1 PCT/CN2019/128828 CN2019128828W WO2021036135A1 WO 2021036135 A1 WO2021036135 A1 WO 2021036135A1 CN 2019128828 W CN2019128828 W CN 2019128828W WO 2021036135 A1 WO2021036135 A1 WO 2021036135A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
diffused
image
map
diffusion
Prior art date
Application number
PCT/CN2019/128828
Other languages
English (en)
French (fr)
Inventor
许龑
祝新革
石建萍
章国锋
李鸿升
Original Assignee
上海商汤临港智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤临港智能科技有限公司 filed Critical 上海商汤临港智能科技有限公司
Priority to SG11202012443SA priority Critical patent/SG11202012443SA/en
Priority to JP2020568542A priority patent/JP7143449B2/ja
Priority to KR1020207036589A priority patent/KR20210027269A/ko
Priority to US17/107,065 priority patent/US20210082135A1/en
Publication of WO2021036135A1 publication Critical patent/WO2021036135A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/529Depth or shape recovery from texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10044Radar image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Definitions

  • the present disclosure relates to image processing technology, and in particular to a depth image complement method and device, and computer-readable storage medium.
  • a common depth image acquisition method is to obtain a depth image of a three-dimensional scene by using a light detection and ranging (LiDAR) sensor, a binocular camera, and a time of flight (Time of Flight, TOF) sensor.
  • LiDAR light detection and ranging
  • TOF time of Flight
  • the effective distance between the binocular camera and the TOF sensor is generally within 10m, which is usually used in terminals such as smartphones, while the effective distance of LiDAR is relatively long, which can reach tens of meters or even hundreds of meters, and can be used in areas such as autonomous driving and robotics. .
  • a laser beam is emitted to the three-dimensional scene, and then the laser beam reflected from the surface of each object in the three-dimensional scene is received, and the time difference between the launch time and the reflection time is calculated to obtain the depth image of the three-dimensional scene.
  • 32/64-line LiDAR is usually used, so that only sparse depth images can be obtained.
  • Depth image completion refers to the process of restoring the depth map to a dense depth map. In related technologies, depth image completion is to directly input the depth map into the neural network to obtain a dense depth map, but this method does not Make full use of the sparse point cloud data, so that the accuracy of the dense depth map obtained is low.
  • the present disclosure provides a depth image complement method and device, and a computer-readable storage medium, which can make full use of sparse point cloud data and improve the accuracy of the complemented depth map.
  • embodiments of the present disclosure provide a depth image completion method, including:
  • the diffusion intensity represents the intensity of the pixel value of each pixel in the to-be-diffused image diffusing to adjacent pixels ;
  • the depth map after completion is determined.
  • a depth image complement device including:
  • An acquisition module configured to acquire a depth map of a target scene through a set radar, and acquire a two-dimensional image of the target scene through a set camera;
  • the processing module is configured to determine the to-be-diffused map and the feature map based on the acquired depth map and the two-dimensional image; determine the value of each pixel in the to-be-diffused map based on the to-be-diffused map and the feature map Diffusion intensity; the diffusion intensity characterizes the intensity of the diffusion of the pixel value of each pixel in the to-be-diffused image to adjacent pixels;
  • the diffusion module is configured to determine the completed depth map based on the pixel value of each pixel in the to-be-diffused image and the diffusion intensity of each pixel in the to-be-diffused image.
  • an embodiment of the present disclosure also provides a depth image completion device, including: a memory and a processor;
  • the memory is configured to store executable depth image completion instructions
  • the processor is configured to execute the executable depth image completion instruction stored in the memory, and implement the method according to any one of the first aspects above.
  • an embodiment of the present disclosure provides a computer-readable storage medium that stores an executable depth image completion instruction, which is used to cause a processor to execute the method described in any one of the first aspects.
  • the embodiments of the present disclosure provide a depth image completion method and device, and a computer-readable storage medium.
  • the depth map of a target scene is collected by a set radar, and a two-dimensional image of the target scene is collected by a set camera;
  • Depth map and two-dimensional image determine the to-be-diffused image and feature map; based on the to-be-diffused image and feature map, determine the diffusion intensity of each pixel in the to-be-diffused image; the diffusion intensity represents the pixel value of each pixel in the to-be-diffused image
  • the intensity of adjacent pixels diffusion based on the pixel value of each pixel in the image to be diffused and the diffusion intensity of each pixel in the image to be diffused, the depth map after completion is determined.
  • the to-be-diffused image can be obtained according to the acquired depth map and the two-dimensional image.
  • the to-be-diffused image will retain all the point cloud data in the acquired depth map, so that the pixels of each pixel in the to-be-diffused image
  • all the point cloud data in the acquired depth image will be used to make full use of the points in the acquired depth image.
  • the cloud data in turn makes the depth information of each 3D point in the three-dimensional scene more accurate, and improves the accuracy of the completed depth map.
  • FIG. 1 is a first flowchart of a depth image completion method provided by an embodiment of the disclosure
  • FIG. 2 is a second flowchart of a depth image completion method provided by an embodiment of the disclosure.
  • FIG. 3 is a schematic diagram of calculating a distance to the origin of a first plane according to an embodiment of the disclosure
  • FIG. 4(a) is a noise schematic diagram of a collected depth map provided by an embodiment of the present disclosure
  • FIG. 4(b) is a schematic diagram of a first confidence graph provided by an embodiment of the disclosure.
  • FIG. 5 is a third flowchart of a depth image completion method provided by an embodiment of the present disclosure.
  • FIG. 6 is a first schematic diagram of a process of a depth image completion method provided by an embodiment of the present disclosure
  • FIG. 7 is a second schematic diagram of the process of a depth image completion method provided by an embodiment of the present disclosure.
  • FIG. 8 is a third schematic diagram of the process of a depth image completion method provided by an embodiment of the present disclosure.
  • FIG. 9 is a fourth flowchart of a depth image completion method provided by an embodiment of the disclosure.
  • FIG. 10 is a fifth flowchart of a depth image completion method provided by an embodiment of the disclosure.
  • FIG. 11 is a schematic diagram of a pixel value after diffusion of a second pixel point of a to-be-diffused image provided by an embodiment of the present disclosure
  • FIG. 12(a) is the first schematic diagram of the influence of the value of the preset number of repetitions provided by the embodiments of the present disclosure on the error of the completed depth map;
  • FIG. 12(b) is a second schematic diagram of the influence of the value of the preset number of repetitions provided by the embodiments of the present disclosure on the error of the completed depth map;
  • FIG. 13(a) is a schematic diagram of the influence of a preset error tolerance parameter provided by an embodiment of the present disclosure on the truth map of the first confidence map;
  • FIG. 13(b) is a schematic diagram of the influence of the preset error tolerance parameters provided by the embodiments of the present disclosure on the distribution of the true value-absolute error curve of the confidence level;
  • 14(a) is a schematic diagram 1 of the influence of the sampling rate of the preset prediction model provided by an embodiment of the present disclosure on the depth map after completion;
  • FIG. 14(b) is a second schematic diagram of the influence of the sampling rate of the preset prediction model provided by an embodiment of the present disclosure on the depth map after completion;
  • FIG. 15(a) is a schematic diagram of a acquired depth map and a two-dimensional image of a three-dimensional scene provided by an embodiment of the present disclosure
  • FIG. 15(b) is a completed depth map obtained by using a convolutional space propagation network provided by an embodiment of the disclosure.
  • FIG. 15(c) is a completed depth map obtained by using NConv-convolutional neural network according to an embodiment of the disclosure.
  • Figure 15(d) is a completed depth map obtained by using the sparse-dense method in related technologies
  • FIG. 15(e) is a normal prediction diagram provided by an embodiment of the disclosure.
  • FIG. 15(f) is a first confidence graph provided by an embodiment of the disclosure.
  • Figure 15(g) is a completed depth map provided by an embodiment of the disclosure.
  • FIG. 16 is a schematic structural diagram of a depth image complement device provided by an embodiment of the disclosure.
  • FIG. 17 is a schematic diagram of the composition structure of a depth image complement device provided by an embodiment of the disclosure.
  • LiDAR Light Detection And Ranging
  • TOF Time of Flight
  • the effective distance between the binocular camera and TOF to obtain depth images is generally within 10m, and is usually applied to terminals such as smartphones to obtain depth images of human faces and other targets; LiDAR has a long effective distance, which can reach tens of meters or even Hundreds of meters can be used in areas such as autonomous driving and robotics.
  • LiDAR When using LiDAR to acquire depth images, it actively emits a laser beam to the three-dimensional scene, and then receives the laser beam reflected from the surface of each object in the three-dimensional scene, according to the emission time of the emitted laser beam and the receiving time of the reflected laser beam. Time difference, to obtain the depth image of the three-dimensional scene. Since LiDAR acquires depth images based on the time difference of the laser beam, the depth images obtained by LiDAR are composed of sparse point cloud data, and in practical applications, 32/64 line LiDAR is the main method. , Which leads to only sparse depth maps, and depth completion must be performed to convert the sparse depth maps into dense depth maps.
  • the depth image completion method relies on training data composed of a large number of sparse depth maps and two-dimensional images of three-dimensional scenes to supervise and train the neural network model to obtain a trained neural network model, and then directly combine the sparse depth
  • the two-dimensional images of the map and the three-dimensional scene are input into the trained neural network model to complete the depth completion process to obtain a denser depth map.
  • this method does not make full use of the point cloud data in the depth map, and the accuracy of the depth completion obtained is low.
  • the basic idea of the embodiments of the present disclosure is to first obtain the to-be-diffused image based on the collected sparse depth map and the two-dimensional image of the three-dimensional scene, and then implement pixel-level diffusion on the to-be-diffused image , To obtain the completed depth map, so as to make full use of each sparse point cloud data in the sparse depth map to obtain a depth complement map with higher accuracy.
  • the embodiment of the present disclosure provides a depth image completion method.
  • the method may include:
  • S101 Collect a depth map of a target scene through a set radar, and acquire a two-dimensional image of the target scene through a set camera.
  • the embodiments of the present disclosure are implemented in a scene where depth image completion is performed on a collected sparse depth map.
  • the depth map of the target scene is collected through the radar set on its own, and at the same time the two-dimensional image of the target scene is collected by the camera set on the device.
  • the depth information of the 3D points in the 3D scene corresponding to the laser beam can be calculated according to the time difference between the launch time and the receiving time of the laser beam, and the calculated The depth information is used as the pixel value to obtain the depth map.
  • the depth information of the 3D point corresponding to the laser beam can also be calculated by using other characteristics of the laser beam, such as phase information, to obtain a depth map, which is not limited in the embodiment of the present disclosure.
  • the depth map collected by the radar is a sparse depth map.
  • the set radar may be a 32/64-line LiDAR sensor, a millimeter wave radar, or other types of radars, and the embodiment of the present disclosure is not limited herein.
  • the optical device of a color camera may be used to obtain the pixel value information of each 3D point in the three-dimensional scene, thereby obtaining a two-dimensional image, or it may be obtained by other means
  • the two-dimensional image of the target scene is not limited in the embodiment of the present disclosure.
  • the set camera may be a color camera to obtain a color two-dimensional image of a three-dimensional scene, or an infrared camera to obtain an infrared grayscale image of a three-dimensional scene.
  • the set camera may also be other Types of cameras are not limited in this embodiment of the present invention.
  • the resolution of the acquired depth map and the two-dimensional image may be the same or different.
  • the resolution of the acquired depth map and the two-dimensional image can be maintained by zooming any one of the acquired depth map and the two-dimensional image Unanimous.
  • the radar and the camera can be set and arranged according to actual needs, and the embodiment of the present disclosure is not limited herein.
  • S102 Obtain a to-be-diffused map and a feature map according to the acquired depth map and the two-dimensional image.
  • S104 Determine a completed depth map based on the pixel value of each pixel in the to-be-diffused image and the diffusion intensity of each pixel in the to-be-diffused image.
  • the image to be diffused since the image to be diffused is determined based on the depth map and the two-dimensional image, the image to be diffused will retain all the point cloud data collected in the depth map, so that each pixel in the image to be diffused is used When determining the diffused pixel value of each pixel in the to-be-diffused image, all the point cloud data in the depth map will be used to make each 3D point of the three-dimensional scene obtained The accuracy of the corresponding depth information is higher, and the accuracy of the completed depth map is improved.
  • the completed depth map is determined based on the pixel value of each pixel in the image to be diffused and the diffusion intensity of each pixel in the image to be diffused, that is, the implementation process of S104, which may include S1041-S1042 ,as follows:
  • the completed depth map in the embodiments of the present disclosure refers to a relatively dense depth map after completion. It has more comprehensive depth information of 3D scenes and can be directly applied to various scenes that require depth maps.
  • the pixel value of each pixel in the image to be diffused and its corresponding diffusion intensity are used to calculate the diffused pixel value of each pixel in the image to be diffused, and according to the diffusion of each pixel in the image to be diffused
  • the final pixel value determines the depth map after the completion, and all the point cloud data in the depth map collected will be used to make the depth information corresponding to each 3D point of the three-dimensional scene more accurate and improve the completion The accuracy of the depth map afterwards.
  • the to-be-diffused image is a preliminarily completed depth map; the completed depth map is determined according to the diffused pixel value of each pixel in the to-be-diffused image , That is, the implementation process of S1042 can include S1042a-S1042b, as follows:
  • S1042a Use the diffused pixel value of each pixel in the image to be diffused as the pixel value of each pixel of the diffused image.
  • the first depth map obtained for the preliminary completion is an image obtained based on the acquired depth map and two-dimensional image, that is, operations such as plane division and depth information filling are performed on the acquired depth map and two-dimensional image.
  • the acquired depth map and two-dimensional image are processed. Among them, the density of the point cloud data in the initially completed depth map is greater than the density of the point cloud data in the acquired depth map.
  • the diffused pixel value of each pixel in the to-be-diffused image can be used as the pixel value of each pixel of the diffused image, and the diffused image can be used as the complemented depth map, which will utilize the acquisition All the point cloud data in the depth map obtained, so as to make full use of the point cloud data in the depth map to obtain a better completed depth map.
  • the map to be diffused is the first plane origin distance map.
  • the map to be diffused and the feature map are determined according to the acquired depth map and the two-dimensional image, namely S102
  • the implementation process of can include S1021-S1023, as follows:
  • the acquired parameter matrix is an inherent parameter matrix of the camera.
  • the parameter matrix may refer to the internal parameter matrix of the camera, which may include the projective transformation parameters and focal length of the camera.
  • the parameter matrix may also include other parameters required for calculating the distance map of the first plane origin, which is not limited in the embodiment of the present disclosure.
  • the normal prediction map refers to an image in which the normal vector of each point of the three-dimensional scene is used as the pixel value.
  • the normal prediction map refers to an image obtained by using the surface normal vector of each 3D point in the three-dimensional scene as the pixel value.
  • the surface normal vector of a 3D point is defined as a vector starting from the 3D point and perpendicular to the tangent plane of the 3D point.
  • the preliminary completed depth map obtained for the first time refers to an image in which the preliminary depth information of each 3D point in the three-dimensional scene is determined by using the acquired depth map and the two-dimensional image as the pixel value.
  • the pixel value of each pixel in the preliminary completion depth map, the parameter matrix and the pixel value of each pixel in the normal prediction map can be obtained.
  • the first plane origin distance is calculated for each 3D point, and then the first plane origin distance of each 3D point is used as the pixel value to obtain the first plane origin distance map, so that it can be based on the first plane origin distance map and features later
  • the pixel value after diffusion is calculated for each pixel in the distance map of the origin of the first plane to obtain the completed depth map.
  • the first plane origin distance refers to the distance from the center of the camera calculated by using the preliminary complemented depth map to the tangent plane where each 3D point in the three-dimensional scene is located.
  • the first plane origin distance map uses the first plane origin distance of each 3D point, that is, the distance from the center of the camera to the tangent plane where the 3D point is located as the pixel value, the image is obtained, therefore, the images on the same tangent plane
  • the 3D point should have the same or similar distance to the origin of the first plane. If there is a distance from the origin of the first plane of a certain 3D point, the distance from the origin of the first plane of other 3D points that are on the same plane as the 3D point is quite different. When, it indicates that the distance of the first plane origin of the 3D point is an abnormal value that needs to be corrected, that is, the 3D point in the same plane has geometric constraints.
  • the abnormal values in the first plane origin distance map can be corrected.
  • the first plane origin distance map with a higher accuracy rate is obtained, and then the completed depth map with better effect can be obtained according to the first plane origin distance map with higher accuracy rate.
  • the first plane origin distance of each 3D point in the three-dimensional scene needs to be calculated first, and then the first plane origin distance of each 3D point is used as the pixel value to obtain the first plane origin distance map.
  • the distance of the first plane origin of each 3D point it is necessary to determine the 2D projection of each 3D point on the image plane, and invert the parameter matrix of the camera to obtain the inverse matrix of the parameter matrix, and then from the preliminary
  • the preliminary depth information corresponding to each 3D point is obtained, and the normal vector of the tangent plane where each 3D point is located is obtained from the normal prediction map.
  • each 3D point The normal vector of the tangent plane where each 3D point is located, the inverse matrix of the parameter matrix, and the 2D projection of the 3D point on the plane image are multiplied to obtain the first plane origin distance of each 3D point.
  • P(x) represents the distance of the first plane origin of the 3D point
  • x represents the 2D projection of the 3D point on the image plane
  • D(x) represents the preliminary depth information corresponding to the 3D point
  • N(x) represents the 3D point X
  • the normal vector of the tangent plane where C is the parameter matrix.
  • the calculation formula for the distance of the first plane origin of the 3D point can be derived through the geometric relationship. From the geometric relationship, the distance from the center of the camera to the tangent plane where the 3D point is located can be determined by any point on the plane where the 3D point is located, and the normal vector of the plane where the 3D point is located, and the three-dimensional coordinates of the 3D point can be determined by the 3D point at The 2D projection on the image plane, the preliminary depth information of the 3D point and the parameter matrix are obtained. Therefore, the distance from the center of the camera to the tangent plane where the 3D point is located can be determined by the preliminary depth information of the 3D point and the normal vector of the plane where the 3D point is located.
  • the position information of each pixel is the 2D projection of the 3D point, and the pixel value of each pixel is the depth information corresponding to the 3D point.
  • the position information of each pixel is the 2D projection of the 3D point, and the pixel value of each pixel is the normal vector information of the 3D point. Therefore, the depth map and the normal prediction map can be obtained from the preliminary completion. Calculate the distance of the first plane origin of all 3D points from the sum parameter matrix.
  • equation (2) the relationship between the distance between the 3D point in the 3D scene and the tangent plane where the 3D point is located can be shown in equation (2):
  • X represents the 3D point in the three-dimensional scene
  • x represents the 2D projection of the 3D point on the image plane
  • N(x) represents the normal vector starting from the 3D point X and perpendicular to the tangent plane where the 3D point X is located
  • P(x ) Represents the distance from the center of the camera to the tangent plane where the 3D point X is located, that is, the preliminary depth information of the 3D point.
  • X represents a 3D point in a three-dimensional scene
  • x represents a 2D projection of the 3D point on the image plane
  • D(x) represents the preliminary depth information corresponding to the 3D point
  • C represents a parameter matrix.
  • the embodiment of the present disclosure provides a schematic diagram of calculating the distance to the origin of the first plane.
  • O is the center of the camera
  • X is a 3D point in the three-dimensional scene
  • x is the 3D point in the image.
  • F is the tangent plane of the 3D point
  • N(x) is the normal vector of the tangent plane where the 3D point is located
  • D(x) is the preliminary depth information corresponding to the 3D point.
  • the 2D projection x of the 3D point and the preliminary depth information corresponding to the 3D point can be obtained from the preliminary completed depth map, and then the tangent plane of the 3D point can be obtained from the normal prediction map Since the parameter matrix C is known, the 2D projection x of the 3D point, the preliminary depth information D(x) corresponding to the 3D point, the normal vector N(x), and the parameter matrix C can be substituted into In equation (1), in this way, the distance of the first plane origin of the 3D point can be calculated. After obtaining the first plane origin distance of each 3D point in the three-dimensional scene by using formula (1), the first plane origin distance of each 3D point can be used as the pixel value to obtain the first plane origin distance map.
  • the acquired depth map and two-dimensional image can be used to obtain a preliminary completed depth map, feature map, and normal prediction map, and based on the preliminary completed depth map, normal prediction map, and stored in Own parameter matrix, calculate the first plane origin distance map, and calculate the diffused pixel value for each pixel in the first plane origin distance map, so that geometric constraints can be used to eliminate the abnormal values in the first plane origin distance map , To improve the accuracy of the first plane origin distance map, thereby facilitating the subsequent acquisition of a better-effectively completed depth map based on the first plane origin distance map with higher accuracy.
  • the method further includes: S1024-S1026, as follows
  • S1024. Determine a first confidence map according to the acquired depth map and the two-dimensional image; where the first confidence map refers to an image that uses the confidence corresponding to each pixel in the depth map as a pixel value.
  • the first confidence map refers to an image obtained by using the confidence of the preliminary depth information of each 3D point in the three-dimensional scene as the pixel value.
  • the second plane origin distance refers to the distance from the center of the camera calculated by using the depth map to the tangent plane where the 3D point in the three-dimensional scene is located.
  • formula (5) may be used to calculate the distance of the second plane origin of each 3D point:
  • N(x) is the normal vector of the tangent plane where the 3D point is located
  • x is the 2D projection of the 3D point on the image plane
  • C is the parameter matrix of the camera.
  • the first confidence map can be introduced to measure the reliability of the depth information.
  • the first confidence map refers to an image obtained by using the confidence of the depth information of each 3D point, that is, the confidence corresponding to each pixel in the depth map as the pixel value.
  • the pixels in the first confidence map, the pixels in the second plane origin distance map, and the pixels in the first plane origin distance map to optimize the first plane origin distance map it can be based on the first confidence map
  • the pixel value of a certain pixel is used to judge the credibility of the depth information of the 3D point corresponding to the pixel.
  • the 3D point corresponding to the pixel is considered The depth information of the point is more reliable, that is, closer to the actual depth of the 3D point, and the distance of the second plane origin of the 3D point corresponding to the pixel point will be more reliable.
  • the optimized first plane origin distance can be made In the figure, there are some pixels whose values are closer to the actual plane origin. In this way, when pixel diffusion is implemented based on the optimized first plane origin distance map and feature map, not only the abnormal values existing in the first plane origin distance map can be eliminated, but also the abnormal values in the collected depth map can be reduced. Reduce the impact on the optimized first plane origin distance map, and further improve the accuracy of the optimized first plane origin distance map.
  • the value range of the pixel value of the first confidence map may be used to indicate the reliability of the original depth information.
  • the pixel value range of the first confidence map can be set to [0, 1]. When the pixel value of the first confidence map is close to 1, it indicates that the original depth information of the 3D point corresponding to the pixel is reliable. When the pixel value of the first confidence map is close to 0, it indicates that the original depth information of the 3D point corresponding to the pixel point is unreliable.
  • the pixel value of the first confidence map can also be ranged according to actual conditions, which is not limited in the embodiment of the present disclosure.
  • an embodiment of the present disclosure provides a noise diagram of a collected depth map.
  • Figure 4(a) when the radar collects depth information on a moving car in area 1, some Noise, such as the offset of the points in the small box, makes the obtained depth information inconsistent with the actual depth information, that is, the depth information is unreliable.
  • the reliability of the original depth information can be judged by the pixel value of each pixel in area 1 of FIG. 4(b). It can be seen from Figure 4(b) that the overall color of area 1 is darker, indicating that there are a large number of pixels with pixel values close to 0 in area 1, that is, there are a large number of pixels with unreliable depth information in area 1.
  • a pixel with a reliable second plane origin distance can be selected from the second plane origin distance map according to the first confidence map, and in the first plane origin distance map, the pixel point can be compared with the pixel point in the first plane origin distance map.
  • the pixel value of the corresponding pixel is replaced to obtain the optimized first plane origin distance map, so that the completed depth map can be obtained based on the optimized first plane origin distance map, so that not only can the first plane be cleared
  • the abnormal values in the origin distance map can also reduce the influence of the abnormal values in the depth map collected by the radar on the optimized first plane origin distance map, and improve the accuracy of the optimized first plane origin distance map In turn, the accuracy of the completed depth map is improved.
  • the pixels in the first plane origin distance map are Perform optimization to obtain the optimized distance map of the first plane origin, that is, the realization process of S1026, which may include: S1026a-S1026e, as follows:
  • the replacement pixel when determining the replacement pixel, it is based on the coordinate information of the first pixel in the first plane origin distance map to find the corresponding pixel in the second plane origin distance map, and obtain the pixel at the same time.
  • the pixel value of the point is used as the pixel value of the replacement pixel.
  • the replacement pixel and the pixel value of the replacement pixel After determining the replacement pixel and the pixel value of the replacement pixel, it is also necessary to determine the pixel corresponding to the replacement pixel according to the coordinate information of the replacement pixel from the first confidence map, and obtain the pixel The pixel value of is the confidence information of the pixel. In this way, the confidence information corresponding to the replacement pixel can be determined.
  • S1026c Determine the optimized pixel value of the first pixel of the first plane origin distance map according to the pixel value of the replaced pixel, the confidence information, and the pixel value of the first pixel of the first plane origin distance map.
  • the optimized pixel value of the first pixel of the first plane origin distance map it will first determine whether the pixel value of the replaced pixel is greater than 0, and use the truth function to record the determination result, that is, when When the pixel value of the replaced pixel is greater than 0, the function value of the truth function is 1. When the pixel value of the replaced pixel is less than or equal to 0, the function value of the truth function is 0, and then replace according to the function value of the truth function Calculate the optimized pixel value of the first pixel by calculating the pixel value of the pixel, the confidence information, and the pixel value of the first pixel in the first plane origin distance map.
  • the function value of the truth value function can be used to multiply the confidence information and the pixel value of the replacement pixel to obtain the first sub-optimized pixel value.
  • the function value of the truth value function can be used to correlate with the confidence information. Multiply, and use 1 to make the difference with the obtained product, and then multiply the difference with the pixel value of the first pixel in the first plane origin distance map to obtain the second sub-optimized pixel value, and finally the first sub-optimized The pixel value is added to the second sub-optimized pixel value to obtain the optimized pixel value of the first pixel.
  • the preset distance calculation model can also be set according to other forms, which is not limited in the embodiment of the present invention.
  • the embodiment of the present disclosure provides the calculation of the first pixel based on the function value of the truth value function, the pixel value of the replaced pixel, the confidence information, and the pixel value of the first pixel of the first plane origin distance map
  • the formula for the optimized pixel value is shown in formula (6):
  • M(x i ) is the confidence information of the replaced pixel
  • P(x i ) is the pixel value of the first pixel of the first plane origin distance map
  • P'(x i ) is the optimized first pixel of the first plane origin distance map The pixel value.
  • calculate the optimized pixel value of the first pixel in the first plane origin distance map calculate the optimized pixel value for each pixel in the first plane origin distance map, and use These optimized pixel values form the optimized distance map of the first plane origin.
  • the optimized pixel value can be calculated for each pixel in the first plane origin distance map one by one, so as to obtain the optimized first plane origin distance map, so that the subsequent optimization can be based on the first plane origin distance map.
  • Origin distance map and feature map determine the diffusion intensity of each pixel of the optimized first plane origin distance map, and get a better effect according to the diffusion intensity and the pixel value of the optimized first plane origin distance map Depth map.
  • the diffusion intensity of each pixel in the to-be-diffused image is determined, that is, the implementation process of S103 may include: S1031-S1032, as follows:
  • S1031 determine the pixel set to be diffused corresponding to the second pixel of the image to be diffused from the image to be diffused, and determine the pixel value of each pixel in the pixel set to be diffused; the second pixel is Any pixel in the image to be diffused.
  • the set of pixels to be diffused refers to pixels located in the neighborhood of the second pixel point of the image to be diffused. According to the preset diffusion range, first determine the neighborhood range of the second pixel of the image to be diffused, and then extract all the pixels in the neighborhood to form the second pixel of the image to be diffused to be diffused. Pixel collection.
  • the preset diffusion range can be set according to actual requirements, and the embodiments of the present disclosure are not limited herein.
  • the preset diffusion range can be set to 4 neighborhoods, and 4 pixels can be taken out to form the set of pixels to be diffused, or the preset diffusion range can be set to 8 neighborhoods, and the second pixel located in the image to be diffused can be taken out The surrounding 8 pixels form a set of pixels to be diffused.
  • S1032 Calculate the diffusion intensity corresponding to the second pixel of the image to be diffused by using the feature map, the second pixel of the image to be diffused, and each pixel in the set of pixels to be diffused.
  • the second pixel of the image to be diffused is compared with the pixel to be diffused.
  • Each pixel in the set forms a pixel pair, and the sub-diffusion intensities of these pixel pairs are respectively calculated, and then these sub-diffusion intensities are collectively used as the diffusion intensities corresponding to the second pixel of the image to be diffused.
  • the diffusion intensity corresponding to the second pixel point of the image to be diffused based on the pixel value of each pixel in the image to be diffused and the diffused pixel value of each pixel in the image to be diffused, it may include: S1033-S1034, as follows :
  • the diffusion intensity corresponding to the second pixel of the image to be diffused is obtained, based on the pixel value of each pixel in the image to be diffused and the diffusion intensity of each pixel in the image to be diffused, determine the diffused intensity of each pixel in the image to be diffused
  • the pixel value will be determined according to the diffusion intensity of the second pixel of the image to be diffused, the pixel value of the second pixel of the image to be diffused, and the pixel value of each pixel in the pixel set to be diffused, to determine the first pixel of the image to be diffused
  • an embodiment of the present invention provides a process schematic diagram of a depth image complementation method, as shown in FIG. 6.
  • the preliminary complemented depth map is used as the to-be-diffused image.
  • Depth map collected by radar the two-dimensional image I of the three-dimensional scene is collected by the camera, and the And I are input into the preset prediction model 1, to obtain the preliminary completed depth map D and feature map G, and then based on the preliminary completed depth map D and feature map G, determine the preliminary completion of each pixel in the depth map D Diffusion intensity 2, and based on the pixel value of each pixel in the preliminarily completed depth map D, and diffusion intensity 2 to obtain the diffused pixel value of each pixel in the preliminarily completed depth map D, so as to obtain the completed depth map D r .
  • the diffused first plane origin distance map is used as the map to be diffused, and the diffused pixel value of the first plane origin distance map is calculated, a diffused first plane origin distance map will be obtained.
  • the diffused first plane origin distance map is not a completed depth map, and the diffused first plane origin distance map needs to be inversely transformed to obtain the completed depth map.
  • the first plane origin distance map is calculated based on the preliminarily completed depth map, the normal prediction map, and the parameter matrix, it can be based on the diffused first plane origin distance map, method A depth map is calculated backwards from the prediction map and the parameter matrix, and the calculated depth map is used as the completed depth map.
  • the normal vector of the tangent plane where each 3D point is located and the 2D projection of each 3D point on the image plane can be obtained from the normal prediction map, and obtained from the first plane origin distance map after diffusion
  • the 2D projection and the inverse matrix of the parameter matrix are multiplied to obtain the product result, and the first plane origin distance after diffusion is compared with the product result, and the ratio obtained is used as the depth complement for each 3D point.
  • the depth complement information corresponding to each 3D point can be used as the pixel value to obtain the complemented depth map.
  • the embodiment of the present disclosure provides a process of calculating the depth complement information corresponding to each 3D point, as shown in formula (7):
  • D'(x) represents the depth complement information corresponding to each 3D point
  • P 1 (x) represents the distance to the origin of the first plane after 3D point diffusion
  • x represents the 2D projection of the 3D point on the image plane
  • N(x) represents the normal vector of the tangent plane where the 3D point X is located
  • C represents the parameter matrix.
  • an embodiment of the present disclosure provides a process diagram of a depth image completion method.
  • the first plane origin distance map is used as the image to be diffused.
  • Depth map to be collected And the two-dimensional image I as input, sent to the preset prediction model 1, to obtain the preliminary completed depth map D output by the sub-network 2 used to output the preliminary completed depth map, and the predicted normal map
  • the normal prediction map N output by the sub-network 3 at the same time, using a convolutional layer, the sub-network 2 used to output the preliminary completion depth map and the sub-network 3 used to predict the normal map are connected in series 4, and Visualize the feature data in the convolutional layer to obtain a feature map G.
  • a diffusion optimized first plane origin distance map can be obtained, and then the diffusion optimization is required
  • the latter first plane origin distance map is inversely transformed to obtain a completed depth map.
  • the plane origin distance of each 3D point can be obtained from the first plane origin distance map after optimization after diffusion, and the normal vector of the tangent plane where each 3D point is located can be obtained from the normal prediction map. And the 2D projection of each 3D point on the image plane, and the inverse matrix of the parameter matrix is obtained at the same time.
  • the embodiment of the present disclosure may use formula (8) to calculate the depth complement information corresponding to each 3D point:
  • D '(x) is a 3D point depth supplemental information corresponding to, P' 1 (x) is the pixel diffusion of the resulting plane origin 3D point distance, N (x) is a tangential plane 3D point resides , X is the 2D projection of the 3D point on the image plane, and C is the parameter matrix of the camera.
  • an embodiment of the present disclosure provides a process schematic diagram of a depth image completion method.
  • the acquired depth image And the two-dimensional image I is sent to the preset prediction model 1, and the preliminary completed depth map D output by the sub-network 2 used to output the preliminary completed depth map is obtained, and the sub-network 3 used to predict the normal map is output.
  • the output normal prediction map N, and the first confidence map M output by the sub-network 4 for outputting the first confidence map, and at the same time, using the convolutional layer, will be used to output the sub-sub of the preliminary completed depth map
  • the network 2 is connected in series 5 with the sub-network 3 used to predict the normal map, and the feature data in the convolutional layer is visualized to obtain the feature map G.
  • the optimized first plane origin distance map P′ and based on the optimized first plane origin distance map P′ and feature map G, the diffusion intensity of each pixel in P′ 7 and based on the optimized first plane origin distance map P′
  • the pixel value of each pixel in a plane distance from the origin map P′, and the first plane after the optimized diffusion intensity 7 is the diffused pixel value of each pixel in the origin map P′ to obtain the optimized first plane after diffusion origin '1.
  • the origin of the first plane optimized after the diffusion distance P in FIG formula (8)' 1 1, the prediction method of FIG inverse transform FIG distance P N, the calculated depth of each 3D point complement Information, and then get the depth map after completion.
  • the corresponding set of pixels to be diffused can be determined for each pixel of the image to be diffused according to the preset diffusion range, and then according to the feature map, each pixel of the image to be diffused, and each pixel to be diffused.
  • the set of pixels to be diffused calculate the diffusion intensity of each pixel of the image to be diffused, so as to be able to calculate the diffusion intensity, the pixel value of each pixel of the image to be diffused and the set of pixels to be diffused corresponding to each pixel of the image , Calculate the pixel value of each pixel in the to-be-diffused image after diffusion, so as to obtain the completed depth map.
  • the feature map, the second pixel of the image to be diffused, and each pixel in the pixel set to be diffused are used to calculate the corresponding pixel of the second pixel of the image to be diffused.
  • Diffusion strength that is, the realization process of S1032, can include: S1032a-S1032f, as follows:
  • the preset feature extraction model is first used to extract the features of the second pixel of the diffusion map, and the preset diffusion range is determined For each pixel in the set of pixels to be diffused, feature extraction is also performed, and then the intensity normalization parameter corresponding to the second pixel of the image to be diffused is calculated according to the extracted feature information, so as to facilitate subsequent use of the intensity normalization parameter Obtain the diffusion intensity corresponding to the second pixel of the image to be diffused.
  • the intensity normalization parameter is a parameter used to normalize the result calculated by the feature information of the first feature pixel and the feature information of the second feature pixel to obtain the sub-diffusion intensity.
  • a small-sized convolution kernel can be used as the preset feature extraction model, such as a 1 ⁇ 1 convolution kernel, or other machine learning models that can achieve the same purpose as the preset feature extraction model.
  • the disclosed embodiments are not limited here.
  • the second pixel point of the image to be diffused and each pixel in the pixel set to be diffused are processed using the preset feature extraction model, at least two types of pixels can be processed using the preset feature extraction model . Therefore, the same preset feature extraction model can be used to perform feature extraction on the second pixel of the diffusion map and each pixel in the set of pixels to be diffused, or different preset feature extraction models can be used to treat the diffusion map separately.
  • the second pixel point is to perform feature extraction with each pixel in the set of pixels to be diffused.
  • the pixel corresponding to the second pixel point of the image to be diffused will be found in the feature map, and the found pixel will be regarded as the first Feature pixels, at the same time, in the feature map, find the pixel corresponding to the third pixel in the set of pixels to be diffused, and use the found pixel as the second feature pixel.
  • the third pixel can be any pixel in the set of pixels to be diffused.
  • the feature map is an image obtained by visualizing the feature data of a certain layer in the preset prediction model
  • a convolutional layer with the same size as the image to be diffused is selected, and the feature data in the convolutional layer is visualized to obtain a feature map, so that the feature map corresponds to the pixels of the image to be diffused one-to-one.
  • the position information of the second pixel of the diffusion map is used to find the first characteristic pixel.
  • the second characteristic pixel can be found according to the position information of the third pixel in the set of pixels to be diffused.
  • the device may also search for the first characteristic pixel and the second characteristic pixel according to other methods, which are not limited in the embodiment of the present disclosure.
  • the pixel value of the first characteristic pixel is first extracted, and then the pixel value of the first characteristic pixel is calculated by using the preset characteristic extraction model to obtain The feature information of the first feature pixel.
  • the pixel value of the second feature pixel is first extracted, and then the preset feature extraction model is used to calculate the pixel value of the second feature pixel to obtain the second feature Characteristic information of the pixel.
  • the preset feature extraction model f can be used to perform feature extraction on the first feature pixel
  • the preset feature extraction model g can be used to perform feature extraction on the second feature pixel.
  • the first feature pixel is the pixel corresponding to the second pixel point of the image to be diffused in the feature map, which can be expressed as G(x i )
  • the second feature pixel is the third pixel point in the feature map and the set of pixels to be diffused.
  • the corresponding pixel can be expressed as G(x j ).
  • the characteristic information of the first characteristic pixel is f(G(x i ))
  • the characteristic information of the second characteristic pixel is g(G(x j )). In this way, the device obtains the characteristic information of the first characteristic pixel and the characteristic information of the second characteristic pixel.
  • the preset diffusion control parameter is a parameter used to control the sub-diffusion intensity value.
  • the preset diffusion control parameter can be a fixed value set according to actual needs, or it can be a variable parameter that can be learned.
  • the feature information of the first feature pixel is first transposed to obtain the transposition result, and then the transposition result is multiplied by the feature information of the second feature pixel, and 1 Make the difference with the obtained product to obtain the difference result. Then, the difference result is squared and compared with the multiple of the square of the preset diffusion control parameter. After that, the obtained ratio is used as the exponent of the exponential function, which will naturally The logarithm e is used as the base of the exponential function. Finally, the intensity normalization parameter is used to normalize the result of the operation to obtain the final sub-diffusion intensity. It should be noted that the specific form of the preset diffusion intensity calculation model can also be set according to actual needs, which is not limited in the embodiment of the present disclosure.
  • the embodiment of the present disclosure provides a preset diffusion intensity calculation model, as shown in formula (9):
  • x i represents the second pixel of the image to be diffused
  • x j represents the third pixel in the set of pixels to be diffused
  • S(x i ) represents the intensity normalization parameter corresponding to the second pixel of the image to be diffused
  • G(x i ) represents the first feature pixel
  • G(x j ) represents the second feature pixel
  • f(G(x i )) is the feature information of the first feature pixel
  • g(G(x j )) is the second feature pixel
  • represents the preset diffusion control parameter
  • w(x i , x j ) represents the sub-diffusion pixel pair composed of the second pixel point of the image to be diffused and the third pixel point in the set of pixels to be diffused Diffusion strength.
  • the characteristic information f(G(x i )) of the first characteristic pixel After obtaining the characteristic information f(G(x i )) of the first characteristic pixel, the characteristic information g(G(x j )) of the second characteristic pixel, and calculating the intensity normalization corresponding to the second pixel of the image to be diffused
  • the specific values of these parameters can be substituted into equation (9) to calculate the diffusion composed of the second pixel point of the image to be diffused and the third pixel point in the set of pixels to be diffused
  • the sub-diffusion intensity of the pixel pair w(x i , x j ).
  • the sub-diffusion intensity can be calculated for the second pixel point of the to-be-diffused intensity, and the diffusion pixel pair composed of each pixel in the to-be-diffused pixel set can be calculated, and then all the calculated sub-diffusions can be calculated.
  • the intensity is collectively used as the diffusion intensity of the second pixel in the image to be diffused. In this way, the diffusion intensity of each pixel in the image to be diffused can be obtained, and according to the diffusion intensity, the diffusion intensity of each pixel in the image to be diffused is calculated The pixel values of, thus get a higher accuracy of the complementary depth map.
  • the sub-diffusion intensity may be the similarity between the second pixel in the image to be diffused and the third pixel in the pixel set to be diffused.
  • the similarity between the second pixel point of the image to be diffused and the third pixel point in the set of pixels to be diffused can be used as the sub-diffusion intensity, that is, the degree of similarity between the second pixel point of the image to be diffused and the third pixel point in the image to be diffused can be used as the sub-diffusion intensity.
  • the degree of similarity of the third pixel in the diffusion pixel set determines the intensity of the third pixel in the pixel set to be diffused to the second pixel in the image to be diffused.
  • the third pixel in the diffusion pixel set is relatively similar, it is considered that the second pixel in the image to be diffused and the third pixel in the pixel set to be diffused are most likely to be on the same plane in the three-dimensional scene.
  • the third pixel in the pixel set to be diffused will have greater diffusion intensity to the second pixel in the image to be diffused; and when the second pixel in the image to be diffused is not similar to the third pixel in the pixel set to be diffused Is the second pixel in the image to be diffused, and is not on the same plane as the third pixel in the pixel set to be diffused.
  • the third pixel in the pixel set to be diffused faces the second pixel in the image to be diffused. The point spread intensity will be small to avoid errors in the pixel spreading process.
  • the sub-diffusion intensity can be determined according to the degree of similarity between the pixels in the image to be diffused and each pixel in the pixel set to be diffused, so as to ensure that the pixels on the same plane as the pixels in the image to be diffused are used to calculate The pixel value of each pixel in the image to be diffused is diffused, so as to obtain a completed depth map with higher accuracy.
  • the second pixel of the image to be diffused and each pixel in the set of pixels to be diffused are used to calculate the intensity normalization parameter corresponding to the second pixel of the image to be diffused, that is, the S1032a
  • the implementation process can include S201-S204, as follows:
  • S201 Extract the feature information of the second pixel of the image to be diffused and the feature information of the third pixel in the set of pixels to be diffused.
  • the pixel value of the second pixel of the image to be diffused is first obtained, and the preset feature extraction model is used for the pixel value. The calculation is performed to obtain the characteristic information of the second pixel of the image to be diffused.
  • the pixel value of the third pixel in the pixel set to be diffused is also first obtained, and then the pixel value is calculated using the preset feature extraction model. The characteristic information of the third pixel in the pixel set to be diffused is obtained.
  • the second pixel of the image to be diffused is expressed as x i and the third pixel in the set of pixels to be diffused is expressed as x j
  • the preset feature extraction model f is used to perform the second pixel of the image to be diffused
  • the preset feature extraction model g to perform feature extraction on the third pixel in the set of pixels to be diffused
  • the feature information of the second pixel in the image to be diffused can be expressed as f(x i )
  • the first pixel in the set of pixels to be diffused The feature information of the three pixels can be expressed as g(x j ).
  • the feature information of the second pixel of the to-be-diffusion map is matrix transposed, and the transposed result is compared with the feature of the third pixel in the set of pixels to be diffused.
  • the information is multiplied, and then the difference between 1 and the obtained product result is used, and the obtained difference result is squared to obtain the squared result.
  • the squared result is compared with the multiple of the square of the preset diffusion control parameter, Finally, the obtained ratio is used as the exponent of the exponential function, the natural logarithm e is used as the base of the exponential function, and the final calculation result is used as the subnormalization parameter corresponding to the third pixel in the set of pixels to be diffused.
  • the preset sub-normalized parameter calculation model can be set in other forms according to actual needs, and the embodiment of the present disclosure does not limit it here.
  • x i represents the second pixel of the image to be diffused
  • x j represents the third pixel in the set of pixels to be diffused
  • f(x i ) represents the feature information of the second pixel of the image to be diffused
  • g(x j ) Represents the characteristic information of the third pixel in the pixel set to be diffused
  • represents the preset diffusion control parameter
  • s(x j ) represents the sub-normalized parameter corresponding to the third pixel in the pixel set to be diffused.
  • S204 Accumulate the sub-normalized parameters of each pixel of the pixel set to be diffused to obtain the intensity normalized parameter corresponding to the second pixel of the image to be diffused.
  • the device can obtain the intensity normalization parameter corresponding to the second pixel point of the image to be diffused using equation (11):
  • N i denotes the set of pixels to be diffused
  • S (x i) represents the intensity of the second pixel is to be diffused
  • the value of these sub-normalization parameters can be directly substituted into equation (11) for accumulation, and the obtained accumulation result is used as the image to be diffused
  • feature extraction can be performed on the second pixel of the to-be-diffused image, feature extraction is performed on each pixel in the set of pixels to be diffused, and then a preset sub-normalized parameter calculation model can be used to perform feature extraction on the extracted features.
  • Information and the preset diffusion control parameters are calculated to obtain the sub-normalized parameters, and all the obtained sub-normalized parameters are accumulated to obtain the intensity normalized parameters, so that the device can use the intensity normalization in the future
  • the parameter calculates the diffusion intensity.
  • the implementation process of S1033 may include: S1033a-S1033d, as follows:
  • the pixel value of the second pixel of the image to be diffused and the diffusion intensity of the second pixel of the image to be diffused are acquired first, and the diffusion intensity of the second pixel of the image to be diffused is used.
  • the sub-diffusion intensity of the third pixel in the pixel set is multiplied by the pixel value of the second pixel of the image to be diffused to obtain a product result, and the process is repeated until the sub-diffusion intensity of each pixel in the pixel set to be diffused and the pixel value to be diffused After the pixel values of the second pixel in the figure are all multiplied, all the products obtained are accumulated to calculate the first diffusion part of the second pixel in the image to be diffused.
  • the first diffusion portion of the second pixel of the image to be diffused may also be calculated according to other methods, which is not limited in the embodiment of the present disclosure.
  • the first diffusion part can be calculated by formula (12), and formula (12) is as follows:
  • w(x i , x j ) is the sub-diffusion intensity corresponding to the third pixel in the pixel set to be diffused
  • N(x i ) represents the pixel set to be diffused
  • P(x i ) represents the second pixel of the image to be diffused
  • the pixel value of the point, p 1 (x i ) represents the calculated first diffusion part of the second pixel point of the image to be diffused.
  • the pixel value of the second pixel of the image to be diffused can be compared with the pixel value of the pixel to be diffused.
  • the value of the sub-diffusion intensity of each pixel in the set is substituted into equation (12), and the first diffusion part of the second pixel point of the image to be diffused is calculated.
  • each sub-diffusion intensity is the same as the second pixel of the image to be diffused. After the pixel values of the dots are multiplied and accumulated, the value of the accumulated result will not exceed the pixel value of the second pixel of the original image to be diffused.
  • the sub-diffusion intensity corresponding to the third pixel in the pixel set to be diffused is used first, and the sub-diffusion intensity is the same as that of the pixel to be diffused.
  • the pixel value of the third pixel in the pixel set is multiplied to obtain the product result, and the cycle repeats until each sub-diffusion intensity is multiplied by each pixel value in the pixel set to be diffused.
  • all The product of is accumulated, and the obtained accumulation result is used as the second diffused part of the second pixel of the image to be diffused.
  • the second diffusion portion of the second pixel of the image to be diffused can also be calculated according to other methods, which is not limited in the embodiment of the present disclosure.
  • equation (13) can be used to calculate the second diffusion part:
  • w(x i , x j ) is the sub-diffusion intensity corresponding to the third pixel in the set of pixels to be diffused
  • N(x i ) represents the set of pixels to be diffused
  • P(x j ) represents the third pixel in the set of pixels to be diffused
  • the pixel value of the pixel, p 2 (x i ) represents the calculated second diffusion portion of the second pixel of the image to be diffused.
  • the pixel value of the third pixel in the pixel set to be diffused can be compared with the pixel value of the third pixel in the pixel set to be diffused.
  • the value of the sub-diffusion intensity of each pixel in the diffusion pixel set is substituted into equation (13), and the second diffusion part of the second pixel point of the image to be diffused is calculated.
  • the pixel value of the second pixel of the image to be diffused can be used to first subtract the first diffused pixel portion, and then use the difference value to add to the second diffused portion, and use the final addition result as The pixel value after diffusion. It should be noted that the embodiment of the present disclosure may also perform other processing on the pixel value of the second pixel of the to-be-diffused image, the first diffused pixel portion, and the second-diffused pixel portion to obtain the second pixel of the to-be-diffused image after diffusion.
  • the pixel value is not limited in the embodiment of the present disclosure.
  • the embodiment of the present disclosure can obtain the diffused pixel value of the second pixel of the image to be diffused according to formula (14), and complete the pixel diffusion:
  • P(x i ) represents the pixel value of the second pixel of the image to be diffused
  • w(x i ,x j ) is the sub-diffusion intensity corresponding to the third pixel in the set of pixels to be diffused
  • N(x i ) represents The set of pixels to be diffused
  • P(x j ) represents the pixel value of the third pixel in the set of pixels to be diffused.
  • the diffused pixel value of the second pixel of the image to be diffused is calculated.
  • the pixel value of the second pixel of the image to be diffused can be used to first subtract the first diffused pixel portion, and then use the difference value to add to the second diffused portion, and use the final addition result as
  • the diffusion pixel value can be expressed by equation (15):
  • p 1 (x i ) represents the calculated first diffusion part of the second pixel of the image to be diffused
  • p 2 (x i ) represents the calculated second diffusion part of the second pixel of the image to be diffused
  • P(x i ) represents the pixel value of the second pixel of the image to be diffused
  • the embodiment of the present disclosure provides a schematic diagram of calculating the diffused pixel value of the second pixel of the image to be diffused.
  • the pixel value to be diffused is calculated.
  • the pixel set to be diffused must first be determined for the second pixel in the map to be diffused.
  • the pixel set to be diffused is determined according to the 8-neighborhood 3 As shown in FIG. 11, the second pixel point x i of the image to be diffused is located at the center of the upper left nine square grid, and the set of 8 pixels around it is the pixel to be diffused set 3.
  • the above steps are continued to be repeated to calculate the diffused pixel value of each pixel in the to-be-diffused image, so as to obtain the completed depth map.
  • the diffusion intensity according to the pixel value of each pixel in the image to be diffused and the pixel values of all pixels in the set of pixels to be diffused corresponding to each pixel of the image to be diffused. Calculate the diffused pixel value of each pixel in the diffusion map one by one, so that the acquired depth map can be fully utilized to obtain a completed depth map with higher accuracy.
  • the method may further include: S105, as follows:
  • S105 Use the completed depth map as the to-be-diffused map, and repeat the step of determining the diffusion intensity of each pixel in the to-be-diffused map based on the to-be-diffused map and the feature map, based on the pixel value of each pixel in the to-be-diffused map and the The step of determining the diffusion intensity of each pixel in the diffusion map to determine the diffused pixel value of each pixel in the diffusion map, and the step of determining the completed depth map according to the diffused pixel value of each pixel in the diffusion map Until the preset number of repetitions is reached.
  • the preset number of repetitions can be set to 8 times. After the complemented depth map is obtained, the above-mentioned steps will be performed 7 times for the complemented depth map to make the pixels Spread more fully. It should be noted that the preset number of repetitions can be set according to actual requirements, and the embodiments of the present disclosure are not limited herein.
  • the method may further include: S106, as follows:
  • the completed depth map is used as the step of the map to be diffused, and the step of determining the diffusion intensity of each pixel in the map to be diffused based on the map to be diffused and the feature map, based on the pixel value of each pixel in the map to be diffused and the diffusion of each pixel in the map to be diffused.
  • the steps of the diagram include:
  • the step of calculating the first plane origin distance map Based on the preliminary completion depth map, the camera parameter matrix and the normal prediction map, the step of calculating the first plane origin distance map; the step of determining the first confidence based on the depth map and the two-dimensional image; based on the depth map and parameter matrix And the normal prediction map, the step of calculating the second plane origin distance map; and according to the pixels in the first confidence map, the pixels in the second plane origin distance map, and the pixels in the first plane origin distance map,
  • the pixels in a plane origin distance map are optimized to obtain an optimized first plane origin distance map, and the optimized first plane origin distance map is used as a step of the image to be diffused.
  • the second plane origin distance information is calculated, Then the second plane origin distance map is obtained, and the first plane origin distance information of all pixels is calculated, and then the first plane origin distance map is obtained. Then, when it is judged that the current number of repetitions is less than the preset number of iterations, the replacement distance information is calculated for each pixel value P(x) in the distance map of the first plane origin, and the pixel value is optimized to obtain the optimized first A graph of the distance to the origin of the plane.
  • the optimized first plane origin distance map is used as the map to be diffused, and the second pixel point in the optimized first plane origin distance map is determined to determine the corresponding pixel set to be diffused, and the second pixel point correspondence is calculated.
  • the optimized pixel value is calculated The first plane origin distance map of the first plane origin distance map after the pixel value after the second pixel point diffusion, the first plane origin distance map optimized after diffusion is obtained, and then the first plane origin distance map optimized after diffusion is inversely transformed, Get the depth map after completion.
  • the embodiment of the present disclosure shows the influence of the value of the preset number of repetitions on the error of the completed depth map.
  • the KITTI data set is used for testing, and the abscissa is the predicted value.
  • the ordinate is Root Mean Square Error (RMSE)
  • the unit of RMSE is mm.
  • the completed depth map after the completed depth map is obtained, the completed depth map can be continued to be repeatedly supplemented, thereby further improving the accuracy of the completed depth map.
  • the depth image completion method may be implemented by using a preset prediction model. After acquiring the depth map and two-dimensional image of the target scene, first obtain the preset prediction model stored in the depth image complement device, and then send the depth map and image map as input to the preset prediction model for calculation. In order to perform preliminary prediction processing, and according to the output result of the preset prediction model, the to-be-diffused map and the feature map are obtained, so that the subsequent implementation of pixel diffusion based on the to-be-diffused map and the feature map.
  • the preset prediction model is a model that has been trained.
  • a trained convolutional neural network (Convolutional Neural Networks, CNN) model can be used as the preset prediction model.
  • CNN convolutional Neural Networks
  • other network models that can achieve the same purpose or other machine learning models can also be used as the preset prediction model according to actual conditions, and the embodiments of the present disclosure are not limited herein.
  • Residual Networks in CNN, ResNet-34 or ResNet-50, may be used as the preset prediction model.
  • the prediction result obtained by the preset prediction model can be directly used as the to-be-diffused map, or the prediction result can be processed to obtain the to-be-diffused map.
  • the obtained map to be diffused refers to the map used to diffuse the pixel value according to the output of the preset prediction model; and the obtained feature map refers to the combination of the depth map and the two-dimensional image After inputting into the preset prediction model for calculation, the feature data of a certain layer in the preset prediction model is visualized to obtain a feature map.
  • the preset prediction model is used to predict the depth map and the two-dimensional image. That is, the preset prediction model has two outputs. Therefore, in When obtaining the feature map, you can visualize only the feature data in the sub-network used to output the preliminary completed depth map to obtain the feature map, or it can only be the feature data in the sub-network used to output the normal prediction map Visualize to get the feature map. You can also connect the sub-network used to output the preliminary completion depth map and the sub-network used to output the normal prediction map in series to visualize the feature data in the series network to obtain the feature map . Of course, other methods may also be used to obtain the feature map, and the embodiments of the present disclosure are not limited herein.
  • the depth map and two-dimensional image can be sent to ResNet-34 for prediction, and then the feature data in the penultimate layer of ResNet-34 can be visualized, and Use the visualization result as a feature map.
  • the feature map can also be obtained in other ways, and the embodiments of the present disclosure are not limited herein.
  • the preset prediction model can be obtained by training using the following methods:
  • the acquired training samples include at least training depth map samples , Training two-dimensional image samples, and the truth value map of the preliminary completed depth map corresponding to the training depth map sample and the training two-dimensional image sample, the truth value map of the normal prediction map and the truth value of the first confidence map Figure.
  • the truth map of the preliminary completed depth map refers to an image composed of the real depth information of the three-dimensional scene as pixel values
  • the truth map of the normal prediction map is the truth map of the preliminary completed depth map
  • PCA Principal Component Analysis
  • the truth map of the first confidence map is the image calculated using the training depth map and the truth map of the depth map.
  • the true value of the confidence of each 3D point is calculated, and then the true value of the confidence of each 3D point is used as the pixel value to obtain the truth map of the first confidence map.
  • the true value of the confidence of each 3D point first use the depth information of the 3D point, subtract the true value of the depth information of the 3D point, and take the absolute value of the difference obtained to obtain the absolute value result, and then, The absolute value result is compared with the preset error tolerance parameters. Finally, the obtained ratio is used as the exponent of the exponential function, and the natural logarithm e is the base of the exponential function to obtain the truth of the confidence of each 3D point. value.
  • formula (17) can be used to calculate the true value of the confidence level of 3D points, which is as follows:
  • D * (x) represents the true value of the training depth information with the 3D point
  • b is the preset error tolerance parameter
  • M * (x) is the true value of the calculated confidence
  • the preset error tolerance parameter will affect the calculation process of the truth map of the first confidence map. Therefore, the preset error tolerance parameter can be set according to experience. It is not limited here.
  • an embodiment of the present disclosure provides a preset error tolerance parameter's error influence on the truth map of the first confidence map.
  • the abscissa is the value of the preset error tolerance parameter b.
  • the ordinate is the Root Mean Square Error (RMSE) of the truth map of the first confidence map calculated using different preset error tolerance parameters b, and the unit of RMSE is mm.
  • RMSE Root Mean Square Error
  • fault tolerance can be preset error parameter b is set to 100.
  • the embodiment of the present disclosure also provides an influence of the value of the preset error tolerance parameter on the distribution of the true value-absolute error (AE) curve of the confidence level.
  • AE true value-absolute error
  • the abscissa of FIG. 13(b) is the absolute error, where AE The unit of is m, and the ordinate is the true value of confidence M * .
  • the training samples are used to perform supervised training on the prediction model, and the training is stopped when the loss function meets the requirements, and the prediction parameters are obtained, so that the preset prediction model can be obtained subsequently.
  • the training depth map sample and the training two-dimensional image sample are used as input, and the training depth map sample and the training two-dimensional image sample are used for the initial completion of the depth map.
  • the value map, the truth map of the normal prediction map, and the truth map of the first confidence map are used as supervision for supervised training.
  • sub-loss functions can be set for the truth-value map of the preliminary completion depth map, the truth-value map of the normal prediction map, and the truth-value map of the first confidence map, and then these sub-loss functions, The weight adjustment parameters of the corresponding loss function are respectively multiplied, and finally the loss function of the preset prediction model is obtained according to the multiplication result.
  • the loss function of the preset prediction model can be set as:
  • L D is the sub-loss function corresponding to the truth map of the preliminary completed depth map
  • L N is the sub-loss function corresponding to the truth map of the normal prediction map
  • L C is the truth map of the first confidence map
  • the loss function of the preset prediction model can also be set to other forms, which is not limited in the embodiment of the present disclosure.
  • weight adjustment parameter of the loss function can be depth according to actual conditions, which is not limited in the embodiment of the present disclosure.
  • the sub-loss function corresponding to the truth map of the preliminary completed depth map can be set as:
  • D(x) represents the preliminary depth information of the 3D point predicted from the training sample
  • D * (x) represents the true value of the original depth information of the 3D point
  • n is the total number of pixels of the preliminary completed depth map.
  • the sub-loss function corresponding to the truth map of the normal prediction map can be set as:
  • N(x) represents the normal vector of the tangent plane where the 3D point is predicted from the training sample
  • N * (x) represents the true normal vector of the 3D point
  • n is the total number of pixels in the normal prediction image.
  • the sub-loss function corresponding to the truth map of the first confidence map can be set as:
  • M(x) represents the confidence information corresponding to the 3D point predicted from the training sample
  • M * (x) represents the truth value of the confidence information corresponding to the 3D point calculated by formula (17)
  • n is the first The total number of pixels in a confidence map.
  • the device can select appropriate hyperparameters to train the prediction model, so that a preset prediction model with better effect can be obtained subsequently.
  • the obtained prediction parameters and prediction model can be used to form a preset prediction model, so that subsequent devices can use the preset prediction model to compare the depth map and the depth map collected by the device. Make predictions on two-dimensional images.
  • the embodiment of the present disclosure shows the influence of the sampling rate of the preset prediction model on the depth map after completion.
  • the test is performed on the KITTI data set, and the abscissa is Sampling rate, the ordinate is RMSE, and the unit of RMSE is mm.
  • the prediction model can be trained to obtain the prediction parameters, and the prediction parameters and the prediction model are used to form the preset prediction model, so that the preset prediction model can be used to predict the depth map and the two-dimensional image collected in real time. deal with.
  • the embodiment of the present disclosure provides a schematic diagram of comparing the effects of a depth image completion method with that of a depth completion technology in the related art.
  • FIG. 15(a) the depth map and the depth map of the collected three-dimensional scene are shown in FIG. A schematic diagram of a two-dimensional image.
  • Figure 15(b) is the depth map after the completion obtained by using the Convolutional Spatial Propagation Network (CSPN) in the related technology to perform deep complementation
  • Figure 15(c) is the related technology using NConv-volume
  • NConv-CNN NConv-Convolutional Neural Network
  • Figure 15(d) shows the completion obtained by the Sparse-to-Dense method in the related technology.
  • Figure 15(e) is the normal prediction map provided by the embodiment of the disclosure
  • Figure 15(f) is the first confidence map predicted by the embodiment of the disclosure.
  • 15(g) is a complemented depth map obtained by using a depth image complementing method provided by an embodiment of the present disclosure. Comparing FIG. 15(b), FIG. 15(c), and FIG. 15(d) with FIG. 15(g), it can be seen that, compared with related technologies, a depth image completion method provided by an embodiment of the present disclosure The effect of the obtained depth map after the completion is better, the number of pixels with wrong depth information is fewer, and the detailed information of the depth map after the completion is more comprehensive.
  • the writing order of the steps does not mean a strict execution order but constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possibility.
  • the inner logic is determined.
  • an embodiment of the present disclosure provides a depth image complementing device 1, and the depth image complementing device 1 may include:
  • the acquisition module 10 is configured to acquire a depth map of a target scene through a set radar, and to acquire a two-dimensional image of the target scene through a set camera;
  • the processing module 11 is configured to determine a map to be diffused and a feature map based on the acquired depth map and the two-dimensional image; determine each pixel in the map to be diffused based on the map to be diffused and the feature map
  • the diffusion intensity of; the diffusion intensity represents the intensity of the diffusion of the pixel value of each pixel in the to-be-diffused image to adjacent pixels;
  • the diffusion module 12 is configured to determine the completed depth map based on the pixel value of each pixel in the to-be-diffused image and the diffusion intensity of each pixel in the to-be-diffused image.
  • the diffusion module 12 is further configured to determine the to-be-diffused image based on the pixel value of each pixel in the to-be-diffused image and the diffusion intensity of each pixel in the to-be-diffused image.
  • the diffused pixel value of each pixel in the diffusion map; and the completed depth map is determined according to the diffused pixel value of each pixel in the to-be-diffused map.
  • the to-be-diffused image is a preliminary complemented depth map
  • the diffusion module 12 is configured to determine the complement according to the diffused pixel value of each pixel in the to-be-diffused image.
  • the full depth map is completed, it is also configured to use the diffused pixel value of each pixel in the to-be-diffused image as the pixel value of each pixel of the diffused image; use the diffused image as the complemented depth Figure.
  • the map to be diffused is a first plane origin distance map
  • the processing module 11 is configured to determine the map to be diffused and the feature map according to the depth map and the two-dimensional image. Is also configured to obtain the parameter matrix of the camera; determine the preliminary complemented depth map, the feature map, and the normal prediction map according to the depth map and the two-dimensional image; the normal prediction A picture refers to an image that uses the normal vector of each point of the three-dimensional scene as the pixel value; according to the preliminary completed depth map, the camera parameter matrix and the normal prediction map, the first plane origin distance map is calculated;
  • the first plane origin distance map is an image in which the distance from the camera to the plane where each point of the three-dimensional scene is located is calculated by using the preliminary complemented depth map as a pixel value.
  • the processing module 11 is further configured to determine a first confidence map according to the depth map and the two-dimensional image; wherein, the first confidence map refers to the The confidence level corresponding to each pixel in the depth map is used as an image of the pixel value; according to the depth map, the parameter matrix, and the normal prediction map, a second plane origin distance map is calculated; the second plane origin distance
  • the figure is an image in which the distance from the camera to the plane of each point of the three-dimensional scene calculated by using the depth map is taken as the pixel value; the figure is based on the pixel in the first confidence map and the distance from the origin of the second plane Optimize the pixels in the first plane origin distance map and the pixels in the first plane origin distance map to obtain an optimized first plane origin distance map.
  • the processing module 11 is configured to be based on the pixels in the first confidence map, the pixels in the second plane origin distance map, and the first plane origin distance
  • the pixels in the figure are optimized for the pixels in the first plane origin distance map, and when the optimized first plane origin distance map is obtained, they are also configured to determine from the second plane origin distance map.
  • the pixel point corresponding to the first pixel point of the first plane origin distance map is used as a replacement pixel point, and the pixel value of the replacement pixel point is determined;
  • the first pixel point is the first plane origin distance map Determine the confidence information corresponding to the replacement pixel from the first confidence map; according to the pixel value of the replacement pixel, the confidence information, and the first
  • the pixel value of the first pixel of a plane origin distance map is determined, and the optimized pixel value of the first pixel of the first plane origin distance map is determined; the above steps are repeated until the first plane origin distance is determined
  • the optimized pixel value of each pixel in the figure obtains the optimized first plane origin distance map.
  • the processing module 11 when the processing module 11 is configured to determine the diffusion intensity of each pixel in the to-be-diffused image based on the to-be-diffused image and the feature image, it is also configured to According to the preset diffusion range, determine the pixel set to be diffused corresponding to the second pixel point of the image to be diffused from the image to be diffused, and determine the pixel value of each pixel in the pixel set to be diffused;
  • the second pixel is any pixel in the image to be diffused; using the feature map, the second pixel of the image to be diffused, and each pixel in the set of pixels to be diffused, the all pixels are calculated The diffusion intensity corresponding to the second pixel of the image to be diffused;
  • the diffusion module 12 is configured to determine the diffusion of each pixel in the image to be diffused based on the pixel value of each pixel in the image to be diffused and the diffusion intensity of each pixel in the image to be diffused.
  • the pixel value of the pixel value is further configured to be based on the diffusion intensity of the second pixel of the image to be diffused, the pixel value of the second pixel of the image to be diffused, and the pixel of each pixel in the pixel set to be diffused Value, determine the diffused pixel value of the second pixel in the to-be-diffused image; repeat the above steps until the diffused pixel value of each pixel in the to-be-diffused image is determined.
  • the processing module 11 is configured to use the feature map, the second pixel of the map to be diffused, and each pixel in the set of pixels to be diffused to calculate
  • the diffusion intensity corresponding to the second pixel of the image to be diffused is also configured to use the second pixel of the image to be diffused and each pixel in the set of pixels to be diffused to calculate the first pixel of the image to be diffused
  • the intensity normalization parameter corresponding to two pixels; the pixel corresponding to the second pixel of the to-be-diffused image in the feature map is used as the first feature pixel, and the third pixel in the set of pixels to be diffused is The pixel corresponding to the point is used as the second characteristic pixel; the third pixel is any pixel in the set of pixels to be diffused; the characteristic information of the first characteristic pixel and the characteristic information of the second characteristic pixel are extracted; Using the feature information of the first feature pixel, the feature information of the second feature pixel, the intensity normalization parameter, and the prese
  • the processing module 11 is configured to use the second pixel of the image to be diffused and each pixel in the set of pixels to be diffused to calculate the second pixel of the image to be diffused.
  • the intensity normalization parameter corresponding to the pixel point is also configured to extract the feature information of the second pixel point of the to-be-diffused image and the feature information of the third pixel point in the to-be-diffused pixel set;
  • the feature information of the second pixel in the diffusion map, the feature information of the third pixel in the set of pixels to be diffused, and the preset diffusion control parameter are calculated to calculate the sub-report of the third pixel in the set of pixels to be diffused A parameter; repeat the above steps until the sub-normalized parameter of each pixel in the set of pixels to be diffused is obtained; accumulate the sub-normalized parameter of each pixel in the set of pixels to be diffused to obtain the The intensity normalization parameter corresponding to the second pixel of the image to be diffused.
  • the diffusion module 12 is configured according to the diffusion intensity of the second pixel of the image to be diffused, the pixel value of the second pixel of the image to be diffused, and the When the pixel value of each pixel in the set of pixels to be diffused is determined, when the diffused pixel value of the second pixel point of the to-be-diffused image is determined, it is also configured to compare each sub-diffusion intensity in the diffusion intensity with the Multiply the pixel values of the second pixel of the image to be diffused, and accumulate the result of the product, to obtain the first diffusion part of the second pixel of the image to be diffused; add each sub-intensity of the diffusion intensity The diffusion intensity is respectively multiplied by the pixel value of each pixel in the set of pixels to be diffused, and the obtained multiplication is accumulated and added to obtain the second diffusion part of the second pixel of the image to be diffused; The pixel value of the second pixel of the figure, the first diffusion part of the second pixel
  • the diffusion module 12 is further configured to use the completed depth map as the to-be-diffused map, and repeatedly execute the determination of the to-be-diffused map based on the to-be-diffused map and the feature map.
  • the step of diffusing the diffusion intensity of each pixel in the image to be diffused is to determine the diffusion of each pixel in the image to be diffused based on the pixel value of each pixel in the image to be diffused and the diffusion intensity of each pixel in the image to be diffused After the step of pixel value, and the step of determining the completed depth map according to the diffused pixel value of each pixel in the to-be-diffused image, until the preset number of repetitions is reached.
  • the diffusion module 12 is further configured to use the complemented depth map as a preliminary complemented depth map, and repeatedly execute the depth map based on the preliminary complement, the The parameter matrix of the camera and the normal prediction map are calculated, the first plane origin distance map is calculated, and the first plane origin distance map is used as the step of the map to be diffused, and the determination is made based on the map to be diffused and the feature map.
  • the diffusion intensity of each pixel in the to-be-diffused image is determined based on the pixel value of each pixel in the to-be-diffused image and the diffusion intensity of each pixel in the to-be-diffused image.
  • the diffusion module 12 is configured to execute each time the depth map based on the preliminary completion, the parameter matrix of the camera, and the normal prediction map are configured to calculate the first A plane origin distance map, and when the first plane origin distance map is used as the step of the to-be-diffused map, it is also configured to be based on the preliminary completed depth map, the parameter matrix of the camera, and the normal prediction Figure, the step of calculating the distance map of the first plane origin; the step of determining the first confidence based on the depth map and the two-dimensional image, and the step of calculating the second degree of confidence based on the depth map, the parameter matrix and the normal prediction map The steps of the plane origin distance map; and according to the pixels in the first confidence map, the pixels in the second plane origin distance map, and the pixels in the first plane origin distance map, compare the first plane
  • the pixels in the origin distance map are optimized to obtain an optimized first plane origin distance map, and the optimized first plane origin distance map is used as the step of the image to be diffused.
  • the functions or modules contained in the device provided in the embodiments of the present disclosure can be used to execute the methods described in the above method embodiments.
  • the functions or modules contained in the device provided in the embodiments of the present disclosure can be used to execute the methods described in the above method embodiments.
  • FIG. 17 is a schematic diagram of the composition structure of a depth image complement device proposed by an embodiment of the present disclosure.
  • the depth image complement device proposed by the present disclosure may include processing The processor 01, the memory 02 storing executable instructions of the processor 01.
  • the processor 01 is configured to execute an executable depth image completion instruction stored in the memory, so as to implement a depth image completion method provided in an embodiment of the present disclosure.
  • the aforementioned processor 01 may be an application specific integrated circuit (ASIC), a digital signal processor (Digital Signal Processor, DSP), or a digital signal processing device (Digital Signal Processing Device, DSPD). ), at least one of a programmable logic device (ProgRAMmable Logic Device, PLD), a Field Programmable Gate Array (Field Programmable Gate Array, FPGA), a CPU, a controller, a microcontroller, and a microprocessor. It is understandable that, for different devices, the electronic device used to implement the above-mentioned processor function may also be other, which is not limited in the embodiment of the present disclosure.
  • the terminal also includes a memory 02, which may be connected to the processor 01, where the memory 02 may include a high-speed RAM memory, or may also include a non-volatile memory, for example, at least two disk memories.
  • the aforementioned memory 02 may be a volatile memory (volatile memory), such as random-access memory (Random-Access Memory, RAM); or a non-volatile memory (non-volatile memory), such as read-only memory (Read-Only Memory, ROM), flash memory (flash memory), hard disk (Hard Disk Drive, HDD) or solid-state drive (Solid-State Drive, SSD); or a combination of the above types of memory, and send it to the processor 01 Provide instructions and data.
  • volatile memory such as random-access memory (Random-Access Memory, RAM)
  • non-volatile memory such as read-only memory (Read-Only Memory, ROM), flash memory (flash memory), hard disk (Hard Disk Drive, HDD) or solid-state drive (Solid-State Drive, SSD); or a combination of the above types of memory, and send it to the processor 01 Provide instructions and data.
  • the functional modules in this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be realized in the form of hardware or software function module.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this embodiment is essentially or correct
  • the part that the prior art contributes or all or part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium and includes several instructions to enable a computer device (which can be a personal computer).
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.
  • the depth image completion device in the embodiment of the present disclosure may be a device with computing functions, such as a desktop computer, a notebook computer, a microcomputer, a vehicle-mounted computer, etc.
  • the specific device implementation form can be determined according to actual needs.
  • the embodiments of the present disclosure are not limited herein.
  • An embodiment of the present disclosure provides a computer-readable storage medium on which an executable depth image completion instruction is stored, which is applied to a terminal.
  • an executable depth image completion instruction is stored, which is applied to a terminal.
  • the program is executed by a processor, the depth image completion provided by the embodiment of the present disclosure is implemented method.
  • the embodiments of the present disclosure can be provided as a method, a system, or a computer program product. Therefore, the present disclosure may adopt the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware. Moreover, the present disclosure may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, optical storage, etc.) containing computer-usable program codes.
  • These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are generated
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device realizes the functions specified in one or more processes in the schematic diagram and/or one block or more in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing functions specified in one or more processes in the schematic diagram and/or one block or more in the block diagram.
  • the depth image completion device can obtain the to-be-diffused image based on the acquired depth map and the two-dimensional image.
  • the to-be-diffused image will retain all the point cloud data in the acquired depth map, so that the When determining the pixel value of each pixel in the diffusion map and its corresponding diffusion intensity, all the point cloud data collected in the depth map will be used to make full use of the collected pixel value.
  • the point cloud data in the depth map further makes the depth information of each 3D point in the three-dimensional scene more accurate, and improves the accuracy of the completed depth map.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Electromagnetism (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Measurement Of Optical Distance (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

一种深度图像补全方法及装置、计算机可读存储介质,包括:通过设置的雷达采集目标场景的深度图,以及通过设置的摄像机采集目标场景的二维图像(S101);根据采集到的深度图和二维图像,确定待扩散图以及特征图(S102);基于待扩散图和特征图,确定待扩散图中的各个像素的扩散强度;扩散强度表征待扩散图中的各个像素的像素值向相邻像素扩散的强度(S103);基于待扩散图中的各个像素的像素值以及待扩散图中的各个像素的扩散强度,确定补全后的深度图(S104)。

Description

一种深度图像补全方法及装置、计算机可读存储介质
相关申请的交叉引用
本申请基于申请号为201910817815.1、申请日为2019年08月30日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本申请。
技术领域
本公开涉及图像处理技术,尤其涉及一种深度图像补全方法及装置、计算机可读存储介质。
背景技术
目前,常见的深度图像获取方法是利用激光雷达(Light Detection And Ranging,LiDAR)传感器、双目相机、飞行时间(Time of Flight,TOF)传感器等获得三维场景的深度图像。双目相机与TOF传感器的有效距离一般处于10m以内,通常应用于智能手机等终端上,而LiDAR的有效距离较远,能够达到几十米甚至上百米,可以应用于自动驾驶、机器人等领域。
在利用LiDAR获取深度图像时,是向三维场景发射激光束,然后接收由三维场景中各个物体表面所反射回来的激光束,计算发射时刻与反射时刻的时间差,从而获得三维场景的深度图像。然而,在实际使用中,通常是以32/64线LiDAR为主,从而只能获取稀疏的深度图像。深度图像补全是指将深度图恢复为稠密的深度图的过程,相关技术中,深度图像补全是将深度图直接输入到神经网络之中,以获得稠密的深度图,但这种方式没有充分利用稀疏的点云数据,使得所获得的稠密的深度图的准确度较低。
发明内容
本公开提供一种深度图像补全方法及装置、计算机可读存储介质,能够充分利用稀疏点云数据,提高补全后的深度图的准确度。
本公开的技术方案是这样实现的:
第一方面,本公开实施例提供了一种深度图像补全方法,包括:
通过设置的雷达采集目标场景的深度图,以及通过设置的摄像机采集所述目标场景的二维图像;
根据采集到的深度图和所述二维图像,确定待扩散图以及特征图;
基于所述待扩散图和所述特征图,确定所述待扩散图中的各个像素的扩散强度;所述扩散强度表征所述待扩散图中的各个像素的像素值向相邻像素扩散的强度;
基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定补全后的深度图。
第二方面,本公开实施例提供一种深度图像补全装置,包括:
采集模块,被配置为通过设置的雷达采集目标场景的深度图,以及通过设置的摄像机采集所述目标场景的二维图像;
处理模块,被配置为根据采集到的深度图和所述二维图像,确定待扩散图以及特征图;基于所述待扩散图和所述特征图,确定所述待扩散图中的各个像素的扩散强度;所述扩散强度表征所述待扩散图中的各个像素的像素值向相邻像素扩散的强度;
扩散模块,被配置为基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定补全后的深度图。
第三方面,本公开实施例还提供了一种深度图像补全装置,包括:存储器及处理器;
所述存储器,被配置为存储可执行深度图像补全指令;
所述处理器,被配置为执行所述存储器中存储的可执行深度图像补全指令,实上述第一方面任一项所述的方法。
第四方面,本公开实施例提供了一种计算机可读存储介质,存储有可执行深度图像补全指令,用于引起处理器执行时,实现上述第一方面任一项所述的方法。
本公开实施例提供了一种深度图像补全方法及装置、计算机可读存储介质,通过设置的雷达采集目标场景的深度图,以及通过设置的摄像机采集目标场景的二维图像;根据采集到的深度图和二维图像,确定待扩散图以及特征图;基于待扩散图和特征图,确定待扩散图中的各个像素的扩散强度;扩散强度表征待扩散图中的各个像素的像素值向相邻像素扩散的强度;基于待扩散图中的各个像素的像素值以及待扩散图中的各个像素的扩散强度,确定补全后的深度图。采用上述实现方式,能够根据采集到的深度图和二维图像得到待扩散图,待扩散图中会保留采集到的深度图中所有的点云数据,使得在利用待扩散图中各个像素的像素值和其所对应的扩散强度,确定待扩散图中各个像素的扩散后的像素值时,会利用到采集到的深度图中所有的点云数据,从而充分利用采集到的深度图中的点云数据,进而使得三维场景中每个3D点的深度信息的准确度更高,提高了补全后的深度图的准确度。
附图说明
图1为本公开实施例提供的一种深度图像补全方法的流程图一;
图2为本公开实施例提供的一种深度图像补全方法的流程图二;
图3为本公开实施例提供的一种计算第一平面原点距离图的示意图;
图4(a)为本公开实施例提供的一种采集到的深度图的噪声示意图;
图4(b)为本公开实施例提供的一种第一置信度图的示意图;
图5为本公开实施例提供的一种深度图像补全方法的流程图三;
图6为本公开实施例提供的一种深度图像补全方法的过程示意图一;
图7为本公开实施例提供的一种深度图像补全方法的过程示意图二;
图8为本公开实施例提供的一种深度图像补全方法的过程示意图三;
图9为本公开实施例提供的一种深度图像补全方法的流程图四;
图10为本公开实施例提供的一种深度图像补全方法的流程图五;
图11为本公开实施例提供的一种对待扩散图的第二像素点的扩散后的像素值的示意图;
图12(a)为本公开实施例提供的预设重复次数的取值对补全后的深度图的误差影响示意图一;
图12(b)为本公开实施例提供的预设重复次数的取值对补全后的深度图的误差影响示意图二;
图13(a)为本公开实施例提供的预设误差容错参数对第一置信度图的真值图的影响示意图;
图13(b)为本公开实施例提供的预设误差容错参数对置信度的真值-绝对误差曲线分布的影响示意图;
图14(a)为本公开实施例提供的预设预测模型的采样率对补全后的深度图的影响示意图一;
图14(b)为本公开实施例提供的预设预测模型的采样率对补全后的深度图的影响示意图二;
图15(a)为本公开实施例提供的三维场景的采集到的深度图和二维图像的示意图;
图15(b)为本公开实施例提供的利用卷积空间传播网络得到的补全后的深度图;
图15(c)为本公开实施例提供的利用NConv-卷积神经网络得到的补全后的深度图;
图15(d)为利用相关技术中的稀疏-稠密方法得到的补全后的深度图;
图15(e)为本公开实施例提供的法向预测图;
图15(f)为本公开实施例提供的第一置信度图;
图15(g)为本公开实施例提供的补全后的深度图;
图16为本公开实施例提供的一种深度图像补全装置的结构示意图;
图17为本公开实施例提供的一种深度图像补全装置的组成结构示意图。
具体实施方式
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述。
随着图像处理技术的发展,越来越多的设备能够获得深度图像,并对深度图像进行进一步的处理,以实现各种功能。常见的深度图像的获取方法是利用激光雷达(Light Detection And Ranging,LiDAR)传感器、毫米波雷达、双目相机、飞行时间(Time of Flight,TOF)传感器等方式来获得三维场景的深度图像。然而,双目相机与TOF获取深度图像的有效距离一般处于10m以内,通常应用于智能手机等终端上,以获得人脸等目标的深度图像;LiDAR的有效距离较远,能够到达几十米甚至上百米,能够应用在自动驾驶、机器人等领域。
在利用LiDAR获取深度图像时,是通过主动向三维场景发射激光束,然后接收三维场景中各个物体的表面所反射回来的激光束,根据发射激光束的发射时间与接收反射激光束的接收时间的时间差,获得三维场景的深度图像。由于LiDAR是根据激光束的时间差来进行深度图像的获取,因而,LiDAR所获得的深度图像是由稀疏点云数据所构成的,并且,在实际应用中,是以32/64线的LiDAR为主,这导致只能获得稀疏的深度图,必须要经过深度补全,将稀疏的深度图转化为稠密的深度图。相关技术中,深度图像补全方法是凭借大量稀疏的深度图和三维场景的二维图像组成的训练数据,对神经网络模型进行监督训练,得到训练好的神经网络模型,然后直接将稀疏的深度图和三维场景的二维图像输入到训练好的神经网络的模型中,完成深度补全过程,获得较为稠密的深度图。但是,这种方式没有对深度图中的点云数据进行充分的利用,所获得的深度补全的准确度较低。
基于上述深度补全方法所存在的问题,本公开实施例的基本思想是先根据采集到的稀疏的深度图和三维场景的二维图像得到待扩散图,然后在待扩散图上实现像素级扩散,得到补全后的深度图,从而对稀疏的深度图中的每个稀疏点云数据都进行充分利用,获得准确度较高的深度补全图。
基于上述本公开实施例的思想,本公开实施例提供一种深度图像补全方法,参见图1,该方法可以包括:
S101、通过设置的雷达采集目标场景的深度图,以及通过设置的摄像机采集目标场景的二维图像。
本公开实施例是在对采集到的稀疏的深度图进行深度图像补全的场景下实现的。首先通过设置在其自身的雷达采集目标场景的深度图,同时通过设置在装置之上的摄像机采集目标场景的二维图像。
需要说明的是,通过设置的雷达采集深度图时,可以是根据激光束的发射时间和接收时间的时间差,计算出激光束所对应的三维场景中的3D点的深度信息,并将所计算出的深度信息作为像素值从而获得的深度图。当然,也可以通过激光束的其他特性,例如相位信息计算出激光束所对应的3D点的深度信息,以获得深度图,本公开实施例在此不作限定。
需要说明的是,在本公开实施例中,通过雷达所采集到的深度图是稀疏的深度图。
本公开实施例中,设置的雷达,可以是32/64线的LiDAR传感器,也可以是毫米波雷达,或者是其他类型的雷达,本公开实施例在此不作限定。
本公开实施例中,通过设置的摄像机采集二维图像时,可以是通过彩色摄像机的光学器件,获得三维场景中每个3D点像素值信息,从而获得二维图像,还可以是利用其他方式获得目标场景的二维图像,本公开实施例在此不作限定。
在本公开的一些实施例中,设置的摄像机可以是彩色摄像机,获得三维场景的彩色二维图像,也可以是红外摄像机,获得三维场景的红外灰度图,当然,设置的摄像机也可以是其他类型的摄像机,本发明实施例在此不做限定。
需要说明的是,在本公开实施例中,所采集到的深度图和二维图像的分辨率可以是相同的,也可以是不同的。当采集到的深度图和二维图像的分辨率不同时,可以通过对采集到的深度图和二维图像中的任意一个进行缩放操作,使得采集到的深度图和二维图像的分辨率保持一致。
本公开实施例中,雷达和摄像机,可以根据实际需求进行设置和排布,本公开实施例在此不作限定。
S102、根据采集到的深度图和二维图像,得到待扩散图以及特征图。
S103、基于待扩散图和特征图,确定待扩散图中各个像素的扩散强度;扩散强度表征待扩散图中各个像素的像素值向相邻像素扩散的强度;以确定根据扩散强度确定待扩散图中每个像素的像素值,需要向相邻的像素扩散多少。
需要说明的是,在基于待扩散图和特征图确定待扩散图中各个像素的扩散强度时,是需要先为待扩散图中的每个像素,都确定出相邻的一些像素,然后根据特征图,逐个比对每个像素与其对应的相邻的像素的相似程度,由此确定出扩散强度。
S104、基于待扩散图中各个像素的像素值以及待扩散图中的各个像素的扩散强度,确定补全后 的深度图。
在本公开实施例中,由于待扩散图是根据深度图和二维图像确定的,待扩散图中会保留着采集到的深度图中所有的点云数据,使得在利用待扩散图中各个像素的像素值以及其所对应的扩散强度,确定待扩散图中各个像素的扩散后的像素值时,会利用采集到的深度图中所有的点云数据,使得所得到的三维场景每个3D点所对应深度信息的准确度更高,提高了补全后的深度图的准确度。
在本公开的一些实施例中,基于待扩散图中的各个像素的像素值以及待扩散图中各个像素的扩散强度,确定补全后的深度图,即S104的实现过程,可以包括S1041-S1042,如下:
S1041、基于待扩散图中的各个像素的像素值以及待扩散图中的各个像素的扩散强度,确定待扩散图中的各个像素的扩散后的像素值。
S1042、根据待扩散图中的各个像素的扩散后的像素值确定补全后的深度图。
需要说明的是,本公开实施例中的补全后的深度图,指的是补全后的,较为稠密的深度图。其拥有较为全面的三维场景的深度信息,能够直接应用于各种需要深度图的场景之中。
在本公开实施例中,利用待扩散图中各个像素的像素值以及其所对应的扩散强度,计算待扩散图中各个像素的扩散后的像素值,并根据待扩散图中的各个像素的扩散后的像素值确定补全后的深度图,会利用到采集到的深度图中所有的点云数据,使得所得到三维场景每个3D点所对应深度信息的准确度更高,提高了补全后的深度图的准确度。
基于与上述实施例相同的发明构思,在本公开的一些实施例中,待扩散图为初步补全的深度图;根据待扩散图中各个像素的扩散后的像素值确定补全后的深度图,即S1042的实现过程可以包括S1042a-S1042b,如下:
S1042a、将待扩散图中各个像素的扩散后的像素值作为扩散后的图像的各个像素的像素值。
S1042b、将扩散后的图像作为补全后的深度图。
需要说明的是,首次得到的初步补全的深度图是根据采集到的深度图和二维图像得到的图像,即是对采集到的深度图和二维图像进行平面划分、深度信息填充等操作,得到三维场景中每个3D点的深度信息,并将得到的每个3D点的深度信息,作为像素值所得到的图像,或者说,首次得到的初步补全的深度图是采用相关技术对采集到的深度图和二维图像进行处理得到的。其中,初步补全的深度图中点云数据的密度要大于采集到的深度图中的点云数据的密度。
本公开实施例中,可以用待扩散图中各个像素扩散后的像素值作为扩散后的图像的各个像素的像素值,并将扩散后的图像作为补全后的深度图,这样会利用到采集到的深度图中所有的点云数据,从而充分利用深度图中的点云数据,获得效果较好的补全后的深度图。
在本公开的一些实施例中,待扩散图为第一平面原点距离图,此时,如图2所示,根据采集到的深度图和二维图像,确定待扩散图以及特征图,即S102的实现过程可以包括S1021-S1023,如下:
S1021、获取摄像机的参数矩阵。
需要说明的是,所获取的参数矩阵,是摄像机所固有的参数矩阵,该参数矩阵可以是指摄像机的内参数矩阵,其中可以包括摄像机的射影变换参数和焦距。当然,参数矩阵中也可以包含其他计算第一平面原点距离图所需要的参数,本公开实施例在此不作限定。
S1022、根据采集到的深度图和二维图像确定初步补全的深度图、特征图和法向预测图;法向预测图是指将三维场景各点的法向量作为像素值的图像。
本公开实施例中,法向预测图是指由三维场景中的每个3D点的表面法向量作为像素值所得到的图像。3D点的表面法向量被定义为从该3D点开始,并垂直于该3D点的切平面的向量。
需要说明的是,首次得到的初步补全的深度图,是指利用采集到的深度图和二维图像,所确定出的三维场景中每个3D点的初步深度信息作为像素值的图像。
S1023、根据初步补全的深度图、摄像机的参数矩阵与法向预测图,计算出第一平面原点距离图;第一平面原点距离图是利用初步补全的深度图计算出的摄像机至三维场景各点所在平面的距离作为像素值的图像。
在获得初步补全的深度图、参数矩阵和法向预测图之后,就能够根据初步补全的深度图中每个像素的像素值、参数矩阵与法向预测图中每个像素的像素值,针对每个3D点都计算出第一平面原点距离,然后将每个3D点的第一平面原点距离作为像素值,得到第一平面原点距离图,以便于后续基于第一平面原点距离图和特征图,为第一平面原点距离图中的各个像素计算扩散后的像素值,从而得到补全后的深度图。
在本公开实施例中,第一平面原点距离,是指用初步补全的深度图所计算出的摄像机的中心到三维场景中的每个3D点所在的切平面的距离。
由于第一平面原点距离图是用每个3D点的第一平面原点距离,即摄像机的中心到3D点所在的切平面的距离作为像素值所得到的图像,因而,处于同一个切平面上的3D点应当是具有相同或相近的第一平面原点距离,若存在某个3D点的第一平面原点距离,与其他和该3D点处于同一切平面的3D点的第一平面原点距离相差较大时,表明该3D点的第一平面原点距离是需要修正的异常值,即处于同一切平面的3D点具备几何约束。基于该几何约束的思想,在基于第一平面原点距离图和特征图,对第一平面原点距离图中各个像素计算扩散后的像素值时,可以修正第一平面原点距离图中的异常值,获得准确率较高的第一平面原点距离图,进而可以根据准确率较高的第一平面原点距离图,得到效果较好的补全后的深度图。
本公开实施例中,需要先对三维场景中每个3D点的第一平面原点距离进行计算,然后再将每个3D点的第一平面原点距离作为像素值,得到第一平面原点距离图。在计算每个3D点的第一平面原点距离时,需要先确定出每个3D点在图像平面上的2D投影,并对摄像机的参数矩阵进行求逆,得到参数矩阵的逆矩阵,然后从初步补全的深度图中获得每个3D点所对应的初步深度信息,从法向预测图中获得每个3D点所在切平面的法向量,最后将每个3D点所对应的初步深度信息、每个3D点所在的切平面法向量、参数矩阵的逆矩阵以及3D点在平面图像上的2D投影进行相乘,得到每个3D点的第一平面原点距离。
示例性的,本公开实施例中,给出了一种计算3D点的第一平面原点距离的公式,如式(1)所示:
P(x)=D(x)N(x)C -1x    (1)
其中,P(x)表示3D点的第一平面原点距离,x表示3D点在图像平面上的2D投影,D(x)表示3D点所对应的初步深度信息,N(x)表示3D点X所在的切平面的法向量,C表示参数矩阵。如此,在得到3D点在图像平面上的2D投影的坐标值,3D点所对应的初步深度信息的数值以及3D点所在切平面的法向量之后,可以将上述内容代入(1)中,计算出3D点的第一平面原点距离,之后,再将每个3D点的第一平面原点距离作为像素值,得到第一平面原点距离图。
需要说明的是,可以通过几何关系,推导出3D点的第一平面原点距离的计算公式。由几何关系可知,摄像机的中心到3D点所在切平面的距离,可以由3D点所在的平面上的任意一点,和3D点所在平面的法向量所确定,而3D点三维坐标可以由3D点在图像平面上的2D投影、3D点的初步深度信息以及参数矩阵求得,因而,摄像机的中心到3D点所在的切平面的距离,可以由3D点的初步深度信息、3D点所在平面的法向量、参数矩阵和2D投影求得。由于对于初步补全的深度图而言,每个像素点的位置信息即为3D点的2D投影,每个像素点的像素值即为3D点所对应的深度信息,同理,对于法向预测图而言,每个像素点的位置信息即为3D点的2D投影,每个像素点的像素值即为3D点的法向量信息,因此,能够从初步补全的深度图、法向预测图和参数矩阵中获得计算所有3D点的第一平面原点距离。
示例性的,本公开实施例中,给出了利用几何关系对3D点的第一平面原点距离的计算公式进行推导的过程,即对式(1)进行推导的过程:
根据几何关系可知,三维场景中的3D点与3D点所在切平面的距离的关系可以式(2)所示:
N(x)·X-P(x)=0       (2)
其中,X表示三维场景中的3D点,x表示3D点在图像平面上的2D投影,N(x)表示从3D点X开始并垂直于3D点X所在的切平面的法向量,P(x)表示摄像机的中心到3D点X所在的切平面的距离,即3D点的初步深度信息。
对式(2)进行变换可得到式(3):
P(x)=N(x)·X       (3)
对于三维场景中的3D点,可以用式(4)来表示:
X=D(x)·C -1x      (4)
其中,X表示三维场景中的3D点,x表示3D点在图像平面上的2D投影,D(x)表示3D点所对应的初步深度信息,C表示参数矩阵。
将式(4)代入式(3)之中,即可得到式(1)。
示例性的,本公开实施例提供了一种计算第一平面原点距离图的示意图,如图3所示,O为摄 像机的中心,X为三维场景中的一个3D点,x为3D点在图像平面上的2D投影,F为3D点的切平面,N(x)为3D点所在切平面的法向量,D(x)为3D点所对应的初步深度信息。在得到初步补全的深度图之后,就可以从初步补全的深度图中获知3D点的2D投影x、该3D点对应的初步深度信息,然后从法向预测图中获知3D点所在切平面的法向量,由于参数矩阵C是已知的,这时,就能够将3D点的2D投影x、3D点对应的初步深度信息D(x)、法向量N(x)以及参数矩阵C,代入到式(1)中,如此,就能计算出3D点的第一平面原点距离。在利用式(1)得到三维场景中每个3D点的第一平面原点距离之后,可以将每个3D点的第一平面原点距离作为像素值,得到第一平面原点距离图。
本公开实施例中,可以利用采集到的深度图和二维图像,得到初步补全的深度图、特征图和法向预测图,并根据初步补全的深度图、法向预测图以及存储于自身的参数矩阵,计算出第一平面原点距离图,并为第一平面原点距离图中的各个像素计算扩散后的像素值,使得可以利用几何约束清除第一平面原点距离图中存在的异常值,提高第一平面原点距离图的准确率,进而便于后续根据准确率较高的第一平面原点距离图得到效果较好的补全后的深度图。
在本公开的一些实施例中,在根据初步补全的深度图、摄像机的参数矩阵与法向预测图,计算出第一平面原点距离图,即S1023之后,该方法还包括:S1024-S1026,如下
S1024、根据采集到的深度图和二维图像确定第一置信度图;其中,第一置信度图是指采用深度图中各个像素对应的置信度作为像素值的图像。
本公开实施例中,第一置信度图是指用三维场景中每个3D点的初步深度信息的置信度,作为像素值所得到的图像。
S1025、根据采集到的深度图、参数矩阵与法向预测图,计算出第二平面原点距离图;所述第二平面原点距离图是利用采集到的深度图计算出的摄像机至三维场景各点所在平面的距离作为像素值的图像。
在本公开实施例中,第二平面原点距离是指用深度图所计算出的摄像机的中心到三维场景中的3D点所在的切平面的距离。
需要说明的是,根据深度图、参数矩阵与法向预测结果,计算第二平面距离原点图时,需要先计算三维场景中每个3D点的第二平面原点距离。在对每个3D点的第二平面原点距离进行计算时,需要先对每个3D点的在图像上的2D投影进行确定,以及对参数矩阵进行求逆运算,得到参数矩阵的逆矩阵,接着从采集到的深度图中,获取每个3D点所对应的深度信息,以及从法向预测图中获得每个3D点所在切平面的法向量,然后将每个3D点所对应的深度信息、每个3D点所在的切平面的法向量、参数矩阵的逆矩阵以及3D点在平面图像上的2D投影进行相乘,得到每个3D点的第二平面原点距离。
示例性的,本公开实施例中,可以使用式(5)来对每个3D点的第二平面原点距离进行计算:
Figure PCTCN2019128828-appb-000001
其中,
Figure PCTCN2019128828-appb-000002
为3D点的第二平面原点距离,
Figure PCTCN2019128828-appb-000003
为3D点对应的深度信息,N(x)为3D点所在的切平面的法向量,x为3D点在图像平面上的2D投影,C为摄像机的参数矩阵。在获取到每个3D点的深度信息的值、每个3D点所在切平面的法向量、参数矩阵以及每个3D点在图像上的2D投影的坐标之后,就可以将上述内容代入式(5)中,计算出每个3D点的第二平面原点距离。之后,就可以将所有3D点的第二平面原点距离作为像素值,得到第二平面原点距离图。
S1026、根据第一置信度图中的像素、第二平面原点距离图中的像素以及第一平面原点距离图中的像素,对第一平面原点距离图中的像素进行优化,得到优化后的第一平面原点距离图。
需要说明的是,由于雷达在对于移动的目标或是物体的边缘进行深度信息采集时,会不可避免地产生噪声,这使得采集到的深度图中存在一些不可靠的深度信息。对此,可以引入第一置信度图,来对深度信息的可靠程度进行衡量。
本公开实施例中,第一置信度图,是指用每个3D点的深度信息的置信度,即深度图中各个像素所对应的置信度作为像素值所得到的图像。
在利用第一置信度图中的像素、第二平面原点距离图中的像素以及第一平面原点距离图中的像素,对第一平面原点距离图进行优化时,可以根据第一置信度图中某个像素的像素值,来对该像素所对应的3D点的深度信息的可信程度进行判断,当第一置信度图中该像素的像素值较高时,认为该像素点所对应的3D点的深度信息较为可靠,即更贴近于3D点的实际深度,进而该像素点所对应 的3D点的第二平面原点距离也会更为可靠。此时,若用该像素点所对应的3D点的第二平面原点距离,来对该像素点所对应的3D点的第一平面原点距离进行替换优化,可以使得优化后的第一平面原点距离图中具有一部分像素值更贴近于实际平面原点距离的像素点。如此,在基于优化后的第一平面原点距离图和特征图,实现像素扩散时,不仅能够清除第一平面原点距离图中存在的异常值,还能降低采集到的深度图中的异常值,降低对优化后的第一平面原点距离图造成影响,进一步提高优化后的第一平面原点距离图的准确度。
在本公开的一些实施例中,可以通过第一置信度图的像素值设置取值范围,来表示原始深度信息的可靠程度。示例性的,第一置信度图的像素值范围可以设置为[0,1],当第一置信度图的像素值接近于1时,表明该像素点所对应的3D点的原始深度信息可靠,当第一置信度图的像素值接近于0时,表明该像素点所对应的3D点的原始深度信息不可靠。当然,还可以根据实际情况对第一置信度图的像素值进行范围设置,本公开实施例在此不作限定。
示例性的,本公开实施例提供了一种采集到的深度图的噪声示意,如图4(a)所示,当雷达对区域1中处于运动状态的汽车进行深度信息采集时,会出现一些噪声,例如小方框中的点出现偏移等,使得所获得的深度信息与实际深度信息不符,即深度信息不可靠。此时,可以通过图4(b)区域1中各像素点的像素值,来对原始深度信息的可靠性进行判断。从图4(b)可以看出,区域1的整体颜色较深,表明区域1中存在大量的像素值接近于0的像素点,也即区域1中存在大量深度信息不可靠的像素点。在进行像素替换时,可以根据这些像素点的置信度情况,选择不替换,从而降低这些像素点对优化后的第一平面原点距离图造成影响。
本公开实施例中,能够根据第一置信度图,从第二平面原点距离图中挑选出具有可靠的第二平面原点距离的像素点,并在第一平面原点距离图中,与该像素点所对应的像素点的像素值进行替换,得到优化后的第一平面原点距离图,从而可以基于优化后的第一平面原点距离图得到补全后的深度图,如此,不仅能够清除第一平面原点距离图中的异常值,还能降低雷达所采集到的深度图中的异常值对优化后的第一平面原点距离图所带来的影响,提高优化后的第一平面原点距离图的准确度,进而提高补全后的深度图的准确度。
在本公开的一些实施例中,在根据第一置信度图中的像素、第二平面原点距离图中的像素以及第一平面原点距离图中的像素,对第一平面原点距离图中的像素进行优化,得到优化后的第一平面原点距离图,即S1026的实现过程,可以包括:S1026a-S1026e,如下:
S1026a、从第二平面原点距离图中,确定出与第一平面原点距离图的第一像素点所对应的像素点,作为替换像素点,并确定替换像素点的像素值;第一像素点为第一平面原点距离图中的任一像素点。
需要说明的是,在确定替换像素点时,是根据第一平面原点距离图的第一像素点的坐标信息,在第二平面原点距离图中寻找与之对应的像素点,同时获取到该像素点的像素值,作为替换像素点的像素值。
S1026b、从第一置信度图中,确定出替换像素点所对应的置信度信息。
在确定出替换像素点,以及替换像素点的像素值之后,还需要从第一置信度图中,根据替换像素点的坐标信息,确定出替换像素点所对应的像素点,并获取该像素点的像素值,即该像素点的置信度信息,如此,就可以确定出替换像素点所对应的置信度信息。
S1026c、根据替换像素点的像素值、置信度信息以及第一平面原点距离图的第一像素点的像素值,确定第一平面原点距离图的第一像素点的优化后的像素值。
需要说明的是,在计算第一平面原点距离图的第一像素点的优化后的像素值时,会先判断替换像素点的像素值是否大于0,并利用真值函数记录判断结果,即当替换像素点的像素值大于0时,真值函数的函数值为1,当替换像素点的像素值小于等于0时,真值函数的函数值为0,然后根据真值函数的函数值、替换像素点的像素值、置信度信息以及第一平面原点距离图的第一像素点的像素值,计算出第一像素点的优化后的像素值。
本公开实施例中,可以利用真值函数的函数值,与置信度信息、替换像素点的像素值相乘,得到第一子优化像素值,同时用真值函数的函数值与置信度信息相乘,并将用1与所得到的乘积做差,然后将差值与第一平面原点距离图的第一像素点的像素值相乘,得到第二子优化像素值,最后将第一子优化像素值与第二子优化像素值相加,得到第一像素点的优化后的像素值。需要说明的是,还可以根据其他形式对预设距离计算模型进行设置,本发明实施例在此不作限定。
示例性的,本公开实施例给出了根据真值函数的函数值、替换像素点的像素值、置信度信息以及第一平面原点距离图的第一像素点的像素值,计算第一像素点的优化后的像素值的公式,如式(6) 所示:
Figure PCTCN2019128828-appb-000004
其中,
Figure PCTCN2019128828-appb-000005
为真值函数,M(x i)为替换像素点的置信度信息,
Figure PCTCN2019128828-appb-000006
为替换像素点的像素值,P(x i)为第一平面原点距离图的第一像素点的像素值,P'(x i)为第一平面原点距离图的第一像素点的优化后的像素值。
S1026d、重复上述步骤,直至确定第一平面原点距离图的每个像素的优化后的像素值,得到优化后的第一平面原点距离图。
按照上述步骤中,对第一平面原点距离图的第一像素点的优化后的像素值的计算方法,对第一平面原点距离图中的每个像素都计算出优化后的像素值,并利用这些优化后的像素值,组成优化后的第一平面原点距离图。
本公开实施例中,能够逐个对第一平面原点距离图中的每个像素计算出优化后的像素值,从而得到优化后的第一平面原点距离图,使得后续能够基于优化后的第一平面原点距离图和特征图,确定优化后的第一平面原点距离图的各个像素的扩散强度,并根据扩散强度以及优化后的第一平面原点距离图的像素值,得到效果较好的补全后的深度图。
在本公开的一些实施例中,参见图5,基于待扩散图和特征图,确定待扩散图中各个像素的扩散强度,即S103的实现过程,可以包括:S1031-S1032,如下:
S1031、根据预设扩散范围,从待扩散图中确定出待扩散图的第二像素点对应的待扩散像素集合,并确定出待扩散像素集合中每个像素的像素值;第二像素点为待扩散图中的任一像素点。
需要说明的是,待扩散像素集合,指的是位于待扩散图的第二像素点的邻域之中的像素。在根据预设扩散范围,先确定出待扩散图的第二像素点的邻域范围,然后将位于该邻域范围内的所有像素提取出来,组成待扩散图的第二像素点对应的待扩散像素集合。
在本公开的一些实施例中,预设扩散范围可以根据实际需求进行设定,本公开实施例在此不作限定。示例性的,可以将预设扩散范围设置为4邻域,取出4个像素点组成待扩散像素集合,也可以将预设扩散范围设置为8邻域,取出位于待扩散图的第二像素点周围的8个像素组成待扩散像素集合。
S1032、利用特征图、待扩散图的第二像素点以及待扩散像素集合中的每个像素,计算出待扩散图的第二像素点对应的扩散强度。
从特征图中,获取待扩散图的第二像素点所对应的特征信息以及待扩散像素集合中的每个像素所对应的特征信息,并根据这些特征信息对待扩散图的第二像素点对应的扩散强度进行计算。
需要说明的是,由于待扩散像素集合是由多个像素组成的,因而,在计算待扩散图的第二像素点对应的扩散强度时,是将待扩散图的第二像素点与待扩散像素集合中的每个像素组成像素对,分别计算出这些像素对的子扩散强度,然后再将这些子扩散强度,共同作为待扩散图的第二像素点对应的扩散强度。
在得到待扩散图的第二像素点对应的扩散强度之后,基于待扩散图中的各个像素的像素值以及待扩散图中的各个像素的扩散后的像素值,可以包括:S1033-S1034,如下:
S1033、根据待扩散图的第二像素点的扩散强度、待扩散图的第二像素点的像素值以及待扩散像素集合中每个像素的像素值,确定待扩散图的第二像素点的扩散后的像素值。
当得到待扩散图的第二像素点对应的扩散强度之后,基于待扩散图中的各个像素的像素值以及待扩散图中的各个像素的扩散强度,确定待扩散图中各个像素的扩散后的像素值,就会变为根据待扩散图的第二像素点的扩散强度、待扩散图的第二像素点的像素值以及待扩散像素集合中每个像素的像素值,确定待扩散图的第二像素点的像素值。
S1034、重复上述步骤,直至确定待扩散图中每个像素扩散后的像素值。
示例性的,本发明实施例给出了一种深度图像补全方法的过程示意图,如图6所示,在该示例中,以初步补全的深度图作为待扩散图。通过雷达采集到的深度图
Figure PCTCN2019128828-appb-000007
同时通过摄像机采集到三维场景的二维图像I,将
Figure PCTCN2019128828-appb-000008
与I输入进预设预测模型1中,得到初步补全的深度图D和特征图G,然后基于初步补全的深度图D和特征图G,确定初步补全的深度图D中各个像素的扩散强度2,并基于初步补全的深度图D中各个像素的像素值,以及扩散强度2得到初步补全的深度图D中各个像素的扩散后的像素值,从而得到补全后的深度图D r
可以理解的是,当在将第一平面原点距离图作为待扩散图,计算得到第一平面原点距离图的扩 散后的像素值之后,会得到一个扩散后的第一平面原点距离图,但是,扩散后的第一平面原点距离图并不是补全后的深度图,还需要对扩散后的第一平面原点距离图进行反变换,得到补全后的深度图。
在本公开实施例中,由于第一平面原点距离图是根据初步补全的深度图、法向预测图和参数矩阵所计算出的,因而,可以根据扩散后的第一平面原点距离图、法向预测图和参数矩阵反向计算出一个深度图,并将计算得到深度图,作为补全后的深度图。
本公开实施例中,可以先从法向预测图中获取每个3D点所在切平面的法向量、每个3D点在图像平面上的2D投影,从扩散后的第一平面原点距离图中获取每个3D点的扩散后的第一平面原点距离,同时对参数矩阵求逆,得到参数矩阵的逆矩阵,然后将每个3D点所在切平面的法向量、每个3D点在图像平面上的2D投影以及参数矩阵的逆矩阵进行相乘,得到乘积结果,并用经过扩散后的第一平面原点距离与所得到的乘积结果相比,并将所得到的比值作为每个3D点对应的深度补全信息。之后,就可以将每个3D点对应的深度补全信息作为像素值,得到补全后的深度图。
示例性的,本公开实施例提供一种对每个3D点对应的深度补全信息进行计算的过程,如式(7)所示:
Figure PCTCN2019128828-appb-000009
其中,D'(x)表示每个3D点所对应的深度补全信息,P 1(x)表示经过3D点扩散后的第一平面原点距离,x表示3D点在图像平面上的2D投影,N(x)表示3D点X所在的切平面的法向量,C表示参数矩阵。
在获得每个3D点所在切平面的法向量、每个3D点在图像平面上的2D投影坐标、参数矩阵,以及每个3D点扩散后的第一平面原点距离的数值之后,就可以将这些参数代入式(7),计算出每个3D点对应的深度补全信息,从而根据每个3D点对应的深度补全信息,得到补全后的深度图。
示例性的,参见图7,本公开实施例提供了一种深度图像补全方法的过程示意,在该示例中,以第一平面原点距离图作为待扩散图。将采集到的深度图
Figure PCTCN2019128828-appb-000010
和二维图像I作为输入,送入预设预测模型1中,得到用于输出初步补全的深度图的子网络2所输出的初步补全的深度图D,以及用于预测法向图的子网络3所输出的法向预测图N,同时,利用一个卷积层,将用于输出初步补全的深度图的子网络2与用于预测法向图的子网络3进行串联4,并对该卷积层中的特征数据进行可视化,得到特征图G。之后,根据初步补全的深度图D、法向预测图N以及从所获取到参数矩阵C,然后利用式(1),对三维场景中每个3D点所对应的第一平面原点距离进行计算,进而得到第一平面原点距离图P,最后,基于所得到的第一平面距离原点图P和特征图G,确定第一平面距离原点图P中各个像素的扩散强度5,并基于第一平面距离原点图P中各个像素的像素值,以及扩散强度5得到第一平面距离原点图P中各个像素的扩散后的像素值,得到扩散后的第一平面距离原点图P 1,最后再用式(7)对扩散后的第一平面距离原点图P 1、法向预测图N进行反变换,得到补全后的深度图D r
同理的,在基于优化后的第一平面原点距离图作为待扩散图计算扩散后的像素值,可以得到一个经过扩散的优化后的第一平面原点距离图,之后,需要对经过扩散的优化后的第一平面原点距离图进行反变换,得到补全后的深度图。
本公开实施例中,可以先从经过扩散的优化后的第一平面原点距离图中获取每个3D点的平面原点距离,从法向预测图中获取每个3D点所在切平面的法向量,以及每个3D点在图像平面上的2D投影,同时求出参数矩阵的逆矩阵,接着,将每个3D点所在切平面的法向量、每个3D点在图像平面上的2D投影以及参数矩阵的逆矩阵进行相乘,得到乘积结果,再用每个3D点的平面原点距离图像与上述乘积结果相比,并将所得到的比值作为每个3D点对应的深度补全信息,最后将每个3D点对应的深度补全信息作为像素值,得到补全后的深度图。
示例性的,本公开实施例可以用式(8)来对每个3D点对应的深度补全信息进行计算:
Figure PCTCN2019128828-appb-000011
其中,D'(x)为3D点对应的深度补全信息,P' 1(x)为像素扩散所得到的3D点的平面原点距 离,N(x)为3D点所在的切平面的法向量,x为3D点在图像平面上的2D投影,C为摄像机的参数矩阵。
在获取到3D点的平面原点距离的具体数值、3D点所在切平面的法向量以及3D点在图像平面上的2D投影坐标之后,便能够将这些参数代入式(8),得到每个3D点对应的深度补全信息,进而将每个3D点对应的深度补全信息作为像素值,得到补全后的深度图。
示例性的,本公开实施例提供了一种深度图像补全方法的过程示意图,如图8所示,将所采集到的深度图
Figure PCTCN2019128828-appb-000012
和二维图像I送入预设预测模型1中,得到用于输出初步补全的深度图的子网络2所输出的初步补全的深度图D,用于预测法向图的子网络3所输出的法向预测图N,以及用于输出第一置信度图的子网络4所输出的第一置信度图M,同时,利用卷积层,将用于输出初步补全的深度图的子网络2与用于预测法向图的子网络3进行串联5,并对卷积层中的特征数据进行可视化,得到特征图G。之后,利用式(4),以及所得到的初步补全的深度图D、法向预测图N和参数矩阵C,计算出每个3D点的第一平面原点距离,进而得到第一平面原点距离图P,同时,利用式(5),以及雷达所采集到的深度图
Figure PCTCN2019128828-appb-000013
法向预测图N和参数矩阵C,计算出每个3D点的第二平面原点距离,进而得到第二平面原点距离图
Figure PCTCN2019128828-appb-000014
接着,会根据第一置信度图M,挑选出具有可靠的第二平面原点距离的像素点,并用可靠的第二平面原点距离,来对第一平面原点距离图P中的各个像素进行对应优化6,得到优化后的第一平面原点距离图P′,并基于优化后的第一平面原点距离图P′和特征图G,对P′中各个像素的扩散强度7,并基于优化后的第一平面距离原点图P′中各个像素的像素值,以及扩散强度7得到优化后的第一平面距离原点图P′中各个像素的扩散后的像素值,得到经过扩散的优化后的第一平面原点距离图P' 1,最后利用式(8)对经过扩散后的优化后的第一平面原点距离图P' 1、法向预测图N进行反变换,计算得到每个3D点的深度补全信息,进而得到补全后的深度图。
本公开实施例中,能够根据预设扩散范围,为待扩散图的每个像素确定出对应的待扩散像素集合,进而根据特征图、待扩散图的每个像素,以及待扩散每个像素所对应的待扩散像素集合,计算出待扩散图的每个像素所拥有的扩散强度,从而能够根据扩散强度、待扩散图每个像素的像素值和待扩散图每个像素对应的待扩散像素集合,计算待扩散图中每个像素扩散后的像素值,从而得到补全后的深度图。
在本公开的一些实施例中,如图9所示,利用特征图、待扩散图的第二像素点以及待扩散像素集合中的每个像素,计算出待扩散图的第二像素点对应的扩散强度,即S1032的实现过程,可以包括:S1032a-S1032f,如下:
S1032a、利用待扩散图的第二像素点,以及待扩散像素集合中每个像素,计算待扩散图的第二像素点对应的强度归一化参数。
在对待扩散图的第二像素点对应的扩散强度进行计算时,会先用预先设置的预设特征提取模型,对待扩散图的第二像素点进行特征提取,以及对预设扩散范围所确定出的待扩散像素集合中的每个像素,也进行特征提取,然后根据提取到的特征信息计算出待扩散图的第二像素点对应的强度归一化参数,以便于后续利用强度归一化参数得到待扩散图的第二像素点对应的扩散强度。
需要说明的是,强度归一化参数是用来对第一特征像素的特征信息、第二特征像素的特征信息所计算出的结果进行归一化,得到子扩散强度的参数。
可以理解的是,可以用小尺寸的卷积核作为预设特征提取模型,例如1×1的卷积核,也可以用其他能达到相同目的的其他机器学习模型作为预设特征提取模型,本公开实施例在此不作限定。
需要说明的是,由于是利用预设特征提取模型对待扩散图的第二像素点,以及待扩散像素集合中的每个像素进行处理,即利用预设特征提取模型至少可以对两类像素进行处理。因而,可以用相同的预设特征提取模型对待扩散图的第二像素点,和待扩散像素集合中的每个像素进行特征提取,也可以用不同的预设特征提取模型,分别对待扩散图的第二像素点,和待扩散像素集合中的每个像素进行特征提取。
S1032b、将特征图中,与待扩散图的第二像素点对应的像素,作为第一特征像素,与待扩散像素集合中第三像素点对应的像素,作为第二特征像素;第三像素为待扩散像素集合中的任一像素点。
在计算出待扩散图的第二像素点的强度归一化参数之后,就会在特征图中,寻找与待扩散图的第二像素点所对应的像素,并将所找到的像素作为第一特征像素,同时,在特征图中,寻找与待扩散像素集合中第三像素点对应的像素,并将所找到的像素作为第二特征像素。第三像素点可以是待 扩散像素集合中的任何一个像素点。
需要说明的是,由于特征图是将预设预测模型中某一个层的特征数据可视化所得到的图像,为了能在特征图中寻找到与待扩散图的第二像素点对应的像素,可以在预设预测模型中选取与待扩散图具有相同尺寸的卷积层,并将该卷积层中的特征数据可视化得到特征图,使得特征图与待扩散图的像素一一对应,进而可以根据待扩散图的第二像素点的位置信息,找到第一特征像素,同理,可以根据待扩散像素集合中第三像素点的位置信息,找到第二特征像素。当然,装置还可以根据其他方式来寻找第一特征像素和第二特征像素,本公开实施例在此不作限定。
S1032c、提取第一特征像素的特征信息,以及第二特征像素的特征信息。
本公开实施例中,在提取第一特征像素的特征信息时,是先将第一特征像素的像素值提取出来,然后利用预设特征提取模型,对第一特征像素的像素值进行运算,得到第一特征像素的特征信息。同理,在提取第二特征像素的特征信息时,也是先将第二特征像素的像素值提取出来,再利用预设特征提取模型,对第二特征像素的像素值进行运算,得到第二特征像素的特征信息。
示例性的,可以用预设特征提取模型f来对第一特征像素进行特征提取,用预设特征提取模型g来对第二特征像素进行提取。而第一特征像素是特征图中与待扩散图的第二像素点对应的像素,可以表示为G(x i),第二特征像素是特征图中与待扩散像素集合中的第三像素点对应的像素,可以表示为G(x j),相应的,第一特征像素的特征信息为f(G(x i)),第二特征像素的特征信息为g(G(x j))。如此,装置就获得了第一特征像素的特征信息,与第二特征像素的特征信息。
S1032d、利用第一特征像素的特征信息、第二特征像素的特征信息、强度归一化参数以及预设扩散控制参数,计算出由待扩散图的第二像素点和待扩散像素集合中的第三像素点所组成的扩散像素对的子扩散强度。
本公开实施例中,预设扩散控制参数是用于对子扩散强度值进行控制的参数。预设扩散控制参数可以根据实际需求所设置的固定值,也可以是可以进行学习可变参数。
本公开实施例中,通过预设扩散强度计算模型,先对第一特征像素的特征信息求转置,得到转置结果,然后将转置结果与第二特征像素的特征信息相乘,并用1与所得到的乘积做差,得到差值结果,接着,将差值结果进行平方,并与预设扩散控制参数的平方的倍数相比,之后,将所得到比值作为指数函数的指数,将自然对数e作为指数函数的底数进行运算,最后利用强度归一化参数对所得到的运算结果进行归一化,得到最终的子扩散强度。需要说明的是,预设扩散强度计算模型的具体形式,还可以根据实际需求来进行设置,本公开实施例在此不作限定。
示例性的,本公开实施例提供了一种预设扩散强度计算模型,如式(9)所示:
Figure PCTCN2019128828-appb-000015
其中,x i表示待扩散图的第二像素点,x j表示待扩散像素集合中的第三像素点,S(x i)表示待扩散图的第二像素点对应的强度归一化参数,G(x i)表示第一特征像素,G(x j)表示第二特征像素,f(G(x i))为第一特征像素的特征信息,g(G(x j))为第二特征像素的特征信息,σ表示预设扩散控制参数,w(x i,x j)表示待扩散图的第二像素点和待扩散像素集合中的第三像素点所组成的扩散像素对的子扩散强度。
在得到第一特征像素的特征信息f(G(x i))、第二特征像素的特征信息g(G(x j)),以及计算出待扩散图的第二像素点对应的强度归一化参数S(x i)之后,就可以将这些参数的具体数值代入式(9)之中,计算出待扩散图的第二像素点和待扩散像素集合中的第三像素点所组成的扩散像素对的子扩散强度w(x i,x j)。
S1032e、重复上述步骤,直至确定待扩散图的第二像素点,与待扩散像素集合中的每个像素所组成的像素对的子扩散强度。
S1032f、将待扩散图的第二像素点,与待扩散像素集合中的每个像素所组成的扩散像素对的子扩散强度,作为待扩散图的第二像素点所对应的扩散强度。
本公开实施例中,在可以分别针对待扩散强度的第二像素点,与待扩散像素集合中的每个像素所组成的扩散像素对进行子扩散强度的计算,然后将计算出的所有子扩散强度,共同作为待扩散图的第二像素点的扩散强度,按照这种方式,能够得到待扩散图中每个像素的扩散强度,并根据扩散强度,对待扩散图中的每个像素计算扩散后的像素值,从而得到准确率较高的补全深度图。
在本公开的一些实施例中,子扩散强度可以为待扩散图的第二像素点和待扩散像素集合中的第三像素的相似度。
本公开实施例中,可以将待扩散图的第二像素点和待扩散像素集合中的第三像素点的相似度,作为子扩散强度,即可以根据待扩散图的第二像素点,和待扩散像素集合中的第三像素点的相似程度,来决定待扩散像素集合中的第三像素点向待扩散图的第二像素点的扩散的强度,当待扩散图的第二像素点和待扩散像素集合中的第三像素点较为相似时,认为待扩散图的第二像素点和待扩散像素集合中的第三像素点,极有可能是处于三维场景中的同一个平面上的,此时,待扩散像素集合中的第三像素点向待扩散图的第二像素点扩散强度会较大;而当待扩散图的第二像素点和待扩散像素集合中的第三像素点不相似时,为待扩散图的第二像素点,和待扩散像素集合中的第三像素点不处于同一平面之上,此时,待扩散像素集合中的第三像素向待扩散图的第二像素点扩散强度会较小,以免在像素扩散过程中发生错误。
本公开实施例中,可以根据待扩散图中的像素,和待扩散像素集合中每个像素的相似程度来确定子扩散强度,以确保用与待扩散图中的像素处于同一平面的像素,计算待扩散图中的各个像素进行扩散后的像素值,从而得到准确率较高的补全后的深度图。
在本公开的一些实施例中,利用待扩散图的第二像素点,以及待扩散像素集合中的每个像素,计算待扩散图的第二像素点对应的强度归一化参数,即S1032a的实现过程,可以包括S201-S204,如下:
S201、提取待扩散图的第二像素点的特征信息,以及待扩散像素集合中第三像素点的特征信息。
需要说明的是,利用预设特征提取模型,提取待扩散图的第二像素点的特征信息时,是先获取待扩散图的第二像素的像素值,并用预设特征提取模型对该像素值进行计算,得到待扩散图的第二像素点的特征信息。同理,在提取待扩散像素集合中第三像素点的特征信息时,也是先获取待扩散像素集合中第三像素点的像素值,然后对该像素值进行用预设特征提取模型进行运算,得到待扩散像素集合中第三像素点的特征信息。
示例性的,当待扩散图的第二像素点表示为x i,待扩散像素集合中第三像素表示为x j时,若用预设特征提取模型f来对待扩散图的第二像素点进行特征提取,用预设特征提取模型g来对待扩散像素集合中第三像素进行特征提取,则待扩散图的第二像素点的特征信息可以表示为f(x i),待扩散像素集合中第三像素点的特征信息可以表示为g(x j)。当然,也可以利用其它预设特征提取模型来对待扩散图的第二像素点,和待扩散像素集合中第三像素点进行特征提取,本公开实施例在此不作限定。
S202、利用提取的待扩散图的第二像素的特征信息、待扩散像素集合中第三像素点的特征信息以及预设扩散控制参数,计算出待扩散像素集合中第三像素的子归一化参数。
需要说明的是,利用预设子归一化参数计算模型,先对待扩散图的第二像素点的特征信息进行矩阵转置,并将转置结果与待扩散像素集合中第三像素点的特征信息相乘,接着,用1与所得到的乘积结果做差,并对所得到的差值结果求平方,得到平方结果,然后,将平方结果与预设扩散控制参数的平方的倍数相比,最后,将所得到的比值作为指数函数的指数,将自然对数e作为指数函数的底数进行运算,并将最终的运算结果作为待扩散像素集合中第三像素点对应的子归一化参数。当然,预设子归一化参数计算模型可以根据实际需求设置为其他形式,本公开实施例在此不作限定。
示例性的,本公开实施例提供了一种预设子归一化参数计算模型,参见式(10):
Figure PCTCN2019128828-appb-000016
其中,x i表示待扩散图的第二像素点,x j表示待扩散像素集合中的第三像素点,f(x i)表示待扩散图的第二像素点的特征信息,g(x j)表示待扩散像素集合中第三像素点的特征信息,σ表示预 设扩散控制参数,s(x j)表示待扩散像素集合中第三像素点对应的子归一化参数。
在得到待扩散图的第二像素点的特征信息f(x i)、待扩散像素集合中第三像素点的特征信息g(x j),以及获取到了预设扩散控制参数σ之后,就可以将这些参数的具体数值代入至式(10)之中,计算出待扩散像素集合中第三像素点对应的子归一化参数。
S203、重复上述步骤,直至得到待扩散像素集合的每个像素的子归一化参数。
S204、将待扩散像素集合的每个像素的子归一化参数进行累加,得到待扩散图的第二像素点对应的强度归一化参数。
示例性的,当待扩散像素集合中第三像素的子归一化参数为s(x j)时,装置可用式(11)得到待扩散图的第二像素点对应的强度归一化参数:
Figure PCTCN2019128828-appb-000017
其中,N i表示待扩散像素集合,S(x i)表示待扩散图的第二像素点的强度归一化参数。
在计算待扩散像素集合中每个像素的子归一化参数的数值时,可以将这些子归一化参数的数值直接代入至式(11)进行累加,将所得到的累加结果作为待扩散图的第二像素点对应的强度归一化参数。
本公开实施例中,可以先对待扩散图的第二像素点进行特征提取,对待扩散像素集合中的每个像素进行特征提取,然后利用预设子归一化参数计算模型对所提取到的特征信息,和预设扩散控制参数进行计算,得到子归一化参数,并对所得到的所有子归一化参数进行累加,得到强度归一化参数,以使得装置在后续能够利用强度归一化参数计算扩散强度。
在本公开的一些实施例中,如图10所示,根据待扩散图的第二像素点的扩散强度、待扩散图的第二像素点的像素值以及待扩散像素集合中每个像素的像素值,确定待扩散图的第二像素点的扩散后的像素值,即S1033的实现过程,可以包括:S1033a-S1033d,如下:
S1033a、将扩散强度中每个子扩散强度,分别与待扩散图的第二像素点的像素值相乘,并将所得到的乘积结果进行累加,得到待扩散图的第二像素点的第一扩散部分。
本公开实施例中,先获取到待扩散图的第二像素点的像素值,以及待扩散图的第二像素点的扩散强度,并用待扩散图的第二像素点的扩散强度中,待扩散像素集合中第三像素点的子扩散强度与待扩散图的第二像素点的像素值相乘,得到一个乘积结果,如此重复,直至待扩散像素集合中每个像素的子扩散强度与待扩散图的第二像素点的像素值都相乘后,将得到的所有乘积进行累加,计算出待扩散图的第二像素点的第一扩散部分。
需要说明的是,本公开实施例中,还可以根据其他方式来计算待扩散图的第二像素点的第一扩散部分,本公开实施例在此不作限定。
示例性的,本公开实施例可以用式(12)计算得到第一扩散部分,式(12)如下:
Figure PCTCN2019128828-appb-000018
其中,w(x i,x j)为待扩散像素集合中第三像素点对应的子扩散强度,N(x i)表示待扩散像素集合,P(x i)表示待扩散图的第二像素点的像素值,p 1(x i)表示计算出的待扩散图的第二像素点的第一扩散部分。
在获得待扩散图的第二像素点的像素值,以及待扩散像素集合中每个像素的子扩散强度的数值之后,就可以将待扩散图的第二像素点的像素值,与待扩散像素集合中每个像素的子扩散强度的数值代入到式(12)中,计算出待扩散图的第二像素点的第一扩散部分。
需要说明的是,由于计算待扩散图的第二像素点的扩散强度时,用强度归一化参数对子扩散强度进行了归一化,因而,各个子扩散强度与待扩散图的第二像素点的像素值相乘并累加之后,所得到的累加结果的数值不会超过原先待扩散图的第二像素点的像素值。
S1033b、将扩散强度中的每个子扩散强度,分别与待扩散集合中每个像素的像素值对应相乘,并将所得到的乘积累加,得到待扩散图的第二像素点的第二扩散部分。
需要说明的是,在用子扩散强度,分别与待扩散像素集合中的每个像素值进行相乘时,是先用待扩散像素集合中的第三像素点对应的子扩散强度,与待扩散像素集合中的第三像素点的像素值相乘,得到乘积结果,如此循环往复,直到将每个子扩散强度,分别与待扩散像素集合中的每个像素值都进行相乘,最后,将所有的乘积进行累加,并将所得到的累加结果作为待扩散图的第二像素点 的第二扩散部分。
需要说明的是,本公开实施例中,还可以根据其他方法来计算待扩散图的第二像素点的第二扩散部分,本公开实施例在此不作限定。
示例性的,在本公开实施例中,可以用式(13)来计算第二扩散部分:
Figure PCTCN2019128828-appb-000019
其中,w(x i,x j)为待扩散像素集合中第三像素点对应的子扩散强度,N(x i)表示待扩散像素集合,P(x j)表示待扩散像素集合中第三像素点的像素值,p 2(x i)表示计算出的待扩散图的第二像素点的第二扩散部分。
在获得待扩散像素集合中第三像素点的像素值,以及待扩散像素集合中每个像素的子扩散强度的数值之后,就可以将待扩散像素集合中第三像素点的像素值,与待扩散像素集合中每个像素的子扩散强度的数值代入到式(13)中,计算出待扩散图的第二像素点的第二扩散部分。
S1033c、根据待扩散图的第二像素点的像素值、待扩散图的第二像素点的第一扩散部分,以及待扩散图的第二像素点的第二扩散部分,计算出待扩散图的第二像素点的扩散后的像素值。
本公开实施例中,可以用待扩散图的第二像素点的像素值,先减去第一扩散像素部分,然后用得到的差值与第二扩散部分相加,将最后的相加结果作为扩散后的像素值。需要说明的是,本公开实施例还可以对待扩散图的第二像素点的像素值、第一扩散像素部分和第二扩散像素部分进行其他的处理,得到待扩散图第二像素点扩散后的像素值,本公开实施例在此不作限定。
示例性的,本公开实施例可以根据式(14)得到待扩散图的第二像素点的扩散像素值,并完成像素扩散:
Figure PCTCN2019128828-appb-000020
其中,P(x i)表示待扩散图的第二像素点的像素值,w(x i,x j)为待扩散像素集合中第三像素点对应的子扩散强度,N(x i)表示待扩散像素集合,P(x j)表示待扩散像素集合中第三像素点的像素值。
在得到待扩散图的第二像素点的像素值、待扩散像素集合中每个像素所对应的子扩散强度、待扩散像素集合中每个像素的像素值之后,就可以将这些参数的具体数值代入式(14)中,计算出待扩散图的第二像素点的扩散后的像素值。
示例性的,本公开实施例给出了对式(14)进行推导的过程:
本公开实施例中,可以用待扩散图的第二像素点的像素值,先减去第一扩散像素部分,然后用得到的差值与第二扩散部分相加,将最后的相加结果作为扩散像素值,可以用式(15)来表示:
P(x i)←P(x i)-p 1(x i)+p 2(x i)      (15)
其中,p 1(x i)表示计算出的待扩散图的第二像素点的第一扩散部分,p 2(x i)表示计算出的待扩散图的第二像素点的第二扩散部分,P(x i)表示待扩散图的第二像素点的像素值。
将式(12)和式(13)代入到式(15)中,可以得到式(16):
Figure PCTCN2019128828-appb-000021
对式(16)进行合并整理,即可得到式(14)。
示例性的,本公开实施例提供了一种计算待扩散图的第二像素点的扩散后的像素值的示意,如图11所示,在基于待扩散图1和特征图2,计算待扩散图的第二像素点的扩散后的像素值时,先要为待扩散图的第二像素点确定出待扩散像素集合,在本公开实施例中,按照8邻域来确定待扩散像素集合3,如图11所示,待扩散图的第二像素点x i位于左上方九宫格的中心,周围的8个像素点所组成的集合为待扩散像素集合3。接着,要从特征图2中找到与待扩散图的第二像素点对应的第一特征像素,以及与待扩散像素集合中第三像素对应的第二特征像素,并利用预设特征提取模型f对第一特征像素进行特征提取,利用预设特征提取模型g对第二特征像素进行特征提取(特征提取过程未示出),其中,f和g都被设置为1×1的卷积核。接着,利用预设扩散强度计算模型4,即式(9),以及计算扩散强度所需要的参数,计算出扩散强度,然后将待扩散图的第二像素点的像素值、扩散强度、待扩散像素集合中每个像素的像素值代入式(14),计算出待扩散图的第二像素点扩散后的像 素值5,进而得到补全后的深度图6。如此,就完成了对待扩散图的第二像素点的扩散后的像素值的计算。
S1033d、重复上述步骤,直至计算出待扩散图中每个像素的扩散后的像素值。
在完成对待扩散图的第二像素点的像素扩散之后,便会继续重复上述步骤,计算出待扩散图中每个像素的扩散后的像素值,从而得到补全后的深度图。
本公开实施例中,能够根据待扩散图中的每个像素的像素值、以及待扩散图每个像素所对应的待扩散像素集合中的所有像素的像素值,以及所计算出的扩散强度,逐个对待扩散图中的每个像素的扩散后的像素值进行计算,以使得能够充分利用所采集到的深度图,得到准确率较高的补全后的深度图。
在本公开的一些实施例中,在基于待扩散图和特征图,实现像素扩散,得到补全后的深度图之后,即S104之后,该方法还可以包括:S105,如下:
S105、将补全后的深度图作为待扩散图,重复执行基于待扩散图和特征图确定待扩散图中的各个像素的扩散强度的步骤,基于待扩散图中的各个像素的像素值以及待扩散图中的各个像素的扩散强度确定待扩散图中的各个像素的扩散后的像素值的步骤,以及根据待扩散图中的各个像素的扩散后的像素值确定补全后的深度图的步骤,直至达到预设重复次数。
在得到补全后的深度图之后,还能够继续将该补全后的深度图,重新作为待扩散图,计算待扩散图中的各个像素的扩散后的像素值,使像素扩散更加充分,得到优化后的补全后的深度图。
在本公开的一些实施例中,可以将预设重复次数设置为8次,在得到补全后的深度图之后,会针对该补全后的深度图,继续进行7次上述步骤,以使像素扩散更加充分。需要说明的是,预设重复次数可以根据实际需求进行设定,本公开实施例在此不作限定。
在本公开的一些实施例中,在根据待扩散图中各个像素的扩散后的像素值确定补全后的深度图之后,即S104之后,该方法还可以包括:S106,如下:
S106、将补全后的深度图作为初步补全的深度图,重复执行基于初步补全的深度图、摄像机的参数矩阵与法向预测图,计算出第一平面原点距离图,并将第一平面原点距离图作为待扩散图的步骤,基于待扩散图和特征图确定待扩散图中各个像素的扩散强度的步骤,基于待扩散图中各个像素的像素值以及待扩散图中各个像素的扩散强度确定待扩散图中各个像素的扩散后的像素值的步骤,以及根据待扩散图中各个像素的扩散后的像素值确定补全后的深度图的步骤,直至达到预设重复次数。
在本公开的一些实施例中,每一次执行的基于初步补全的深度图、摄像机的参数矩阵与法向预测图,计算出第一平面原点距离,并将第一平面原点距离图作为待扩散图的步骤,包括:
基于初步补全的深度图、摄像机的参数矩阵与法向预测图,计算出第一平面原点距离图的步骤;基于深度图和二维图像确定第一置信度的步骤;基于深度图、参数矩阵与法向预测图,计算出第二平面原点距离图的步骤;以及根据第一置信度图中的像素、第二平面原点距离图中的像素以及第一平面原点距离图中的像素,对第一平面原点距离图中的像素进行优化,得到优化后的第一平面原点距离图,将优化后的第一平面原点距离图作为待扩散图的步骤。
本公开实施例中,基于采集到的深度图
Figure PCTCN2019128828-appb-000022
和二维图像得到初步补全的深度图D、法向预测图N和第一置信度图M之后,对于初步补全的深度图D中的所有像素x,计算出第二平面原点距离信息,进而得到第二平面原点距离图,并计算出所有像素的第一平面原点距离信息,进而得到第一平面原点距离图。然后,判断当前重复次数小于预设迭代次数时,对第一平面原点距离图中的每个像素值P(x),计算出替换距离信息,并进行像素值的优化,进而得到优化后的第一平面原点距离图。之后,将优化后的第一平面原点距离图作为待扩散图,对优化后的第一平面原点距离图中的第二像素点,确定出对应的待扩散像素集合,并计算第二像素点对应的扩散强度,然后根据扩散强度中的每个子扩散强度、待扩散像素集合中每个像素的像素值以及优化后的第一平面原点距离图中的第二像素点的像素值,计算出优化后的第一平面原点距离图的第二像素点扩散后的像素值,得到经过扩散的优化后的第一平面原点距离图,再对经过扩散的优化后的第一平面原点距离图进行反变换,得到补全后的深度图。当得到补全后的深度图之后,会给当前重复次数i加上1,得到新的当前重复次数,然后将新的当前重复次数与预设重复次数进行比较,在新的当前重复次数小于预设重复次数时继续进行上述过程,直至进行新的当前重复次数不再小于预设重复次数,得到最终的补全后的深度图。
示例性的,本公开实施例给出了预设重复次数的取值对补全后的深度图的误差的影响,如图12(a)所示,用KITTI数据集进行测试,横坐标为预设重复次数,纵坐标为均方根误差(Root Mean  Square Error,RMSE),RMSE的单位为mm,图中3条曲线,分别为全样本测试次数(epoch)取不同值所得到的结果。从图12(a)可以看出,当epoch=10时,即KITTI数据集中的全部样本被测试了10次时,RMSE是随着预设重复次数的增加而下降的,当预设重复次数为20时,RMSE最小,接近于0;当epoch=20时,RMSE先是随着预设重复次数下降然后保持不变,RMSE接近于0;当epoch=30时,RMSE随着预设重复次数的增加先下降,然后有小幅度的上升,但是RMSE最高也不会超过5,直至最后RMSE接近于0。图12(b)是用NYU数据集进行测试的结果图,与图12(a)相同,图12(b)的也横坐标为预设重复次数,纵坐标为RMSE,图中3条曲线,分别为epoch取不同值所得到的结果。从图12(b)可以看出,不管是epoch=5、epoch=10还是epoch=15,随着预设重复次数的增加,RMSE都是先减小,直至接近于0,然后保持不变。从图12(a)和图12(b)可以看出,进行预设重复次数的像素扩散,能够显著减小补全后的深度图的RMSE,即进行预设重复次数的像素扩展,能够进一步提升补全后的深度图的准确度。
本公开实施例中,在得到补全后的深度图之后,还能继续对补全后的深度图重复进行补全,从而进一步提高补全后的深度图的准确度。
在本公开的一些实施例中,深度图像补全方法可以采用预设预测模型来实现。在采集到目标场景的深度图和二维图像之后,先获取预先存储于深度图像补全装置内部的预设预测模型,然后将深度图和影像图作为输入送进预设预测模型中进行计算,以进行初步的预测处理,并根据预设预测模型输出的结果,得到待扩散图和特征图,以便于后续基于待扩散图和特征图,实现像素扩散。
可以理解的是,本公开实施例中,预设预测模型是已经训练好的模型。本公开实施例中,可以使用已经训练好的卷积神经网络(Convolutional Neural Networks,CNN)模型作为预设预测模型。当然,还可以根据实际情况使用其他能够达到的相同目的的网络模型,或是其他机器学习模型作为预设预测模型,本公开实施例在此不作限定。
示例性的,本公开实施例中,可以使用CNN中的残差网络(Residual Networks,ResNet)的变体ResNet-34或是ResNet-50作为预设预测模型。
需要说明的是,由于在利用预设预测模型对采集到的深度图和二维图像进行预测处理之后,可以根据实际设置获得多种预测结果,例如初步补全的深度图、法向预测图,乃至深度图对应的置信度图等,因此,可以将预设预测模型所得到的预测结果直接作为待扩散图,也可以对预测结果进行处理得到待扩散图。
需要说明的是,所得到的待扩散图是指根据预设预测模型的输出,所得到的用来进行像素值的扩散的图;而所得到的特征图,是指将深度图和二维图像输入预设预测模型中进行计算之后,将预设预测模型中的某一层的特征数据进行可视化,以得到特征图。
需要说明的是,由于在利用预设预测模型,对深度图和二维图像进行预测,可以得到初步补全的深度图和法向预测图,即预设预测模型具有两个输出,因此,在获得特征图时,可以仅将用于输出初步补全的深度图的子网络中的特征数据进行可视化,得到特征图,也可以是仅将用于输出法向预测图的子网络中的特征数据进行可视化,得到特征图,还可以将用于输出初步补全的深度图的子网络,与用于输出法向预测图的子网络进行串联,对串联网络中的特征数据进行可视化,得到特征图。当然,还可以利用其他方式得到特征图,本公开实施例在此不作限定。
示例性的,当预设预测模型为ResNet-34时,可以先将深度图和二维图像送入ResNet-34进行预测,然后对ResNet-34的倒数第二层中的特征数据进行可视化,并将可视化结果作为特征图。当然,还可以用其他方式得到特征图,本公开实施例在此不作限定。
在本公开的一些实施例中,预设预测模型可以采用如下方法训练得到:
S107、获取训练样本以及预测模型。
在用雷达采集目标场景的深度图,以及通过摄像机采集目标场景的二维图像之前,还需要获取训练样本以及预测模型,以便于后续利用训练样本对预测模型进行训练。
需要说明的是,由于通过预设预测模型可以得到初步补全的深度图、法向预测图、特征图和第一置信度图,因此,所获取的训练样本中,至少包含有训练深度图样本、训练二维图像样本、以及与训练深度图样本和训练二维图像样本所对应的初步补全的深度图的真值图、法向预测图的真值图和第一置信度图的真值图。其中,初步补全的深度图的真值图指的由三维场景的真实的深度信息作为像素值所构成的图像,法向预测图的真值图是对初步补全的深度图的真值图运用主成分分析(Principal Component Analysis,PCA)所计算出的图像,而第一置信度图的真值图则是用训练深度图,和深度图的真值图所计算出的图像。
本公开实施例中,对每个3D点的置信度的真值进行计算,然后将每个3D点的置信度的真值作 为像素值,得到第一置信度图的真值图。在对每个3D点的置信度的真值时,先用3D点的深度信息,减去3D点深度信息的真值,并对所得到的差值取绝对值,得到绝对值结果,之后,将绝对值结果与预设的误差容错参数相比,最后,将所得到的比值作为指数函数的指数,将自然对数e最为指数函数的底数进行运算,得到每个3D点的置信度的真值。
示例性的,本公开实施例中,可以利用式(17)来计算3D点的置信度真值,式(17)如下:
Figure PCTCN2019128828-appb-000023
其中,
Figure PCTCN2019128828-appb-000024
表示3D点的深度信息,D *(x)表示与3D点的训练深度信息的真值,b为预设的误差容错参数,M *(x)为计算得到的置信度的真值。
在获取到每个3D点的深度信息,以及与每个3D点的训练深度信息的真值,以及预设的误差容错参数的数值之后,可以将这些数据代入式(17),逐个计算出每个3D点的置信度的真值,进而将每个3D点的置信度的真值作为像素值,得到第一置信度图的真值图。
需要说明的是,本公开实施例中,预设误差容错参数会对第一置信度图的真值图进行计算过程造成影响,因而,预设误差容错参数可以根据经验来设置,本公开实施例在此不作限定。
示例性的,本公开实施例提供了一种预设误差容错参数对第一置信度图的真值图的误差影响,如图13(a)所示,横坐标为预设误差容错参数b的取值,纵坐标为利用不同预设误差容错参数b,所计算出的第一置信度图的真值图的均方根误差(Root Mean Square Error,RMSE),RMSE的单位为mm。从图13(a)可以看出,当b的取值从10 -1逐渐增大,直至达到10 1时,第一置信度图的真值图的RMSE是先减小,然后增大的,且在当b为10 0时,第一置信度图的真值图的RMSE达到最小。由此可以看出,为了使第一置信度图的真值图的RMSE最小,可以将预设误差容错参数b设置为10 0。本公开实施例还提供了一种预设误差容错参数的取值对置信度的真值-绝对误差(Absolute Error,AE)曲线分布的影响,图13(b)横坐标为绝对误差,其中AE的单位为m,纵坐标为置信度的真值M *,图13(b)中的5条曲线从左到右依次为b=0.1时的M *-AE曲线分布,b=0.5时的M *-AE曲线分布,b=1.0时的M *-AE曲线分布,b=1.5时的M *-AE曲线分布,b=2.0时的M *-AE曲线分布以及b=5.0时的M *-AE曲线分布。从这些曲线分布中可以看出,当b取值过小时,例如b=0.1、b=0.5时,即使AE很小,置信度的M *也比较低,在实际应用时无法针对误差较低的置信度真值给出较高的置信度,即置信度不准确,同样的,当b取值过大时,即b=2.0、b=5.0时,虽然AE较大,但是置信度的真值M *却比较高,在实际应用时对噪声的容忍程度比较高,也就无法针对误差较高的置信度的真值给出较低的置信度。当b取1时,对于小AE,置信度M *较高,对于大AE,置信度M *较低,能够针对置信度的真值给出合适的置信度。
S108、利用训练样本对预测模型进行训练,得到预测参数。
在得到训练样本之后,就会用训练样本对预测模型进行有监督训练,直至损失函数达到要求时停止训练,得到预测参数,以便于后续得到预设预测模型。
需要说明的是,在对预测模型进行训练时,是将训练深度图样本和训练二维图像样本作为输入,利用训练深度图样本和训练二维图像样本所对应的初步补全的深度图的真值图、法向预测图的真值图和第一置信度图的真值图作为监督,进行有监督训练。
本公开实施例中,可以为初步补全的深度图的真值图、法向预测图的真值图和第一置信度图的真值图分别设置子损失函数,然后将这些子损失函数,分别与其对应的损失函数的权重调整参数相乘,最后根据相乘结果,得到预设预测模型的损失函数。
示例性的,可以将预设预测模型的损失函数设置为:
L=L D+βL N+γL C     (18)
其中,L D为初步补全的深度图的真值图对应的子损失函数,L N为法向预测图的真值图对应的子损失函数,L C为第一置信度图的真值图对应的子损失函数,β和γ损失函数的权重调整参数。当然,还可以将预设预测模型的损失函数设置为其他形式,本公开实施例在此不作限定。
需要说明的是,损失函数的权重调整参数可以根据实际情况来进行深度,本公开实施例在此不作限定。
初步补全的深度图的真值图对应的子损失函数可以设置为:
Figure PCTCN2019128828-appb-000025
其中,D(x)表示从训练样本中预测出的3D点的初步深度信息,D *(x)表示3D点的原始深度信息的真值,n为初步补全的深度图的像素总数目。
法向预测图的真值图对应的子损失函数可以设置为:
Figure PCTCN2019128828-appb-000026
其中,N(x)表示从训练样本中预测出的3D点所在切平面的法向量,N *(x)表示3D点的真实法向量,n为法向预测图的像素总数目。
第一置信度图的真值图对应的子损失函数可以设置为:
Figure PCTCN2019128828-appb-000027
其中,M(x)表示从训练样本中预测出的3D点对应的置信度信息,M *(x)表示通过式(17)计算出的3D点对应的置信度信息的真值,n为第一置信度图的像素总数。
需要说明的是,在训练过程中,会有较多的超参数会对最后所得到的预设预测模型的性能造成影响,例如采样率等。因此,装置可以选择合适的超参数,对预测模型进行训练,以便于后续得到效果较好的预设预测模型。
S109、利用预测参数以及预测模型构成预设预测模型。
在对预测模型进行训练,得到预测参数之后,就可以用所得到的预测参数和预测模型,共同构成预设预测模型,以便于后续装置能够利用预设预测模型,对装置所采集的深度图和二维图像进行预测。
示例性的,本公开实施例给出了用预设预测模型的采样率对补全后的深度图的影响示意,如图14(a)所示,在KITTI数据集上进行测试,横坐标为采样率,纵坐标为RMSE,RMSE单位为mm,图中3条曲线,分别为epoch=10、epoch=20和epoch=30所得到的结果。从图14(a)中可以看出,无论是epoch=10、epoch=20还是epoch=30,当采样率从0开始向1.0递增时,RMSE越来越小的,且都在采样率为1.0时RMSE达到了最小。图14(b)为在NYU数据集上进行测试的结果,与图14(a)类似,图14(b)的横坐标为采样率,纵坐标为RMSE,RMSE单位为mm,图中3条曲线,分别为epoch=10、epoch=20和epoch=30所得到的结果。与图14(a)类似,在图14(b)中,无论是epoch=10、epoch=20还是epoch=30,当采样率从0开始向1.0递增时,RMSE会越来越小,并在采样率为1.0时达到最小。从图14(a)和图14(b)可以看出,为预设预测模型选择合适的采样率,能够使得补全后的深度图的RMSE显著下降,即得到效果较好的补全后深度图。
本公开实施例中,能够对预测模型进行训练,得到预测参数,并用预测参数和预测模型构成预设预测模型,使得在之后可以用预设预测模型对实时采集的深度图和二维图像进行预测处理。
示例性的,本公开实施例给出一种深度图像补全方法与相关技术中的深度补全技术的效果比较示意图,如图15(a)所示,为所采集的三维场景的深度图和二维图像示意图,为了便于观察,将深度图和二维图像重叠在一起进行示出。图15(b)为相关技术中利用卷积空间传播网络(Convolutional Spatial Propagation Network,CSPN)进行深度补全所得到的补全后的深度图,图15(c)为相关技术中利用NConv-卷积神经网络(NConv-Convolutional Neural Network,NConv-CNN)所得到的补全后的深度图,图15(d)为相关技术中利用稀疏-稠密(Sparse-to-Dense)方法所得到的补全后的深度图,图15(e)为利用本公开实施例提供的所预测出的法向预测图,图15(f)为本公开实施例提供的所预测出的第一置信度图,图15(g)为利用本公开实施例提供的一种深度图像补全方法所得到的补全后的深度图。将图15(b)、图15(c)、图15(d)与图15(g)进行比较,可以看出,相比与相关技术,本公开实施例提供的一种深度图像补全方法所得到的补全后的深度图的效果更好,出现具有错误深度信息的像素点的数目更少,并且补全后的深度图的细节信息也更为全面。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
在本公开的一些实施例中,如图16所示,本公开实施例提供了一种深度图像补全装置1,该深度图像补全装置1可以包括:
采集模块10,被配置为通过设置的雷达采集目标场景的深度图,以及通过设置的摄像机采集所述目标场景的二维图像;
处理模块11,被配置为根据采集到的深度图和所述二维图像,确定待扩散图以及特征图;基于所述待扩散图和所述特征图,确定所述待扩散图中的各个像素的扩散强度;所述扩散强度表征所述待扩散图中的各个像素的像素值向相邻像素扩散的强度;
扩散模块12,被配置为基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定补全后的深度图。
在本公开的一些实施例中,所述扩散模块12,还被配置为基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定所述待扩散图中的各个像素的扩散后的像素值;根据所述待扩散图中的各个像素的扩散后的像素值确定补全后的深度图。
在本公开的一些实施例中,所述待扩散图为初步补全的深度图;所述扩散模块12,在被配置为根据所述待扩散图中的各个像素的扩散后的像素值确定补全后的深度图时,还被配置为将所述待扩散图中的各个像素的扩散后的像素值作为扩散后的图像的各个像素的像素值;将扩散后的图像作为补全后的深度图。
在本公开的一些实施例中,所述待扩散图为第一平面原点距离图;所述处理模块11,在被配置为根据所述深度图和所述二维图像确定待扩散图以及特征图时,还被配置为获取所述摄像机的参数矩阵;根据所述深度图和所述二维图像确定所述初步补全的深度图、所述特征图和法向预测图;所述法向预测图是指将三维场景各点的法向量作为像素值的图像;根据所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图;所述第一平面原点距离图是利用所述初步补全的深度图计算出的所述摄像机至所述三维场景各点所在平面的距离作为像素值的图像。
在本公开的一些实施例中,所述处理模块11,还被配置为根据所述深度图和所述二维图像确定第一置信度图;其中,所述第一置信度图是指采用所述深度图中各个像素对应的置信度作为像素值的图像;根据所述深度图、所述参数矩阵与所述法向预测图,计算出第二平面原点距离图;所述第二平面原点距离图是利用所述深度图计算出的所述摄像机至所述三维场景各点所在平面的距离作为像素值的图像;根据所述第一置信度图中的像素、所述第二平面原点距离图中的像素以及所述第一平面原点距离图中的像素,对所述第一平面原点距离图中的像素进行优化,得到优化后的第一平面原点距离图。
在本公开的一些实施例中,所述处理模块11,在被配置为根据所述第一置信度图中的像素、所述第二平面原点距离图中的像素以及所述第一平面原点距离图中的像素,对所述第一平面原点距离图中的像素进优化,得到优化后的第一平面原点距离图时,还被配置为从所述第二平面原点距离图中,确定出与所述第一平面原点距离图的第一像素点所对应的像素点,作为替换像素点,并确定所述替换像素点的像素值;所述第一像素点为所述第一平面原点距离图中的任一像素点;从所述第一置信度图中,确定出所述替换像素点所对应的置信度信息;根据所述替换像素点的像素值、所述置信度信息以及所述第一平面原点距离图的第一像素点的像素值,确定所述第一平面原点距离图的所述第一像素点的优化后的像素值;重复上述步骤,直至确定所述第一平面原点距离图中每个像素的优化后的像素值,得到所述优化后的第一平面原点距离图。
在本公开的一些实施例中,所述处理模块11,在被配置为基于所述待扩散图和所述特征图,确定所述待扩散图中的各个像素的扩散强度时,还被配置为根据预设扩散范围,从所述待扩散图中确定出所述待扩散图的第二像素点对应的待扩散像素集合,并确定出所述待扩散像素集合中每个像素的像素值;所述第二像素点为所述待扩散图中的任一像素点;利用所述特征图、所述待扩散图的第二像素点以及所述待扩散像素集合中的每个像素,计算出所述待扩散图的第二像素点对应的扩散强度;
所述扩散模块12,在被配置为基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定所述待扩散图中的各个像素的扩散后的像素值时,还被配置为根据所述待扩散图的第二像素点的扩散强度、所述待扩散图的第二像素点的像素值以及所述待扩散像素集合中每个像素的像素值,确定所述待扩散图的第二像素点的扩散后的像素值;重复上述步骤,直至确定所述待扩散图中每个像素扩散后的像素值。
在本公开的一些实施例中,所述处理模块11,在被配置为利用所述特征图、所述待扩散图的第二像素点以及所述待扩散像素集合中的每个像素,计算出所述待扩散图的第二像素点对应的扩散强度时,还被配置为利用待扩散图的第二像素点,以及所述待扩散像素集合中每个像素,计算所述待 扩散图的第二像素点对应的强度归一化参数;将所述特征图中,与所述待扩散图的第二像素点对应的像素,作为第一特征像素,与所述待扩散像素集合中第三像素点对应的像素,作为第二特征像素;所述第三像素点为待扩散像素集合中的任一像素;提取所述第一特征像素的特征信息,以及所述第二特征像素的特征信息;利用所述第一特征像素的特征信息、所述第二特征像素的特征信息、所述强度归一化参数以及预设扩散控制参数,计算出由所述待扩散图的第二像素点和所述待扩散像素集合中的第三像素点所组成的扩散像素对的子扩散强度;重复上述步骤,直至确定所述待扩散图的第二像素点,与所述待扩散像素集合中的每个像素所组成的像素对的子扩散强度;将所述待扩散图的第二像素点,与所述待扩散像素集合中的每个像素所组成的扩散像素对的子扩散强度,作为所述待扩散图的第二像素点所对应的扩散强度。
在本公开的一些实施例中,所述处理模块11,在被配置为利用待扩散图的第二像素点,以及所述待扩散像素集合中每个像素,计算所述待扩散图的第二像素点对应的强度归一化参数时,还被配置为提取所述待扩散图的第二像素点的特征信息,以及所述待扩散像素集合中第三像素点的特征信息;利用提取的待扩散图的第二像素点的特征信息、所述待扩散像素集合中第三像素点的特征信息以及所述预设扩散控制参数,计算出所述待扩散像素集合中第三像素点的子归一化参数;重复上述步骤,直至得到所述待扩散像素集合的每个像素的子归一化参数;将所述待扩散像素集合的每个像素的子归一化参数进行累加,得到所述待扩散图的第二像素点对应的强度归一化参数。
在本公开的一些实施例中,所述扩散模块12,在被配置为根据所述待扩散图的第二像素点的扩散强度、所述待扩散图的第二像素点的像素值以及所述待扩散像素集合中每个像素的像素值,确定所述待扩散图的第二像素点的扩散后的像素值时,还被配置为将所述扩散强度中每个子扩散强度,分别与所述待扩散图的第二像素点的像素值相乘,并将所得到的乘积结果进行累加,得到所述待扩散图的第二像素点的第一扩散部分;将所述扩散强度中的每个子扩散强度,分别与待扩散像素集合中每个像素的像素值对应相乘,并将所得到的乘积累加,得到所述待扩散图的第二像素点的第二扩散部分;根据所述待扩散图的第二像素点的像素值、所述待扩散图的第二像素点的第一扩散部分,以及所述待扩散图的第二像素点的第二扩散部分,计算出所述待扩散图的第二像素点的扩散后的像素值。
在本公开的一些实施例中,所述扩散模块12,还被配置为将所述补全后的深度图作为待扩散图,重复执行基于所述待扩散图和所述特征图确定所述待扩散图中的各个像素的扩散强度的步骤,基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度确定所述待扩散图中的各个像素的扩散后的像素值的步骤,以及根据所述待扩散图中的各个像素的扩散后的像素值确定补全后的深度图的步骤,直至达到预设重复次数。
在本公开的一些实施例中,所述扩散模块12,还被配置为将所述补全后的深度图作为初步补全的深度图,重复执行基于所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图,并将所述第一平面原点距离图作为待扩散图的步骤,基于所述待扩散图和所述特征图确定所述待扩散图中各个像素的扩散强度的步骤,基于所述待扩散图中各个像素的像素值以及所述待扩散图中各个像素的扩散强度确定所述待扩散图中各个像素的扩散后的像素值的步骤,以及根据所述待扩散图中各个像素的扩散后的像素值确定补全后的深度图的步骤,直至达到预设重复次数。
在本公开的一些实施例中,所述扩散模块12,在被配置为每一次执行的基于所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图,并将所述第一平面原点距离图作为待扩散图的步骤时,还被配置为基于所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图的步骤;基于所述深度图和所述二维图像确定第一置信度的步骤,基于所述深度图、参数矩阵与法向预测图,计算出第二平面原点距离图的步骤;以及根据所述第一置信度图中的像素、所述第二平面原点距离图中的像素以及所述第一平面原点距离图中的像素,对所述第一平面原点距离图中的像素进行优化,得到优化后的第一平面原点距离图,将优化后的第一平面原点距离图作为待扩散图的步骤。
在一些实施例中,本公开实施例提供的装置具有的功能或包含的模块可以用于执行上文方法实施例描述的方法,其具体实现可以参照上文方法是实施例的描述,为了简洁,这里不再赘述。
在本公开的一些实施例中,图17为本公开实施例提出的一种深度图像补全装置的组成结构示意图,如图17所示,本公开提出的一种深度图像补全装置可以包括处理器01、存储有处理器01可执行指令的存储器02。其中,处理器01被配置为执行存储器中存储的可执行深度图像补全指令,以实现本公开实施例提供的一种深度图像补全方法。
在本公开的实施例中,上述处理器01可以为特定用途集成电路(Application Specific Integrated Circuit,ASIC)、数字信号处理器(Digital Signal Processor,DSP)、数字信号处理装置(Digital Signal Processing Device,DSPD)、可编程逻辑装置(ProgRAMmable Logic Device,PLD)、现场可编程门阵列(Field ProgRAMmable Gate Array,FPGA)、CPU、控制器、微控制器、微处理器中的至少一种。可以理解地,对于不同的设备,用于实现上述处理器功能的电子器件还可以为其它,本公开实施例不作限定。该终端还包括存储器02,该存储器02可以与处理器01连接,其中,存储器02可能包含高速RAM存储器,也可能还包括非易失性存储器,例如,至少两个磁盘存储器。
在实际应用中,上述存储器02可以是易失性存储器(volatile memory),例如随机存取存储器(Random-Access Memory,RAM);或者非易失性存储器(non-volatile memory),例如只读存储器(Read-Only Memory,ROM),快闪存储器(flash memory),硬盘(Hard Disk Drive,HDD)或固态硬盘(Solid-State Drive,SSD);或者上述种类的存储器的组合,并向处理器01提供指令和数据。
另外,在本实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
可以理解的是,本公开实施例中的深度图像补全装置可以是具有计算功能的设备,例如台式计算机、笔记本电脑、微型计算机、车载电脑等,具体的装置实施形式可以根据实际需求来确定,本公开实施例在此不作限制。
本公开实施例提供一种计算机可读存储介质,其上存储有可执行深度图像补全指令,应用于终端中,该程序被处理器执行时实现本公开实施例提供的一种深度图像补全方法。
本领域内的技术人员应明白,本公开的实施例可提供为方法、系统、或计算机程序产品。因此,本公开可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本公开是参照根据本公开实施例的方法、设备(系统)、和计算机程序产品的实现流程示意图和/或方框图来描述的。应理解可由计算机程序指令实现流程示意图和/或方框图中的每一流程和/或方框、以及实现流程示意图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在实现流程示意图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在实现流程示意图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在实现流程示意图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在后续的描述中,使用用于表示元件的诸如“模块”、“部件”或“单元”的后缀仅为了有利于本公开的说明,其本身没有特定的意义。因此,“模块”、“部件”或“单元”可以混合地使用。
以上所述,仅为本公开的较佳实施例而已,并非用于限定本公开的保护范围。
工业实用性
本实施例中,深度图像补全装置能够根据采集到的深度图和二维图像得到待扩散图,待扩散图 中会保留采集到的深度图中所有的点云数据,使得在利用待扩散图中各个像素的像素值和其所对应的扩散强度,确定待扩散图中各个像素的扩散后的像素值时,会利用到采集到的深度图中所有的点云数据,从而充分利用采集到的深度图中的点云数据,进而使得三维场景中每个3D点的深度信息的准确度更高,提高了补全后的深度图的准确度。

Claims (29)

  1. 一种深度图像补全方法,所述方法包括:
    通过设置的雷达采集目标场景的深度图,以及通过设置的摄像机采集所述目标场景的二维图像;
    根据采集到的深度图和所述二维图像,确定待扩散图以及特征图;
    基于所述待扩散图和所述特征图,确定所述待扩散图中的各个像素的扩散强度;所述扩散强度表征所述待扩散图中的各个像素的像素值向相邻像素扩散的强度;
    基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定补全后的深度图。
  2. 根据权利要求1所述的方法,其中,基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定补全后的深度图,包括:
    基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定所述待扩散图中的各个像素的扩散后的像素值;
    根据所述待扩散图中的各个像素的扩散后的像素值确定补全后的深度图。
  3. 根据权利要求2所述的方法,其中,所述待扩散图为初步补全的深度图;所述根据所述待扩散图中的各个像素的扩散后的像素值确定补全后的深度图,包括:
    将所述待扩散图中的各个像素的扩散后的像素值作为扩散后的图像的各个像素的像素值;
    将扩散后的图像作为补全后的深度图。
  4. 根据权利要求2所述的方法,其中,所述待扩散图为第一平面原点距离图;所述根据所述深度图和所述二维图像,确定待扩散图以及特征图,包括:
    获取所述摄像机的参数矩阵;
    根据所述采集到的深度图和所述二维图像确定所述初步补全的深度图、所述特征图和法向预测图;所述法向预测图是指将三维场景各点的法向量作为像素值的图像;
    根据所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图;所述第一平面原点距离图是利用所述初步补全的深度图计算出的所述摄像机至所述三维场景各点所在平面的距离作为像素值的图像。
  5. 根据权利要求4所述的方法,其中,所述方法还包括:
    根据所述采集到的深度图和所述二维图像确定第一置信度图;其中,所述第一置信度图是指采用所述采集到的深度图中各个像素对应的置信度作为像素值的图像;
    根据所述采集到的深度图、所述参数矩阵与所述法向预测图,计算出第二平面原点距离图;所述第二平面原点距离图是利用所述采集到的深度图计算出的所述摄像机至所述三维场景各点所在平面的距离作为像素值的图像;
    根据所述第一置信度图中的像素、所述第二平面原点距离图中的像素以及所述第一平面原点距离图中的像素,对所述第一平面原点距离图中的像素进行优化,得到优化后的第一平面原点距离图。
  6. 根据权利要求5所述的方法,其中,所述根据所述第一置信度图中的像素、所述第二平面原点距离图中的像素以及所述第一平面原点距离图中的像素,对所述第一平面原点距离图中的像素进优化,得到优化后的第一平面原点距离图,包括:
    从所述第二平面原点距离图中,确定出与所述第一平面原点距离图的第一像素点所对应的像素点,作为替换像素点,并确定所述替换像素点的像素值;所述第一像素点为所述第一平面原点距离图中的任一像素点;
    从所述第一置信度图中,确定出所述替换像素点所对应的置信度信息;
    根据所述替换像素点的像素值、所述置信度信息以及所述第一平面原点距离图的第一像素点的像素值,确定所述第一平面原点距离图的所述第一像素点的优化后的像素值;
    重复上述步骤,直至确定所述第一平面原点距离图中每个像素的优化后的像素值,得到所述优化后的第一平面原点距离图。
  7. 根据权利要求2-6任一所述的方法,其中,所述基于所述待扩散图和所述特征图,确定所述待扩散图中的各个像素的扩散强度,包括:
    根据预设扩散范围,从所述待扩散图中确定出所述待扩散图的第二像素点对应的待扩散像素集 合,并确定出所述待扩散像素集合中每个像素的像素值;所述第二像素点为所述待扩散图中的任一像素点;
    利用所述特征图、所述待扩散图的第二像素点以及所述待扩散像素集合中的每个像素,计算出所述待扩散图的第二像素点对应的扩散强度;
    基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定所述待扩散图中的各个像素的扩散后的像素值,包括:
    根据所述待扩散图的第二像素点的扩散强度、所述待扩散图的第二像素点的像素值以及所述待扩散像素集合中每个像素的像素值,确定所述待扩散图的第二像素点的扩散后的像素值;
    重复上述步骤,直至确定所述待扩散图中每个像素扩散后的像素值。
  8. 根据权利要求7所述的方法,其中,所述利用所述特征图、所述待扩散图的第二像素点以及所述待扩散像素集合中的每个像素,计算出所述待扩散图的第二像素点对应的扩散强度,包括:
    利用待扩散图的第二像素点,以及所述待扩散像素集合中每个像素,计算所述待扩散图的第二像素点对应的强度归一化参数;
    将所述特征图中,与所述待扩散图的第二像素点对应的像素,作为第一特征像素;
    将所述特征图中,与所述待扩散像素集合中第三像素点对应的像素,作为第二特征像素;所述第三像素点为待扩散像素集合中的任一像素点;
    提取所述第一特征像素的特征信息,以及所述第二特征像素的特征信息;
    利用所述第一特征像素的特征信息、所述第二特征像素的特征信息、所述强度归一化参数以及预设扩散控制参数,计算出由所述待扩散图的第二像素点和所述待扩散像素集合中的第三像素点所组成的扩散像素对的子扩散强度;
    重复上述步骤,直至确定所述待扩散图的第二像素点,与所述待扩散像素集合中的每个像素所组成的扩散像素对的子扩散强度;
    将所述待扩散图的第二像素点,与所述待扩散像素集合中的每个像素所组成的扩散像素对的子扩散强度,作为所述待扩散图的第二像素点所对应的扩散强度。
  9. 根据权利要求8所述的方法,其中,所述子扩散强度为所述待扩散图的第二像素点和所述待扩散像素集合中的第三像素点的相似度。
  10. 根据权利要求8所述的方法,其中,所述利用待扩散图的第二像素点,以及所述待扩散像素集合中每个像素,计算所述待扩散图的第二像素点对应的强度归一化参数,包括:
    提取所述待扩散图的第二像素点的特征信息,以及所述待扩散像素集合中第三像素点的特征信息;
    利用提取的待扩散图的第二像素点的特征信息、所述待扩散像素集合中第三像素点的特征信息以及所述预设扩散控制参数,计算出所述待扩散像素集合中第三像素点的子归一化参数;
    重复上述步骤,直至得到所述待扩散像素集合的每个像素的子归一化参数;
    将所述待扩散像素集合的每个像素的子归一化参数进行累加,得到所述待扩散图的第二像素点对应的强度归一化参数。
  11. 根据权利要求8所述的方法,其中,所述根据所述待扩散图的第二像素点的扩散强度、所述待扩散图的第二像素点的像素值以及所述待扩散像素集合中每个像素的像素值,确定所述待扩散图的第二像素点的扩散后的像素值,包括:
    将所述扩散强度中的每个子扩散强度,分别与所述待扩散图的第二像素点的像素值相乘,并将所得到的乘积结果进行累加,得到所述待扩散图的第二像素点的第一扩散部分;
    将所述扩散强度中的每个子扩散强度,分别与待扩散像素集合中每个像素的像素值对应相乘,并将所得到的乘积累加,得到所述待扩散图的第二像素点的第二扩散部分;
    根据所述待扩散图的第二像素点的像素值、所述待扩散图的第二像素点的第一扩散部分,以及所述待扩散图的第二像素点的第二扩散部分,计算出所述待扩散图的第二像素点的扩散后的像素值。
  12. 根据权利要求3所述的方法,其中,在所述根据所述待扩散图中的各个像素的扩散后的像素值确定补全后的深度图之后,所述方法还包括:
    将所述补全后的深度图作为待扩散图,重复执行基于所述待扩散图和所述特征图确定所述待扩散图中的各个像素的扩散强度的步骤,基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度确定所述待扩散图中的各个像素的扩散后的像素值的步骤,以及根据所述待扩散图中的各个像素的扩散后的像素值确定补全后的深度图的步骤,直至达到预设重复次数。
  13. 根据权利要求4-11任一所述的方法,其中,在所述根据所述待扩散图中的各个像素的扩散 后的像素值确定补全后的深度图之后,所述方法还包括:
    将所述补全后的深度图作为初步补全的深度图,重复执行基于所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图,并将所述第一平面原点距离图作为待扩散图的步骤,基于所述待扩散图和所述特征图确定所述待扩散图中各个像素的扩散强度的步骤,基于所述待扩散图中各个像素的像素值以及所述待扩散图中各个像素的扩散强度确定所述待扩散图中各个像素的扩散后的像素值的步骤,以及根据所述待扩散图中各个像素的扩散后的像素值确定补全后的深度图的步骤,直至达到预设重复次数。
  14. 根据权利要求13所述的方法,其中,每一次执行的基于所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图,并将所述第一平面原点距离图作为待扩散图的步骤,包括:
    基于所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图的步骤;
    基于所述采集到的深度图和所述二维图像确定第一置信度图的步骤;
    基于所述采集到的深度图、参数矩阵与法向预测图,计算出第二平面原点距离图的步骤;
    以及根据所述第一置信度图中的像素、所述第二平面原点距离图中的像素以及所述第一平面原点距离图中的像素,对所述第一平面原点距离图中的像素进行优化,得到优化后的第一平面原点距离图,将优化后的第一平面原点距离图作为待扩散图的步骤。
  15. 一种深度图像补全装置,所述装置包括:
    采集模块,被配置为通过设置的雷达采集目标场景的深度图,以及通过设置的摄像机采集所述目标场景的二维图像;
    处理模块,被配置为根据采集到的深度图和所述二维图像,确定待扩散图以及特征图;基于所述待扩散图和所述特征图,确定所述待扩散图中的各个像素的扩散强度;所述扩散强度表征所述待扩散图中的各个像素的像素值向相邻像素扩散的强度;
    扩散模块,被配置为基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定补全后的深度图。
  16. 根据权利要求15所述的深度图像补全装置,其中,
    所述扩散模块,还被配置为基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定所述待扩散图中的各个像素的扩散后的像素值;根据所述待扩散图中的各个像素的扩散后的像素值确定补全后的深度图。
  17. 根据权利要求16所述的深度图像补全装置,其中,所述待扩散图为初步补全的深度图;
    所述扩散模块,在被配置为根据所述待扩散图中的各个像素的扩散后的像素值确定补全后的深度图时,还被配置为将所述待扩散图中的各个像素的扩散后的像素值作为扩散后的图像的各个像素的像素值;将扩散后的图像作为补全后的深度图。
  18. 根据权利要求16所述的深度图像补全装置,其中,所述待扩散图为第一平面原点距离图;
    所述处理模块,在被配置为根据所述深度图和所述二维图像确定待扩散图以及特征图时,还被配置为获取所述摄像机的参数矩阵;根据所述深度图和所述二维图像确定所述初步补全的深度图、所述特征图和法向预测图;所述法向预测图是指将三维场景各点的法向量作为像素值的图像;根据所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图;所述第一平面原点距离图是利用所述初步补全的深度图计算出的所述摄像机至所述三维场景各点所在平面的距离作为像素值的图像。
  19. 根据权利要求18所述的深度图像补全装置,其中,
    所述处理模块,还被配置为根据所述深度图和所述二维图像确定第一置信度图;其中,所述第一置信度图是指采用所述深度图中各个像素对应的置信度作为像素值的图像;根据所述深度图、所述参数矩阵与所述法向预测图,计算出第二平面原点距离图;所述第二平面原点距离图是利用所述深度图计算出的所述摄像机至所述三维场景各点所在平面的距离作为像素值的图像;根据所述第一置信度图中的像素、所述第二平面原点距离图中的像素以及所述第一平面原点距离图中的像素,对所述第一平面原点距离图中的像素进行优化,得到优化后的第一平面原点距离图。
  20. 根据权利要求19所述的深度图像补全装置,其中,
    所述处理模块,在被配置为根据所述第一置信度图中的像素、所述第二平面原点距离图中的像素以及所述第一平面原点距离图中的像素,对所述第一平面原点距离图中的像素进优化,得到优化后的第一平面原点距离图时,还被配置为从所述第二平面原点距离图中,确定出与所述第一平面原 点距离图的第一像素点所对应的像素点,作为替换像素点,并确定所述替换像素点的像素值;所述第一像素点为所述第一平面原点距离图中的任一像素点;从所述第一置信度图中,确定出所述替换像素点所对应的置信度信息;根据所述替换像素点的像素值、所述置信度信息以及所述第一平面原点距离图的第一像素点的像素值,确定所述第一平面原点距离图的所述第一像素点的优化后的像素值;重复上述步骤,直至确定所述第一平面原点距离图中每个像素的优化后的像素值,得到所述优化后的第一平面原点距离图。
  21. 根据权利要求16-20任一所述的深度图像补全装置,其中,
    所述处理模块,在被配置为基于所述待扩散图和所述特征图,确定所述待扩散图中的各个像素的扩散强度时,还被配置为根据预设扩散范围,从所述待扩散图中确定出所述待扩散图的第二像素点对应的待扩散像素集合,并确定出所述待扩散像素集合中每个像素的像素值;所述第二像素点为所述待扩散图中的任一像素点;利用所述特征图、所述待扩散图的第二像素点以及所述待扩散像素集合中的每个像素,计算出所述待扩散图的第二像素点对应的扩散强度;
    所述扩散模块,在被配置为基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度,确定所述待扩散图中的各个像素的扩散后的像素值时,还被配置为根据所述待扩散图的第二像素点的扩散强度、所述待扩散图的第二像素点的像素值以及所述待扩散像素集合中每个像素的像素值,确定所述待扩散图的第二像素点的扩散后的像素值;重复上述步骤,直至确定所述待扩散图中每个像素扩散后的像素值。
  22. 根据权利要求21所述的深度图像补全装置,其中,
    所述处理模块,在被配置为利用所述特征图、所述待扩散图的第二像素点以及所述待扩散像素集合中的每个像素,计算出所述待扩散图的第二像素点对应的扩散强度时,还被配置为利用待扩散图的第二像素点,以及所述待扩散像素集合中每个像素,计算所述待扩散图的第二像素点对应的强度归一化参数;将所述特征图中,与所述待扩散图的第二像素点对应的像素,作为第一特征像素,与所述待扩散像素集合中第三像素点对应的像素,作为第二特征像素;所述第三像素点为待扩散像素集合中的任一像素;提取所述第一特征像素的特征信息,以及所述第二特征像素的特征信息;利用所述第一特征像素的特征信息、所述第二特征像素的特征信息、所述强度归一化参数以及预设扩散控制参数,计算出由所述待扩散图的第二像素点和所述待扩散像素集合中的第三像素点所组成的扩散像素对的子扩散强度;重复上述步骤,直至确定所述待扩散图的第二像素点,与所述待扩散像素集合中的每个像素所组成的像素对的子扩散强度;将所述待扩散图的第二像素点,与所述待扩散像素集合中的每个像素所组成的扩散像素对的子扩散强度,作为所述待扩散图的第二像素点所对应的扩散强度。
  23. 根据权利要求22所述的深度图像补全装置,其中,
    所述处理模块,在被配置为利用待扩散图的第二像素点,以及所述待扩散像素集合中每个像素,计算所述待扩散图的第二像素点对应的强度归一化参数时,还被配置为提取所述待扩散图的第二像素点的特征信息,以及所述待扩散像素集合中第三像素点的特征信息;利用提取的待扩散图的第二像素点的特征信息、所述待扩散像素集合中第三像素点的特征信息以及所述预设扩散控制参数,计算出所述待扩散像素集合中第三像素点的子归一化参数;重复上述步骤,直至得到所述待扩散像素集合的每个像素的子归一化参数;将所述待扩散像素集合的每个像素的子归一化参数进行累加,得到所述待扩散图的第二像素点对应的强度归一化参数。
  24. 根据权利要求22所述的深度图像补全装置,其中,
    所述扩散模块,在被配置为根据所述待扩散图的第二像素点的扩散强度、所述待扩散图的第二像素点的像素值以及所述待扩散像素集合中每个像素的像素值,确定所述待扩散图的第二像素点的扩散后的像素值时,还配置为将所述扩散强度中每个子扩散强度,分别与所述待扩散图的第二像素点的像素值相乘,并将所得到的乘积结果进行累加,得到所述待扩散图的第二像素点的第一扩散部分;将所述扩散强度中的每个子扩散强度,分别与待扩散像素集合中每个像素的像素值对应相乘,并将所得到的乘积累加,得到所述待扩散图的第二像素点的第二扩散部分;根据所述待扩散图的第二像素点的像素值、所述待扩散图的第二像素点的第一扩散部分,以及所述待扩散图的第二像素点的第二扩散部分,计算出所述待扩散图的第二像素点的扩散后的像素值。
  25. 根据权利要求17所述的深度图像补全装置,其中,
    所述扩散模块,还被配置为将所述补全后的深度图作为待扩散图,重复执行基于所述待扩散图和所述特征图确定所述待扩散图中的各个像素的扩散强度的步骤,基于所述待扩散图中的各个像素的像素值以及所述待扩散图中的各个像素的扩散强度确定所述待扩散图中的各个像素的扩散后的像 素值的步骤,以及根据所述待扩散图中的各个像素的扩散后的像素值确定补全后的深度图的步骤,直至达到预设重复次数。
  26. 根据权利要求18-24任一所述的深度图像补全装置,其中,
    所述扩散模块,还被配置为将所述补全后的深度图作为初步补全的深度图,重复执行基于所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图,并将所述第一平面原点距离图作为待扩散图的步骤,基于所述待扩散图和所述特征图确定所述待扩散图中各个像素的扩散强度的步骤,基于所述待扩散图中各个像素的像素值以及所述待扩散图中各个像素的扩散强度确定所述待扩散图中各个像素的扩散后的像素值的步骤,以及根据所述待扩散图中各个像素的扩散后的像素值确定补全后的深度图的步骤,直至达到预设重复次数。
  27. 根据权利要求26所述的深度图像补全装置,其中,
    所述扩散模块,在被配置为每一次执行的基于所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图,并将所述第一平面原点距离图作为待扩散图的步骤时,还被配置为基于所述初步补全的深度图、所述摄像机的参数矩阵与所述法向预测图,计算出第一平面原点距离图的步骤;基于所述深度图和所述二维图像确定第一置信度的步骤,基于所述深度图、参数矩阵与法向预测图,计算出第二平面原点距离图的步骤;以及根据所述第一置信度图中的像素、所述第二平面原点距离图中的像素以及所述第一平面原点距离图中的像素,对所述第一平面原点距离图中的像素进行优化,得到优化后的第一平面原点距离图,将优化后的第一平面原点距离图作为待扩散图的步骤。
  28. 一种深度图像补全装置,其中,所述装置包括:存储器及处理器;
    所述存储器,被配置为存储可执行深度图像补全指令;
    所述处理器,被配置为执行所述存储器中存储的可执行深度图像补全指令,实现权利要求1-14任一项所述的方法。
  29. 一种计算机可读存储介质,其中,存储有可执行深度图像补全指令,用于引起处理器执行时,实现权利要求1-14任一项所述的方法。
PCT/CN2019/128828 2019-08-30 2019-12-26 一种深度图像补全方法及装置、计算机可读存储介质 WO2021036135A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
SG11202012443SA SG11202012443SA (en) 2019-08-30 2019-12-26 Method and device for depth image completion and computer-readable storage medium
JP2020568542A JP7143449B2 (ja) 2019-08-30 2019-12-26 デプス画像補完方法及び装置、コンピュータ可読記憶媒体
KR1020207036589A KR20210027269A (ko) 2019-08-30 2019-12-26 깊이 이미지 보완 방법 및 장치, 컴퓨터 판독 가능한 저장 매체
US17/107,065 US20210082135A1 (en) 2019-08-30 2020-11-30 Method and device for depth image completion and computer-readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910817815.1A CN112446909B (zh) 2019-08-30 2019-08-30 一种深度图像补全方法及装置、计算机可读存储介质
CN201910817815.1 2019-08-30

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/107,065 Continuation US20210082135A1 (en) 2019-08-30 2020-11-30 Method and device for depth image completion and computer-readable storage medium

Publications (1)

Publication Number Publication Date
WO2021036135A1 true WO2021036135A1 (zh) 2021-03-04

Family

ID=74684872

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/128828 WO2021036135A1 (zh) 2019-08-30 2019-12-26 一种深度图像补全方法及装置、计算机可读存储介质

Country Status (6)

Country Link
US (1) US20210082135A1 (zh)
JP (1) JP7143449B2 (zh)
KR (1) KR20210027269A (zh)
CN (1) CN112446909B (zh)
SG (1) SG11202012443SA (zh)
WO (1) WO2021036135A1 (zh)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12008740B2 (en) * 2020-08-12 2024-06-11 Niantic, Inc. Feature matching using features extracted from perspective corrected image
CN113012210B (zh) * 2021-03-25 2022-09-27 北京百度网讯科技有限公司 深度图的生成方法、装置、电子设备和存储介质
CN114255271A (zh) * 2021-04-16 2022-03-29 威盛电子股份有限公司 电子装置以及物件侦测方法
US12061253B2 (en) * 2021-06-03 2024-08-13 Ford Global Technologies, Llc Depth map generation
CN113625271B (zh) * 2021-07-29 2023-10-27 中汽创智科技有限公司 基于毫米波雷达和双目相机的同时定位与建图方法
KR102641108B1 (ko) * 2021-08-03 2024-02-27 연세대학교 산학협력단 깊이맵 완성 장치 및 방법
GB2609983A (en) * 2021-08-20 2023-02-22 Garford Farm Machinery Ltd Image processing
CN118525234A (zh) * 2021-12-09 2024-08-20 索尼集团公司 控制装置、控制方法、信息处理装置、生成方法和程序
CN114897955B (zh) * 2022-04-25 2023-04-18 电子科技大学 一种基于可微几何传播的深度补全方法
WO2024076027A1 (ko) * 2022-10-07 2024-04-11 삼성전자 주식회사 포인트 클라우드를 생성하는 방법 및 전자 장치

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5363213A (en) * 1992-06-08 1994-11-08 Xerox Corporation Unquantized resolution conversion of bitmap images using error diffusion
CN106780593A (zh) * 2016-11-28 2017-05-31 深圳奥比中光科技有限公司 一种彩色深度图像的获取方法、获取设备
CN109325972A (zh) * 2018-07-25 2019-02-12 深圳市商汤科技有限公司 激光雷达稀疏深度图的处理方法、装置、设备及介质
CN110047144A (zh) * 2019-04-01 2019-07-23 西安电子科技大学 一种基于Kinectv2的完整物体实时三维重建方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198486B (zh) * 2013-04-10 2015-09-09 浙江大学 一种基于各向异性扩散的深度图像增强方法
AU2013206597A1 (en) * 2013-06-28 2015-01-22 Canon Kabushiki Kaisha Depth constrained superpixel-based depth map refinement
CN103839258A (zh) * 2014-02-13 2014-06-04 西安交通大学 一种二值化激光散斑图像的深度感知方法
WO2018072817A1 (en) 2016-10-18 2018-04-26 Photonic Sensors & Algorithms, S.L. A device and method for obtaining distance information from views
JP2019016275A (ja) 2017-07-10 2019-01-31 キヤノン株式会社 画像処理方法、画像処理プログラム、記憶媒体、画像処理装置、および撮像装置
EP3644277B1 (en) 2017-08-14 2024-02-14 Rakuten Group, Inc. Image processing system, image processing method, and program
JP7156624B2 (ja) 2017-11-10 2022-10-19 凸版印刷株式会社 デプスマップフィルタ処理装置、デプスマップフィルタ処理方法及びプログラム
CN108062769B (zh) * 2017-12-22 2020-11-17 中山大学 一种用于三维重建的快速深度恢复方法
CN108932734B (zh) * 2018-05-23 2021-03-09 浙江商汤科技开发有限公司 单目图像的深度恢复方法及装置、计算机设备
CN109685732B (zh) * 2018-12-18 2023-02-17 重庆邮电大学 一种基于边界捕捉的深度图像高精度修复方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5363213A (en) * 1992-06-08 1994-11-08 Xerox Corporation Unquantized resolution conversion of bitmap images using error diffusion
CN106780593A (zh) * 2016-11-28 2017-05-31 深圳奥比中光科技有限公司 一种彩色深度图像的获取方法、获取设备
CN109325972A (zh) * 2018-07-25 2019-02-12 深圳市商汤科技有限公司 激光雷达稀疏深度图的处理方法、装置、设备及介质
CN110047144A (zh) * 2019-04-01 2019-07-23 西安电子科技大学 一种基于Kinectv2的完整物体实时三维重建方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG WEI , YU MIAO , HU ZHAN-YI: "Multi-view Dense Depth Map Estimation through Match Propagation", ACTA AUTOMATICA SINICA, vol. 40, no. 12, 31 December 2014 (2014-12-31), pages 2782 - 2796, XP009521512, ISSN: 0254-4156, DOI: 10.3724/SP.J.1004.2014.02782 *

Also Published As

Publication number Publication date
JP7143449B2 (ja) 2022-09-28
CN112446909A (zh) 2021-03-05
JP2022501681A (ja) 2022-01-06
US20210082135A1 (en) 2021-03-18
SG11202012443SA (en) 2021-04-29
CN112446909B (zh) 2022-02-01
KR20210027269A (ko) 2021-03-10

Similar Documents

Publication Publication Date Title
WO2021036135A1 (zh) 一种深度图像补全方法及装置、计算机可读存储介质
CN109919993B (zh) 视差图获取方法、装置和设备及控制系统
CN112241976B (zh) 一种训练模型的方法及装置
CN110033481A (zh) 用于进行图像处理的方法和设备
JP6202147B2 (ja) 曲線検出方法と曲線検出装置
US20220164566A1 (en) Methods for encoding point cloud feature
CN111028327A (zh) 一种三维点云的处理方法、装置及设备
US11651052B2 (en) Methods for extracting point cloud feature
WO2022126522A1 (zh) 物体识别方法、装置、可移动平台以及存储介质
US20220036573A1 (en) Method and apparatus with image depth estimation
CN116194933A (zh) 处理系统、处理方法以及处理程序
CN114494433B (zh) 图像处理方法、装置、设备和计算机可读存储介质
CN113281779B (zh) 一种3d物体快速检测方法、装置、设备及介质
CN113705617A (zh) 点云数据的处理方法、装置、计算机设备和存储介质
WO2024045942A1 (zh) 环境信息感知方法、装置、系统、计算机设备及存储介质
CN116543295A (zh) 一种基于退化图像增强的轻量化水下目标检测方法及系统
WO2023154986A1 (en) Method, system, and device using a generative model for image segmentation
CN112967399A (zh) 三维时序图像生成方法、装置、计算机设备和存储介质
CN112686936B (zh) 图像深度补全方法、装置、计算机设备、介质和程序产品
CN116958954B (zh) 基于关键点与旁路矫正的车牌识别方法、装置及存储介质
CN118298194B (zh) 一种面向相机光通信的条纹图像处理方法、装置及设备
CN112598722B (zh) 一种基于可变形卷积网络的图像立体匹配方法以及系统
US12020409B2 (en) Blur correction system
US20230102186A1 (en) Apparatus and method for estimating distance and non-transitory computer-readable medium containing computer program for estimating distance
CN117911671A (zh) 一种目标检测方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2020568542

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19943460

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19943460

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 29.09.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 19943460

Country of ref document: EP

Kind code of ref document: A1