CN109360235B - Hybrid depth estimation method based on light field data - Google Patents

Hybrid depth estimation method based on light field data Download PDF

Info

Publication number
CN109360235B
CN109360235B CN201811151940.5A CN201811151940A CN109360235B CN 109360235 B CN109360235 B CN 109360235B CN 201811151940 A CN201811151940 A CN 201811151940A CN 109360235 B CN109360235 B CN 109360235B
Authority
CN
China
Prior art keywords
depth
image
pixel
parallax
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811151940.5A
Other languages
Chinese (zh)
Other versions
CN109360235A (en
Inventor
李贺建
黄建民
沙龙
张镇军
刘鹏程
冉广奎
王贺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AVIC Shanghai Aeronautical Measurement Controlling Research Institute
Original Assignee
AVIC Shanghai Aeronautical Measurement Controlling Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AVIC Shanghai Aeronautical Measurement Controlling Research Institute filed Critical AVIC Shanghai Aeronautical Measurement Controlling Research Institute
Priority to CN201811151940.5A priority Critical patent/CN109360235B/en
Publication of CN109360235A publication Critical patent/CN109360235A/en
Application granted granted Critical
Publication of CN109360235B publication Critical patent/CN109360235B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images

Abstract

The invention discloses a mixed depth estimation method based on light field data. The method comprises the following steps: performing mixed depth estimation based on parallax and fuzzy detection on focus stack images, sub-images and full-focus images acquired by data acquired by a light field camera by adopting a mixed depth estimation method, performing parallax analysis on the sub-images with angle difference by adopting a fusion matching method, performing focus detection on the focus stack images acquired simultaneously to acquire depth images, and fusing the depth images acquired based on parallax and the depth images acquired based on focusing by adopting a classification and self-adaptive mixing method; and performing depth map fusion after up-sampling the sub-images. The method can obtain the scene depth information by only using one light field camera, and has simple algorithm and high accuracy.

Description

Hybrid depth estimation method based on light field data
Technical Field
The invention relates to the technical field of stereo image processing, in particular to a mixed depth estimation method based on light field data.
Background
Depth estimation is a key technology for three-dimensional reconstruction, and is a key intermediate step in the fields of three-dimensional display, machine vision, virtual reality, intelligent vision and the like. The depth measurement or depth estimation adopts a manual measurement method in the early stage, more accurate laser measurement appears later, and the depth estimation based on images becomes a research hotspot along with the development of image processing technology, so that the development of non-contact measurement technology and three-dimensional reconstruction technology is promoted. The depth estimation based on the images has the characteristics of simple acquisition equipment, easiness in integration and the like, the two-dimensional limitation of images shot by the original camera is expanded, the depth estimation enables the mobile equipment to have a three-dimensional function, the mobile equipment becomes a key technology in applications such as stereoscopic televisions and virtual reality, and the like, and the depth estimation is widely concerned and researched. However, the depth estimation algorithm based on the image still faces the difficulties of high algorithm complexity, further improvement on accuracy and the like.
Existing depth estimation methods include active laser ranging, TOF camera, structured light depth acquisition, and passive multi-view image-based depth estimation methods, as well as the latest light field camera-based methods. The active mode has high acquisition speed and high accuracy, but has the defect of complex equipment of a measurement system; the structured light depth acquisition method has high requirements on environmental illumination conditions; the TOF camera method adopts ray measurement, and cannot give correct depth to an object with small reflectivity; the passive method is based on visual cue for analysis, the depth perception of human vision integrates various depth cues such as blur, parallax, motion, color and the like, and the blur and parallax cues are most commonly used in the field of computers.
The light field camera is a novel camera which inherits a micro lens array, and records light ray two-dimensional information and light ray angle information. The light field imaging technology can realize a series of special imaging effects such as full-focus image synthesis, depth of field expansion, digital refocusing and the like. And calculating the data of the fuzzy clues needing different focusing depths of the same viewpoint, evaluating and measuring the focusing degree of each pixel point in the image by using a focusing detection function, and calculating the focusing pixel in the image. The depth estimation algorithm based on the parallax is based on a binocular parallax principle to carry out depth estimation, simplifies video acquisition equipment, does not need additional equipment, but is high in algorithm complexity and still needs to be improved in accuracy.
Disclosure of Invention
The invention aims to provide a low-complexity mixed depth estimation method based on light field data, which can accurately obtain scene depth information by using only one light field camera.
The technical solution for realizing the purpose of the invention is as follows: a hybrid depth estimation method based on light field data comprises the following steps:
step 1, acquiring light field data: acquiring four-dimensional light field image data through a light field camera, and processing to obtain a scene focus stack image, a sub-image, a full focus image and a focus parameter, wherein the focus parameter comprises a white image calibration parameter and a sub-image calibration parameter;
step 2, enhancing parallax matching of the CT-SAD algorithm: selecting a horizontally arranged camera to simulate the horizontally arranged camera based on the subimages, and performing stereo matching of left and right viewpoints, including enhanced CT conversion, SAD calculation, CT-SAD matching cost fusion, cost accumulation, parallax estimation, parallax post-processing and depth conversion to obtain an initial resolution depth image;
and 3, upsampling based on bilateral filtering: combining the full-focus image and the initial resolution depth image obtained in the parallax matching in the step 2, and performing an up-sampling process based on bilateral filtering to obtain a first depth image;
and 4, depth estimation based on focus detection: analyzing the focal stack image by adopting a gradient algorithm, obtaining a focusing pixel layer by adopting a two-dimensional gradient algorithm, and performing depth-of-field conversion by utilizing a focusing parameter to obtain the depth of the focusing pixel layer; repeatedly carrying out focusing detection until all focusing layers are detected to obtain a second depth map;
step 5, depth self-adaptive mixing: firstly, equalizing the depth grading of a first depth map obtained in a parallax matching mode and a second depth map obtained in a focusing detection mode, obtaining a substitution value according to a median approach principle, then performing cost fusion in a self-adaptive parameter mode, minimizing the cost to obtain a depth grading map, and performing depth conversion on the depth grading map by using grading equalization parameters to obtain a final depth map;
step 6, visualization of the depth map: and quantizing the final depth map obtained by the depth grading, and adjusting the depth display range to be in the range of 0-255.
Further, the enhanced CT transform in step 2 is specifically as follows:
the CT transformation process is formulated as follows:
Figure BDA0001818138170000021
Figure BDA0001818138170000022
Figure BDA0001818138170000023
l, R is the conversion result of left image and right image, the conversion window size is M N, the variables M, N are integer and cover the whole window M N; r (u, v), l (u, v) are the right and left image pixel values at (u, v), respectively; xi (-) is a comparison relation, and the calculation method is like xi (p)1,p2) Shown in a formula; p is a radical of1、p2For the value of the original pixel, it is,
Figure BDA0001818138170000031
the window area pixel average, (u, v) the pixel coordinates,
Figure BDA0001818138170000034
indicating that each of the obtained comparison values is ordered;
the Census transform yields a corresponding array representing the transformed center pixel by comparing the center pixel with surrounding pixels within an mxn window, thus yielding two images R and L represented by the Census transform.
Further, the enhanced CT-SAD algorithm in step 2 specifically comprises the following steps:
combining the SAD method based on luminance with the Census transform method, using CSADRepresenting the matching cost of SAD, CSADBy passingCalculating the difference value of the brightness and obtaining:
Figure BDA0001818138170000032
wherein, formula CSAD(μ, v, d) represents the cumulative SAD matching cost for a pixel with disparity d at (u, v), L × K is the cumulative window constant, IrAnd IlPixel values corresponding to the right view and the left view are d;
obtaining matching cost C based on Census transformation by comparing the Hamming distance of Census transformation values of corresponding pixels at the same positions in the right image and the left imageCTThen passes the parameter (rho) by two initial matching costsSADCT) Obtaining the matching cost C of the pixel by fusionCT-SAD(p,d):
CCT-SAD(p,d)=ρSADCSAD(p,d)+ρCTCCT(p,d)
Wherein CCT-SADP represents a (u, v) point pixel for the matching cost of the parallax d position obtained based on the parallax principle of the subimage; ρ is a unit of a gradientSAD、ρCTRespectively fusing cost parameters; cSAD(p,d)、CCT(p, d) are SAD matching cost and CT matching cost of the pixel p in the parallax d respectively;
carrying out matching cost accumulation in a pixel related region, and obtaining a new matching cost C after the cost accumulation is finishedaggThen, the matching cost C is obtained based on the principle of the parallax of the sub-imagebComprises the following steps:
Figure BDA0001818138170000033
wherein, Cagg(μ, v, d) is an accumulated cost of the (μ, v) point pixel when the parallax is d, and m, n is an accumulated window size;
and performing minimum optimization on the matching cost by adopting a WTA optimization method to obtain a parallax matching value, and searching a corresponding pixel position with the minimum matching cost to obtain a corresponding parallax map.
Further, the disparity post-processing in step 2 is to perform denoising processing on the disparity map, which is specifically as follows:
performing median filtering by adopting a k multiplied by l window to obtain a disparity value Di;
the parallax map is represented by pixels, and the parallax is converted into an initial resolution depth image Z' b by using camera calibration parameters and a parallax conversion relation.
Further, the upsampling based on bilateral filtering in step 3 specifically includes the following steps:
by adopting bilateral upsampling combined with an original texture map, the expression of the joint bilateral upsampling is as follows:
Figure BDA0001818138170000041
wherein
Figure BDA0001818138170000042
Is the up-sampled pixel value; p is a radical of formulas(Y) is the pixel of the small-size image, namely the pixel value of the large resolution corresponding to the new position of the small-size image after up-sampling;
Figure BDA0001818138170000043
for original large-sized pixel values, Xs,YsThe spatial positions of two pixel points in the small-size image are obtained; x, Y is X in large-size images、YsA corresponding spatial position; omega is a kernel function support domain; k is a radical ofXThe sum of the products of the distance weight r and the pixel difference weight d, i.e. the normalization parameter, is given by the following formula:
Figure BDA0001818138170000044
wherein, the pixel difference weight d is the brightness factor in the physical sense, and the pixel difference weight d is separated into the brightness factors d in the horizontal and vertical directionsh、dv
Figure BDA0001818138170000045
Figure BDA0001818138170000046
Wherein the content of the first and second substances,
Figure BDA0001818138170000047
is the (x, y) point pixel value,
Figure BDA0001818138170000048
pixel values in the horizontal and vertical directions aligned with the (x, y) point, respectively;
adopting the pixel value of the original texture image as a parameter for generating pixel difference weight, and performing directional separation operation; the distance weight r, i.e. the scale factor, is an operation in the window of the initial low-resolution depth image, and is used for obtaining a high-resolution image by upsampling, and the distance of the initial low-resolution is used as a ranging mode, and the formula is as follows:
Figure BDA0001818138170000049
wherein, r (X)s,Ys) Is the distance weight, δ (X)s,Ys) Is a distance, σrIs a distance weight parameter;
obtaining a first depth image Z based on the parallax of the sub-image after up-samplingb
Further, the depth estimation based on focus detection described in step 4 specifically includes the following steps:
the method comprises the steps of taking different focused images obtained by a light field camera, namely focus stack images, as input data of depth estimation, evaluating and measuring the focusing degree of each pixel point in the images by using a focusing detection function, calculating intermediate focusing pixels in the images, calculating clear pixels in fixed focus images, and obtaining the depth corresponding to the pixels by using a lens imaging principle;
the lens imaging principle formula is as follows:
Figure BDA0001818138170000051
Figure BDA0001818138170000052
wherein, f is the focal length, v is the image distance, and the object distance Z is the depth data corresponding to the pixel;
the degree of focus F (p, L) in focus detection is expressed by a gradient and a threshold value, and the formula is as follows:
Figure BDA0001818138170000053
wherein
Figure BDA0001818138170000054
The gradient of the candidate L image at the point p; l represents images on different focus points, and the depth of field is L in layers; t is tLThe gradient threshold value is automatically obtained through a roberts edge extraction algorithm; a is a normalized parameter, and W is a cumulative window of N x N;
by a minimization process
Figure BDA0001818138170000055
Obtaining an L value corresponding to the minimum F value, and taking the L value as the depth value Z 'before conversion of the point p'f(p); the focused image is then converted to a depth map Z using image acquisition transformation parametersf
Further, the depth adaptive mixing described in step 5 is specifically as follows:
according to the difference of the characteristics of a first depth map obtained in a parallax matching mode and a second depth map obtained in a focusing detection mode, introducing characteristic analysis of a smooth region and an edge region of a full-focusing image, respectively assigning values, unifying a parallax range and a focusing level to an L-dimension representation, performing self-adaptive weight distribution by using matching cost and fuzzy measurement parameters corresponding to the same depth in the L-dimension, and finally respectively filling the smooth region, the edge and other regions with respective depth values;
the smooth region estimation in the full-focus image adopts a gradient mean square error method, the size of a neighborhood window in the gradient image is W-N, and the mean square error S is as follows:
Figure BDA0001818138170000061
wherein p iscenterIs the smooth window center pixel position;
Figure BDA0001818138170000062
a gradient of p point pixel values I (p);
labeling a smooth region in the image with the variance S and a threshold η:
Figure BDA0001818138170000063
wherein, IsmoothRepresents a smooth region, the smooth region being represented by 1, the other regions being represented by 0; smaxIs the maximum value of the variance;
for the acquisition of image edge pixels, a sobel edge operator is adopted to obtain an edge Iedge
For other regions than smooth and edge, adaptive fusion parameters λ are usedpAs a weight of focusing, the formula is as follows:
Figure BDA0001818138170000064
Figure BDA0001818138170000065
Figure BDA0001818138170000066
wherein there are L candidate depths, Fi,minAnd Ci,minRespectively representing the minimum value of the cost of the focus clue and the parallax clue of the p point in different candidate depths, sigmafAnd σbF (p, i) and C (p, i) are depth focusing degree of p points at i and cost value of parallax clues respectively;
and integrating smoothing, edge and other areas to carry out depth correction, wherein the smooth area selects depth obtained based on parallax, the edge area selects depth obtained by focusing detection, and the other areas are determined by adopting a cost self-adaptive weight method, and the formula is as follows:
Figure BDA0001818138170000067
wherein Z' (p) is the depth value after fusion, Zb(p) and Zf(p) depth values obtained by the parallax clue and the focusing clue respectively; lambda [ alpha ]pFor adaptive parameters, Ismooth、IedgeOthers denote smooth regions, edges, and other regions in addition thereto, respectively;
further filtering the depth map by adopting a formula
Figure BDA0001818138170000071
And denoising the obtained depth map.
Further, the visualization of the depth map in step 6 is specifically as follows:
and quantizing the final depth map obtained by depth grading, and adjusting the depth display range to be within a range of 0-255, wherein the depth inverse quantization formula is as follows:
Figure BDA0001818138170000072
in the formula, zmaxAnd zmaxMaximum and minimum depth of field in the scene, z, respectivelypFor the corresponding depth value of the pixel obtained from the parallax, IdI.e. an 8-bit grey scale representation corresponding to the depth value.
Compared with the prior art, the invention has the remarkable advantages that: (1) scene depth information can be obtained by only using one light field camera, and the device is simple and low in cost; (2) and the optimized depth map is obtained by mixing the dual clues of parallax and focusing by adopting a simplified algorithm suitable for hardware design, and the algorithm is simple and has high accuracy.
Drawings
Fig. 1 is a flow chart of a hybrid depth estimation method based on light field data according to the present invention.
Detailed Description
The invention relates to a mixed depth estimation method based on light field data, which comprises the following steps:
step 1, acquiring light field data: acquiring four-dimensional light field image data through a light field camera, and processing to obtain a scene focus stack image, a sub-image, a full focus image and a focus parameter, wherein the focus parameter comprises a white image calibration parameter and a sub-image calibration parameter;
step 2, enhancing parallax matching of the CT-SAD algorithm: selecting a sub-image of a horizontal position to simulate a horizontally placed camera based on the sub-image, and performing stereo matching of left and right viewpoints, including enhanced CT transformation, SAD calculation, CT-SAD matching cost fusion, cost accumulation, parallax estimation, parallax post-processing and depth conversion to obtain an initial resolution depth image;
and 3, upsampling based on bilateral filtering: combining the full-focus image and the initial resolution depth image obtained in the parallax matching in the step 2, and performing an up-sampling process based on bilateral filtering to obtain a first depth image;
and 4, depth estimation based on focus detection: analyzing the focal stack image by adopting a gradient algorithm, obtaining a focusing pixel layer by adopting a two-dimensional gradient algorithm, and performing depth-of-field conversion by utilizing a focusing parameter to obtain the depth of the focusing pixel layer; repeatedly carrying out focusing detection until all focusing layers are detected to obtain a second depth map;
step 5, depth self-adaptive mixing: firstly, equalizing the depth grading of a first depth map obtained in a parallax matching mode and a second depth map obtained in a focusing detection mode, obtaining a substitution value according to a median approach principle, then performing cost fusion in a self-adaptive parameter mode, minimizing the cost to obtain a depth grading map, and performing depth conversion on the depth grading map by using grading equalization parameters to obtain a final depth map;
step 6, visualization of the depth map: the final depth map resulting from the depth hierarchy is quantized and the depth display range is adjusted to be in the range of 0-255.
Further, the enhanced CT transform in step 2 is specifically as follows:
the CT transformation process is formulated as follows:
Figure BDA0001818138170000081
Figure BDA0001818138170000082
Figure BDA0001818138170000083
l, R is the conversion result of left image and right image, the conversion window size is M N, the variables M, N are integer and cover the whole window M N; r (u, v), l (u, v) are the right and left image pixel values at (u, v), respectively; xi (·) is a comparison relation, and the calculation method is like xi (p)1,p2) Shown in a formula; p is a radical of formula1、p2For the value of the original pixel, it is,
Figure BDA0001818138170000086
the window area pixel average, (u, v) the pixel coordinates,
Figure BDA0001818138170000084
indicating that each comparison value obtained is orderly arranged;
the Census transform yields a corresponding array representing the transformed center pixel by comparing the center pixel with surrounding pixels within an mxn window, thus yielding two images R and L represented by the Census transform.
Further, the enhanced CT-SAD algorithm in step 2 specifically comprises the following steps:
combining the SAD method based on luminance with the Census transform method, using CSADRepresenting the matching cost of SAD, CSADObtaining by calculating the difference value of the brightness:
Figure BDA0001818138170000085
wherein, formula CSAD(μ, v, d) represents the cumulative SAD matching cost for a pixel with disparity d at (u, v), L × K is the cumulative window constant, IrAnd IlPixel values corresponding to the right view and the left view are d;
obtaining a matching cost C based on Census transformation by comparing the Hamming distance of Census transformation values of corresponding pixels at the same positions in the right image and the left imageCTThen the matching cost is passed through the parameter (rho) by two initializationsSADCT) Obtaining the matching cost C of the pixel by fusionCT-SAD(p,d):
CCT-SAD(p,d)=ρSADCSAD(p,d)+ρCTCCT(p,d)
Wherein CCT-SADP represents a (u, v) point pixel for the matching cost of the parallax d position obtained based on the parallax principle of the subimage; ρ is a unit of a gradientSAD、ρCTFusing parameters by respective costs; cSAD(p,d)、CCT(p, d) are SAD matching cost and CT matching cost of the pixel p in the parallax d respectively;
carrying out matching cost accumulation in a pixel related region, and obtaining a new matching cost C after the cost accumulation is finishedaggThen, the matching cost C is obtained based on the principle of the parallax of the sub-imagebComprises the following steps:
Figure BDA0001818138170000091
wherein, Cagg(μ, v, d) is an accumulated cost of the (μ, v) point pixel when the parallax is d, and m, n is an accumulated window size;
and performing minimum optimization on the matching cost by adopting a WTA optimization method to obtain a parallax matching value, and searching a corresponding pixel position with the minimum matching cost to obtain a corresponding parallax map.
Further, the disparity postprocessing in step 2 is to perform denoising processing on the disparity map, which is specifically as follows:
median filtering by adopting a k multiplied by l window to obtain a disparity value Di;
the disparity map is represented by pixels, and the disparity is converted into an initial resolution depth image Z' b by using camera calibration parameters and a disparity conversion relation.
Further, the upsampling based on bilateral filtering in step 3 specifically includes the following steps:
by adopting bilateral upsampling combined with an original texture map, the expression of the joint bilateral upsampling is as follows:
Figure BDA0001818138170000092
wherein
Figure BDA0001818138170000093
Is the up-sampled pixel value; p is a radical ofs(Y) is the pixel of the small-size image, namely the pixel value of the large resolution corresponding to the new position of the small-size image after up-sampling;
Figure BDA0001818138170000094
for original large-sized pixel values, Xs,YsThe space positions of two pixel points in the small-size image are obtained; x, Y is the sum of X in large-size images、YsA corresponding spatial position; omega is a kernel function support domain; k is a radical ofXThe sum of the products of the distance weight r and the pixel difference weight d, i.e. the normalization parameter, is given by the following formula:
Figure BDA0001818138170000101
wherein the pixel difference weight d is a brightness factor in physical sense, and is separated into a brightness factor d in horizontal and vertical directionsh、dv
Figure BDA0001818138170000102
Figure BDA0001818138170000103
Wherein the content of the first and second substances,
Figure BDA0001818138170000104
is the (x, y) point pixel value,
Figure BDA0001818138170000105
pixel values in the horizontal and vertical directions aligned with the (x, y) point, respectively;
adopting the pixel value of the original texture image as a parameter for generating pixel difference weight, and performing directional separation operation; the distance weight r, i.e. the scale factor, is an operation in the window of the initial low-resolution depth image, and is used for obtaining a high-resolution image by upsampling, and the distance of the initial low-resolution is used as a ranging mode, and the formula is as follows:
Figure BDA0001818138170000106
wherein, r (X)s,Ys) Is the distance weight, δ (X)s,Ys) Is a distance, σrIs a distance weight parameter;
obtaining a first depth image Z based on the parallax of the sub-image after up-samplingb
Further, the depth estimation based on focus detection described in step 4 specifically includes the following steps:
taking differently focused images obtained by a light field camera, namely focal point stack images, as input data of depth estimation, evaluating and measuring the focusing degree of each pixel point in the images by using a focusing detection function, calculating intermediate focusing pixels in the images, calculating clear pixels in fixed focus images, and obtaining the depth corresponding to the pixels by using a lens imaging principle;
the lens imaging principle formula is as follows:
Figure BDA0001818138170000107
Figure BDA0001818138170000108
wherein f is a focal length, v is an image distance, and the object distance Z is depth data corresponding to the pixel;
the degree of focus F (p, L) in focus detection is expressed by a gradient and a threshold value, and the formula is as follows:
Figure BDA0001818138170000111
wherein
Figure BDA0001818138170000112
The gradient of the candidate L image at the point p; l represents images on different focus points, and the depth of field is L in layers; t is tLAutomatically acquiring a gradient threshold value through a roberts edge extraction algorithm; a is a normalization parameter, and W is a cumulative window of N × N;
by a minimization process
Figure BDA0001818138170000113
Obtaining an L value corresponding to the minimum F value, and taking the L value as the depth value Z 'before conversion of the point p'f(p); the focused image is then converted to a depth map Z using image acquisition transformation parametersf
Further, the depth adaptive mixing described in step 5 is specifically as follows:
according to the difference of the characteristics of a first depth map obtained in a parallax matching mode and a second depth map obtained in a focusing detection mode, introducing characteristic analysis of a smooth region and an edge region of a full focusing image, respectively assigning values, unifying a parallax range and a focusing level to an L-dimension representation, carrying out self-adaptive weight distribution by using matching cost and fuzzy measurement parameters corresponding to the same depth in the L-dimension, and finally respectively filling the smooth region, the edge and other regions with respective depth values;
the smooth region estimation in the full-focus image adopts a gradient mean square error method, the size of a neighborhood window in the gradient image is W-N, and the mean square error S is as follows:
Figure BDA0001818138170000114
wherein p iscenterIs the smooth window center pixel position;
Figure BDA0001818138170000115
a gradient of p point pixel values I (p);
labeling a smooth region in the image with the variance S and a threshold η:
Figure BDA0001818138170000116
wherein, IsmoothRepresents a smooth region, the smooth region being represented by 1, the other regions being represented by 0; smaxIs the maximum value of the variance;
for the acquisition of image edge pixels, a sobel edge operator is adopted to obtain an edge Iedge
For other regions than smooth and edge, adaptive fusion parameters λ are usedpAs a weight of focusing, the formula is as follows:
Figure BDA0001818138170000117
Figure BDA0001818138170000121
Figure BDA0001818138170000122
wherein there are L candidate depths, Fi,minAnd Ci,minRespectively representing the minimum value of the cost of the focusing clue and the parallax clue of the p point in different candidate depths, sigmafAnd σbF (p, i) and C (p, i) are the depth focusing degree of p points at i and the cost value of the parallax clue respectively;
and integrating smoothing, edge and other areas to carry out depth correction, wherein the smooth area selects depth obtained based on parallax, the edge area selects depth obtained by focusing detection, and the other areas are determined by adopting a cost self-adaptive weight method, and the formula is as follows:
Figure BDA0001818138170000123
wherein Z' (p) is the depth value after fusion, Zb(p) and Zf(p) depth values obtained by the parallax clue and the focusing clue respectively; lambdapAs adaptive parameter, Ismooth、IedgeOthers denote smooth regions, edges, and other regions in addition thereto, respectively;
further filtering the depth map by adopting a formula
Figure BDA0001818138170000124
And denoising the obtained depth map.
Further, the visualization of the depth map in step 6 is specifically as follows:
and quantizing the final depth map obtained by depth grading, and adjusting the depth display range to be within a range of 0-255, wherein the depth inverse quantization formula is as follows:
Figure BDA0001818138170000125
in the formula, zmaxAnd zmaxMaximum and minimum depth of field in the scene, z, respectivelypFor the corresponding depth value of the pixel obtained from the parallax, IdI.e. an 8-bit grey scale representation corresponding to the depth value.
The invention is described in further detail below with reference to the figures and specific examples.
Example 1
The mixed depth estimation method based on the light field data is used for acquiring the data of a basic light field camera by adopting the prior art, a light field image can be shot by using a Lytro camera, a focus stack image and a full focus image are extracted by using Lytro desktop software, the focus depth is taken as a parameter, the step length is taken as 0.2, and the focus stack image in the maximum depth of field range is extracted from the foreground to the background. The sub-image is obtained using matlab tool library LFToolbox 0.4.
With reference to fig. 1, a method for estimating hybrid depth based on light field data includes the following steps:
step 1: acquiring light field data: acquiring four-dimensional light field image data through a light field camera, and processing to obtain a scene focus stack image, a sub-image, a full focus image and a focus parameter, wherein the focus parameter comprises a white image calibration parameter and a sub-image calibration parameter;
the sub-image calibration is performed on the basis that the micro lens can be regarded as acquisition of different visual angles, and can be used for depth estimation based on parallax error; the focus parameters are used for depth conversion for focus detection.
And 2, step: enhancing the parallax matching of the CT-SAD algorithm: selecting a camera horizontally placed by simulating the subimage of the horizontal position based on the subimage, and performing stereo matching of left and right viewpoints, wherein the stereo matching comprises the functions of enhanced CT transformation, SAD calculation, CT-SAD matching cost fusion, cost accumulation, parallax estimation, parallax post-processing and depth conversion;
reading the number pairs of sub-images, converting the video format, if the data is RGB data, setting to execute the RGB-to-gray function, and if the data is YUV data, setting to execute the Y data extraction function;
the RGB to gray formula is:
Gray=R*0.299+G*0.587+B*0.114
CT conversion is carried out on the original image data, and the formula is as follows:
Figure BDA0001818138170000131
Figure BDA0001818138170000132
Figure BDA0001818138170000133
wherein L, R are the conversion results of the left image and the right image, respectively, and the conversion window size is M × N, p1、p2Is the value of the original pixel(s),
Figure BDA0001818138170000134
is the average value of the pixels in the window area, u and v are the coordinates of the pixels,
Figure BDA0001818138170000135
indicating that each of the obtained comparison values is ordered.
Census transform obtains a corresponding array for representing the converted center pixel by comparing the center pixel with surrounding pixels within an mxn window.
The size value of the window M is 9, the size value N is 9, the data after CT transformation adopts Hamming distance to carry out matching cost calculation to obtain the matching cost C based on CT transformationCT
Combining the luminance-based SAD method with Census transform method, SAD is used to compensate some deficiencies in Census transform, such as errors in texture repeat regions; with CSADRepresentation SADMatching cost of CSADObtaining by calculating the difference value of the brightness:
Figure BDA0001818138170000141
obtaining a matching cost C based on Census transformation by comparing the Hamming distance of Census transformation values of corresponding pixelsCTThen passes the parameter (rho) by two initial matching costsSAD,ρCT) And obtaining the matching cost C of the pixel by fusion: in this example, (1,1) is shown.
CCT-SAD(p,d)=ρSADCSAD(p,d)+ρCTCCT(p,d)
Wherein CbP represents a (u, v) point pixel for a matching cost obtained based on a sub-image parallax principle;
matching cost accumulation is carried out in the pixel related area, the matching performance of the matching cost is further improved, and a new matching cost C is obtained after the cost accumulation is finishedaggThen, the matching cost C is obtained based on the principle of the parallax of the sub-imagebComprises the following steps:
Figure BDA0001818138170000142
Cband m-n-8 in the accumulation is taken as the matching cost obtained based on the parallax principle of the sub-image.
And performing minimum optimization on the matching cost by adopting a WTA optimization method to obtain a parallax matching value, and searching a corresponding pixel position with the minimum matching cost to obtain a corresponding parallax map.
Further, the disparity post-processing, that is, denoising the disparity map, is specifically implemented as follows:
median filtering with an mxn window, the formula is as follows:
Figure BDA0001818138170000143
filtering the median value m-n-8 to obtain a disparity value Dn; the parallax map is represented by pixels, and parallax is converted into a low-resolution depth map Z 'by utilizing camera calibration parameters and parallax conversion relation'b
And step 3: bilateral filtering based upsampling: in the process of acquiring the sub-image by the light field data, the pixel resolution is reduced, and an up-sampling process is required; the light field full-focusing image has clear texture characteristics, and the quality of a depth map can be improved; combining the color image of the light field full focus and the small-resolution depth image obtained by parallax estimation, and performing an up-sampling process based on bilateral filtering to obtain a high-definition depth image; in this way, the full-focus texture map is used as a reference, bilateral filtering upsampling is carried out on the depth map, and information such as edges can be effectively reserved; x, Y is the pixel location, X, of the fully focused images、YsFor the depth map pixel point, the up-sampling process is completed by using the following formula:
Figure BDA0001818138170000151
wherein
Figure BDA0001818138170000152
Is the up-sampled pixel value, ps(Y) is a pixel of the depth map,
Figure BDA0001818138170000153
the pixel value of the fully focused image corresponding to its new position after upsampling.
Figure BDA0001818138170000154
The pixel values of the full-focus image are obtained, and Xs and Ys are the spatial positions of two pixel points in the depth image; x, Y are the spatial locations in the fully focused image corresponding to Xs, Ys; Ω is the kernel support domain. k is a radical ofXThe sum of the products of the distance weight r and the pixel difference weight d, i.e. the normalization parameter, is given by the following formula:
Figure BDA0001818138170000155
where d is a luminance factor, which is separated into luminance factors in both horizontal and vertical directions:
Figure BDA0001818138170000156
Figure BDA0001818138170000157
adopting the pixel value of the large-resolution original texture image as a parameter for generating pixel difference weight, and performing directional separation operation;
the scale factor r is operated by adopting a depth image with small resolution, the pixel distance with small resolution is used as a distance measurement mode, and the formula is as follows:
Figure BDA0001818138170000158
obtaining a high-resolution depth map Z based on the parallax of the sub-image after up-samplingb
For the control of the upsampling quality, the upsampling equation has 3 parameters, i.e. the Ω support domain, i.e. the radial window n of the filter, and also a parameter σr、σd(ii) a The larger the values of the window radius and sigma are, the larger the smoothing effect is, otherwise, the smaller the smoothing effect is; in order to suppress noise in the up-sampling process, noise suppression processing can be performed by selecting a value of σ, and a larger luminance parameter σ is taken for a noise signal pointdAnd the pixel brightness parameter sigma of the smooth region and the edge regiondAs close to zero as possible. Therefore, the brightness factor is set in a self-adaptive mode to achieve the purpose of suppressing noise; the interpolation of the corresponding point pixel and the average value of the filter window pixel is used as sigmadThe formula is as follows:
Figure BDA0001818138170000161
σd=D(x,y)=|I(x,y)-Imean(x,y)|
and 4, step 4: depth estimation based on focus detection: analyzing the focal stack image by adopting a gradient algorithm, obtaining a focusing pixel layer by adopting a two-dimensional gradient algorithm, and performing depth-of-field conversion by utilizing a focusing parameter to obtain the depth of the focusing pixel layer; repeating the focusing detection until all focusing layers are detected; the focus parameters are used for depth conversion of the focus stack hierarchy.
The degree of focus F (p, L) in the focus detection is expressed by the gradient and the threshold value, and the formula is as follows:
Figure BDA0001818138170000162
wherein
Figure BDA0001818138170000163
Gradient at point p, t, for candidate L depthLAutomatically acquiring a gradient threshold value through a roberts edge extraction algorithm; a is a normalization parameter, W is an accumulation window of N x N, and the value N is 8;
the focusing point has small focusing degree F value according to the formula, and the minimization process
Figure BDA0001818138170000164
Obtaining an L value corresponding to the minimum F value as a pre-conversion depth value Z 'of the point p'f(p); the focused image is then converted to a depth map Z using image acquisition transformation parametersf. The depth conversion is obtained by using a focusing parameter and an image distance, the image distance is a constant, and the focusing parameter is a corresponding relation between the focusing parameter and a focal length obtained through manual measurement and curve fitting and is used for depth conversion of a focusing pixel.
And 5: depth adaptive mixing: the method comprises the steps of firstly balancing depth grading in a parallax matching mode and a focusing detection mode, obtaining a substitution value according to a median approach principle, then performing cost fusion in a self-adaptive parameter mode to obtain a depth grading graph with minimized cost, and performing depth conversion on the depth grading graph by using grading balance parameters to obtain the depth graph.
According to the fact that the depth map based on the parallax is different from the depth map obtained based on focusing in characteristic, characteristic analysis of a smooth region and an edge region of a full-focus image is introduced, values are assigned respectively, a parallax range and a focusing level are unified to L-dimensional representation, adaptive weight distribution is conducted by utilizing corresponding matching cost and fuzzy measurement parameters, and finally the smooth region, the edge and other regions are filled with corresponding depth numerical values respectively, so that the accuracy of the whole depth map is improved;
the smooth region estimation in the full-focus image adopts a gradient mean square error method, the size of a neighborhood window in the gradient image is W-N, and the mean square error S is as follows:
Figure BDA0001818138170000171
the smooth region in the image is labeled with the variance S and the threshold η, and η is taken to be 0.3.
Figure BDA0001818138170000172
For the acquisition of image edge pixels, a sobel edge operator is adopted to obtain an edge Iedge. For other regions than smooth and edge, such as texture rich regions, adaptive fusion parameter λ is usedpAs the weight of focusing, the formula is as follows:
Figure BDA0001818138170000173
Figure BDA0001818138170000174
Figure BDA0001818138170000175
and depth correction is carried out on the smoothing region, the edge region and other regions comprehensively, the depth obtained based on parallax is selected from the smoothing region, the depth obtained by focusing detection is selected from the edge region, and other regions such as a texture-rich region are determined by a cost self-adaptive weight method, wherein the formula is as follows:
Figure BDA0001818138170000176
further filtering the depth map by adopting a formula
Figure BDA0001818138170000177
And denoising the obtained depth map.
Step 6: in the visualization of depth maps, the range of the depth map hierarchy obtained is generally small due to the limitations of the parallax hierarchy and the focus hierarchy, and in order to obtain the depth of visualization, the depth map obtained by the depth hierarchy is quantized and the depth display range is adjusted to be in the range of 0 to 255.
Since the range of the parallax and focus stack hierarchies is limited, in order to obtain a visualized depth map, the depth hierarchy is quantized and corresponds to the range of 0-255 of the gray scale.
The inverse quantization formula for depth is:
Figure BDA0001818138170000181
in the formula ZmaxAnd ZminMaximum and minimum depth of field in the scene, Z, respectivelypFor the corresponding depth value of the pixel obtained from the parallax, IdI.e. an 8-bit grey scale representation corresponding to the depth value.
In conclusion, the scene depth information can be obtained by only one light field camera, the equipment is simple, and the cost is low; in addition, the optimized depth map is obtained by mixing double clues of parallax and focusing by adopting a simplified algorithm suitable for hardware design, and the algorithm is simple and high in accuracy.

Claims (2)

1. A mixed depth estimation method based on light field data is characterized by comprising the following steps:
step 1, acquiring light field data: acquiring four-dimensional light field image data through a light field camera, and processing the four-dimensional light field image data to obtain a scene focus stack image, a sub-image, a full-focus image and a focus parameter, wherein the focus parameter comprises a white image calibration parameter and a sub-image calibration parameter;
step 2, enhancing parallax matching of the CT-SAD algorithm: selecting a sub-image of a horizontal position to simulate a horizontally placed camera based on the sub-image, and performing stereo matching of left and right viewpoints, including enhanced CT transformation, SAD calculation, CT-SAD matching cost fusion, cost accumulation, parallax estimation, parallax post-processing and depth conversion to obtain an initial resolution depth image;
and 3, upsampling based on bilateral filtering: combining the full-focus image and the initial resolution depth image obtained in the parallax matching in the step 2, and performing an up-sampling process based on bilateral filtering to obtain a first depth image;
and 4, depth estimation based on focus detection: analyzing the focal stack image by adopting a gradient algorithm, obtaining a focusing pixel layer by adopting a two-dimensional gradient algorithm, and performing depth-of-field conversion by utilizing a focusing parameter to obtain the depth of the focusing pixel layer; repeatedly carrying out focusing detection until all focusing layers are detected to obtain a second depth map;
step 5, depth self-adaptive mixing: firstly, equalizing the depth grading of a first depth map obtained in a parallax matching mode and a second depth map obtained in a focusing detection mode, obtaining a substitution value according to a median approach principle, then performing cost fusion in a self-adaptive parameter mode, minimizing the cost to obtain a depth grading map, and performing depth conversion on the depth grading map by using grading equalization parameters to obtain a final depth map;
step 6, visualization of the depth map: quantizing the final depth map obtained by depth grading, and adjusting the depth display range to be within the range of 0-255;
the enhanced CT transform described in step 2 is specifically as follows:
the CT transformation process is formulated as follows:
Figure FDA0003482743780000011
Figure FDA0003482743780000012
Figure FDA0003482743780000013
l, R is the conversion result of left image and right image, the conversion window size is M N, the variables M, N are integer and cover the whole window M N; r (u, v), l (u, v) are the right and left image pixel values at (u, v), respectively; xi (·) is a comparison relation, and the calculation method is like xi (p)1,p2) The formula is shown; p is a radical of1、p2Is the value of the original pixel(s),
Figure FDA0003482743780000014
is the average value of the pixels in the window area, (u, v) is the pixel coordinate,
Figure FDA0003482743780000021
indicating that each of the obtained comparison values is ordered;
census conversion is carried out in an MxN window, a corresponding array is obtained by comparing a central pixel with surrounding pixels and is used for representing the converted central pixel, and thus two images R and L represented by the Census conversion value are obtained;
the enhanced CT-SAD algorithm described in step 2 specifically includes the following steps:
combining the SAD method based on luminance with the Census transform method, using CSADRepresenting the matching cost of SAD, CSADObtaining by calculating a difference value of brightness:
Figure FDA0003482743780000022
wherein, formula CSAD(μ, v, d) represents the cumulative SAD matching cost for a pixel with disparity d at (u, v), L × K is the cumulative window constant, IrAnd IlThe disparity is d pixel values corresponding to a right view and a left view;
obtaining a matching cost C based on Census transformation by comparing the Hamming distance of Census transformation values of corresponding pixels at the same positions in the right image and the left imageCTThen the matching cost is passed through the parameter (rho) by two initializationsSADCT) Obtaining the matching cost C of the pixel by fusionCT-SAD(p,d):
CCT-SAD(p,d)=ρSADCSAD(p,d)+ρCTCCT(p,d)
Wherein CCT-SADP represents a (u, v) point pixel for the matching cost of the parallax d position obtained based on the parallax principle of the subimage; rhoSAD、ρCTFusing parameters by respective costs; cSAD(p,d)、CCT(p, d) are SAD matching cost and CT matching cost of the pixel p in the parallax d respectively;
carrying out matching cost accumulation in a pixel related region, and obtaining a new matching cost C after the cost accumulation is finishedaggThen, the matching cost C is obtained based on the principle of the parallax of the sub-imagebComprises the following steps:
Figure FDA0003482743780000023
wherein, Cagg(μ, v, d) is an accumulated cost of the (μ, v) point pixel when the parallax is d, and m, n is an accumulated window size;
performing minimum optimization on the matching cost by adopting a WTA optimization method to obtain a parallax matching value, and searching a corresponding pixel position with the minimum matching cost to obtain a corresponding parallax map;
the parallax post-processing in step 2 is to perform denoising processing on the parallax image, and specifically includes:
median filtering by adopting a k multiplied by l window to obtain a disparity value Di;
the parallax image is represented by pixels, and parallax is converted into an initial resolution depth image Z' b by utilizing a camera calibration parameter and a parallax conversion relation;
step 3, the upsampling based on the bilateral filtering is specifically as follows:
by adopting bilateral upsampling combined with an original texture map, the expression of the joint bilateral upsampling is as follows:
Figure FDA0003482743780000031
wherein
Figure FDA0003482743780000032
Is the up-sampled pixel value; p is a radical ofs(Y) is the pixel of the small-size image, namely the pixel value of the large resolution corresponding to the new position of the small-size image after up-sampling;
Figure FDA0003482743780000033
for original large-sized pixel values, Xs,YsThe spatial positions of two pixel points in the small-size image are obtained; x, Y is X in large-size images、YsA corresponding spatial position; omega is a kernel function support domain; k is a radical ofXThe sum of the products of the distance weight r and the pixel difference weight d, i.e. the normalization parameter, is given by the following formula:
Figure FDA0003482743780000034
wherein the pixel difference weight d is a brightness factor in physical sense, and is separated into a brightness factor d in horizontal and vertical directionsh、dv
Figure FDA0003482743780000035
Figure FDA0003482743780000036
Wherein the content of the first and second substances,
Figure FDA0003482743780000037
is the (x, y) point pixel value,
Figure FDA0003482743780000038
pixel values in the horizontal and vertical directions aligned with the (x, y) point, respectively;
adopting the pixel value of the original texture image as a parameter for generating pixel difference weight, and performing directional separation operation; the distance weight r, i.e. the scale factor, is an operation in the window of the initial low-resolution depth image, and is used for obtaining a high-resolution image by upsampling, and the distance of the initial low-resolution is used as a ranging mode, and the formula is as follows:
Figure FDA0003482743780000039
wherein r (X)s,Ys) Is the distance weight, δ (X)s,Ys) Is a distance, σrIs a distance weight parameter;
obtaining a first depth image Z based on the parallax of the sub-image after up-samplingb
The depth estimation based on focus detection described in step 4 specifically includes the following steps:
taking differently focused images obtained by a light field camera, namely focal point stack images, as input data of depth estimation, evaluating and measuring the focusing degree of each pixel point in the images by using a focusing detection function, calculating intermediate focusing pixels in the images, calculating clear pixels in fixed focus images, and obtaining the depth corresponding to the pixels by using a lens imaging principle;
the lens imaging principle formula is as follows:
Figure FDA0003482743780000041
Figure FDA0003482743780000042
wherein, f is the focal length, v is the image distance, and the object distance Z is the depth data corresponding to the pixel;
the degree of focus F (p, L) in focus detection is expressed by a gradient and a threshold value, and the formula is as follows:
Figure FDA0003482743780000043
wherein
Figure FDA0003482743780000044
The gradient of the candidate L image at the point p; l represents images on different focus points, and the depth of field is L in level; t is tLAutomatically acquiring a gradient threshold value through a roberts edge extraction algorithm; a is a normalization parameter, and W is a cumulative window of N × N;
by a minimization process
Figure FDA0003482743780000045
Obtaining an L value corresponding to the minimum F value, and taking the L value as the depth value Z 'before conversion of the point p'f(p); the focused image is then converted to a depth map Z using image acquisition transformation parametersf
The depth adaptive mixing in step 5 is specifically as follows:
according to the difference of the characteristics of a first depth map obtained in a parallax matching mode and a second depth map obtained in a focusing detection mode, introducing characteristic analysis of a smooth region and an edge region of a full-focusing image, respectively assigning values, unifying a parallax range and a focusing level to an L-dimension representation, performing self-adaptive weight distribution by using matching cost and fuzzy measurement parameters corresponding to the same depth in the L-dimension, and finally respectively filling the smooth region, the edge and other regions with respective depth values;
the smooth region estimation in the full focus image adopts a gradient mean square error method, the size of a neighborhood window in the gradient image is W-N, and the mean square error S is as follows:
Figure FDA0003482743780000046
wherein p iscenterIs the smooth window center pixel position;
Figure FDA0003482743780000051
a gradient of p point pixel values I (p);
labeling smooth regions in the image with the variance S and the threshold η:
Figure FDA0003482743780000052
wherein, IsmoothRepresents a smooth region, the smooth region being represented by 1, the other regions being represented by 0; smaxIs the maximum value of the variance;
for the acquisition of image edge pixels, a sobel edge operator is adopted to obtain an edge Iedge
For other regions than smooth and marginal, adaptive fusion parameter λ is usedpAs the weight of focusing, the formula is as follows:
Figure FDA0003482743780000053
Figure FDA0003482743780000054
Figure FDA0003482743780000055
wherein there are L candidate depths, Fi,minAnd Ci,minRespectively representing the minimum value of the cost of the focusing clue and the parallax clue of the p point in different candidate depths, sigmafAnd σbF (p, i) and C (p, i) are the depth focusing degree of p points at i and the cost value of the parallax clue respectively;
and depth correction is carried out on the smoothing region, the edge region and other regions comprehensively, the depth obtained based on parallax is selected in the smoothing region, the depth obtained by focusing detection is selected in the edge region, and the depth obtained by focusing detection is determined in the other regions by adopting a cost self-adaptive weight method, wherein the formula is as follows:
Figure FDA0003482743780000056
wherein Z' (p) is the depth value after fusion, Zb(p) and Zf(p) respectively representing depth values obtained by the parallax clue and the focusing clue; lambda [ alpha ]pAs adaptive parameter, Ismooth、IedgeOthers denote smooth regions, edges, and other regions in addition thereto, respectively;
further filtering the depth map by adopting a formula
Figure FDA0003482743780000061
And denoising the obtained depth map.
2. The light-field-data-based hybrid depth estimation method according to claim 1, wherein the visualization of the depth map in step 6 is specifically as follows:
and quantizing the final depth map obtained by depth grading, and adjusting the depth display range to be within a range of 0-255, wherein the depth inverse quantization formula is as follows:
Figure FDA0003482743780000062
in the formula, zmaxAnd zmaxMaximum and minimum depth of field in the scene, z, respectivelypFor the corresponding depth value of the pixel obtained from the parallax, IdI.e. an 8-bit grey scale representation corresponding to the depth value.
CN201811151940.5A 2018-09-29 2018-09-29 Hybrid depth estimation method based on light field data Active CN109360235B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811151940.5A CN109360235B (en) 2018-09-29 2018-09-29 Hybrid depth estimation method based on light field data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811151940.5A CN109360235B (en) 2018-09-29 2018-09-29 Hybrid depth estimation method based on light field data

Publications (2)

Publication Number Publication Date
CN109360235A CN109360235A (en) 2019-02-19
CN109360235B true CN109360235B (en) 2022-07-19

Family

ID=65348332

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811151940.5A Active CN109360235B (en) 2018-09-29 2018-09-29 Hybrid depth estimation method based on light field data

Country Status (1)

Country Link
CN (1) CN109360235B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110120071B (en) * 2019-05-15 2023-03-24 南京工程学院 Depth estimation method for light field image
CN110246172A (en) * 2019-06-18 2019-09-17 首都师范大学 A kind of the light field total focus image extraction method and system of the fusion of two kinds of Depth cues
CN110246162A (en) * 2019-06-20 2019-09-17 首都师范大学 A kind of total focus light field image composing method and system
CN111340715B (en) * 2019-09-19 2024-02-06 杭州海康慧影科技有限公司 Grid pattern weakening method and device of image and electronic equipment
CN110738628B (en) * 2019-10-15 2023-09-05 湖北工业大学 Adaptive focus detection multi-focus image fusion method based on WIML comparison graph
CN113034568B (en) * 2019-12-25 2024-03-29 杭州海康机器人股份有限公司 Machine vision depth estimation method, device and system
CN111652817B (en) * 2020-05-28 2023-08-22 大连海事大学 Underwater image sharpening method based on human eye visual perception mechanism
CN112288791A (en) * 2020-11-06 2021-01-29 浙江中控技术股份有限公司 Disparity map obtaining method, and fish-eye camera-based three-dimensional model obtaining method and device
CN112669355B (en) * 2021-01-05 2023-07-25 北京信息科技大学 Method and system for splicing and fusing focusing stack data based on RGB-D super pixel segmentation
CN113052886A (en) * 2021-04-09 2021-06-29 同济大学 Method for acquiring depth information of double TOF cameras by adopting binocular principle
CN113706466A (en) * 2021-07-26 2021-11-26 桂林电子科技大学 Shadow clue and parallax method image depth measuring method
CN113610908B (en) * 2021-07-29 2023-08-18 中山大学 Depth estimation method for multi-baseline fusion in monocular endoscopic surgery
CN114757985A (en) * 2022-04-15 2022-07-15 湖南工程学院 Binocular depth sensing device based on ZYNQ improved algorithm and image processing method
CN114881907B (en) * 2022-06-30 2022-09-23 江苏集萃苏科思科技有限公司 Optical microscopic image multi-depth-of-field focus synthesis method and system and image processing method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103220545A (en) * 2013-04-28 2013-07-24 上海大学 Hardware implementation method of stereoscopic video real-time depth estimation system
CN106257537A (en) * 2016-07-18 2016-12-28 浙江大学 A kind of spatial depth extracting method based on field information
CN107135388A (en) * 2017-05-27 2017-09-05 东南大学 A kind of depth extraction method of light field image
CN108596965A (en) * 2018-03-16 2018-09-28 天津大学 A kind of light field image depth estimation method

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102065313B (en) * 2010-11-16 2012-10-31 上海大学 Uncalibrated multi-viewpoint image correction method for parallel camera array
US20130070060A1 (en) * 2011-09-19 2013-03-21 Pelican Imaging Corporation Systems and methods for determining depth from multiple views of a scene that include aliasing using hypothesized fusion
CN104469183B (en) * 2014-12-02 2015-10-28 东南大学 A kind of light field of X-ray scintillation body imaging system catches and post-processing approach
EP3220351A1 (en) * 2016-03-14 2017-09-20 Thomson Licensing Method and device for processing lightfield data
CN106340041B (en) * 2016-09-18 2018-12-25 杭州电子科技大学 It is a kind of to block the light-field camera depth estimation method for filtering out filter based on cascade
CN107452031B (en) * 2017-03-09 2020-06-26 叠境数字科技(上海)有限公司 Virtual ray tracking method and light field dynamic refocusing display system
CN108564620B (en) * 2018-03-27 2020-09-04 中国人民解放军国防科技大学 Scene depth estimation method for light field array camera
CN108406731B (en) * 2018-06-06 2023-06-13 珠海一微半导体股份有限公司 Positioning device, method and robot based on depth vision

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103220545A (en) * 2013-04-28 2013-07-24 上海大学 Hardware implementation method of stereoscopic video real-time depth estimation system
CN106257537A (en) * 2016-07-18 2016-12-28 浙江大学 A kind of spatial depth extracting method based on field information
CN107135388A (en) * 2017-05-27 2017-09-05 东南大学 A kind of depth extraction method of light field image
CN108596965A (en) * 2018-03-16 2018-09-28 天津大学 A kind of light field image depth estimation method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"Depth from Combining Defocus and Correspondence Using Light-Field Cameras";Michael W. Tao等;《2013 IEEE International Conference on Computer Vision》;20131231;第673-680页 *
"三维视频中基于FPGA的实时深度估计研究与应用";李贺建;《中国优秀博士学位论文全文数据集》;20160315(第3期);正文第63-65页第4.2节 *
"基于光场分析的多线索融合深度估计方法";杨德刚等;《计算机学报》;20151231;第38卷(第12期);第2442-2445页第4节 *
"聚焦性检测与彩色信息引导的光场图像深度提取";胡良梅等;《中国图象图形学报》;20160229;第21卷(第2期);第159页第2.2节 *

Also Published As

Publication number Publication date
CN109360235A (en) 2019-02-19

Similar Documents

Publication Publication Date Title
CN109360235B (en) Hybrid depth estimation method based on light field data
JP6561216B2 (en) Generating intermediate views using optical flow
US10699476B2 (en) Generating a merged, fused three-dimensional point cloud based on captured images of a scene
CN107430782B (en) Method for full parallax compressed light field synthesis using depth information
TWI524734B (en) Method and device for generating a depth map
RU2382406C1 (en) Method of improving disparity map and device for realising said method
KR102464523B1 (en) Method and apparatus for processing image property maps
JP2015188234A (en) Depth estimation based on global motion
CN110352592B (en) Image forming apparatus and image forming method
WO2011014419A1 (en) Methods, systems, and computer-readable storage media for creating three-dimensional (3d) images of a scene
US8867826B2 (en) Disparity estimation for misaligned stereo image pairs
EP2382791A1 (en) Depth and video co-processing
Lee et al. Generation of multi-view video using a fusion camera system for 3D displays
WO2019122205A1 (en) Method and apparatus for generating a three-dimensional model
KR20130112311A (en) Apparatus and method for reconstructing dense three dimension image
CN107750370A (en) For the method and apparatus for the depth map for determining image
Sharma et al. A flexible architecture for multi-view 3DTV based on uncalibrated cameras
Jung A modified model of the just noticeable depth difference and its application to depth sensation enhancement
CN112102347B (en) Step detection and single-stage step height estimation method based on binocular vision
Seitner et al. Trifocal system for high-quality inter-camera mapping and virtual view synthesis
Orozco et al. HDR multiview image sequence generation: Toward 3D HDR video
JP2013200840A (en) Video processing device, video processing method, video processing program, and video display device
Jung et al. All-in-focus and multi-focus color image reconstruction from a database of color and depth image pairs
Kang et al. Generation of multi-view images using stereo and time-of-flight depth cameras
Liu Improving forward mapping and disocclusion inpainting algorithms for depth-image-based rendering and geomatics applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant