CN112488915B

CN112488915B - Depth reconstruction method and device based on superpixel relation analysis

Info

Publication number: CN112488915B
Application number: CN201910864317.2A
Authority: CN
Inventors: 季向阳; 邸研
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2019-09-12
Filing date: 2019-09-12
Publication date: 2022-06-21
Anticipated expiration: 2039-09-12
Also published as: CN112488915A

Abstract

The present disclosure relates to depth reconstruction based on superpixel relationship analysisA method and apparatus, the method comprising: determining a feature point matching result M between adjacent image frames in the image frame set and a super-pixel segmentation result S of a target image frame; according to the feature point matching result M and the super-pixel segmentation result S, determining a homography matrix H corresponding to each super-pixel and a motion relation R between adjacent super-pixels_e(ii) a According to the motion relation R between adjacent superpixels_eDetermining the spatial relationship R between adjacent superpixels_s(ii) a According to the spatial relationship R between adjacent superpixels_sDetermining a plane parameter theta of each super pixel; and according to the homography matrix H corresponding to each super pixel and the plane parameter theta of each super pixel, performing depth reconstruction on the dynamic scene corresponding to the image frame set, thereby effectively realizing the depth reconstruction on the complex motion scene.

Description

Depth reconstruction method and device based on superpixel relationship analysis

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a depth reconstruction method and apparatus based on superpixel relationship analysis.

Background

Monocular camera-based depth reconstruction methods have been an important and extremely challenging piece of research, which has important applications in many fields, such as automatic driving, object recognition, motion recognition, etc. However, for many complex Motion scenes in practical application, such as rigid moving vehicles, pedestrians moving in a deformation manner, and the like, when a conventional Motion StructuRe From Motion (SFM) method is used for performing dynamic scene depth solution, the relative scale of each unit StructuRe cannot be directly determined, so that the conventional SFM method cannot perform depth reconstruction on the complex Motion scenes.

Disclosure of Invention

In view of this, the present disclosure provides a depth reconstruction method and apparatus based on superpixel relationship analysis, which effectively implement depth reconstruction of a complex motion scene.

According to a first aspect of the present disclosure, there is provided a depth reconstruction method based on superpixel relationship analysis, including: determining a feature point matching result M between adjacent image frames in an image frame set and a super-pixel segmentation result S of a target image frame, wherein the image frame set comprises at least two image frames, the target image frame is one of the image frame set, and the super-pixel segmentation result S comprises a plurality of super-pixels; determining the motion relation between the homography matrix H corresponding to each super pixel and the adjacent super pixels according to the feature point matching result M and the super pixel segmentation result SR_e(ii) a According to the motion relation R between the adjacent super pixels_eDetermining a spatial relationship R between said neighboring superpixels_s(ii) a According to the spatial relation R between said adjacent superpixels_sDetermining a plane parameter theta of each super pixel; and performing depth reconstruction on the dynamic scene corresponding to the image frame set according to the homography matrix H corresponding to each super pixel and the plane parameter theta of each super pixel.

In a possible implementation manner, the motion relationship R between the adjacent superpixels and the homography matrix H corresponding to each superpixel is determined according to the feature point matching result M and the superpixel segmentation result S_eThe method comprises the following steps: for any superpixel i, by applying a first energy function E (H, R)_e) Optimizing to obtain a homography matrix H corresponding to the superpixel i_iAnd a motion relation R between said super-pixel i and an adjacent super-pixel j_e(i, j); wherein the first energy function E (H, R)_e) Comprises the following steps: a homography matrix H corresponding to the super pixel_iRelated first optimization term E_data(H_i) Homography matrix H corresponding to said super-pixel_iThe motion relation R between the super pixel i and the adjacent super pixel_e(i, j) and homography matrix H corresponding to the adjacent pixel j_jAssociated second optimization term E_pair(H_i，H_j，R_e(i, j)), and a motion relationship R between said super-pixel i and an adjacent super-pixel_e(i, j) related constant term E_o(R_e(i，j))。

In a possible implementation manner, the homography matrix H corresponding to the adjacent pixel j_jIs the optimized homography matrix.

In a possible implementation, said motion relation R between said adjacent superpixels is used as a function of said motion relation R_eDetermining a spatial relationship R between said neighboring superpixels_sThe method comprises the following steps: relating the motion R between said adjacent superpixels_eIs determined as a spatial relationship R between said neighboring superpixels_sThe relationship type includes: coplanar, Hinge and Crack.

In one possible implementation, said determining is based on a spatial relationship R between said adjacent superpixels_sDetermining a plane parameter θ for each superpixel, comprising: determining a reference superpixel set S among the plurality of superpixels based on a preset algorithm_tWherein the reference super pixel set S_tCorresponding to a background part in the target image frame; the reference super pixel set S_tDetermining the scale factor s of each reference superpixel as 1; according to the reference super pixel set S_tAnd the spatial relationship R between said adjacent superpixels_sThe plane parameter θ for each superpixel is determined.

In one possible implementation, the set S of super-pixels is based on the reference_tAnd the spatial relationship R between said adjacent superpixels_sDetermining a plane parameter θ for each superpixel, comprising: for any superpixel i, optimizing a second energy function E (theta, s) to obtain a plane parameter theta of the superpixel i_iAnd a scale factor s_i(ii) a Wherein the second energy function E (θ, s) includes: with the plane parameter theta of the super-pixel i_iAnd a scale factor s_iRelated third optimization term E_fit(θ_i，s_i) With the plane parameter θ of the superpixel i_iAnd a plane parameter theta of an adjacent super pixel j corresponding to the super pixel i_jRelated fourth optimization term E_Rel(θ_i，θ_j) And a scale factor s associated with said super-pixel i_iAssociated constant term E_occ(s_i)。

In a possible implementation, the motion relation R between said super-pixel i and all the adjacent super-pixels j is determined_sWhen the relationship types of (i, j) are all Crack, the second energy function E (θ, s) further includes: with the plane parameter theta of the super-pixel i_iPlane parameter theta of super pixel k_kRelated fifth optimization term E_pair(θ_i，θ_k) Wherein the super pixel k is a super pixel set S_rA set S of super-pixels_rCan reach the reference superpixel set S based on the target path_tAnd the relation type of the superpixel relation on the target path is Coplannar or Hinge.

In a possible implementation, the plane parameter θ of said neighboring super-pixel j_jAnd a plane parameter theta of the superpixel k_kThe optimized plane parameters.

In a possible implementation manner, the performing depth reconstruction on the dynamic scene corresponding to the image frame set according to the homography matrix H corresponding to each super pixel and the plane parameter θ of each super pixel includes: decomposing the homography matrix H corresponding to each super pixel to obtain a camera rotation parameter R corresponding to each super pixel₀And a camera translation parameter t₀(ii) a According to the camera rotation parameter R corresponding to each super pixel₀And a camera translation parameter t₀And constructing a dense depth map of the dynamic scene corresponding to the image frame set according to the plane parameter theta of each super pixel.

According to a second aspect of the present disclosure, there is provided a depth reconstruction apparatus based on superpixel relationship analysis, including: the image processing device comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining a feature point matching result M between adjacent image frames in an image frame set and a super pixel segmentation result S of a target image frame, the image frame set comprises at least two image frames, the target image frame is one of the image frame set, and the super pixel segmentation result S comprises a plurality of super pixels; a second determining module, configured to determine a homography matrix H corresponding to each super pixel and a motion relation R between adjacent super pixels according to the feature point matching result M and the super pixel segmentation result S_e(ii) a A third determination module for determining a motion relation R between said adjacent superpixels_eDetermining a spatial relationship R between said neighboring superpixels_s(ii) a A fourth determination module for determining a spatial relationship R between said adjacent superpixels_sDetermining a plane parameter theta of each super pixel; a reconstruction module for reconstructing said image from said homography H corresponding to each superpixel and said image of each superpixelAnd the plane parameter theta is used for carrying out depth reconstruction on the dynamic scene corresponding to the image frame set.

Determining a feature point matching result M between adjacent image frames in an image frame set comprising at least two image frames and a super-pixel segmentation result S of a target image frame, wherein the target image frame is one frame in the image frame set, the super-pixel segmentation result S comprises a plurality of super-pixels, and determining a homography matrix H corresponding to each super-pixel and a motion relation R between the adjacent super-pixels according to the feature point matching result M and the super-pixel segmentation result S_eAccording to the motion relation R between adjacent superpixels_eDetermining the spatial relationship R between adjacent superpixels_sAccording to the spatial relationship R between adjacent superpixels_sAnd determining a plane parameter theta of each super pixel, and performing depth reconstruction on the dynamic scene corresponding to the image frame set according to the homography matrix H corresponding to each super pixel and the plane parameter theta of each super pixel. Therefore, the spatial relationship of the super pixels is determined through the motion relationship of the super pixels, so that the plane parameter of each super pixel can be determined, and the deep reconstruction of a complex motion scene is effectively realized.

Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a schematic flow chart of a depth reconstruction method based on superpixel relationship analysis according to an embodiment of the present disclosure;

FIG. 2 shows a schematic diagram of a depth reconstruction system based on superpixel relationship analysis according to an embodiment of the present disclosure;

FIG. 3 shows a schematic diagram of the results of depth reconstruction using two consecutive image frames according to an embodiment of the present disclosure;

fig. 4 shows a schematic structural diagram of a depth reconstruction apparatus based on superpixel relationship analysis according to an embodiment of the present disclosure.

Detailed Description

Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.

Fig. 1 shows a schematic flowchart of a depth reconstruction method based on superpixel relationship analysis according to an embodiment of the present disclosure. As shown in fig. 1, the method may include:

step S11, determining a feature point matching result M between adjacent image frames in the image frame set and a super-pixel segmentation result S of the target image frame, where the image frame set includes at least two image frames, the target image frame is one frame in the image frame set, and the super-pixel segmentation result S includes a plurality of super-pixels.

Step S12, according to the feature point matching result M and the super-pixel segmentation result S, determining the homography matrix H corresponding to each super-pixel and the motion relation R between the adjacent super-pixels_e。

Step S13, according to the motion relation R between the adjacent super pixels_eDetermining the spatial relationship R between adjacent superpixels_s。

Step S14, according to the spatial relation R between the adjacent super-pixels_sThe plane parameter θ for each superpixel is determined.

And step S15, performing depth reconstruction on the dynamic scene corresponding to the image frame set according to the homography matrix H corresponding to each super pixel and the plane parameter theta of each super pixel.

The spatial relationship of the super-pixels is determined through the motion relationship between the super-pixels, so that the plane parameter of each super-pixel can be determined, and the depth reconstruction of a complex motion scene is effectively realized.

Fig. 2 shows a schematic diagram of a depth reconstruction system based on superpixel relationship analysis according to an embodiment of the present disclosure. As shown in fig. 2, the depth reconstruction system includes five modules: the device comprises a preprocessing module, a relation analysis module, a motion selection module, a reconstruction module and an optimization module.

Inputting at least two image frames included in the image frame set into a preprocessing module, preprocessing the at least two image frames by the preprocessing module, and outputting a feature point matching result M between adjacent image frames and a super-pixel segmentation result S of a target image frame.

In one example, feature point matching results between adjacent image frames are determined by a CPM algorithm.

It should be noted that, the method for determining the feature point matching result M between adjacent image frames may also use other algorithms besides the CPM algorithm described above, and this disclosure is not limited in this respect.

In an example, a superpixel segmentation result S of the target image frame is determined by the SLIC algorithm.

It should be noted that, the method for determining the super-pixel segmentation result S of the target image frame may also use other algorithms besides the SLIC algorithm described above, and this disclosure is not limited in this regard.

For the relationship between adjacent superpixels, two relationships are defined: kinematic relationships and spatial relationships. The motion relationship is related to the homography matrix H and can be obtained by point matching, and the spatial relationship is determined by the plane parameter theta. For the kinematic and spatial relationships, there are 3 relationship types each, defined as: coplanar, Hinge and Crack. Table 1 shows a specific definition of the kinematic relationships. Table 2 shows a specific definition of the spatial relationship.

TABLE 1

TABLE 2

Wherein Hi is the super pixel S_iCorresponding homography matrix, theta_iIs a super pixel S_iP is the canonical form of the pixel point P, B_i,jIs a neighboring super-pixel S_iAnd S_jA set of pixels on a common boundary.

Still taking the above fig. 2 as an example, the feature point matching result M between the adjacent image frames output by the preprocessing module and the super-pixel segmentation result S of the target image frame are input into the relationship analysis module, and the relationship analysis module determines the homography matrix H corresponding to each super-pixel and the motion relationship R between the adjacent super-pixels according to the feature point matching result M between the adjacent image frames determined by the preprocessing module and the super-pixel segmentation result S of the target image frame_e。

In one possible implementation, according to the feature point matching result M and the super-pixel segmentation result S, determining the homography matrix H corresponding to each super-pixel and the motion relationship R between adjacent super-pixels_eThe method comprises the following steps: for any superpixel i, by applying a first energy function E (H, R)_e) Optimizing to obtain a homography matrix H corresponding to the super pixel i_iAnd the motion relation R between a superpixel i and an adjacent superpixel j_e(i, j); wherein the first energy function E (H, R)_e) Comprises the following steps: homography matrix H corresponding to super pixel_iRelated first optimization term E_data(H_i) Homography matrix H corresponding to a super pixel_iThe motion relation R between a super-pixel i and an adjacent super-pixel_e(i, j) homography matrix H corresponding to adjacent pixel j_jAssociated second optimization term E_pair(H_i，H_j，R_e(i, j)), and a motion relationship R with the super-pixel i and the neighboring super-pixel_e(i, j) related constant term E_o(R_e(i，j))。

In one possible implementation, the homography matrix H corresponding to the adjacent pixel j_jIs the optimized homography matrix.

For example, the first energy function E (H, R)_e) Is expressed as follows:

determining a homography matrix H corresponding to the super pixel i by optimizing the first energy function_iAnd the motion relation R between a superpixel i and an adjacent superpixel j_e(i, j) E is the set of adjacent superpixel pairs corresponding to superpixel i, and λ_s1And λ_s2Are the weight coefficients.

At a first energy function E (H, R)_e) In (1), a homography matrix H corresponding to the super pixel_iRelated first optimization term E_data(H_i) The specific representation form for matching the feature point matching result M is as follows:

wherein, | Z_iI is a normalization parameter, S_u(S_i) Is the support matching of the super-pixel i, tau_gIs the threshold parameter, ω_cThe specific representation form of (A) is as follows:

ω_c(S_i,p_l)＝exp(-ω_s(S_i,S_j)/γ),p_l∈S_j。

wherein, ω is_sγ is a constant parameter for the geodesic distance between a superpixel i and the center of a superpixel j.

At a first energy function E (H, R)_e) In (1), a homography matrix H corresponding to the super pixel_iThe motion relation R between a super-pixel i and an adjacent super-pixel_e(i, j) homography matrix H corresponding to adjacent pixel j_jAssociated second optimization term E_pair(H_i，H_j，R_e(i, j)) are represented in the following specific forms:

at a first energy function E (H, R)_e) And the motion relation R between the super pixel i and the adjacent super pixel_e(i, j) related constant term E_o(R_e(i, j)) is represented in the following specific form:

the first energy function E (H, R) is set to be equal to or higher than a predetermined threshold when a plurality of image frames are included in the image frame set_e) Further comprising: with the motion relation R between the super-pixel i and the adjacent super-pixel_e(i, j) related prior term E_mul(R_e(i, j)), at which time the first energy function E (H, R)_e) Is expressed in the following specific form:

wherein the prior term E_mul(R_e(i, j)) is added according to the tracking superpixel relationship, and is expressed in the following form:

wherein R is_p(i, j) is a priori the tracked motion relationships,

indicates how many frames the relationship holds, τ_mIs a threshold parameter.

By optimizing the first energy function E (H, R)_e) Determining that a super-pixel i corresponds toHomography matrix H_iAnd the motion relation R between a superpixel i and an adjacent superpixel j_e(i, j) includes:

first, a homography matrix H is randomly initialized_i. Wherein, the homography matrix H can be initialized through the characteristic point matching result M_iThe homography matrix H may also be initialized by other methods_iThis disclosure is not particularly limited thereto.

Secondly, according to the optimized homography matrix H corresponding to the adjacent pixels_jDetermining a second optimization term E_pair(H_i，H_j，R_e(i, j)), a support vector that considers all point matches involved as local is optimized.

And finally, using a new support vector set to correspond to the homography matrix H by adopting a method of randomly generating model parameters according to a fast propagation method_iAnd (6) optimizing.

Wherein, the optimization process can be ended by setting the optimization times, or the homography matrix H can be used_iThe optimization process is ended when convergence is reached, and the optimization process may also be ended in other ways, which is not specifically limited by the present disclosure.

Optimizing the first energy function E (H, R) by the above_e) The homography matrix H corresponding to each super pixel and the motion relation R between the adjacent super pixels can be obtained_e。

In a possible implementation, the motion relation R between adjacent superpixels is used as a basis_eDetermining the spatial relationship R between adjacent superpixels_sThe method comprises the following steps: relating the motion between adjacent super-pixels to R_eIs determined as a spatial relationship R between adjacent superpixels_sThe relationship type includes: coplanar, Hinge and Crack.

In practical application, the motion relation and the control relation between the adjacent super pixels are in one-to-one correspondence, so that the motion relation R between the adjacent super pixels is obtained_eThen, the motion relation R between the adjacent super pixels_eIs determined as a spatial relationship R between adjacent superpixels_sThe type of relationship (c). I.e. phase inversionMotion relation R between adjacent super pixels_eWhen the relationship type of (1) is Coplanar, the spatial relationship R thereof_sThe type of relationship of (a) is also Coplanar; when the motion relation R between adjacent superpixels_eWhen the relationship type of (1) is Hinge, the spatial relationship R_sThe relationship type of (1) is also Hinge; when the motion relation R between adjacent super-pixels_eWhen the relationship type of (1) is Crack, the spatial relationship R is_sIs also Crack.

Still taking the example of FIG. 2 above, the relationship analysis module outputs the homography H corresponding to each super-pixel and the spatial relationship R between adjacent super-pixels_sTo the motion selection module.

In one possible implementation, the spatial relationship R between adjacent superpixels is based on_sDetermining a plane parameter θ for each superpixel, comprising: determining a reference superpixel set S among a plurality of superpixels based on a preset algorithm_tWherein the reference super pixel set S_tA background portion in the corresponding target image frame; reference super pixel set S_tDetermining the scale factor s of each reference superpixel as 1; from a reference set of superpixels S_tAnd the spatial relationship R between adjacent superpixels_sThe plane parameter θ for each superpixel is determined.

Still taking the example of fig. 2 above, the motion selection module determines a reference super-pixel set S among the plurality of super-pixels based on a predetermined algorithm_tAnd decomposing the homography matrix Hi corresponding to each super pixel to obtain the inverse depth d corresponding to each super pixel_iAnd the normal vector n of the plane_iAnd further outputs a reference super-pixel set S_tThe spatial relationship R between adjacent superpixels_sAnd an inverse depth d for each super-pixel_iAnd the normal vector n of the plane_iTo the reconstruction module such that the reconstruction module determines a planar parameter θ for each superpixel.

In one possible implementation, the set S of superpixels is based on a reference_tAnd the spatial relationship R between adjacent superpixels_sDetermining a plane parameter θ for each superpixel, comprising: for any superpixel i, by optimizing the second energy function E (θ, s),obtaining the plane parameter theta of the superpixel i_iAnd a scale factor s_i(ii) a Wherein, the second energy function E (θ, s) includes: with the plane parameter theta of the super-pixel i_iAnd a scale factor s_iRelated third optimization term E_fit(θ_i，s_i) And the plane parameter theta of the super pixel i_iPlane parameter theta of adjacent super pixel j corresponding to super pixel i_jRelated fourth optimization term E_Rel(θ_i，θ_j) And scale factor s with super pixel i_iAssociated constant term E_occ(s_i)。

In one possible implementation, the motion relationship R between a superpixel i and all neighboring superpixels j_sWhen the relationship types of (i, j) are all Crack, the second energy function E (θ, s) further includes: with the plane parameter theta of the super-pixel i_iPlane parameter theta of super pixel k_kAssociated fifth optimization term Epari (θ)_i，θ_k) Wherein the super-pixel k is a super-pixel set S_rOther super-pixels, super-pixel sets S_rCan reach the reference superpixel set S based on the target path_tAnd the relation type of the superpixel relation on the target path is Coplanar or Hinge.

In a possible implementation, the plane parameter θ of the neighboring superpixel j_jAnd the plane parameter theta of the superpixel k_kThe optimized plane parameters.

For example, the second energy function E (θ, s) is expressed as follows:

wherein λ is_r1、λ_r2And λ_r3Are the weight coefficients.

In the second energy function E (theta, s), the plane parameter theta of the super-pixel i_iPlane parameter theta of adjacent super pixel j corresponding to super pixel i_jRelated fourth optimization term E_Rel(θ_i，θ_j) In the concrete form of representationThe following were used:

in the second energy function E (θ, s), the scale factor s associated with the super-pixel i_iAssociated constant term E_occ(s_i) The specific representation form of (A) is as follows:

E_occ(s_i)＝δ(s_i≠1)。

wherein the content of the first and second substances,

in the second energy function E (theta, s), the plane parameter theta of the super-pixel i_iAnd a scale factor s_iRelated third optimization term E_fit(θ_i，s_i) Mainly used for maintaining local characteristics, the specific representation form is as follows:

wherein, tau_rIs a threshold parameter.

Definition of S_rRepresenting all the underlying super-pixel sets S_tThe super pixel set that can be achieved. For one not in S_tWe define that it can reach S_tIf and only if we can find a target path from the superpixel to S_tAnd the relation type of the superpixel relation on the target path is Coplanar or Hinge, and at this time, the following is defined:

E_c＝{(i,k)|(i,k)∈E,S_i∈S_r,S_k∈S\S_r}。

in the second energy function E (theta, s), the plane parameter theta of the super-pixel i_iPlane parameter theta of super pixel k_kRelated fifth optimization term E_pair(θ_i，θ_k) The specific expression form of (A) is as follows:

according to the fifth optimization term E_pair(θ_i，θ_k) It can be ensured that the plane parameters of all superpixels apply to the second energy function E (θ, s) described above.

Determining a planar parameter θ of the superpixel i by optimizing said second energy function E (θ, s)_iAnd a scale factor s_iThe method comprises the following steps:

first, a plane parameter θ is randomly initialized_i。

Secondly, according to the optimized homography matrix theta corresponding to the adjacent pixels_jAnd theta_kDetermining a fourth optimization term E_Rel(θ_i，θ_j) And a fifth optimization term E_pari(θ_i，θ_k)。

Finally, according to a fast propagation method, a method for randomly generating model parameters is adopted to carry out plane parameter theta_iAnd (6) optimizing.

Wherein, the optimization process can be ended by setting the optimization times, or the plane parameter theta can be set_iThe optimization process is ended when convergence is reached, and the optimization process may also be ended in other ways, which is not specifically limited by the present disclosure.

In one possible implementation manner, performing depth reconstruction on a dynamic scene corresponding to an image frame set according to a homography matrix H corresponding to each super pixel and a plane parameter θ of each super pixel includes: decomposing the homography matrix H corresponding to each super pixel to obtain the camera rotation parameter R corresponding to each super pixel₀And a camera translation parameter t₀(ii) a According to the camera rotation parameter R corresponding to each super pixel₀And a camera translation parameter t₀And the plane parameter theta of each super pixel is used for constructing a dense depth map of the dynamic scene corresponding to the image frame set.

Still taking the example of fig. 2 above, the motion selection module decomposes the homography matrix Hi corresponding to each super-pixel and obtains the camera rotation corresponding to each super-pixelParameter R₀And a camera translation parameter t₀And output to a reconstruction module, which is finally based on the camera rotation parameter R corresponding to each superpixel₀And a camera translation parameter t₀And a plane parameter theta of each super pixel, constructing a dense depth map D of the dynamic scene corresponding to the image frame set, and further optimizing by an optimization module to obtain an optimized dense depth map D_R. Fig. 3 shows a schematic diagram of the result of depth reconstruction using two consecutive image frames according to an embodiment of the present disclosure.

Fig. 4 shows a schematic structural diagram of a depth reconstruction apparatus based on superpixel relationship analysis according to an embodiment of the present disclosure. As shown in fig. 4, the apparatus 40 includes:

a first determining module 41, configured to determine a feature point matching result M between adjacent image frames in an image frame set and a super-pixel segmentation result S of a target image frame, where the image frame set includes at least two image frames, the target image frame is one frame in the image frame set, and the super-pixel segmentation result S includes multiple super-pixels;

a second determining module 42, configured to determine a homography H corresponding to each super pixel and a motion relation R between adjacent super pixels according to the feature point matching result M and the super pixel segmentation result S_e；

A third determination module 43 for determining the motion relation R between adjacent superpixels_eDetermining the spatial relationship R between adjacent superpixels_s；

A fourth determination module 44 for determining the spatial relationship R between adjacent superpixels_sDetermining a plane parameter theta of each super pixel;

and the reconstruction module 45 is configured to perform depth reconstruction on the dynamic scene corresponding to the image frame set according to the homography matrix H corresponding to each super pixel and the plane parameter θ of each super pixel.

In one possible implementation, the second determining module 42 is specifically configured to:

for any superpixel i, by applying a first energy function E (H, R)_e) Optimizing to obtain a homography matrix H corresponding to the super pixel i_iAnd the motion relation R between a superpixel i and an adjacent superpixel j_e(i，j)；

Wherein the first energy function E (H, R)_e) Comprises the following steps: homography H corresponding to a super-pixel i_iRelated first optimization term E_data(H_i) Homography matrix H corresponding to a super-pixel i_iThe motion relation R between a super-pixel i and an adjacent super-pixel j_e(i, j) homography matrix H corresponding to adjacent pixel j_jAssociated second optimization term E_pair(H_i，H_j，R_e(i, j)), and a motion relationship R with the super-pixel i and the neighboring super-pixel j_e(i, j) related constant term E_o(R_e(i，j))。

In a possible implementation manner, the third determining module 43 is specifically configured to:

relating the motion between adjacent superpixels to R_eDetermination of the type of relationshipAs a spatial relationship R between adjacent superpixels_sThe relationship type includes: coplanar, Hinge and Crack.

In one possible implementation, the fourth determining module 44 includes:

a first determination submodule for determining a set S of reference superpixels among the plurality of superpixels on the basis of a preset algorithm_tWherein the reference super pixel set S_tA background portion in the corresponding target image frame;

a second determining submodule for determining the reference superpixel set S_tDetermining the scale factor s of each reference superpixel as 1;

a third determining submodule for determining a reference set S of superpixels from_tAnd the spatial relationship R between adjacent superpixels_sThe plane parameter θ for each superpixel is determined.

In a possible implementation manner, the third determining submodule is specifically configured to:

for any superpixel i, the plane parameter theta of the superpixel i is obtained by optimizing the second energy function E (theta, s)_iAnd a scale factor s_i；

Wherein, the second energy function E (θ, s) includes: with the plane parameter theta of the super pixel i_iAnd scale factor s_iRelated third optimization term E_fit(θ_i，s_i) And the plane parameter theta of the super pixel i_iPlane parameter theta of adjacent super pixel j corresponding to super pixel i_jRelated fourth optimization term E_Rel(θ_i，θ_j) And scale factor s with super pixel i_iAssociated constant term E_occ(s_i)。

In one possible implementation, the motion relationship R between a superpixel i and all neighboring superpixels j_sWhen the relationship types of (i, j) are all Crack, the second energy function E (θ, s) further includes: with the plane parameter theta of the super-pixel i_iPlane parameter theta of super pixel k_kAssociated fifth optimization term E_pair(θ_i，θ_k) Wherein the super-pixel k is a super-pixel set S_rOther super-pixels, super-pixel sets S_rCan reach the reference super-pixel set S based on the target path_tAnd the relation type of the superpixel relation on the target path is Coplanar or Hinge.

In one possible implementation, the reconstruction module 45 is specifically configured to:

decomposing the homography matrix H corresponding to each super pixel to obtain the camera rotation parameter R corresponding to each super pixel₀And a camera translation parameter t₀；

According to the camera rotation parameter R corresponding to each super pixel₀And a camera translation parameter t₀And the plane parameter theta of each super pixel is used for constructing a dense depth map of the dynamic scene corresponding to the image frame set.

The apparatus 40 provided in the present disclosure can implement each step in the method embodiment shown in fig. 1 and/or fig. 2, and implement the same technical effect, and for avoiding repetition, details are not described here again.

Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terms used herein were chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the techniques in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A depth reconstruction method based on superpixel relationship analysis is characterized by comprising the following steps:

determining a feature point matching result M between adjacent image frames in an image frame set and a super-pixel segmentation result S of a target image frame, wherein the image frame set comprises at least two image frames, the target image frame is one of the image frame set, and the super-pixel segmentation result S comprises a plurality of super-pixels;

determining a homography matrix H corresponding to each super pixel and a motion relation R between adjacent super pixels according to the feature point matching result M and the super pixel segmentation result S_e；

According to the motion relation R between the adjacent super pixels_eDetermining a spatial relationship R between said neighboring superpixels_s；

According to the spatial relation R between said adjacent superpixels_sDetermining a plane parameter theta of each super pixel;

according to the homography matrix H corresponding to each super pixel and the plane parameter theta of each super pixel, carrying out depth reconstruction on the dynamic scene corresponding to the image frame set;

determining a homography matrix H corresponding to each super pixel and a motion relation R between adjacent super pixels according to the feature point matching result M and the super pixel segmentation result S_eThe method comprises the following steps:

for any superpixel i, by applying a first energy function E (H, R)_e) Optimizing to obtain a homography matrix H corresponding to the superpixel i_iAnd a motion relation R between said super-pixel i and an adjacent super-pixel j_e(i，j)；

Wherein the first energy function E (H, R)_e) Comprises the following steps: a homography H corresponding to said super-pixel i_iRelated first optimization term E_data(H_i) Homography H corresponding to said super-pixel i_iThe motion relation R between the super pixel i and the adjacent super pixel j_e(i, j) homography H corresponding to said adjacent superpixel j_jAssociated second optimization term E_pair(H_i，H_j，R_e(i, j)), and a motion relationship R between said super-pixel i and an adjacent super-pixel j_e(i, j) related constant term E_o(R_e(i，j))。

2. The method of claim 1, wherein the homography matrix H for the neighboring superpixel j corresponds to_jIs the optimized homography matrix.

3. The method according to claim 1, wherein said determining is based on a motion relationship R between said neighboring superpixels_eDetermining a spatial relationship R between said neighboring superpixels_sThe method comprises the following steps:

relating the motion R between said adjacent superpixels_eIs determined as a spatial relationship R between said neighboring superpixels_sThe relationship type includes: coplanar, Hinge and Crack.

4. A method according to claim 3, wherein said correlation is based on the spatial relationship R between said adjacent superpixels_sDetermining a plane parameter θ for each superpixel, comprising:

determining a reference superpixel set S among the plurality of superpixels based on a preset algorithm_tWherein the reference super pixel set S_tCorresponding to a background part in the target image frame;

the reference super pixel set S_tDetermining the scale factor s of each reference superpixel as 1;

according to the reference super pixel set S_tAnd the spatial relationship R between said adjacent superpixels_sThe plane parameter θ for each superpixel is determined.

5. The method of claim 4, wherein the set S of superpixels is based on the reference_tAnd the spatial relationship R between said adjacent superpixels_sDetermining a plane parameter θ for each superpixel, comprising:

for any superpixel i, optimizing a second energy function E (theta, s) to obtain a plane parameter theta of the superpixel i_iAnd a scale factor s_i；

Wherein the second energyThe quantity function E (θ, s) includes: with the plane parameter theta of the super-pixel i_iAnd a scale factor s_iRelated third optimization term E_fit(θ_i，s_i) With the plane parameter θ of the superpixel i_iAnd the plane parameter theta of the adjacent super pixel j corresponding to the super pixel i_jRelated fourth optimization term E_Rel(θ_i，θ_j) And a scale factor s associated with said super-pixel i_iAssociated constant term E_occ(s_i)。

6. Method according to claim 5, characterized in that the motion relation R between said superpixel i and all neighboring superpixels j is determined as_sWhen the relationship types of (i, j) are all Crack, the second energy function E (θ, s) further includes: with the plane parameter theta of the super-pixel i_iPlane parameter theta of super pixel k_kRelated fifth optimization term E_pair(θ_i，θ_k) Wherein the super pixel k is a super pixel set S_rA set S of super-pixels_rCan reach the reference superpixel set S based on the target path_tAnd the relation type of the superpixel relation on the target path is Coplannar or Hinge.

7. The method of claim 6, wherein the plane parameter θ of the neighboring superpixel j_jAnd a plane parameter θ of the superpixel k_kThe optimized plane parameters.

8. The method according to claim 1, wherein the depth reconstruction of the dynamic scene corresponding to the image frame set according to the homography H corresponding to each super pixel and the plane parameter θ of each super pixel comprises:

decomposing the homography matrix H corresponding to each super pixel to obtain a camera rotation parameter R corresponding to each super pixel₀And a camera translation parameter t₀；

According to the camera rotation parameter R corresponding to each super pixel₀And a camera translation parameter t₀And constructing a dense depth map of the dynamic scene corresponding to the image frame set according to the plane parameter theta of each super pixel.

9. A depth reconstruction device based on superpixel relationship analysis, comprising:

the image processing device comprises a first determining module, a second determining module and a third determining module, wherein the first determining module is used for determining a feature point matching result M between adjacent image frames in an image frame set and a superpixel segmentation result S of a target image frame, the image frame set comprises at least two image frames, the target image frame is one frame in the image frame set, and the superpixel segmentation result S comprises a plurality of superpixels;

a second determining module, configured to determine a homography matrix H corresponding to each super pixel and a motion relation R between adjacent super pixels according to the feature point matching result M and the super pixel segmentation result S_e；

A third determination module for determining a motion relation R between said adjacent superpixels_eDetermining a spatial relationship R between said neighboring superpixels_s；

A fourth determination module for determining a spatial relationship R between said adjacent superpixels_sDetermining a plane parameter theta of each super pixel;

the reconstruction module is used for carrying out depth reconstruction on the dynamic scene corresponding to the image frame set according to the homography matrix H corresponding to each super pixel and the plane parameter theta of each super pixel;

the second determining module is specifically configured to:

Wherein the first energy function E (H, R)_e) Comprises the following steps: a homography H corresponding to said super-pixel i_iIs related to the firstOptimization term E_data(H_i) Homography H corresponding to said super-pixel i_iThe motion relation R between the super pixel i and the adjacent super pixel j_e(i, j) homography H corresponding to said adjacent superpixel j_jAssociated second optimization term E_pair(H_i，H_j，R_e(i, j)), and a motion relationship R between said super-pixel i and an adjacent super-pixel j_e(i, j) related constant term E_o(R_e(i，j))。