CN102446366B - Time-space jointed multi-view video interpolation and three-dimensional modeling method - Google Patents

Time-space jointed multi-view video interpolation and three-dimensional modeling method Download PDF

Info

Publication number
CN102446366B
CN102446366B CN 201110271761 CN201110271761A CN102446366B CN 102446366 B CN102446366 B CN 102446366B CN 201110271761 CN201110271761 CN 201110271761 CN 201110271761 A CN201110271761 A CN 201110271761A CN 102446366 B CN102446366 B CN 102446366B
Authority
CN
China
Prior art keywords
interpolation
frame
dimensional
camera
angle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 201110271761
Other languages
Chinese (zh)
Other versions
CN102446366A (en
Inventor
李坤
杨敬钰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Lingyun Shixun Technology Co ltd
Luster LightTech Co Ltd
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN 201110271761 priority Critical patent/CN102446366B/en
Publication of CN102446366A publication Critical patent/CN102446366A/en
Application granted granted Critical
Publication of CN102446366B publication Critical patent/CN102446366B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

The invention belongs to the technical field of computer multimedia. In order to provide a simple and practical multi-view video interpolation and three-dimensional modeling method, the invention adopts the technical scheme that: a time-space jointed multi-view video interpolation and three-dimensional modeling method is provided, a plurality of camera arrays are grouped in a spacing manner, and the three-dimensional model of a scene at every moment is reconstructed. The time-space jointed multi-view video interpolation and three-dimensional modeling method specifically comprises the following steps of: (1) obtaining unacquired frames between two frames through interpolation; (2) obtaining an image of unacquired visual angles at this moment by using a model-assisted weighting method; (3) calculating and extracting key points; (4) describing the extracted key points by using shape context and solving by the Hungarian method; (5) obtaining final interpolated frames through solving the problem of Poisson editing and optimization; and (6) reconstructing and rendering the three-dimensional model of the scene. The time-space jointed multi-view video interpolation and three-dimensional modeling method is mainly applied to the design and the manufacture of antennae.

Description

Space-time joint multi-view video interpolation and three-dimensional modeling method
Technical Field
The invention belongs to the technical field of computer multimedia, and particularly relates to a space-time joint multi-view video interpolation and three-dimensional modeling method.
Background
For a long time, the acquisition, processing and communication of single-channel videos make important breakthrough in key technology, and the single-channel videos tend to be mature and widely applied to multiple fields such as broadcast televisions, internet videos, intelligent transportation and the like. However, the traditional single-camera acquisition format does not bring a sense of depth, perspective, and full-range knowledge of the object (variable viewing angle). The multi-channel video acquisition based on the multi-camera system and the reconstruction of the scene object can achieve the omnibearing visual perception, and related research starts to become a research hotspot in the middle of the 90's of the last century. The three-dimensional scene real-time acquisition and reconstruction technology based on the multi-camera system is widely applied to the fields of free viewpoint video, virtual reality, immersive video conference, movie entertainment, stereo video, motion analysis and the like. International university and research institutions such as: stanford, ma province, camenium, colombia university, mitsubishi electronics, microsoft institute, marx-planck institute have all built various multi-camera acquisition systems for scene geometry capture, motion analysis, and stereo fabrication. At present, the acquisition and reconstruction technology based on a multi-camera system is difficult to obtain the satisfactory reconstruction effect of users due to the problems of camera construction and synchronization, camera storage and transmission, high-dimensional data processing, high-speed motion capture and the like. One method is to adopt a plurality of high-speed cameras for capturing high-speed motion, but the high-speed cameras are expensive and have limited storage capacity; the other method is to adopt a plurality of cheap low-frame-rate cameras to reasonably group the cameras, simultaneously sample the same group of cameras, and interpolate and sample different groups of cameras, so as to obtain sparse sampled space-time information, and then realize the reconstruction of a high frame rate by an interpolation method. Stanford university (Wilburn B, Joshi N, Vaish V, et al, high-speed video using a dense camera array, proceedings of IEEE Conference on computer Vision and Pattern Recognition, Washington, DC, USA, 2004.294-301.) high-speed scene reconstruction with a single view angle was achieved by 52 dense light field camera arrays at a frame rate of 30 fps. However, the result of this method is limited to a single view, and cannot obtain multi-view images at each time, and further cannot reconstruct a three-dimensional model at each time. The first inventor of the invention proposed a method (ZL200810103684.2) for modeling a high-speed moving object by using an annular low frame rate camera array to realize full-view three-dimensional reconstruction, but the method only simply uses a visual shell model to perform intersection to perform interpolation and reconstruction, so the interpolation and reconstruction effects are very general and not robust. Although the existing video interpolation and image fusion methods can be adopted to acquire the non-acquired multi-view video, the obtained result has fuzzy or unsmooth areas.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a simple and practical multi-view video interpolation and three-dimensional modeling method, and adopts the technical scheme that a space-time combination multi-view video interpolation and three-dimensional modeling method alternately groups a plurality of camera arrays: n cameras with frame rate of f frames/second are arranged and are uniformly divided into m groups at intervals, n and m are positive integers, and n is an integral multiple of m; the same group of cameras synchronously acquire n/m videos of visual angles at the same moment, and the different groups of cameras acquire the videos at different moments by time interpolation of 1/(fm) seconds; the method comprises the following steps of obtaining videos of n visual angles at all moments by adopting the proposed space-time joint multi-visual-angle video interpolation and three-dimensional modeling method, and further reconstructing a three-dimensional model of a scene at each moment, wherein the method comprises the following steps:
1) for each camera, forward optical flow and backward optical flow between two adjacent collected frames are obtained by adopting an optical flow method, and then the non-collected frames between the two frames, namely the frames subjected to the time domain interpolation, are interpolated;
2) for each acquisition moment, obtaining an image of an uncollected visual angle at the moment by adopting a model-assisted weighting method, namely a spatial domain interpolation frame;
3) calculating the accumulated energy spectrum of the dual-tree discrete wavelet domain of the time domain interpolation frame obtained in the step 1) and the spatial domain interpolation frame obtained in the step 2), and extracting key points;
4) describing the extracted key points by using shape context, converting a key point matching problem based on the shape context into square assignment, namely a weighted bipartite graph matching problem, and solving by using a Hungarian method;
5) obtaining a final interpolation frame by solving a Poisson editing optimization problem;
6) at each moment, images of all visual angles are utilized, including collected images and interpolation images, and a three-dimensional model of a scene is reconstructed and rendered by adopting a multi-visual angle stereo method.
The model-assisted weighting method specifically comprises the following steps:
21) extracting a contour map of the three-dimensional object from the collected visual angle image by a simple difference or blue screen segmentation technology;
22) reconstructing a rough three-dimensional model, namely a visible shell model, by using the contour map obtained by calculation in the step 21) through an EPVH method;
23) for each non-collection visual angle i, performing weighted interpolation by using the images of two collection visual angles j and k adjacent to the non-collection visual angle i, wherein the weight is calculated as follows:
(1)
Figure BDA0000091126390000022
wherein, theta and phi are two constant angles which respectively represent the maximum value of an included angle between allowed camera sight lines and the maximum value of an included angle between a three-dimensional point normal and the camera sight lines; theta1Is the camera sight line riAnd the camera sight line rjAngle of (a) of2Is the camera sight line riAnd the camera sight line rkThe angle of,
Figure BDA0000091126390000023
normal to the three-dimensional point p and the camera's line of sight rjThe angle of,
Figure BDA0000091126390000024
normal to the three-dimensional point p and the camera's line of sight rkThe included angle of (A); p is the intersection of the three-dimensional model and a line of sight through a pixel on the view i image.
The method comprises the following steps of calculating an accumulated energy spectrum of a dual-tree discrete wavelet domain and extracting key points:
31) performing double-tree discrete wavelet transform on the spatial domain interpolation frame and the time domain interpolation frame, and decomposing the spatial domain interpolation frame and the time domain interpolation frame into S scales;
32) respectively calculating the energy spectrum { M of the key point under each scale of the real part and the imaginary parts}1≤s≤SThe keypoint energy for each pixel position is calculated as:
E ( s ) = α s ( Π n = 1 6 c n ) β - - - ( 2 )
wherein c1,K,c6The parameters alpha and beta are used for adjusting the importance of the scale in the accumulated energy spectrum;
33) the energy spectrum obtained in the step 32) is interpolated into the size of an original image by adopting a two-dimensional Gaussian kernel, and the interpolation spectrum under the scale s is defined as gs(Ms);
34) Calculating the accumulated energy spectrum A of the real part and the imaginary part respectivelyrAnd AiIs composed of
Figure BDA0000091126390000031
And obtaining a final accumulated energy spectrum of A = A r 2 + A i 2 ;
35) Extracting key points of the accumulated energy spectrum obtained in the step 34) by adopting a SIFT method.
The key point matching method based on the shape context specifically obtains a final interpolation frame by solving the following optimization problems:
Δf|Ω=divv s . t . f | ∂ Ω = f ^ | ∂ Ω - - - ( 3 )
wherein f is an unknown frame to be interpolated,
Figure BDA0000091126390000034
for the laplacian, v ═ (u, v) is the gradient vector field of the temporal interpolation frame,
Figure BDA0000091126390000035
is the divergence of v and is the sum of the magnitudes of v,in order to interpolate the frame in the spatial domain,
Figure BDA0000091126390000037
is the boundary of the closed set omega, s.t. means "satisfyΩIt is meant that on the closed set omega,
Figure BDA0000091126390000038
represented on the closed set omega boundary.
The method has the characteristics and effects that:
the method avoids the problems of the need of an expensive high-speed camera, the unsmooth existing video interpolation and image fusion methods and the like, and realizes the high-frame-rate multi-view video recovery and the three-dimensional reconstruction of a scene under the acquisition condition of a low-frame-rate camera through space-time sampling, space-time interpolation and space-time optimization. Has the following characteristics:
1. the program is simple and easy to realize.
2. Space-time sampling and interpolation for non-planar camera systems. And reasonably grouping systems consisting of the low frame rate cameras, so that each group of cameras are synchronous and uniformly distributed. Different groups of cameras acquire dynamic scenes in a time domain in an interleaved mode. At each sampling time, the camera images which are not sampled are subjected to spatial interpolation by adopting a weighting method, and time domain interpolation is carried out by adopting a bidirectional optical flow method.
3. An optimization method of shape context based on dual-tree discrete wavelet transform. And (5) converting space-time information optimization into Poisson editing problem solution of the image. Extracting interested key points near edges (high-frequency information) by using time-shift invariance and direction selectivity of the dual-tree discrete wavelet transform, and then matching the key points by adopting a shape context method as boundary constraint.
The invention can adopt a low frame rate camera system to realize the three-dimensional reconstruction of the intensive dynamic scene in the time domain. The proposed method has good scalability: multi-view video restoration and three-dimensional dynamic scene reconstruction with higher temporal resolution can be achieved by simply adding more cameras or using higher frame rate cameras.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flow chart of a spatio-temporal joint multi-view video interpolation and three-dimensional modeling method according to an embodiment of the invention;
FIG. 2 is a diagram illustrating the results of recovering an unacquired frame for sequence 1 using the proposed inventive method and two other methods according to an embodiment of the present invention;
FIG. 3 is a visual skin model reconstructed for sequence 1 and the three-dimensional model results reconstructed using the proposed inventive method according to an embodiment of the present invention;
fig. 4 shows the result of the dynamic three-dimensional model reconstructed for the sequence 2 by the proposed inventive method according to the embodiment of the present invention.
Detailed Description
The invention converts the spatio-temporal joint multi-view video interpolation into the Poisson editing problem of the image to solve, wherein the extraction and the matching of the key points utilize the directionality of the double-tree discrete wavelet transform (DDWT) and the robustness of the shape context (shape context), thereby realizing the multi-view video interpolation with high frame rate and high quality. The obtained result has the characteristics of good interpolation effect, high precision, accurate and complete reconstruction of the three-dimensional model and realization under the condition of a low frame rate camera array with the group-to-group interpolation sampling.
The invention discloses a space-time joint multi-view video interpolation and three-dimensional modeling method, which is characterized in that:
the method comprises the following steps of (1) grouping a plurality of camera arrays at intervals (n cameras with a frame rate of f frames/second are arranged, the cameras are evenly divided into m groups at intervals, n and m are positive integers, and n is an integral multiple of m), synchronously acquiring videos of n/m visual angles at the same moment by the same group of cameras, and acquiring videos at different moments by different groups of cameras in a time interpolation mode of 1/(fm) seconds; the method comprises the following steps of obtaining videos of n visual angles at all moments by adopting the proposed space-time joint multi-visual-angle video interpolation and three-dimensional modeling method, and further reconstructing a three-dimensional model of a scene at each moment, wherein the method comprises the following steps:
1) for each camera, an optical flow method (Brox T, Bruhn A, Papenberg N, et al. high acquisition optical flow estimation based on a the road for forwarding. proceedings of European conference reference Computer Vision, volume 3024, 2004.25-36.) proposed by Brox et al is adopted to obtain a forward optical flow and a backward optical flow between two adjacent acquisition frames, and then m-1 frame between the two frames is interpolated; the specific time domain interpolation method can comprise the following steps:
11) assuming that the motion path is linear, namely the position of the pixel on the frame to be interpolated on the motion path is in direct proportion to the relative position of the frame between the two nearest collected frames, and calculating the forward optical flow interpolation and the backward optical flow interpolation of the pixel corresponding to the frame to be interpolated according to the forward optical flow and backward optical flow results;
12) for each pixel on the frame to be interpolated, taking the average value of the forward optical flow interpolation and the backward optical flow interpolation as the final estimation result;
13) filling up pixel holes which are not assigned in a frame to be interpolated by adopting an eight-neighborhood filtering method;
2) for each acquisition moment, obtaining an image of an uncollected visual angle at the moment by adopting a model-assisted weighting method; the specific method can comprise the following steps:
21) extracting a contour map of the three-dimensional object from the collected visual angle image by a simple difference or blue screen segmentation technology;
22) reconstructing a rough three-dimensional model, namely an Visual shell model, by using the contour map calculated in the step 21) through an EPVH (Exact polymeric Visual shells, FrancoJ S, Boyer E. Exact polymeric Visual shells. proceedings of British Machine Vision Conference, 1994.329-338);
23) for each non-collection visual angle i, performing weighted interpolation by using the images of two collection visual angles j and k adjacent to the non-collection visual angle i, wherein the weight is calculated as follows:
Figure BDA0000091126390000041
(1)
Figure BDA0000091126390000042
wherein, theta and phi are two constant angles which respectively represent the maximum value of an included angle between allowed camera sight lines and the maximum value of an included angle between a three-dimensional point normal and the camera sight lines; theta1Is the camera sight line riAnd the camera sight line rjAngle of (a) of2Is the camera sight line riAnd the camera sight line rkThe angle of,
Figure BDA0000091126390000051
normal to the three-dimensional point p and the camera's line of sight rjThe angle of,
Figure BDA0000091126390000052
normal to the three-dimensional point p and the camera's line of sight rkThe included angle of (A); p is the intersection point of the sight line passing through a certain pixel on the view angle i image and the three-dimensional model;
3) calculating the accumulated energy spectrum of the dual-tree wavelet transform domain of the time domain interpolation frame obtained in the step 1) and the spatial domain interpolation frame obtained in the step 2), and extracting key points; the specific method can comprise the following steps:
31) performing double-tree discrete wavelet transform on the spatial domain interpolation frame and the time domain interpolation frame, and decomposing the spatial domain interpolation frame and the time domain interpolation frame into S scales;
32) respectively calculating the energy spectrum { M of the key point under each scale of the real part and the imaginary parts}1≤s≤SThe keypoint energy for each pixel position is calculated as:
E ( s ) = α s ( Π n = 1 6 c n ) β - - - ( 2 )
wherein c1,K,c6The parameters alpha and beta are used for adjusting the importance of the scale in the accumulated energy spectrum;
33) the energy spectrum obtained in the step 32) is interpolated into the size of an original image by adopting a two-dimensional Gaussian kernel, and the interpolation spectrum under the scale s is defined as gs(Ms);
34) Calculating the accumulated energy spectrum A of the real part and the imaginary part respectivelyrAnd AiIs composed of
Figure BDA0000091126390000054
And obtaining a final accumulated energy spectrum of A = A r 2 + A i 2 ;
35) Extracting key points of the accumulated energy spectrum obtained in the step 34) by using a Scale-Invariant Feature Transform (SIFT) method;
4) describing the extracted key points by using shape context (shape context), converting the key point matching problem based on the shape context into a square assignment (weighted bipartite graph matching) problem, and solving by a Hungarian method;
5) the final interpolated frame is obtained by solving the following optimization problem:
Δf|Ω=divv s . t . f | ∂ Ω = f ^ | ∂ Ω - - - ( 3 )
wherein f is an unknown frame to be interpolated,
Figure BDA0000091126390000057
for the laplacian, v ═ (u, v) is the gradient vector field of the temporal interpolation frame,
Figure BDA0000091126390000058
is the divergence of v and is the sum of the magnitudes of v,in order to interpolate the frame in the spatial domain,
Figure BDA00000911263900000510
the boundary of the closed set omega.
6) At each moment, a three-dimensional model of the scene is reconstructed and rendered by a multi-view stereo method by utilizing images (acquired images and interpolated images) of all views.
The invention provides a method for space-time joint multi-view video interpolation and three-dimensional modeling, which is described in detail by combining the accompanying drawings and an embodiment as follows:
the structure of the embodiment of the system for realizing the method of the invention is as follows: the 20 cameras with the frame rate of 30 frames/second are distributed in a ring shape and surround the scene to be collected. The multi-camera array is divided into 4 groups at uniform intervals, the same group of cameras synchronously acquire videos with 5 visual angles at the same moment, and different groups of cameras acquire videos at different moments in time of 1/120 seconds in an interleaved manner; the proposed space-time joint multi-view video interpolation and three-dimensional modeling method is adopted to obtain videos of 20 views at all moments, and then a three-dimensional model of a scene at each moment is reconstructed. As shown in fig. 1, a flow chart of a spatio-temporal joint multi-view video interpolation and three-dimensional modeling method according to an embodiment of the present invention includes the following steps:
1) for each camera, an optical flow method (Brox T, Bruhn A, Papenberg N, et al. high acquisition optical flow estimation based on a the road for forwarding. proceedings of European Conference on Computer Vision, volume 3024, 2004.25-36.) proposed by Brox et al is adopted to obtain a forward optical flow and a backward optical flow between two adjacent acquisition frames, and then 3 frames between the two frames are interpolated; the specific time domain interpolation method can comprise the following steps:
11) assuming that the motion path is linear, namely the position of the pixel on the frame to be interpolated on the motion path is in direct proportion to the relative position of the frame between the two nearest collected frames, and calculating the forward optical flow interpolation and the backward optical flow interpolation of the pixel corresponding to the frame to be interpolated according to the forward optical flow and backward optical flow results;
12) for each pixel on the frame to be interpolated, taking the average value of the forward optical flow interpolation and the backward optical flow interpolation as the final estimation result;
13) filling up pixel holes which are not assigned in a frame to be interpolated by adopting an eight-neighborhood filtering method;
2) for each acquisition moment, obtaining an image of an uncollected visual angle at the moment by adopting a model-assisted weighting method; the specific method can comprise the following steps:
21) extracting a contour map of the three-dimensional object from the collected visual angle image by a simple difference or blue screen segmentation technology;
22) reconstructing a rough three-dimensional model, namely an Visual shell model, by using the contour map calculated in the step 21) through an EPVH (Exact polymeric Visual shells, Franco J S, Boyer E. Exact polymeric Visual shells. proceedings of British machine vision Conference, 1994.329-338);
23) for each non-collection visual angle i, performing weighted interpolation by using the images of two collection visual angles j and k adjacent to the non-collection visual angle i, wherein the weight is calculated as follows:
Figure BDA0000091126390000061
Figure BDA0000091126390000062
wherein, theta is 80 degrees and phi is 70 degrees, which are two constant angles respectively representing the maximum value of the included angle between the allowed camera sight lines and the maximum value of the included angle between the three-dimensional point normal and the camera sight line; theta1Is the camera sight line riAnd the camera sight line rjAngle of (a) of2Is the camera sight line riAnd the camera sight line rkThe angle of,
Figure BDA0000091126390000063
normal to the three-dimensional point p and the camera's line of sight rjThe angle of,normal to the three-dimensional point p and the camera's line of sight rkThe included angle of (A); p is the intersection point of the sight line passing through a certain pixel on the view angle i image and the three-dimensional model;
3) calculating the accumulated energy spectrum of the dual-tree wavelet transform domain of the time domain interpolation frame obtained in the step 1) and the spatial domain interpolation frame obtained in the step 2), and extracting key points; the specific method can comprise the following steps:
31) performing double-tree discrete wavelet transform on the spatial domain interpolation frame and the time domain interpolation frame, and decomposing the spatial domain interpolation frame and the time domain interpolation frame into s scales;
32) respectively calculating the energy spectrum { M of the key point under each scale of the real part and the imaginary parts}1≤s≤3The keypoint energy for each pixel position is calculated as:
E ( s ) = α s ( Π n = 1 6 c n ) β - - - ( 2 )
wherein c1,K,c6The six subband coefficients of the pixel corresponding to the real or imaginary part, the parameters a and β are used to adjust the importance of the scale in the accumulated energy spectrum, a is 1,
Figure BDA0000091126390000072
33) the energy spectrum obtained in the step 32) is interpolated into the size of an original image by adopting a two-dimensional Gaussian kernel, and the interpolation spectrum under the scale s is defined as gs(Ms);
34) Calculating the accumulated energy spectrum A of the real part and the imaginary part respectivelyrAnd AiIs composed of
Figure BDA0000091126390000073
And obtaining a final accumulated energy spectrum of A = A r 2 + A i 2 ;
35) Extracting key points of the accumulated energy spectrum obtained in the step 34) by using a Scale-Invariant Feature Transform (SIFT) method;
4) describing the extracted key points by using shape context (shape context), converting the key point matching problem based on the shape context into a square assignment (weighted bipartite graph matching) problem, and solving by a Hungarian method;
5) the final interpolated frame is obtained by solving the following optimization problem:
Δf|Ω=divv s . t . f | ∂ Ω = f ^ | ∂ Ω - - - ( 3 )
wherein f is an unknown frame to be interpolated,
Figure BDA0000091126390000076
for the laplacian, v ═ (u, v) is the gradient vector field of the temporal interpolation frame,
Figure BDA0000091126390000077
is the divergence of v and is the sum of the magnitudes of v,
Figure BDA0000091126390000078
in order to interpolate the frame in the spatial domain,the boundary of the closed set omega.
Fig. 2 shows a final optimized interpolation result of the sequence 1 and a comparison with other methods, where (a) is an interpolation frame result obtained by using a wavelet-based SSIM fusion method (x.luo, j.zhang, and q.dai, "a classification-based wavelet fusion using wavelet transform," in proc.spie 8064, No.806400, 2011.); (b) the figure shows the interpolation frame result obtained by two-dimensional empirical mode decomposition (Y.ZHEN and Z.Qin, "Region-based image fusion method," Journal of Electronic Imaging, vol.18, No.1, p.013008, 2009.); (c) the figure shows the interpolation frame result obtained by the method of the invention.
6) At each moment, a three-dimensional model of the scene is reconstructed and rendered by a multi-view stereo method by utilizing images (acquired images and interpolated images) of all views.
As shown in fig. 3, is the result of a three-dimensional model reconstructed for sequence 1 using the proposed inventive method. Wherein, the figure (a) is a visual shell model, and the figure (b) is a model reconstructed by adopting the method of the invention. And rendering the model by adopting a normal graph. As shown in fig. 4, is the result of a dynamic three-dimensional model reconstructed for sequence 2 using the proposed inventive method. The first graph is a general graph formed by putting models at all times together, and the subsequent pictures are modeling results at all times respectively.

Claims (4)

1. A space-time joint multi-view video interpolation and three-dimensional modeling method is characterized in that a plurality of camera arrays are grouped at intervals: n cameras with frame rate of f frames/second are arranged and are uniformly divided into m groups at intervals, n and m are positive integers, and n is an integral multiple of m; the same group of cameras synchronously acquire n/m videos of visual angles at the same moment, and the different groups of cameras acquire the videos at different moments by time interpolation of 1/(fm) seconds; the method comprises the following steps of obtaining videos of n visual angles at all moments by adopting the proposed space-time joint multi-visual-angle video interpolation and three-dimensional modeling method, and further reconstructing a three-dimensional model of a scene at each moment, wherein the method comprises the following steps:
1) for each camera, forward optical flow and backward optical flow between two adjacent collected frames are obtained by adopting an optical flow method, and then the non-collected frames between the two frames, namely the frames subjected to the time domain interpolation, are interpolated;
2) for each acquisition moment, obtaining an image of an uncollected visual angle at the moment by adopting a model-assisted weighting method, namely a spatial domain interpolation frame;
3) calculating the accumulated energy spectrum of the dual-tree discrete wavelet domain of the time domain interpolation frame obtained in the step 1) and the spatial domain interpolation frame obtained in the step 2), and extracting key points;
4) describing the extracted key points by using shape context, converting a key point matching problem based on the shape context into square assignment, namely a weighted bipartite graph matching problem, and solving by using a Hungarian method;
5) obtaining a final interpolation frame by solving a Poisson editing optimization problem;
6) at each moment, images of all visual angles are utilized, including collected images and interpolation images, and a three-dimensional model of a scene is reconstructed and rendered by adopting a multi-visual angle stereo method.
2. The method of claim 1, wherein the model-assisted weighting method specifically comprises the steps of:
21) extracting a contour map of the three-dimensional object from the collected visual angle image by a differential or blue screen segmentation technology;
22) reconstructing a rough three-dimensional model, namely a visible shell model, by using the contour map obtained by calculation in the step 21) through an EPVH method;
23) for each non-collection visual angle i, performing weighted interpolation by using the images of two collection visual angles j and k adjacent to the non-collection visual angle i, wherein the weight is calculated as follows:
Figure FDA00002748172200011
wherein, theta and phi are two constant angles which respectively represent the maximum value of an included angle between allowed camera sight lines and the maximum value of an included angle between a three-dimensional point normal and the camera sight line; theta1Is the camera sight line riAnd the camera sight line rjAngle of (a) of2Is the camera sight line riAnd the camera sight line rkThe angle of,
Figure FDA00002748172200013
normal to the three-dimensional point p and the camera's line of sight rjThe angle of,
Figure FDA00002748172200014
normal to the three-dimensional point p and the camera's line of sight rkThe included angle of (A); p is the intersection of the three-dimensional model and a line of sight through a pixel on the view i image.
3. The method of claim 1, wherein the accumulated energy spectrum of the dual-tree discrete wavelet domain is calculated and the key points are extracted, the method comprising the steps of:
31) performing double-tree discrete wavelet transform on the spatial domain interpolation frame and the time domain interpolation frame, and decomposing the spatial domain interpolation frame and the time domain interpolation frame into S scales;
32) respectively calculating the energy spectrum { M of the key point under each scale of the real part and the imaginary parts}1≤i≤SThe keypoint energy for each pixel position is calculated as:
Figure FDA00002748172200021
wherein c1,…c6The parameters alpha and beta are used for adjusting the importance of the scale in the accumulated energy spectrum;
33) subjecting the energy obtained in step 32)The spectrum is interpolated by two-dimensional Gaussian kernel to obtain the original image size, and the interpolation spectrum at the scale s is defined as gs(Ms);
34) Calculating the accumulated energy spectrum A of the real part and the imaginary part respectivelyrAnd AiAs a result of
Figure FDA00002748172200022
And obtaining a final accumulated energy spectrum of
Figure FDA00002748172200023
35) Extracting key points of the accumulated energy spectrum obtained in the step 34) by adopting a SIFT method.
4. The method as claimed in claim 1, wherein the shape context based keypoint matching method is embodied by solving the following optimization problem to obtain the final interpolated frame:
Figure FDA00002748172200024
wherein,ffor the unknown frame to be interpolated,
Figure FDA00002748172200025
is Laplace operator, v ═ o (u,v) For the gradient vector field of the time-domain interpolated frame,
Figure FDA00002748172200026
is the divergence of v and is the sum of the magnitudes of v,
Figure FDA00002748172200027
in order to interpolate the frame in the spatial domain,
Figure FDA00002748172200028
Ω is the boundary of the closed set Ω, s.t. means "condition … is satisfied", andΩmeans that over a closed set omega, calcuim
Figure FDA00002748172200029
ΩRepresented on the closed set omega boundary.
CN 201110271761 2011-09-14 2011-09-14 Time-space jointed multi-view video interpolation and three-dimensional modeling method Expired - Fee Related CN102446366B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110271761 CN102446366B (en) 2011-09-14 2011-09-14 Time-space jointed multi-view video interpolation and three-dimensional modeling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110271761 CN102446366B (en) 2011-09-14 2011-09-14 Time-space jointed multi-view video interpolation and three-dimensional modeling method

Publications (2)

Publication Number Publication Date
CN102446366A CN102446366A (en) 2012-05-09
CN102446366B true CN102446366B (en) 2013-06-19

Family

ID=46008840

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110271761 Expired - Fee Related CN102446366B (en) 2011-09-14 2011-09-14 Time-space jointed multi-view video interpolation and three-dimensional modeling method

Country Status (1)

Country Link
CN (1) CN102446366B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103903300A (en) * 2012-12-31 2014-07-02 博世汽车部件(苏州)有限公司 Object surface height reconstructing method, object surface height reconstructing system, optical character extracting method and optical character extracting system
CN105519105B (en) * 2013-09-11 2019-03-08 索尼公司 Image processing equipment and method
CN104766304B (en) * 2015-02-26 2017-12-05 浙江工业大学 A kind of blood vessel method for registering based on multisequencing medical image
CN106844620B (en) * 2017-01-19 2020-05-12 天津大学 View-based feature matching three-dimensional model retrieval method
US10311630B2 (en) * 2017-05-31 2019-06-04 Verizon Patent And Licensing Inc. Methods and systems for rendering frames of a virtual scene from different vantage points based on a virtual entity description frame of the virtual scene
CN107901424B (en) * 2017-12-15 2024-07-26 北京中睿华信信息技术有限公司 Image acquisition modeling system
CN108806259B (en) * 2018-01-15 2021-02-12 江苏壹鼎崮机电科技有限公司 BIM-based traffic control model construction and labeling method
CN108833785B (en) * 2018-07-03 2020-07-03 清华-伯克利深圳学院筹备办公室 Fusion method and device of multi-view images, computer equipment and storage medium
CN109242950B (en) * 2018-07-11 2023-05-02 天津大学 Multi-view human dynamic three-dimensional reconstruction method under multi-person tight interaction scene
CN111797269A (en) * 2020-07-21 2020-10-20 天津理工大学 Multi-view three-dimensional model retrieval method based on multi-level view associated convolutional network
CN112819945B (en) * 2021-01-26 2022-10-04 北京航空航天大学 Fluid reconstruction method based on sparse viewpoint video

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03160575A (en) * 1989-11-20 1991-07-10 Toshiba Corp Picture display device
CN101271582B (en) * 2008-04-10 2010-06-16 清华大学 Three-dimensional reconstruction method based on multi-vision angle two-dimension image combined with SIFT algorithm
CN101271579B (en) * 2008-04-10 2010-06-16 清华大学 Method for modeling high-speed moving object adopting ring shaped low frame rate camera array
TWI492188B (en) * 2008-12-25 2015-07-11 Univ Nat Chiao Tung Method for automatic detection and tracking of multiple targets with multiple cameras and system therefor
CN101615304A (en) * 2009-07-31 2009-12-30 深圳先进技术研究院 Generate the method for the visual shell of robust
CN101833786B (en) * 2010-04-06 2011-12-28 清华大学 Method and system for capturing and rebuilding three-dimensional model

Also Published As

Publication number Publication date
CN102446366A (en) 2012-05-09

Similar Documents

Publication Publication Date Title
CN102446366B (en) Time-space jointed multi-view video interpolation and three-dimensional modeling method
CN111028150B (en) Rapid space-time residual attention video super-resolution reconstruction method
Kappeler et al. Video super-resolution with convolutional neural networks
CN113139898B (en) Light field image super-resolution reconstruction method based on frequency domain analysis and deep learning
EP3216216B1 (en) Methods and systems for multi-view high-speed motion capture
Hua et al. Holopix50k: A large-scale in-the-wild stereo image dataset
Tam et al. 3D-TV content generation: 2D-to-3D conversion
US9525858B2 (en) Depth or disparity map upscaling
CN113362223A (en) Image super-resolution reconstruction method based on attention mechanism and two-channel network
Lu et al. Learning spatial-temporal implicit neural representations for event-guided video super-resolution
CN101872491A (en) Free view angle relighting method and system based on photometric stereo
CN102741879A (en) Method for generating depth maps from monocular images and systems using the same
CN113538243B (en) Super-resolution image reconstruction method based on multi-parallax attention module combination
CN112906675B (en) Method and system for detecting non-supervision human body key points in fixed scene
CN114494050A (en) Self-supervision video deblurring and image frame inserting method based on event camera
CN104376547A (en) Motion blurred image restoration method
CN109523508B (en) Dense light field quality evaluation method
Chandramouli et al. A generative model for generic light field reconstruction
Wu et al. Depth mapping of integral images through viewpoint image extraction with a hybrid disparity analysis algorithm
CN109302600B (en) Three-dimensional scene shooting device
CN108615221B (en) Light field angle super-resolution method and device based on shearing two-dimensional polar line plan
CN111652922B (en) Binocular vision-based monocular video depth estimation method
CN116402908A (en) Dense light field image reconstruction method based on heterogeneous imaging
CN106355595A (en) Stereo vision matching method for target images
Van Duong et al. Lfdenet: Light field depth estimation network based on hybrid data representation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200703

Address after: 411, block a, Zhizao street, Zhongguancun, No. 45, Chengfu Road, Haidian District, Beijing 100080

Patentee after: Beijing Youke Nuclear Power Technology Development Co.,Ltd.

Address before: 300072 Tianjin City, Nankai District Wei Jin Road No. 92

Patentee before: Tianjin University

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20201010

Address after: 100094 Beijing city Haidian District Cui Hunan loop 13 Hospital No. 7 Building 7 room 701

Patentee after: Beijing lingyunguang Technology Group Co.,Ltd.

Address before: 411, block a, Zhizao street, Zhongguancun, No. 45, Chengfu Road, Haidian District, Beijing 100080

Patentee before: Beijing Youke Nuclear Power Technology Development Co.,Ltd.

CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 100094 701, 7 floor, 7 building, 13 Cui Hunan Ring Road, Haidian District, Beijing.

Patentee after: Lingyunguang Technology Co.,Ltd.

Address before: 100094 701, 7 floor, 7 building, 13 Cui Hunan Ring Road, Haidian District, Beijing.

Patentee before: Beijing lingyunguang Technology Group Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210113

Address after: 518000 room 1101, 11th floor, building 2, C District, Nanshan Zhiyuan, Nanshan District, Shenzhen City, Guangdong Province

Patentee after: Shenzhen Lingyun Shixun Technology Co.,Ltd.

Address before: 100094 701, 7 floor, 7 building, 13 Cui Hunan Ring Road, Haidian District, Beijing.

Patentee before: Lingyunguang Technology Co.,Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130619