WO2013087880A1 - Method and system for interpolating a virtual image from a first and a second input images - Google Patents

Method and system for interpolating a virtual image from a first and a second input images Download PDF

Info

Publication number
WO2013087880A1
WO2013087880A1 PCT/EP2012/075634 EP2012075634W WO2013087880A1 WO 2013087880 A1 WO2013087880 A1 WO 2013087880A1 EP 2012075634 W EP2012075634 W EP 2012075634W WO 2013087880 A1 WO2013087880 A1 WO 2013087880A1
Authority
WO
WIPO (PCT)
Prior art keywords
luminance
color
input images
pixels
input
Prior art date
Application number
PCT/EP2012/075634
Other languages
French (fr)
Inventor
Philippe Robert
Matthieu Fradet
Tomas CRIVELLI
Cedric Thebault
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Publication of WO2013087880A1 publication Critical patent/WO2013087880A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence

Definitions

  • the invention concerns the interpolation of images from at least one pair of images. It concerns particularly the interpolation of the luminance for this interpolated image.
  • interpolation is carried out from a dense correspondence map assigned to the virtual view.
  • This map is either the motion or the disparity map.
  • This map has been either directly computed for the view to be interpolated or derived from such a map first computed for one of the two input views.
  • the assigned correspondence vector of each pixel in the virtual view allows linking it to the two input images, getting the corresponding points in these input images and then interpolating the pixel.
  • US7 697 769 discloses an interpolation image generating method includes dividing each of the first reference image and the second reference image into reference regions, correlation values between the regions of the first and second destination images are indicated by the motion vectors. An interpolation image is generated using the reference region determined as the high correlation region, and mixing the interpolation image candidates using the motion vectors of the reference region to produce an interpolation image.
  • WO 201 1 105337 (CA 2 790 268) from Nippon telegraph and telephone corporation discloses a method for encoding object frame from previously encoded reference frame.
  • a correction parameter for correcting mismatches in terms of local brightness and color is estimated from viewpoint synthesized image and reference frame and is used to correct viewpoint synthesized image.
  • Interpolation of disparity or depth values such as interpolation of luminance or color are described but it happens that some areas are occluded in one view and visible in only one view. Thus if a pixel is interpolated from one frame only, the value of the luminance and or color in an interpolated view cannot be directly determined.
  • Luminance or color of the pixel to interpolate is obtained via the following equation:
  • Ti corresponds to a value: for example, in the case of interpolation along the temporal axis, Ti corresponds to a time value, the various frames of the sequence are referred by this value. In the case of view interpolation from multiple views, the input and output views are assumed to have their focal points aligned, and Ti corresponds to a position index along this line.
  • L can be a luminance map or a color channel map. In case of color, an a value can be different for each color channel ,the rot one CR , the green one
  • the coefficient a depends of the disparity or motion of the images TO, T1 and Ti as follow:
  • This equation combines the luminance L or color C component of the two input imagesTO and T1.
  • the result takes in consideration a weighting coefficient a between the input values taking into accountjhe distance of the virtual frame to the two input views ⁇
  • the luminance or color of the interpolated pixel is the luminance or color of the visible pixel.
  • the invention will remedy these disadvantages. It consists in a method of interpolating a virtual image from a first and a second input image the luminance or color of each interpolated pixel of the virtual image being calculated from corresponding pixels of the first and the second input images
  • luminance and/or color gain variation factor
  • pixels belonging to an occluded area are determined by an occlusion map relative to the interpolated image.
  • the luminance/color gain variation factor is a value estimated by correlation during matching between the first and second input images as it varies linearly between the first and second input images.
  • the luminance and/or color variation gain takes account of the offset value defined by the affine model.
  • the luminance and/or color gain variation gain takes account of the offset value defined by the parabolic model.
  • interpolated pixel luminance and/or color is filled from spatio-temporal neighboring pixels without taking into account luminance/color gain
  • luminance and/or color variation are filled from estimation of the spatial luminance and/or color variation of pixels in the visible parts.
  • the present invention concerns furthermore a method of extrapolating a virtual image corresponding to the preceding method of interpolating a virtual image.
  • the present invention concerns furthermore a system for interpolating a virtual image from a first and a second input images, the luminance or color of each interpolated pixel of the virtual image being calculated from corresponding pixels of the first and the second input images.
  • the system comprises further, if a first input image is partially occluded, for the pixels belonging to an occluded area, Means for estimating a luminance and/or color gain variation factor between the first and second input images from spatio-temporal neighboring information from pixels that do not belong to occluded area, means for calculating a weighting coefficient taking in account a distance of the virtual image to the second input image) relative to the distance between the first and the second input images and means for computing the luminance and/or color of each interpolated pixel from the luminance and/or color of the corresponding pixel belonging to the second input image taking account of the estimated luminance and/or color variation gain and the calculated weighting coefficient .
  • Figure 2 illustrates the situation of an orphan pixel which gets at least a region label and disparity/motion information
  • FIG. 3 represents a flowchart of a method of the invention.
  • T is the image to be interpolated
  • T 0 and "Ti” are input images.
  • d x corresponds to motion or disparity information linking "To” and "Ti”.
  • This equation (1 ) combines the luminance L or color C component of the two input imagesTO and T1 .
  • the result takes in consideration a weighting coefficient a between the input values taking into account the distance of the virtual frame to one of the two input views T,-Ti and the distance between the two input views Ti -To .
  • the coefficient a depends on the images TO, T1 and Ti .
  • equation (1 ) is used for interpolation. It means that the luminance/color gain is not explicitly considered in the interpolation but integrated in equation (1 ).
  • E x,T i a x ( ⁇ - l)x l(x -a x d x , ⁇ 0 )
  • luminance or color variation
  • the gain (Eq.(3)) can be estimated for example via correlation during matching between views. Considering two blocks in two images (e.g.T 0 and Ti) that are candidates for matching, if X and Y represent the luminance or color component respectively in each block, then the luminance gain or color gain ⁇ can be given by :
  • E[XY] is the covariance between X and Y
  • E[X] is the average of X
  • E[X 2 ] is the variance of X
  • index i is for the pixels of the block X or Y.
  • a luminance gain ⁇ (x) and a color gainp (x) can be estimated for each pixel in the image.
  • the color gain can be obtained by applying this formula to each color channel CR, CG, CB, leading to a gain factor for each of these channels.
  • motion value or disparity value and luminance and/or color gain must be extrapolated from spatio-temporal neighboring information of the background or from pixels that do not belong to the occluding object.
  • a parametric model can be estimated from the neighborhood and used for extrapolation of disparity, motion or luminance/color gain.
  • the interpolation can be the following:
  • the gain can be computed as before, and the offset ⁇ can be given by :
  • the gain ⁇ is supposed to vary linearly between the two views. However, a more complex model may be more appropriate. Actually, considering more than two views, more complex models can be considered that may better fit the variation. In motion estimation, it could be 3 or 4 successive images, in disparity estimation, 3 or 4 adjacent views. For example, with 3 images To, Ti, T 2 , one can identify a parabolic model : a first gain is estimated between T 0 and Ti, then a second one between Ti and T 2 :
  • Equation (1) is no more valid and is replaced by:
  • these different occluded regions get a label and a relative occlusion order. Moreover, the most probable shape of the occluded regions in the hidden parts is determined.
  • inpainting is applied to disparity/motion and possibly luminance/color and luminance/color change information in these hidden areas.
  • color change data in the hidden area may be filled from the color change data in the visible parts:
  • Figure 2 illustrates this situation.
  • a and B represent two input images from both images in which part O1 corresponds to the background in which objects O2 and 03 are represented.
  • Object 04 is partially occluded by objects 03 and 04
  • C represents the image to be interpolated.
  • the first image right represents estimated disparity map of the first input image A. Different parts are recognized corresponding to different correspondence values.
  • the disparity/motion of 01 is represented in black
  • the disparity/motion of 02 is represented by the color white
  • 03 is hell gray
  • 04 is dark gray.
  • 04 is partially occluded by 02 and 03.
  • the middle image at the second line depicts the disparity/motion map of 01 and 04 from which objects 02 and 03 have been removed. So, the hidden parts of 01 and 04 have been filled as described above. Similarly, luminance/color variation maps have been filled (gain map and possibly offset map). This processing can also be carried out in the second image (B).
  • the next step is the projection of the regions from the input images onto the virtual view via disparity/motion compensation. With the depth layer representation, each orphan pixel can get now at least a region label and disparity/motion information. Similarly, a luminance/color variation map can be built from the input luminance/color variation map(s).
  • the lower right image in Figure 2 shows the complete interpolated disparity/motion map, filled in the areas occluded in the input views according to the description above.
  • a luminance/color variation map is obtained in the same way.
  • the next step is the interpolation of the virtual image.
  • the pixels that are visible in both input images can then be interpolated via a known method or via the method that computes a non-linear luminance/color gain that is described previously.
  • the pixels that are visible in only one image are filled according to are interpolated according what is described above.
  • color filling of the orphan pixels is carried out via inpainting.
  • three cases are considered: ⁇ Either the pixel is filled from pixels belonging to the same object region in the current image, i.e. from spatial neighboring pixels. This process is called intra-frame filling, spatial filling or image inpainting. • If the regions in the input views have been filled, i.e. if color and color change vector in their hidden parts have been filled, then the orphan pixels can be directly interpolated via disparity/motion compensation as pixels visible in two images.
  • the orphan pixels are filled from the visible parts of the region it belongs to in the input views. For example, exemplar-based inpainting is applied to color and to color change vector. The filling color value is then modified from the color change vector as in the case of pixels visible in one view :
  • a pixel in an image to be interpolated from any model that allows identifying the position of this pixel in a set of input images distributed in 3D space and time, and from any model that allows predicting the luminance or color appearance of this pixel from the luminance or color of the corresponding points in this set of input images, assign the predicted color appearance value to the pixel.
  • Ti is no more in the range [T 0 ,Ti] and a is out of the range [0,1 ].
  • a flow from a method of the invention is represented in figure 3.
  • the method comprises a first step of estimating a luminance and/or color gain variation factor ⁇ between the first and second input images Ti, T 0 from spatio-temporal neighboring information from pixels that do not belong to occluded area.
  • the method comprises further the step of calculating a weighting coefficient (a) taking in account a distance of the virtual image to the second input image ⁇ ,- T 0 relative to the distance between the first and the second input images Ti- T 0 . Knowing the estimated luminance and/or color variation gain( ⁇ ) and the calculated weighting coefficient (a), the method comprises the step of computing the luminance and/or color of each interpolated pixel from the luminance and/or color of the corresponding pixel belonging to the second input image taking account of the estimated luminance and/or color variation gain( ⁇ ) and the calculated weighting coefficient a.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Television Systems (AREA)

Abstract

The invention concerns a method of interpolating a virtual image (Ti) from a first and a second input images (T1, T0) the luminance (L) or color (CR, CG, CB) of each interpolated pixel of the virtual image being calculated from corresponding pixels of the first and the second input images comprising the steps of, if a first input image is partially occluded (T1), for the pixels belonging to an occluded area, estimating a luminance and/or color gain variation factor (β) between the first and second input images (T1, T0) from spatio-temporal neighboring information from pixels that do not belong to occluded arrea, calculating a weighting coefficient (α) taking in account a distance of the virtual image to the second input image (Τi- T0) relative to the distance between the first and the second input images (Tr T0) and computing the luminance and/or color of each interpolated pixel from the luminance and/or color of the corresponding pixel belonging to the second input image taking account of the estimated luminance and/or color variation gain (β) and the calculated weighting coefficient (α).

Description

Method and system for interpolating a virtual image from a first and a second input images
The invention concerns the interpolation of images from at least one pair of images. It concerns particularly the interpolation of the luminance for this interpolated image.
Usually, interpolation is carried out from a dense correspondence map assigned to the virtual view. This map is either the motion or the disparity map. This map has been either directly computed for the view to be interpolated or derived from such a map first computed for one of the two input views.
Then, during the interpolation step, the assigned correspondence vector of each pixel in the virtual view allows linking it to the two input images, getting the corresponding points in these input images and then interpolating the pixel.
US7 697 769 discloses an interpolation image generating method includes dividing each of the first reference image and the second reference image into reference regions, correlation values between the regions of the first and second destination images are indicated by the motion vectors. An interpolation image is generated using the reference region determined as the high correlation region, and mixing the interpolation image candidates using the motion vectors of the reference region to produce an interpolation image.
WO 201 1 105337 (CA 2 790 268) from Nippon telegraph and telephone corporation discloses a method for encoding object frame from previously encoded reference frame. A correction parameter for correcting mismatches in terms of local brightness and color is estimated from viewpoint synthesized image and reference frame and is used to correct viewpoint synthesized image. Interpolation of disparity or depth values such as interpolation of luminance or color are described but it happens that some areas are occluded in one view and visible in only one view. Thus if a pixel is interpolated from one frame only, the value of the luminance and or color in an interpolated view cannot be directly determined.
Luminance or color of the pixel to interpolate is obtained via the following equation:
L(x, Tt ) = (ΐ -α)χ ΐ(χ-α dx , T0 ) + a x L(x + (l - a) dx , Tl ) Eq. (1 )
"T," is the image to be interpolated, "T0" and "Τ are input images. "dx" corresponds to motion or disparity information linking "T0" and "Τ and is assigned to point (x,Ti). In the following, it is assumed that Ti corresponds to a value: for example, in the case of interpolation along the temporal axis, Ti corresponds to a time value, the various frames of the sequence are referred by this value. In the case of view interpolation from multiple views, the input and output views are assumed to have their focal points aligned, and Ti corresponds to a position index along this line.
The (temporal or spatial) distance between the input frames can be set to 1 (Ti-To=1 ) and Ti refers to the relative position of image Ti with respect to T0. L can be a luminance map or a color channel map. In case of color, an a value can be different for each color channel ,the rot one CR , the green one
Figure imgf000004_0001
In a simple interpolation case the coefficient a depends of the disparity or motion of the images TO, T1 and Ti as follow:
Figure imgf000004_0002
This equation combines the luminance L or color C component of the two input imagesTO and T1. In particular, in case of color or luminance difference between the two images, the result takes in consideration a weighting coefficient a between the input values taking into accountjhe distance of the virtual frame to the two input views^
Let us assume that there is a luminance gain factor β between the two input images views "T0" and "Τ :
L(X + (l-a) dx,Tl)= βχ L(X -a dx,T0) Eq . (3)
In this case, interpolation equation (1) becomes:
L(x, TI ) = (l + a x (β - l))x L(X -a dx,T0)
If the luminance gain factor β is equivalent to 1, there is no variation of luminance between the two views and the interpolated views:
L(x,Tt) =∑x-a x dx,T0) = L(x + (l-a)xdx,Tl)
But if the luminance gain factor β is different to 1 (β≠1), the luminance or color component in the interpolated view Ti resulting from the interpolation from both views T0 and Ti becomes (introducing Eq.(3) in Eq.(1)) :
L(x,Tt) =∑x-a xdx,T0)+a χ(β -l)x L(X -a xdx,T0) Eq. (4)
But it happens that some areas are visible in only one view and occluded in the other view. This case requires a special processing to locate such areas in the virtual view. Once it is done, the luminance in the virtual view can be computed from just one frame via the following equation if the pixel is occluded in Ti : L(x,Ti) = L(x-axdx,T0) Eq. (2) Thus the luminance or color of the interpolated pixel is the luminance or color of the visible pixel.
If a pixel is interpolated from one frame only, due to occlusion, for example from To only (Eq. (2, the luminance gain factor β between the two input images views "T0" and "Τ is not known, the error on the pixel reconstruction is :
E x, Ti ) = α (β - l) x l(x - a dx , T0 )
The problem is that these partially occluded pixels are not compensated in a similar way as the pixels visible in both views. This may lead to annoying defects.
The invention will remedy these disadvantages. It consists in a method of interpolating a virtual image from a first and a second input image the luminance or color of each interpolated pixel of the virtual image being calculated from corresponding pixels of the first and the second input images
Characterized in that the method comprises further the steps of:
If a first input image is partially occluded, for the pixels belonging to an occluded area,
estimating a luminance and/or color gain variation factor (β) between the first and second input images from spatio-temporal neighboring information from pixels that do not belong to occluded area;
calculating a weighting coefficient taking in account a distance of the virtual image to the second input image relative to the distance between the first and the second input images;
computing the luminance and/or color of each interpolated pixel from the luminance and/or color of the corresponding pixel belonging to the second input image taking account of the estimated luminance and/or color variation gain and the calculated weighting coefficient. Thus luminance and/or color are compensated in a similar way as the pixels visible in both views
According to an aspect of the invention, pixels belonging to an occluded area are determined by an occlusion map relative to the interpolated image.
According to another aspect of the invention, the luminance/color gain variation factor is a value estimated by correlation during matching between the first and second input images as it varies linearly between the first and second input images.
According to another aspect of the invention if the luminance and/or color gain factor variations follow an affine model, the luminance and/or color variation gain takes account of the offset value defined by the affine model.
According to another aspect of the invention if the luminance and/or color gain factor variations follow a parabolic model, the luminance and/or color gain variation gain takes account of the offset value defined by the parabolic model.
According to another aspect of the invention for pixels belonging to occluded object in the two input images, interpolated pixel luminance and/or color is filled from spatio-temporal neighboring pixels without taking into account luminance/color gain
According to another aspect of the invention for pixels belonging to occluded areas in the first and second input images, luminance and/or color variation are filled from estimation of the spatial luminance and/or color variation of pixels in the visible parts. The present invention concerns furthermore a method of extrapolating a virtual image corresponding to the preceding method of interpolating a virtual image. The present invention concerns furthermore a system for interpolating a virtual image from a first and a second input images, the luminance or color of each interpolated pixel of the virtual image being calculated from corresponding pixels of the first and the second input images. The system comprises further, if a first input image is partially occluded, for the pixels belonging to an occluded area, Means for estimating a luminance and/or color gain variation factor between the first and second input images from spatio-temporal neighboring information from pixels that do not belong to occluded area, means for calculating a weighting coefficient taking in account a distance of the virtual image to the second input image) relative to the distance between the first and the second input images and means for computing the luminance and/or color of each interpolated pixel from the luminance and/or color of the corresponding pixel belonging to the second input image taking account of the estimated luminance and/or color variation gain and the calculated weighting coefficient .
The above and other aspects of the invention will become more apparent by the following detailed description of exemplary embodiments thereof with reference to the attached drawings in which: Figure 1 is an illustration of a simple interpolation case;
Figure 2 illustrates the situation of an orphan pixel which gets at least a region label and disparity/motion information;
Figure 3 represents a flowchart of a method of the invention; Hereinafter, the present invention will be described more fully with reference to the accompanying drawings, in which exemplary embodiments of the invention are shown As described previously, luminance or color of the pixel to interpolate is obtained via the following equation : L(x, Tt) = (ΐ -α)χ ΐ(χ-α x dx ,T0 ) + a x L(x + (l - a) dx ,Tl ) Eq. (1 )
As represented by the figure 1 , "T" is the image to be interpolated, "T0" and "Ti" are input images. "dx" corresponds to motion or disparity information linking "To" and "Ti".
This equation (1 ) combines the luminance L or color C component of the two input imagesTO and T1 . In particular, in case of color or luminance difference between the two images, the result takes in consideration a weighting coefficient a between the input values taking into account the distance of the virtual frame to one of the two input views T,-Ti and the distance between the two input views Ti -To .
Figure imgf000009_0001
In a simple interpolation case the coefficient a depends on the images TO, T1 and Ti .
If the pixel is visible in the two images "T0" and "Τ , equation (1 ) is used for interpolation. It means that the luminance/color gain is not explicitly considered in the interpolation but integrated in equation (1 ).
If a pixel is interpolated from one frame only, due to occlusion, for example from To only (Eq. (2)), the luminance gain factor β between the two input images views "T0" and "Τ is not known, the error on the pixel reconstruction is :
E x,Ti) = a x (β - l)x l(x -a x dx , Γ0 ) In order to avoid the problem described above, it is proposed to estimate the luminance or color variation (β) between both views and to compensate this variation during interpolation. The presentation below assumes two input views, but it can be applied with more than two views.
During motion/disparity estimation, an additional luminance and/or /color gain is estimated for each pixel of one input image with respect to the other. The result from correspondence between two frames is, for each frame : · A disparity/motion map
• An occlusion map
• A luminance and/or color gain map
The gain (Eq.(3)) can be estimated for example via correlation during matching between views. Considering two blocks in two images (e.g.T0 and Ti) that are candidates for matching, if X and Y represent the luminance or color component respectively in each block, then the luminance gain or color gain β can be given by :
B =— r Ί J = I I L r J J- J assuming : Υ. = β χ Χ. ,
E[XY] is the covariance between X and Y, E[X] is the average of X, E[X2] is the variance of X. index i is for the pixels of the block X or Y. Thus, a luminance gain β (x) and a color gainp (x) can be estimated for each pixel in the image. The color gain can be obtained by applying this formula to each color channel CR, CG, CB, leading to a gain factor for each of these channels.
If the occlusion map indicates that the pixel is visible in one view only and as matching is not possible in these areas, motion value or disparity value and luminance and/or color gain must be extrapolated from spatio-temporal neighboring information of the background or from pixels that do not belong to the occluding object. For example, a parametric model can be estimated from the neighborhood and used for extrapolation of disparity, motion or luminance/color gain.
Then, the interpolation can be the following:
L(x, Ti ) = (l + a x (β - l))x L(X - a dx , T0 )
if x is visible in T0 only (Eq. (4))
Figure imgf000011_0001
if x is visible in Ti only
The solution above concerns what is done in the interpolation of pixels visible in one image in presence of a linear luminance and/or color gain.
Nothing changes for the interpolation of a pixel visible in both input views. But, some variants of more complex luminance/color variation models are proposed:
• For example, one can even consider an affine model instead of a linear one :
Figure imgf000011_0002
In this case, the gain can be computed as before, and the offset γ can be given by :
Figure imgf000011_0003
The relation between Ti and T0 is supposed to be (Equation (3) modified) :
L(X + (\ - a) x dX , TL ) = β χ L(X - α χ άΧ , Τ0 ) + γ Eq. (6)
Then, in case of occlusion in one frame, the interpolation of the luminance can be given (introducing Eq. (6) in Eq. (1 )), by : o L(x, Tt) = (l + a χ(β -l))xL(x-a xdx, Τ0)+αγ Eq. (7a) if x is visible in To only
Figure imgf000012_0001
o Z(x,i;.) = Eq. (7b) if x is visible in Ti only
• Up to now, the gain β is supposed to vary linearly between the two views. However, a more complex model may be more appropriate. Actually, considering more than two views, more complex models can be considered that may better fit the variation. In motion estimation, it could be 3 or 4 successive images, in disparity estimation, 3 or 4 adjacent views. For example, with 3 images To, Ti, T2, one can identify a parabolic model : a first gain is estimated between T0 and Ti, then a second one between Ti and T2 :
L X + (1 - a) x dx , JJ ) = βι x L(X -a dx,T0)
Figure imgf000012_0002
From the triplet (1,βι,β2), the parabolic model β(α) can be estimated : β( ) = axa2+bxa+c
In the same way, the offset γ(α) can be estimated. Then interpolation is carried out via the same equations as previously (Eqs. (7)), except that the parameters β and γ depend on a value.
Furthermore, in this case, even the interpolation of the pixels visible in two images must be reconsidered as the gain β depends on the relative position a. Equation (1) is no more valid and is replaced by:
L(x,Tt) = β{α)χ∑(χ-α xdx,T0)+
Figure imgf000012_0003
With the following conditions: β(θ) = 1 and jS(l) = 0
= β(α) = αχα2-(α + ΐ)χα+1 All these methods can be used for image extrapolation instead of image interpolation: in this case, Ti is no more in the range [T0, T1 ], a is out of the range [0, 1 ]. If the occlusion map indicates that the pixel is not visible in an input view, so it is an "orphan" pixel. Thus there is no corresponding vector that links this pixel to a point in another view. The pixel luminance/color is filled from spatio- temporal neighboring pixels without taking into account luminance/color gain. A solution is that disparity/motion based segmentation is applied to the input images to distinguish the various depth layers in these images and in particular to identify occluded and occluding regions.
So, these different occluded regions get a label and a relative occlusion order. Moreover, the most probable shape of the occluded regions in the hidden parts is determined.
Then, inpainting is applied to disparity/motion and possibly luminance/color and luminance/color change information in these hidden areas.
For a given region, color change data in the hidden area may be filled from the color change data in the visible parts:
• Via exemplar-based inpainting techniques
• from estimation of the spatial variation of the color change information inside the region via a color change (e.g. affine parametric) model. Similarly, disparity/motion and color can be filled in the same way.
Figure 2 illustrates this situation. In first line, A and B represent two input images from both images in which part O1 corresponds to the background in which objects O2 and 03 are represented. Object 04 is partially occluded by objects 03 and 04, C represents the image to be interpolated. In the second line the first image right represents estimated disparity map of the first input image A. Different parts are recognized corresponding to different correspondence values. The disparity/motion of 01 is represented in black, the disparity/motion of 02 is represented by the color white, 03 is hell gray and 04 is dark gray. 04 is partially occluded by 02 and 03.
The middle image at the second line depicts the disparity/motion map of 01 and 04 from which objects 02 and 03 have been removed. So, the hidden parts of 01 and 04 have been filled as described above. Similarly, luminance/color variation maps have been filled (gain map and possibly offset map). This processing can also be carried out in the second image (B). The next step is the projection of the regions from the input images onto the virtual view via disparity/motion compensation. With the depth layer representation, each orphan pixel can get now at least a region label and disparity/motion information. Similarly, a luminance/color variation map can be built from the input luminance/color variation map(s).
The lower right image in Figure 2 shows the complete interpolated disparity/motion map, filled in the areas occluded in the input views according to the description above. A luminance/color variation map is obtained in the same way.
The next step is the interpolation of the virtual image. The pixels that are visible in both input images can then be interpolated via a known method or via the method that computes a non-linear luminance/color gain that is described previously. The pixels that are visible in only one image are filled according to are interpolated according what is described above.
Then, color filling of the orphan pixels is carried out via inpainting. In this context, three cases are considered: · Either the pixel is filled from pixels belonging to the same object region in the current image, i.e. from spatial neighboring pixels. This process is called intra-frame filling, spatial filling or image inpainting. • If the regions in the input views have been filled, i.e. if color and color change vector in their hidden parts have been filled, then the orphan pixels can be directly interpolated via disparity/motion compensation as pixels visible in two images.
• If the color data in the hidden areas of the regions in the input views have not been filled, then, the orphan pixels are filled from the visible parts of the region it belongs to in the input views. For example, exemplar-based inpainting is applied to color and to color change vector. The filling color value is then modified from the color change vector as in the case of pixels visible in one view :
Considering a pixel in an image to be interpolated, from any model that allows identifying the position of this pixel in a set of input images distributed in 3D space and time, and from any model that allows predicting the luminance or color appearance of this pixel from the luminance or color of the corresponding points in this set of input images, assign the predicted color appearance value to the pixel.
It can be applied in the same way to extrapolation. In this case, Ti is no more in the range [T0,Ti] and a is out of the range [0,1 ].
A flow from a method of the invention is represented in figure 3. For pixels belonging to an occluded area, the method comprises a first step of estimating a luminance and/or color gain variation factor β between the first and second input images Ti, T0 from spatio-temporal neighboring information from pixels that do not belong to occluded area.
The method comprises further the step of calculating a weighting coefficient (a) taking in account a distance of the virtual image to the second input image Τ,- T0 relative to the distance between the first and the second input images Ti- T0. Knowing the estimated luminance and/or color variation gain( β) and the calculated weighting coefficient (a), the method comprises the step of computing the luminance and/or color of each interpolated pixel from the luminance and/or color of the corresponding pixel belonging to the second input image taking account of the estimated luminance and/or color variation gain( β) and the calculated weighting coefficient a.

Claims

1 . Method of interpolating a virtual image (Ti) from a first and a second input images (ΤΊ, T0) the luminance (L) or color (CR, CG, CB) of each interpolated pixel of the virtual image being calculated from corresponding pixels of the first and the second input images
Characterized in that the method comprises further the steps of:
If a first input image is partially occluded (Ti), for the pixels belonging to an occluded area,
estimating a luminance and/or color gain variation factor (β) between the first and second input images (Ti, T0) from spatio-temporal neighboring information from pixels that do not belong to occluded area;
calculating a weighting coefficient (a) taking in account a distance of the virtual image to the second input image (Τ> T0) relative to the distance between the first and the second input images (TV T0);
computing the luminance and/or color of each interpolated pixel from the luminance and/or color of the corresponding pixel belonging to the second input image taking account of the estimated luminance and/or color variation gain( β) and the calculated weighting coefficient (a).
2. Method as claimed in claim 1 , wherein pixels belonging to an occluded area are determined by an occlusion map relative to the interpolated image.
3. Method as claimed in claim 1 , wherein the luminance/color gain variation factor is a value estimated by correlation during matching between the first and second input images as it varies linearly between the first and second input images.
4. Method as claimed in claim 1 , wherein if the luminance and/or color gain factor variations follow an affine model , the luminance and/or color variation gain ( β) takes account of the offset value defined by the affine model.
5. Method as claimed in claim 1 , wherein if the luminance and/or color gain factor variations follow a parabolic model, the luminance and/or color gain variation gain ( β) takes account of the offset value defined by the parabolic model.
6. Method as claimed in claim 1 , wherein for pixels belonging to occluded areas in the first and second input images, interpolated pixel luminance and/or color is filled from the luminance and/or color of spatio-temporal neighboring pixels without taking into account luminance/color gain factor.
7. Method as claimed in claim 1 , wherein for pixels belonging to occluded areas in the first and second input images, luminance and/or color variation ( β) are filled from estimation of the spatial luminance and/or color variation of pixels in the visible parts.
8. System for interpolating a virtual image (Ti) from a first and a second input images (Ti, T0), the luminance (L) or color (CR, CG, CB) of each interpolated pixel of the virtual image being calculated from corresponding pixels of the first and the second input images
Characterized in that the system comprises further:
If a first input image is partially occluded (Ti), for the pixels belonging to an occluded area,
Means for estimating a luminance and/or color gain variation factor (β) between the first and second input images (Ti, T0) from spatio-temporal neighboring information from pixels that do not belong to occluded area;
Means for calculating a weighting coefficient (a) taking in account a distance of the virtual image to the second input image (Τ,- T0) relative to the distance between the first and the second input images (Ti- To) and Means for computing the luminance and/or color of each interpolated pixel from the luminance and/or color of the corresponding pixel belonging to the second input image taking account of the estimated luminance and/or color variation gain( β) and the calculated weighting coefficient (a).
PCT/EP2012/075634 2011-12-14 2012-12-14 Method and system for interpolating a virtual image from a first and a second input images WO2013087880A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP11306661 2011-12-14
EP11306661.7 2011-12-14

Publications (1)

Publication Number Publication Date
WO2013087880A1 true WO2013087880A1 (en) 2013-06-20

Family

ID=47603536

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2012/075634 WO2013087880A1 (en) 2011-12-14 2012-12-14 Method and system for interpolating a virtual image from a first and a second input images

Country Status (1)

Country Link
WO (1) WO2013087880A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108876841A (en) * 2017-07-25 2018-11-23 成都通甲优博科技有限责任公司 The method and system of interpolation in a kind of disparity map parallax refinement

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7257272B2 (en) * 2004-04-16 2007-08-14 Microsoft Corporation Virtual image generation
WO2008134829A2 (en) * 2007-05-04 2008-11-13 Interuniversitair Microelektronica Centrum A method and apparatus for real-time/on-line performing of multi view multimedia applications
US7697769B2 (en) 2004-01-15 2010-04-13 Kabushiki Kaisha Toshiba Interpolation image generating method and apparatus
US20100194858A1 (en) * 2009-02-02 2010-08-05 Samsung Electronics Co., Ltd. Intermediate image generation apparatus and method
US20110080463A1 (en) * 2009-10-07 2011-04-07 Fujifilm Corporation Image processing apparatus, method, and recording medium
WO2011105337A1 (en) 2010-02-24 2011-09-01 日本電信電話株式会社 Multiview video coding method, multiview video decoding method, multiview video coding device, multiview video decoding device, and program

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7697769B2 (en) 2004-01-15 2010-04-13 Kabushiki Kaisha Toshiba Interpolation image generating method and apparatus
US7257272B2 (en) * 2004-04-16 2007-08-14 Microsoft Corporation Virtual image generation
WO2008134829A2 (en) * 2007-05-04 2008-11-13 Interuniversitair Microelektronica Centrum A method and apparatus for real-time/on-line performing of multi view multimedia applications
US20100194858A1 (en) * 2009-02-02 2010-08-05 Samsung Electronics Co., Ltd. Intermediate image generation apparatus and method
US20110080463A1 (en) * 2009-10-07 2011-04-07 Fujifilm Corporation Image processing apparatus, method, and recording medium
WO2011105337A1 (en) 2010-02-24 2011-09-01 日本電信電話株式会社 Multiview video coding method, multiview video decoding method, multiview video coding device, multiview video decoding device, and program
CA2790268A1 (en) 2010-02-24 2011-09-01 Nippon Telegraph And Telephone Corporation Multiview video encoding method, multiview video decoding method, multiview video encoding apparatus, multiview video decoding apparatus, and program

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CRIMINISI A ET AL: "Efficient Dense Stereo with Occlusions for New View-Synthesis by Four-State Dynamic Programming", INTERNATIONAL JOURNAL OF COMPUTER VISION, KLUWER ACADEMIC PUBLISHERS, BO, vol. 71, no. 1, June 2006 (2006-06-01), pages 89 - 110, XP019410163, ISSN: 1573-1405, DOI: 10.1007/S11263-006-8525-1 *
IN-YONG SHIN ET AL: "Disparity Estimation at Virtual Viewpoint for Real-time Intermediate View Generation", PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON 3D SYSTEMS AND APPLICATIONS SEOUL, KOREA, 20~22 JUNE 2011., June 2011 (2011-06-01), pages 195 - 198, XP055058150 *
PAGLIARI C L ET AL: "Reconstruction of intermediate views from stereoscopic images using a rational filter", IMAGE PROCESSING, 1998. ICIP 98. PROCEEDINGS. 1998 INTERNATIONAL CONFERENCE ON CHICAGO, IL, USA 4-7 OCT. 1998, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, vol. 2, 4 October 1998 (1998-10-04), pages 627 - 631, XP010308477, ISBN: 978-0-8186-8821-8, DOI: 10.1109/ICIP.1998.723552 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108876841A (en) * 2017-07-25 2018-11-23 成都通甲优博科技有限责任公司 The method and system of interpolation in a kind of disparity map parallax refinement

Similar Documents

Publication Publication Date Title
US6625333B1 (en) Method for temporal interpolation of an image sequence using object-based image analysis
US8854425B2 (en) Method and apparatus for depth-related information propagation
US9361677B2 (en) Spatio-temporal disparity-map smoothing by joint multilateral filtering
US20220286660A1 (en) Concept for determining a measure for a distortion change in a synthesized view due to depth map modifications
US8649437B2 (en) Image interpolation with halo reduction
US9256926B2 (en) Use of inpainting techniques for image correction
TWI432034B (en) Multi-view video coding method, multi-view video decoding method, multi-view video coding apparatus, multi-view video decoding apparatus, multi-view video coding program, and multi-view video decoding program
US20100123792A1 (en) Image processing device, image processing method and program
US8331711B2 (en) Image enhancement
US8223839B2 (en) Interpolation method for a motion compensated image and device for the implementation of said method
EP3580924B1 (en) Method and apparatus for processing an image property map
Dikbas et al. Novel true-motion estimation algorithm and its application to motion-compensated temporal frame interpolation
US7787048B1 (en) Motion-adaptive video de-interlacer
US20120113093A1 (en) Modification of perceived depth by stereo image synthesis
CN102985949A (en) Multi-view rendering apparatus and method using background pixel expansion and background-first patch matching
Fu et al. Temporal consistency enhancement on depth sequences
JP3095141B2 (en) Motion compensation method of moving image using two-dimensional triangular lattice model
WO2013087880A1 (en) Method and system for interpolating a virtual image from a first and a second input images
US9538179B2 (en) Method and apparatus for processing occlusions in motion estimation
Robert et al. Disparity-compensated view synthesis for s3D content correction
JP2013114682A (en) Method for generating virtual image
KR101834952B1 (en) Apparatus and method for converting frame rate
Brites et al. Epipolar geometry-based side information creation for multiview Wyner–Ziv video coding
Lin et al. Key-frame-based depth propagation for semi-automatic stereoscopic video conversion
Oh et al. Frame interpolation method based on adaptive threshold and adjacent pixels

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12818497

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12818497

Country of ref document: EP

Kind code of ref document: A1