CN113179396B - Double-viewpoint stereo video fusion method based on K-means model - Google Patents

Double-viewpoint stereo video fusion method based on K-means model Download PDF

Info

Publication number
CN113179396B
CN113179396B CN202110295931.9A CN202110295931A CN113179396B CN 113179396 B CN113179396 B CN 113179396B CN 202110295931 A CN202110295931 A CN 202110295931A CN 113179396 B CN113179396 B CN 113179396B
Authority
CN
China
Prior art keywords
image
viewpoint
depth
foreground
depth image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110295931.9A
Other languages
Chinese (zh)
Other versions
CN113179396A (en
Inventor
周洋
张博文
崔金鹏
梁文青
殷海兵
陆宇
黄晓峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202110295931.9A priority Critical patent/CN113179396B/en
Publication of CN113179396A publication Critical patent/CN113179396A/en
Application granted granted Critical
Publication of CN113179396B publication Critical patent/CN113179396B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/275Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
    • H04N13/279Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/15Processing image signals for colour aspects of image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/156Mixing image signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a double-viewpoint three-dimensional video fusion method based on a K-means model. The method comprises the steps of firstly preprocessing a left viewpoint depth image and a right viewpoint depth image to obtain the left viewpoint depth image and the right viewpoint depth image; then, the depth images of the left viewpoint and the right viewpoint are respectively segmented by using a K-means method, and the three-dimensional projection operation is carried out on the segmented depth images of the foreground and background areas to obtain foreground and background drawing images of the left viewpoint and the right viewpoint; filling a vacant area of the foreground rendering image with the background rendering image when the foreground rendering image is a bluebook, and carrying out image fusion on the filled left and right viewpoint rendering images to obtain a virtual viewpoint rendering image; and finally, performing weighted filling on the hole area of the virtual viewpoint rendering image according to pixel information around the hole to obtain a final output image. The method of the invention adopts the operation of pixel level to accurately process the hollow area, so that the drawing effect is better and more harmonious in visual effect.

Description

Double-viewpoint stereo video fusion method based on K-means model
Technical Field
The invention belongs to the technical field of stereo video coding and decoding, relates to a double-viewpoint stereo video fusion method based on a K-means model, and aims to improve the double-viewpoint image fusion process.
Background
Currently, depth Image Based Rendering (DIBR) is a main method for Rendering observation images from different viewing angles. And drawing the image of another visual angle according to the existing image so as to obtain the image observed under different visual angles. The most critical part of the mapping method is to perform the 3D-WARPING process. The process is that an image is restored to a three-dimensional model, and then the three-dimensional model is re-projected to a target plane near another target viewpoint to obtain an image at the virtual viewpoint. In the process of restoring the three-dimensional model and projecting onto a plane, the depth information of the image is very critical. The depth information of each pixel is very critical, and the multiple between the depth information of two views directly influences the result after rendering.
The most critical part of the DIBR technique is 3D Image transformation (3D Image warping), which is an operation of changing pixels of an Image. And mapping the pixel points of the reference image into the target view through three-dimensional transformation, thereby forming an original target view corresponding to the reference image.
The whole virtual viewpoint drawing process can be divided into two parts, and firstly, a depth map corresponding to the virtual viewpoint can be obtained by performing projection operation through the input depth image. In order to obtain the depth image at the virtual viewpoint, the most convenient method is completed by a three-dimensional projection operation process (3 d-forwarding). In the process, an image is reversely projected into a three-dimensional space to form a three-dimensional model, and then the three-dimensional model is re-projected to a target plane at a virtual viewpoint to obtain a virtual viewpoint image.
Disclosure of Invention
The invention aims to provide a double-viewpoint stereo video fusion method based on a K-means model.
The method comprises the following steps:
the method comprises the following steps of (1) preprocessing a left viewpoint depth map and a right viewpoint depth map to obtain a left viewpoint depth image and a right viewpoint depth image;
respectively segmenting the left viewpoint depth image and the right viewpoint depth image by using a K-means method, dividing the left viewpoint depth image and the right viewpoint depth image into a foreground region depth image and a background region depth image, and respectively performing three-dimensional projection operation on the foreground region depth image and the background region depth image to obtain a foreground rendering image and a background rendering image of a left viewpoint and a foreground rendering image and a background rendering image of a right viewpoint;
respectively fusing the foreground rendering image and the background rendering image of the left viewpoint and the right viewpoint: filling the vacant areas of the foreground drawing image with the background drawing image, wherein the foreground drawing image is a bluebook; obtaining a left viewpoint drawing image and a right viewpoint drawing image; carrying out image fusion on the filled left viewpoint drawing image and right viewpoint drawing image to obtain a virtual viewpoint drawing image;
and (4) carrying out weighted filling on the hole area of the virtual viewpoint drawing image according to pixel information around the hole to obtain a final output image.
Further, the preprocessing in the step (1) comprises a noise removing process and an image smoothing process; the noise removal processing is to select a residual error neural network for processing, and the image smoothing processing is to perform opening operation processing on the image after the noise removal processing.
Further, the step (2) is specifically:
carrying out clustering operation on the input left viewpoint depth image by a K-means method and segmenting, wherein the method comprises the following steps:
the pixel value distribution probability of the pixel points of the input depth image is { p } 0 ,p 1 ,…,p 255 },p i I =0,1, \ 8230;, 255, which is the distribution probability of a pixel point having a pixel value i;
setting k thresholds { τ 12 ,…,τ k Inputting the data into K-means operation, calculating the minimum Euclidean distance between each pixel value and each threshold value
Figure BDA0002984349170000021
Taking a threshold value corresponding to the minimum Euclidean distance as a first filling value; outputting new k threshold values { tau 'simultaneously' 1 ,τ′ 2 ,…,τ′ k The obtained value is used as the input of the next iteration operation;
selecting a threshold value tau each time after one iteration j A change occurs; will tau j Adjusted to set C j Statistical mean of all elements within, C j Is at τ j To tau j+1 Set of pixels in between:
Figure BDA0002984349170000022
x is C j A pixel of (1);
foreground region mask map
Figure BDA0002984349170000023
Foreground region Depth image FG = Depth × MOD FG
Background area mask map
Figure BDA0002984349170000024
Background region Depth image BG = Depth × MOD BG (ii) a K (i, j) represents a pixel value of the filling result, the corresponding pixel coordinate is (i, j), and Depth represents a Depth image;
respectively carrying out three-dimensional projection operation on the depth image of the foreground area and the depth image of the background area to obtain a drawing depth image of the foreground area and a drawing depth image of the background area;
respectively carrying out three-dimensional projection operation on the foreground area rendering depth image, the background area rendering depth image and the color image of the original viewpoint to obtain a foreground rendering image and a background rendering image of the left viewpoint;
and performing the same operation on the depth image of the right viewpoint to obtain a foreground drawing image and a background drawing image of the right viewpoint.
And further, in the step (3), carrying out image fusion on the filled left viewpoint drawing image and right viewpoint drawing image to obtain a virtual viewpoint drawing image, wherein the pixel value I of a pixel point with coordinates (x, y) in the image blend (x, y) is:
Figure BDA0002984349170000031
wherein l (x, y) ∈ </h and r (x, y) ∈ </h respectively indicate that the left viewpoint and the right viewpoint are hollow regions at coordinates (x, y), l (x, y), and
Figure BDA0002984349170000038
respectively indicating that the left viewpoint and the right viewpoint are not hole areas at coordinates (x, y); i is L (x, y) and I R (x, y) respectively represent pixel values of the left viewpoint rendered image and the right viewpoint rendered image at coordinates (x, y);
left viewpoint weight
Figure BDA0002984349170000032
Right viewpoint weight
Figure BDA0002984349170000033
R L And R R Spatial rotation matrices, T, for the left and right viewpoints, respectively L And T R The spatial translation vectors for the left and right views, respectively.
Further, in the step (4), the hollow area is formed by all I blend (x, y) =0 pixel configuration for each I blend A region in which the coordinates (x, y) of the pixel (x, y) =0 are 5 × 5 around the center is used as a filling region Ω, and the other 24 pixels except for (x, y) in the filling region Ω are divided into four groups of six pixels;
finally outputting the pixel value of the pixel point with the (x, y) image coordinate
Figure BDA0002984349170000034
Mean of non-hole pixels in mth group
Figure BDA0002984349170000035
Wherein, H (x, y) represents whether the coordinate (x, y) is a hole region, H (x, y) =0 if the coordinate is the hole region, and H (x, y) =1 if the coordinate is not the hole region;
weighted sum of pixels
Figure BDA0002984349170000036
Y (x, Y) is the priority of the coordinate (x, Y), W is the priority weight,
Figure BDA0002984349170000037
t is abbreviated as priority Y (x, Y).
The method of the invention uses a noise removal network and a smooth filtering method to preprocess the depth image, effectively reduces the drawing problem possibly generated by the depth image in the virtual viewpoint drawing process, and has positive influence on the improving process of drawing the color image. The method fuses the two-viewpoint images and uses a weighting filter based on geometric distance to optimize the fused images. The method breaks through the previous method of searching the best matching block, adopts the operation of pixel level to accurately process the hollow area, and leads the drawing effect to be better and more harmonious in visual effect.
Detailed Description
The double-viewpoint stereo video fusion method based on the K-means model specifically comprises the following steps:
the method comprises the following steps of (1) preprocessing a left viewpoint depth map and a right viewpoint depth map to obtain a left viewpoint depth image and a right viewpoint depth image; the preprocessing comprises noise removal processing and image smoothing processing; the noise removing processing is to select a residual error neural network for processing, and the image smoothing processing is to perform opening operation processing on the image after the noise removing processing.
Respectively segmenting the left viewpoint depth image and the right viewpoint depth image by using a K-means method, dividing the left viewpoint depth image and the right viewpoint depth image into a foreground region depth image and a background region depth image, and respectively performing three-dimensional projection operation (3 d-forwarding) on the foreground region depth image and the background region depth image to obtain a foreground drawing image and a background drawing image of a left viewpoint and a foreground drawing image and a background drawing image of a right viewpoint; the method comprises the following steps:
clustering operation is carried out on the input left viewpoint depth image through a K-means method and segmentation is carried out on the left viewpoint depth image, the K-means method is a clustering method based on data statistics, data with high similarity in a group of data are gathered into a class, then the data are divided into K groups of clusters, and the distance between the clusters is maximized as far as possible, and the method is as follows:
the pixel value distribution probability of the pixel points of the input depth image is { p } 0 ,p 1 ,…,p 255 },p i I =0,1, \ 8230;, 255, which is the distribution probability of a pixel point having a pixel value i;
setting k thresholdsValue { tau 12 ,…,τ k Inputting the data into K-means operation, calculating the minimum Euclidean distance between each pixel value and each threshold value
Figure BDA0002984349170000041
Taking a threshold value corresponding to the minimum Euclidean distance as a first filling value; simultaneously outputting new k threshold values { tau } 1 ′,τ 2 ′,…,τ k ' } is used as the input of the next iteration operation;
the selected threshold value tau is obtained after each iteration j A change occurs; will tau j Adjusted to set C j Statistical average of all elements in, C j Is at τ j To tau j+1 Set of pixels in between:
Figure BDA0002984349170000042
x is C j The pixel of (2);
foreground region mask map
Figure BDA0002984349170000043
Foreground region Depth image FG = Depth × MOD FG
Background region mask map
Figure BDA0002984349170000044
Background region Depth image BG = Depth × MOD BG
K (i, j) represents the fill result pixel value, the corresponding pixel coordinate is (i, j), and Depth represents the Depth image.
Respectively carrying out three-dimensional projection operation on the depth image of the foreground area and the depth image of the background area to obtain a drawing depth image of the foreground area and a drawing depth image of the background area; the three-dimensional projection operation method comprises the following steps:
Figure BDA0002984349170000051
wherein M is v And M represents the coordinate vectors of the virtual viewpoint and the original viewpoint respectively; I.C. A v And I in-camera representing virtual and original viewpoints, respectivelyA partial parameter matrix relating only to the internal performance of the camera; r v And R represents the camera external rotation matrix of the virtual viewpoint and the original viewpoint respectively, the size is 3 multiplied by 3, and the rotation angle of the camera in the three-dimensional space is represented; t is v And T represents the camera translation vectors of the virtual viewpoint and the original viewpoint, respectively, the size of which is 3 × 1, and represents the translation amount of the camera in the three-dimensional space.
Taking an original viewpoint coordinate vector M of 4 × 1 as an example, M = [ x, y, C, D (x, y)] T (ii) a T denotes transpose, x is abscissa of M, y is ordinate of M, C denotes constant coefficient, in this embodiment C =1, d (x, y) is depth value of M, the depth value reflects distance of a scene from a camera in an image, and the depth value can be calculated by depth image information:
Figure BDA0002984349170000052
where Depth (x, y) represents a pixel value at the Depth map (x, y), MAXZ represents a maximum value that is desirable in the Depth map, MAXZ =255 in the present embodiment; z min And Z max Respectively, the minimum actual depth and the maximum actual depth, i.e. the closest and farthest distances of the photographic subject to the camera, respectively.
Respectively carrying out three-dimensional projection operation on the foreground area rendering depth image, the background area rendering depth image and the color image of the original viewpoint to obtain a foreground rendering image and a background rendering image of the left viewpoint;
and performing the same operation on the depth image of the right viewpoint to obtain a foreground rendering image and a background rendering image of the right viewpoint.
And (3) fusing the foreground rendering image and the background rendering image of the left viewpoint and the right viewpoint respectively: filling the vacant areas of the foreground drawing image with the background drawing image, wherein the foreground drawing image is a bluebook; obtaining a left viewpoint drawing image and a right viewpoint drawing image; carrying out image fusion on the filled left viewpoint drawing image and right viewpoint drawing image to obtain a virtual viewpoint drawing image, wherein the pixel value I of a pixel point with coordinates (x, y) in the image blend (x, y) is:
Figure BDA0002984349170000053
wherein l (x, y) ∈ </h and r (x, y) ∈ </h respectively indicate that the left viewpoint and the right viewpoint are hollow regions at coordinates (x, y), l (x, y), and
Figure BDA0002984349170000056
respectively indicating that the left viewpoint and the right viewpoint are not hole areas at coordinates (x, y); I.C. A L (x, y) and I R (x, y) respectively represent pixel values of the left viewpoint rendered image and the right viewpoint rendered image at coordinates (x, y);
left viewpoint weight
Figure BDA0002984349170000054
Right viewpoint weight
Figure BDA0002984349170000055
R L And R R Spatial rotation matrices, T, for the left and right viewpoints, respectively L And T R The spatial translation vectors of the left viewpoint and the right viewpoint, respectively.
And (4) carrying out weighted filling on the hole area of the virtual viewpoint drawing image according to the pixel information around the hole to obtain a final output image. The hollow region is composed of all I blend (x, y) =0 pixel configuration per I blend A region having a size of 5 × 5 with the coordinate (x, y) of the pixel of (x, y) =0 as the center is set as a filled region Ω, and the other 24 pixels in the filled region Ω except for (x, y) are divided into four groups of six pixels each.
Finally outputting the pixel value of the pixel point with the (x, y) image coordinate
Figure BDA0002984349170000061
Mean of non-hole pixels in mth group
Figure BDA0002984349170000062
Wherein, H (x, y) represents whether the coordinate (x, y) is a hole region, H (x, y) =0 if the coordinate is the hole region, and H (x, y) =1 if the coordinate is not the hole region; weightingSummation of pixels
Figure BDA0002984349170000063
Y (x, Y) is the priority of the coordinate (x, Y), W is the priority weight,
Figure BDA0002984349170000064
t is abbreviated as priority Y (x, Y).

Claims (1)

1. The double-view point stereo video fusion method based on the K-means model is characterized by comprising the following steps:
the method comprises the following steps of (1) preprocessing a left viewpoint depth map and a right viewpoint depth map to obtain a left viewpoint depth image and a right viewpoint depth image; the preprocessing comprises noise removal processing and image smoothing processing; the noise removal processing is to select a residual error neural network for processing, and the image smoothing processing is to perform opening operation processing on the image after the noise removal processing;
respectively segmenting the left viewpoint depth image and the right viewpoint depth image by using a K-means method, dividing the left viewpoint depth image and the right viewpoint depth image into a foreground region depth image and a background region depth image, and respectively performing three-dimensional projection operation on the foreground region depth image and the background region depth image to obtain a foreground rendering image and a background rendering image of a left viewpoint and a foreground rendering image and a background rendering image of a right viewpoint; the method comprises the following steps:
carrying out clustering operation on the input left viewpoint depth image by a K-means method and segmenting, wherein the method comprises the following steps:
the pixel value distribution probability of the pixel points of the input depth image is { p } 0 ,p 1 ,…,p 255 },p i I =0,1, \ 8230;, 255, which is the distribution probability of a pixel point having a pixel value i;
set k thresholds { tau } 12 ,…,τ k Inputting the pixel values into a K-means operation, calculating the minimum Euclidean distance between each pixel value and each threshold value
Figure FDA0003804357560000011
Threshold value corresponding to minimum Euclidean distanceFilling the value for the first time; outputting new k threshold values { tau 'simultaneously' 1 ,τ′ 2 ,…,τ′ k The arithmetic is used as the input of the next iteration operation;
the selected threshold value tau is obtained after each iteration j A change occurs; will tau j Adjusted to set C j Statistical mean of all elements within, C j Is at tau j To tau j+1 Set of pixels in between:
Figure FDA0003804357560000012
x is C j The pixel of (2);
foreground region mask map
Figure FDA0003804357560000013
Foreground region Depth image FG = Depth × MOD FG
Background area mask map
Figure FDA0003804357560000014
Background region Depth image BG = Depth × MOD BG (ii) a K (i, j) represents a filling result pixel value, the corresponding pixel coordinate is (i, j), and Depth represents a Depth image;
respectively carrying out three-dimensional projection operation on the depth image of the foreground area and the depth image of the background area to obtain a drawing depth image of the foreground area and a drawing depth image of the background area;
respectively carrying out three-dimensional projection operation on the foreground region rendering depth image, the background region rendering depth image and the color image of the original viewpoint to obtain a foreground rendering image and a background rendering image of the left viewpoint; the three-dimensional projection operation method comprises the following steps:
Figure FDA0003804357560000021
wherein M is v And M respectively represent coordinate vectors of a virtual viewpoint and an original viewpoint; i is v And I represents the internal parameter matrix of the camera of the virtual viewpoint and the original viewpoint respectively, and is only related to the internal performance of the camera; r v And R represents a virtual viewpoint and a raw viewpoint, respectivelyA partial rotation matrix with the size of 3 multiplied by 3 and representing the rotation angle of the camera in the three-dimensional space; t is v And T represents the translation vector of the camera of the virtual viewpoint and the original viewpoint respectively, the size of the translation vector is 3 multiplied by 1, and the translation vector of the camera in the three-dimensional space is represented;
performing the same operation on the depth image of the right viewpoint to obtain a foreground drawn image and a background drawn image of the right viewpoint;
and (3) fusing the foreground rendering image and the background rendering image of the left viewpoint and the right viewpoint respectively: filling the vacant areas of the foreground drawing image with the background drawing image, wherein the foreground drawing image is a bluebook; obtaining a left viewpoint drawing image and a right viewpoint drawing image; carrying out image fusion on the filled left viewpoint drawing image and right viewpoint drawing image to obtain a virtual viewpoint drawing image, wherein the pixel value I of a pixel point with coordinates (x, y) in the image blend (x, y) is:
Figure FDA0003804357560000022
wherein l (x, y) ∈ </h and r (x, y) ∈ </h respectively indicate that the left viewpoint and the right viewpoint are hollow regions at coordinates (x, y), l (x, y), and
Figure FDA0003804357560000025
respectively indicating that the left viewpoint and the right viewpoint are not hole areas at coordinates (x, y); i is L (x, y) and I R (x, y) respectively represent pixel values of the left viewpoint rendered image and the right viewpoint rendered image at coordinates (x, y);
left viewpoint weight
Figure FDA0003804357560000023
Right viewpoint weight
Figure FDA0003804357560000024
R L And R R Spatial rotation matrices, T, for the left and right viewpoints, respectively L And T R Spatial translation vectors for the left viewpoint and the right viewpoint, respectively;
step (4), carrying out weighted filling on a cavity area of the virtual viewpoint drawing image according to pixel information around the cavity to obtain a final output image;
the hollow region is composed of all I blend (x, y) =0 pixel configuration for each I blend A region with the size of 5 × 5 with the coordinate (x, y) of the pixel of (x, y) =0 as the center is used as a filling region Ω, and the other 24 pixels except for (x, y) in the filling region Ω are divided into four groups, each group including six pixels;
finally outputting the pixel value of the pixel point with (x, y) image coordinates
Figure FDA0003804357560000031
Mean of non-hole pixels in mth group
Figure FDA0003804357560000032
Wherein, H (x, y) represents whether the coordinate (x, y) is a hole region, H (x, y) =0 if the coordinate is the hole region, and H (x, y) =1 if the coordinate is not the hole region;
weighted sum of pixels
Figure FDA0003804357560000033
Y (x, Y) is the priority of the coordinate (x, Y), W is the priority weight,
Figure FDA0003804357560000034
t is abbreviated as priority Y (x, Y).
CN202110295931.9A 2021-03-19 2021-03-19 Double-viewpoint stereo video fusion method based on K-means model Active CN113179396B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110295931.9A CN113179396B (en) 2021-03-19 2021-03-19 Double-viewpoint stereo video fusion method based on K-means model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110295931.9A CN113179396B (en) 2021-03-19 2021-03-19 Double-viewpoint stereo video fusion method based on K-means model

Publications (2)

Publication Number Publication Date
CN113179396A CN113179396A (en) 2021-07-27
CN113179396B true CN113179396B (en) 2022-11-11

Family

ID=76922161

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110295931.9A Active CN113179396B (en) 2021-03-19 2021-03-19 Double-viewpoint stereo video fusion method based on K-means model

Country Status (1)

Country Link
CN (1) CN113179396B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055982A (en) * 2011-01-13 2011-05-11 浙江大学 Coding and decoding methods and devices for three-dimensional video
CN105141940A (en) * 2015-08-18 2015-12-09 太原科技大学 3D video coding method based on regional division

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102592275B (en) * 2011-12-16 2013-12-25 天津大学 Virtual viewpoint rendering method
CN102903110B (en) * 2012-09-29 2015-11-25 宁波大学 To the dividing method of image with deep image information
CN103269438A (en) * 2013-05-27 2013-08-28 中山大学 Method for drawing depth image on the basis of 3D video and free-viewpoint television
CN103996174B (en) * 2014-05-12 2017-05-10 上海大学 Method for performing hole repair on Kinect depth images
CN109712067B (en) * 2018-12-03 2021-05-28 北京航空航天大学 Virtual viewpoint drawing method based on depth image
CN111385554B (en) * 2020-03-28 2022-07-08 浙江工业大学 High-image-quality virtual viewpoint drawing method of free viewpoint video

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055982A (en) * 2011-01-13 2011-05-11 浙江大学 Coding and decoding methods and devices for three-dimensional video
CN105141940A (en) * 2015-08-18 2015-12-09 太原科技大学 3D video coding method based on regional division

Also Published As

Publication number Publication date
CN113179396A (en) 2021-07-27

Similar Documents

Publication Publication Date Title
CN113192179B (en) Three-dimensional reconstruction method based on binocular stereo vision
CN111325693B (en) Large-scale panoramic viewpoint synthesis method based on single viewpoint RGB-D image
CN110853151A (en) Three-dimensional point set recovery method based on video
KR100560464B1 (en) Multi-view display system with viewpoint adaptation
CN111462030A (en) Multi-image fused stereoscopic set vision new angle construction drawing method
CN115298708A (en) Multi-view neural human body rendering
Tomiyama et al. Algorithm for dynamic 3D object generation from multi-viewpoint images
Xu et al. Layout-guided novel view synthesis from a single indoor panorama
CN111951368A (en) Point cloud, voxel and multi-view fusion deep learning method
Zhu et al. An improved depth image based virtual view synthesis method for interactive 3D video
Ma et al. Depth-guided inpainting algorithm for free-viewpoint video
Jantet et al. Joint projection filling method for occlusion handling in depth-image-based rendering
Zhu et al. Occlusion-free scene recovery via neural radiance fields
CN117501313A (en) Hair rendering system based on deep neural network
CN113179396B (en) Double-viewpoint stereo video fusion method based on K-means model
CN117730530A (en) Image processing method and device, equipment and storage medium
CN116681839A (en) Live three-dimensional target reconstruction and singulation method based on improved NeRF
CN113450274B (en) Self-adaptive viewpoint fusion method and system based on deep learning
CN115409932A (en) Texture mapping and completion method of three-dimensional human head and face model
CN114742954A (en) Method for constructing large-scale diversified human face image and model data pairs
CN111178163B (en) Stereoscopic panoramic image salient region prediction method based on cube projection format
CN113763474A (en) Scene geometric constraint-based indoor monocular depth estimation method
CN112364711A (en) 3D face recognition method, device and system
Lee et al. Hole concealment for depth image using pixel classification in multiview system
Lee et al. Removing foreground objects by using depth information from multi-view images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant