CN111385554B

CN111385554B - High-image-quality virtual viewpoint drawing method of free viewpoint video

Info

Publication number: CN111385554B
Application number: CN202010232831.7A
Authority: CN
Inventors: 朱威; 陈璐瑶; 陈思洁; 岑宽; 郑雅羽
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-03-28
Filing date: 2020-03-28
Publication date: 2022-07-08
Anticipated expiration: 2040-03-28
Also published as: CN111385554A

Abstract

The invention relates to a high-image-quality virtual viewpoint drawing method of a free viewpoint video, which comprises the following steps of: inputting a color image, a depth image, camera parameters and virtual viewpoint position parameters of two reference viewpoints; calculating a camera parameter matrix of the virtual viewpoint; detecting the edge of the reference viewpoint depth map; forward mapping with reference to the viewpoint depth map; median filtering of the depth map of the virtual viewpoint; the virtual viewpoint is mapped backwards; expanding the hole mask map in combination with the edge information; brightness correction and image fusion; and (4) interpolating residual holes and outputting a final virtual viewpoint color image. The invention improves small cavities, eliminates false edges, reduces unnecessary expansion operation, protects foreground edges, and effectively improves the cavity interpolation effect at the foreground edges by fusing and filling large cavities with two viewpoints, effectively solves the problems of cavities, false edges, brightness difference and the like in the virtual viewpoint drawing process, and the output virtual viewpoints have good subjective and objective image quality.

Description

High-image-quality virtual viewpoint drawing method of free viewpoint video

Technical Field

The invention belongs to the application of a digital image processing technology in the field of three-dimensional video processing, and particularly relates to a high-image-quality virtual viewpoint drawing method for a free viewpoint video.

Background

Free viewpoint video is an interactive video technology that allows a user to freely change a viewing angle, and can provide a novel immersive visual experience for the user. In recent years, it has found applications in the fields of sports events live, video conferencing, 3D television, distance education, and the like. The traditional multi-view video needs dense multi-path color video, huge transmission bandwidth and storage resources are consumed, and the number of views cannot be increased without limit, the angle switching transition between views is not natural, and the user experience is poor. Therefore, the new generation of free viewpoint video adopts a form of matching color video with corresponding depth video at a limited viewing angle, uses a virtual viewpoint rendering technology to obtain a viewpoint at any angle, and also greatly reduces the data amount of the free viewpoint video.

The virtual viewpoint drawing is a technology for drawing an Image seen by a virtual viewpoint at another position in a space according to a single or a plurality of known reference viewpoint images, and a Depth Image Based Rendering (DIBR) drawing technology draws a virtual viewpoint Image only according to a Depth map of a reference viewpoint and camera internal and external parameters, has good Image quality and high drawing speed, and is a research hotspot in the field of free viewpoint videos at present. The core of DIBR is 3D Image Warping equation (see W.Mark, L.Mmcmillan, G.Bishop, Post-rendering 3D Warping, Proc.Sympossium on I3D Graphics,1997, 7-16), which can be divided into two steps: firstly, mapping pixel points in a reference viewpoint to a three-dimensional space; and secondly, projecting the points of the three-dimensional space onto a virtual viewpoint image plane. DIBR can be divided into forward mapping and backward mapping, wherein forward mapping refers to calculating the corresponding pixel coordinates of the mapped image according to the pixel coordinates of the original image, and backward mapping refers to calculating the pixel coordinates of the target image in the original image according to the pixel coordinates of the target image. Because each pixel point in the DIBR method needs to be calculated and no data correlation exists between the calculation of each pixel point, the DIBR method has quite high parallelism when using multi-thread calculation, can greatly reduce the calculation time and meet the requirement of real-time property.

Patent application No. CN201710304519.2 discloses a virtual viewpoint hole filling method based on inverse mapping, which includes mapping left and right viewpoints to virtual viewpoints respectively, then eliminating contour artifacts through hole expansion, filling holes based on inverse mapping, and finally fusing images. However, the method also has the problems that the image has artifacts due to the brightness difference of the left and right viewpoints, and the hole expansion eliminates the contour artifacts and partial normal boundaries. Patent application No. CN201610334492.7 discloses a free viewpoint image synthesis method with color correction introduced, which maps left and right reference viewpoints to virtual viewpoints, adjusts the color of one viewpoint by using a histogram matching algorithm, and then performs image fusion and hole filling. However, the method has poor instantaneity of histogram matching and filling holes layer by layer.

Disclosure of Invention

In order to realize high-quality image display of any viewpoint in a free viewpoint video, the invention provides a high-image-quality virtual viewpoint drawing method of the free viewpoint video, which specifically comprises the following steps:

(1) inputting a color image, a depth image, camera parameters and virtual viewpoint position parameters of two reference viewpoints:

the method comprises the steps that two input reference viewpoints comprise a left reference viewpoint and a right reference viewpoint, each reference viewpoint image comprises a color image and a corresponding depth image, and camera parameters of each reference viewpoint comprise an internal reference matrix and an external reference matrix which are calibrated; the virtual viewpoint is located between the connecting lines of the two reference viewpoints, and the position parameters of the virtual viewpoint comprise Euclidean distance D between the virtual viewpoint and the left reference viewpoint_LAnd Euclidean distance D from right reference viewpoint_R. The input color image adopts an RGB space format.

(2) Calculating a camera parameter matrix of the virtual viewpoint:

and (2) calculating to obtain a camera parameter matrix of the virtual viewpoint by using the camera parameters of the two reference viewpoints input in the step (1) and the position parameters of the virtual viewpoint relative to the two reference viewpoints respectively by adopting a classical linear interpolation method.

(3) And (3) referring to the edge detection of the viewpoint depth map to obtain foreground edge mark maps of the two depth maps:

performing morphological dilation treatment on the two reference viewpoint depth maps input in the step (1) by using rectangular templates with the size of 3 x 3 respectively to obtain two dilated depth maps; and then Canny edge detection is carried out on the two expanded depth maps respectively to obtain two binary foreground edge mark maps. The effect of the depth map dilation process is to cause the subsequently detected edges to be outside the foreground contours.

(4) And mapping the reference viewpoint depth map forwards to obtain two virtual viewpoint depth maps: the foreground edge mark images of the depth map are respectively mapped forwards to obtain two mapped foreground edge mark images;

respectively mapping the two reference viewpoint depth maps input in the step (1) to an imaging plane of a virtual viewpoint in a forward three-dimensional manner by using the reference viewpoint camera parameter matrix input in the step (1) and the virtual viewpoint camera parameter matrix obtained by calculation in the step (2) by adopting a DIBR method to obtain a virtual viewpoint depth map mapped by a left viewpoint and a virtual viewpoint depth map mapped by a right viewpoint; in the mapping process of each reference viewpoint, if two pixel values in the same reference viewpoint depth map are mapped to the same pixel position of the corresponding virtual viewpoint depth map, the pixel values of the two are compared, and a smaller pixel value is reserved to ensure that the foreground is not mistakenly shielded by the background.

And (3) respectively mapping the foreground edge mark maps of the left viewpoint and the right viewpoint obtained in the step (3) to an imaging plane of the virtual viewpoint forwards in a three-dimensional manner by adopting a DIBR method to obtain the foreground edge mark map after the left viewpoint is mapped and the foreground edge mark map after the right viewpoint is mapped.

(5) Median filtering of the virtual viewpoint depth map:

and (4) performing median filtering on the two virtual viewpoint depth maps mapped forwards by the left viewpoint and the right viewpoint in the step (4) by using a 3 x 3 rectangular template respectively to obtain two filtered virtual viewpoint depth maps. The median filtering can substantially eliminate a crack in the virtual viewpoint depth map due to a rounding error at the time of forward mapping, or the like.

(6) Respectively mapping the color images of the two reference viewpoints input in the step (1) backwards by using the two filtered virtual viewpoint depth images obtained in the step (5) to obtain two mapped virtual viewpoint color images and corresponding cavity mask images:

and (3) respectively carrying out backward three-dimensional mapping on the two color images of the reference viewpoint input in the step (1) to an imaging plane of the virtual viewpoint by using the camera parameters of the reference viewpoint input in the step (1) and the camera parameter matrix of the virtual viewpoint calculated in the step (2) and the median-filtered virtual viewpoint depth map obtained in the step (5) by adopting a DIBR method, so as to obtain a virtual viewpoint color map mapped by the left viewpoint and a virtual viewpoint color map mapped by the right viewpoint. Due to the view angle conversion, a hole due to occlusion exists in the mapped color image, the hole is a region that is not three-dimensionally mapped, and the corresponding pixel values are all RGB (0,0, 0). In the mapping process, the mapped pixel positions are marked, the left unmarked pixels become holes, and a hole mask map mapped by a left viewpoint and a hole mask map mapped by a right viewpoint are respectively marked, wherein the hole mask map is a binary map, the pixel value of the hole mask map is 0, which represents that the pixel at the current position is marked as a hole pixel, and the value of 255 represents that the pixel at the current position is not marked as a hole pixel.

(7) And (4) combining the edge information expansion cavity mask images of the two mapped foreground edge mark images obtained in the step (4) to obtain two expanded cavity mask images:

and (3) carrying out selective morphological dilation on the left viewpoint mapping hole mask image obtained in the step (6) by using the foreground edge mark image obtained in the step (4) after the left viewpoint mapping, namely carrying out dilation on the position with the edge mark only to obtain the dilated left hole mask image, wherein the region possibly generating the false edge artifact is changed into a hole as a processing result. And (4) carrying out the same selective morphological dilation treatment on the right viewpoint mapping cavity mask map according to the foreground edge marking map obtained in the step (4) after the right viewpoint mapping, so as to obtain the dilated right cavity mask map.

(8) Brightness correction and image fusion:

and (3) fusing the two virtual viewpoint color images obtained by mapping the left viewpoint and the right viewpoint in the step (6) by using the expanded left and right cavity mask images obtained in the step (7) to obtain a fused virtual viewpoint color image and a cavity mask image of the fused virtual viewpoint color image, wherein the specific steps are as follows:

(8-1) if two pixels at the same position in the two cavity mask images do not belong to the cavity pixel, converting the pixel values of the position in the two virtual viewpoint color images from an RGB space format to an HSV space format, then obtaining the ratio of the pixel value brightness components of the left image and the right image at the position, accumulating and counting, and finally obtaining the average value k of all brightness ratios through statistics.

(8-2) first, the position coordinates in the virtual viewpoint color image mapped from the right viewpoint obtained in the step (6) are (u, v)Converting the pixel value from RGB space format to HSV space format, multiplying the brightness value by k, converting the pixel value from HSV space format to RGB space format, and obtaining the brightness corrected virtual viewpoint color image pixel value I mapped by right viewpoint_R(u, v). Then, the pixel value I is calculated according to the formula (1)_R(u, v) virtual viewpoint color image pixel value I mapped with co-located left viewpoint_L(u, v) fusing to obtain a fused color image pixel value I_F(u, v). The above processing is carried out on each pixel position, and finally a fused virtual viewpoint color image I is obtained_F. In the formula (1), H_LAnd H_RRespectively representing the left and right hole mask images obtained in the step (7), wherein the value of the image is 0, which represents that the pixel at the current position is a hole pixel, and a value other than 0 represents that the pixel is not a hole pixel, and the fusion process is divided into four cases: if the two pixels of the left and right void mask images at the current position are not void pixels, I_FThe pixel value at that position is represented by I_LAnd I_RThe pixel value of the corresponding position is obtained by weighted calculation according to a proportionality coefficient alpha, the alpha is obtained by calculation of an expression (2), wherein D_RAnd D_LThe Euclidean distance between the virtual viewpoint and the left and right viewpoints in the step (1); if the current position is not a hole pixel only in the left hole mask image, then I is used directly_LThe pixel value of (1) is taken as I_FA pixel value of (1); if the current position is only at the right hole mask image non-hole pixel, directly using I_RThe pixel value of (1) is taken as I_FThe pixel value of (a); otherwise, if all the pixels with the current positions in the left and right hole mask images are hole pixels, I_FThe corresponding pixel in (2) belongs to a hole pixel, cannot be obtained by calculating the left and right viewpoint pixels, the corresponding pixel value is set to be 0, and the pixel is subjected to binary classification marking. And obtaining a hole mask diagram of the fused virtual viewpoint color image after the classification marking is finished, wherein the value of the value in the diagram is 0 to indicate a hole, and the value of the value is 255 to indicate a non-hole.

(9) And (3) residual hole interpolation, and outputting a final virtual viewpoint color image:

and (3) interpolating the holes in the virtual viewpoint color image fused in the step (8) by using the two filtered virtual viewpoint depth images obtained in the step (5) and the fused hole mask image obtained in the step (8), and outputting a final virtual viewpoint color image, wherein the specific processing method comprises the following steps:

(9-1) searching first non-hollow pixel points in the mask image in the direction of 4 directions from top to bottom and from left to right to obtain 4 non-hollow pixel points for each pixel point belonging to the hollow in the fused hollow mask image obtained in the step (8), and then obtaining the depth values of the 4 non-hollow pixel points from the two filtered virtual viewpoint depth images obtained in the step (5), wherein the specific obtaining mode is as follows: preferentially, the depth value is obtained at the corresponding pixel position in the left depth map and is used as the depth value of the pixel point, if the point value in the left depth map is 0, namely the depth value does not exist, the depth value is obtained at the corresponding pixel position in the right depth map.

(9-2) sequencing the depth values of the 4 searched non-hollow pixel points from large to small, calculating the difference value of every two adjacent depth values, and eliminating the non-hollow pixel points with small correlation according to the difference value, wherein the specific elimination mode is as follows: if the difference value between the larger value and the smaller value in the adjacent depth values is larger than the threshold value Thd, discarding the smaller value and the non-empty pixel points corresponding to the depth values arranged behind the smaller value, and reserving the non-empty pixel points with the larger depth values, wherein the value range of the threshold value Thd is [5,20 ].

(9-3) finally, obtaining an interpolation result I of the current hole pixel point in the virtual viewpoint color image by utilizing the retained non-hole pixel points and performing weighted average calculation according to the formula (3)_cAnd (5) as the pixel value of the current hole pixel point in the fused virtual viewpoint color image obtained in the step (8):

in the formula I_cRepresenting the pixel value of the current hole pixel point after interpolation in the virtual viewpoint color image, n is the number of the reserved non-hole pixel points, I is the non-hole pixel point index, and I_iRepresenting the pixel value d of the non-empty pixel point in the fused virtual viewpoint color image obtained in the step (8)_iThe pixel distance between a non-empty pixel point and a current empty pixel point is obtained, and the weight value of each non-empty pixel is inversely proportional to the pixel distance. The step of processing can avoid foreground color mixing in the image hole when repairing the image hole, and the effect is good.

The technical conception of the invention is as follows: firstly, Canny edge detection is carried out on the reference viewpoint depth map after expansion processing to prepare for subsequent cavity expansion processing; mapping the reference viewpoint depth map forwards to obtain a virtual viewpoint depth map and a mapped foreground edge marking map, and performing median filtering on the depth map to reduce small holes; then, carrying out reverse mapping on the filtered depth map to obtain an initial virtual viewpoint color image and a hole mask map; then, carrying out expansion processing based on boundary detection on the hole mask graph so as to eliminate false edges; then fusing the two virtual viewpoint images after brightness correction to fill a large cavity; and finally, filling the residual holes by adopting an interpolation method considering the depth, and improving the image quality of the virtual viewpoint.

Compared with the prior art, the method has the following beneficial effects: the problem of small holes is solved by using median filtering, the influence of false edges is eliminated through hole expansion based on edge detection, unnecessary expansion operation is reduced, and the foreground edges are protected; the brightness correction based on the HSV color space is adopted, so that the large cavity filled by the two viewpoints in a fusion manner is natural; and finally, a fast interpolation method considering the depth information is utilized, so that the void interpolation effect at the foreground edge is effectively improved. The virtual viewpoint subjective and objective image obtained by the method has good quality.

Drawings

FIG. 1 is a block diagram of the process of the present invention.

In fig. 2, (a) is a left viewpoint color image, (b) is a left viewpoint depth image, (c) is a right viewpoint color image, and (d) is a right viewpoint depth image.

Fig. 3 is a diagram of foreground edge labeling for the right viewpoint.

Fig. 4 is a foreground edge label map after right viewpoint mapping.

Fig. 5 is a virtual viewpoint depth map after right viewpoint median filtering.

Fig. 6 is a color diagram of virtual viewpoints before fusion, where (a) is a left viewpoint and (b) is a right viewpoint.

Fig. 7 is a hole mask diagram after the right viewpoint dilation processing.

Fig. 8 is a color diagram of a virtual viewpoint after image fusion.

Fig. 9 is a virtual viewpoint color image of the final rendering output.

Detailed Description

The present invention will be described in detail below with reference to examples and drawings, but the present invention is not limited thereto. "Ballet" is a multi-view plus depth free-view video test sequence provided by Microsoft institute, with a total of 8 reference views, and a physical separation of 20cm between each two views. The present embodiment is based on the "Ballet" sequence dataset and uses CUDA technology of england GPU for parallel computation acceleration.

As shown in fig. 1, a method for rendering a high image quality virtual viewpoint of a free viewpoint video includes the following steps:

(1) inputting a color image, a depth image and camera parameters corresponding to the two reference viewpoints; the camera parameters of any reference viewpoint comprise calibrated internal reference matrixes and external reference matrixes; the virtual viewpoint is positioned on the connecting line of the two reference viewpoints, and the position parameters of the virtual viewpoint relative to the two reference viewpoints comprise Euclidean distance D between the virtual viewpoint and the left reference viewpoint_LAnd Euclidean distance D from the right reference viewpoint_R；

(2) Calculating to obtain a camera parameter matrix of the virtual viewpoint by adopting a linear interpolation method based on the camera parameters of the two reference viewpoints input in the step (1) and the position parameters of the virtual viewpoint relative to the two reference viewpoints respectively;

(3) performing edge detection on the depth maps corresponding to the two reference viewpoints input in the step (1) to obtain foreground edge marker maps of the two depth maps;

(4) forward mapping is carried out on the depth maps corresponding to the two reference viewpoints input in the step (1) to obtain two virtual viewpoint depth maps, and forward mapping is respectively carried out on the foreground edge mark maps of the two depth maps obtained in the step (3) to obtain two mapped foreground edge mark maps;

(5) performing median filtering on the two virtual viewpoint depth maps obtained in the step (4) respectively to obtain two filtered virtual viewpoint depth maps;

(6) respectively mapping the color images of the two reference viewpoints input in the step (1) backwards by using the two filtered virtual viewpoint depth images obtained in the step (5) to obtain two mapped virtual viewpoint color images and corresponding cavity mask images;

(7) expanding the two hole mask images obtained in the step (6) by using the two mapped foreground edge mark images obtained in the step (4) respectively to obtain two expanded hole mask images;

(8) correspondingly fusing the two mapped virtual viewpoint color images obtained in the step (6) by using the two expanded cavity mask images obtained in the step (7) to obtain a fused virtual viewpoint color image and a cavity mask image of the fused virtual viewpoint color image;

(9) and (5) interpolating the holes in the fused virtual viewpoint color image obtained in the step (8), and outputting a final virtual viewpoint color image.

The step (1) specifically comprises the following steps:

viewpoints 5 and 3 of the "pellet" sequence are input as left and right reference viewpoints, respectively, and each reference viewpoint image includes an RGB color image and a depth image, as shown in fig. 2. And inputting parameters of each reference viewpoint camera, including an internal reference matrix and an external reference matrix of the calibrated reference viewpoint camera. And copying the color image, the depth image matrix and the camera parameter matrix into a GPU video memory, and operating in the GPU in subsequent steps. The virtual viewpoint is on the connection line of the two reference viewpoints, and the position parameters of the virtual viewpoint comprise Euclidean distance D between the virtual viewpoint and the left viewpoint_LAnd Euclidean distance D from the right viewpoint_RHere, the virtual viewpoint is set at the middle of two reference viewpoints, with D_L＝D_R。

The step (2) specifically comprises the following steps:

and (2) calculating to obtain a camera parameter matrix of the virtual viewpoint by adopting a classical linear interpolation method according to the camera parameters of the two reference viewpoints input in the step (1) and the position parameters of the virtual viewpoint and the two reference viewpoints.

The step (3) specifically comprises the following steps:

performing morphological dilation treatment on the two reference viewpoint depth maps input in the step (1) by using rectangular templates with the size of 3 x 3 respectively to obtain two dilated depth maps; and performing Canny edge detection on the two expanded depth maps respectively, wherein the low threshold value of a Canny algorithm is set to be 50, and the high threshold value of the Canny algorithm is set to be 150, so as to obtain two binary foreground edge mark maps, as shown in fig. 3.

The step (4) specifically comprises the following steps:

distributing a thread for the mapping process of each pixel to perform parallel calculation according to the reference viewpoint camera parameter matrix input in the step (1) and the virtual viewpoint camera parameter matrix obtained in the step (2), and respectively mapping the two reference viewpoint depth maps input in the step (1) to an imaging plane corresponding to a virtual viewpoint in a forward three-dimensional manner by adopting a DIBR (depth image based rendering) method to obtain a virtual viewpoint depth map mapped by a left viewpoint and a virtual viewpoint depth map mapped by a right viewpoint; in the mapping process of each reference viewpoint, if two pixel values in the same reference viewpoint depth map are mapped to the same pixel position in the corresponding virtual viewpoint depth map, the sizes of the two pixel values are compared, and the smaller pixel value is reserved. In addition, the foreground edge label maps of the left viewpoint and the right viewpoint obtained in the step (3) are respectively mapped to the imaging plane corresponding to the virtual viewpoint forwards, so as to obtain the foreground edge label map after the mapping of the left viewpoint and the foreground edge label map after the mapping of the right viewpoint, as shown in fig. 4.

The step (5) specifically comprises the following steps:

and (3) allocating a thread to perform parallel computation in the median filtering process of each pixel, and performing median filtering on the two virtual viewpoint depth maps mapped forward from the left viewpoint and the right viewpoint in the step (4) by using a 3 × 3 rectangular template to obtain two filtered virtual viewpoint depth maps, as shown in fig. 5.

The step (6) specifically comprises the following steps:

and (3) allocating a thread for the mapping process of each pixel to perform parallel calculation, and performing backward three-dimensional mapping on the left viewpoint color image input in the step (1) to an imaging plane of a virtual viewpoint by using the camera parameter of the reference viewpoint input in the step (1), the camera parameter matrix of the virtual viewpoint calculated in the step (2) and the virtual viewpoint depth map obtained in the step (5) after median filtering of the left viewpoint, so as to obtain a virtual viewpoint color image mapped by the left viewpoint. In the mapping process, the mapped pixel positions are marked, the left unmarked pixels become holes, and a hole mask map of the left viewpoint mapping is obtained, wherein the map is a binary map, the pixel value of the binary map is 0, which represents that the current position pixel is marked as a hole pixel, and the value of the binary map is 255, which represents that the current position pixel is not marked as a hole pixel. And (4) carrying out the same treatment on the right viewpoint to obtain a virtual viewpoint color image mapped by the right viewpoint and a cavity mask image mapped by the right viewpoint. The color map of the virtual viewpoint of the right viewpoint map is shown in fig. 6.

The step (7) specifically comprises:

and (3) allocating a thread to the expansion processing process of each pixel, and performing selective morphological expansion processing on the left viewpoint mapping cavity mask map obtained in the step (6) by using a 3 x 3 rectangular template by using the foreground edge mark map obtained after the left viewpoint mapping obtained in the step (4), specifically performing expansion processing only on the position where the edge mark exists to obtain the expanded left cavity mask map. And (4) carrying out the same selective morphological dilation treatment on the right viewpoint mapping cavity mask map according to the obtained foreground edge marking map of the right viewpoint mapping in the step (4) to obtain a dilated right cavity mask map, wherein the dilated right cavity mask map is shown in fig. 7.

The step (8) specifically comprises:

and (3) fusing the two virtual viewpoint color images obtained by mapping the left viewpoint and the right viewpoint in the step (6) by using the expanded left and right cavity mask images obtained in the step (7) to obtain a fused virtual viewpoint color image, which comprises the following specific steps:

and (8-1) distributing threads which are in parallel computing and equal to the total number of pixels of a single image, wherein each thread processes two pixels of the same coordinate of the two images. If two pixels at the same position in the two cavity mask images do not belong to the cavity pixel, the pixel values of the position in the two virtual viewpoint color images are converted from an RGB space format to an HSV space format, the ratio of the pixel value brightness components of the left image and the right image at the position is obtained, the values are accumulated and counted, and finally the average value k of all the brightness ratios is obtained through statistics.

(8-2) assigning a thread to each pixel of the output image for parallel computation. Firstly, converting the pixel value with the position coordinate of (u, v) in the virtual viewpoint color image mapped by the right viewpoint obtained in the step (6) from an RGB space format to an HSV space format, then multiplying the brightness value by k, and then converting the pixel value from the HSV space format back to the RGB space format to obtain the pixel value I of the right-side mapped virtual viewpoint color image with corrected brightness_R(u, v). Then, the pixel value I is calculated according to the formula (1)_R(u, v) virtual viewpoint color image pixel value I mapped with co-located left viewpoint_L(u, v) fusing to obtain a fused color image pixel value I_F(u, v); the above processing is carried out on each pixel position, and finally a fused virtual viewpoint color image I is obtained_FAs shown in fig. 8. In the formula (1), α is calculated as in the formula (2), and D is calculated in this embodiment_LAnd D_REqual, α equals 0.5; h_LAnd H_RThe left and right hole mask maps obtained in step (7) are respectively represented, wherein a value of 0 in the map indicates that the current position pixel is a hole pixel, and a value of non-0 indicates that the current position pixel is not a hole pixel. In addition, the hole mask H is masked in the image fusion process_RAnd H_LAnd marking the areas with the holes as the holes, and obtaining a hole mask image of the fused virtual viewpoint color image after the classification marking is finished.

The step (9) specifically comprises:

utilizing the two filtered virtual viewpoint depth maps obtained in the step (5) and the fused hole mask map obtained in the step (8) to interpolate holes in the virtual viewpoint color image fused in the step (8), and finally outputting the virtual viewpoint color image, wherein in the calculation process, a GPU thread is allocated to each pixel to be processed for parallel calculation, and the specific processing method comprises the following steps:

(9-1) firstly searching each pixel belonging to the hole in the fused hole mask map obtained in the step (8) in the mask map in the up-down, left-right and 4 directions respectively to obtain 4 non-hole pixels, and then obtaining the depth values of the 4 non-hole pixels from the two filtered virtual viewpoint depth maps, wherein the specific obtaining mode is as follows: preferentially, the depth value is obtained at the corresponding pixel position in the left depth map and is used as the depth value of the pixel point, if the point value in the left depth map is 0, namely the depth value does not exist, the depth value is obtained at the corresponding pixel position in the right depth map.

(9-2) sequencing the searched depth values of the 4 non-hollow pixel points from large to small, calculating the difference value of every two adjacent depth values, and eliminating the non-hollow pixel points with small correlation according to the difference value, wherein the specific eliminating mode is as follows: if the difference value between the larger value and the smaller value in the adjacent depth values is larger than the threshold value Thd, and the value of the Thd is 10, discarding the smaller value and the non-empty pixel points corresponding to the depth values arranged behind the smaller value, and reserving the non-empty pixel points with the larger depth values.

(9-3) finally, obtaining an interpolation result I of the current hole pixel in the virtual viewpoint color image by utilizing the retained non-hole pixel points and performing weighted average calculation according to the formula (3)_cAnd (3) taking the pixel values of the current hole pixel points in the fused virtual viewpoint color image obtained in the step (8) as follows:

in the formula, n is the number of reserved non-empty pixel points, I is the index of the non-empty pixel points, and I_iRepresenting the pixel values of non-empty pixel points in a color image at a virtual viewpoint, d_iThe pixel distance between a non-empty pixel point and a current empty pixel point is obtained, and the weight value of each non-empty pixel is inversely proportional to the pixel distance. The virtual viewpoint color image finally rendered and output is shown in fig. 9.

Claims

1. A high image quality virtual viewpoint drawing method of a free viewpoint video is characterized in that: the method comprises the following steps:

step 1: inputting a color image, a depth image and camera parameters corresponding to the two reference viewpoints; the camera parameters of any reference viewpoint comprise calibrated internal reference matrixes and external reference matrixes; the virtual viewpoint is positioned on the connecting line of the two reference viewpoints, and the position parameters of the virtual viewpoint relative to the two reference viewpoints comprise Euclidean distance D between the virtual viewpoint and the left reference viewpoint_LAnd Euclidean distance D from the right reference viewpoint_R；

Step 2: based on the camera parameters of the two reference viewpoints and the position parameters of the virtual viewpoint relative to the two reference viewpoints, which are input in the step 1, calculating by adopting a linear interpolation method to obtain a camera parameter matrix of the virtual viewpoint;

and step 3: performing morphological dilation processing on the depth maps corresponding to the two reference viewpoints input in the step 1 by using a rectangular template to obtain two dilated depth maps; respectively carrying out Canny edge detection on the two expanded depth maps to obtain foreground edge marker maps of the two depth maps, wherein any foreground edge marker map is a binary map;

and 4, step 4: the method comprises the following steps:

step 4.1: respectively mapping the depth maps corresponding to the two reference viewpoints input in the step 1 forwards in three dimensions to an imaging plane of the virtual viewpoint by using the camera parameters of the reference viewpoints input in the step 1 and the camera parameter matrix of the virtual viewpoint obtained by calculation in the step 2 by adopting a DIBR method to obtain a virtual viewpoint depth map mapped by a left viewpoint and a virtual viewpoint depth map mapped by a right viewpoint; in the mapping process of each reference viewpoint, if two pixel values in the same reference viewpoint depth map are mapped to the same pixel position in the corresponding virtual viewpoint depth map, comparing the sizes of the two pixel values, and reserving a smaller pixel value; until the mapping is completed;

step 4.2: respectively forward three-dimensionally mapping the foreground edge mark images of the two depth maps obtained in the step (3) to an imaging plane of a virtual viewpoint by adopting a DIBR method to obtain a foreground edge mark image after left viewpoint mapping and a foreground edge mark image after right viewpoint mapping;

and 5: performing median filtering on the two virtual viewpoint depth maps obtained in the step (4) respectively to obtain two filtered virtual viewpoint depth maps;

step 6: respectively mapping the color images of the two reference viewpoints input in the step 1 backwards by using the two filtered virtual viewpoint depth images obtained in the step 5 to obtain two mapped virtual viewpoint color images and corresponding cavity mask images;

and 7: performing morphological expansion processing on the two mapped cavity mask images obtained in the step 6 by using the two mapped foreground edge mark images obtained in the step 4 and a rectangular template based on preset conditions to obtain two expanded cavity mask images; the preset condition is that only the position with the edge mark is subjected to expansion treatment;

and step 8: correspondingly fusing the two mapped virtual viewpoint color images obtained in the step 6 by using the two expanded cavity mask images obtained in the step 7 to obtain a fused virtual viewpoint color image and a cavity mask image of the fused virtual viewpoint color image;

and step 9: and (4) interpolating the holes in the fused virtual viewpoint color image obtained in the step (8) and outputting a final virtual viewpoint color image.

2. The method for rendering a high image quality virtual viewpoint of a free viewpoint video according to claim 1, characterized in that: the step 6 comprises the following steps:

step 6.1: respectively carrying out backward three-dimensional mapping on color images of the two reference viewpoints input in the step 1 to an imaging plane of the virtual viewpoint by utilizing the camera parameters of the reference viewpoints input in the step 1 and the camera parameter matrix of the virtual viewpoint obtained by calculation in the step 2, and utilizing the two filtered virtual viewpoint depth maps obtained in the step 5 and adopting a DIBR method;

step 6.2: marking the mapped pixel positions, wherein the left unmarked pixels are holes;

step 6.3: obtaining two mapped virtual viewpoint color images and two corresponding mapped cavity mask images; any of the hole mask maps is a binary map, and if the pixel value is 0, the current position pixel is marked as a hole pixel.

3. The method for rendering a high image quality virtual viewpoint of a free viewpoint video according to claim 1, characterized in that: the step 8 comprises the following steps:

step 8.1: if two pixels at the same position do not belong to the cavity pixel in the two expanded cavity mask images obtained in the step 7, converting the pixel values of the two mapped virtual viewpoint color images at the position obtained in the step 6 from the RGB space format to the HSV space format, and then obtaining the ratio of the pixel value brightness components of the left mapped virtual viewpoint color image and the right mapped virtual viewpoint color image at the position; repeating the step 8.1 until all positions are traversed, accumulating all the ratios, and counting to obtain an average value k of all the brightness ratios;

step 8.2: converting the pixel value with the position coordinate of (u, v) in the right side mapped virtual viewpoint color image obtained in the step 6 from an RGB space format to an HSV space format, multiplying the brightness value of the position by k, and then converting the pixel value from the HSV space format back to the RGB space format to obtain the brightness corrected right side mapped virtual viewpoint color image pixel value I_R(u,v)；

Step 8.3: to the pixel value I according to the formula (1)_R(u, v) pixel value I of virtual viewpoint color map mapped to left side of same position_L(u, v) are fused to obtainTo the fused color image pixel value I_F(u,v)；

Wherein the content of the first and second substances,

D_Land D_REuclidean distances H between the virtual viewpoint input in the step 1 and the left reference viewpoint and the right reference viewpoint respectively_LAnd H_RRespectively representing the left and right cavity mask images obtained in the step 7 after expansion, wherein if the current position is a cavity pixel, the current position is 0, otherwise, the current position is not 0;

step 8.4: repeating the step 8.2 until the position of each pixel in the virtual viewpoint color image mapped on the left side and the right side is processed, and obtaining a fused virtual viewpoint color image I_F(ii) a In the process of image fusion, the hole mask H is processed_RAnd H_LAnd marking the areas with the holes as the holes, and obtaining a hole mask image of the fused virtual viewpoint color image after the classification marking is finished.

4. The method for rendering a high image quality virtual viewpoint of a free viewpoint video according to claim 1, characterized in that: in step 9, the interpolation includes the following steps:

step 9.1: for each pixel point belonging to a hole pixel in the hole mask image of the fused virtual viewpoint color image obtained in the step 8, searching a first non-hole pixel point in the hole mask image in the upper, lower, left and right 4 directions respectively to obtain 4 non-hole pixel points, and obtaining the depth values of the 4 non-hole pixel points in the two filtered virtual viewpoint depth images obtained in the step 5; in the acquisition process, preferentially acquiring a depth value at a corresponding pixel position in the left depth map as the depth value of the pixel point, and if the depth value of the pixel does not exist in the left depth map, acquiring the depth value at the corresponding pixel position in the right depth map;

step 9.2: sorting the searched depth values of the 4 non-hollow pixel points from large to small, calculating the difference value of every two adjacent depth values, and eliminating the non-hollow pixel points with small correlation according to the difference value;

step 9.3: obtaining an interpolation result I of the current hollow pixel point in the virtual viewpoint color image by utilizing the reserved non-hollow pixel points and performing weighted average calculation according to the formula (2)_cAs the pixel value of the current hole pixel point in the fused virtual viewpoint color image obtained in step 8,

in the formula, n is the number of reserved non-empty pixel points, I is the index of the non-empty pixel points, and I_iRepresenting the pixel value, d, of the fused virtual viewpoint color image obtained in step 8 for the non-empty pixel point_iThe pixel distance between the non-empty pixel point and the current empty pixel point is obtained.

5. The method of claim 4, wherein the method comprises: in the step 9.2, if the difference between the larger value and the smaller value of the adjacent depth values is greater than the threshold Thd, the non-empty pixel points corresponding to the smaller value and the depth values arranged behind the smaller value are discarded, the non-empty pixel points with the larger depth values are reserved, and the value range of the threshold Thd is [5,20 ].