WO2017201751A1 - 虚拟视点视频、图像的空洞填充方法、装置和终端 - Google Patents

虚拟视点视频、图像的空洞填充方法、装置和终端 Download PDF

Info

Publication number
WO2017201751A1
WO2017201751A1 PCT/CN2016/083746 CN2016083746W WO2017201751A1 WO 2017201751 A1 WO2017201751 A1 WO 2017201751A1 CN 2016083746 W CN2016083746 W CN 2016083746W WO 2017201751 A1 WO2017201751 A1 WO 2017201751A1
Authority
WO
WIPO (PCT)
Prior art keywords
background
video
foreground
depth map
camera
Prior art date
Application number
PCT/CN2016/083746
Other languages
English (en)
French (fr)
Inventor
朱跃生
罗桂波
张立明
Original Assignee
北京大学深圳研究生院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学深圳研究生院 filed Critical 北京大学深圳研究生院
Priority to PCT/CN2016/083746 priority Critical patent/WO2017201751A1/zh
Publication of WO2017201751A1 publication Critical patent/WO2017201751A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof

Definitions

  • the present invention relates to the field of 3D video, and in particular, to a virtual view video, a method for filling holes in an image, a device, and a terminal.
  • the virtual view video is a video under the virtual viewpoint synthesized by the 3D image of the camera viewpoint video captured by the 3D scene camera. Due to the occlusion of the foreground in the camera view video, the background pixel information of the occluded portion is missing in each frame image in the video. When the virtual view video is generated by transforming the view angle, the missing background pixel information needs to appear in the virtual view point because the camera cannot be used. Obtaining the corresponding information in the viewpoint video leads to the defect that the background hole is generated in the virtual viewpoint video.
  • the traditional method of cavity repair of virtual view video is generally to obtain the padding information by using the correlation between the video time domain or the air domain. For example, in the time domain, due to the motion of the foreground, the area where the current frame is occluded by the foreground may become visible in other frames, so the background of the occluded area can be restored by the background modeling method. For the traditional repair method, since the image of each frame of the virtual view video contains the foreground part and the background part, it is very likely that the pixel of the foreground is used to fill the hole in the background due to improper distinction, resulting in image distortion and video quality of the repaired image. difference.
  • a method for filling a void of a virtual view video comprising the steps of:
  • a method for filling a void of a virtual viewpoint image comprising:
  • a void of the virtual viewpoint image synthesized by the camera viewpoint image is filled using the background image and the background depth map.
  • a cavity filling device for virtual view video comprising:
  • a shooting module configured to acquire a camera viewpoint video corresponding to the camera viewpoint video and the camera viewpoint video
  • a foreground removal module configured to remove a set of pixel points corresponding to the foreground in the camera view video and the camera view depth map to form an intermediate background video and an intermediate background depth map with foreground holes
  • a background filling module configured to fill the foreground hole with a background pixel set, and generate a background video and a background depth map corresponding to the background video;
  • a hole filling module is configured to fill a hole of the virtual view video synthesized by the camera view video using the background video and the background depth map.
  • a cavity filling device for a virtual viewpoint image comprising:
  • a photographing module configured to acquire a camera viewpoint image corresponding to the camera viewpoint image and the camera viewpoint image
  • a foreground removal module configured to remove the camera view image and the set of pixel points corresponding to the foreground in the camera view depth map to form an intermediate background image and an intermediate background depth map with foreground holes;
  • a background filling module configured to fill the foreground hole with a background pixel set to generate a background image a background depth map corresponding to the background image
  • a hole filling module is configured to fill a hole of the virtual view image synthesized by the camera view image using the background image and the background depth map.
  • a terminal comprising a memory and a processor, wherein the memory stores instructions that, when executed by the processor, cause the processor to perform the following steps:
  • the above-mentioned virtual view video hole filling method and device by constructing a clean foregroundless background video and background depth map, and filling the void in the virtual view video through the constructed background video and the background depth map. Since the background video does not have the foreground texture, the background hole is not filled with the foreground block, and the distortion of the virtual view video due to the repair is avoided, and the video quality of the virtual view video is improved.
  • FIG. 1 is a flowchart of a method for filling a void of a virtual view video in an embodiment
  • FIG. 2 is a flow chart of generating an intermediate background video and an intermediate background depth map in one embodiment
  • 3 is a flow chart for extracting a set of foreground pixel points in a depth map in one embodiment
  • FIG. 4 is a flow chart of extracting a set of foreground pixel points in a depth map in another embodiment
  • Figure 5 is a flow chart for extracting a foreground boundary and a background boundary in one embodiment
  • FIG. 6 is a flow chart of generating a background video and a background depth map in one embodiment
  • FIG. 7 is a flow chart of filling a virtual video hole with a background video in one embodiment
  • Figure 8 is a flow chart showing an extended background video and a background depth map in one embodiment
  • Figure 9 is an image of one of the camera viewpoint videos in one embodiment
  • Figure 10 is a depth map corresponding to the image in Figure 9;
  • Figure 11 is an image after the foreground pixel set is removed in Figure 9;
  • Figure 12 is an image of Figure 10 with the foreground pixel set removed;
  • Figure 13 is a background video image after the cavity is filled with the background pixel point set
  • Figure 15 is a frame image of an unfilled virtual view video
  • Figure 16 is a depth map image after preprocessing
  • Figure 17 is a representation of a minimum circumscribed rectangle constructing a foreground boundary
  • Figure 18 is a foreground boundary iteration diagram
  • Figure 19 is an intermediate diagram of obtaining an initialization seed point
  • 20 and 21 are a foreground probability map and a background probability map, respectively;
  • Figure 22 and Figure 23 are the extracted foreground image and the extracted background image, respectively;
  • Figure 24 is a depth map before and after patching by depth value prediction
  • Figure 25 is a schematic diagram of a patched background video
  • Figure 26 is an expanded area map of the background video and the background depth map
  • Figure 27 is an expanded background video and background depth map
  • 28 is a structural block diagram of a hole filling device for virtual view video in an embodiment
  • 29 is a structural block diagram of a foreground removal module in one embodiment
  • FIG. 30 is a structural block diagram of an extraction depth map foreground module in an embodiment
  • 31 is a structural block diagram of another foreground module for extracting a depth map in an embodiment
  • 32 is a structural block diagram of a background filling module in an embodiment
  • Figure 33 is a block diagram showing the structure of a hollow hole filling module of an embodiment
  • FIG. 34 is a flowchart of a method for filling a void of a virtual view image in an embodiment
  • Figure 35 is a flow chart showing an extended background video and a background depth map in one embodiment
  • Figure 36 is a block diagram showing the structure of a terminal in an embodiment.
  • FIG. 1 is a schematic flowchart of a method for filling a hole of a virtual view video in an embodiment. As shown in FIG. 1 , the method includes the following steps:
  • Step S100 Acquire a camera viewpoint depth map corresponding to the camera view video and the camera view video.
  • the camera viewpoint video is a video recording of an event based on a single camera camera.
  • the camera view depth map corresponding to the camera view video is substantially the depth map corresponding to each frame image in the camera view video, and the value of the pixel point in the depth map represents the distance relationship between the physical point corresponding to the pixel point in the scene and the camera camera. That is, the depth value, the range is 0-255, the farthest depth value is 0, and the most recent depth value is 255.
  • a single camera when a single camera is recording, it can be a static recording under a single viewpoint, or it can be a dynamic recording under a viewpoint by moving or twisting.
  • the camera viewpoint video may be acquired by a camera's camera
  • the camera viewpoint depth map may be acquired by the camera's sensing system.
  • Step S200 Removing the pixel point set corresponding to the foreground in the camera view video and the camera view depth map to form an intermediate background video and an intermediate background depth map with foreground holes.
  • the camera viewpoint video and the camera viewpoint depth map contain the foreground portion and the background portion.
  • the foreground is closer to the camera than the background.
  • the foreground is a foreground of motion, and the foreground may be one object or multiple.
  • FIG. 9 is one of the images in the camera viewpoint video, in which the ladies and men in the image are foreground, the two foreground objects in the camera viewpoint video are moving, the dance studio is the background, and
  • FIG. 10 is the image in FIG. Depth map.
  • the pixel point set corresponding to the foreground may be extracted according to the difference between the foreground and the background and the camera, and the extracted pixel point set is removed to form an intermediate background video with a foreground cavity and an intermediate background depth map, and FIG. 11 is in the middle background video.
  • FIG. 12 is a depth map corresponding to the image in FIG. 11 in the intermediate background depth map.
  • Step S300 Filling the foreground hole with the background pixel set to generate a background depth map corresponding to the background video and the background video.
  • the background image, image restoration and other filling means are used to fill the foreground cavity based on the intermediate background video and the intermediate background depth map, and a complete background video and background depth map without voids are generated, as shown in FIG. 13 as a complete background.
  • One frame of image in the video, Figure 14 is a depth map corresponding to the image in Figure 13 in the full background depth map.
  • the filling of the foreground void area is performed under the premise that the foreground is removed, and the foreground texture is not brought into the reconstructed background, and the restored background can maintain good quality and can be better. Avoid problems with background video distortion.
  • Step S400 Filling the void of the virtual view video synthesized by the camera view video using the background video and the background depth map.
  • the camera viewpoint video can be generated under the rendering of the camera viewpoint depth map, and the virtual video under the virtual viewpoint can be generated after 3D mapping.
  • the background of the occlusion due to the transformation of the viewpoint will be exposed under the virtual viewpoint, and the camera viewpoint video lacks occlusion.
  • the virtual view view of the frame of the camera view video is 3D mapped, and the white area in the figure is the background hole.
  • the virtual view video hole filling method may further be:
  • the virtual view video is synthesized according to the camera view video and the camera view depth map, and the pixel point set corresponding to the foreground in the virtual view video is removed to form an intermediate background video and an intermediate background depth map with foreground holes.
  • the foreground hole is filled with the background pixel set to generate a background depth map corresponding to the background video and the background video. Fill the void of the virtual view video with the background video and the background depth map.
  • the virtual view video filling method described above first converts the camera view video into a virtual view video.
  • the foreground image is removed based on the virtual viewpoint video, and the background is filled to form a background video under a clean virtual viewpoint, and the background video is used to fill the void of the virtual viewpoint video.
  • the background hole in the virtual view video is filled with a clean foreground video without foreground image.
  • the background video may be formed based on the original camera view video, or may be formed based on the virtual view video, as long as it is used.
  • the clean background video to fill the hole can overcome the defect of the video distortion of the traditional image directly repairing the virtual view video, effectively avoiding the misuse of the foreground part of the feature point to fill the background cavity due to the inability to accurately distinguish the foreground and the background.
  • Virtual video distortion, virtual video quality is better.
  • step S200 includes:
  • Step S210 Extract a pixel point set corresponding to the foreground in the camera view depth map.
  • Step S220 Determine a set of pixel points corresponding to the foreground in the camera view video.
  • the pixel point set corresponding to the foreground in the camera viewpoint video can be determined.
  • Step S230 Remove the pixel point set in the extracted camera view depth map and the pixel point set in the camera view video.
  • the pixel points acquired in step S210 and step S220 are removed, and a background video with foreground holes and a depth map with foreground holes are respectively obtained, and a background video with foreground holes is defined as an intermediate background video with foreground holes.
  • the depth map is the intermediate background depth map.
  • the foreground image in the depth map is extracted first, and the camera viewpoint video and the camera viewpoint depth map are accurately removed according to the foreground in the extracted depth map.
  • foreground extraction and removal are more accurate and efficient.
  • step S210 includes:
  • Step S212 Extracting a foreground boundary in the camera viewpoint depth map.
  • step S211 is further included before step S212: filtering and etching the camera viewpoint depth map.
  • the depth value of the same object in the depth map should be continuous, but due to the influence of noise and other factors, the depth value of the same object in the depth map may not be continuous, that is, the phenomenon of unreal edges occurs. This unreal edge can interfere with subsequent foreground extraction and affect the accuracy of foreground extraction. Filtering the camera viewpoint depth map can reduce or even eliminate unreal edges.
  • Figure 16 is a filtered camera viewpoint depth map. Comparing the unfiltered depth map of Figure 10, the same object in Figure 16 has a smoothing effect while the boundaries are preserved.
  • a morphological erosion operation is performed on the camera viewpoint depth map to cause the foreground corrosion to shrink to ensure that the boundary of the subsequent extraction is located inside the foreground.
  • the foreground boundary extraction is performed, which not only avoids the interference of the unreal boundary, but also ensures that the extracted foreground boundary is located inside the foreground after the etching operation, and the extracted boundary is more accurate.
  • the Canny edge detection method is used to extract the foreground boundary in the camera viewpoint depth map.
  • Step S213 sequentially generating a secondary boundary according to the foreground boundary by using an iterative method, and the difference between the depth value of the point corresponding to the secondary boundary of the secondary boundary is smaller than a preset range;
  • the set of pixel points included in the foreground boundary and the secondary boundary constitute a set of pixel points corresponding to the foreground in the camera depth map.
  • the depth value difference between the query and the point included in the foreground boundary is smaller than a preset range and the distance is less than the set value.
  • the point of the query generates the secondary boundary of the foreground boundary, and then the secondary boundary is generated by the secondary boundary, and so on, and the boundary is continuously expanded until the entire foreground is obtained.
  • the method before generating the secondary boundary according to the native boundary, the method further includes:
  • the point at which the difference between the depth value of the point included in the foreground boundary and the point included in the foreground boundary is smaller than the preset range and the distance is less than the set value is the depth value of the point in the minimum bounding rectangle of the query, and the depth value of the query corresponds to the maximum inter-class variance method.
  • the threshold is compared, the threshold is greater than the threshold and the distance from the corresponding point in the foreground boundary is less than the pre-
  • the point where the value is set is the point of the secondary boundary.
  • the threshold can be set more accurately by constructing the minimum circumscribed rectangle, and the query result is more accurate. Reduce the scope of the search and improve the efficiency of the search.
  • the above foreground and secondary boundaries can be represented as a set of points that can grow into foreground targets.
  • T represents an unallocated set of points that is adjacent to at least one foreground target.
  • the characteristics of point set T are as follows:
  • N(x) denotes a point set directly adjacent to the pixel x
  • PMBR i denotes a minimum circumscribed rectangle (MBR) of F i
  • MRR minimum circumscribed rectangle
  • the Otsu function represents a threshold corresponding to the maximum inter-class variance method, and the depth value is greater than the threshold as a condition for determining that the pixel is a foreground target.
  • the new foreground goal is used as input to the next iteration process until The iterative process ends.
  • the iterative process is shown in Figure 18.
  • step S210 includes:
  • Step S214 Extract the foreground boundary and the background boundary in the camera viewpoint depth map.
  • the inner black edge is the foreground boundary
  • the outer white edge is the background boundary.
  • Step S215 Calculating a probability distribution of the foreground/background in the camera depth map by using the foreground boundary and the background boundary as seed points, thereby determining a pixel point set corresponding to the foreground in the viewpoint depth map.
  • a random walk segmentation algorithm is used to map foreground and background boundaries.
  • the sub-points are randomly walked and segmented, and the probability distribution of the foreground/background in the camera depth map is calculated to determine the pixel point set corresponding to the foreground.
  • the algorithm for processing the seed points is not limited to the random walk segmentation algorithm, and other algorithms that can calculate the foreground/background probability distribution based on the seed points can be used.
  • the foreground or background probability distribution of each point can be obtained by solving the formula:
  • L U is the weight coefficient corresponding to the non-seed node
  • B T is the transposed matrix
  • Figure 20 and Figure 21 show the probability of the walker to the foreground label and the background label, respectively.
  • the higher the value in the grayscale image, the higher the probability of occurrence, the label of the highest probability value as the label of the non-seed node, and the segmentation result of the foreground and background are shown in FIGS. 22 and 23.
  • the step S214 includes:
  • Step S2141 Filtering the camera viewpoint depth map.
  • the processing method of this step is the same as the filtering processing method in step S211.
  • Step S2142 Corrosion processing the camera viewpoint depth map and extracting the foreground boundary.
  • the processing method of this step is the same as the etching processing method in step S211 and the manner of extracting the foreground boundary in step S212.
  • Step S2143 Inflating the camera viewpoint depth map and extracting the background boundary.
  • the morphological expansion operation is used to ensure that the extracted background boundary falls within the background area, ensuring the accuracy of the extraction of the background boundary.
  • step S300 is to perform background modeling on the intermediate background video and the intermediate background depth map in step S200, through the intermediate background video and the image in the intermediate background depth map.
  • the background pixel points are complemented by each other to fill the foreground hole area.
  • the video is based on a time function of a multi-frame image, and the intermediate background image includes multi-frame images at different times.
  • the foreground is the foreground of the motion, and as the foreground moves, the background area of the foreground occlusion at one time may appear in the image of the other moment.
  • the background modeling utilizes the above characteristics to fill the foreground void region by complementing the background pixel point sets between the images, and generating a clean foregroundless background video and background depth map.
  • the background modeling of this embodiment is performed based on the intermediate background video and the intermediate background depth map with the foreground removed, and the phenomenon that the generated background image is distorted caused by the image block filling the void of the misuse foreground does not occur.
  • the camera view video is a video of a dynamic camera viewpoint.
  • the method further includes:
  • Step S310 Acquire a mapping relationship of video segments in different viewpoints in the camera view video.
  • the dynamic camera viewpoint video is the camera viewpoint video recorded by the camera in a non-stationary state, and the camera viewpoint is dynamic.
  • the background mapping cannot be directly performed.
  • the background of this embodiment is modeled as an improved background modeling with motion compensation.
  • the mapping relationship between video segments in different viewpoints in the camera view video is obtained by using the SURF detection and the RANSAC algorithm.
  • the SURF is used to detect and describe feature points of the current frame and the reference frame.
  • the RANSAC algorithm is used to optimize the matching of feature point pairs. After the feature point pairs are matched, the homography matrix can be obtained, and then the model parameters of one moment are mapped to another time by the projection transformation.
  • Step 320 When the two images supplemented by each other are images under different viewpoints, the model parameters corresponding to the two images are mapped to the same viewpoint according to the mapping relationship, and the background pixel point sets in the mapped two images complement each other. Fill the foreground void area.
  • the first image and the second image are images that can complement each other, when the first image and the first image
  • the corresponding model parameters of the first image are mapped to the second image viewpoint according to the mapping relationship.
  • Part or all of the foreground hole area of the second image is complemented by the set of background pixel points of the mapped first image.
  • the model of the background modeling is a Gaussian mixture model, specifically the background modeling of two adjacent moments, and the Gaussian distribution is composed of:
  • p(I x,t ) represents the probability density of the pixel at the time t coordinate x
  • is the Gaussian function
  • I x,t represents the pixel value of the pixel at the time t coordinate x
  • ⁇ x,i,t Respecting the mean and variance of the pixel points at the time t coordinate x, respectively
  • w x, i, t represents the i-th Gaussian distribution weight of the pixel point at the time t coordinate x
  • B(x t ) represents the background mask of the pixel at the time t coordinate x
  • B(x t ) 0 when the model is empty
  • B(x t ) 1 when the model is not empty.
  • the background model parameters at all t-1 moments are mapped to time t using projection transformation.
  • H t: t-1 t to obtain X coordinate at time t corresponds to time t-1 coordinates x 't-1
  • the pixel point at the coordinate x' t-1 at time t-1 is specifically as follows:
  • the current pixel is matched with K Gaussian models. For model i, if the condition is met Then stop the matching process.
  • the Gaussian model on the match is updated as follows:
  • ⁇ x,i,t (1- ⁇ ) ⁇ x,i,t-1 + ⁇ I x,t
  • the remaining video frames are also processed using the above method. Finally, the K Gaussian models are sorted in descending order according to the ⁇ / ⁇ value.
  • the value of the background pixel point bp(x t ) at time t can be obtained by the following formula:
  • the method of modeling with dynamic background in steps S310 and S320 After filling the foreground void area further includes the following optimization steps for filling the hollow portion that has not been filled after the processing in step S310 and step S320.
  • Step S330 predict the depth value at the cavity according to the depth value around the hollow hole of the intermediate background depth map, and repair the void in the intermediate background depth map according to the predicted depth value.
  • the depth value at the cavity is predicted.
  • the specific prediction method is as follows:
  • the energy function is established and the label value f that minimizes the energy function is obtained, as in the following formula:
  • N is a set of point pairs adjacent to each other;
  • V(f p , f q ) is the cost between two labels f p and f q of adjacent pixels, indicating the cost of discontinuity;
  • D p (f p ) is the cost between the assigned label f p and the pixel p, indicating the data cost.
  • V(f p , f q ) and D p (f P ) are defined as
  • V(f p ,f q ) min((f p -f q ) 2 ,DISC_K)
  • Step S340 Patching the filled intermediate background video with an image restoration algorithm that adds depth value limitation.
  • the void in the intermediate background video is repaired using the Criminisi algorithm (image restoration algorithm based on texture features and structure information proposed by Criminisi et al.). Adding a depth value limit in the Criminisi algorithm, if the image block to be added is detected as a foreground block by using the depth value, the block to be added will be removed, effectively preventing the foreground from being used to fill the hole area. The image to be repaired is flawed.
  • the specific repair method is as follows:
  • Figure 25 shows the principle of the Criminisi algorithm: For an input image I, ⁇ is the unknown region (the void region), and the source region ⁇ is defined as:
  • the boundary of the hole region ⁇ is marked as ⁇ , and at the boundary point p ⁇ , the priority of the image block ⁇ p centered at p is calculated as follows
  • the distance d( ⁇ a , ⁇ b ) of the image blocks ⁇ a and ⁇ b uses the sum of squares of the known partial pixel differences of the two image blocks (Sum of Squared Differences, SSD).
  • the C(p) update method is as follows
  • Y is less than the depth value
  • the region, ⁇ 3 is a scaling factor less than 1, and in one embodiment, ⁇ 3 is 0.85 or 0.95 or any value between 0.85 and 0.95.
  • the depth value of Y is greater than
  • ⁇ 4 is a scaling factor greater than 1, in one embodiment, ⁇ 4 is 1.05 or 1.15 or any value between 1.05 and 1.15; Is an image block
  • the average of the depth values is obtained by:
  • the filling of the foreground holes in the intermediate background video and the intermediate background depth map may be performed by steps S330 and S340 alone without going through the background modeling in steps S310 and S320.
  • step S400 includes:
  • Step S410 Distort the background video with the first distortion parameter 3D under the rendering of the background depth map to generate a virtual background video.
  • the first distortion parameter is a twisted angle or an offset displacement or a twist angle while shifting the set displacement.
  • Step S420 Distort the camera view video with the first distortion parameter 3D under the rendering of the camera view depth map to generate a virtual view video.
  • the virtual viewpoint video is generated by distorting the camera viewpoint video with the same distortion parameter 3D as step S410.
  • Step S430 Fill the void in the virtual view video with the virtual background video.
  • each frame image of the virtual background video is in one-to-one correspondence with each frame image of the virtual view video, and the image frame of the virtual background video is synchronously mapped into the corresponding image frame in the virtual view video, and the virtual view video can be
  • the holes are filled. Filling the holes in the virtual view video with a clean and flawless background video, there is no phenomenon of filling the holes with the foreground image block, the hole filling effect is better, and the problem of video distortion is avoided, and especially, the virtual view video is a pair of frames.
  • the background video is used to fill the holes, and only the background video image frames can be mapped one by one, and the cavity filling efficiency is high, which solves the need in the traditional direct repair virtual view view.
  • the problem of repeating the fill is described below.
  • step S500 before step S400, step S500 is further included, and step S500 performs background edge expansion on the background video and the background depth map in step S300.
  • Step S400 in this embodiment is to fill the void of the virtual view video synthesized by the camera view video using the extended background video and the background depth map, and effectively fill the boundary void in the virtual view video.
  • step S500 includes:
  • Step S510 Reverse mapping the virtual view video to the camera view point to obtain an extended boundary.
  • the virtual view video direction is first mapped to the global coordinate, and then the projection is transformed to the camera view point, that is, the projection is transformed to the background video view point, thereby obtaining an extended boundary, as shown in FIG. 26, the upper edge and the left edge. For the expansion of the area.
  • Step S520 Expand the background video and the background depth map according to the extended boundary.
  • the method of expanding the background video is consistent with the method described in step S340.
  • the extended background depth map is consistent with the method described in step S330, as shown in FIG. 27, which is the expanded background video and background depth map.
  • a device for filling a cavity of a virtual view video including:
  • the shooting module 610 is configured to acquire a camera view depth map corresponding to the camera view video and the camera view video.
  • the foreground removal module 620 is configured to remove a set of pixel points corresponding to the foreground in the camera view video and the camera view depth map to form an intermediate background video and an intermediate background depth map with foreground holes.
  • the background filling module 630 is configured to fill the foreground hole with the background pixel point set, and generate a background depth map corresponding to the background video and the background video.
  • a hole filling module 640 for filling a camera viewpoint video with a background video and a background depth map The void of the synthesized virtual viewpoint video.
  • the foreground removal module 620 includes:
  • the depth map foreground module 621 is extracted for extracting a set of pixel points corresponding to the foreground in the camera view depth map.
  • the video foreground module 622 is configured to determine a set of pixel points corresponding to the foreground in the camera view video.
  • the removing module 623 is configured to remove the set of pixel points in the extracted camera view depth map and the set of pixel points in the camera view video.
  • extracting the depth map foreground module 621 includes:
  • the foreground boundary module 6211 is extracted for extracting foreground boundaries in the camera view depth map.
  • the iterative module 6212 is configured to sequentially generate a secondary boundary according to the foreground boundary by using an iterative method, and the difference between the depth value of the point corresponding to the secondary boundary of the secondary boundary is smaller than a preset range.
  • the set of pixel points included in the foreground boundary and the secondary boundary constitute a set of pixel points corresponding to the foreground.
  • extracting the depth map foreground module 621 includes:
  • the foreground boundary and background boundary extraction module 6213 for extracting foreground and background boundaries in the camera view depth map.
  • the probability calculation module 6214 is configured to calculate a probability distribution of the foreground/background in the camera depth map according to the foreground boundary and the background boundary as the seed point, thereby determining a pixel point set corresponding to the foreground.
  • the extraction foreground boundary module is also used to filter the camera viewpoint depth map and the erosion processing camera viewpoint depth map.
  • the foreground boundary and background boundary extraction module is further configured to filter the camera view depth map; etch the camera view depth map and extract the foreground boundary; expand the camera viewpoint depth map and extract the background boundary.
  • the background fill module 630 includes:
  • the background modeling module is configured to perform background modeling on the intermediate background video and the intermediate background depth map, and fill the foreground void area by complementing the background pixel point set between the intermediate background video and the image in the intermediate background depth map.
  • the camera view video is a video of a dynamic camera viewpoint.
  • the background filling module further includes:
  • the motion compensation module 631 is configured to acquire a mapping relationship of video segments in different viewpoints in the camera view video.
  • the background modeling module 632 is further configured to map the model parameters corresponding to the two images to the same viewpoint according to the mapping relationship when the two images complement each other are images under different viewpoints, and the background in the two mapped images The set of pixel points complement each other to fill the foreground void area.
  • the background filling module further includes:
  • the background depth map repairing module 633 is configured to predict a depth value at the hole according to the pixel point set in the intermediate background depth map, and repair the filled intermediate background depth map according to the predicted depth value.
  • the background video repair module 634 is configured to repair the filled intermediate background video using an image restoration algorithm that adds a depth value limit.
  • the hole filling module 640 includes:
  • the background video warping module 641 is configured to distort the background video with the first distortion parameter 3D under the rendering of the background depth map to generate a virtual background video.
  • the camera view video warping module 642 is configured to distort the camera view video with the first distortion parameter 3D under the rendering of the camera view depth map to generate a virtual view video.
  • a padding module 643 is configured to fill a hole in the virtual view video with the virtual background video.
  • the hole filling module further includes:
  • the background expansion module is configured to inversely map the virtual view video to the camera view point to obtain an extended area; and expand the background video and the background depth map according to the extended area.
  • a fill module that fills holes in the virtual view video with the extended background video and the extended background depth map.
  • a method for filling a void of a virtual view image including the following steps:
  • Step 710 Acquire a camera viewpoint depth map corresponding to the camera viewpoint image and the camera viewpoint image.
  • the camera viewpoint image is an image taken by the camera.
  • 9 is a camera viewpoint image, the ladies and men in the image are foreground, and
  • FIG. 10 is a depth map of the image in FIG.
  • Step 720 Remove the camera view image and the set of pixel points corresponding to the foreground in the camera view depth map to form an intermediate background image and an intermediate background depth map with foreground holes.
  • the specific method is the same as the method of removing the foreground in the video from the slave steps of step S200 and step S200.
  • the camera viewpoint image may be first converted into a virtual viewpoint image, the foreground is removed on the basis of the virtual viewpoint image, the background is filled, a background image under a clean virtual viewpoint is formed, and the virtual viewpoint is filled with the background image.
  • the void of the image may be first converted into a virtual viewpoint image, the foreground is removed on the basis of the virtual viewpoint image, the background is filled, a background image under a clean virtual viewpoint is formed, and the virtual viewpoint is filled with the background image.
  • Step 730 Fill the foreground hole with a set of pixel points of the background, and generate a background image and a background depth map corresponding to the background image.
  • the depth value at the cavity is predicted from the depth value around the hole in the intermediate background depth map, and the foreground hole in the intermediate background depth map is filled according to the predicted depth value.
  • the method of the specific depth value prediction is consistent with the method stated in step S330.
  • the foreground hole in the intermediate background image is filled with an image restoration algorithm that adds a depth value limit.
  • the specific image restoration algorithm for adding the depth value limitation is consistent with the method stated in step S340.
  • Step 740 Fill the void of the virtual view image synthesized by the camera view image using the background image and the background depth map.
  • a virtual background image is generated by distorting the background image with the first distortion parameter 3D under the rendering of the background depth map.
  • a virtual viewpoint image is generated by distorting the camera viewpoint image with the first distortion parameter 3D under the rendering of the camera viewpoint depth map. Fill the voids in the virtual viewpoint image with a virtual background image.
  • step S750 performs background edge expansion on the background image and the background depth map in step S730. Filling the void of the virtual viewpoint image synthesized by the camera viewpoint image with the expanded background image and the background depth map effectively fills the boundary void in the virtual viewpoint image.
  • a terminal 800 comprising a processor, an image processing unit, a storage medium, a memory, a network interface, a display screen, and an input device connected by a system bus.
  • the operating medium is stored in the storage medium, and the computer readable instructions are stored.
  • a virtual video filling method can be implemented.
  • This processor is used to provide The computing and control capabilities support the entire terminal 800 operation.
  • the image processing unit in the terminal 800 is used for image compression, enhancement and restoration, matching, description and recognition, as well as etching and expansion operations of image operations.
  • the memory is used to provide an environment for the operation of the void fill device of the virtual view video in the storage medium.
  • the display screen displays images and videos.
  • the input device is used to receive commands or data input by the user. It is to be understood that the structure shown in FIG. 34 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the terminal to which the solution of the present application is applied.
  • the specific terminal may include a comparison diagram. More or fewer components are shown, or some components are combined, or have different component arrangements.
  • the program can be stored in a non-volatile computer readable storage.
  • the program may be stored in a storage medium of the computer system and executed by at least one processor in the computer system to implement a flow comprising an embodiment of the methods as described above.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Processing (AREA)

Abstract

一种虚拟视点视频、图像的空洞填充方法、装置和终端,该方法包括步骤:移除所述相机视点视频和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景视频和中间背景深度图;以背景的像素点集填充所述前景空洞,生成背景视频和所述背景视频对应的背景深度图;使用所述背景视频和所述背景深度图填充相机视点视频合成的虚拟视点视频的空洞。该装置与上述方法对应。本发明实施例方案极大地提高了虚拟视点视频的质量。

Description

虚拟视点视频、图像的空洞填充方法、装置和终端 技术领域
本发明涉及3D视频领域,特别涉及一种虚拟视点视频、图像的空洞填充方法、装置和终端。
背景技术
虚拟视点视频是基于三维场景相机拍摄的相机视点视频经3D变换后合成的虚拟视点下的视频。由于的相机视点视频中前景的遮挡导致视频中各帧图像中缺失被遮挡部分的背景像素信息,变换视角生成虚拟视点视频时,这些缺失的背景像素信息需要在虚拟视点中显现,由于不能由相机视点视频中获取相应的信息,导致了虚拟视点视频中产生了背景空洞的缺陷。
传统的虚拟视点视频的空洞修复的方法一般是利用视频时域或空域间的关联性获取填充信息。例如在时域上,由于前景的运动,当前帧被前景遮挡的区域可能会在其它帧会变得可见,因此可以利用背景建模的方法恢复被遮挡区域的背景。针对传统的修复方法由于虚拟视点视频各帧图像包含有前景部分和背景部分,填充时由于区分不当很可能出现利用前景的像素点来填充背景处的空洞的问题,导致修复的图像失真、视频质量差。
发明内容
基于此,有必要针对虚拟视点视频空洞填充后的失真问题,提供一种虚拟视点视频、图像的空洞填充方法、装置和终端。
一种虚拟视点视频的空洞填充方法,包括步骤:
获取相机视点视频和所述相机视点视频对应的相机视点深度图;
移除所述相机视点视频和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景视频和中间背景深度图;
以背景的像素点集填充所述前景空洞,生成背景视频和所述背景视频对应的背景深度图;
使用所述背景视频和所述背景深度图填充相机视点视频合成的虚拟视点视频的空洞。
一种虚拟视点图像的空洞填充方法,包括:
获取相机视点图像和所述相机视点图像对应的相机视点深度图;
移除所述相机视点图像和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景图像和中间背景深度图;
以背景的像素点集填充所述前景空洞,生成背景图像和所述背景图像对应的背景深度图;
使用所述背景图像和所述背景深度图填充相机视点图像合成的虚拟视点图像的空洞。
一种虚拟视点视频的空洞填充装置,包括:
拍摄模块,用于获取相机视点视频和所述相机视点视频对应的相机视点深度图;
前景移除模块,用于移除所述相机视点视频和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景视频和中间背景深度图;
背景填充模块,用于以背景的像素点集填充所述前景空洞,生成背景视频和所述背景视频对应的背景深度图;
空洞填充模块,用于使用所述背景视频和所述背景深度图填充相机视点视频合成的虚拟视点视频的空洞。
一种虚拟视点图像的空洞填充装置,包括:
拍摄模块,用于获取相机视点图像和所述相机视点图像对应的相机视点深度图;
前景移除模块,用于移除所述相机视点图像和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景图像和中间背景深度图;
背景填充模块,用于以背景的像素点集填充所述前景空洞,生成背景图 像和所述背景图像对应的背景深度图;
空洞填充模块,用于使用所述背景图像和所述背景深度图填充相机视点图像合成的虚拟视点图像的空洞。
一种终端,包括存储器和处理器,所述存储器中存储有指令,所述指令被所述处理器执行时,可使得所述处理器执行以下步骤:
获取相机视点视频和所述相机视点视频对应的相机视点深度图;
移除所述相机视点视频和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景视频和中间背景深度图;
以背景的像素点集填充所述前景空洞,生成背景视频和所述背景视频对应的背景深度图;
使用所述背景视频和所述背景深度图填充相机视点视频合成的虚拟视点视频的空洞。
上述虚拟视点视频的空洞填充方法和装置,通过构建干净的无前景瑕疵的背景视频和背景深度图,并通过构建的背景视频和背景深度图填充虚拟视点视频中的空洞。由于背景视频中不带有前景纹理,修补时不会出现以前景块填充背景空洞的现象,避免了虚拟视点视频由于修复带来的失真缺陷,提高了虚拟视点视频的视频质量。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是一个实施例中的虚拟视点视频的空洞填充方法的流程图;
图2是一个实施例中生成中间背景视频和中间背景深度图的流程图;
图3是一个实施例中提取深度图中前景像素点集的流程图;
图4是另一个实施例中提取深度图中前景像素点集的流程图;
图5是一个实施例中提取前景边界和背景边界的流程图;
图6是一个实施例中生成背景视频和背景深度图的流程图;
图7是一个实施例中利用背景视频填充虚拟视频空洞的流程图;
图8是一个实施例中拓展背景视频和背景深度图的流程图;
图9是一个实施例中相机视点视频中的其中一帧图像;
图10是图9中图像对应的深度图;
图11是图9中移除了前景像素点集后的图像;
图12是图10中移除了前景像素点集后的图像;
图13是以背景像素点集进行空洞填充后的背景视频图像;
图14是以背景像素点集进行空洞填充后的背景深度图像;
图15是未填充的虚拟视点视频的一帧图像;
图16是预处理后的深度图图像;
图17是构建前景边界的最小外接矩形的展示图;
图18是前景边界迭代图;
图19是获取初始化种子点的中间图;
图20和图21分别是前景概率图和背景概率图;
图22和图23分别为提取的前景图和提取的背景图;
图24是经过深度值预测修补前后的深度图;
图25是修补背景视频的原理图;
图26是背景视频和背景深度图拓展后区域图;
图27是拓展后的背景视频和背景深度图;
图28是一个实施例中虚拟视点视频的空洞填充装置的结构框图;
图29是一个实施例中前景移除模块的结构框图;
图30是一个实施例中一种提取深度图前景模块的结构框图;
图31是一个实施例中另一种提取深度图前景模块的结构框图;
图32是一个实施例中背景填充模块的结构框图;
图33是一个实施例中空洞填充模块的结构框图;
图34是一个实施例中的虚拟视点图像的空洞填充方法的流程图;
图35是一个实施例中的拓展背景视频和背景深度图的流程图;
图36是一个实施例中的终端的结构框图。
具体实施方式
为使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步的详细说明。应当理解,此处所描述的具体实施方式仅仅用以解释本发明,并不限定本发明的保护范围。
图1中给出了一个实施例中的一种虚拟视点视频的空洞填充的方法的流程示意图,如图1所示,该方法包括如下步骤:
步骤S100:获取相机视点视频和相机视点视频对应的相机视点深度图。
相机视点视频是基于单个摄像相机的对于事件的摄录视频。相机视点视频对应的相机视点深度图实质是相机视点视频中每帧图像对应的深度图,深度图中的像素点的值代表场景中该像素点对应的实物点与摄像相机之间的距离关系,即深度值,范围是0-255,最远的深度值是0,最近的深度值是255。
其中,单个摄像相机在进行摄录时,可以是单一视点下的静态摄录,也可以是通过移动或者扭转变换视点下的动态摄录。
在一个实施例中,相机视点视频可以由相机的摄像头获取,相机视点深度图可以由相机的传感系统获取。
步骤S200:移除相机视点视频和相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景视频和中间背景深度图。
相机视点视频和相机视点深度图中包含有由前景部分和背景部分。前景相对于背景更靠近摄像相机。在其中一个实施例中,前景为运动的前景,前景可以是一个对象,也可以是多个。举例来说,图9为相机视点视频中其中的一帧图像,图像中的女士和男士为前景,相机视点视频中两个前景对象是运动的,舞蹈室为背景,图10为图9中图像的深度图。
可根据前景与背景与相机之间的距离差异提取前景对应的像素点集,去除提取到的像素点集,形成带有前景空洞的中间背景视频和中间背景深度图,图11为中间背景视频中一帧图像,图12为中间背景深度图中与图11中图像对应的深度图。
步骤S300:以背景的像素点集填充前景空洞,生成背景视频和背景视频对应的背景深度图。
前景部分被移除后,基于中间背景视频和中间背景深度图进行背景建模、图像修复等填充手段填充前景空洞,生成没有空洞的完整的背景视频和背景深度图,如图13为完整的背景视频中一帧图像,图14为完整背景深度图中与图13中图像对应的深度图。
本实施例中前景空洞区域的填充是在移除了前景的前提下进行的,前景的纹理不会带入到重新构建的背景中,修复出来的背景能够保持较好的质量,可以较好的避免背景视频失真的问题。
步骤S400:使用背景视频和背景深度图填充相机视点视频合成的虚拟视点视频的空洞。
相机视点视频在相机视点深度图的渲染下,经3D映射后能够生成虚拟视点下的虚拟视频,由于视点的变换被遮挡的背景在虚拟视点下将会暴露出来,而相机视点视频缺少被遮挡的背景部分的特征点,因此,虚拟视点视频中会出现背景空洞,如图15所示,为相机视点视频的一帧图像经3D映射后的虚拟视点视图,图中的白色区域即为背景空洞。
在其中一个实施例中,上述虚拟视点视频空洞填充方法还可以是:
获取相机视点视频和相机视点视频对应的相机视点深度图。根据相机视点视频和相机视点深度图合成虚拟视点视频,去除虚拟视点视频中的前景对应的像素点集,形成带有前景空洞的中间背景视频和中间背景深度图。以背景的像素点集填充前景空洞,生成背景视频和背景视频对应的背景深度图。使用背景视频和背景深度图填充虚拟视点视频的空洞。
上述虚拟视点视频填充方法,先将相机视点视频转换成虚拟视点视频, 在虚拟视点视频的基础上去除前景,填充背景,形成干净的虚拟视点下的背景视频,用该背景视频填充虚拟视点视频的空洞。
本实施例中,使用干净的无前景瑕疵的背景视频,填补虚拟视点视频中的背景空洞,该背景视频可以是基于原始相机视点视频形成的,也可以是基于虚拟视点视频形成的,只要是使用干净的背景视频去填充空洞即可克服传统技术直接对虚拟视点视频进行图像修复的视频失真的缺陷,有效避免了由于不能准确辨别前景和背景而误用前景部分的特征点来填充背景空洞导致的虚拟视频失真的问题,虚拟视频质量更好。
在其中一个实施例中,如图2所示,步骤S200包括:
步骤S210:提取相机视点深度图中的前景对应的像素点集。
步骤S220:确定相机视点视频中前景对应的像素点集。
通过相机视点深度图中的前景对应的像素点集的位置映射,可以确定出相机视点视频中前景对应的像素点集。
步骤S230:移除提取的相机视点深度图中的像素点集和相机视点视频中的像素点集。
移除步骤S210和步骤S220中获取到的像素点集,分别得到带有前景空洞的背景视频和带有前景空洞的深度图,定义带有前景空洞的背景视频为中间背景视频,带有前景空洞的深度图为中间背景深度图。
在深度图中,前景与背景的深度值存在较大的差异,本实施例通过先提取深度图中的前景,根据提取的深度图中的前景准确移除相机视点视频以及相机视点深度图中的前景像素点集,前景提取和移除更加准确、高效。
在其中一个实施例中,如图3所示,步骤S210包括:
步骤S212:提取相机视点深度图中的前景边界。
在其中一个实施例中,在步骤S212之前还包括步骤S211:滤波和腐蚀处理相机视点深度图。
在深度图中同一个物体的深度值应该是连续的,但由于噪声等因素的影响,可能会使深度图中同一物体深度值并不连续即出现不真实边缘的现象, 这种不真实的边缘会对后续的前景提取造成干扰,影响前景提取的准确度。对相机视点深度图进行滤波处理可以消减甚至消除不真实的边缘。图16为滤波后的相机视点深度图。对比图10的未滤波的深度图,图16中同一物体具有平滑的效果,而边界得以保留。
由于前景目标的边界可能位于背景区域,在其中一个实施例中,对相机视点深度图进行形态学腐蚀运算,使前景腐蚀收缩以保证后续提取的边界位于前景内部。
相机视点深度图经上述滤波和腐蚀后,再进行前景边界提取,不仅能够避免不真实边界的干扰,而且经腐蚀运算后,能够确保提取的前景边界位于前景内部,提取的边界更加准确。
在其中一个实施例中,经滤波和腐蚀处理后,使用Canny边缘检测方法提取相机视点深度图中前景边界。
步骤S213:根据前景边界采用迭代的方法依次生成次生边界,次生边界与其原生边界相对应的点的深度值差值小于预设范围;
其中,前景边界和次生边界包含的像素点集构成了相机视点深度图中前景对应的像素点集。
由于同一个前景目标中,相邻间像素具有相近的深度值,基于步骤S212中提取的前景边界,查询与前景边界中包含的点的深度值差值小于预设范围且距离小于设定值的点,查询的点生成前景边界的次生边界,再以该次生边界为原生边界生成第二次生边界,以此类推,不断拓展边界直至获取整个前景为止。
在其中一个实施例中,根据原生边界生成次生边界之前,还包括:
构建原生边界的最小外接矩形,计算该最小外接矩形内所有点的最大类间方差法对应的阈值。
查询与前景边界中包含的点的深度值差值小于预设范围且距离小于设定值的点为查询最小外接矩形内的点的深度值,并将查询的深度值与最大类间方差法对应的阈值做对比,大于该阈值且与前景边界中对应点的距离小于预 设值的点为次生边界的点。通过构建最小外接矩形可更加准确设定阈值,查询结果更加准确。减少了查找的范围、提高了查找的效率。
具体迭代原理如下:
上述前景边界和次生边界可表示为一个点集,这些边界可以生长成前景目标。假设T表示未分配的点集,该点集至少与一个前景目标相邻,点集T的特性如下公式所示:
Figure PCTCN2016083746-appb-000001
其中N(x)表示直接与像素x相邻的点集,PMBRi表示Fi的最小外接矩形(MBR),获取的最小外接矩形如图17所示。
如果N(x)与其中一个前景目标Fj有重叠部分,那么x与它的交叠区域的距离定义为:
Figure PCTCN2016083746-appb-000002
如果像素x满足以下条件,则将它新添到前景目标Fj
δ(x)<β,and Z(x)>Otsu(PMBRj)
其中β是一个小值,Otsu函数表示最大类间方差法对应的阈值,深度值大于该阈值作为判断该像素是前景目标的一个条件。这两个条件来自深度图的特性:同一个前景目标里的相邻像素具有相近的深度值,前景目标的深度值大于它所覆盖的背景的深度值。
新的前景目标作为下一次迭代过程的输入,直到
Figure PCTCN2016083746-appb-000003
迭代过程才结束。迭代过程如图18所示。
在其中一个实施例中,如图4所示,步骤S210包括:
步骤S214:提取相机视点深度图中前景边界和背景边界,如图19所示,内侧的黑色边线为前景边界,外侧的白色边线为背景边界。
步骤S215:以前景边界和背景边界作为种子点计算相机视点深度图中前景/背景的概率分布,进而确定视点深度图中前景对应的像素点集。
在其中一个实施例中,采用随机游走分割算法对前景边界和背景边界种 子点做随机游走分割,计算相机视点深度图中前景/背景的概率分布,进而确定前景对应的像素点集。对种子点进行处理的算法不仅限于随机游走分割算法,其他能够根据种子点计算得到前景/背景的概率分布的算法均可。
令前景边界种子点和背景边界种子点的标签集合为S={s1,s2},其中s1和s2分别表示前景目标和背景的标签。有了初始化种子点,各个点为前景或背景概率分布图可通过如下求解公式获得:
LUxs=-BTms
其中:LU是非种子节点对应的权重系数,BT为转置矩阵,另外,令节点vi首次游走到标签s的概率为
Figure PCTCN2016083746-appb-000004
定义种子节点的标签函数为
Figure PCTCN2016083746-appb-000005
其中s∈S,S={S1,S2},为每个标签s定义|VM|×1维向量ms,其节点vj∈VM处的值为
Figure PCTCN2016083746-appb-000006
如图20和图21分别为游走者到前景标签和背景标签的概率。灰度图中的值越高代表出现的概率越高,最高概率值的那个标签作为非种子节点的标签,前景和背景的分割结果如图22和图23所示。
在其中一个实施例中,如图5所示,在步骤S214包括:
步骤S2141:滤波处理相机视点深度图。该步骤的处理方法与步骤S211中的滤波处理方式相同。
步骤S2142:腐蚀处理相机视点深度图并提取前景边界。该步骤的处理方法与步骤S211中的腐蚀处理方式和步骤S212中的提取前景边界的方式相同。
步骤S2143:膨胀处理相机视点深度图并提取背景边界。使用形态学膨胀操作保证提取背景边界落在背景区域里,确保了背景边界的提取的准确度。
在其中一个实施例中,步骤S300为对步骤S200中的中间背景视频和中间背景深度图进行背景建模,通过中间背景视频和中间背景深度图中的图像 之间背景像素点集的相互补充来填充前景空洞区域。
视频是基于多帧图像的时间函数,中间背景图像中包括不同时刻的多帧图像。在其中一个实施例中,前景是运动的前景,随着前景的运动,在一个时刻前景遮挡的背景区域可能在另一个时刻的图像中显现出来。背景建模即利用上述特性,通过图像之间背景像素点集的相互补充来填充前景空洞区域,生成干净的无前景瑕疵的背景视频和背景深度图。
本实施例的背景建模是基于移除了前景的中间背景视频和中间背景深度图进行的,不会出现误用前景的图像块填充空洞而造成生成的背景视频失真的现象。
在其中一个实施例中,如图6所示,相机视点视频为动态相机视点的视频。
在对中间背景视频和中间背景深度图进行背景建模的步骤之前,还包括:
步骤S310:获取相机视点视频中不同视点下的视频段的映射关系。
动态相机视点视频为相机在非静止状态下摄录的相机视点视频,相机视点为动态的。对于摄像机非静止的情形,如果映射的两个时刻对应不同的相机视点,则无法直接进行背景映射。为了适应相机的动态情形,本实施例的背景建模为带有运动补偿的改进的背景建模。
具体为:使用SURF检测和RANSAC算法获取相机视点视频中不同视点下的视频段的映射关系。使用SURF检测和描述当前帧和参考帧的特征点。为了提高鲁棒性,使用RANSAC算法优化特征点对的匹配。获取特征点对匹配后,便可得到单应矩阵,接着把其一时刻的模型参数通过投射变换映射到另一时刻上。
对中间背景视频和中间背景深度图进行背景建模,包括:
步骤320:当相互补充的两个图像为不同视点下的图像时,根据映射关系将两个图像对应的模型参数映射到同一视点下,映射后的两个图像中的背景像素点集相互补充以填充前景空洞区域。
具体的,第一图像和第二图像为可以相互补充的图像,当第一图像和第 二图像分别对应不同视点下的视频段时,根据映射关系,将第一图像的对应的模型参数映射到第二图像视点下。以映射后的第一图像的背景像素点集补充第二图像的部分或者全部前景空洞区域。
在其中一个实施例中,背景建模的模型为高斯混合模型,具体为两个相邻时刻的背景建模,高斯分布组成为:
Figure PCTCN2016083746-appb-000007
其中p(Ix,t)表示时刻t坐标x处像素点的概率密度,η为高斯函数,Ix,t表示时刻t坐标x处像素点的像素值,μx,i,t
Figure PCTCN2016083746-appb-000008
分别表示时刻t坐标x处像素点的均值和方差,wx,i,t表示时刻t坐标x处像素点的第i个高斯分布权重,并满足
Figure PCTCN2016083746-appb-000009
B(xt)表示时刻t坐标x处像素点的背景掩膜,当模型为空时,B(xt)=0,当模型非空时,B(xt)=1。
背景模型的详细处理过程如下:
首先,在时刻t0,对所有的高斯模型进行初始化,具体如下述公式:
Figure PCTCN2016083746-appb-000010
Figure PCTCN2016083746-appb-000011
Figure PCTCN2016083746-appb-000012
Figure PCTCN2016083746-appb-000013
其中σ0是一个预设的大值,F(xt)表示时刻t坐标x处像素点的前景掩膜,如果检测出像素xt是前景像素,F(xt)=1,否则,F(xt)=0。
其次,对于下一帧视频,使用投射变换把所有t-1时刻上的背景模型参数映射到t时刻上。使用单应矩阵Ht:t-1,求出t时刻的坐标xt对应到的t-1时刻坐标x′t-1,相应的,t时刻坐标xt处像素点的背景模型参数更新自t-1时刻坐标x′t-1处像素点,具体如下述公式:
μx,i,t-1=μx′,i,t-1
Figure PCTCN2016083746-appb-000014
wx,i,t-1=wx′,i,t-1
B(xt-1)=B(x′t-1)
如果当前像素点不是前景像素点(F(xt)=0),则更新背景模型,更新的过程如下:
当前像素点与K个高斯模型进行匹配,对于模型i,如果满足条件
Figure PCTCN2016083746-appb-000015
则停止匹配过程。匹配上的高斯模型更新如下:
μx,i,t=(1-ρ)μx,i,t-1+ρIx,t
Figure PCTCN2016083746-appb-000016
wx,i,t=(1-α)wx,i,t-1
其它的高斯模型更新如下
μx,i,t=μx,i,t-1
Figure PCTCN2016083746-appb-000017
wx,i,t=(1-α)wx,i,t-1
其中
Figure PCTCN2016083746-appb-000018
α是学习率。
然而,当所有的高斯模型无法与当前像素点匹配上,则引入一个新的高斯模型σx,t=σ0,ωx,t=w0,其中w0是一个小的权值用以剔除那些ω/σ值小的高斯模型。其它高斯模型的均值和方差保持不变,K个高斯模型的权重进行归一化
Figure PCTCN2016083746-appb-000019
剩余的视频帧同样使用上述方法来处理,最后K个高斯模型根据ω/σ值进行降序排序,时刻t背景像素点的值bp(xt)可由以下公式获取:
bp(xt)=μx,1,t,如果B(xt)=1。
在其中一个实施例中,在步骤S310和步骤S320的以动态背景建模的方 式对前景空洞区域进行填充后,还包括有如下优化步骤,用于填补经步骤S310和步骤S320处理后仍未被填充的空洞部分。
步骤S330:根据中间背景深度图中空洞周围的深度值预测空洞处的深度值,并根据预测深度值修补中间背景深度图中的空洞。
在深度图中,由于没有前景的干扰,可以认为深度图的空洞区域与周围背景同处一个平面,因此,可以认为修补空洞部分的深度值为与周围背景深度值一致或者与周围背景深度图线性变化。基于上述特性,对空洞处的深度值进行预测,具体预测方式如下:
Figure PCTCN2016083746-appb-000020
由于预测的深度图可能存在误差或不够平滑的问题,在一个实施例中通过建立能量函数,并获取使能量函数最小的标签值f,如下述公式:
Figure PCTCN2016083746-appb-000021
其中N是相互接邻的点对集合;V(fp,fq)是相邻像素的两个标签fp和fq之间的代价,表示的是不连续性的代价;Dp(fp)是分配的标签fp与像素p之间的代价,表示的是数据代价。这里V(fp,fq)和Dp(fP)定义为
V(fp,fq)=min((fp-fq)2,DISC_K)
Dp(fp)=λmin((Zp-fp)2,DATA_K)
其中λ是权重系数,DISC_K和DATA_K控制着代价惩罚何时停止。通过深度值预测修补深度图的效果对比如图24所示。
步骤S340:使用添加深度值限制的图像修复算法修补填充后的中间背景视频。
在其中一个实施例中,使用Criminisi算法(Criminisi等人提出的基于纹理特征和结构信息的图像修复算法)修复中间背景视频中的空洞。在Criminisi算法中添加深度值限制,通过深度值限制如果检测到待添加的图像块为前景块,则该待添加块将剔除,有效防止了前景可能会被用来填充空洞区域,导 致修复的图像存在瑕疵。
具体修复方法如下:
如图25所示为Criminisi算法的原理:对于一张输入图像I,Ω是未知区域(空洞区域),源区域Ф定义为:
Ф=I-Ω
空洞区域Ω的边界标记为δΩ,在边界点p∈δΩ,以p为中心点的图像块Ψp的优先级计算如下
P(p)=C(p)gD(p)
置信度项(confidence term)C(p)和数据项(data term)D(p)定义如下
Figure PCTCN2016083746-appb-000022
其中|Ψp|是Ψp的面积,α是归一化系数(比如,对于典型的灰度图像α=255),np是在p点垂直于边界δΩ的元矢量,
Figure PCTCN2016083746-appb-000023
表示图像结构的方向。C(p)表示图像块Ψp属于非空洞区域像素的百分比,初始化时,对于空洞区域像素C(q)=0,其他区域C(q)=1。当边界δΩ所有点的优先级确定后,找出优先级最高的点
Figure PCTCN2016083746-appb-000024
对应的图像块为
Figure PCTCN2016083746-appb-000025
然后找出与
Figure PCTCN2016083746-appb-000026
最相似的图像块
Figure PCTCN2016083746-appb-000027
填充,
Figure PCTCN2016083746-appb-000028
的选择方式如下:
Figure PCTCN2016083746-appb-000029
其中,图像块Ψa和Ψb的距离d(Ψa,Ψb)使用两个图像块已知部分像素差的平方和(Sum of Squared Differences,SSD)。
当图像块
Figure PCTCN2016083746-appb-000030
被填充后,
Figure PCTCN2016083746-appb-000031
里的C(p)更新方式如下
Figure PCTCN2016083746-appb-000032
由于前景目标可能会被用来填充空洞区域,导致修复的图像存在瑕疵。因此,在上述Criminisi算法方法中,在搜索与
Figure PCTCN2016083746-appb-000033
最相似的图像块
Figure PCTCN2016083746-appb-000034
时,对搜索区域Ф′加入深度信息限制条件,排除与Ф′深度值偏差大的区域,只在深度值相近的区域内寻找最匹配图像块。搜索区域Ф′定义为Ф′=Ф-Y。
其中在一个实施例中,Y是深度值小于
Figure PCTCN2016083746-appb-000035
的区域,ξ3是一个小于1 的缩放系数,在一个实施例中,ξ3为0.85或者0.95或者0.85与0.95之间任一值。在一个实施例中Y的深度值大于
Figure PCTCN2016083746-appb-000036
的区域,ξ4是一个大于1的缩放系数,在一个实施例中,ξ4为1.05或者1.15或者1.05与1.15之间任一值;
Figure PCTCN2016083746-appb-000037
是图像块
Figure PCTCN2016083746-appb-000038
深度值的平均值,由以下方式获得:
Figure PCTCN2016083746-appb-000039
在搜索区域Ф′寻找与
Figure PCTCN2016083746-appb-000040
最相似的图像块
Figure PCTCN2016083746-appb-000041
其中
Figure PCTCN2016083746-appb-000042
在其中一个实施例中,当ξ3=0.9,ξ4=1.1,得到如图13所示的背景视频。
在其中一个实施例中,对中间背景视频和中间背景深度图中前景空洞的填充可以单独由步骤S330和步骤S340来完成,不经过步骤S310和步骤S320中的背景建模。
在其中一个实施例中,如图7所示,步骤S400包括:
步骤S410:在背景深度图的渲染下以第一扭曲参数3D扭曲背景视频,生成虚拟背景视频。在其中一个实施例中,第一扭曲参数为扭曲的角度或者偏移的位移或者扭曲角度的同时偏移设定位移。
步骤S420:在相机视点深度图的渲染下以第一扭曲参数3D扭曲相机视点视频,生成虚拟视点视频。以与步骤S410相同扭曲参数3D扭曲相机视点视频,生成虚拟视点视频。
步骤S430:使用虚拟背景视频填充虚拟视点视频中的空洞。
经过相同的3D扭曲,虚拟背景视频每帧图像与虚拟视点视频的每帧图像一一对应,将虚拟背景视频的图像帧同步映射到虚拟视点视频中对应图像帧中,即可对虚拟视点视频中的空洞进行填充。使用干净无瑕疵的背景视频填充虚拟视点视频中的空洞,不会出现使用前景图像块填充空洞的现象,空洞填充效果更好,避免了视频失真的问题,另外,尤其在虚拟视点视频为对帧多视角视频时,利用背景视频对空洞进行填充,只需将背景视频图像帧一一映射即可,空洞填充效率高,解决了传统的直接修复虚拟视点视图中的需 要重复填充的问题。
在其中一个实施例中,在步骤S400之前还包括步骤S500,步骤S500对步骤S300中的背景视频和背景深度图进行背景边缘拓展。
由于拍摄范围有限,3D扭转后新视角的虚拟视点视频中部分边界区域无法在依据相机视点视频得到的背景视频中找到对应的区域,在经过3D扭曲变化后,虚拟视点视图部分边界也会存在较大的空洞,如图15所示。本实施例中对背景视频和背景深度图进行背景边缘拓展。本实施例中的步骤S400为使用拓展后的背景视频和背景深度图填充相机视点视频合成的虚拟视点视频的空洞,有效填充了虚拟视点视频中边界空洞。
在其中一个实施例中,如图8所示,步骤S500包括:
步骤S510:将虚拟视点视频反向映射到相机视点下,得到拓展边界。具体为,首先把虚拟视点视频方向映射到全局坐标,然后在投影变换到相机视点下,也就是投影变换到背景视频视点下,进而得到拓展边界,如图26所示,上边缘和左侧边缘为拓展的区域。
步骤S520:根据拓展边界拓展背景视频和背景深度图。
拓展背景视频和背景深度图至其延伸至拓展边界。在其中一个实施例中,拓展背景视频的方法与步骤S340描述的方法一致,拓展背景深度图与步骤S330描述的方法一致,如图27所示,为拓展后的背景视频和背景深度图。
在一个实施例中,如图28所示,提出了一种虚拟视点视频的空洞填充的装置,包括:
拍摄模块610,用于获取相机视点视频和相机视点视频对应的相机视点深度图。
前景移除模块620,用于移除相机视点视频和相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景视频和中间背景深度图。
背景填充模块630,用于以背景的像素点集填充前景空洞,生成背景视频和背景视频对应的背景深度图。
空洞填充模块640,用于使用背景视频和背景深度图填充相机视点视频 合成的虚拟视点视频的空洞。
在其中一个实施例中,如图29所示,前景移除模块620包括:
提取深度图前景模块621,用于提取相机视点深度图中的前景对应的像素点集。
提取视频前景模块622,用于确定相机视点视频中前景对应的像素点集。
移除模块623,用于移除提取相机视点深度图中的像素点集和相机视点视频中的像素点集。
在其中一个实施例中,如图30所示,提取深度图前景模块621包括:
提取前景边界模块6211,用于提取相机视点深度图中前景边界。
迭代模块6212,用于根据前景边界采用迭代的方法依次生成次生边界,次生边界与其原生边界相对应的点的深度值差值小于预设范围。
其中,前景边界和次生边界包括的像素点集构成了前景对应的像素点集。
在其中一个实施例中,如图31所示,提取深度图前景模块621包括:
前景边界和背景边界提取模块6213:用于提取相机视点深度图中的前景边界和背景边界。
概率计算模块6214,用于根据前景边界和背景边界作为种子点计算相机视点深度图中前景/背景的概率分布,进而确定前景对应的像素点集。
在其中一个实施例中,提取前景边界模块还用于滤波相机视点深度图以及腐蚀处理相机视点深度图。
在其中一个实施例中,前景边界和背景边界提取模块还用于滤波相机视点深度图;腐蚀处理相机视点深度图并提取前景边界;膨胀处理相机视点深度图并提取背景边界。
在其中一个实施例中,背景填充模块630包括:
背景建模模块,用于对中间背景视频和中间背景深度图进行背景建模,通过中间背景视频和中间背景深度图中的图像之间背景像素点集的相互补充来填充前景空洞区域。
在其中一个实施例中,相机视点视频为动态相机视点的视频。
如图32所示,背景填充模块还包括:
运动补偿模块631,用于获取相机视点视频中不同视点下的视频段的映射关系。
背景建模模块632,还用于当相互补充的两个图像为不同视点下的图像时,根据映射关系将两个图像对应的模型参数映射到同一视点下,映射后的两个图像中的背景像素点集相互补充以填充前景空洞区域。
在其中一个实施例中,背景填充模块,还包括:
背景深度图修复模块633,用于根据中间背景深度图中的像素点集预测空洞处深度值,并根据预测深度值修补填充后的中间背景深度图。
背景视频修复模块634,用于使用添加深度值限制的图像修复算法修补填充后的中间背景视频。
在其中一个实施例中,如图33所示,空洞填充模块640包括:
背景视频扭曲模块641,用于在背景深度图的渲染下以第一扭曲参数3D扭曲背景视频,生成虚拟背景视频。
相机视点视频扭曲模块642,用于在相机视点深度图的渲染下以第一扭曲参数3D扭曲相机视点视频,生成虚拟视点视频。
填充模块643,用于使用虚拟背景视频填充虚拟视点视频中的空洞。
在其中一个实施例中,空洞填充模块还包括:
背景拓展模块,用于将虚拟视点视频反向映射到相机视点下,得到拓展区域;根据拓展区域拓展背景视频和背景深度图。
填充模块,用于使用拓展的背景视频和拓展的背景深度图填充虚拟视点视频中的空洞。
在一个实施例中,如图34所示,还提供了一种虚拟视点图像的空洞填充方法,包括如下步骤:
步骤710:获取相机视点图像和所述相机视点图像对应的相机视点深度图。相机视点图像为相机拍摄的图像。如图9为相机视点图像,图像中的女士和男士为前景,图10为图9中图像的深度图。
步骤720:移除所述相机视点图像和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景图像和中间背景深度图。具体方法与步骤S200以及步骤S200的从属步骤中针对与视频中前景移除的方法相同。
在其中一个实施例中,还可以先将相机视点图像转换成虚拟视点图像,在虚拟视点图像的基础上去除前景,填充背景,形成干净的虚拟视点下的背景图像,用该背景图像填充虚拟视点图像的空洞。
步骤730:以背景的像素点集填充所述前景空洞,生成背景图像和所述背景图像对应的背景深度图。
在其中一个实施例中,根据中间背景深度图中的空洞周围的深度值预测空洞处的深度值,并根据预测深度值填充中间背景深度图中的前景空洞。具体深度值预测的方法与步骤S330中陈述的方法一致。使用添加深度值限制的图像修复算法填充中间背景图像中的前景空洞。其中,具体的添加深度值限制的图像修复算法与步骤S340中陈述的方法一致。
步骤740:使用所述背景图像和所述背景深度图填充相机视点图像合成的虚拟视点图像的空洞。
在背景深度图的渲染下以第一扭曲参数3D扭曲背景图像,生成虚拟背景图像。在相机视点深度图的渲染下以第一扭曲参数3D扭曲相机视点图像,生成虚拟视点图像。使用虚拟背景图像填充虚拟视点图像中的空洞。
在其中一个实施例中,如图35所示,在步骤S740之前还包括步骤S750,步骤S750对步骤S730中的背景图像和背景深度图进行背景边缘拓展。使用拓展后的背景图像和背景深度图填充相机视点图像合成的虚拟视点图像的空洞,有效填充了虚拟视点图像中边界空洞。
如图36所示,还提供了一种终端800,包括通过系统总线连接的处理器、图像处理单元、存储介质、内存、网络接口、显示屏幕和输入设备。其中,存储介质中存储有操作系统,还存储有计算机可读指令,该计算机可读指令被处理器执行时,可实现一种虚拟视频的空洞填充方法。该处理器用于提供 计算和控制能力,支撑整个终端800运行。终端800中的图像处理单元用于进行图像压缩,增强和复原,匹配、描述和识别,还可以进行图像的运算的腐蚀和膨胀运算。内存用于为存储介质中的虚拟视点视频的空洞填充装置的运行提供环境。显示屏幕显示图像和视频。输入设备用于接收用户输入的命令或数据等。可以理解的是,图34中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的终端的限定,具体的终端可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一非易失性的计算机可读取存储介质中,如本发明实施例中,该程序可存储于计算机系统的存储介质中,并被该计算机系统中的至少一个处理器执行,以实现包括如上述各方法的实施例的流程。其中,的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。

Claims (21)

  1. 一种虚拟视点视频的空洞填充方法,包括:
    获取相机视点视频和所述相机视点视频对应的相机视点深度图;
    移除所述相机视点视频和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景视频和中间背景深度图;
    以背景的像素点集填充所述前景空洞,生成背景视频和所述背景视频对应的背景深度图;
    使用所述背景视频和所述背景深度图填充相机视点视频合成的虚拟视点视频的空洞。
  2. 根据权利要求1所述的方法,其特征在于,所述移除所述相机视点视频和所述相机视点深度图中的前景对应的像素点集的步骤,包括:
    提取所述相机视点深度图中前景对应的像素点集;
    确定所述相机视点视频中前景对应的像素点集;
    移除提取所述相机视点深度图中的所述像素点集和所述相机视点视频中的所述像素点集。
  3. 根据权利要求2所述的方法,其特征在于,所述提取所述相机视点深度图中的前景对应的像素点集的步骤,包括:
    提取所述相机视点深度图中的前景边界;
    根据所述前景边界采用迭代的方法依次生成次生边界,所述次生边界与其原生边界相对应的点的深度值差值小于预设范围;
    其中,所述前景边界和所述次生边界包含的像素点集构成了前景对应的像素点集。
  4. 根据权利要求2所述的方法,其特征在于,所述提取所述相机视点深度图中的前景对应的像素点集的步骤,包括:
    提取所述相机视点深度图中的前景边界和背景边界;
    以所述前景边界和所述背景边界作为种子点计算相机视点深度图中前景/背景的概率分布,进而确定前景对应的像素点集。
  5. 根据权利要求1所述的方法,其特征在于,所述以背景的像素点集填充所述前景空洞区域的步骤,包括:
    对所述中间背景视频和所述中间背景深度图进行背景建模,通过所述中间背景视频和所述中间背景深度图中的图像之间背景像素点集的相互补充来填充所述前景空洞区域。
  6. 根据权利要求5所述的方法,其特征在于,所述相机视点视频为动态相机视点的视频;
    在所述对所述中间背景视频和所述中间背景深度图进行背景建模的步骤之前,还包括:
    获取所述相机视点视频中不同视点下的视频段的映射关系;
    所述对所述中间背景视频和所述中间背景深度图进行背景建模,还包括:
    当相互补充的两个图像为不同视点下的图像时,根据所述映射关系将两个图像对应的模型参数映射到同一视点下,映射后的两个图像中的背景像素点集相互补充以填充所述前景空洞。
  7. 根据权利要求5所述的方法,其特征在于,在所述对所述中间背景视频和所述中间背景深度图进行背景建模步骤之后,还包括:
    根据中间背景深度图中的空洞周围的深度值预测空洞处的深度值,并根据预测深度值修补填充后的所述中间背景深度图中的空洞;
    使用添加深度值限制的图像修复算法修补填充后的所述中间背景视频。
  8. 根据权利要求1所述的方法,其特征在于,所述使用所述背景视频和所述背景深度图填充相机视点视频合成的虚拟视点视频的空洞的步骤,包括:
    在所述背景深度图的渲染下以第一扭曲参数3D扭曲所述背景视频,生成虚拟背景视频;
    在所述相机视点深度图的渲染下以第一扭转参数3D扭曲所述相机视点视频,生成所述虚拟视点视频;
    使用所述虚拟背景视频填充所述虚拟视点视频中的空洞。
  9. 根据权利要求8所述的方法,其特征在于,在所述使用所述虚拟背景 视频填充所述虚拟视点视频中的空洞的步骤之前,还包括:
    将所述虚拟视点视频反向映射到相机视点下,得到拓展边界;
    根据所述拓展边界拓展所述背景视频和所述背景深度图;
    使用所述虚拟背景视频填充所述虚拟视点视频中的空洞的步骤为:使用拓展的所述背景视频和拓展的所述背景深度图填充所述虚拟视点视频中的空洞。
  10. 一种虚拟视点图像的空洞填充方法,包括:
    获取相机视点图像和所述相机视点图像对应的相机视点深度图;
    移除所述相机视点图像和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景图像和中间背景深度图;
    以背景的像素点集填充所述前景空洞,生成背景图像和所述背景图像对应的背景深度图;
    使用所述背景图像和所述背景深度图填充相机视点图像合成的虚拟视点图像的空洞。
  11. 一种虚拟视点视频的空洞填充的装置,其特征在于,包括:
    拍摄模块,用于获取相机视点视频和所述相机视点视频对应的相机视点深度图;
    前景移除模块,用于移除所述相机视点视频和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景视频和中间背景深度图;
    背景填充模块,用于以背景的像素点集填充所述前景空洞,生成背景视频和所述背景视频对应的背景深度图;
    空洞填充模块,用于使用所述背景视频和所述背景深度图填充相机视点视频合成的虚拟视点视频的空洞。
  12. 根据权利要求11所述的装置,其特征在于,所述前景移除模块包括:
    提取深度图前景模块,用于提取所述相机视点深度图中前景对应的像素点集;
    提取视频前景模块,用于确定所述相机视点视频中前景对应的像素点集;
    移除模块,用于移除提取所述相机视点深度图中的所述像素点集和所述相机视点视频中的所述像素点集。
  13. 根据权利要求12所述的装置,其特征在于,所述提取深度图前景模块包括:
    提取前景边界模块,用于提取所述相机视点深度图中前景边界;
    迭代模块,用于根据所述前景边界采用迭代的方法依次生成次生边界,所述次生边界与其原生边界相对应的点的深度值差值小于预设范围;
    其中,所述前景边界和所述次生边界包括的像素点集构成了前景对应的像素点集。
  14. 根据权利要求12所述的装置,其特征在于,所述提取深度图前景模块包括:
    前景边界和背景边界提取模块:用于提取所述相机视点深度图中的前景边界和背景边界;
    概率计算模块,以根据所述前景边界和所述背景边界作为种子点计算相机视点深度图中前景/背景的概率分布,进而确定前景对应的像素点集。
  15. 根据权利要求11所述的装置,其特征在于,所述背景填充模块包括:
    背景建模模块,用于对所述中间背景视频和所述中间背景深度图进行背景建模,通过所述中间背景视频和所述中间背景深度图中的图像之间背景像素点集的相互补充来填充所述前景空洞区域。
  16. 根据权利要求15所述的装置,其特征在于,所述相机视点视频为动态相机视点的视频;
    所述背景填充模块还包括:
    运动补偿模块,用于获取所述相机视点视频中不同视点下的视频段的映射关系;
    所述背景建模模块,还用于当相互补充的两个图像为不同视点下的图像时,根据所述映射关系将两个图像对应的模型参数映射到同一视点下,映射 后的两个图像中的背景像素点集相互补充以填充所述前景空洞区域。
  17. 根据权利要求15所述的装置,其特征在于,所述背景填充模块,还包括:
    背景深度图修复模块,用于根据中间背景深度图中的空洞周围的深度值预测空洞处的深度值,并根据预测深度值修补填充后的所述中间背景深度图中的空洞;
    背景视频修复模块,用于使用添加深度值限制的图像修复算法修补填充后的所述中间背景视频。
  18. 根据权利要求11所述的装置,其特征在于,所述空洞填充模块包括:
    背景视频扭曲模块,用于在所述背景深度图的渲染下以第一扭曲参数3D扭曲所述背景视频,生成虚拟背景视频;
    相机视点视频扭曲模块,用于在所述相机视点深度图的渲染下以第一扭曲参数3D扭曲所述相机视点视频,生成所述虚拟视点视频;
    填充模块,用于使用所述虚拟背景视频填充所述虚拟视点视频中的空洞。
  19. 根据权利要求18所述的装置,其特征在于,所述空洞填充模块还包括:
    背景拓展模块,用于将所述虚拟视点视频反向映射到相机视点下,得到拓展区域;根据所述拓展区域拓展所述背景视频和所述背景深度图;
    所述填充模块,用于使用拓展的所述所述背景视频和拓展的所述背景深度图填充所述虚拟视点视频中的空洞。
  20. 一种虚拟视点图像的空洞填充装置,其特征在于,包括:
    拍摄模块,用于获取相机视点图像和所述相机视点图像对应的相机视点深度图;
    前景移除模块,用于移除所述相机视点图像和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景图像和中间背景深度图;
    背景填充模块,用于以背景的像素点集填充所述前景空洞,生成背景图像和所述背景图像对应的背景深度图;
    空洞填充模块,用于使用所述背景图像和所述背景深度图填充相机视点图像合成的虚拟视点图像的空洞。
  21. 一种终端,包括存储器和处理器,所述存储器中存储有指令,所述指令被所述处理器执行时,可使得所述处理器执行以下步骤:
    获取相机视点视频和所述相机视点视频对应的相机视点深度图;
    移除所述相机视点视频和所述相机视点深度图中的前景对应的像素点集,形成带有前景空洞的中间背景视频和中间背景深度图;
    以背景的像素点集填充所述前景空洞,生成背景视频和所述背景视频对应的背景深度图;
    使用所述背景视频和所述背景深度图填充相机视点视频合成的虚拟视点视频的空洞。
PCT/CN2016/083746 2016-05-27 2016-05-27 虚拟视点视频、图像的空洞填充方法、装置和终端 WO2017201751A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/083746 WO2017201751A1 (zh) 2016-05-27 2016-05-27 虚拟视点视频、图像的空洞填充方法、装置和终端

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/083746 WO2017201751A1 (zh) 2016-05-27 2016-05-27 虚拟视点视频、图像的空洞填充方法、装置和终端

Publications (1)

Publication Number Publication Date
WO2017201751A1 true WO2017201751A1 (zh) 2017-11-30

Family

ID=60410950

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/083746 WO2017201751A1 (zh) 2016-05-27 2016-05-27 虚拟视点视频、图像的空洞填充方法、装置和终端

Country Status (1)

Country Link
WO (1) WO2017201751A1 (zh)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110660131A (zh) * 2019-09-24 2020-01-07 宁波大学 一种基于深度背景建模的虚拟视点空洞填补方法
CN111179195A (zh) * 2019-12-27 2020-05-19 西北大学 深度图像空洞填充方法、装置、电子设备及其存储介质
CN111383185A (zh) * 2018-12-29 2020-07-07 海信集团有限公司 一种基于稠密视差图的孔洞填充方法及车载设备
US10970519B2 (en) 2019-04-16 2021-04-06 At&T Intellectual Property I, L.P. Validating objects in volumetric video presentations
CN112802175A (zh) * 2019-11-13 2021-05-14 北京博超时代软件有限公司 大规模场景遮挡剔除方法、装置、设备及存储介质
US11012675B2 (en) 2019-04-16 2021-05-18 At&T Intellectual Property I, L.P. Automatic selection of viewpoint characteristics and trajectories in volumetric video presentations
CN113139910A (zh) * 2020-01-20 2021-07-20 复旦大学 一种视频补全方法
US11074697B2 (en) 2019-04-16 2021-07-27 At&T Intellectual Property I, L.P. Selecting viewpoints for rendering in volumetric video presentations
US11153492B2 (en) 2019-04-16 2021-10-19 At&T Intellectual Property I, L.P. Selecting spectator viewpoints in volumetric video presentations of live events
CN114007058A (zh) * 2020-07-28 2022-02-01 阿里巴巴集团控股有限公司 深度图校正、视频处理、视频重建方法及相关装置
WO2022022548A1 (zh) * 2020-07-31 2022-02-03 阿里巴巴集团控股有限公司 自由视点视频重建及播放处理方法、设备及存储介质
CN114299076A (zh) * 2021-11-10 2022-04-08 西北大学 基于离散小波分解的深度图像空洞填充方法及装置
CN115908162A (zh) * 2022-10-28 2023-04-04 中山职业技术学院 一种基于背景纹理识别的虚拟视点生成方法及系统
CN116977162A (zh) * 2023-09-25 2023-10-31 福建自贸试验区厦门片区Manteia数据科技有限公司 图像配准方法、装置、存储介质以及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742349A (zh) * 2010-01-05 2010-06-16 浙江大学 一种对三维场景的表达方法及其电视系统
CN102592275A (zh) * 2011-12-16 2012-07-18 天津大学 虚拟视点绘制方法
CN103905813A (zh) * 2014-04-15 2014-07-02 福州大学 基于背景提取与分区修复的dibr空洞填补方法
CN104778673A (zh) * 2015-04-23 2015-07-15 上海师范大学 一种改进的高斯混合模型深度图像增强算法
CN104813658A (zh) * 2012-12-21 2015-07-29 映客实验室有限责任公司 在合成立体图像中可使用的方法、装置和计算机程序

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742349A (zh) * 2010-01-05 2010-06-16 浙江大学 一种对三维场景的表达方法及其电视系统
CN102592275A (zh) * 2011-12-16 2012-07-18 天津大学 虚拟视点绘制方法
CN104813658A (zh) * 2012-12-21 2015-07-29 映客实验室有限责任公司 在合成立体图像中可使用的方法、装置和计算机程序
CN103905813A (zh) * 2014-04-15 2014-07-02 福州大学 基于背景提取与分区修复的dibr空洞填补方法
CN104778673A (zh) * 2015-04-23 2015-07-15 上海师范大学 一种改进的高斯混合模型深度图像增强算法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LINGNI, M. ET AL.: "Depth-guided Inpainting Algorithm for Free-viewpoint Video", 2012 19TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP, 21 February 2013 (2013-02-21), pages 1721 - 1724, XP032333520 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111383185A (zh) * 2018-12-29 2020-07-07 海信集团有限公司 一种基于稠密视差图的孔洞填充方法及车载设备
CN111383185B (zh) * 2018-12-29 2023-09-22 海信集团有限公司 一种基于稠密视差图的孔洞填充方法及车载设备
US11670099B2 (en) 2019-04-16 2023-06-06 At&T Intellectual Property I, L.P. Validating objects in volumetric video presentations
US11663725B2 (en) 2019-04-16 2023-05-30 At&T Intellectual Property I, L.P. Selecting viewpoints for rendering in volumetric video presentations
US11470297B2 (en) 2019-04-16 2022-10-11 At&T Intellectual Property I, L.P. Automatic selection of viewpoint characteristics and trajectories in volumetric video presentations
US11012675B2 (en) 2019-04-16 2021-05-18 At&T Intellectual Property I, L.P. Automatic selection of viewpoint characteristics and trajectories in volumetric video presentations
US10970519B2 (en) 2019-04-16 2021-04-06 At&T Intellectual Property I, L.P. Validating objects in volumetric video presentations
US11074697B2 (en) 2019-04-16 2021-07-27 At&T Intellectual Property I, L.P. Selecting viewpoints for rendering in volumetric video presentations
US11153492B2 (en) 2019-04-16 2021-10-19 At&T Intellectual Property I, L.P. Selecting spectator viewpoints in volumetric video presentations of live events
US11956546B2 (en) 2019-04-16 2024-04-09 At&T Intellectual Property I, L.P. Selecting spectator viewpoints in volumetric video presentations of live events
CN110660131A (zh) * 2019-09-24 2020-01-07 宁波大学 一种基于深度背景建模的虚拟视点空洞填补方法
CN110660131B (zh) * 2019-09-24 2022-12-27 宁波大学 一种基于深度背景建模的虚拟视点空洞填补方法
CN112802175B (zh) * 2019-11-13 2023-09-19 北京博超时代软件有限公司 大规模场景遮挡剔除方法、装置、设备及存储介质
CN112802175A (zh) * 2019-11-13 2021-05-14 北京博超时代软件有限公司 大规模场景遮挡剔除方法、装置、设备及存储介质
CN111179195A (zh) * 2019-12-27 2020-05-19 西北大学 深度图像空洞填充方法、装置、电子设备及其存储介质
CN111179195B (zh) * 2019-12-27 2023-05-30 西北大学 深度图像空洞填充方法、装置、电子设备及其存储介质
CN113139910A (zh) * 2020-01-20 2021-07-20 复旦大学 一种视频补全方法
CN113139910B (zh) * 2020-01-20 2022-10-18 复旦大学 一种视频补全方法
CN114007058A (zh) * 2020-07-28 2022-02-01 阿里巴巴集团控股有限公司 深度图校正、视频处理、视频重建方法及相关装置
CN114071115A (zh) * 2020-07-31 2022-02-18 阿里巴巴集团控股有限公司 自由视点视频重建及播放处理方法、设备及存储介质
WO2022022548A1 (zh) * 2020-07-31 2022-02-03 阿里巴巴集团控股有限公司 自由视点视频重建及播放处理方法、设备及存储介质
CN114299076A (zh) * 2021-11-10 2022-04-08 西北大学 基于离散小波分解的深度图像空洞填充方法及装置
CN114299076B (zh) * 2021-11-10 2023-09-19 西北大学 基于离散小波分解的深度图像空洞填充方法及装置
CN115908162A (zh) * 2022-10-28 2023-04-04 中山职业技术学院 一种基于背景纹理识别的虚拟视点生成方法及系统
CN115908162B (zh) * 2022-10-28 2023-07-04 中山职业技术学院 一种基于背景纹理识别的虚拟视点生成方法及系统
CN116977162A (zh) * 2023-09-25 2023-10-31 福建自贸试验区厦门片区Manteia数据科技有限公司 图像配准方法、装置、存储介质以及电子设备
CN116977162B (zh) * 2023-09-25 2024-01-19 福建自贸试验区厦门片区Manteia数据科技有限公司 图像配准方法、装置、存储介质以及电子设备

Similar Documents

Publication Publication Date Title
WO2017201751A1 (zh) 虚拟视点视频、图像的空洞填充方法、装置和终端
Hamzah et al. Stereo matching algorithm based on per pixel difference adjustment, iterative guided filter and graph segmentation
JP5538617B2 (ja) 複数カメラのキャリブレーション用の方法および構成
US7379583B2 (en) Color segmentation-based stereo 3D reconstruction system and process employing overlapping images of a scene captured from viewpoints forming either a line or a grid
CN109462747B (zh) 基于生成对抗网络的dibr系统空洞填充方法
Lee et al. Silhouette segmentation in multiple views
US20100315410A1 (en) System and method for recovering three-dimensional particle systems from two-dimensional images
CN110189339A (zh) 深度图辅助的主动轮廓抠图方法及系统
CN112184759A (zh) 一种基于视频的运动目标检测与跟踪方法及系统
US10249046B2 (en) Method and apparatus for object tracking and segmentation via background tracking
CN112785705B (zh) 一种位姿获取方法、装置及移动设备
CN110147816B (zh) 一种彩色深度图像的获取方法、设备、计算机存储介质
Kong et al. A method for learning matching errors for stereo computation.
Korah et al. Analysis of building textures for reconstructing partially occluded facades
Chen et al. Kinect depth recovery using a color-guided, region-adaptive, and depth-selective framework
CN112037109A (zh) 一种改进的基于显著性目标检测的图像水印添加方法和系统
Mukherjee et al. A hybrid algorithm for disparity calculation from sparse disparity estimates based on stereo vision
Gallego et al. Joint multi-view foreground segmentation and 3d reconstruction with tolerance loop
Dimiccoli et al. Exploiting t-junctions for depth segregation in single images
CN116805356A (zh) 建筑模型的构建方法、设备及计算机可读存储介质
Fan et al. Collaborative three-dimensional completion of color and depth in a specified area with superpixels
Masaoka et al. Edge-enhanced GAN with vanishing points for image inpainting
Pertuz et al. Region-based depth recovery for highly sparse depth maps
CN110490877B (zh) 基于Graph Cuts的双目立体图像对目标分割方法
Engels et al. Automatic occlusion removal from façades for 3D urban reconstruction

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16902740

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16902740

Country of ref document: EP

Kind code of ref document: A1