KR20140001358A

KR20140001358A - Method and apparatus of processing image based on occlusion area filtering

Info

Publication number: KR20140001358A
Application number: KR1020120068649A
Authority: KR
Inventors: 엄기문; 이왕로; 고민수; 이현; 정원식; 허남호; 유지상
Original assignee: 한국전자통신연구원; 광운대학교 산학협력단
Priority date: 2012-06-26
Filing date: 2012-06-26
Publication date: 2014-01-07

Abstract

The present invention relates to a method for processing a stereo image, and more particularly, to an image processing method for performing shielding region filtering in a stereo image. The method may include calculating optical flow based parallax from a stereo image, obtaining feature point based parallax attention from the stereo image, and calculating the calculated optical flow based parallax (notice) and the obtained feature point based parallax attention. Generating a disparity saliency map on the basis of the filter; and filtering an occlusion area in the disparity attention map; Calculates disparity once using one image of the stereo image as a reference image, calculates disparity again using the other image as a reference image, and the difference between the two disparities obtained for the same specific pixel is greater than or equal to a certain limit. The specific pixel is detected as a shielding area pixel, and an image is displayed on the disparity attention map. Characterized in that the correction of the time difference value for the difference value for a particular pixel around the pixel of the particular pixel the minimum value.

Description

Image processing method based on occlusion area filtering

The present invention relates to a method for processing an image, and more particularly, to a method for processing a stereo image based on shielding region filtering.

With the advent of the digital age, various multimedia technologies are rapidly developing, and the digital content market based on this is growing remarkably each year. Along with this trend, interest in realistic media is increasing in digital image field. Recently, research on 3D imaging technologies such as holography, stereoscopic systems, and multi-view images has been actively conducted as a kind of sensory media. Among them, a multi-view image is obtained or generated. First, a multi-view image is acquired directly using as many cameras as the number of viewpoints. Second, a color image and a depth image are obtained using one color camera and one depth camera. And disparity information through stereo matching from stereo images obtained with two color cameras, and based on the depth information. There is a method for obtaining and generating a multi-view image. Here, stereo matching is a reference image of one of images acquired from two or more cameras as a reference image, and when the other images are placed as search images, the position of the pixel of the reference image and the search images for a point in three-dimensional space is in the image. Is defined as a process of obtaining a difference, and the image coordinate difference between the corresponding corresponding points is called disparity.

On the other hand, since most of the 3D displays currently being distributed are binocular, most of the contents currently provided are stereo images composed of two images. In the future, autonomous 3D displays need to support pre-fabricated stereo images. Therefore, depth image based rendering (DIBR), which generates multi-view images using stereo images, has a high degree of utilization and weight. It is expected to be larger. The DIBR technique is a technique of rendering an image at an arbitrary time point using a depth image composed of a texture image and distance information corresponding to each pixel of the texture image.

However, when generating a multi-view image based on a stereo image, it is required to extract a disparity map or depth map in which the parallax or depth is calculated for each pixel of the reference image in the form of an image. In addition, the task of extracting the parallax map or the depth map has a problem in that accuracy and reliability are much lower than time and effort. For example, when the parallax map or the depth map is extracted, there may be an occlusion area in which pixels visible in one image of the stereo image are not visible in the other image, and the generated virtual viewpoint image is also inaccurate. Due to the parallax information, boundary noise and holes may exist, and a high performance hole filling process is required to compensate for this. Accordingly, there is a need for a method of generating a more precise and high quality multiview image based on a stereo image.

An object of the present invention is to provide an image processing method for performing occlusion area filtering in a stereo image.

Another object of the present invention is to provide an image processing method for generating a high quality multiview image through shielding region filtering.

According to an aspect of the present invention, a stereo image processing method is provided. The method includes calculating optical flow based parallax from a stereo image, obtaining a feature point based disparity saliency from the stereo image, the calculated optical flow based parallax and the obtained feature point based parallax Generating a disparity saliency map based on the attention, and filtering an occlusion area in the disparity attention map, wherein filtering the occlusion area in the disparity attention map The disparity may be determined once by using one image of the stereo image as a reference image, and again obtained by disparity using the other image as a reference image, and the difference between the two disparities obtained for the same specific pixel is limited. If abnormal, the specific pixel is detected as a shielding area pixel, and the parallax is noted. The parallax value for the specific pixel in the map may be corrected to a minimum value among the parallax values for the surrounding pixels of the specific pixel.

Through shielding region filtering in a stereo image, the shape of an object may be prevented from being distorted and unnatural in an image generated through image processing.

In addition, when obtaining a disparity saliency map, an intensity gradient saliency map, and a line segment from which the shielding area is filtered, and generating a multi-view image based on this, Distortion can be prevented and artifacts and time consistency problems at the boundary of the shielding area, which occurred when using the conventional DIBR technique, can be reduced. In addition, when warping is performed based on the grid-mesh in generating the multi-view image, an optimization function through vertical line maintenance conditions, a warping optimization energy function, and iteration may be performed. By using the multi-view image generation, it is possible to prevent distortion of the image and deterioration of image quality, and to reduce ghost phenomenon in which images overlap.

1 is an example of detecting and correcting a shielding area of a stereo image according to an image processing method of the present invention.
2 shows an example of a result of finding a saliency in a stereo image using the SIFT technique according to the present invention.
3 shows an example of a disparity attention map generated according to the present invention.
4 is an example of a multiview image generation operation according to an image processing method of the present invention.
5 is an example of the contrast change attention map according to the present invention.

Hereinafter, some embodiments of the present invention will be described in detail with reference to exemplary drawings. It should be noted that, in adding reference numerals to the constituent elements of the drawings, the same constituent elements are denoted by the same reference symbols as possible even if they are shown in different drawings. In the following description of the embodiments of the present invention, a detailed description of known configurations or functions will be omitted if it is determined that the gist of the present specification may be obscured.

The present invention proposes a method for processing a stereo image by an image processing apparatus. The image processing apparatus may perform occlusion filtering (detection and correction) on a stereo image, and a disparity saliency map and intensity gradient saliency map through filtering the occlusion region. Multi-view images of high quality may be generated based on life features such as, and line segments. In this case, the image processing apparatus may generate a multiview image without extracting depth information.

First, shielding area filtering in a stereo image by an image processing apparatus may be performed as follows.

1 is an example of shielding region filtering of a stereo image according to an image processing method of the present invention.

Referring to FIG. 1, the image processing apparatus receives an image (S100). The image is a stereo image.

The image processing apparatus calculates an optical flow based parallax (notice) indicating a degree of change in movement of pixels between the received stereo images (S110). The optical flow based parallax calculation may be performed on a block basis.

The image processing apparatus obtains a feature point disparity attention based on the received stereo image (S120). In order to obtain the feature-based parallax attention, a scale-invariant feature transform (SIFT) technique for extracting the feature-based parallax attention that does not change with the size and rotation of an object may be used.

2 illustrates an example of a result of finding a feature point based disparity saliency in a stereo image using the SIFT technique according to the present invention.

Referring to FIG. 2, the scratch mark in the stereo image represents the feature point based parallax attention extracted in the image by using the SIFT technique after comparing the (a) and (b) images.

Referring back to FIG. 1, the image processing apparatus generates a disparity saliency map based on the calculated optical flow-based parallax (notice) and the obtained feature point-based parallax attention (S130). The image processing apparatus combines the calculated optical flow-based parallax (notice) with the obtained feature point-based parallax attention, removes an outlier, and generates the parallax attention map. 3 shows an example of a disparity attention map generated according to the present invention.

The image processing apparatus filters an occlusion area in the generated disparity attention map (S140). That is, the image processing apparatus detects a shielding area in the generated disparity attention map and corrects the shielding area. In detail, the image processing apparatus obtains parallax once using one image of the stereo image as a reference image, and once again calculates parallax again using the other image as a reference image. When the difference of the obtained two parallaxes for the same specific pixel is more than a predetermined limit, the image processing apparatus detects the specific pixel as a shielding area pixel. The predetermined limit may be, for example, a threshold value TH1 (about 1 to 3 pixel differences), and may be predetermined by the user. Thereafter, the image processing apparatus corrects the parallax value for the specific pixel in the parallax attention map to a minimum value among the parallax values for the peripheral pixels of the specific pixel. Herein, the peripheral pixel of the specific pixel is a pixel within a predetermined block from the specific pixel and means a pixel that is not a shielding area pixel. For example, the peripheral pixel may be a pixel within a 3 × 3 block from the specific pixel, and may be a pixel that is not a shielding area pixel. As another example, the peripheral pixel is a pixel within a 5 × 5 block from the specific pixel, and may be a pixel other than a shielding area pixel. When the parallax information of the shielding region is incorrect, the image processing apparatus may distort the shape of the object in an image generated by image processing, for example, a multi-view image generated later, and unnaturally. You can prevent it from being seen.

4 is an example of a multiview image generation operation according to an image processing method of the present invention.

Referring to FIG. 4, S400 to S440 are the same as S400 to S440 described in FIG. 1.

The image processing apparatus obtains an intensity gradient saliency map based on the stereo image input at S400 (S450). The image processing apparatus may generate the contrast change attention map by calculating an integral image of the stereo image by changing a filtering coefficient. For example, the image processing apparatus may generate the contrast change attention map by adding a pixel immediately above the current pixel of the stereo image and a pixel immediately to the left, and applying a Gaussian filter twice. In this case, in order to obtain high quality features, the result of applying the difference between the current pixel and the pixel surrounding the current pixel to the original grayscale image while changing the filter size using a single integrated image may be used. 5 is an example of the contrast change attention map according to the present invention.

The image processing apparatus obtains a line segment based on the stereo image in operation S460.

Here, S450 and S460 may be performed before S410 to S440 or later, or may be performed simultaneously with at least one procedure of S410 to S440.

The image processing apparatus generates a multiview image based on the disparity attention map from which the shielding area is filtered, the contrast change interest, and the line portion (S470). In this case, the image processing apparatus may generate the multiview image through grid-mesh-based warping. In more detail, the image processing apparatus may perform mesh optimization based on the parallax attention map in which the shielding region is filtered, attention to the contrast change, and the line portion, and perform mesh initilaization, A view synthesis may be performed to generate a multiview image based on the view synthesis.

When obtaining a parallax attention map, a contrast change attention map, and a line portion from which the shielding area is filtered from the input stereo image, and generating a multi-view image based on this, it is possible to prevent the distortion of the object. Can reduce the artifacts of the boundary of the shielding area-defects in the image due to inadequate image sampling-and the problem of time consistency. In the multi-view image generation, when warping is performed based on the lattice mesh, a multi-view image is generated by using a vertical line condition, a warping optimization energy function, and an optimization function through iteration. This prevents image distortion and deterioration of image quality, and reduces ghosting in which images overlap.

The foregoing description is merely illustrative of the technical idea of the present invention, and various changes and modifications may be made by those skilled in the art without departing from the essential characteristics of the present invention. Therefore, the embodiments disclosed in the present invention are intended to illustrate rather than limit the scope of the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. The scope of protection of the present invention should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present invention.

Claims

Stereo image processing method,
Calculating an optical flow based parallax from the stereo image;
Obtaining a feature point based disparity saliency from the stereo image;
Generating a disparity saliency map based on the calculated optical flow based parallax and the obtained feature point based parallax attention; And
Filtering an occlusion area in the disparity map;
The filtering of the shielding region in the disparity attention map may include disparity obtained by using one image of the stereo image as a reference image, and obtaining parallax once again using the other image as a reference image, and applying the same specific pixel. When the difference between the obtained two parallaxes is greater than or equal to a predetermined limit, the specific pixel is detected as a shielding area pixel, and the parallax value for the specific pixel is corrected to the minimum of the parallax values for the peripheral pixels of the specific pixel in the parallax attention map. An image processing method, characterized in that.

The method of claim 1,
The feature-based parallax attention is obtained based on a scale-invariant feature transform (SIFT) technique.

The method of claim 1,
And the peripheral pixel is a pixel within a 3x3 block from the specific pixel, and is a pixel instead of a shielding area pixel.

The method of claim 1,
And the peripheral pixel is a pixel within a 5x5 block from the specific pixel, and is a pixel instead of a shielding area pixel.

The method of claim 1,
Obtaining an intensity gradient saliency map from the stereo image;
Obtaining a line segment from the stereo image; And
And generating a multi-view image based on the disparity attention map filtering the shielding area, the obtained contrast change attention map, and the line portion.

6. The method of claim 5,
The multi-view image is generated using a grid-mesh based warping technique.