JP3769850B2 - Intermediate viewpoint image generation method, parallax estimation method, and image transmission method - Google Patents

Intermediate viewpoint image generation method, parallax estimation method, and image transmission method Download PDF

Info

Publication number
JP3769850B2
JP3769850B2 JP34697296A JP34697296A JP3769850B2 JP 3769850 B2 JP3769850 B2 JP 3769850B2 JP 34697296 A JP34697296 A JP 34697296A JP 34697296 A JP34697296 A JP 34697296A JP 3769850 B2 JP3769850 B2 JP 3769850B2
Authority
JP
Japan
Prior art keywords
parallax
image
pixel
pixel value
intermediate viewpoint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP34697296A
Other languages
Japanese (ja)
Other versions
JPH10191396A (en
Inventor
健夫 吾妻
森村  淳
謙也 魚森
Original Assignee
松下電器産業株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 松下電器産業株式会社 filed Critical 松下電器産業株式会社
Priority to JP34697296A priority Critical patent/JP3769850B2/en
Priority claimed from US08/825,723 external-priority patent/US6163337A/en
Publication of JPH10191396A publication Critical patent/JPH10191396A/en
Application granted granted Critical
Publication of JP3769850B2 publication Critical patent/JP3769850B2/en
Anticipated expiration legal-status Critical
Application status is Expired - Fee Related legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20228Disparity calculation for image-based rendering

Description

[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a method for generating an intermediate viewpoint image and a parallax estimation method in a stereoscopic video display method.
[0002]
[Prior art]
Conventionally, various stereoscopic video systems have been proposed, but a multi-view stereoscopic video system is promising as a system that allows a plurality of people to observe a stereoscopic moving image without wearing special glasses. In the multi-view stereoscopic video system, the more the number of cameras and display devices used, the more the observer can feel natural motion parallax, and the observation with a large number of people becomes easier. However, there are limits to the number of cameras that can be used practically due to restrictions such as the scale of the imaging system and the setting of the optical axis of the camera. In the transmission and storage processes, it is desired to reduce the amount of information that increases in proportion to the number of cameras.
[0003]
Therefore, if an intermediate viewpoint image can be generated from a binocular stereo image and a multi-view stereoscopic image can be displayed on the display side, the burden on the imaging system can be reduced and the amount of information during transmission and storage can be reduced. . In order to generate an intermediate viewpoint image that should be seen at an arbitrary viewpoint between the different viewpoints from a plurality of images having different viewpoints, it is necessary to estimate the depth by obtaining the correspondence of the pixels between the images.
[0004]
[Problems to be solved by the invention]
A fundamental problem in the correspondence between images is that it is difficult to obtain a correspondence with accuracy because an occlusion occurs in an object contour line whose depth changes discontinuously. However, the estimated value in the vicinity of the object contour line determines the contour position of the object in the generated intermediate viewpoint image, and is therefore very important when synthesizing the intermediate viewpoint image. That is, if an estimation error occurs in the vicinity of the object outline during parallax estimation, the pixels in the foreground area stick to the background side, conversely, the pixels in the background area stick to the foreground, the object outline is distorted, or the object outline is near A false contour is generated in the background area.
[0005]
In view of this point, the present invention provides a parallax estimation method for accurately estimating a sudden change (discontinuous change) in the vicinity of an object outline, and a method for generating an intermediate viewpoint image using the parallax estimation value. For the purpose.
[0006]
It is another object of the present invention to provide an efficient multi-viewpoint image transmission method by generating the intermediate viewpoint image on both the transmission side and the reception side.
[0007]
[Means for Solving the Problems]
  The present inventionFrom the initial parallax calculated on the basis of the left and right images, a corresponding area between the left and right pixels and an uncorresponding area are extracted, and for the corresponding area, the pixel value and the initial parallax value are extracted. And for each pixel in an uncorresponding region, calculate the distance in the pixel value space between the pixel value of the pixel and the pixel value of each cluster centroid obtained by the clustering, and The parallax in the pixels in the non-region is determined by using the parallax in the corresponding corresponding pixels belonging to the cluster with the closest distance, so that the boundary between the foreground object parallax and the background parallax is defined as the foreground object contour. Characterized by matchingA parallax estimation method and an intermediate viewpoint image generation method using the parallax estimation value.
[0008]
  Also,From the initial parallax calculated on the basis of the left and right images, a corresponding area between the left and right pixels and an uncorresponding area are extracted, and for the corresponding area, the pixel value and the initial parallax value are extracted. And for each pixel in a non-corresponding region, calculate the distance in the pixel value space between the pixel value of the pixel and the pixel value of each cluster centroid obtained by the clustering, and the distance is the largest. By determining the close cluster and determining the disparity of the pixels in the uncorresponding region taking into account the connectivity of the determined clustering results, the boundary between the foreground object and the background disparity matches the foreground object contour A parallax estimation method characterized in that the parallax estimation value is generated, and an intermediate viewpoint image generation method using the parallax estimation value.
[0009]
DETAILED DESCRIPTION OF THE INVENTION
(First embodiment)
FIG. 1 is a block diagram of a disparity estimation method and an intermediate viewpoint image generation method according to the first embodiment of the present invention. In FIG. 1, 1L and 1R are parallax estimation units, 2 is an intermediate viewpoint image generation unit, 3L and 3R are initial parallax estimation units, 4L and 4R are reliability evaluation units, 5L and 5R are clustering units, and 6L and 6R are parallaxes. Interpolation means, 7L and 7R are pixel shift means, and 8 is an intermediate viewpoint image integration means.
[0010]
The operation of the above configuration will be described below. The initial parallax estimation unit 3 calculates a residual sum of squares (Sum of Squared Differences, hereinafter referred to as SSD) shown in (Equation 1). The SSD value according to (Equation 1) is a small value when the distribution of pixel values in the window area set in the reference image and the window area set in the reference image is similar, and conversely in both window areas. When the pixel value distribution is different, the value is large. The initial parallax estimation unit 3 sets the shift amount d between images that minimizes the SSD value within a predetermined search range as the parallax at the point of interest (x, y).
[0011]
[Expression 1]
[0012]
FIG. 2 is a diagram for explaining the initial parallax estimation (block matching) performed by the initial parallax estimation unit 3. In FIG. 2, the window region set around the point of interest (x, y) represents the integration region W of (Equation 1). The initial parallax of the entire image can be obtained by setting the window areas sequentially shifted and performing the above-described SSD calculation. The initial parallax estimation unit 3L calculates the initial parallax using the left image as the reference image, and the initial parallax estimation unit 3R calculates the right image as the reference image.
[0013]
The reliability evaluation unit 4 calculates the evaluation value (the difference in initial parallax with reference to the left and right images) shown in (Expression 2) for each pixel. Then, a pixel whose evaluation value is equal to or higher than a certain threshold value (a pixel with low reliability) is regarded as a pixel in the occlusion region, and conversely, a pixel whose pixel value is less than the threshold value (a pixel with high reliability) is correctly handled. Binary output is performed (for example, 1 is output for a pixel determined to be occlusion, and 0 is output for a pixel determined to be correctly handled).
[0014]
[Expression 2]
[0015]
The clustering unit 5 performs the clustering using the pixel value and the initial parallax for the pixel in the region with the high association reliability, and uses the pixel value and the result of the clustering for the pixel with the low association reliability. Clustering. Any image clustering method may be used as long as it classifies pixel values and initial parallax data (which is four-dimensional data when a color image is used) into a plurality of clusters. In this embodiment, an example using the k-mean method will be described. The k-mean method will be described below.
[0016]
The k-mean method is an algorithm that classifies the entire data into a predetermined number of clusters. The k-mean method performs clustering according to the following four procedures.
[0017]
(1) Determination of the number of clusters and end conditions.
(2) Place the initial cluster.
[0018]
(3) Data arrangement.
(4) End determination (if not ended, return to (3)).
[0019]
In this embodiment, the number of clusters is given as a fixed value (for example, 10). The termination condition is when the number of data arranged in each cluster converges or the number of repeated calculations of cluster update reaches a certain number (for example, 10 times).
[0020]
As for the arrangement of the initial clusters, the ith cluster centroid is obtained by (Equation 3) from the average vector and standard deviation vector of all data.
[0021]
[Equation 3]
[0022]
In a region where the initial parallax estimation value is highly reliable, clustering is performed on the four-dimensional pixel values (R, G, B) and initial parallax. Then, the pixels in the region with low initial disparity estimation value reliability are the closest in the three-dimensional space of the pixel values (R, G, B) in the clustering result in the region with high correspondence reliability. Clustering is performed as belonging to the cluster.
[0023]
FIG. 3 is a diagram for explaining how the clustering unit 5 performs clustering. In FIG. 3, 203 (region where hatching overlaps in the center) is a region with low initial parallax reliability, 201 is a region of cluster 1, 202 is a region of cluster 2, and 204 is a region boundary between cluster 1 and cluster 2 Indicates. In FIG. 3, the number of clusters is 2 for simplicity. In FIG. 3, clustering is performed in a region other than 203 based on pixel values and initial parallax. Then, the pixels in the region 203 with low initial parallax reliability are classified into clusters having a shorter distance from the cluster 1 and the cluster 2 in the (R, G, B) space.
[0024]
The parallax interpolation unit 6 refers to the parallax in the pixel determined to be occlusion in the reliability evaluation unit 4, and refers to the parallax in the pixel determined to have the correct surrounding correspondence, and interpolates the parallax value. Determined by Interpolation of a parallax value in a region with low initial parallax reliability is performed by calculating, for example, the following equation.
[0025]
[Expression 4]
[0026]
By calculating the parallax of the pixel of interest from the parallax of the same surrounding cluster by (Equation 4), it is possible to perform parallax estimation in which the parallax changes discontinuously at the cluster boundary 204 in FIG. The disparity discontinuity at the boundary can be estimated.
[0027]
Note that the parallax interpolation calculation formula need not be limited to (Equation 4). The parallax values of surrounding pixels are referred to according to the cluster to which the pixel of interest belongs, and the weight coefficient for the parallax of the surrounding pixels to be referred to is calculated. Any interpolation method in which the sum is 1 is sufficient (in the case of the interpolation method shown in (Equation 4), the weight of the parallax value to be referenced changes in inverse proportion to the square of the distance between the pixel of interest and the reference pixel. ).
[0028]
By the parallax interpolation performed by the parallax interpolation unit 6 described above, the surrounding parallax can be filled with the occlusion of the initial parallax.
[0029]
The pixel shift unit 7 multiplies the parallax subjected to the parallax interpolation by a constant according to the viewpoint direction, and shifts the pixels of the input image. That is, the pixel shift unit 7L generates an intermediate viewpoint image from the parallax calculated based on the left image and the left image, and the pixel shift unit 7R generates an intermediate viewpoint image from the parallax calculated based on the right image and the right image. Here, by setting the sum of the constant multiplied by the parallax when the left image is based and the constant multiplied by the parallax when the right image is based, an intermediate viewpoint image in the same viewpoint direction is generated from both.
[0030]
The intermediate viewpoint image integration unit 8 integrates the intermediate viewpoint image generated based on the left image and the intermediate viewpoint image generated based on the right image. In the intermediate viewpoint image generated on the basis of the left and right images, an area in which no image is generated (displaced area) is generated as shown in FIG. 4 due to discontinuous change in parallax. In FIG. 4, reference numeral 51 denotes a region where no image is generated in the intermediate viewpoint image based on the left image, and 52 denotes a region where no image is generated in the intermediate viewpoint image based on the right image. Since both occur on opposite sides of the foreground object, an intermediate viewpoint image having no gap area can be generated by integrating the intermediate viewpoint images based on the left and right images.
[0031]
As described above, according to the present embodiment, by using the clustering result of the image, by performing the parallax estimation that determines the parallax in the region where the initial parallax is low from the parallax in the pixels of the same cluster around, The parallax boundary between the foreground object outline and the foreground background can be matched, and the parallax near the occlusion and the object outline can be determined from the surrounding background parallax or the foreground parallax. Can be reproduced with good image quality.
[0032]
(Second Embodiment)
FIG. 5 is a block diagram of a disparity estimation method and an intermediate viewpoint image generation method according to the second embodiment of the present invention. In the second embodiment of the present invention, the image quality near the contour of the foreground object at the time of generating the intermediate viewpoint image is improved by performing disparity estimation in consideration of the spatial continuity of the clustering result in the occlusion region. To do.
[0033]
In FIG. 5, 9L and 9R are parallax estimation units, 2 is an intermediate viewpoint image generation unit, 3L and 3R are initial parallax estimation units, 4L and 4R are reliability evaluation units, 5L and 5R are clustering units, and 10L and 10R are labeling units. 11L and 11R are parallax interpolation, 7L and 7R are pixel shift means, and 8 is an intermediate viewpoint image integration means. 1 that perform the same operations as those of the first embodiment of the present invention in FIG. 1 are denoted by the same reference numerals as those in FIG. 1 and the description thereof is omitted. The operation of the unit 11 will be described.
[0034]
The labeling unit 10 performs clustering in consideration of connectivity for the clustering result in the region where the initial parallax is low in reliability.
[0035]
FIG. 6 is a diagram for explaining how labeling is performed by the labeling unit 10. In FIG. 6, 401 is cluster 1, 402 is cluster 2, 403 is cluster 3, 404 is cluster 4, 405 is label 1, 406 is label 2, 407 is label 3, 408 is label 4, 409 and 410 are initial parallaxes. The boundary between the high reliability region and the low region is shown. In FIG. 6, the number of clusters is 4 and the number of label areas is 4 for simplicity. The labeling unit 10 performs labeling in consideration of the spatial connectivity of pixels belonging to the same cluster in an area with low initial parallax reliability.
[0036]
The labeling method will be described below.
For each pixel in the region with low initial parallax reliability, if the adjacent pixel is included in the region with low initial parallax reliability and belongs to the same cluster as the pixel of interest, the adjacent pixel The same label is attached to the target pixel, and if not, a new label is attached to the target pixel.
[0037]
The parallax interpolation unit 11 performs parallax interpolation using both the clustering result by the clustering unit 5L and the labeling result by the labeling unit 10. The parallax interpolation is performed for each label area. That is, when the label area is in contact with the same cluster area of the area where the initial parallax is highly reliable (in the case of labels 1, 2, and 3 in FIG. 6), the parallax outside the boundary line as viewed from the label area is reduced. Then, the parallax in the label area is determined by (Equation 5).
[0038]
[Equation 5]
[0039]
The parallax d (s, t) on the integration path of (Equation 5) is used in a region with low initial parallax reliability by using parallax outside the boundary line (parallax of a region with high reliability) when viewed from the label region. Determine the parallax. In addition, when the label area is not in contact with the same cluster area with high initial parallax reliability (in the case of cluster 4 in FIG. 6), only the parallax on the background side among the parallaxes in the surrounding high reliability area is obtained. The parallax is determined with reference. The background parallax extraction differs depending on the positional relationship of the cameras (parallel shooting or convergence shooting) when shooting the left and right images. In the following, each case of parallel shooting (the optical axes of the cameras that shoot the left and right images are parallel) and convergence shooting (the optical axes of the cameras that shoot the left and right images intersect at one point) will be described.
[0040]
FIG. 15 is an explanatory diagram of a parallax distribution in the case of parallel shooting. If the left and right cameras are arranged so that the optical axes are parallel to each other as shown in FIG. 15A, the parallax with the left and right images as a reference is as shown in FIG. 15B. (The left side of FIG. 15B is the parallax with the left image as a reference, and the right side is the parallax with the right image as a reference.) Also, as shown in FIG. When the calculation is performed with reference to, only a negative value is obtained, and on the contrary, the right image has only a positive value, and the parallax is 0 when the depth is infinite.
[0041]
From the above, in the case of parallel shooting, an image having an absolute value of surrounding parallax that is equal to or less than the average value is extracted as the background parallax.
[0042]
FIG. 16 is an explanatory diagram of the distribution of parallax in the case of convergence imaging. If the optical axes of the left and right cameras are arranged so as to intersect at a certain depth as shown in FIG. 16 (a), the parallax becomes as shown in FIG. 16 (b). (The left side of FIG. 16B is the parallax with the left image as a reference, and the right side is the parallax with the right image as a reference.) Also, as shown in FIG. Even when the image is used as a reference, both positive and negative values are obtained, and parallax is 0 at the convergence point (the point where the optical axes of the left and right cameras intersect).
[0043]
From the above, in the case of convergence shooting, the background parallax is greater than the average value of surrounding parallax for the parallax based on the left image, and conversely the average of the surrounding parallax for the parallax based on the right image Extract as less than value.
[0044]
As described above, according to the present embodiment, by further labeling the result of clustering and considering spatial connectivity, it is stable even when there is a pixel having a pixel value similar to the foreground object in the occlusion. Thus, the boundary between the foreground parallax and the background parallax can be determined, and the image quality in the vicinity of the foreground contour of the intermediate viewpoint image can be improved.
[0045]
(Third embodiment)
FIG. 7 is a block diagram of a disparity estimation method and an intermediate viewpoint image generation method according to the third embodiment of the present invention. In the third embodiment of the present invention, the parallax estimation is performed in consideration of the correspondence between the initial parallax based on the left image and the initial parallax based on the right image, so that the vicinity of the contour of the foreground object at the time of generating the intermediate viewpoint image is obtained. Improve image quality.
[0046]
In FIG. 7, 12L and 12R are parallax estimation units, 2 is an intermediate viewpoint image generation unit, 13L and 13R are initial parallax estimation units, 14L and 14R are association determination units, 15L and 15R are parallax interpolation units, and 7L and 7R are Image shift means 8 is an image integration means. Of the above configurations, those performing the same operations as those in the first and second embodiments of the present invention in FIGS. 1 and 5 are denoted by the same reference numerals as those in FIGS. Hereinafter, operations of the initial parallax estimation unit 13, the association determination unit 14, and the parallax interpolation unit 15 will be described.
[0047]
The initial parallax estimation unit 13 performs parallax estimation by the same operation as the parallax estimation unit 1 in the first embodiment of the present invention or the parallax estimation unit 9 in the second embodiment of the present invention.
[0048]
The association determination unit 14 determines the association state of the initial parallax from the initial parallax with the left image as the reference image and the initial parallax with the right image as the reference, by (a) correct association, (b) occlusion, (c ) Classify into three types of mishandling. FIG. 8 is a diagram for explaining the above three associations. 8A shows a state in which a point A in the standard image and a point B in the reference image correspond to each other, and FIG. 8B shows a point B in the reference image corresponding to the point A in the standard image. A state in which the point C in the image corresponds to a point C other than the point A, (c) is a state in which the point B in the reference image corresponding to the point A in the base image does not correspond to the point C in the base image Indicates.
[0049]
The parallax interpolation unit 15 performs parallax interpolation using the surrounding foreground parallax or background parallax for the parallax of the region in the state (C) among the three association states. Foreground parallax and background parallax, the parallax average value of surrounding areas in contact with the area (C) is used as a threshold, parallax having an absolute value larger than the average value is extracted as foreground parallax, and small parallax is extracted as background parallax. . The parallax interpolation is calculated by (Equation 6).
[0050]
[Formula 6]
[0051]
By limiting the integration path C in (Equation 6) to only the pixels whose absolute value is larger or smaller than the average value among the pixels in the surrounding area that are in contact with the area (C), the parallax due to the foreground parallax Performs parallax interpolation with interpolation and background parallax. The final parallax value is determined by selecting a parallax value having a smaller difference in pixel value from the corresponding destination among the parallax interpolation results of the foreground parallax and the background parallax.
[0052]
As described above, according to the present embodiment, the parallax near the object contour and the occlusion boundary where the parallax estimation value is likely to be unstable is estimated with reference to one of the surrounding foreground parallax and the background parallax. By doing so, the parallax does not become an intermediate value between the foreground parallax and the background parallax near the boundary between the foreground parallax and the background parallax, and the image quality near the object outline of the intermediate viewpoint image can be improved.
[0053]
(Fourth embodiment)
FIG. 9 is a configuration diagram of the transmission side of the multi-viewpoint image compression transmission system according to the fourth embodiment of the present invention. In FIG. 9, 101a to 101d are cameras that capture images at the respective viewpoint positions, 102 is an image compression encoding unit that compresses and encodes the images of the cameras 1 and 4, and 103a is compressed by the image compression encoding unit 102. A decoded image decompression unit 104a that decodes and decompresses the encoded image data, and 104a predicts an image at the viewpoint of the camera 2 and the camera 3 from the images of the camera 1 and the camera 4 decoded and decompressed by the decoded image decompression unit 103a. An intermediate viewpoint image generation unit 105 generates a residual compression encoding unit that compresses and encodes a residual between the images of the camera 2 and the camera 3 and the image generated by the intermediate viewpoint image generation unit 104a. The operation of the above configuration will be described below.
[0054]
The image compression / encoding unit 102 compresses and encodes a plurality of images in the multi-view image (in this embodiment, images of the viewpoints at both ends of the four-view image) by an existing technique using block correlation between the images. Turn into. FIG. 10 shows an example of the configuration of the image compression transmission encoding unit. In FIG. 10, 107a and 107b are DCT means for calculating DCT coefficients by performing DCT calculation every 8 × 8 pixels or 16 × 16 pixels, 108a and 108b are quantizing means for quantizing the DCT coefficients, and 109 is inverse quantization. Means, 110 is an inverse DCT means for performing inverse DCT calculation, 111 is a parallax detection means, 112 is a parallax compensation means, and 113 is an encoding means for encoding quantized DCT coefficients and parallax. The operation of the above configuration will be described below.
[0055]
The DCT means 107a processes the image of the camera 1 for each block and calculates a DCT coefficient for each block. The quantization means 108a quantizes the DCT coefficient. The inverse quantization unit 109a performs inverse quantization on the quantized DCT coefficient. The inverse DCT unit 110a inversely transforms the inversely quantized DCT coefficient to restore the image of the camera 1 obtained on the receiving side. The parallax detection unit 111 performs block matching between the restored image of the camera 1 and the image of the camera 4 and calculates a parallax with reference to the image of the camera 1 for each block. The parallax compensation unit 112 predicts the image of the camera 4 using the restored image of the camera 1 and the parallax of each block. (That is, a process corresponding to motion compensation of a moving image is performed.) The DCT unit 107b processes a residual between the image of the camera 4 and the predicted image for each block and calculates a DCT coefficient. The quantization means 108b quantizes the residual DCT coefficient. The encoding unit 113 encodes the quantized DCT coefficient of the image of the camera 1, the disparity for each block, and the quantized DCT coefficient of the disparity compensation residual.
[0056]
The decoded image decompression unit 103 decodes and decompresses the image data compressed and encoded by the image compression encoding unit 102. FIG. 11 is a diagram illustrating an example of the configuration of the decoded image decompression unit 103. In FIG. 11, 114a is decoding means, 109b and 109c are inverse quantization means, 110b and 110c are inverse DCT means, and 112b is parallax compensation means. The operation of the above configuration will be described below.
[0057]
The decoding unit 114a decodes the compression-encoded data, and expands the quantized DCT coefficient of the camera 1 image, the disparity for each block, and the quantized DCT coefficient of the disparity compensation residual. The quantized DCT coefficient of the image of the camera 1 is inversely quantized by the inverse quantization unit 109b and is expanded as an image by the inverse DCT unit 110b. The motion compensation unit 112b generates a predicted image of the camera 2 from the expanded image of the camera 1 and the decoded parallax. And the image of the camera 4 is expanded by adding the residual expanded by the inverse quantization means 109c and the inverse DCT means 110c to the predicted image.
[0058]
The intermediate viewpoint image generation unit 104a calculates the parallax for each pixel from the images of the camera 1 and the camera 4 by the method shown in any one of the first to third embodiments of the present invention, and Predict and generate images.
[0059]
The residual compression encoding unit 105 compresses and encodes the residual between the images of the cameras 2 and 3 and the predicted image. Since the intermediate viewpoint image generation unit 104a calculates the parallax for each pixel, the parallax can be accurately estimated as compared with the parallax calculation for each block by block matching. As a result, the prediction error (that is, the residual) of the intermediate viewpoint image can be reduced, the compression efficiency can be increased, more effective bit allocation can be performed, and the compression maintaining the image quality can be performed. FIG. 12 shows an example of the configuration of the residual compression encoding unit. In FIG. 12, 107c and 107d are DCT means, 108c and 108d are quantization means, and 113b is an encoding means. The residuals of the images of the cameras 2 and 3 are converted into DCT coefficients by the DCT means 107c and 107d, quantized by the quantization means 108c and 108d, and encoded by the encoding means 113b.
[0060]
FIG. 13 is a block diagram of the receiving side of the multi-viewpoint image compression transmission system according to the fourth embodiment of the present invention. In FIG. 13, reference numeral 103b denotes a decoding / decompression unit that decodes / decompresses the image data of the camera 1 and camera 4 compression-encoded by the transmission-side image compression / encoding unit 102, and 104b denotes a decoding / decompression unit 103b. An intermediate viewpoint image generation unit 106 that predicts and generates an image at the viewpoint of the camera 2 and the camera 3 from the images of the camera 1 and the camera 4, and a prediction error (residual) of the prediction image at the viewpoint of the camera 2 and the camera 3 Is a decoding residual expansion unit that decodes and expands. Since the operations of the decoded image decompression unit 103b and the intermediate viewpoint image generation unit 104b are the same as the operations of the decoded image decompression unit 103a and the intermediate viewpoint image generation unit 104a on the transmission side, description thereof will be omitted, and decoding will be described below. The operation of the residual extension unit will be described.
[0061]
The decoding residual decompression unit 106 decodes and decompresses the prediction error (residual) of the prediction image at the viewpoint of the camera 2 and the camera 3 compression-encoded by the transmission-side residual compression encoding unit 105. FIG. 14 shows an example of the configuration of the decoding residual expansion unit 106. In FIG. 14, 114b is a decoding means, 109d and 109e are inverse quantization means, and 110d and 110e are inverse DCT means. The residual data of the camera 2 and camera 3 images that have been compression-encoded are decoded by the decoding means 114b, dequantized by the inverse quantization means 109d and 109e, and decompressed by the inverse DCT means 110d and 110e, respectively. Is done. The viewpoint images of the camera 2 and the camera 3 are restored by superimposing the residual images of the decoded images 2 and 3 on the images generated by the intermediate viewpoint image generation unit 104b.
[0062]
As described above, according to the present embodiment, an intermediate viewpoint image is generated from two non-adjacent images in a multi-viewpoint image on the transmission side, and the residual between the two images and the intermediate viewpoint image is compressed and encoded. And then decoding and decompressing the residual between the two images and the intermediate viewpoint image on the receiving side, generating the images of the two intermediate viewpoint viewpoints, and superimposing the decoded and decompressed residuals at the intermediate viewpoint Therefore, the multi-viewpoint image can be efficiently compressed and transmitted while maintaining the image quality.
[0063]
The generation of the intermediate viewpoint image is not limited to the configuration in which the image at the intermediate viewpoint is generated from the images at the two viewpoints (the viewpoints of the camera 1 and the camera 4) at both ends of the multi-viewpoint image. An image at the viewpoint of the camera 1 and the camera 3 may be generated from the image of the camera 1, and an image at the viewpoint of the camera 2 and the camera 4 may be generated from the image of the camera 1 and the camera 3. Images from the viewpoints of the camera 1 and the camera 4 may be generated from these images, which are included in the present invention.
[0064]
Furthermore, the number of viewpoints of the multi-viewpoint image need not be limited to four viewpoints, and it is clear that each intermediate viewpoint image may be generated from images with two or more viewpoints, and is included in the present invention. .
[0065]
Note that the reliability evaluation measure of the initial parallax in the reliability evaluation unit 4 in the first exemplary embodiment of the present invention need not be limited to that shown in (Equation 2), and (Equation 7) and (Equation 8). , (Equation 9), the same effect can be obtained even if the evaluation values shown in (Equation 9) are used.
[0066]
[Expression 7]
[0067]
[Equation 8]
[0068]
[Equation 9]
[0069]
(Equation 7), (Equation 8), and (Equation 9) indicate that the reliability of the initial parallax is lower as the evaluation value is larger, and the reliability of the initial parallax is higher as the evaluation value is smaller. In addition, since the values of (Equation 7), (Equation 8), and (Equation 9) are unstable in a pixel having a small luminance gradient (the denominator of each equation), the luminance gradient is below a certain threshold value. It is assumed that the reliability of the initial parallax is low.
[0070]
Further, the reliability evaluation measure of the initial parallax in the reliability evaluation unit 4 in the first exemplary embodiment of the present invention is not limited to the one using the residual sum of squares SSD shown in (Equation 1). The same effect can be obtained by using (Equation 11), (Equation 12), and (Equation 13) using the residual absolute value sum SAD shown in (Equation 10), and is included in the present invention.
[0071]
[Expression 10]
[0072]
## EQU11 ##
[0073]
[Expression 12]
[0074]
[Formula 13]
[0075]
In (Equation 11), (Equation 12), and (Equation 13), since the value becomes unstable in a pixel having a small luminance gradient (the denominator of each equation), the luminance gradient is below a certain threshold value. It is assumed that the reliability of the initial parallax is low.
[0076]
Furthermore, when the noise between images is small, the reliability evaluation scale is between the images of (Equation 7), (Equation 8), (Equation 9), (Equation 11), (Equation 12), and (Equation 13). An expression ignoring the noise term σn may be used, and only the numerators of (Equation 7), (Equation 8), (Equation 9), (Equation 11), (Equation 12), and (Equation 13) are initial. The reliability of parallax may be evaluated and each is included in the present invention.
[0077]
In addition, even if the initial parallax estimation means in the second embodiment of the present invention performs the same operation as the parallax estimation unit 1 in the first embodiment of the present invention, the same effect can be obtained, It is included in the present invention.
[0078]
In the first and second embodiments of the present invention, before performing clustering, a median filter is applied to the input image to obtain an intermediate pixel value between the foreground and the background near the object contour. The pixel value of either the background or the background can be replaced, and the parallax can be prevented from having an intermediate value at the boundary between the foreground and the background of the parallax map, and the image quality in the vicinity of the object outline of the intermediate viewpoint image can be improved. include.
[0079]
Further, in the first to third embodiments of the present invention, by generating an intermediate viewpoint image after applying a median filter to the parallax map calculated by the parallax estimation unit, , And as a result, the object contour in the intermediate viewpoint image can be smoothed, and a more natural-looking intermediate viewpoint image can be generated, which is included in the present invention.
[0080]
In the fourth embodiment of the present invention, the method of compressing and encoding images at two non-adjacent viewpoints need not be limited to the one using the correlation between images (inter-viewpoints). Those utilizing the correlation may be used and are included in the present invention.
[0081]
【The invention's effect】
As described above, according to the present invention, by using the clustering result of the image, by performing the parallax estimation that determines the parallax in the region where the initial parallax is low from the parallax in the surrounding pixels of the same cluster, The parallax boundary between the foreground object outline and the foreground background can be matched, and the parallax near the occlusion and the object outline can be determined from the surrounding background parallax or the foreground parallax. Can be reproduced with good image quality.
[0082]
In addition, by further labeling the clustering results and considering spatial connectivity, the boundary between the foreground parallax and the background parallax can be determined stably even when there are pixels with pixel values similar to the foreground object in the occlusion. It is possible to improve the image quality in the vicinity of the foreground contour of the intermediate viewpoint image.
[0083]
Furthermore, by performing parallax estimation considering the correspondence between the initial parallax based on the left image and the initial parallax based on the right image, the parallax near the object contour and the occlusion boundary where the parallax estimate is likely to be unstable Estimating with reference to one of the surrounding foreground and background parallax, and avoiding the parallax between the foreground parallax and background parallax near the boundary between the foreground parallax and the background parallax. The image quality in the vicinity can be improved.
[0084]
In addition, by applying a median filter to the input image before clustering, the intermediate pixel value between the foreground and background near the object contour is replaced with the pixel value of either the foreground or background region, and the parallax map The parallax can be prevented from having an intermediate value at the boundary between the foreground and the background, and the image quality in the vicinity of the object outline of the intermediate viewpoint image can be improved.
[0085]
Furthermore, the boundary between the foreground and background of the parallax map can be smoothed by generating an intermediate viewpoint image after applying the median filter to the parallax map calculated by the parallax estimation unit, and as a result, The object contour can be smoothed, and a natural intermediate viewpoint image can be generated according to the appearance.
[0086]
In addition, by generating the intermediate viewpoint image on both the transmission side and the reception side of the multi-view image transmission system, the transmission amount (residual transmission amount) of the intermediate viewpoint image can be reduced. The image can be efficiently transmitted and compressed and transmitted while maintaining the image quality, and its practical effect is great.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of a disparity estimation method and an intermediate viewpoint image generation method according to a first embodiment of the present invention.
FIG. 2 is a diagram showing the block matching
FIG. 3 is a diagram showing the clustering.
FIG. 4 is a view showing a blank area of the intermediate viewpoint image
FIG. 5 is a configuration diagram of a disparity estimation method and an intermediate viewpoint image generation method according to a second embodiment of the present invention.
FIG. 6 is a diagram showing labeling of the clustering result
FIG. 7 is a configuration diagram of a disparity estimation method and an intermediate viewpoint image generation method according to a third embodiment of the present invention.
FIGS. 8A to 8C are diagrams for explaining the association state;
FIG. 9 is a configuration diagram of a transmission unit of a multi-viewpoint image transmission system according to a fourth embodiment of the present invention.
FIG. 10 is a diagram illustrating an example of a configuration of an image compression encoding unit of a multi-view image transmission system according to a fourth embodiment of the present invention.
FIG. 11 is a diagram illustrating an example of a configuration of a decoded image expansion unit of a multi-view image transmission system according to a fourth embodiment of the present invention.
FIG. 12 is a diagram illustrating an example of a configuration of a residual compression encoding unit of a multi-view image transmission system according to a fourth embodiment of the present invention.
FIG. 13 is a configuration diagram of a reception unit of a multi-viewpoint image transmission system according to a fourth embodiment of the present invention.
FIG. 14 is a diagram illustrating an example of a configuration of a decoding residual expansion unit of a multi-view image transmission system according to a fourth embodiment of the present invention.
FIGS. 15A to 15C are explanatory diagrams of parallax distribution during parallel shooting;
FIGS. 16A to 16C are explanatory diagrams of the distribution of parallax at the time of congestion shooting;
[Explanation of symbols]
1L, 1R parallax estimation unit
2 Intermediate viewpoint image generator
3L, 3R initial parallax estimation means
4L, 4R reliability evaluation means
5L, 5R clustering means
6L, 6R parallax interpolation means
7L, 7R pixel shift means
8 Intermediate viewpoint image integration means

Claims (10)

  1.   From the initial parallax calculated on the basis of the left and right images, a corresponding area between the left and right pixels and an uncorresponding area are extracted, and for the corresponding area, the pixel value and the initial parallax value are extracted. And for each pixel in an uncorresponding region, calculate the distance in the pixel value space between the pixel value of the pixel and the pixel value of each cluster centroid obtained by the clustering, and The parallax in the pixels in the non-region is determined by using the parallax in the corresponding corresponding pixels belonging to the cluster with the closest distance, so that the boundary between the foreground object parallax and the background parallax is defined as the foreground object contour. An intermediate viewpoint image generation method characterized by matching.
  2.   From the initial parallax calculated on the basis of the left and right images, a corresponding area between the left and right pixels and an uncorresponding area are extracted, and for the corresponding area, the pixel value and the initial parallax value are extracted. And for each pixel in a non-corresponding region, calculate the distance in the pixel value space between the pixel value of the pixel and the pixel value of each cluster centroid obtained by the clustering, and the distance is the largest. By determining the close cluster and taking into account the connectivity of the clustering results that determined the parallax in the pixels of the uncorresponding region, the boundary between the foreground object's parallax and background parallax is matched with the foreground object's contour An intermediate viewpoint image generation method characterized by the above.
  3. Clustering of the image by performing the over a median filter in advance the image, claim and removing a minute region having an intermediate pixel values of the foreground and background in the vicinity of the object contour 1 or 2 The intermediate viewpoint image generation method described.
  4. 3. The intermediate viewpoint image generation method according to claim 1 , wherein a boundary between the parallax of the foreground object and the background parallax is smoothed by applying a median filter to the parallax map.
  5. From the initial parallax calculated on the basis of the left and right images, a corresponding area between the left and right pixels and an uncorresponding area are extracted, and for the corresponding area, the pixel value and the initial parallax value are extracted. And for each pixel in an uncorresponding region, calculate the distance in the pixel value space between the pixel value of the pixel and the pixel value of each cluster centroid obtained by the clustering, and The parallax in the pixels in the non-region is determined by using the parallax in the corresponding corresponding pixels belonging to the cluster with the closest distance, so that the boundary between the foreground object parallax and the background parallax is defined as the foreground object contour. A parallax estimation method characterized by matching.
  6. From the initial parallax calculated on the basis of the left and right images, a corresponding area between the left and right pixels and an uncorresponding area are extracted, and for the corresponding area, the pixel value and the initial parallax value are extracted. And for each pixel in a non-corresponding region, calculate the distance in the pixel value space between the pixel value of the pixel and the pixel value of each cluster centroid obtained by the clustering, and the distance is the largest. By determining the close cluster and determining the disparity of the pixels in the uncorresponding region taking into account the connectivity of the determined clustering results, the boundary between the foreground object and the background disparity matches the foreground object contour A parallax estimation method characterized by:
  7. By performing the over median filter clustering images beforehand the image, according to claim 5 or 6, wherein the removal of minute regions with intermediate pixel values of the foreground and background in the vicinity of the object contour Parallax estimation method.
  8. 7. The parallax estimation method according to claim 5 , wherein a boundary between the parallax of the foreground object and the background parallax is smoothed by applying a median filter to the parallax map.
  9. For multi-view images captured at a plurality of viewpoints, and decoded image decompression step of decoding the image compression and coded images at a plurality of viewpoint positions elongation, wherein from said decoded decompressed plurality of images 5. An intermediate viewpoint image generating step for generating an image at an intermediate viewpoint by the method according to any one of Items 1 to 4 , and compressing and encoding the generated image at the intermediate viewpoint and the residual of the captured image And a transmission step of transmitting the encoded information encoded by the compression encoding step.
  10. A decoding residual expansion step of decoding and expanding the residual between the generated image at the intermediate viewpoint and the captured image with respect to the encoded information transmitted by the image transmission method according to claim 9. A characteristic image transmission method.
JP34697296A 1996-12-26 1996-12-26 Intermediate viewpoint image generation method, parallax estimation method, and image transmission method Expired - Fee Related JP3769850B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP34697296A JP3769850B2 (en) 1996-12-26 1996-12-26 Intermediate viewpoint image generation method, parallax estimation method, and image transmission method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP34697296A JP3769850B2 (en) 1996-12-26 1996-12-26 Intermediate viewpoint image generation method, parallax estimation method, and image transmission method
US08/825,723 US6163337A (en) 1996-04-05 1997-04-02 Multi-view point image transmission method and multi-view point image display method

Publications (2)

Publication Number Publication Date
JPH10191396A JPH10191396A (en) 1998-07-21
JP3769850B2 true JP3769850B2 (en) 2006-04-26

Family

ID=18387064

Family Applications (1)

Application Number Title Priority Date Filing Date
JP34697296A Expired - Fee Related JP3769850B2 (en) 1996-12-26 1996-12-26 Intermediate viewpoint image generation method, parallax estimation method, and image transmission method

Country Status (1)

Country Link
JP (1) JP3769850B2 (en)

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100636785B1 (en) 2005-05-31 2006-10-13 삼성전자주식회사 Multi-view image system and method for compressing and decompressing applied to the same
AT551839T (en) * 2005-08-19 2012-04-15 Koninkl Philips Electronics Nv Stereoscopic display device
JP2007180982A (en) * 2005-12-28 2007-07-12 Victor Co Of Japan Ltd Device, method, and program for decoding image
JP2007180981A (en) * 2005-12-28 2007-07-12 Victor Co Of Japan Ltd Device, method, and program for encoding image
KR100756034B1 (en) 2006-01-26 2007-09-07 삼성전자주식회사 Apparatus and method for compensator of disparity vector
ES2599858T3 (en) * 2006-03-31 2017-02-03 Koninklijke Philips N.V. Effective multi-view coding
EP2235955A1 (en) * 2008-01-29 2010-10-06 Thomson Licensing Method and system for converting 2d image data to stereoscopic image data
KR100950046B1 (en) * 2008-04-10 2010-03-29 포항공과대학교 산학협력단 Apparatus of multiview three-dimensional image synthesis for autostereoscopic 3d-tv displays and method thereof
WO2010024919A1 (en) * 2008-08-29 2010-03-04 Thomson Licensing View synthesis with boundary-splatting
CN102239506B (en) * 2008-10-02 2014-07-09 弗兰霍菲尔运输应用研究公司 Intermediate view synthesis and multi-view data signal extraction
EP2180449A1 (en) * 2008-10-21 2010-04-28 Philips Electronics N.V. Method and device for providing a layered depth model of a scene
JP2011044828A (en) * 2009-08-19 2011-03-03 Fujifilm Corp Stereoscopic image generator, stereoscopic image printing device, and stereoscopic image generation method
JP2011060216A (en) * 2009-09-14 2011-03-24 Fujifilm Corp Device and method of processing image
BR112012016544A2 (en) 2010-01-07 2016-04-19 Thomson Licensing method and apparatus for providing video content
JP5051670B2 (en) * 2010-02-15 2012-10-17 Necシステムテクノロジー株式会社 Image processing apparatus, image processing method, and image processing program
JP5784729B2 (en) * 2010-08-27 2015-09-24 サムスン エレクトロニクス カンパニー リミテッド rendering apparatus and method for multi-view generation
WO2012068724A1 (en) * 2010-11-23 2012-05-31 深圳超多维光电子有限公司 Three-dimensional image acquisition system and method
JP5820985B2 (en) 2011-01-19 2015-11-24 パナソニックIpマネジメント株式会社 Stereoscopic image processing apparatus and stereoscopic image processing method
JP5782761B2 (en) * 2011-03-14 2015-09-24 大日本印刷株式会社 Parallax image generation device, parallax image generation method, program, and storage medium
BR112014007263A2 (en) * 2011-09-29 2017-03-28 Thomson Licensing method and device for filtering a disparity map
JP2013089981A (en) * 2011-10-13 2013-05-13 Sony Corp Image processing apparatus, image processing method and program
KR20140092910A (en) * 2011-11-14 2014-07-24 도쿠리츠 교세이 호진 죠호 츠신 켄큐 키코 Stereoscopic video coding device, stereoscopic video decoding device, stereoscopic video coding method, stereoscopic video decoding method, stereoscopic video coding program, and stereoscopic video decoding program
JP5820716B2 (en) * 2011-12-15 2015-11-24 シャープ株式会社 Image processing apparatus, image processing method, computer program, recording medium, and stereoscopic image display apparatus
WO2013099169A1 (en) * 2011-12-27 2013-07-04 パナソニック株式会社 Stereo photography device
JP5755571B2 (en) * 2012-01-11 2015-07-29 シャープ株式会社 Virtual viewpoint image generation device, virtual viewpoint image generation method, control program, recording medium, and stereoscopic display device
JP5373931B2 (en) * 2012-03-22 2013-12-18 日本電信電話株式会社 Virtual viewpoint image generation method, virtual viewpoint image generation apparatus, and virtual viewpoint image generation program
JP6016061B2 (en) * 2012-04-20 2016-10-26 Nltテクノロジー株式会社 Image generation apparatus, image display apparatus, image generation method, and image generation program
JP5953916B2 (en) * 2012-05-02 2016-07-20 ソニー株式会社 Image processing apparatus and method, and program
WO2014013805A1 (en) * 2012-07-18 2014-01-23 ソニー株式会社 Image processing device, image processing method, and image display device
US9762893B2 (en) * 2015-12-07 2017-09-12 Google Inc. Systems and methods for multiscopic noise reduction and high-dynamic range
WO2018042752A1 (en) * 2016-08-31 2018-03-08 株式会社堀場製作所 Signal analysis device, signal analysis method, computer program, measurement device, and measurement method

Also Published As

Publication number Publication date
JPH10191396A (en) 1998-07-21

Similar Documents

Publication Publication Date Title
Oh et al. H. 264-based depth map sequence coding using motion information of corresponding texture video
KR101354387B1 (en) Depth map generation techniques for conversion of 2d video data to 3d video data
Lukacs Predictive coding of multi-viewpoint image sets
US5686973A (en) Method for detecting motion vectors for use in a segmentation-based coding system
US20050084016A1 (en) Video encoding apparatus and video decoding apparatus
JP5268645B2 (en) Method for predicting disparity vector using camera parameter, device for encoding and decoding multi-view video using the method, and recording medium on which program for performing the method is recorded
JP3826236B2 (en) Intermediate image generation method, intermediate image generation device, parallax estimation method, and image transmission display device
KR980011300A (en) Speed control for stereoscopic digital video encoding
US6043838A (en) View offset estimation for stereoscopic video coding
US5832115A (en) Ternary image templates for improved semantic compression
KR100378902B1 (en) A method and an apparatus for processing pixel data and a computer readable medium
KR20080088299A (en) Method for encoding and decoding motion model parameter, and method and apparatus for video encoding and decoding using motion model parameter
US20070147502A1 (en) Method and apparatus for encoding and decoding picture signal, and related computer programs
EP0579319B1 (en) Tracking moving objects
JP3736869B2 (en) Bi-directional motion estimation method and apparatus
Tzovaras et al. Object-based coding of stereo image sequences using joint 3-D motion/disparity compensation
JP4001400B2 (en) Motion vector detection method and motion vector detection device
US6144701A (en) Stereoscopic video coding and decoding apparatus and method
EP0734178A2 (en) Method and apparatus for determining true motion vectors for selected pixels
KR100209793B1 (en) Apparatus for encoding/decoding a video signals by using feature point based motion estimation
US8116557B2 (en) 3D image processing apparatus and method
US7822231B2 (en) Optical flow estimation method
KR101345303B1 (en) Dynamic depth control method or apparatus in stereo-view or multiview sequence images
KR100751422B1 (en) A Method of Coding and Decoding Stereoscopic Video and A Apparatus for Coding and Decoding the Same
US5760846A (en) Apparatus for estimating motion vectors for feature points of a video signal

Legal Events

Date Code Title Description
RD01 Notification of change of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7421

Effective date: 20050623

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20050707

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20050726

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20050908

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20060117

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20060130

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20100217

Year of fee payment: 4

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20100217

Year of fee payment: 4

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20110217

Year of fee payment: 5

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20120217

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130217

Year of fee payment: 7

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130217

Year of fee payment: 7

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20140217

Year of fee payment: 8

LAPS Cancellation because of no payment of annual fees