EP1800493A1 - Method and system for encoding/decoding multi-view video based on layered-depth image - Google Patents

Method and system for encoding/decoding multi-view video based on layered-depth image

Info

Publication number
EP1800493A1
EP1800493A1 EP05809005A EP05809005A EP1800493A1 EP 1800493 A1 EP1800493 A1 EP 1800493A1 EP 05809005 A EP05809005 A EP 05809005A EP 05809005 A EP05809005 A EP 05809005A EP 1800493 A1 EP1800493 A1 EP 1800493A1
Authority
EP
European Patent Office
Prior art keywords
ldi
view video
bit stream
depth value
encoded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP05809005A
Other languages
German (de)
French (fr)
Other versions
EP1800493A4 (en
Inventor
Kug Jin Yun
Dae Hee Kim
Suk Hee Cho
Chung Hyun Ahn
Soo In Lee
Yo Sung Ho
Seung Uk Yoon
Sung Yeol Kim
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electronics and Telecommunications Research Institute ETRI
Original Assignee
Electronics and Telecommunications Research Institute ETRI
Gwangju Institute of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020050031715A external-priority patent/KR20060045798A/en
Application filed by Electronics and Telecommunications Research Institute ETRI, Gwangju Institute of Science and Technology filed Critical Electronics and Telecommunications Research Institute ETRI
Publication of EP1800493A1 publication Critical patent/EP1800493A1/en
Publication of EP1800493A4 publication Critical patent/EP1800493A4/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • G06T15/205Image-based rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/128Adjusting depth or disparity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2213/00Details of stereoscopic systems
    • H04N2213/005Aspects relating to the "3D+depth" image format

Definitions

  • the present invention relates to a linear decorrelation method and apparatus that adjust a probability distribution of a layered depth image to improve coding efficiency in encoding and decoding the layered depth image.
  • a multi-view video has been used in various applications to provide more realistic services, but a great amount of data is required so that an extremely wide bandwidth is needed to transmit the data. Therefore, to solve this problem, a layered depth image ("LDI") method requiring a relatively narrow bandwidth can be utilized.
  • LDM layered depth image
  • LDI represents a 3-D object with an array of pixels seen from a single camera position.
  • Each LDI pixel is represented by its color, depth that is the distance of the pixel to the camera, and some other property information supporting LDI rendering.
  • the LDI is composed of pixels similar to a typical 2-D image, but each pixel has color information as well as depth information and additional information that supports rendering. Therefore, any view image within a certain view angles can be easily rendered by using the LDI, which is constructed from a single view.
  • the LDI contains color information on Y, Cb, Cr, and Alpha, depth information representing a distance between a camera and an object, and a splat table index used to support various pixel sizes upon rendering.
  • Each LDI pixel contains 63-bit information in total to include all the information, so that one sheet of LDI includes data from several megabytes to several tens of megabytes.
  • LDI is divided into multiple layers, each of which contains a mask indicating the existence of pixel in the layer.
  • LDI is characterized in that the distribution of pixels becomes sparser towards the back layer. Such a phenomenon becomes more noticeable, as the number of LDI layers becomes more.
  • IEEE TRANSACTIONS OF IMAGE PROCESSING, VOL., 12, NO.3, 2003/3) discloses a data aggregation method as a preprocessing process prior to compression.
  • Data aggregation which uses a feature of LDI that the distribution of pixels becomes sparser towards the back layer, is performed to aggregates pixels distributed in each layer.
  • Such simple data aggregation does not consider correlation of LDI data. [6] Therefore, there is a need to improve the coding efficiency of LDI data by removing redundant (duplicated) information from a highly correlated LDI data, converting them into de-correlated data, and then encoding the non-correlated data. Disclosure of Invention Technical Problem
  • the object of the present invention is to propose a linear decorrelation process, which is a new preprocessing process to remove redundant depth information prior to performing data aggregation and a method for encoding multiview video using LDI.
  • Other object of the present invention is to improve coding efficiency in encoding process using the LDI, by making a distribution of depth information of the LDI data highly skewed around a median, through the linear decorrelation.
  • the present invention provides a method and an apparatus for encoding and decoding multi-view video using LDI.
  • a method of encoding multiview video data using LDI includes: (i) generating the LDI including multiple layers by using color and depth information of each viewpoint image of the multi-view video; (ii) performing linear decorrelation in each layer of the LDI; (iii) performing data aggregation in each linearly-decorrelated layer of the LDI; and (iv) encoding the aggregated data in each layer of the LDI to generate an encoded LDI bit stream.
  • the step (ii) may includes, for each of all pixels in each layer of the LDI, calculating a minimum distance between a line connecting two previous pixels and a depth value of a current pixel to replace the depth value of the current pixel with the minimum distance. Also, in the step (b), when a depth value of the current pixel does not exist, an average depth value of the two previous pixels may be used as the depth value of the current pixel.
  • the information for compensating information loss occurred in the LDI generation step may be transmitted to a decoding apparatus, together with the encoded LDI bit stream, so that images close to the original ones may be reconstructed.
  • a method of decoding a multi- view video comprising the steps of decoding an encoded LDI bit stream; decoding a bit stream of residual information between an original multi-view video and a multi-view video reconstructed from the encoded LDI bit stream; reconstructing the multi-view video based on the decoded LDI bit stream and residual information is provided.
  • the image at the corresponding viewpoint may be reconstructed only.
  • FIG. 1 shows a typical LDI structure.
  • FIG. 2 is a schematic diagram of a multi-view video LDI-based encoding/decoding apparatus according to an embodiment of the present invention
  • FIG. 3 is a diagram illustrating how a LDI is generated from multi-view video data
  • FIG. 4 shows how to perform linear decorrelation on the LDI layer in which all the pixels have depth values, according to the present invention
  • FIG. 5 shows how to perform linear decorrelation on the LDI layer in which some of the pixels do not have depth values, according to the present invention
  • FIG. 6 is a flowchart showing the linear decorrelation process according to a preferred embodiment of the present invention. Mode for the Invention
  • FIG. 1 shows a typical LDI structure.
  • the LDI includes an array of pixels seen from a single LDI camera position, together with multiple layers based on any viewpoints.
  • the rays intersect with an object at a plurality of points, which are ordered from the front to the back.
  • the first intersection points constitute the first LDI layer; the second intersection points constitute the second layer, and so on.
  • Each LDI layer is separated into individual components: luminance, color, transparency and depth. Further, component image of each layer is compressed separately. In order to increase a compression rate, data aggregation is performed to aggregate data on the same layer, so that data are more compactly distributed.
  • FIG. 2 is a schematic diagram of a multi-view video LDI-based encoding/decoding apparatus according to an embodiment of the present invention.
  • the apparatus 210 includes a LDI generation unit 201, a linear decorrelation unit 202, a data aggregation unit 203, an LDI encoding unit 204, an LDI decoding unit 205, a multi-view video generation unit 206 and residual information encoding unit 207.
  • the LDI generation unit 201 generates a LDI, which is composed of multiple layers, by 3-D warping of multiview video images with depth information, which uses color and depth information of each image.
  • a LDI which is composed of multiple layers, by 3-D warping of multiview video images with depth information, which uses color and depth information of each image.
  • Fig. 3 while the images with depth information at different camera viewpoints C and C are warped into one with depth information at a common viewpoint C , when the warped pixels are placed in the same pixel location, their depth values are compared. If the difference between depth values is less than predefined threshold, they are merged. Otherwise, a new layer having the average depth value of the two pixels is created.
  • the former case is shown as 'c' and 'd' in Fig. 3. Since algorithms for generating a LDI are well known to those skilled in the art, a detailed description thereof will be omitted in this spec ⁇ ification.
  • the linear decorrelation unit 202 which performs a pre-process before data ag ⁇ gregation to improve the coding efficiency, makes the depth values of pixels in each layer of the LDI to be gathered around the median, in order to reduce the variance of them. Specifically, the linear decorrelation is performed on each layer consisting of LDI (hereinafter, "LDI layer"). The details of the linear decorrelation will be explained referring to Figs. 3-5.
  • the data aggregation unit 203 performs the LDI data aggregation in each LDI layer, in order to reduce distribution of depth values. Since data aggregation process is disclosed in the above-article "Compression of the layered depth image" (J. Duan and J. Li, IEEE TRANSACTIONS OF IMAGE PROCESSING, VOL., 12, NO.3, 2003/3), it will be omitted herein.
  • the LDI encoding 204 encodes the data aggregated toward a certain direction in a space.
  • the encoded LDI bit stream will be transmitted through a communication channel or a storage medium to a multi-view video decoding apparatus 220.
  • the reconstructed images may have a residual with the original images. This is due to information loss during the LDI generation. Accordingly, it is required to separately transmit information for compensating such information loss to a multi-view video decoding apparatus 220, in order to reconstruct high-quality images close to the original ones.
  • the multi-view video encoding apparatus 210 may additionally include LDI decoding unit 205, multi- view image generation unit 206 and residual information encoding unit 207.
  • LDI decoding unit 205 receives the encoded LDI bit stream from the LDI encoding unit 204 and decodes it.
  • the multi-view image generation unit 206 generates each of multi-view images from the decoded LDI data.
  • the residual information encoding unit 207 calculates residual information between multi-view images generated by the multi- view image generation unit 206 and original multi-view images, encodes and transmits it to the multi-view video decoding apparatus 220.
  • the multi-view video decoding apparatus 220 includes an LDI decoding unit 221, multi-view image generation unit 222 and residual information decoding unit 223.
  • the LDI decoding unit 221 receives the encoded LDI bit stream from the multi-view video encoding apparatus 220 and decodes it.
  • the residual information decoding unit 223 receives the encoded residual information bit stream and from the multi-view video encoding apparatus 220 and decodes it.
  • the multi-view image generation unit 222 generates each of the multi-view images close to the original images, using the LDI data decoded by the LDI decoding unit 221 and the residual information decoded by the residual information decoding unit 223.
  • a user can select which viewpoint will be reconstructed and, the multi-view image generation unit 222 can generate the image corresponding to the selected viewpoint, in response to the selection.
  • FIG. 4 shows how to perform linear decorrelation on the LDI layer in which all the pixels have depth information, according to the present invention.
  • the one- dimensional (1-D) depth value of a pixel may be considered as the two-dimensional (2-D) value point.
  • the minimum distance between a line passing through previous two points, which represent the depth values of the previous two pixels, and the depth value of a current pixel is calculated; and then the depth value of the current pixel is replaced with the minimum distance.
  • FIG. 5 shows how to perform linear decorrelation on the LDI layer in which some of the pixels do not have depth values, according to the present invention.
  • the average depth value of the previous two points is inserted into a depth value of the pixel, which does not have a depth value.
  • the minimum distance between a line passing through the previous two points and the depth value of the current pixel is calculated and then the depth value of the current pixel can be replaced with the minimum distance.
  • the some of the previous two pixels do not have depth values.
  • the depth value of a first pixel when the depth value of a first pixel does not exist, the depth value of the first pixel is filled with '0', and when the depth value of a second pixel does not exist, the depth value of the second pixel is filled with that of the first pixel.
  • the depth value of the third pixel can be filled with the average depth value of the previous two pixels. Then, the minimum distance is calculated using this average value as a depth value of the current pixel. In other words, the depth values of all the pixels on each LDI layer are filled and then the minimum distance is calculated.
  • the minimum distance, d, between a line passing through the previous two points, which represent the depth values of the two previous pixels (for example, A(x z ) and B(x , z )) and a current point, which represents the depth value of the current i i pixel (for example, C(x , z )) can be computed by
  • a 1 - represents A(-z , x ). Since the depth value does not exist in the position of C, as described above, the average value of the previous two depth values is inserted into z . With this, the variance distribution of the depth values can be reduced.
  • FIG. 6 is a flowchart of the linear decorrelation process according to an embodiment of the present invention.
  • step 610 it is checked whether all the LDI pixels have depth values on the same LDI layer.
  • step 620 it is determined if the pixel having no depth value is the first pixel in step 620. If it is the first pixel, the value thereof is filled with '0' in step 630.
  • step 640 it is determined if the pixel having no depth value is the second pixel. If it is, the value of the second pixel is filled with the depth value of the first pixel in step 650.
  • the depth value of the corresponding pixel is filled with the average depth value of the previous two points, which represent the depth values of the previous two pixels, in step 660.
  • the steps 620 to 660 are performed to fill the depth values of the corresponding pixels.
  • step 670 the minimum distance between a line passing through the previous two points and a depth value of a current pixel is calculated, and the depth value of the current pixel is replaced with the minimum distance.
  • the present invention can be provided as one or more computer readable medium implemented on one or more products.
  • the products may be a floppy disk, a hard disk, a CD ROM, a flash memory card, a PROM, a RAM, a ROM, or a magnetic tape.
  • a computer readable program can be implemented in any programming language. Some examples of available languages include C, C++, or JAVA.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Provided are a method and an apparatus for enoding/decoding a multi-view video using LDI. Specifically, provided are a method and an apparatus for enoding/decoding a multi-view video using LDI, which uses a linear decorrelation process to improve compression efficiency. The LDI encoding method according to present invention includes: (i) generating the LDI including multiple layers by using color and depth information of each viewpoint image of the multi-view video; (ii) performing linear decorrelation in each layer of the LDI; (iii) performing data aggregation in each linearly-decorrelated layer of the LDI; and (iv) encoding the aggregated data in each layer of the LDI to generate an encoded LDI bit stream.

Description

Description
METHOD AND SYSTEM FOR ENCODING/DECODING MULTI- VIEW VIDEO BASED ON LAYERED-DEPTH IMAGE
Technical Field
[1] The present invention relates to a linear decorrelation method and apparatus that adjust a probability distribution of a layered depth image to improve coding efficiency in encoding and decoding the layered depth image. Background Art
[2] A multi-view video has been used in various applications to provide more realistic services, but a great amount of data is required so that an extremely wide bandwidth is needed to transmit the data. Therefore, to solve this problem, a layered depth image ("LDI") method requiring a relatively narrow bandwidth can be utilized.
[3] Unlike a typical 3-D modeling mechanism using a mesh, LDI represents a 3-D object with an array of pixels seen from a single camera position. Each LDI pixel is represented by its color, depth that is the distance of the pixel to the camera, and some other property information supporting LDI rendering. In other words, the LDI is composed of pixels similar to a typical 2-D image, but each pixel has color information as well as depth information and additional information that supports rendering. Therefore, any view image within a certain view angles can be easily rendered by using the LDI, which is constructed from a single view. Specifically, the LDI contains color information on Y, Cb, Cr, and Alpha, depth information representing a distance between a camera and an object, and a splat table index used to support various pixel sizes upon rendering. Each LDI pixel contains 63-bit information in total to include all the information, so that one sheet of LDI includes data from several megabytes to several tens of megabytes.
[4] LDI is divided into multiple layers, each of which contains a mask indicating the existence of pixel in the layer. LDI is characterized in that the distribution of pixels becomes sparser towards the back layer. Such a phenomenon becomes more noticeable, as the number of LDI layers becomes more.
[5] A paper entitled to "Compression of the layered depth image" (J. Duan and J. Li,
IEEE TRANSACTIONS OF IMAGE PROCESSING, VOL., 12, NO.3, 2003/3) discloses a data aggregation method as a preprocessing process prior to compression. Data aggregation, which uses a feature of LDI that the distribution of pixels becomes sparser towards the back layer, is performed to aggregates pixels distributed in each layer. However, such simple data aggregation does not consider correlation of LDI data. [6] Therefore, there is a need to improve the coding efficiency of LDI data by removing redundant (duplicated) information from a highly correlated LDI data, converting them into de-correlated data, and then encoding the non-correlated data. Disclosure of Invention Technical Problem
[7] The object of the present invention is to propose a linear decorrelation process, which is a new preprocessing process to remove redundant depth information prior to performing data aggregation and a method for encoding multiview video using LDI. Other object of the present invention is to improve coding efficiency in encoding process using the LDI, by making a distribution of depth information of the LDI data highly skewed around a median, through the linear decorrelation. Technical Solution
[8] In order to achieve the above objects, the present invention provides a method and an apparatus for encoding and decoding multi-view video using LDI.
[9] According to an aspect of the present invention, a method of encoding multiview video data using LDI is provided. The method includes: (i) generating the LDI including multiple layers by using color and depth information of each viewpoint image of the multi-view video; (ii) performing linear decorrelation in each layer of the LDI; (iii) performing data aggregation in each linearly-decorrelated layer of the LDI; and (iv) encoding the aggregated data in each layer of the LDI to generate an encoded LDI bit stream.
[10] The step (ii) may includes, for each of all pixels in each layer of the LDI, calculating a minimum distance between a line connecting two previous pixels and a depth value of a current pixel to replace the depth value of the current pixel with the minimum distance. Also, in the step (b), when a depth value of the current pixel does not exist, an average depth value of the two previous pixels may be used as the depth value of the current pixel.
[11] In addition, the information for compensating information loss occurred in the LDI generation step may be transmitted to a decoding apparatus, together with the encoded LDI bit stream, so that images close to the original ones may be reconstructed.
[12] According to another aspect of the present invention, a method of decoding a multi- view video comprising the steps of decoding an encoded LDI bit stream; decoding a bit stream of residual information between an original multi-view video and a multi-view video reconstructed from the encoded LDI bit stream; reconstructing the multi-view video based on the decoded LDI bit stream and residual information is provided.
[13] According to an embodiment of the present invention, when an instruction selecting a viewpoint to be reconstructed is received from a user; the image at the corresponding viewpoint may be reconstructed only.
Advantageous Effects
[14] According to the LDI-based multiview video encoding/decoding methods of the present invention, coding efficiency may be improved and high-quality images at the corresponding viewpoints, which are close to the original ones, can be reconstructed. Brief Description of the Drawings
[15] FIG. 1 shows a typical LDI structure.
[16] FIG. 2 is a schematic diagram of a multi-view video LDI-based encoding/decoding apparatus according to an embodiment of the present invention
[17] FIG. 3 is a diagram illustrating how a LDI is generated from multi-view video data;
[18] FIG. 4 shows how to perform linear decorrelation on the LDI layer in which all the pixels have depth values, according to the present invention;
[19] FIG. 5 shows how to perform linear decorrelation on the LDI layer in which some of the pixels do not have depth values, according to the present invention;
[20] FIG. 6 is a flowchart showing the linear decorrelation process according to a preferred embodiment of the present invention. Mode for the Invention
[21] The present invention will now be described more fully hereinafter with reference to the accompanying drawings, in which preferred embodiments of the invention are shown. This invention may, however, be embodied in different forms and should not be construed as limited to the embodiments set forth herein.
[22] FIG. 1 shows a typical LDI structure. The LDI includes an array of pixels seen from a single LDI camera position, together with multiple layers based on any viewpoints. As shown in FIG. 1 , when rays are shot from a LDI camera position P, the rays intersect with an object at a plurality of points, which are ordered from the front to the back. The first intersection points constitute the first LDI layer; the second intersection points constitute the second layer, and so on. Each LDI layer is separated into individual components: luminance, color, transparency and depth. Further, component image of each layer is compressed separately. In order to increase a compression rate, data aggregation is performed to aggregate data on the same layer, so that data are more compactly distributed.
[23] FIG. 2 is a schematic diagram of a multi-view video LDI-based encoding/decoding apparatus according to an embodiment of the present invention. As shown in FIG. 2, the apparatus 210 includes a LDI generation unit 201, a linear decorrelation unit 202, a data aggregation unit 203, an LDI encoding unit 204, an LDI decoding unit 205, a multi-view video generation unit 206 and residual information encoding unit 207.
[24] The LDI generation unit 201 generates a LDI, which is composed of multiple layers, by 3-D warping of multiview video images with depth information, which uses color and depth information of each image. As an example, as shown in Fig. 3, while the images with depth information at different camera viewpoints C and C are warped into one with depth information at a common viewpoint C , when the warped pixels are placed in the same pixel location, their depth values are compared. If the difference between depth values is less than predefined threshold, they are merged. Otherwise, a new layer having the average depth value of the two pixels is created. The former case is shown as 'c' and 'd' in Fig. 3. Since algorithms for generating a LDI are well known to those skilled in the art, a detailed description thereof will be omitted in this spec¬ ification.
[25] The linear decorrelation unit 202, which performs a pre-process before data ag¬ gregation to improve the coding efficiency, makes the depth values of pixels in each layer of the LDI to be gathered around the median, in order to reduce the variance of them. Specifically, the linear decorrelation is performed on each layer consisting of LDI (hereinafter, "LDI layer"). The details of the linear decorrelation will be explained referring to Figs. 3-5.
[26] Next, the data aggregation unit 203 performs the LDI data aggregation in each LDI layer, in order to reduce distribution of depth values. Since data aggregation process is disclosed in the above-article "Compression of the layered depth image" (J. Duan and J. Li, IEEE TRANSACTIONS OF IMAGE PROCESSING, VOL., 12, NO.3, 2003/3), it will be omitted herein.
[27] The LDI encoding 204 encodes the data aggregated toward a certain direction in a space. The encoded LDI bit stream will be transmitted through a communication channel or a storage medium to a multi-view video decoding apparatus 220.
[28] When the multi-view video is reconstructed from the LDI data generated by LDI generation unit 201, the reconstructed images may have a residual with the original images. This is due to information loss during the LDI generation. Accordingly, it is required to separately transmit information for compensating such information loss to a multi-view video decoding apparatus 220, in order to reconstruct high-quality images close to the original ones.
[29] In order to do this, according to one embodiment of the invention, the multi-view video encoding apparatus 210 may additionally include LDI decoding unit 205, multi- view image generation unit 206 and residual information encoding unit 207. LDI decoding unit 205 receives the encoded LDI bit stream from the LDI encoding unit 204 and decodes it. The multi-view image generation unit 206 generates each of multi-view images from the decoded LDI data. The residual information encoding unit 207 calculates residual information between multi-view images generated by the multi- view image generation unit 206 and original multi-view images, encodes and transmits it to the multi-view video decoding apparatus 220.
[30] The multi-view video decoding apparatus 220 includes an LDI decoding unit 221, multi-view image generation unit 222 and residual information decoding unit 223. The LDI decoding unit 221 receives the encoded LDI bit stream from the multi-view video encoding apparatus 220 and decodes it. The residual information decoding unit 223 receives the encoded residual information bit stream and from the multi-view video encoding apparatus 220 and decodes it. The multi-view image generation unit 222 generates each of the multi-view images close to the original images, using the LDI data decoded by the LDI decoding unit 221 and the residual information decoded by the residual information decoding unit 223. In another embodiment, a user can select which viewpoint will be reconstructed and, the multi-view image generation unit 222 can generate the image corresponding to the selected viewpoint, in response to the selection.
[31] FIG. 4 shows how to perform linear decorrelation on the LDI layer in which all the pixels have depth information, according to the present invention. As shown, the one- dimensional (1-D) depth value of a pixel may be considered as the two-dimensional (2-D) value point. As shown in FIG. 4, in case that all the pixels have depth values on the same LDI layer, the minimum distance between a line passing through previous two points, which represent the depth values of the previous two pixels, and the depth value of a current pixel is calculated; and then the depth value of the current pixel is replaced with the minimum distance.
[32] On the other hand, FIG. 5 shows how to perform linear decorrelation on the LDI layer in which some of the pixels do not have depth values, according to the present invention. As shown in FIG. 5, in case that there is a pixel that does not have a depth value, the average depth value of the previous two points is inserted into a depth value of the pixel, which does not have a depth value. In the same manner, the minimum distance between a line passing through the previous two points and the depth value of the current pixel is calculated and then the depth value of the current pixel can be replaced with the minimum distance.
[33] However, there may be a case where the some of the previous two pixels do not have depth values. For example, when the depth value of a first pixel does not exist, the depth value of the first pixel is filled with '0', and when the depth value of a second pixel does not exist, the depth value of the second pixel is filled with that of the first pixel. Accordingly, the depth value of the third pixel can be filled with the average depth value of the previous two pixels. Then, the minimum distance is calculated using this average value as a depth value of the current pixel. In other words, the depth values of all the pixels on each LDI layer are filled and then the minimum distance is calculated. The minimum distance, d, between a line passing through the previous two points, which represent the depth values of the two previous pixels (for example, A(x z ) and B(x , z )) and a current point, which represents the depth value of the current i i pixel (for example, C(x , z )) , can be computed by
[34]
(A-B) L ' (C-A)I
\A-B\
(i)
[35] where
A 1- represents A(-z , x ). Since the depth value does not exist in the position of C, as described above, the average value of the previous two depth values is inserted into z . With this, the variance distribution of the depth values can be reduced.
[36] FIG. 6 is a flowchart of the linear decorrelation process according to an embodiment of the present invention. As shown in FIG. 6, in step 610, it is checked whether all the LDI pixels have depth values on the same LDI layer. When it is determined that there exists a pixel having no depth value, it is determined if the pixel having no depth value is the first pixel in step 620. If it is the first pixel, the value thereof is filled with '0' in step 630. Next, in step 640, it is determined if the pixel having no depth value is the second pixel. If it is, the value of the second pixel is filled with the depth value of the first pixel in step 650. When the other pixels do not have depth values, the depth value of the corresponding pixel is filled with the average depth value of the previous two points, which represent the depth values of the previous two pixels, in step 660. In case that some pixels do not have depth values, the steps 620 to 660 are performed to fill the depth values of the corresponding pixels.
[37] Next, in step 670, the minimum distance between a line passing through the previous two points and a depth value of a current pixel is calculated, and the depth value of the current pixel is replaced with the minimum distance.
[38] The above steps 610- 670 are repetitively performed on each layer of the LDI data, so that linear decorrelation is obtained on each layer.
[39] The present invention can be provided as one or more computer readable medium implemented on one or more products. The products may be a floppy disk, a hard disk, a CD ROM, a flash memory card, a PROM, a RAM, a ROM, or a magnetic tape. In general, a computer readable program can be implemented in any programming language. Some examples of available languages include C, C++, or JAVA.
[40] Although exemplary embodiments of the present invention have been described with reference to the attached drawings, the present invention is not limited to these embodiments, and it should be appreciated to those skilled in the art that a variety of modifications and changes can be made without departing from the spirit and scope of the present invention.

Claims

Claims
[1] A method of encoding multi-view video using LDI, the method comprising the steps of:
(i) generating the LDI including multiple layers by using color and depth in¬ formation of each viewpoint image of the multi-view video; (ii) performing linear decorrelation in each layer of the LDI; (iii) performing data aggregation in each linearly-decorrelated layer of the LDI; and
(iv) encoding the aggregated data in each layer of the LDI to generate an encoded LDI bit stream..
[2] The method according to claim 1, wherein the step (ii) includes, for each of all pixels in each layer of the LDI, calculating a minimum distance between a line connecting two previous pixels and a depth value of a current pixel to replace the depth value of the current pixel with the minimum distance.
[3] The method according to claim 2, wherein the step (ii) includes, when a depth value of the current pixel does not exist, using an average depth value of the two previous pixels as the depth value of the current pixel.
[4] The method according to claim 3, further comprising, when the pixel not having the depth value is a first pixel of the LDI, filling a depth value of the first pixel with a value of 1O1.
[5] The method according to claim 3, further comprising, when the pixel not having the depth value is a second pixel of the LDI, copying the depth value of the first pixel.
[6] The method according to claim 1 , further comprising transmitting information for compensating information loss occurred in the step (i), together with the encoded LDI bit stream.
[7] The method according to claim 6, wherein the information for compensating in¬ formation loss is residual information between an original multi-view video and a multi-view video reconstructed from the encoded LDI bit stream and the residual information is encoded and transmitted to a decoding apparatus.
[8] A method of decoding a multi-view video, the method comprising: decoding an encoded LDI bit stream; decoding a bit stream of residual information between an original multi-view video and a multi-view video reconstructed from the encoded LDI bit stream; reconstructing the multi-view video based on the decoded LDI bit stream and residual information.
[9] The method according to claim 8, further comprising: receiving an instruction selecting a viewpoint to be reconstructed from a user; and in response to the received instruction, reconstructing image data at the cor¬ responding viewpoint. [10] An apparatus of encoding multi-view video using LDI, comprising:
(i) means for generating the LDI including multiple layers by using color and depth information of each viewpoint image of the multi-view video;
(ii) means for performing linear decorrelation in each layer of the LDI;
(iii) means for performing data aggregation in each linearly-decorrelated layer of the LDI; and
(iv) means for encoding the aggregated data in each layer of the LDI to generate an encoded LDI bit stream. [11] The apparatus according to claim 10, wherein the means for performing linear decorrelation calculates, for each of all pixels in each layer of the LDI, a minimum distance between a line connecting two previous pixels and a depth value of a current pixel to replace the depth value of the current pixel with the minimum distance. [12] The apparatus according to claim 10, further comprising means for calculating and encoding residual information between an original multi-view video and a multi-view video reconstructed from the encoded LDI bit stream. [13] The apparatus according to claim 12, comprising: means for decoding the encoded LDI bit stream; means for reconstructing the multi-view video data based on the decoded LDI data; and means for calculating and encoding the residual information between the original multi-view video data and the reconstructed multi-view video data. [14] An apparatus of decoding a multi-view video, comprising: means for decoding an encoded LDI bit stream; means for decoding a bit stream of residual information between an original multi-view video and a multi-view video reconstructed from the encoded LDI bit stream; means for reconstructing the multi-view video based on the decoded LDI bit stream and residual information. [15] A computer readable recording medium having a computer program thereon, which performs a method according to any one of claims 1 to 7. [16] A computer readable recording medium having a computer program thereon, which performs a method for decoding multi-view video according to claim 8 or
EP05809005A 2004-10-16 2005-10-13 Method and system for encoding/decoding multi-view video based on layered-depth image Withdrawn EP1800493A4 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
KR20040082927 2004-10-16
KR1020050031715A KR20060045798A (en) 2004-10-16 2005-04-16 Method and system for coding multi-view video based on layered -depth image
PCT/KR2005/003418 WO2006041261A1 (en) 2004-10-16 2005-10-13 Method and system for encoding/decoding multi-view video based on layered-depth image

Publications (2)

Publication Number Publication Date
EP1800493A1 true EP1800493A1 (en) 2007-06-27
EP1800493A4 EP1800493A4 (en) 2012-10-10

Family

ID=36148536

Family Applications (1)

Application Number Title Priority Date Filing Date
EP05809005A Withdrawn EP1800493A4 (en) 2004-10-16 2005-10-13 Method and system for encoding/decoding multi-view video based on layered-depth image

Country Status (3)

Country Link
EP (1) EP1800493A4 (en)
KR (1) KR100714068B1 (en)
WO (1) WO2006041261A1 (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100780840B1 (en) * 2006-06-14 2007-11-29 광주과학기술원 A temporal prediction apparatus and method for coding multi-view video based on layer-depth image
KR101965781B1 (en) 2007-04-12 2019-04-05 돌비 인터네셔널 에이비 Tiling in video encoding and decoding
US8953684B2 (en) 2007-05-16 2015-02-10 Microsoft Corporation Multiview coding with geometry-based disparity prediction
KR100943634B1 (en) * 2008-08-07 2010-02-24 한국전자통신연구원 Method and apparatus for free-viewpoint video contents offering according to scalable depth coding
EP2328337A4 (en) * 2008-09-02 2011-08-10 Huawei Device Co Ltd 3d video communicating means, transmitting apparatus, system and image reconstructing means, system
TWI542190B (en) * 2008-11-04 2016-07-11 皇家飛利浦電子股份有限公司 Method and system for encoding a 3d image signal, encoded 3d image signal, method and system for decoding a 3d image signal
EP2348732A4 (en) 2008-11-10 2012-05-09 Lg Electronics Inc Method and device for processing a video signal using inter-view prediction
US8760495B2 (en) 2008-11-18 2014-06-24 Lg Electronics Inc. Method and apparatus for processing video signal
US9036714B2 (en) * 2009-01-26 2015-05-19 Thomson Licensing Frame packing for video coding
WO2011094019A1 (en) 2010-01-29 2011-08-04 Thomson Licensing Block-based interleaving
US8428372B2 (en) * 2010-04-09 2013-04-23 The Boeing Company Method, apparatus and computer program product for compressing data
JPWO2012114975A1 (en) * 2011-02-24 2014-07-07 ソニー株式会社 Image processing apparatus and image processing method
KR101811637B1 (en) 2011-05-02 2017-12-26 삼성전자주식회사 Apparatus and method for selective binnig pixel
CN106210722B (en) * 2016-07-08 2019-06-25 上海大学 The coding method of depth of seam division video residual error layer data based on HEVC

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6487304B1 (en) * 1999-06-16 2002-11-26 Microsoft Corporation Multi-view approach to motion and stereo
KR20030056997A (en) * 2001-12-28 2003-07-04 엘지전자 주식회사 Method for Compressing Stero-Image
KR100828353B1 (en) * 2003-02-05 2008-05-08 삼성전자주식회사 Method for dividing the image block and Apparatus thereof

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
"Description of Exploration Experiments in 3DAV", 62. MPEG MEETING;21-10-2002 - 25-10-2002; SHANGHAI; (MOTION PICTUREEXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. N5169, 25 October 2002 (2002-10-25), XP030012488, ISSN: 0000-0362 *
ISO/IEC JTC 1/SC 29/WG 11: "Table of the the Registered Contributions to the MPEG Meeting #70, Palma, ES", , 8 May 2005 (2005-05-08), XP002682316, Retrieved from the Internet: URL:http://mpeg.nist.gov/reg/_list70.php [retrieved on 2005-05-08] *
See also references of WO2006041261A1 *
YOON SUNG-YEOL KIM DAEHEE KIM SUKHEE CHO KUGJIN YUN CHUNGHYUN AHN SOOIN LEE: "Coding of Layered Depth Image using Coherency between Point Samples", 70. MPEG MEETING; 18-10-2004 - 22-10-2004; PALMA DE MALLORCA; (MOTIONPICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. M11279, 12 October 2004 (2004-10-12), XP030040053, ISSN: 0000-0253 *
YOON SUNG-YEOL KIM DAEHEE KIM SUKHEE CHO KUGJIN YUN CHUNGHYUN AHN SOOIN LEE: "Multi-view Video Coding using Layered Depth Image", 70. MPEG MEETING; 18-10-2004 - 22-10-2004; PALMA DE MALLORCA; (MOTIONPICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. M11278, 12 October 2004 (2004-10-12), XP030040052, ISSN: 0000-0253 *

Also Published As

Publication number Publication date
EP1800493A4 (en) 2012-10-10
KR20060053268A (en) 2006-05-19
KR100714068B1 (en) 2007-05-02
WO2006041261A1 (en) 2006-04-20

Similar Documents

Publication Publication Date Title
EP1800493A1 (en) Method and system for encoding/decoding multi-view video based on layered-depth image
US11968372B2 (en) Layered scene decomposition CODEC method
US10528004B2 (en) Methods and apparatus for full parallax light field display systems
EP2150065B1 (en) Method and system for video rendering, computer program product therefor
WO2014057988A1 (en) Device, program, and method for reducing data size of multiple images containing similar information, and data structure expressing multiple images containing similar information
CN112042201B (en) Method and apparatus for encoding/decoding a point cloud representing a 3D object
US20150237323A1 (en) 3d video representation using information embedding
CN110495178A (en) The device and method of 3D Video coding
WO2017020806A1 (en) A multi-overlay variable support and order kernel-based representation for image deformation and view synthesis
CN115661403A (en) Explicit radiation field processing method, device and storage medium
Shin et al. Enhanced pruning algorithm for improving visual quality in MPEG immersive video
KR20060045798A (en) Method and system for coding multi-view video based on layered -depth image
WO2014057989A1 (en) Device, program, and method for reducing data size of multiple images containing similar information
CN115633179A (en) Compression method for real-time volume video streaming transmission
CN104904199A (en) Method and apparatus for efficient coding of depth lookup table
WO2022224112A1 (en) Inherited geometry patches
WO2012128209A1 (en) Image encoding device, image decoding device, program, and encoded data
CN103179423B (en) Signal processing method of interactive three-dimensional video system
Yoon et al. Preprocessing of depth and color information for layered depth image coding
KR102681496B1 (en) Method and apparatus for converting image
Yoon et al. Inter-camera coding of multi-view video using layered depth image representation
Gao et al. Multi-view image coding using 3-D voxel models
Liu et al. Light field image coding using a residual channel attention network–based view synthesis
KR20230047930A (en) Method, apparatus and recording medium for encoding/decoding image for immersive 3d image
Graziosi et al. Dynamic Mesh Coding Using Orthogonal Atlas Projection

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20070426

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

DAX Request for extension of the european patent (deleted)
RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTIT

A4 Supplementary search report drawn up and despatched

Effective date: 20120912

RIC1 Information provided on ipc code assigned before grant

Ipc: H04N 7/26 20060101AFI20120831BHEP

Ipc: H04N 7/32 20060101ALI20120831BHEP

Ipc: G06T 15/20 20110101ALI20120831BHEP

Ipc: H04N 13/00 20060101ALI20120831BHEP

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20130407