WO2015051498A1 - Methods for view synthesis prediction - Google Patents
Methods for view synthesis prediction Download PDFInfo
- Publication number
- WO2015051498A1 WO2015051498A1 PCT/CN2013/084849 CN2013084849W WO2015051498A1 WO 2015051498 A1 WO2015051498 A1 WO 2015051498A1 CN 2013084849 W CN2013084849 W CN 2013084849W WO 2015051498 A1 WO2015051498 A1 WO 2015051498A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- block
- processed
- current
- depth
- partition structure
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- the invention relates generally to Three-Dimensional (3D) video processing.
- the present invention relates to optimized methods for view synthesis prediction in 3D video coding.
- 3D video coding is developed for encoding or decoding video data of multiple views simultaneously captured by several cameras. Since all cameras capture the same scene from different viewpoints, multi-view video data contains a large amount of inter-view redundancy. To exploit the inter- view redundancy, additional tools such as view synthesis prediction (VSP) have been integrated to conventional 3D-HEVC (High Efficiency Video Coding) codec.
- VSP view synthesis prediction
- VSP locates the reconstructed depth of the reference view, which is namely virtual depth, for the current PU with neighbouring disparity vector (NBDV) at first. Then prediction signals are generated with a disparity vector derived from the virtual depth for each 8x8 partition in the PU.
- NBDV neighbouring disparity vector
- the determination process is invoked several times for a PU when the PU size is larger than 8x8. For example, if the PU size is 64x64, the determination process will be invoked 64 times.
- variable horSplitFlag is set to 1 and the whole PU is partitioned into 8x4 sub-blocks/partitions. Otherwise, the variable horSplitFlag is set to 0 and the whole PU it is partitioned into 4x8 sub- blocks/partitions.
- Fig. 4 illustrates the relationship between variable horSplitFlag and two kinds of partition structures.
- Fig. 1(a) is a diagram illustrating the basic concept of VSP in current HTM
- Fig. 1(b) is a diagram illustrating the optimized VSP procedure for HTM.
- Fig. 2 is a diagram illustrating one worst partition structure of JCVTC-E0207 according to an embodiment of the invention.
- Fig. 3 is a diagram illustrating the corner positions of the four pixels used to determinate the PU partition for VSP.
- Fig. 4 is a diagram illustrating the proposed two kinds of partition structures for the whole PU. DETAILED DESCRIPTION
- VSP view synthesis prediction
- NBDV neighboring block disparity vector
- the virtual depth corresponding to the current prediction unit is obtained from the reconstructed depth of the reference view by using the NBDV, as shown in step 2 of Fig. 1(b).
- an embodiment of the present invention can be a circuit integrated into a video compression chip or program codes integrated into video compression software to perform the processing described herein.
- An embodiment of the present invention may also be program codes to be executed on a Digital Signal Processor (DSP) to perform the processing described herein.
- DSP Digital Signal Processor
- the invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA).
- processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention.
- the software code or firmware codes may be developed in different programming languages and different format or style.
- the software code may also be compiled for different target platform.
- different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Methods of view synthesis prediction for multi-view video coding and 3D video coding are disclosed. It is proposed to determine a partition size in the view synthesis prediction only once for a current prediction unit (PU), and a partition structure of the current PU is pre- determined before prediction procedure with view synthesis prediction (VSP).
Description
METHODS FOR VIEW SYNTHESIS PREDICTION
TECHNICAL FIELD
[0001] The invention relates generally to Three-Dimensional (3D) video processing. In particular, the present invention relates to optimized methods for view synthesis prediction in 3D video coding.
BACKGROUND
[0002] 3D video coding is developed for encoding or decoding video data of multiple views simultaneously captured by several cameras. Since all cameras capture the same scene from different viewpoints, multi-view video data contains a large amount of inter-view redundancy. To exploit the inter- view redundancy, additional tools such as view synthesis prediction (VSP) have been integrated to conventional 3D-HEVC (High Efficiency Video Coding) codec.
[0003] The basic concept of the VSP in current 3DV-HTM is illustrated in Fig. 1(a). VSP locates the reconstructed depth of the reference view, which is namely virtual depth, for the current PU with neighbouring disparity vector (NBDV) at first. Then prediction signals are generated with a disparity vector derived from the virtual depth for each 8x8 partition in the PU.
[0004] An adaptive method proposed in JCVTC-E0207 is adopted to partition PU in VSP. Each 8x8 block can be partitioned into two 8x4 partitions, or two 4x8 partitions independently as depicted in Fig. 2. To determine the partitioning way, a determination process is conducted for each 8x8 block in the PU. The current design involves two problems.
[0005] Firstly, the determination process is invoked several times for a PU when the PU size is larger than 8x8. For example, if the PU size is 64x64, the determination process will be invoked 64 times.
[0006] Secondly, the memory access method is irregular over a PU, which is unfriendly to paralleling process.
SUMMARY
[0007] It is proposed to determine the partition size only once for a PU in VSP by checking whether the following two cases are satisfied at the same time, as shown in Fig. 1(b).
[0008] (1) Value of the left top pixel in the virtual depth of the current prediction unit (PU) is larger than the right bottom pixel in the virtual depth of the current PU. (2) Value of the right top pixel in the virtual depth of the current PU is larger than the left bottom pixel in the virtual depth of the current PU. Fig. 3 presents the positions of the four corner pixels.
[0009] If both the two cases are satisfied, a variable horSplitFlag is set to 1 and the whole PU is partitioned into 8x4 sub-blocks/partitions. Otherwise, the variable horSplitFlag is set to 0 and the whole PU it is partitioned into 4x8 sub- blocks/partitions. Fig. 4 illustrates the relationship between variable horSplitFlag and two kinds of partition structures.
[0010] Other aspects and features of the invention will become apparent to those with ordinary skill in the art upon review of the following descriptions of specific embodiments.
BRIEF DESCRIPTION OF DRAWINGS [0011] The invention can be more fully understood by reading the subsequent detailed description and examples with references made to the accompanying drawings, wherein:
[0012] Fig. 1(a) is a diagram illustrating the basic concept of VSP in current HTM;
[0013] Fig. 1(b) is a diagram illustrating the optimized VSP procedure for HTM.
[0014] Fig. 2 is a diagram illustrating one worst partition structure of JCVTC-E0207 according to an embodiment of the invention.
[0015] Fig. 3 is a diagram illustrating the corner positions of the four pixels used to determinate the PU partition for VSP.
[0016] Fig. 4 is a diagram illustrating the proposed two kinds of partition structures for the whole PU.
DETAILED DESCRIPTION
[0017] The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
[0018] The proposed method to determine the partitions for VSP of a PU can be illustrated in Fig. 4. The procedure of the proposed view synthesis prediction (VSP) is described in the following:
[0019] Firstly, neighboring block disparity vector (NBDV) is derived for the current PU, as shown in step 1 of Fig. 1(b).
[0020] Secondly, the virtual depth corresponding to the current prediction unit is obtained from the reconstructed depth of the reference view by using the NBDV, as shown in step 2 of Fig. 1(b).
[0021] Thirdly, selecting four corner points from the virtual depth of current processed PU, and marking them as refDepPels[LT], refDepPels[RB], refDepPels[RT] and refDepPels[LB] as shown in Fig. 3.
[0022] Fourthly, calculating one variable horSplitFlag by horSplitFlag = (( refDepPels[LT] > refDepPels[RB] ) = = ( refDepPels[RT] > refDepPels[LB] )).
[0023] Fifthly, dividing the current PU into WxH sub-blocks by step 3 in Fig. 1(b), where W equals to 8» (1 -horSplitFlag ) and H equals to 8» horSplitFlag. If horSplitFlag equals to 1, the left side of Fig. 4 shows the 8x4 partition structure for the current PU. Otherwise, the right side of Fig. 4 presents the 4x8 partition structure for the current PU
[0024] Finally, for each WxH sub-block: (1) the obtained virtual depth is utilized to derive the corresponding disparity vector; (3) the predicted data is obtained by using the disparity vector, as step 4 of Fig. 1(b) shows; (4) prediction residual is generated for encoder and compensated to the predicted data for decoding process.
[0025] The proposed method described above can be used in a video encoder as well as in a video decoder. Embodiments of the method according to the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be a
circuit integrated into a video compression chip or program codes integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program codes to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA). These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware codes may be developed in different programming languages and different format or style. The software code may also be compiled for different target platform. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
[0026] The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. To the contrary, it is intended to cover various modifications and similar arrangements (as would be apparent to those skilled in the art). Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.
Claims
1. A method to decide the partition for view synthesis prediction (VSP) in multi-view video coding or 3D video coding, comprising:
pre-determining a partition structure of a current prediction unit (PU) before prediction procedure with VSP.
2. The method as claimed in claim 1, wherein the partition structure is determined by directly using inside pixels of a depth block corresponding to the current prediction unit (PU) or a current coding unit (CU) containing the current PU.
3. The method as claimed in claim 2, the depth block is a corresponding virtual depth block derived from a depth map of another reference view.
4. The method as claimed in claim 2, wherein the depth block corresponding to the current prediction unit is obtained from depth of a reference view by using neighboring block disparity vector (NBDV).
5. The method as claimed in claim 1, wherein the partition structure is determined by using four corner pixels of a virtual depth corresponding to the current
PU or a current CU containing the current PU.
6. The method as claimed in claim 1, wherein the partition structure is determined block by block by using inside pixels, and for each processed block, at least one of width or height of each processed block is larger than 8.
7. The method as claimed in claim 6, wherein a total number of the processed blocks is larger than 0 but less than the number of 8x8 blocks.
8. The method as claimed in claim 6, wherein the width and height of each processed block are different or are the same, and said processed block processes the same size as a prediction unit (PU).
9. The method as claimed in claim 6, wherein the partition structure of each processed block is determined by its corner pixels of a depth block corresponding to the processed block.
10. The method as claimed in claim 6, wherein the partition structure of each processed block is determined by four corner pixels of a depth block corresponding to the processed block.
11. The method as claimed in claim 6, wherein the partition structure of each processed block is determined by detecting whether the following two cases are
satisfied: (1) value of a left top pixel is larger than a right bottom pixel; and (2) value of a right top pixel is larger than a left bottom pixel.
12. The method as claimed in claim 11, wherein if the detected two cases are satisfied at the same time, the processed block is equally divided into multiple partitions with the same size, in which the width is larger than the height, and the size is selecting from 8x4, 16x8, 16x4, 32x16, 32x8 and 32x4; otherwise, the processed block is equally divided into multiple partitions with the same size, in which the height is larger than the width, and the size is selecting from 4x8, 8x16, 4x16, 16x32, 8x32 and 4x32.
13. The method as claimed in claim 6, wherein the partition structure of each processed block is determined by four centric pixels respectively belonging to an upper part, below part, left-part and right part.
14. The method as claimed in claim 6, wherein the partition structure of each processed block is determined by comparing the following two values UDD and LRD; wherein UDD is an absolute difference between centric pixels of an upper part and that of a below part, and LRD is an absolute difference between centric pixels of a left part and that of a right part.
15. The method as claimed in claim 14, if UDD is larger than LRD, the processed block is equally divided into multiple partitions with the same size, in which the width is larger than the height, and the size is selecting from 8x4, 16x8, 16x4, 32x16, 32x8 and 32x4, otherwise, the processed block is equally divided into multiple partitions with the same size, in which the height is larger than the width, and the size is selecting from 4x8, 8x16, 4x16, 16x32, 8x32 and 4x32.
16. The method as claimed in claim 6, wherein the processed block is a region of the current PU being processed.
17. The method as claimed in claim 6, wherein the process block is a region of a current coding unit (CU), current largest coding unit (LCU), current coding tree block (CTB), or current coding tree unit (CTU).
18. The method as claimed in claim 1, wherein VSP is used to obtain predicted data for each partition after the partition structure is determined.
19. The method as claimed in claim 18, wherein VSP obtains virtual depth from reconstructed depth of a reference view for each partition of the current prediction unit being processed.
20. The method as claimed in claim 18, wherein VSP obtains a disparity
vector from virtual depth for each partition of the current prediction unit being processed.
21. The method as claimed in claim 18, wherein VSP obtains predicted data by using a disparity vector for each partition of the current prediction unit being processed.
22. The method as claimed in claim 1, wherein a flag is transmitted in a sequence, view, picture, or slice level to disable or disable the method for partition structure determination of VSP.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2013/084849 WO2015051498A1 (en) | 2013-10-08 | 2013-10-08 | Methods for view synthesis prediction |
CN201410496046.7A CN104284194B (en) | 2013-10-08 | 2014-09-24 | Utilize View synthesis predictive coding or the method and device of decoding three-dimensional or multi-view video |
US14/503,427 US9906813B2 (en) | 2013-10-08 | 2014-10-01 | Method of view synthesis prediction in 3D video coding |
GB1417716.6A GB2520615B (en) | 2013-10-08 | 2014-10-07 | Method of view synthesis prediction in 3D video coding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2013/084849 WO2015051498A1 (en) | 2013-10-08 | 2013-10-08 | Methods for view synthesis prediction |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/503,427 Continuation-In-Part US9906813B2 (en) | 2013-10-08 | 2014-10-01 | Method of view synthesis prediction in 3D video coding |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015051498A1 true WO2015051498A1 (en) | 2015-04-16 |
Family
ID=52812424
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2013/084849 WO2015051498A1 (en) | 2013-10-08 | 2013-10-08 | Methods for view synthesis prediction |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB2520615B (en) |
WO (1) | WO2015051498A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111385585A (en) * | 2020-03-18 | 2020-07-07 | 北京工业大学 | 3D-HEVC depth map coding unit division fast decision method based on machine learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012128068A1 (en) * | 2011-03-18 | 2012-09-27 | ソニー株式会社 | Image processing device, image processing method, and program |
CA2844593A1 (en) * | 2011-08-09 | 2013-02-14 | Byeong-Doo Choi | Multiview video data encoding method and device, and decoding method and device |
US20130039417A1 (en) * | 2011-08-08 | 2013-02-14 | General Instrument Corporation | Residual tree structure of transform unit partitioning |
-
2013
- 2013-10-08 WO PCT/CN2013/084849 patent/WO2015051498A1/en active Application Filing
-
2014
- 2014-10-07 GB GB1417716.6A patent/GB2520615B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012128068A1 (en) * | 2011-03-18 | 2012-09-27 | ソニー株式会社 | Image processing device, image processing method, and program |
US20130039417A1 (en) * | 2011-08-08 | 2013-02-14 | General Instrument Corporation | Residual tree structure of transform unit partitioning |
CA2844593A1 (en) * | 2011-08-09 | 2013-02-14 | Byeong-Doo Choi | Multiview video data encoding method and device, and decoding method and device |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111385585A (en) * | 2020-03-18 | 2020-07-07 | 北京工业大学 | 3D-HEVC depth map coding unit division fast decision method based on machine learning |
Also Published As
Publication number | Publication date |
---|---|
GB2520615A (en) | 2015-05-27 |
GB2520615B (en) | 2017-02-08 |
GB201417716D0 (en) | 2014-11-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11234002B2 (en) | Method and apparatus for encoding and decoding a texture block using depth based block partitioning | |
JP7126329B2 (en) | Efficient Prediction Using Partition Coding | |
JP6501808B2 (en) | Efficient partition coding with high degree of freedom | |
EP2721823B1 (en) | Method and apparatus of texture image compression in 3d video coding | |
WO2015176678A1 (en) | Method of intra block copy with flipping for image and video coding | |
WO2015192781A1 (en) | Method of sub-pu syntax signaling and illumination compensation for 3d and multi-view video coding | |
WO2015003383A1 (en) | Methods for inter-view motion prediction | |
EP2898688B1 (en) | Method and apparatus for deriving virtual depth values in 3d video coding | |
CA2964642A1 (en) | Method of improved directional intra prediction for video coding | |
WO2015062002A1 (en) | Methods for sub-pu level prediction | |
WO2015006951A1 (en) | Methods for fast encoder decision | |
WO2015123806A1 (en) | Methods for depth based block partitioning | |
US9596484B2 (en) | Method of depth intra prediction using depth map modelling | |
CA2966862A1 (en) | Method and apparatus of alternative transform for video coding | |
WO2019091292A1 (en) | Method and apparatus for intra prediction fusion in image and video coding | |
JP5986657B2 (en) | Simplified depth-based block division method | |
WO2015139762A1 (en) | An apparatus and a method for associating a video block partitioning pattern to a video coding block | |
KR20160070815A (en) | A method for determining a corner video part of a partition of a video coding block | |
WO2015051498A1 (en) | Methods for view synthesis prediction | |
CN104284194B (en) | Utilize View synthesis predictive coding or the method and device of decoding three-dimensional or multi-view video | |
WO2016123801A1 (en) | Methods for partition mode coding | |
WO2015103747A1 (en) | Motion parameter hole filling | |
WO2015139206A1 (en) | Methods for 3d video coding | |
WO2015006899A1 (en) | A simplified dv derivation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13895348 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13895348 Country of ref document: EP Kind code of ref document: A1 |