WO2014106346A1 - Method of signalling additional collocated picture for 3dvc - Google Patents

Method of signalling additional collocated picture for 3dvc Download PDF

Info

Publication number
WO2014106346A1
WO2014106346A1 PCT/CN2013/070172 CN2013070172W WO2014106346A1 WO 2014106346 A1 WO2014106346 A1 WO 2014106346A1 CN 2013070172 W CN2013070172 W CN 2013070172W WO 2014106346 A1 WO2014106346 A1 WO 2014106346A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
collocated
pictures
collocated picture
additional
Prior art date
Application number
PCT/CN2013/070172
Other languages
French (fr)
Inventor
Kai Zhang
Jicheng An
Original Assignee
Mediatek Singapore Pte. Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mediatek Singapore Pte. Ltd. filed Critical Mediatek Singapore Pte. Ltd.
Priority to PCT/CN2013/070172 priority Critical patent/WO2014106346A1/en
Publication of WO2014106346A1 publication Critical patent/WO2014106346A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/58Motion compensation with long-term prediction, i.e. the reference frame for a current frame not being the temporally closest one
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/573Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding

Definitions

  • the invention relates generally to Three-Dimensional (3D) video processing.
  • the present invention relates to improvement on DV derivation process in 3D video coding.
  • 3D video coding is developed for video data of multiple views simultaneously captured by several cameras. Since cameras capture the same scene from different viewpoints, multi-view video data contains a large amount of inter-view redundancy.
  • Disparity Vector plays a very important role in the current HTM [1]. Besides being used in inter-view prediction directly, DV is also utilized in inter-view motion vector (MV) prediction and inter-view residual prediction. In the last two cases, DV is estimated instead of being sent from the encoder to the decoder. In the current HTM, the estimated DV is derived from the DV in spatial and temporal neighboring blocks [2].
  • DV-MCP [3] which appends DVs used for inter-view MV prediction in neighboring merge blocks into the DV derivation process, has also been adopted into the current HTM.
  • HTM To utilize temporal neighboring blocks sufficiently, it is adopted by HTM to use two collocated pictures in the DV derivation process [4]. Temporal neighboring blocks in the two collocated pictures are checked in an order to find candidate DV estimation in the DV derivation process.
  • the first collocated picture is the same as the one used in the Temporal Motion Vector Prediction (TMVP) process.
  • TMVP Temporal Motion Vector Prediction
  • the second collocated picture is another reference picture which is different from the first collocated picture.
  • the second collocated picture is derived implicitly both on the encoder and the decoder following the same rule.
  • the second collocated picture is derived in the reference picture lists with the ascending order of reference picture indices, and added into the candidate list, given as follows:
  • a random access point is searched in the reference picture lists. If found, the RAP is placed into the candidate list for the second picture and the derivation process is completed. In a case that the RAP is not available for the current picture, go to step (2).
  • a picture with the lowest temporallD (TID) is searched out and placed into the candidate list of the temporal pictures as the second entry. 3) If multiple pictures with the same lowest TID exist, a picture of less POC difference with the current picture is chosen.
  • the implicit signalling method restricts the flexibility of the encoder.
  • the rule is designed towards the common test condition which utilizes the hierarchical-B coding structure. In real applications, it is hardly to guarantee that a fixed rule can always adapt well to variety of coding structures and coding contents.
  • the implicit signalling method increases the complexity of the decoder.
  • the decoder must execute the algorithm of the rule for each slice and save temporallDs of all reference pictures.
  • the implicit signalling method uses the syntax element temporallD in the NAL header, which is used for temporal scalability usually. If no coding tools depend on temporallD, an extractor or a trans-coder can change temporallD directly according to application requirements, without recoding the bit-stream. Nevertheless, the rule imposes a dependency between temporallD and the coding process. Under such a rule, temporallD cannot be changed without recoding the bit- stream.
  • the implicit signalling method is not so friendly to error concealment. If one reference picture is lost, it is hardly to determine whether the lost one is the second collocated picture or not, since its temporallD is also lost. This information may be useful for doing error concealment to the current picture.
  • an explicit signalling method of the second collocated picture is proposed. It is proposed to signal the second collocated picture explicitly in the slice header, just like the first collocated picture. This explicit signaling method can make the standard more flexible and reliable.
  • Fig. 1 is a diagram illustrating how to signal the second collocated picture explicitly from the encoder to the decoder when the second collocated picture is a reference picture.
  • the second collocated picture is signalled in the slice header just like the first collocated picture. Concretely, a reference index syntax element is signaled. Moreover, a reference list element is signaled for the B-slice. If the second collocated picture refers to the same picture as the first collocated picture, the second is considered as unavailable.
  • Fig.l shows an example of how to signal the second collocated picture explicitly from the encoder to the decoder when the second collocated picture is a reference picture.
  • an embodiment of the present invention can be a circuit integrated into a video compression chip or program codes integrated into video compression software to perform the processing described herein.
  • An embodiment of the present invention may also be program codes to be executed on a Digital Signal Processor (DSP) to perform the processing described herein.
  • DSP Digital Signal Processor
  • the invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA).
  • processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention.
  • the software code or firmware codes may be developed in different programming languages and different format or style.
  • the software code may also be compiled for different target platform.
  • different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

In the current HTM, the second collocated picture is adopted into the DV derivation process. By checking additional temporal neighboring blocks in the second collocated picture, the DV derivation process becomes more efficient. Currently, the second collocated picture is derived implicitly following the same rule both on the encoder and on the decoder. This implicit signaling method works well in the common test condition. However, it has some drawbacks such as restricting the flexibility of the encoder. In this contribution, it is proposed to signal the second collocated picture explicitly in the slice header, just like the first collocated picture. This explicit signaling method can make the standard more flexible and reliable.

Description

METHOD OF SIGNALLING ADDITIONAL COLLOCATED PICTURE FOR 3DVC
FIELD OF INVENTION
The invention relates generally to Three-Dimensional (3D) video processing. In particular, the present invention relates to improvement on DV derivation process in 3D video coding.
BACKGROUND OF THE INVENTION
3D video coding is developed for video data of multiple views simultaneously captured by several cameras. Since cameras capture the same scene from different viewpoints, multi-view video data contains a large amount of inter-view redundancy. To utilize the redundancy, Disparity Vector (DV) plays a very important role in the current HTM [1]. Besides being used in inter-view prediction directly, DV is also utilized in inter-view motion vector (MV) prediction and inter-view residual prediction. In the last two cases, DV is estimated instead of being sent from the encoder to the decoder. In the current HTM, the estimated DV is derived from the DV in spatial and temporal neighboring blocks [2]. DV-MCP [3], which appends DVs used for inter-view MV prediction in neighboring merge blocks into the DV derivation process, has also been adopted into the current HTM.
To utilize temporal neighboring blocks sufficiently, it is adopted by HTM to use two collocated pictures in the DV derivation process [4]. Temporal neighboring blocks in the two collocated pictures are checked in an order to find candidate DV estimation in the DV derivation process. The first collocated picture is the same as the one used in the Temporal Motion Vector Prediction (TMVP) process. And the second collocated picture is another reference picture which is different from the first collocated picture. Unlike the first collocated picture, which is signaled explicitly in the slice header, the second collocated picture is derived implicitly both on the encoder and the decoder following the same rule.
The second collocated picture is derived in the reference picture lists with the ascending order of reference picture indices, and added into the candidate list, given as follows:
1) A random access point (RAP) is searched in the reference picture lists. If found, the RAP is placed into the candidate list for the second picture and the derivation process is completed. In a case that the RAP is not available for the current picture, go to step (2).
2) A picture with the lowest temporallD (TID) is searched out and placed into the candidate list of the temporal pictures as the second entry. 3) If multiple pictures with the same lowest TID exist, a picture of less POC difference with the current picture is chosen.
This rule is quite reasonable in the common test condition since the reference picture with a high probability to possess more DVs has a high priority to be chosen as the second collocated picture. However, such an implicit signalling method has some drawbacks.
First, the implicit signalling method restricts the flexibility of the encoder. The rule is designed towards the common test condition which utilizes the hierarchical-B coding structure. In real applications, it is hardly to guarantee that a fixed rule can always adapt well to variety of coding structures and coding contents.
Second, the implicit signalling method increases the complexity of the decoder. As an overhead, the decoder must execute the algorithm of the rule for each slice and save temporallDs of all reference pictures.
Third, the implicit signalling method uses the syntax element temporallD in the NAL header, which is used for temporal scalability usually. If no coding tools depend on temporallD, an extractor or a trans-coder can change temporallD directly according to application requirements, without recoding the bit-stream. Nevertheless, the rule imposes a dependency between temporallD and the coding process. Under such a rule, temporallD cannot be changed without recoding the bit- stream.
Finally, the implicit signalling method is not so friendly to error concealment. If one reference picture is lost, it is hardly to determine whether the lost one is the second collocated picture or not, since its temporallD is also lost. This information may be useful for doing error concealment to the current picture.
SUMMARY OF THE INVENTION
In light of the previously described problems, an explicit signalling method of the second collocated picture is proposed. It is proposed to signal the second collocated picture explicitly in the slice header, just like the first collocated picture. This explicit signaling method can make the standard more flexible and reliable.
Other aspects and features of the invention will become apparent to those with ordinary skill in the art upon review of the following descriptions of specific embodiments.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention can be more fully understood by reading the subsequent detailed description and examples with references made to the accompanying drawings, where: Fig. 1 is a diagram illustrating how to signal the second collocated picture explicitly from the encoder to the decoder when the second collocated picture is a reference picture.
DETAILED DESCRIPTION
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
For a slice where DV derivation is required, the second collocated picture is signalled in the slice header just like the first collocated picture. Concretely, a reference index syntax element is signaled. Moreover, a reference list element is signaled for the B-slice. If the second collocated picture refers to the same picture as the first collocated picture, the second is considered as unavailable.
An example of how to signal the second collocated picture in slice header is depicted in Table 1.
Table 1
Figure imgf000004_0001
Fig.l shows an example of how to signal the second collocated picture explicitly from the encoder to the decoder when the second collocated picture is a reference picture.
The proposed method described above can be used in a video encoder as well as in a video decoder. Embodiments of methods according to the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be a circuit integrated into a video compression chip or program codes integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program codes to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA). These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware codes may be developed in different programming languages and different format or style. The software code may also be compiled for different target platform. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.
The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. To the contrary, it is intended to cover various modifications and similar arrangements (as would be apparent to those skilled in the art). Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.

Claims

1. A method of signaling additional collocated pictures which are used in a disparity vector (DV) derivation process.
2. The method as claimed in claim 1, wherein temporal neighboring blocks in the additional collocated pictures signaled are checked orderly to find a valid DV candidate in the DV derivation process.
3. The method as claimed in claim 1, wherein the additional collocated pictures signaled can be the same or different to a collocated picture used in Temporal Motion Vector Prediction (TMVP).
4. The method as claimed in claim 1, wherein the number of additional collocated pictures signaled which are used in the DV derivation process can be one, two, or more.
5. The method as claimed in claim 1, wherein the additional collocated pictures are signaled in the video coding bit- stream or in other ways.
6. The method as claimed in claim 1, wherein the additional collocated pictures are signaled in a video parameter set, sequence parameter set, picture parameter set, picture header, slice header, macro-block, coding unit, prediction unit, or any other position in the video coding bit- stream.
7. The method as claimed in claim 1, wherein one additional collocated picture signaled is not used in the DV derivation process if it is the same picture as a collocated picture used in TMVP.
8. The method as claimed in claim 1, wherein the additional collocated pictures signaled must be reference pictures for a current picture.
9. The method as claimed in claim 1, wherein one additional collocated picture is signaled by sending a reference list where it is in and a reference index which represents the its position in the list.
10. The method as claimed in claim 1, wherein a reference list is not sent if there is only one reference list for a current picture; and the additional collocated picture must be in that list.
11. The method as claimed in claim 1, wherein a reference index is not sent if there is only one reference picture in the list where one additional collocated picture is in; and the additional collocated picture must be that reference picture in that list.
12. The method as claimed in claim 1, wherein no additional collocated picture needs to be signaled if there are only two kinds of reference pictures for the current picture, wherein the two kinds of reference pictures includes the collocated picture used in TMVP and inter- view reference pictures.
13. The method as claimed in claim 1 wherein a reference list is not sent if there is only one reference list possessing reference pictures which are neither the collocated picture used in TMVP nor inter- view reference pictures.
14. The method as claimed in claim 1, wherein a reference index is not sent if there is only one reference picture satisfying the following condition in the list where one additional collocated picture is in; the condition is that the reference picture is neither the collocated picture used in TMVP nor an inter- view reference picture.
PCT/CN2013/070172 2013-01-07 2013-01-07 Method of signalling additional collocated picture for 3dvc WO2014106346A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2013/070172 WO2014106346A1 (en) 2013-01-07 2013-01-07 Method of signalling additional collocated picture for 3dvc

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2013/070172 WO2014106346A1 (en) 2013-01-07 2013-01-07 Method of signalling additional collocated picture for 3dvc

Publications (1)

Publication Number Publication Date
WO2014106346A1 true WO2014106346A1 (en) 2014-07-10

Family

ID=51062148

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/070172 WO2014106346A1 (en) 2013-01-07 2013-01-07 Method of signalling additional collocated picture for 3dvc

Country Status (1)

Country Link
WO (1) WO2014106346A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016165617A1 (en) * 2015-04-14 2016-10-20 Mediatek Singapore Pte. Ltd. Method and apparatus for deriving temporal motion vector prediction
CN112655214A (en) * 2018-04-12 2021-04-13 艾锐势有限责任公司 Motion information storage for video coding and signaling

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101056398A (en) * 2006-03-29 2007-10-17 清华大学 A method and decoding and encoding method for capturing the video difference vector in the multi-video coding process
CN101999228A (en) * 2007-10-15 2011-03-30 诺基亚公司 Motion skip and single-loop encoding for multi-view video content
WO2012108700A2 (en) * 2011-02-09 2012-08-16 엘지전자 주식회사 Method for encoding and decoding image and device using same
US20120287999A1 (en) * 2011-05-11 2012-11-15 Microsoft Corporation Syntax element prediction in error correction

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101056398A (en) * 2006-03-29 2007-10-17 清华大学 A method and decoding and encoding method for capturing the video difference vector in the multi-video coding process
CN101999228A (en) * 2007-10-15 2011-03-30 诺基亚公司 Motion skip and single-loop encoding for multi-view video content
WO2012108700A2 (en) * 2011-02-09 2012-08-16 엘지전자 주식회사 Method for encoding and decoding image and device using same
US20120287999A1 (en) * 2011-05-11 2012-11-15 Microsoft Corporation Syntax element prediction in error correction

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016165617A1 (en) * 2015-04-14 2016-10-20 Mediatek Singapore Pte. Ltd. Method and apparatus for deriving temporal motion vector prediction
US10412406B2 (en) 2015-04-14 2019-09-10 Mediatek Singapore Pte. Ltd. Method and apparatus for deriving temporal motion vector prediction
CN112655214A (en) * 2018-04-12 2021-04-13 艾锐势有限责任公司 Motion information storage for video coding and signaling

Similar Documents

Publication Publication Date Title
WO2016165069A1 (en) Advanced temporal motion vector prediction in video coding
WO2015003383A1 (en) Methods for inter-view motion prediction
WO2015109598A1 (en) Methods for motion parameter hole filling
WO2016008157A1 (en) Methods for motion compensation using high order motion model
TW200843513A (en) Methods and apparatus for improved signaling using high level syntax for multi-view video coding and decoding
WO2014166068A1 (en) Refinement of view synthesis prediction for 3-d video coding
WO2016008161A1 (en) Temporal derived bi-directional motion vector predictor
WO2015100710A1 (en) Existence of inter-view reference picture and availability of 3dvc coding tools
WO2015006967A1 (en) Simplified view synthesis prediction for 3d video coding
KR20160132935A (en) Method for depth lookup table signaling in 3d video coding based on high efficiency video coding standard
WO2014005280A1 (en) Method and apparatus to improve and simplify inter-view motion vector prediction and disparity vector prediction
US10499075B2 (en) Method for coding a depth lookup table
KR20160147069A (en) Method and apparatus of spatial motion vector prediction derivation for direct and skip modes in three-dimensional video coding
CA2896132C (en) Method and apparatus of compatible depth dependent coding
WO2015135175A1 (en) Simplified depth based block partitioning method
WO2015006922A1 (en) Methods for residual prediction
WO2014166109A1 (en) Methods for disparity vector derivation
WO2014106346A1 (en) Method of signalling additional collocated picture for 3dvc
WO2014029086A1 (en) Methods to improve motion vector inheritance and inter-view motion prediction for depth map
WO2013159326A1 (en) Inter-view motion prediction in 3d video coding
WO2014023024A1 (en) Methods for disparity vector derivation
CA2921759C (en) Method of motion information prediction and inheritance in multi-view and three-dimensional video coding
WO2014106327A1 (en) Method and apparatus for inter-view residual prediction in multiview video coding
WO2014166096A1 (en) Reference view derivation for inter-view motion prediction and inter-view residual prediction
WO2015103747A1 (en) Motion parameter hole filling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13869918

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13869918

Country of ref document: EP

Kind code of ref document: A1