WO2014106346A1

WO2014106346A1 - Method of signalling additional collocated picture for 3dvc

Info

Publication number: WO2014106346A1
Application number: PCT/CN2013/070172
Authority: WO
Inventors: Kai Zhang; Jicheng An
Original assignee: Mediatek Singapore Pte. Ltd.
Priority date: 2013-01-07
Filing date: 2013-01-07
Publication date: 2014-07-10

Abstract

In the current HTM, the second collocated picture is adopted into the DV derivation process. By checking additional temporal neighboring blocks in the second collocated picture, the DV derivation process becomes more efficient. Currently, the second collocated picture is derived implicitly following the same rule both on the encoder and on the decoder. This implicit signaling method works well in the common test condition. However, it has some drawbacks such as restricting the flexibility of the encoder. In this contribution, it is proposed to signal the second collocated picture explicitly in the slice header, just like the first collocated picture. This explicit signaling method can make the standard more flexible and reliable.

Description

METHOD OF SIGNALLING ADDITIONAL COLLOCATED PICTURE FOR 3DVC

FIELD OF INVENTION

The invention relates generally to Three-Dimensional (3D) video processing. In particular, the present invention relates to improvement on DV derivation process in 3D video coding.

BACKGROUND OF THE INVENTION

3D video coding is developed for video data of multiple views simultaneously captured by several cameras. Since cameras capture the same scene from different viewpoints, multi-view video data contains a large amount of inter-view redundancy. To utilize the redundancy, Disparity Vector (DV) plays a very important role in the current HTM [1]. Besides being used in inter-view prediction directly, DV is also utilized in inter-view motion vector (MV) prediction and inter-view residual prediction. In the last two cases, DV is estimated instead of being sent from the encoder to the decoder. In the current HTM, the estimated DV is derived from the DV in spatial and temporal neighboring blocks [2]. DV-MCP [3], which appends DVs used for inter-view MV prediction in neighboring merge blocks into the DV derivation process, has also been adopted into the current HTM.

To utilize temporal neighboring blocks sufficiently, it is adopted by HTM to use two collocated pictures in the DV derivation process [4]. Temporal neighboring blocks in the two collocated pictures are checked in an order to find candidate DV estimation in the DV derivation process. The first collocated picture is the same as the one used in the Temporal Motion Vector Prediction (TMVP) process. And the second collocated picture is another reference picture which is different from the first collocated picture. Unlike the first collocated picture, which is signaled explicitly in the slice header, the second collocated picture is derived implicitly both on the encoder and the decoder following the same rule.

The second collocated picture is derived in the reference picture lists with the ascending order of reference picture indices, and added into the candidate list, given as follows:

1) A random access point (RAP) is searched in the reference picture lists. If found, the RAP is placed into the candidate list for the second picture and the derivation process is completed. In a case that the RAP is not available for the current picture, go to step (2).

2) A picture with the lowest temporallD (TID) is searched out and placed into the candidate list of the temporal pictures as the second entry. 3) If multiple pictures with the same lowest TID exist, a picture of less POC difference with the current picture is chosen.

This rule is quite reasonable in the common test condition since the reference picture with a high probability to possess more DVs has a high priority to be chosen as the second collocated picture. However, such an implicit signalling method has some drawbacks.

First, the implicit signalling method restricts the flexibility of the encoder. The rule is designed towards the common test condition which utilizes the hierarchical-B coding structure. In real applications, it is hardly to guarantee that a fixed rule can always adapt well to variety of coding structures and coding contents.

Second, the implicit signalling method increases the complexity of the decoder. As an overhead, the decoder must execute the algorithm of the rule for each slice and save temporallDs of all reference pictures.

Third, the implicit signalling method uses the syntax element temporallD in the NAL header, which is used for temporal scalability usually. If no coding tools depend on temporallD, an extractor or a trans-coder can change temporallD directly according to application requirements, without recoding the bit-stream. Nevertheless, the rule imposes a dependency between temporallD and the coding process. Under such a rule, temporallD cannot be changed without recoding the bit- stream.

Finally, the implicit signalling method is not so friendly to error concealment. If one reference picture is lost, it is hardly to determine whether the lost one is the second collocated picture or not, since its temporallD is also lost. This information may be useful for doing error concealment to the current picture.

SUMMARY OF THE INVENTION

In light of the previously described problems, an explicit signalling method of the second collocated picture is proposed. It is proposed to signal the second collocated picture explicitly in the slice header, just like the first collocated picture. This explicit signaling method can make the standard more flexible and reliable.

Other aspects and features of the invention will become apparent to those with ordinary skill in the art upon review of the following descriptions of specific embodiments.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention can be more fully understood by reading the subsequent detailed description and examples with references made to the accompanying drawings, where: Fig. 1 is a diagram illustrating how to signal the second collocated picture explicitly from the encoder to the decoder when the second collocated picture is a reference picture.

DETAILED DESCRIPTION

The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.

For a slice where DV derivation is required, the second collocated picture is signalled in the slice header just like the first collocated picture. Concretely, a reference index syntax element is signaled. Moreover, a reference list element is signaled for the B-slice. If the second collocated picture refers to the same picture as the first collocated picture, the second is considered as unavailable.

An example of how to signal the second collocated picture in slice header is depicted in Table 1.

Table 1

Fig.l shows an example of how to signal the second collocated picture explicitly from the encoder to the decoder when the second collocated picture is a reference picture.

The proposed method described above can be used in a video encoder as well as in a video decoder. Embodiments of methods according to the present invention as described above may be implemented in various hardware, software codes, or a combination of both. For example, an embodiment of the present invention can be a circuit integrated into a video compression chip or program codes integrated into video compression software to perform the processing described herein. An embodiment of the present invention may also be program codes to be executed on a Digital Signal Processor (DSP) to perform the processing described herein. The invention may also involve a number of functions to be performed by a computer processor, a digital signal processor, a microprocessor, or field programmable gate array (FPGA). These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software code or firmware code that defines the particular methods embodied by the invention. The software code or firmware codes may be developed in different programming languages and different format or style. The software code may also be compiled for different target platform. However, different code formats, styles and languages of software codes and other means of configuring code to perform the tasks in accordance with the invention will not depart from the spirit and scope of the invention.

The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. To the contrary, it is intended to cover various modifications and similar arrangements (as would be apparent to those skilled in the art). Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.

Claims

1. A method of signaling additional collocated pictures which are used in a disparity vector (DV) derivation process.

2. The method as claimed in claim 1, wherein temporal neighboring blocks in the additional collocated pictures signaled are checked orderly to find a valid DV candidate in the DV derivation process.

3. The method as claimed in claim 1, wherein the additional collocated pictures signaled can be the same or different to a collocated picture used in Temporal Motion Vector Prediction (TMVP).

4. The method as claimed in claim 1, wherein the number of additional collocated pictures signaled which are used in the DV derivation process can be one, two, or more.

5. The method as claimed in claim 1, wherein the additional collocated pictures are signaled in the video coding bit- stream or in other ways.

6. The method as claimed in claim 1, wherein the additional collocated pictures are signaled in a video parameter set, sequence parameter set, picture parameter set, picture header, slice header, macro-block, coding unit, prediction unit, or any other position in the video coding bit- stream.

7. The method as claimed in claim 1, wherein one additional collocated picture signaled is not used in the DV derivation process if it is the same picture as a collocated picture used in TMVP.

8. The method as claimed in claim 1, wherein the additional collocated pictures signaled must be reference pictures for a current picture.

9. The method as claimed in claim 1, wherein one additional collocated picture is signaled by sending a reference list where it is in and a reference index which represents the its position in the list.

10. The method as claimed in claim 1, wherein a reference list is not sent if there is only one reference list for a current picture; and the additional collocated picture must be in that list.

11. The method as claimed in claim 1, wherein a reference index is not sent if there is only one reference picture in the list where one additional collocated picture is in; and the additional collocated picture must be that reference picture in that list.

12. The method as claimed in claim 1, wherein no additional collocated picture needs to be signaled if there are only two kinds of reference pictures for the current picture, wherein the two kinds of reference pictures includes the collocated picture used in TMVP and inter- view reference pictures.

13. The method as claimed in claim 1 wherein a reference list is not sent if there is only one reference list possessing reference pictures which are neither the collocated picture used in TMVP nor inter- view reference pictures.

14. The method as claimed in claim 1, wherein a reference index is not sent if there is only one reference picture satisfying the following condition in the list where one additional collocated picture is in; the condition is that the reference picture is neither the collocated picture used in TMVP nor an inter- view reference picture.