WO2018122092A1

WO2018122092A1 - Methods, apparatus, and computer programs for decoding media

Info

Publication number: WO2018122092A1
Application number: PCT/EP2017/084050
Authority: WO
Inventors: Rickard Sjöberg; Martin Petterson; Kenneth Andersson; Jacob STRÖM; Jonatan Samuelsson
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 2016-12-30
Filing date: 2017-12-21
Publication date: 2018-07-05

Abstract

The disclosure provides methods, apparatus and compute programs for decoding media. One method, performed by a decoder, for predicting parameter values from a previously decoded reference picture to a current slice of a current picture or current slice of a current picture, comprises: receiving an encoded representation of the current picture or slice of a video sequence from an encoder; identifying a set of previously decoded reference pictures or slices for the current picture or slice; creating an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set; determining, from the list, a previously decoded reference picture or slice to use for prediction; deriving final parameter values by predicting parameter values using the determined reference picture or slice; and decoding the current picture or slice from the encoded representation, using the final parameter values.

Description

METHODS, APPARATUS, AND COMPUTER PROGRAMS FOR DECODING

MEDIA

TECHNICAL FIELD

Embodiments of the present disclosure relate to media decoding, and particularly to methods, apparatus and computer programs for decoding media such as encoded video streams.

BACKGROUND

High Efficiency Video Coding (HEVC) is a block-based video codec standardized by ITU-T and MPEG that utilizes both temporal and spatial prediction. Spatial prediction is achieved using intra (I) prediction from within the current frame. Temporal prediction is achieved using inter (P) or bi-directional inter (B) prediction on block level from previously decoded reference pictures. The difference between the original pixel data and the predicted pixel data, referred to as the residual, is transformed into the frequency domain, quantized and then entropy coded before transmitted together with necessary prediction parameters such as mode selections and motion vectors, also entropy coded. By quantizing the transformed residuals, the tradeoff between bitrate and quality of the video may be controlled. The level of quantization is determined by the quantization parameter (QP). The decoder performs entropy decoding, inverse quantization and inverse transformation to obtain the residual, and then adds the residual to an intra or inter prediction to reconstruct a picture.

Context Adaptive Binary Arithmetic Coding (CABAC) is an entropy coding tool used in HEVC. CABAC encodes binary symbols, which keeps the complexity low and allows modelling of probabilities for more frequently used bits of a symbol. The probability models are selected adaptively based on local context, because coding modes are usually locally well correlated. Reference Picture Sets (RPS) is a concept in HEVC that defines how previously decoded pictures are managed in a decoded picture buffer (DPB) in order to be used for reference, i.e., sample data prediction and motion vector prediction. With other words, what pictures to store is in HEVC signaled using RPS. An RPS is a set of indicators to previously decoded pictures. An RPS is signaled in each slice header in HEVC. All pictures in the DPB that are not included in the RPS are marked as "unused for prediction". Once a picture has been marked "unused for prediction" it can no longer be used for prediction, and when it is no longer needed for output it can be removed from the DPB. If a picture in the RPS is set to "used by curr pic" it means the picture may be used as reference picture for the current picture.

HEVC uses at most two reference picture lists, L0 and L1 for each picture. P-pictures use L0 and B-pictures use L0 and L1 . The reference picture lists are constructed from the RPS subsets RefPicSetStCurrBefore, RefPicSetStCurrAfter and RefPicSetLtCurr. MPEG and ITU-T have recently started the development of the successor to HEVC within the Joint Video Exploratory Team (JVET). In the exploration phase, an experimental software codec called JVET exploratory model (JEM) is being used, which is based on the HEVC reference codec software HM. One tool that is part of JEM is Adaptive Loop Filtering (ALF). ALF was also investigated during the development of HEVC but it was removed prior to the finalization of the standard. In the current version of ALF (JEM 3.1 ) one among 25 filters is selected for the luma component for each 2x2 block based on direction and activity of local gradients. Up to 25 sets of luma filter coefficients could be signaled in the slice header. To reduce the bitrate, ALF coefficients (parameters) may also be predicted from reference pictures.

In video coding, prediction across pictures other than sample value prediction used in motion compensation has historically not been used to a large extent. In more recent video encoders, however, there are a few examples.

One example is the temporal motion vector prediction (TMVP) technique in the HEVC video coding standard. The TMVP technique extends the prediction of motion vectors to also use motion vectors from previously decoded pictures for prediction. Looking at Figure 1 , the set of possible motion vector predictors for a current block C includes five previously decoded motion vectors from the same picture as the current picture, which are illustrated by the spatial positions a-e in the figure. The set of possible predictors also includes two possible motion vectors from a previously decoded block P of a reference picture which are illustrated by the positions A and B in the figure. In case there are several reference pictures used for the current picture, the picture to select is signaled in the HEVC bitstream using the collocated_ref_idx code word. The value of this code word is used as an index in a reference picture list. In case of a uni- predicted slice (slice_type equal to P), the index is used wth the final L0 reference picture list. In case of a bi-predicted slice (slice_type equal to B), there is an additional code word collocated_from_IO_flag that specifies whether collocated_ref_idx applies to the final L0 or final L1 reference picture list. The result of these code words is a single picture that will be used as the co-located picture for the slice. All TMVP motion vectors for the slice will come from this co-located picture.

A second more recent example of cross-picture prediction is the technique of predicting adaptive loop filter (ALF) parameter values from one picture to another. In this description we use the term "ALF parameters" as short for "ALF parameter values". Since those parameters may need a high number of bits to be expressed without prediction, and there is a correlation between parameter values of consecutive pictures, the JEM 3.1 video codec from the JVET group allows such prediction of ALF parameters. For each picture, there are three basic ALF options in JEM 3.1 . The first is to disable ALF for the picture. The second is to use ALF and send the ALF coefficients explicitly in the bitstream. The third is to predict ALF parameter values from a previously decoded picture. The option to use for each picture is signaled in the slice header. For ALF prediction, the decoding method in JEM includes storing ALF parameter values for the 6 most recently decoded pictures for which ALF parameters were explicitly signaled. The parameters are stored in a FI FO queue. This means that if the queue is full, a new set of ALF parameters overwrites the oldest parameters in the queue. For each slice there is a flag in the slice header that specifies whether prediction of ALF parameters is done. If so, there is than an index signaled in the slice header that specifies from which of the at most 6 previous pictures to use for prediction.

A third recent example is cross-prediction of CABAC probability states for context models in JEM. After coding a centrally-located block of a picture, the probability states of context models are stored. These can then optionally be used as the initial CABAC state for later pictures. In JEM 3.1 , the set of initial states for each inter-coded slice is copied from the stored states of a previously coded picture that has the same slice type, and the same slice level QP as the current slice.

During HEVC standardization (JCTVC-F747), two main options were identified carry data that may change on a picture-by-picture basis: either carry the data in the picture parameter set (PPS) or carry the data in a separate parameter set called adaptation parameter set (APS). In the case of an APS, two slices of the same picture could either be free to point to different APSes or constrained to point to the same single APS. The APS concept was adopted into the HEVC draft at the F meeting but later removed from the HEVC specification.

SUMMARY

The JEM signaling in the examples two and three above is very rudimentary. This leads to a number of problems. A first problem with existing solutions for predicting across pictures is that the encoder needs to control the prediction to ensure that there is no mismatch between encoder and decoder for the cases of e.g. temporal layer pruning or random access operation on the bitstream. For example, the current JEM method for prediction of ALF parameters would force the encoder to disable ALF parameter prediction for some pictures that could have used it, in order to avoid mismatch for temporal pruning and random access.

A second problem is that current solutions only support picture-to-picture prediction. There is no described behavior of what to do when there are multiple slices or multiple tiles in a previous picture.

A third problem is that current methods for decoding a subset of temporal layers may not work. The reason for this is that the state of the queue will differ depending on whether a high temporal layer is decoded or not. If a high temporal layer is decoded, the ALF parameters from that layer will be stored in the queue and the queue will hold more parameters compared to the case when the high temporal layer is not decoded.

A fourth problem is that current methods may not be robust to error. For ALF, a queue with maximum 6 sets of ALF parameters is built up from previously encoded/decoded pictures and an index is signaled which set of ALF parameters to use. If a picture is lost, the current ALF prediction scheme may not be aware of this so ALF parameters which have not been updated correctly may be used. For current versions of CABAC, there are at most 14 stored CABAC probability state sets (2 slice types ^* 7 slice QPs). Again, if a picture is lost, the states of the CABAC storages in the encoder and decoder may differ and the decoder may use an incorrect set. A fifth problem, particular to CABAC, is that current methods of decoding a subset of temporal layers may not work. If the slice QP of a higher temporal layer is the same as the slice QP of a current slice, it may well happen that the CABAC state of the higher temporal layer picture is used for the current picture. The problem is then that the higher temporal layer picture may not be received by the decoder if the bitstream has been pruned to contain only a subset of temporal layers.

A solution to one or more of these problems is therefore required. It would be possible to use a parameter set such as the picture parameter set (PPS) or the adaptation parameter set (APS) for carrying data to be used across pictures. However, since data such as ALF parameters and CABAC probability states are anticipated to change on a picture-by-picture basis, the encoder will need to change parameter sets on the fly as it encodes pictures. Neither the PPS nor the APS is resilient against packet losses when parameter sets are modified on the fly so they may not be suitable.

According to some embodiments of the present disclosure, a new ordered list of indicators pointing to reference pictures or reference slices is created and used for predicting parameter data from one previous picture or slice to a current picture or slice. A special case is to use a list of only one indicator. In this case, the creation of a list is reduced to identifying one single reference picture or reference slice.

The ordered list may for example be used for temporal ALF parameter prediction (TAPP) and the single identification may be used for temporal CABAC probability state prediction (TCSP). Note that other types of parameters than the examples of ALF parameters and CABAC probability states may be predicted from a previous slice to a current slice. The proposed solution is applicable to cross-picture prediction of any decoding parameters or data and is not limited to ALF and/or CABAC data.

One aspect of the disclosure provides a method, performed by a decoder, for predicting parameter values from a previously decoded reference picture to a current slice of a current picture or current slice of a current picture. The method comprises: receiving an encoded representation of the current picture or slice of a video sequence from an encoder; identifying a set of previously decoded reference pictures or slices for the current picture or slice; creating an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set; determining, from the list, a previously decoded reference picture or slice to use for prediction; deriving final parameter values by predicting parameter values using the determined reference picture or slice; and decoding the current picture or slice from the encoded representation, using the final parameter values

Another aspect of the disclosure provides a decoder for predicting parameter values from a previously decoded reference picture to a current picture or current slice of a current picture. The decoder is configured to: receive an encoded representation of the current picture or slice of a video sequence from an encoder; identify a set of previously decoded reference pictures or slices for the current picture or slice; create an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set; determine, from the list, a previously decoded reference picture or slice to use for prediction; derive final parameter values by predicting parameter values using the determined reference picture or slice; and decode the current picture or slice from the encoded representation, using the final parameter values.

Another aspect of the embodiments defines a decoder for predicting parameter values from a previously decoded reference picture to a current picture or current slice of a current picture. The decoder comprises: a receiver module configured to receive an encoded representation of the current picture or slice of a video sequence from an encoder; an identifying module configured to identify a set of previously decoded reference pictures or slices for the current picture or slice; a creating module configured to create an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set; a determining module configured to determine, from the list, a previously decoded reference picture or slice to use for prediction; a deriving module configured to derive final parameter values by predicting parameter values using the determined reference picture or slice; and a decoding module configured to decode the current picture or slice from the encoded representation, using the final parameter values.

The decoder could also comprise a receiving means configured to receive an encoded representation of the current picture or slice of a video sequence from an encoder; an identifying means configured to identify a set of previously decoded reference pictures or slices for the current picture or slice; a creating means configured to create an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set; a determining means configured to determine, from the list, a previously decoded reference picture or slice to use for prediction; a deriving means configured to derive final parameter values by predicting parameter values using the determined reference picture or slice; and a decoding means configured to decode the current picture or slice from the encoded representation, using the final parameter values.

The decoder may be implemented in hardware, in software or a combination of hardware and software. The decoder may be implemented in, e.g. comprised in, user equipment, such as a mobile telephone, tablet, desktop, netbook, multimedia player, video streaming server, set-top box or computer.

A further aspect of the embodiments defines a computer program for a decoder, for predicting parameter values from a previously decoded reference picture to a current picture or current slice of a current picture. The computer program comprises computer program code which, when executed, causes the decoder to: receive an encoded representation of the current picture or slice of a video sequence from an encoder; identify a set of previously decoded reference pictures or slices for the current picture or slice; create an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set; determine, from the list, a previously decoded reference picture or slice to use for prediction; derive final parameter values by predicting parameter values using the determined reference picture or slice; and decode the current picture or slice from the encoded representation, using the final parameter values.

A further aspect of the embodiments defines a computer program product for a decoder, for predicting parameter values from a previously decoded reference picture to a current picture or current slice of a current picture. The computer program product comprises a non-transitory computer-readable medium storing computer program code which, when executed, causes the decoder to: receive an encoded representation of the current picture or slice of a video sequence from an encoder; identify a set of previously decoded reference pictures or slices for the current picture or slice; create an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set; determine, from the list, a previously decoded reference picture or slice to use for prediction; derive final parameter values by predicting parameter values using the determined reference picture or slice; and decode the current picture or slice from the encoded representation, using the final parameter values. One advantage of embodiments of the present disclosure is that it removes the burden on the encoder to cleverly control the use of parameter prediction in order to avoid mismatches for temporal layer pruning and random access operations.

Another advantage is that full prediction flexibility is enabled which provides opportunities for improved compression efficiency.

A third advantage is that prediction in a multi-slice scenario is supported.

Compared to using the PPS or APS, the fourth advantage is improved error resilience. The RPS design is robust against packet losses so by tying prediction data to reference pictures and using the robust RPS mechanisms, error resilience is preserved. By error resilience we here mean the ability to know what has been lost. The RPS provides information which picture that was lost. With the proposed method here, the decoder will in case of parameter loss know which picture loss that caused it.

Some of the embodiments below are described in the context of HEVC. A person skilled in the art would understand that HEVC may be replaced in the text with other existing or future video codecs. The term "slice" herein is taken to mean any partition of a picture. Slice can therefore mean independent slice segment, dependent slice segment, tile, or any other partition of a picture.

Note that temporal layering is used in this application as a layer example. A person skilled in the art should know that the methods described herein also would apply to other types of layers, such as e.g. spatial, SNR, and view layers.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 illustrates a set of possible motion vector predictors for a current block C, including five previously decoded motion vectors from the same picture as the current picture, which are illustrated by the spatial positions a-e in the figure, and two possible motion vectors from a previously decoded block P of a reference picture, which are illustrated by the positions A and B.

Figure 2 is a flowchart of a method for predicting parameter values from a previously decoded reference picture to a current slice according to embodiments of the present disclosure.

Figure 3 illustrates one example of picture prediction according to an embodiment of the present disclosure.

Figure 4 illustrates a decoder according to embodiments of the present disclosure. DETAILED DESCRIPTION OF THE PROPOSED SOLUTION

Figure 2 is a flowchart of a method according to embodiments of the disclosure. The method may be carried out in a decoder, such as the decoder 400 described below with respect to Figure 4, for example.

In step 200, the decoder receives an encoded representation of a current picture, or a current slice of a current picture, of a video sequence. The encoded representation may be received from an encoder, for example. The decoder begins to decode the slice or picture header.

In step 202, the decoder identifies a set of previously decoded reference pictures or slices for the current picture or slice. For example, step 202 may comprise decoding the reference picture set (RPS), and identifying which pictures or slices in the RPS are reference pictures for the current picture or slice. In one embodiment, the decoder may determine the reference pictures in one or more (or all) of the lists RefPicSetStCurrBefore, RefPicSetStCurrAfter, and RefPicSetLtCurr. The former two lists are reference picture lists containing short-term reference pictures (i.e. pictures stored as short term pictures in the decoded picture buffer, DPB), having picture order count values which are higher and lower, respectively, than the current picture order count value; the latter list is a reference picture list containing long-term reference pictures (i.e. pictures stored as long term pictures in the DPB).

The list construction process may include only those pictures or slices that belong to a temporal layer that is equal to or lower than the temporal layer of the current slice. This solves the third problem stated above by ensuring that the reference picture list for parameter prediction is the same for a particular picture regardless of whether higher temporal layers have been removed or not. In step 204, the decoder creates an ordered list of indicators pointing to previously decoded reference pictures or slices. The indicators may point to one or more of the reference pictures or slices identified in step 202; alternatively, the indicators may point to any previously decoded reference picture or slice. In one embodiment, step 204 comprises re-using final reference picture lists L0 and L1 . These are temporary lists created for the purposes of decoding picture data: L0 comprises a list of reference pictures used for both P and B slices or pictures; L1 comprises a list of reference pictures used for B slices or pictures. Since the RPS mechanisms guarantee that no picture that may be unavailable at random access or when temporal pruning is done is included in the L0 and L1 lists, this embodiment solves both the error resilience and the temporal pruning problems identified above.

However, as discussed below, this embodiment is also associated with certain drawbacks. In alternative embodiments, rather than re-using L0 or L1 lists, a new list is created for the purposes of predicting parameter values according to embodiments of the disclosure.

In one embodiment, if the reference pictures identified in step 202 have not already been limited to those pictures that belong to a temporal layer that is equal to or lower than the temporal layer of the current slice, step 204 may comprise identifying only those pictures that belong to a temporal layer that is equal to or lower than the temporal layer of the current slice.

In step 206, the decoder determines, from the list created in step 204, one or more previously decoded reference pictures or slices to use for prediction in the current slice or picture. The determination of which reference picture to use may be based on an index decoded from the bitstream received in step 200, pointing to one of the pictures or slices in the ordered list. For example, an index value of "0" may point to a first or initial picture or slice in the list, and so on. Further detail regarding this aspect can be found below with respect to Embodiments 8 and 9. In step 208, the decoder utilizes the one or more previously decoded reference pictures to derive final parameter values by predicting the final parameter values based on the one or more previously decoded reference pictures, and particularly based on the parameter values for those one or more previously decoded reference pictures. For example, calculating parameters for a current slice or picture based on parameters from a previous slice or picture may include:

1 ) Copy method: Copy parameter values as-is from a previous slice For example, assume that a previous slice S used ALF with a set of ALF parameter values. When decoding the slice header of a current slice C, the decoder decodes which previous slice to use for ALF parameter prediction. The decoder then copies or uses the same ALF parameter values for slice C that were used for slice S. 2) Prediction method: Use parameter values from a previous slice as a prediction for the current slice and derive final parameter values by using both values from a previous slice and values signaled for the current slice

For example, assume that a previous slice S used ALF with a set of ALF parameter values. When decoding the slice header of a current slice C, the decoder decodes which previous slice to use for ALF parameter prediction. For at least one ALF parameter in the set of ALF parameters, the decoder then decodes an ALF parameter delta value and combines this value with the corresponding ALF parameter value that was used in slice S. In one embodiment the combination is done by addition and done for multiple ALF parameter values.

3) Overwrite method: Partially overwrite parameter values from a previous slice by values signaled in the current slice For example, assume that a previous slice S used ALF with a set of ALF parameter values. When decoding the slice header of a current slice C, the decoder decodes which previous slice to use for ALF parameter prediction. For at least one ALF parameter in the set of ALF parameters, the decoder decodes a parameter value from the data of the current slice C and uses this parameter value as is. For at least one ALF parameters, the decoder uses either method 1 ) or 2) above. Any combination of methods 1 , 2, and 3 can be used. For instance, method 1 and 2 can be combined in which some parameters are copied and some parameters are predicted. In step 210, the final parameter values determined in step 208 are used to decode the current slice or picture from the encoded representation received in step 200. This step thus comprises decoding the picture data (e.g. pixel values, etc), using the parameter values determined in step 208. Thus the final parameter values are derived before decoding of the picture data begins.

Likewise, the parameters to predict are in some embodiments exemplified with ALF or CABAC. It is to be understood that other types of parameters may be used in place or in combination with ALF or CABAC. Examples of other types of parameters to predict from a previous slice according to embodiments of the disclosure include, but are not limited to, sample adaptive offset (SAO) parameters, coding tree structure parameters, interpolation filter coefficients, scaling matrices, slice_segment_address, slice_type, color_plane_id, collocated_ref_idx, weighted prediction parameters (e.g luma and chroma weights), merge candidate parameters (e.g. five_minus_max_merge_candidates), QP modification parameters (e.g. slice_qp_delta, slice_cb_qp_offset, slice_cr_qp_offset), deblocking parameters (e.g. slice_beta_offset_div2, slice_tc_offset_div2), entry point data (e.g. num_entry_point_offsets, offsetJen_minus1 , entry_point_offset_minus1 ) and slice header extension (e.g. slice_segment_header_extension_length and slice_segment_header extension_data_byte).

Other parameters to predict from a previous picture could be processed sample data (e.g. filtered, subsampled, etc), intermediate sample data not used for output, residual data or statistics drawn from the picture. 1.1 Embodiment 1 - Keep separate parameter queue but create new parameter reference picture

In one embodiment, the current ALF queue in JEM is kept but a reference picture list for ALF prediction is introduced. For each slice, there is a list construction process that includes the pictures in the queue that belong to a temporal layer that is equal to or lower than the temporal layer of the current slice. This solves the third problem stated above by ensuring that the reference picture list for ALF prediction is the same for a particular picture regardless of whether higher temporal layers have been removed or not.

For CABAC, the current method is changed to store parameters for each combination of slice type and temporal id (and layer id and view id for spatial, SNR and view scalability). The one to select for prediction can then be the most recently received (in decoding order) that has a temporal id (and layer id, and view id) equal to the current picture. If no such picture exists, the most recently received in decoding order that has a temporal id (and layer id, and view id) lower than the current picture is selected. If no such picture exists, CABAC prediction is prohibited. Alternatively, instead of the most recently received picture in decoding order, the picture that is closest in output order is selected. If two pictures are equally close, the one with lower picture order count (alternatively the one with higher picture order count) shall be selected. 1.2 Embodiment 2

In one embodiment, cross-picture prediction is allowed only from pictures that are reference pictures for the current slice.

The following decoding steps illustrate the decoder operation for this embodiment:

1 . The decoder starts decoding a slice or picture header.

2. The decoder decodes information as to which previously decoded pictures are reference pictures for the current slice or picture. The decoder identifies the reference pictures.

3. The decoder creates a list of picture indicators by including indicators to reference pictures. Note that there may be a limit on the size of the list such that not all picture indicators are included but only a subset of them.

4. The decoder receives information in the bitstream in the form of an index to the created list for which set of parameters to use for prediction. For example, an index value of 2 may mean that the third (since it is assumed that indices start from 0) indicator in the list is used to identify the picture.

5. The decoder calculates the parameters to use for the current slice or picture based on the parameters from the indicated picture and decodes the slice or picture using those.

6. The decoder decodes the slice or picture, using the calculated parameters 7. After decoding the slice or picture, the decoder stores the parameters used together with the current slice or picture to enable using the parameters for prediction in the future. Step 2 above for a decoder that uses reference picture sets (RPS) is preferably done by determining that the pictures that are included in the RPS are reference pictures. Preferably only the pictures that are included in RefPicSetStCurrBefore, RefPicSetStCurrAfter, or RefPicSetLtCurr are used. Step 3 above for a decoder that uses reference picture sets (RPS) may be done by reusing final reference picture lists L0 and L1 . Since the RPS mechanisms guarantee that no picture that may be unavailable at random access or when temporal pruning is done is included in the L0 and L1 lists, the method solves both the error resilience and the temporal pruning problem.

The following steps illustrate the encoder operation for this embodiment:

1 The encoder starts encoding a slice or picture.

2 The encoder selects which previously encoded pictures that are to be reference pictures for the current slice or picture.

3 The encoder creates a list of picture indicators by including indicators to reference pictures. Note that there may be a limit on the size of the list such that not all picture indicators are included but only a subset of them.

4 The encoder selects which picture to use for parameter prediction and puts information in the bitstream in form of an index code word that identifies the picture and thereby the parameters.

5 The encoder calculates the parameter values to use for the current slice or picture based on the parameters from the indicated picture as specified above. For case 2 above (prediction method) the encoder first determines the parameter values to use. It thereafter uses the parameter values from the selected picture to form a signal to transmit in the bitstream, for example by subtracting the predicted parameter values from the determined parameter values. For case 1 above (copy method) the encoder uses the predicted parameter values as is.

6 The encoder then encodes the current slice or picture. 7 After encoding the slice or picture, the encoder stores the parameters used together with the current slice or picture to enable using the parameters for prediction in the future. However, there are at least two problems of allowing prediction only from reference pictures that are included in LO and/or L1 .

The first problem is for the case of Intra coded pictures that are not random-access point (RAP) pictures. One reason of using those is when content consisting of very frequent scene cuts is coded. Intra coded picture may be the most efficient type of picture or slice coding, but there is no need to enable RAP functionality for each Intra picture. Now, since the Intra picture does not use any reference pictures, a method using LO and L1 would not allow for any cross-picture prediction. However, for an Intra picture that is not a random access picture, it would be advantageous to allow prediction from a previous picture.

The reason that this may not work for HEVC is because some syntax related to reference picture lists is not included for l-slices. Also, reference picture list construction (section 8.3.4) is only done for P and B slices so the LO and L1 lists are undefined or can be considered empty.

The second problem is that the construction of the LO and L1 reference picture lists may be optimized for coding efficiency of motion vectors. We know that for each motion vector, the reference picture to use must be signaled. The bit cost for signaling this depends on the number of available reference pictures. If there is only one reference picture, there is no need to signal anything since there is only one choice. If there are many possible reference pictures, the signaling space must have room for many options and this comes with a bit cost. To reduce the bit cost, the encoder may choose not to include all possible available reference pictures into LO and/or L1 for the current picture.

Let's look at an example based on HEVC: Assume that there are 5 available reference pictures. Assume further that the correlation between samples of the current picture and samples of one particular reference picture is very high while the correlation between the current picture and any other reference picture is very low. The encoder therefore chooses to include only one picture into list L0 and L1 . Using L0 and L1 for cross-picture prediction indication would mean that there would only be one set of parameters to use for prediction although many more reference options could be made available. But if the encoder would choose to include all 5 pictures into list L0 and L1 , there will be unnecessary overhead to repeatedly signal which reference picture to use for the motion vectors of the current picture.

1.3 Embodiment 3 - Construct parameter reference picture list depending on temporal layer

In this embodiment, a new list of reference picture indicators, separate from L0 and L1 is used for cross-picture prediction.

To construct the new list, all reference pictures for which the layer id(s) are equal to or lower than the corresponding layer id(s) of the current picture or slice are added. Alternatively, for pictures of the same temporal layer as the current picture or slice, no sub-layer non-reference picture is included in the new list.

This solves the problem of temporal pruning and random access and at the same time does not impose any restriction on the creation of reference picture lists L0 and L1 . 1.4 Embodiment 4 - Parameter reference picture list based on RPS

This preferred embodiment is similar to embodiment 3 but the list construction is based on the reference pictures in the reference picture set (RPS) that are available for the current picture. In HEVC, this is equivalent to let the new list only include elements of the three sets RefPicSetStCurrBefore, RefPicSetStCurrAfter and RefPicSetLtCurr. An encoder that wishes to use short L0 and L1 lists must avoid setting used_by_curr_pic_flag to 0 for pictures that are allowed for prediction for the current picture. Such an encoder may use num_ref_idx_IO_default_active_minus1 , num_ref_idxJ0_active_minus1 , or ref_pic_lists_modification() syntax to shorten the lengths of the L0 and L1 lists.

List construction is for this embodiment based on the three sets RefPicSetStCurrBefore, RefPicSetStCurrAfter and RefPicSetLtCurr. Additionally, list construction can also be based on any layer identity such as temporal layer (temporaljd), spatial or SNR layer (layerj^'d), or view layer (viewj^'d), for example such that the order of elements in the new list depends on the layer identity and/or such that the presence of a particular reference picture or slice in the list depends on the layer identity. List construction can also in addition to the three sets be based on the output order, for example picture order count (POC), such that the order of entries in the new list is based on the output order of reference pictures or slices. Finally, list construction can also be based on the decoding order of pictures and/or slices, such that the order of entries in the new list is based on the decoding order. List construction can also in addition to the three sets be based on matching characteristics between the current and reference picture/slice such that only reference picture/slices with matching characteristics are included in the new list or such that the order of entries in the new list depends on matching characteristics. Examples of matching characteristics is whether particular tools are turned on or off, whether the picture or slice type is the same or not, whether the coded picture size is similar and/or whether the configuration of a particular tool is identical or not.

The following decoding steps illustrate the decoder operation for this embodiment: 1 The decoder starts decoding a slice or picture header.

2 The decoder decodes the RPS and constructs the sets RefPicSetStCurrBefore, RefPicSetStCurrAfter and RefPicSetLtCurr.

3 The decoder creates a list of picture indicators by including indicators to the reference pictures of the three sets of step 2.

4 The decoder receives information in the bitstream in form of an index to the created list for which set of parameters to use for prediction. For example, an index value of 0 means that the first indicator in the list is used to identify the picture.

5 The decoder calculates the parameters to use for the current slice or picture based on the parameters from the indicated picture and decodes the slice or picture using those.

6 After decoding the slice or picture, the decoder optionally stores the parameters used together with the current slice or picture to enable using the parameters for prediction in the future. An alternative sequence of steps can be expressed as follows:

1 . The decoder decodes RPS information for example in a slice header, a picture header or a picture parameter set.

2. The decoder creates a list LP containing all or some of the reference pictures in the RPS.

3. The decoder decodes an index / related to which picture to predict parameters from. 4. The reference picture at position /^' in LP is used for predicting parameters of the current picture.

The following encoding steps illustrate the encoder operation for this embodiment: 1 The encoder starts encoding a slice or picture.

3 The encoder writes to the output bitstream such that the selected reference pictures will be present in RefPicSetStCurrBefore, RefPicSetStCurrAfter and RefPicSetLtCurr when decoded.

4 The encoder creates a list of picture indicators by including indicators to the reference pictures of the three sets in step 3.

5 The encoder selects which picture to use for parameter prediction and puts information in the bitstream in form of an index code word that identifies the picture and thereby the parameters.

6 The encoder calculates the parameters values to use for the current slice or picture based on the parameters from the indicated picture as specified above. For case 2 above (prediction method) the encoder first determines the parameter values to use. It thereafter uses the parameter values from the selected picture to form a signal to transmit in the bitstream, for example by subtracting the predicted parameter values from the determined parameter values. For case 1 above (copy method) the encoder uses the predicted parameter values as is. The encoder then encodes the current slice or picture.

7 After encoding the slice or picture, the encoder stores the parameters used together with the current slice or picture to enable using the parameters for prediction in the future.

List construction can for example be done by first adding short-term pictures that has an output order before the current picture. After those, the short-term pictures with an output order after the current picture may be added. Finally, the long-term pictures are added. This can be illustrated by the following pseudo-code where ParamRefPicList is the list that is constructed and starts out empty: rldx = 0

for( i = 0; i < NumPocStCurrBefore ; i++ )

ParamRefPicList [ rldx++ ] = RefPicSetStCurrBefore[ i ] for( i = 0; i < NumPocStCurrAfter ; i++ )

ParamRefPicList [ rldx++ ] = RefPicSetStCurrAfter[ i ]

for( i = 0; i < NumPocLtCurr ; i++ )

ParamRefPicList [ rldx++ ] = RefPicSetl_tCurr[ i ]

In another list construction method, the order of the entries is sorted in output order such that the picture that are closest in output order are earlier in the list. One preferred method is to sort the list in increasing abs(CurrPOC-RefPOC), such that the entry with the smallest value of abs(CurrPOC-RefPOC) is first in the list. Here CurrPOC is the picture order count (POC) of the current picture and RefPOC is the picture order count of a reference picture or slice.

In one embodiment, there is no list with multiple entries constructed since only one element is selected for prediction. This is currently the case for CABAC in JEM 3.1 . One method of selecting one element is to select the most recently received (in decoding order) picture that has a slice of the same type as the current slice and a temporal id (and layer id, and view id) equal to the current picture. If no such picture exists, the most recently received picture in decoding order that has a slice of the same type and a temporal id (and layer id, and view id equal to or) lower to the current picture is selected. If no such picture exists, CABAC prediction is prohibited. The parameters of the first slice in decoding order that belongs to the selected picture with the same slice type as the current slice should be used.

Alternatively, and preferably, the closest picture in output order that has a slice of the same type as the current slice and a temporal id (and layer id, and view id) equal to the current picture is selected. If no such picture exists, the closest picture in output order that has a slice of the same type and a temporal id (and layer id, and view id) lower to the current picture is selected. If no such picture exists, CABAC prediction is prohibited. If two pictures are equally close, the picture that is output before the current picture is always selected. Alternatively, the picture that is output after the current picture is always selected. Alternatively, the picture that is closest to the current picture in decoding order is selected. Alternatively, there is a codeword in the bitstream that specifies which one to select for this case. The parameters of the first slice in decoding order that belongs to the selected picture with the same slice type as the current slice should be used. A realization of the embodiment for temporal prediction of ALF parameters is described in the pseudo-code below. storedAlfParams and storedUniqueAlfParams are arrays that are empty before encoding/decoding the first picture. Encoder side:

1 . Start encoding picture

2. Signal reference pictures in RPS

3. Determine based on RPS which pictures that may be used for parameter prediction and put them in an array paramRefPicList

4. Update the arrays storedAlfParams and storedUniqueAlfParams using paramRefPicList according to the pseudo-function updateStoredAlfParams below

5. Create a new explicit ALF parameter set AlfParamSet

6. Test encode with AlfParamSet (a), all stored ALF parameter sets in storedUniqueAlfParams (b), and without using ALF (c)

7. If AlfParamSet (a) or any ALF parameter set in storedUniqueAlfParams (b) is best

signal slice_adaptive_loop_filter_flag =1 in the bitstream store the best ALF parameter set at the end of the storedAlfParams array and tag it with the current POC number

If any of the ALF parameter set in storedUniqueAlfParams

(b) is best

signal adaptive_loop_filter_prediction_flag=1 in the bitstream

- signal the index of the best ALF parameter set in storedUniqueAlfParams as adaptive_loop_filter_prediction_ref_idx in the bitstream else (if a)

signal adaptive_loop_filter_prediction_flag=0 in the bitstream

- encode explicit AlfParamSet

else (if c)

signal slice_adaptive_loop_filter_flag =0 in the bitstream

8. Encode picture using the best option from 6

9. Go to 1 to encode next picture

Decoder side: 1 . Start decoding picture

2. Decode RPS and build paramRefPicList

3. Update the arrays storedAlfParams and storedUniqueAlfParams using paramRefPicList according to the pseudo-function updateStoredAlfParams below

4. Decode slice_adaptive_loop_filter_flag from the bitstream

5. If slice_adaptive_loop_filter_flag == 1

Decode adaptive_loop_filter_prediction_flag from bitstream If adaptive_loop_filter_prediction_flag == 1

- Decode adaptive_loop_filter_prediction_ref_idx from bitstream

Set currAlfParamSet to ALF parameter with index adaptive_loop_filter_prediction_ref_idx in storedUniqueAlfParams

else

- Decode explicit AlfParamSet

Set currAlfParamSet to AlfParamSet

Store currAlfParamSet at the end of the storedAlfParams array and tag it with the current POC number

Decode picture using currAlfParamSet

else

Decode picture without using ALF

6. Go to 1 to decode next picture

updateStoredAlfParams (paramRefPicList, storedAlfParams, storedUniqueAlfParams) {

// Copy the storedAlfParams belonging to pictures in paramRefPicList to a temporary // array in the order of the paramRefPicList

numStoredParams = 0

for (i = 0; i < length(paramRefPicList); i++)

{

for G = 0; j < MAX_NUM_REF_PICS+1 ; j++)

{

if (storedAlfParams[j].POC >= 0 && paramRefPicList[i].POC storedAlfParams[j].POC)

{ storedAlfParamsTemp[numStoredParams] = storedAlfParams[j] storedAlfParamsTemp[numStoredParams].usedByCurr paramRefPicList[i].usedByCurr

numStoredParams++

}

// Copy the temporary array back to the storedAlfParams

for (i = 0; i < numStoredParams; i++)

{

storedAlfParamsp] = storedAlfParamsTempp]

} // Copy unique Alf params into storedUniqueAlfParams

rldx = 0;

for(i = 0; i < numStoredParams; i++)

{

if (!storedAlfParams[i].usedByCurr) // Don't add if not used by curr continue

unique = 1

for(j = 0 ; j < rldx ; j++)

{

if(storedUniqueAlfPara[j] == storedAlfParamsp])

{

unique = 0;

}

if( unique == 1 )

{

storedUniqueAlfParams[rldx] = storedAlfParamsp]

storedUniqueAlfParams[rldx].currPOC = storedAlfParams[i].currPOC rldx++

}

numStoredParams = rldx; }

A realization of the embodiment for temporal prediction of CABAC probability state where the picture to predict from is not signaled in the bitstream is described in the pseudo- code below. storedCabacCtx is an array which is empty before encoding/decoding the first picture.

Encoder side:

1 . Start encoding picture

2. Signal reference pictures in RPS

4. Update the array storedCabacCtx using paramRefPicList according to the pseudo-function updateStoredCabacCtx below

5. Get the index bestPicldx of the picture to predict from using the pseudo-function getBestPicldxForCtxPred

6. If bestPicldx does not point to a valid picture

Select the CABAC context by initialize CABAC context in a conventional way

else

Select the CABAC context to predict from as the CABAC context at position idx in storedCabacCtx

7. Store the selected CABAC context at the end of the storedCabacCtx array and tag it with the current POC number

8. Encode picture using the selected CABAC context

9. Go to 1 to encode next picture

Decoder side:

1 . Start decoding picture

2. Decode RPS and build paramRefPicList

3. Update the array storedCabacCtx using paramRefPicList according to the pseudo-function updateStoredCabacCtx below

4. Get the index bestPicldx of the picture to predict from using the pseudo-function getBestPicldxForCtxPred

5. If bestPicldx does not point to a valid picture Select the CABAC context by initialize CABAC context in a conventional way

else

6. Store the selected CABAC context at the end of the storedCabacCtx array and tag it with the current POC number

7. Decode picture using the selected CABAC context

8. Go to 1 to decode next picture updateStoredCabacCtx (paramRefPicList, storedCabacCtx)

{

// Copy the ctxProbldx belonging to pictures in paramRefPicList to a temporary // array in the order of the paramRefPicList

numPicslnStore = 0;

for (i = 0; i < length(paramRefPicList); i++)

{

forC = 0; j < MAX_NUM_REF_PICS+1 ; j++)

{

if (storedCabacCtx[j].POC >= 0 && paramRefPicList[i].POC storedCabacCtx[j].POC)

{

tempStoredCabacCtx[numPicslnStore] = stored CabacCtx[j]

tempStoredCabacCtx[numPicslnStore].usedByCurr = paramRefPicList[i].usedByCurr

numPicslnStore++

}

// Copy the temporary array back to the storedCabacCtx

for (i = 0; i < numPicslnStore; i++)

{

storedCabacCtxp] = tempStoredCabacCtxp]

}

} getBestPicldxForCtxPred(storedCabacCtx, currPOC, currSliceType, currLayerlD) {

bestPicldx = -1

for (i = 0; i < numPicslnStore; i++)

{

// Check that picture may be used and that the slice type is the same

if (stored CabacCtx [i].usedByCurr && storedCabacCtx [i].sliceType == currSliceType)

{

// Current layer should be equal or higher than stored layer

layerDiff = currLayerlD - storeEntry[i].layerlD

if (layerDiff >= 0 && layerDiff <= bestLayerDiff)

{

bestLayerDiff = layerDiff;

POCDiff = abs(storeEntry[i].POC-currPOC)

if (POCDiff < bestPOCDiff) // Choose the one with closest POC

{

bestPOCDiff = POCDiff

bestPicldx = i

}

return bestPicldx;

}

1.5 Embodiment 5 - Predict from specific slice in picture

The previous embodiments assume that there is only one set of parameters for each picture. It may be advantageous to allow for prediction from an individual previous slice in the case multiple slices were used for the previous picture. In this case, the list construction can be done using one of the following two methods:

1 ) The new list consists of indicators to slices instead of pictures, optionally by removing duplicates

2) A preferred method in which the new list consists of picture indications as described earlier. In the case of multiple slices, the codeword to indicate reference picture is followed by a codeword to indicate which reference slice to predict from. One preferred embodiment is to always use two UVLC code words, one code word to indicate the reference picture and one code word to indicate the reference slice. The benefits of always including the slice indication is that the slice header would be parsable without knowing how many slices that were used for a particular previous picture. Also, the overhead for the case where only one slice per picture is used is only one bit per slice. Slice indexing can be done by counting slices in decoding order, such that the first slice of a particular picture has index 0, the second slice of a particular picture has index 1 and so on. Alternatively, a slice id is sent for each slice and used as index.

In an alternative version of this embodiment the codeword to indicate which reference slice to predict from is not signaled. Instead the same slice index as for the current slice is used to select the slice to predict from in the reference picture. In case there are more slices in the current picture than in the referenced picture and the current slice has an index higher than the number of slices in the reference picture - 1 , then the slice with the highest index in the referenced picture is used for predicting the parameters in the current slice. Note that slice boundaries do not need to be the same between frames. In yet another version of this embodiment the selected reference slice to predict from is collocated with the current slice, either by selecting the reference slice collocated with the first coding block of the current slice, or by selecting the reference slice that has most coding blocks collocated with the current slice. It may also be advantageous to enable prediction within a current picture, from a previous slice of the current picture to the current slice of the current picture. For method 1 , this can be done by including previous slices of the current picture in new list. For method 2, this can be done by including the current picture in the list of reference picture in case the current slice is not the first slice of the current picture.

1.6 Embodiment 6 - Disallowing reference chains

In this embodiment the temporal parameter prediction as described in any of the previous embodiments is only allowed from a picture that do not predict the parameters from another picture itself, i.e. a reference chain is not allowed. This could be realized at the encoder side by checking whether the picture to predict from is using prediction itself or not. The advantages of disallowing reference chains include

Not having to deal with the extra complexity to handle reference chains

- A shorter index may be needed to signal the picture since fewer pictures are allowed

It will automatically prevent the case where a picture C references picture B which references picture A, and where A has been removed from the list (e.g. RPS). Unless the parameter data is copied to picture B before picture A has been removed, this case will result in a decoding error.

1.7 Embodiment 7 - Allowing reference chains

In this embodiment the temporal parameter prediction as described in any of the previous embodiments is allowed to reference pictures for which the pictures themselves predict parameters from another picture.

Assume, as shown in Figure 3, a picture C with POC 4 at temporal ID (tID) 2 (temporal layer 2) is predicting at least one parameter from a picture B with POC 8 at tID 1 and the picture B is predicting the same parameter(s) from picture A with POC 0 at tID 0.

In one realization of the decoding process, after parsing picture B, the parameters predicted from picture A are copied to the parameter buffer belonging to picture B, to make them available when decoding picture C. In yet another realization of the decoding process, the parameters are kept in a buffer array where one entry is a parameter set belonging to one or more pictures. Thus, after decoding picture A, the parameters are put in the buffer array at entry 0 and tagged with POC 0/tlDO. After decoding picture B the parameter buffer at entry 0 in the buffer array is in addition tagged with POC 8/tlD 1 and after decoding picture C the parameter buffer at entry 0 in the buffer array is also tagged with POC 8/tlD 2. Let's say POC 0 is no longer available in the RPS, but POC 4 and POC 8 are, when decoding a picture D at tID 0. The tag POCO/tlDO is then removed from the parameter buffer at entry 0 and the parameters from entry 0 may not be predicted from since it does not contain any reference to a picture with the same or lower tI D than the current picture D. When all tags have been removed from the entry, the parameter data of the entry is no longer referenceable and can be removed. The emptied entry in the buffer is replaced by moving down entries with a higher index.

1.8 Embodiment 8 - Signaling of parameter index

In this embodiment it is described how the signaling of the parameter index can be realized. In the table below, this is exemplified with the ALF parameters. The first column describes the picture order count (POC) of the picture, the second column whether ALF is used by the current picture, the third column whether ALF is predicted and if so from which POC. In the fourth column the POCs for the unique ALF parameters are shown and in the last column the codeword signaled to represent the POC for the picture from where the ALF parameters are predicted from.

The list of unique ALF params are built up on both the encoder and decoder side. When the decoder receives the signaled codeword it uses the ALF parameters at the certain index. Since both the encoder and decoder side update the unique ALF params list equally based on RPS, it is sufficient to only send the codeword for the list. The method according to embodiment 1 should be used to ensure that the unique ALF params lists are updated equally also when there is temporal pruning, i.e. higher temporal layers are removed before decoding.

In order to avoid signaling overhead it may be advantageous to avoid entries in the new list for which the corresponding reference picture or slice is not associated with any parameter values suitable for prediction. One first example is reference pictures for which no parameters are available because no parameters were signaled. For instance, a particular reference picture may not have used a particular tool and therefore no parameters were signaled for that reference picture. Another second example is that other characteristics of the current and/or reference picture or slice makes the prediction unsuitable. For instance, the parameter values of a particular type may differ a lot depending on the slice type (I , P or B). If then the current slice and a reference slice have different slice types, prediction is unsuitable. In all these cases, unsuitable entries in the new list should be avoided by simply performing corresponding checks during construction of the new list and not add entries that are unsuitable. Alternatively, in order to avoid signaling overhead it may be advantageous to avoid duplicates in the new list. This can be done by identifying duplicates by either pruning of the new list after it has been constructed or by not adding duplicates during list construction of the new list. This can be illustrated for embodiment 4 by the following pseudo-code where ParamRefPicList is the new list that is constructed and starts out empty:

rldx = 0

for( i = 0; i < NumPocStCurrBefore ; i++ )

unique = 1 ;

for( j = 0 ; j < rldx ; j++ )

if( parameter values of ParamRefPicl_ist[ j ] is identical to parameter values of RefPicSetStCurrBefore[ i ] )

unique = 0;

if( unique == 1 )

ParamRefPicList [ rldx++ ] = RefPicSetStCurrBefore[ i ]

for( i = 0; i < NumPocStCurrAfter ; i++ )

unique = 1 ;

for( j = 0 ; j < rldx ; j++ )

if(parameter values of ParamRefPicl_ist[ j ] is identical to parameter values of RefPicSetStCurrAfter[ i ] )

unique = 0;

if( unique == 1 )

ParamRefPicList [ rldx++ ] = RefPicSetStCurrAfter[ i ]

for( i = 0; i < NumPocLtCurr ; i++ )

unique = 1 ;

for( j = 0 ; j < rldx ; j++ ) if(parameter values of ParamRefPicl_ist[ j ] is identical to parameter values of RefPicSetl_tCurr[ i ] )

unique = 0;

if( unique == 1 )

ParamRefPicList [ rldx++ ] = RefPicSetl_tCurr[ i ]

Note that a general possibility when cross-picture prediction of multiple types of data is used is to construct and use separate new lists for each type. Using separate lists is strongly preferred when multiple types of data is used and duplicates are avoided.

As an example assuming two types, Typel and Type2, the pseudo code above would be run twice. One first time with ParamRefPicList representing a first new list for Typel prediction where the check for identical parameter values is done by checking Typel values only followed by a second time with ParamRefPicList representing a second new list for Type2 prediction where the check for identical parameter values is done by checking Type2 values only.

Alternatively, in a second method, instead of comparing whether two sets of parameter values are identical, the encoder and decoder could keep track of when parameter prediction is used and only store a copy of the parameters when parameter prediction is not used. In addition, the encoder and decoder must know when it is ok to remove a set of parameter values from the storage.

One way of realizing this is to keep a dictionary of the pictures (e.g. in terms of POC) that have been using each of the stored parameter set. This is illustrated by extending the previous ALF prediction example with a column describing the dictionary after receiving the current picture of each row.

Use ALF Pred Unique Number Signaled Idx of stored ALF

ALF ALF params codeword/ldx params {POCs params of stored ALF using current ALF params params}

Yes No 0 1 0:{0}

Yes From 0 0 1 0 0:{0, 8}

Yes No 0,4 2 0:{8}

1 :{4} Yes No 0,2,4 3 0 {0, 8}

1 {4}

2 {2}

Yes From 2 0,2,4 3 1 0 {0, 8}

1 {4}

2 {2, 1 }

No No 0, 2,4 3 0 {0, 8}

1 {4}

2 {2, 1 }

Yes From 4 0, 2, 4 3 2 0 {0, 8}

1 {4, 6}

2 {2, 1 }

Yes From 2 0, 2, 4 3 1 0 {0, 8}

1 {4, 6}

2 {2, 1 , 5}

The following pseudocode describes how the ParamRefPicList could be determined using the second method. The function getldxofParameterSet( iPOC ) returns the index of a stored parameter set given the POC as input. rldx = 0

for (i = 0; < numStoredParameterSets; i++)

numStoredPicsp] = 0

for( i = 0; i < NumPocStCurrBefore ; i++ )

paramSetldx = getldxofParameterSet( RefPicSetStCurrBefore[ i ] ) numStoredPics [paramSetldx]++

if (numStoredPics[ paramSetldx ] == 1 )

ParamRefPicList [ rldx++ ] = RefPicSetStCurrBefore[ i ]

for( i = 0; i < NumPocStCurrAfter ; i++ )

if (numStoredPics[ paramSetldx ] == 1 )

ParamRefPicList [ rldx++ ] = RefPicSetStCurrAfter[ i ]

for( i = 0; i < NumPocLtCurr ; i++ )

paramSetldx = getldxofParameterSet( RefPicSetStCurrBefore[ i ] ) numStoredPics [paramSetldx]++ if (numStoredPics[ paramSetldx ] == 1 )

ParamRefPicList [ rldx++ ] = RefPicSetLtCurr[ i ]

Whenever a picture is no longer available for the decoder (not part of RPS in HEVC terms) the POC for that picture is removed from the list. If a set of stored parameters is no longer connected to any POCs, that set of parameter values is removed from the list of stored parameter values.

1.9 Embodiment 9 - Specification text for signaling of ALF and CABAC parameters

This embodiment shows an example realization where ALF parameters are predicted across slices and CABAC parameters are predicted across pictures. The changes are based on H.265 version 3 (04/2015). Additions are in red and deletions are marked by strikethrough.

adaptive_loop_filter_enabled_flag equal to 1 specifies that the adaptive loop filter process may be applied to the reconstructed picture. adaptive_loop_filter_enabled_flag equal to 0 specifies that the adaptive loop filter process is not applied to the reconstructed picture. cabac_init_prediction_present_flag equal to 1 specifies that cabac_prediction_ref_idx may be present in the slice header. cabac_prediction_present_flag equal to 0 specifies that cabac_prediction_ref_idx is not present in the slice header. slice_segment_header( ) { Descript or first_slice_segment_in_pic_flag u(1 ) if( !dependent_slice_segment_flag ) { if( adaptiveJoop_filter_enabled_flag ) {

slice_adaptive_loop_filter_flag u(1 ) if ( slice_adaptive_loop_filter_flag && NumPicTotalCurr > 0 ) {

adaptive_loop_filter_prediction_flag u(1 ) if (adaptive_loop_filter_prediction_flag )

ad apti vej oop_f i lter_pred i cti on_ref_idx ue(v) adaptive_loop_filter_prediction_slice_idx ue(v)

}

if ( !adaptive_loop_filter_prediction_flag ) {

for( i = 0; i < nALFParams ; i++ )

adaptive_loop_filter_parameter[ i ] unspecifi ed

}

if( cabac_prediction_present_flag && NumPicTotalCurr > 0 ) {

cabac_prediction_flag u(1 ) if ( cabac_prediction_flag )

cabac_prediction_ref_idx ue(v)

}

slice_adaptive_loop_filter_flag equal to 1 specifies that the adaptive loop filter process may be applied to the reconstructed slice after the deblocking filter process. slice_adaptive_loop_filter_flag equal to 0 specifies that the adaptive loop filter process is not applied to the reconstructed slice after the deblocking filter process. When not present, the value of slice_adaptive_loop_filter_flag is inferred to be equal to 0. adaptive_loop_filter_prediction_flag equal to 1 specifies that the adaptive loop filter parameter values will be copied from a previous slice. adaptive_loop_filter_prediction_flag equal to 0 specifies that the adaptive loop filter parameter values will be specified in the slice header of the current picture. When not present, the value of adaptive_loop_filter_prediction_flag is inferred to be equal to 0. adaptive_loop_filter_prediction_ref_idx specifies the parameter reference index in ParamRefPicList that identifies the picture from which adaptive loop filter parameter values will be copied for the current slice. The value of adaptive_loop_filter_prediction_ref_idx shall be in the range of 0 to NumPicTotalCurr, inclusive. adaptive_loop_filter_prediction_slice_idx specifies the slice from which adaptive loop filter parameter values will be copied for the current slice. adaptive_loop_f ilter _parameter is currently an unspecified adaptive loop filter parameter. cabac_prediction_flag equal to 1 specifies cabac initialization parameters will be copied from a reference picture. When not present, the value of cabac_prediction_flag is inferred to be equal to 0 cabac_prediction_ref_idx specifies the parameter reference index in ParamRefPicList that identifies the reference picture from which cabac initialization parameters will be copied for the current slice. The value of cabac_prediction_ref_idx shall be in the range of 0 to NumPicTotalCurr - 1 , inclusive.

8.1 .3 Decoding process for a coded picture with nuh_layer_id equal to 0

The decoding process operates as follows for the current picture CurrPic:

1 . The decoding of NAL units is specified in clause 8.2.

2. The processes in clause 8.3 specify the following decoding processes using syntax elements in the slice segment layer and above:

- At the beginning of the decoding process for each P or B slice, the decoding process for reference picture lists construction specified in clause 8.3.4 is invoked for derivation of reference picture list 0 (RefPicListO) and, when decoding a B slice, reference picture list 1 (RefPicListl ), and the decoding process for collocated picture and no backward prediction flag specified in clause 8.3.5 is invoked for derivation of the variables ColPic and NoBackwardPredFlag. - At the beginning of the decoding process for each I, P or B slice, the decoding process for parameter reference picture list construction specified in clause 8.3.6 is invoked for derivation of the parameter reference picture list ParamRefPicList.

- At the beginning of the decoding process for each I, P or B slice, the decoding process for adaptive loop filter parameters specified in clause 8.3.7 is invoked.

3. The processes in clauses 8.4, 8.5, 8.6 and 8.7 specify decoding processes using syntax elements in all syntax structure layers. It is a requirement of the bitstream conformance that the coded slices of the picture shall contain slice segment data for every coding tree unit of the picture, such that the division of the picture into slices, the division of the slices into slice segments and the division of the slice segments into coding tree units each forms a partitioning of the picture. When CtbAddrlnRs is equal to PicSizelnCtbsY / 2 the the storage process for context variables and Rice parameter initialization states as specified in clause 9.3.2.3 is invoked with TableStateldxRefPic, TableMpsValRefPic and TableStatCoeffRefPic of the current picture as outputs.

4. After all slices of the current picture have been decoded, the decoded picture is marked as "used for short-term reference".

8.3.6 Decoding process for parameter reference picture list construction

This process is invoked at the beginning of the decoding process for each I, P or B slice. At the beginning of the decoding process for each slice, the parameter reference picture list ParamRefPicList is derived as follows:

rldx = 0 if ( first_slice_segment_in_pic_flag == 0 ) ParamRefPicList [ rldx++ ] = CurrPic for( i = 0; i < NumPocStCurrBefore ; i++ )

ParamRefPicList [ rldx++ ] = RefPicSetStCurrBefore[ i ] for( i = 0; i < NumPocStCurrAfter ; i++ ) (8-12)

ParamRefPicList [ rldx++ ] = RefPicSetStCurrAfter[ i ] for( i = 0; i < NumPocLtCurr ; i++ )

ParamRefPicList [ rldx++ ] = RefPicSetLtCurr[ i ]

8.3.7 Decoding process for adaptive loop filter parameters

This process is invoked at the beginning of the decoding process for each I, P or B slice after invokation of 8.3.6.

The following ordered steps applies: - If first_slice_segment_in_pic_flag is equal to 1 , the variable Sliceldx of the current picture is set equal to 0, otherwise the variable Sliceldx of the current picture is incremented by 1.

- The variable SliceldxCurrPic is set equal to the variable Sliceldx of the current picture - The variable Sliceldx of the current slice is set equal to the variable Sliceldx of the current picture

- The variable AlfParams[ SliceldxCurrPic ][ n ] of the current picture is set equal to "no parameter value" with n = CLnALFParams - 1

When adaptive_loop_filter_prediction_flag is equal to 1 the following applies:

- The variable AlfPic is set equal to

ParamRefPicl_ist[ adaptive_loop_filter_prediction_ref_idx ]. It is a requirement of bitstream conformance that adaptive_loop_filter_prediction_slice_idx shall be smaller than the variable Sliceldx of the picture AlfPic.

- The variable AlfParams[ SliceldxCurrPic][ n ] of the current picture is set equal to AlfParams[ adaptive_loop_filter_prediction_slice_idx ][ n ] of the picture specified by

AlfPic with n = 0..nALFParams - 1. It is a requirement of bitstream conformance that no value of AlfParams[ adaptive_loop_filter_prediction_slice_idx ][ n ] of the picture specified by AlfPic shall be equal to "no parameter value".

When slice_adaptive_loop_filter_flag is equal to 1 and adaptive_loop_filter_prediction_flag is equal to 0 the following applies:

- AlfParams[ SliceldxCurrPic ][ n ] of the current picture is set equal to adaptive_loop_filter _parameter[ n ] with n = 0..nALFParams - 1

9.3 CABAC parsing process for slice segment data

9.3.2 Initialization process

9.3.2.1 General

The context variables of the arithmetic decoding engine are initialized as follows:

If the coding tree unit is the first coding tree unit in a tile, the following applies: - If cabac_prediction_flag is equal to 1 , the synchronization process for context variables and Rice parameter initialization states as specified in clause 9.3.2.4 is invoked with TableStateldxRefPic, TableMpsValRefPic and TableStatCoeffRefPic of the picture indicated by ParamRefPicl_ist[ cabac_prediction_ref_idx + 1 first_slice_segment_in_pic_flag ] as inputs.

- Otherwise, the following applies: ♦ The initialization process for context variables is invoked as specified in clause 9.3.2.2.

♦ The variables StatCoeff[ k ] are set equal to 0, for k in the range 0 to 3, inclusive.

8.7 In-loop filter process 8.7.1 General

This clause specifies the application of two three in-loop filters. When the in-loop filter process is specified as optional in Annex A, the application of either or both of these filters is optional.

The two three in-loop filters, namely deblocking filter, and sample adaptive offset filter, and adaptive loop filter are applied as specified by the following ordered steps:

1 . For the deblocking filter, the following applies:

- The deblocking filter process as specified in clause 8.7.2 is invoked with the reconstructed picture sample array SL and, when ChromaArrayType is not equal to 0, the arrays Scb and So as inputs, and the modified reconstructed picture sample array S'L and, when ChromaArrayType is not equal to 0, the arrays S'cb and S'cr after deblocking as outputs.

- The array S'L and, when ChromaArrayType is not equal to 0, the arrays S'cb and S'cr are assigned to the array SL and, when ChromaArrayType is not equal to 0, the arrays Scb and So (which represent the decoded picture), respectively.

2. When sample_adaptive_offset_enabled_flag is equal to 1 , the following applies:

- The sample adaptive offset process as specified in clause 8.7.3 is invoked with the reconstructed picture sample array SL and, when ChromaArrayType is not equal to

0, the arrays Scb and So as inputs, and the modified reconstructed picture sample array S'L and, when ChromaArrayType is not equal to 0, the arrays S'cb and SO after sample adaptive offset as outputs.

- The array S'L and, when ChromaArrayType is not equal to 0, the arrays S'cb and SO are assigned to the array SL and, when ChromaArrayType is not equal to 0, the arrays Scb and So (which represent the decoded picture), respectively.

3. For the adaptive loop filter, the process in clause 8.7.4 is invoked. 8.7.4 Adaptive loop filter process

This process is invoked after the completion of the sample adaptive offset process for the entire decoded picture. This process is invoked for each slice for which slice_adaptive_loop_filter_flag is equal to 1 .

This process is currently unspecified but each slice shall use the adaptive loop filter parameter values from AlfParams[ Idx ][ n ] where Idx is the Sliceldx of the slice.

The methods described above may be implemented in any suitable decoder implementation. Thus, in one embodiment, the disclosure provides a decoder which is configured to perform any of the methods described herein. The description below sets out some specific embodiments. Figure 4 shows a decoder 400 according to embodiments of the disclosure. The decoder may be implemented in hardware, in software or a combination of hardware and software. The decoder may be implemented in, e.g. comprised in, user equipment, such as a mobile telephone, tablet, desktop, netbook, multimedia player, video streaming server, set-top box or computer.

In the illustrated embodiment, the decoder 400 comprises receiving means or a receiving module 402, identifying means or an identifying module 404, creating means or a creating module 406, determining means or a determining module 408, deriving means or a deriving module 410 and decoding means or a decoding module 412. The receiving means/module 402 is operative to receive an encoded representation of a current picture or slice of a video sequence from an encoder. The identifying means/module 404 is operative to identify a set of previously decoded reference pictures or slices for the current picture or slice. The creating means/module 406 is operative to create an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set. The determining means/module 408 is operative to determine, from the list, a previously decoded reference picture or slice to use for prediction. The deriving means/module 410 is operative to derive final parameter values by predicting parameter values using the determined reference picture or slice. The decoding means/module 412 is operative to decode the current picture or slice from the encoded representation, using the final parameter values. As noted above, each of the modules may be implemented purely in hardware, or purely in software. Alternatively, the modules may be implemented in a combination of hardware and software.

In an alternative embodiment, the decoder may be implemented in or comprise processing circuitry and a non-transitory machine-readable medium storing instructions which, when executed by the processing circuitry, cause the decoder to: receive an encoded representation of a current picture or slice of a video sequence from an encoder; identify a set of previously decoded reference pictures or slices for the current picture or slice; create an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set; determine, from the list, a previously decoded reference picture or slice to use for prediction; derive final parameter values by predicting parameter values using the determined reference picture or slice; and decode the current picture or slice from the encoded representation, using the final parameter values.

The present disclosure thus provides methods, apparatus and computer programs for decoding video media, and particularly for determining one or more parameters values to be used in decoding video data based on parameter values for one or more previously decoded reference pictures or slices.

Claims

1 . A method, performed by a decoder (400), for predicting parameter values from a previously decoded reference picture to a current picture or current slice of a current picture, the method comprising:

receiving (200) an encoded representation of the current picture or slice of a video sequence from an encoder;

identifying (202) a set of previously decoded reference pictures or slices for the current picture or slice;

creating (204) an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set;

determining (206), from the list, a previously decoded reference picture or slice to use for prediction;

deriving (208) final parameter values by predicting parameter values using the determined reference picture or slice; and

decoding (210) the current picture or slice from the encoded representation, using the final parameter values.

2. The method according to claim 1 , wherein determining a previously decoded reference picture or slice to use for prediction comprises decoding an indicator from the encoded representation pointing to one of the pictures or slices belonging to the set.

3. The method according to claim 1 , wherein determining a previously decoded reference picture or slice to use for prediction comprises selecting a reference picture or slice that is most recently received by the decoder that has a temporal identity equal to or lower than the current picture or slice.

4. The method according to claim 1 , wherein determining a previously decoded reference picture or slice to use for prediction comprises selecting a reference picture or slice that is closest in output order to the current picture or slice and has a temporal identity equal to or lower than the current picture or slice.

5. The method according to any one of the preceding claims, wherein the set of previously decoded reference pictures or slices for the current picture or slice comprises: short-term pictures or slices which are available for the current picture or slice and have a higher picture order count value than the current picture or slice; short-term pictures or slices which are available for the current picture or slice and have a lower picture order count value than the current picture or slice; and long-term pictures or slices which are available for the current picture or slice.

6. The method according to any one of the preceding claims, wherein the set of previously decoded reference pictures or slices for the current picture or slice comprises previously decoded reference pictures or slices that belong to a temporal layer that is equal to or lower than the temporal layer of the current picture or slice.

7. The method according to any one of the preceding claims, wherein creating the ordered list comprises identifying reference pictures or slices from the set of previously decoded reference pictures or slices which have one or more characteristics which match characteristics of the current slice or picture.

8. The method according to claim 7, wherein the characteristics comprise one or more of: whether one or more particular decoding tools are turned on or off; the picture or slice type; the coded picture size; and the configuration of one or more particular tools.

9. The method according to any one of the preceding claims, wherein the ordered list of indicators pointing to previously decoded reference pictures or slices belonging to the set is ordered according to picture order count value of the previously decoded reference pictures or slices.

10. The method according to any one of the preceding claims, wherein the ordered list of indicators is separate from L0 and L1 lists used for decoding picture data.

1 1 . The method according to any one of the preceding claims, wherein the parameters comprise one or more of: adaptive loop filter coefficients; Context Adaptive Binary Arithmetic Coding states; sample adaptive offset (SAO) parameters; coding tree structure parameters; interpolation filter coefficients; scaling matrices; slice_segment_address; slice_type; color_plane_id; collocated_ref_idx; weighted prediction parameters, merge candidate parameters, quantization parameter modification parameters, deblocking parameters, entry point data and slice header extension.

12. The method according to any one of the preceding claims, wherein the decoder is implemented in user equipment, such as a mobile telephone, tablet, desktop, netbook, multimedia player, video streaming server, set-top box or computer.

13. A decoder (400) for predicting parameter values from a previously decoded reference picture to a current picture or current slice of a current picture, the decoder being configured to:

receive an encoded representation of the current picture or slice of a video sequence from an encoder;

identify a set of previously decoded reference pictures or slices for the current picture or slice;

create an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set;

determine, from the list, a previously decoded reference picture or slice to use for prediction;

derive final parameter values by predicting parameter values using the determined reference picture or slice; and

decode the current picture or slice from the encoded representation, using the final parameter values.

14. A decoder (400) for predicting parameter values from a previously decoded reference picture to a current picture or current slice of a current picture, the decoder comprising:

a receiver module (402) configured to receive an encoded representation of the current picture or slice of a video sequence from an encoder;

an identifying module (404) configured to identify a set of previously decoded reference pictures or slices for the current picture or slice;

a creating module (406) configured to create an ordered list of indicators pointing to one or more previously decoded reference pictures or slices belonging to the set;

a determining module (408) configured to determine, from the list, a previously decoded reference picture or slice to use for prediction;

a deriving module (410) configured to derive final parameter values by predicting parameter values using the determined reference picture or slice; and

a decoding module (412) configured to decode the current picture or slice from the encoded representation, using the final parameter values.

15. The decoder according to claim 14, wherein the determining module is configured to determine a previously decoded reference picture or slice to use for prediction by decoding an indicator from the encoded representation pointing to one of the pictures or slices belonging to the set.

16. The decoder according to claim 14, wherein the determining module is configured to determine a previously decoded reference picture or slice to use for prediction by selecting a reference picture or slice that is most recently received by the decoder that has a temporal identity equal to or lower than the current picture or slice.

17. The decoder according to claim 14, wherein the determining module is configured to determine a previously decoded reference picture or slice to use for prediction by selecting a reference picture or slice that is closest in output order to the current picture or slice and has a temporal identity equal to or lower than the current picture or slice.

18. The decoder according to any one of claims 14 to 17, wherein the set of previously decoded reference pictures or slices for the current picture or slice comprises: short-term pictures or slices which are available for the current picture or slice and have a higher picture order count value than the current picture or slice; short-term pictures or slices which are available for the current picture or slice and have a lower picture order count value than the current picture or slice; and long-term pictures or slices which are available for the current picture or slice.

19. The decoder according to any one of claims 14 to 18, wherein the set of previously decoded reference pictures or slices for the current picture or slice comprises previously decoded reference pictures or slices that belong to a temporal layer that is equal to or lower than the temporal layer of the current picture or slice.

20. The decoder according to any one of claims 14 to 19, wherein the creating module is configured to create the ordered list by identifying reference pictures or slices from the set of previously decoded reference pictures or slices which have one or more characteristics which match characteristics of the current slice or picture.

21 . The decoder according to claim 20, wherein the characteristics comprise one or more of: whether one or more particular decoding tools are turned on or off; the picture or slice type; the coded picture size; and the configuration of one or more particular tools.

22. The decoder according to any one of claims 14 to 21 , wherein the ordered list of indicators pointing to previously decoded reference pictures or slices belonging to the set is ordered according to picture order count value of the previously decoded reference pictures or slices.

23. The decoder according to any one of claims 14 to 22, wherein the ordered list of indicators is separate from L0 and L1 lists used for decoding picture data.

24. The decoder according to any one of claims 14 to 23, wherein the parameters comprise one or more of: adaptive loop filter coefficients; Context Adaptive Binary Arithmetic Coding states; sample adaptive offset (SAO) parameters; coding tree structure parameters; interpolation filter coefficients; scaling matrices; slice_segment_address; slice_type; color_plane_id; collocated_ref_idx; weighted prediction parameters, merge candidate parameters, quantization parameter modification parameters, deblocking parameters, entry point data and slice header extension.

25. The decoder according to any one of claims 14 to 24, wherein the decoder is implemented in user equipment, such as a mobile telephone, tablet, desktop, netbook, multimedia player, video streaming server, set-top box or computer.

26. A computer program for a decoder, for predicting parameter values from a previously decoded reference picture to a current picture or current slice of a current picture, the computer program comprising computer program code which, when executed, causes the decoder to:

27. A computer program product for a decoder, for predicting parameter values from a previously decoded reference picture to a current picture or current slice of a current picture, the computer program product comprising a non-transitory computer-readable medium storing computer program code which, when executed, causes the decoder to: receive an encoded representation of the current picture or slice of a video sequence from an encoder;