WO2024080916A1 - Inter-predicted reference picture lists - Google Patents

Inter-predicted reference picture lists Download PDF

Info

Publication number
WO2024080916A1
WO2024080916A1 PCT/SE2023/051012 SE2023051012W WO2024080916A1 WO 2024080916 A1 WO2024080916 A1 WO 2024080916A1 SE 2023051012 W SE2023051012 W SE 2023051012W WO 2024080916 A1 WO2024080916 A1 WO 2024080916A1
Authority
WO
WIPO (PCT)
Prior art keywords
list
value
values
lopred
uniquelist
Prior art date
Application number
PCT/SE2023/051012
Other languages
French (fr)
Inventor
Rickard Sjöberg
Martin Pettersson
Jacob STRÖM
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Publication of WO2024080916A1 publication Critical patent/WO2024080916A1/en

Links

Abstract

There is provided a method for decoding a video bitstream. The method comprises constructing a list of lists, LOL, before any coded picture that references the LOL is decoded, the LOL comprising an entry comprising i) a first previous reference picture list, the L1pred_prev list, and ii) a second previous reference picture list, the L1pred_prev list, wherein the L1pred_prev list comprises a first set of values and the L1pred_prev list comprises a second set of values, and further wherein each value included in the first set of values and the second set of values represents a reference picture indicator that can be used in the decoding process of a coded picture. The method comprises constructing a first current reference picture list, the L1pred_cur list, and a second current reference picture list, the L1pred_cur list, based on the L1pred_prev list and the L1pred_prev list, wherein constructing the L0pred_cur list and L1pred_cur list comprises: deriving a value, the deltaPOC value, from a set of syntax elements in the bitstream, deriving a set of values, the UniqueList, wherein the UniqueList comprises a) the deltaPOC value and b) a third set of values, wherein each value in the third set of values is equal to either i) the sum of a value from the first set of values and the deltaPOC value or ii) the sum of a value from the second set of values and the deltaPOC value, deriving a value N from the set of syntax elements, deriving a value M from the set of syntax elements, including N values from the UniqueList in the L0pred_cur list and including M values from the UniqueList in the L1pred_cur list.

Description

TITLE
INTER-PREDICTED REFERENCE PICTURE LISTS
TECHNICAL FIELD
[001] Disclosed are embodiments related to video encoding and video decoding.
BACKGROUND
[002] 1. Versatile Video Coding (VVC) and High-Efficiency Video Coding (HEVC)
[003] Versatile Video Coding (VVC) and its predecessor, High Efficiency Video
Coding (HEVC), are block-based video codecs standardized and developed jointly by ITU-T and MPEG. The codecs utilize both temporal and spatial prediction. Spatial prediction is achieved using intra (I) prediction from within the current picture. Temporal prediction is achieved using uni-directional (P) or bi-directional inter (B) prediction on the block level from previously decoded reference pictures.
[004] In the encoder, the difference between the original sample data and the predicted sample data, referred to as the residual, is transformed into the frequency domain, quantized, and then entropy coded before transmitted together with necessary prediction parameters such as prediction mode and motion vectors, also entropy coded. The decoder performs entropy decoding, inverse quantization, and inverse transformation to obtain the residual, and then adds the residual to an intra or inter prediction to reconstruct a picture.
[005] The VVC version 1 specification was published as Rec. ITU-T H.266 | ISO/IEC
23090-3, “Versatile Video Coding,” in 2020. MPEG and ITU-T are working together within the Joint Video Exploratory Team (JVET) on updated versions of HEVC and VVC as well as the successor to VVC, i.e., the next generation video codec.
[006] 2. Components
[007] A video sequence consists of a series of pictures where each picture consists of one or more components. A picture in a video sequence is sometimes denoted ‘image’ or ‘frame’. Each component in a picture can be described as a two-dimensional rectangular array of picture sample values (or “sample values” or “samples” for short). It is common that a picture in a video sequence consists of three components; one luma component Y where the sample values are luma values and two chroma components Cb and Cr, where the sample values are chroma values. Other common representations include ICtCb, IPT, constantluminance YCbCr, YCoCg and others. It is also common that the dimensions of the chroma components are smaller than the luma components by a factor of two in each dimension. For example, the size of the luma component of an HD picture would be 1920x1080 and the chroma components would each have the dimension of 960x540. Components are sometimes referred to as ‘color components’, and other times as ‘channels’.
[008] 3. Coding Units and Coding Blocks
[009] In many video coding standards, such as HEVC and VVC, each component of a picture is split into blocks and the coded video bitstream consists of a series of coded blocks. A block is a two-dimensional array of samples. It is common in video coding that the picture is split into units that cover a specific area of the picture.
[0010] Each unit consists of all blocks from all components that make up that specific area and each block belongs fully to one unit. The macroblock in H.264 and the Coding Unit (CU) in HEVC and VVC are examples of units. In VVC the CUs may be split recursively to smaller CUs. The CU at the top level is referred to as the coding tree unit (CTU). A CU usually contains three coding blocks, i.e. one coding block for luma and two coding blocks for chroma. A block to which a transform used in coding is applied is referred to as a “transform block.” And a block to which a prediction mode is applied is referred to as a “prediction blocks.”
[0011] 4. Network Abstraction Layer (NAL)
[0012] HEVC and VVC define a Network Abstraction Layer (NAL). A NAL unit is a data structure that contains data. A so-called Video Coding Layer (VCL) NAL unit contains data that represents picture sample values. A non-VCL NAL unit contains additional associated data such as parameter sets and supplemental enhancement information (SEI) messages. The NAL unit in HEVC begins with a 2-byte header which specifies the NAL unit type of the NAL unit that identifies what type of data is carried in the NAL unit, the layer ID and the temporal ID for which the NAL unit belongs to. The NAL unit type is transmitted in the nal unit type codeword in the NAL unit header and the type indicates and defines how the NAL unit should be parsed and decoded. The bytes after the 2-byte NAL unit header is payload of the type indicated by the NAL unit type. A bitstream consists of a series of concatenated NAL units
[0013] 5. Slices and Tiles
[0014] The concept of slices in HEVC divides the picture into independently coded slices, where decoding of one slice in a picture is independent of other slices of the same picture. Different coding types could be used for slices of the same picture, i.e. a slice could either be an I-slice, P-slice or B-slice. One purpose of slices is to enable resynchronization in case of data loss. In HEVC, a slice is a set of CTUs.
[0015] The VVC and HEVC video coding standards includes a tool called tiles that divides a picture into rectangular spatially independent regions. Tiles in VVC are similar to the tiles used in HEVC. Using tiles, a picture in VVC can be partitioned into rows and columns of CTUs where a tile is an intersection of a row and a column.
[0016] In VVC, a slice is defined as an integer number of complete tiles or an integer number of consecutive complete CTU rows within a tile of a picture that are exclusively contained in a single NAL unit. In VVC, a picture may be partitioned into either raster scan slices or rectangular slices. A raster scan slice consists of a number of complete tiles in raster scan order. A rectangular slice consists of a group of tiles that together occupy a rectangular region in the picture or a consecutive number of CTU rows inside one tile. Each slice has a slice header comprising syntax elements. Decoded slice header values from these syntax elements are used when decoding the slice. Each slice is carried in one VCL NAL unit. In an early draft of the VVC specification, slices were referred to as tile groups.
[0017] 6. Parameter Sets
[0018] HEVC and VVC specifies three types of parameter sets, the picture parameter set (PPS), the sequence parameter set (SPS) and the video parameter set (VPS). The PPS contains data that is common for a whole picture, the SPS contains data that is common for a coded video sequence (CVS) and the VPS contains data that is common for multiple CVSs, e.g., data for multiple scalability layers in the bitstream. [0019] VVC also specifies one additional parameter set, the adaptation parameter set (APS). The APS carries parameters needed for the adaptive loop filter (ALF) tool, the luma mapping and chroma scaling (LMCS) tool and the scaling list tool.
[0020] Both HEVC and VVC allow certain information (e.g., parameter sets) to be provided by external means. “By external means” should be interpreted as the information is not provided in the coded video bitstream but by some other means not specified in the video codec specification, e.g., via metadata possibly provided in a different data channel, as a constant in the decoder, or provided through an API to the decoder.
[0021] 7. Picture Header
[0022] The current VVC draft includes a picture header syntax structure. This syntax structure can either be conveyed in its own NAL unit or be included in a slice header when there is only one slice in the picture. When conveyed in a NAL unit, the NAL unit type is equal to a value that indicates that the NAL unit contains a picture header. The values of the syntax elements in the picture header are used to decode all slices of one picture.
[0023] 8. Decoding Capability Information (DCI)
[0024] In VVC there is a DCI NAL unit. The DCI specifies information that doesn’t change during the decoding session and may be good for the decoder to know about early and upfront, such as profile and level information. The information in the DCI is not necessary for operation of the decoding process. In drafts of the VVC specification the DCI was called decoding parameter set (DPS).
[0025] The decoding capability information may also contain a set of general constraints for the bitstream, that gives the decoder information of what to expect from the bitstream, in terms of coding tools, types of NAL units, etc. In VVC version 1, the general constraint information can be signaled in the DCI, VPS or SPS.
[0026] 9. Decoded Picture Buffer (DPB)
[0027] Decoded pictures are stored by the decoder so that they can be used for temporal prediction when decoding future pictures. Those pictures are commonly stored in a decoded picture buffer (DPB). The DPB conceptually consists of a limited number of picture buffers where each picture buffer holds all sample data and motion vector data that may be needed for decoding of future pictures. In HEVC, sample data is needed for motion compensation and motion vector data is needed for temporal motion vector prediction (TMVP). Each picture in the DPB is marked as either “used for short-term reference”, “used for long-term reference”, or “unused for reference”. A picture is stored in the DPB either because it may be used for prediction during decoding or because it is waiting for output. The DPB has a limited size that limits the amount of memory the decoder needs to allocate as well as the number of reference pictures an encoder may use. The memory size is specified by a bitstream level that can be indicated in the bitstream or signaled by the system. A decoder is typically claiming conformance to a specific level which means that it is capable of decoding all bitstreams conforming to that level and lower levels. The decoder may allocate the maximum number of bytes specified by the level and be certain that all bitstreams of that level and lower are decodable.
[0028] 10. Picture Order Count (POC)
[0029] Pictures in HEVC are identified by their picture order count (POC) values, also known as full POC values. Each slice contains a code word, pic order cnt lsb, that shall be the same for all slices in a picture, pic order cnt lsb is also known as the least significant bits (Isb) of the full POC since it is a fixed-length code word and only the least significant bits of the full POC is signaled. Both encoder and decoder keep track of POC and assign POC values to each picture that is encoded/decoded. The pic order cnt lsb can be signaled by 4-16 bits. There is a variable MaxPicOrderCntLsb used in HEVC and VVC which is set to the maximum pic order cnt lsb value plus 1. This means that if 8 bits are used to signal pic order cnt lsb, the maximum value is 255 and MaxPicOrderCntLsb is set to 2A8 = 256. The picture order count value of a picture is called PicOrderCntVal in HEVC and VVC. Usually, PicOrderCntVal for the current picture is simply called PicOrderCntVal.
[0030] 11. Reference Picture Set
[0031] Reference Picture Sets are a concept in HEVC that defines how previously decoded pictures are managed in a decoded picture buffer (DPB) in order to be used for reference, i.e., sample data prediction and motion vector prediction. In other words, what pictures to store in memory is in HEVC signaled using RPS. An RPS is a set of indicators to previously decoded pictures and the RPS is signalled or indicated in each slice header. An RPS is signaled in each slice header in HEVC. All pictures in the DPB that are not included in the RPS are marked as “unused for reference”. Once a picture has been marked “unused for reference” it can no longer be used for prediction, and when it is no longer needed for output, it will be removed from the DPB. In HEVC, the RPS is signalled as a set of delta POC values relative to a current picture. As an example, the RPS information may contain the values -4, -6, 4
[0032] This then means that given that the picture has a POC value equal to 100, the current picture can use the pictures having POC values equal to 96, 94 and 104 for prediction. In this example, the RPS also indicates that any picture having any other POC value will never be used for predication in the future. This is a robust way for the decoder to discard pictures.
[0033] Sometimes the encoder may want the decoder to save a picture although it is not going to be used for prediction in the current frame. This is signaled by a flag for each value called used_by_curr_pic_flag. If used by curr pic flag is equal to 1, this means that the picture can indeed be used for prediction for the current frame. If it is equal to 0, this instead means that the decoder cannot predict from it, but it must keep it in the DPB since future pictures may predict from it. We may use the convention that if a reference picture is marked by a star (*), then used_by_curr_pic_flag equals 0, otherwise used by curr pic flag equals 1. Hence -4, -6*, 4 means that the decoder may predict from the POC==96 and POC==104 pictures, but not from the POC==94 picture. Still, the decoder cannot throw away the 94 pictures since it may be used for prediction by future pictures.
[0034] 12. RPS in the SPS
[0035] Sometimes the RPS information can be rather lengthy. As an example, taken from the configuration files used for test purposes during the standardization of HEVC, the following RPS is used: -3, -2, 1, 2, 5, 6.
[0036] Encoding these values may take up 33 bits. Not much perhaps, but if it has to be sent every slice this still amounts to something noticeable, especially at very low bit rates and small image sizes. One key observation however is that these RPSes are typically not completely random. Instead, they can be reused over and over again. As an example, let us go through an example of 18 pictures using a so-called GOP size of 8 pictures. As said above, pictures in HEVC are identified by their POC values, also known as full POC values or picture order count values (PicOrderCntVal). This is the output order (also referred to as the display order); for instance, a picture with POC value == 57 will be displayed after a picture with POC value == 56. However, the images are not always transmitted in the order they are displayed. For instance, the encoder may first transmit the picture with POC = 0, followed by POC = 8, followed by POC = 4 and so forth. The decoder has to keep track of these and display them in the correct order. In our example, the 18 pictures will be transmitted in the order shown in Table 1 below.
TABLE 1 - RPS Example
Figure imgf000009_0001
Figure imgf000010_0002
[0037] As can be seen in the Table, the RPSes are sent several times. For instance, the RPS sent for POC==6, {-2, -4, -6, 2} is the same as the one sent for POC==14. Due to this periodicity, the HEVC standard allows the RPSes to be sent in the Sequence Parameter Set (SPS). SPS may be sent only once per sequence, or as often as it is desired to have random access possibility. For instance, if the bit stream is broadcasted, an SPS might be sent once a second, to support channel switching once a second. For our example, we could in HEVC specify the eight recurring RPSes in the SPS and assign an index to each one of them as shown in Table 2: TABLE 2 - RPSes stored in SPS
Figure imgf000010_0001
[0038] The information sent in the slice header now only has to refer to an index in the SPS. This is much cheaper than sending the RPSes themselves, as can be guessed from comparing Table 3 and Table 1 :
TABLE 3 - RPSes in slice header sent using indices to SPS
Figure imgf000010_0003
Figure imgf000011_0001
[0039] 13. Prediction of RPSs in the SPS
[0040] Coding according to Table 3 reduces the number of bits spent on sending RPSs. The bulk of the data is sent in the SPS instead, which is fine because it is not sent often. Still, it turns out it is possible to compress this information further. If we look at two rows from Table 1, we will see that there is often a correlation between the rows. For instance, if we compare the RPS values for POC=6 (i.e., -2, -4, -6, 2) to the RPS values for POC=1 (i.e., -1, 1, 3, 5, 7), a pattern is revealed as shown in Table 4. TABLE 4
Figure imgf000012_0001
[0041] That is, every number in the RPS for POC=6 plus the value 5 is equal to a number in the RPS for POC=1. For example, the first value -2 plus 5 is equal to 3, which is a value in POC=1. Similarly, every number in the RPS for POC=1 minus the value 5 is equal to a number in the RPS for POC=6, with the only exception being the second to last number 5 which corresponds to the difference between the POC values of the rows, 6-1=5. As it turns out, every row of RPS data can often be predicted from the previous row. This is the case because the RPS of a picture that immediately follows another picture can only contain indicators to pictures in the RPS of that previous picture plus an indicator to the previous picture itself. This is utilized in the HEVC specification with the syntax shown in Table 5.
[0042] The value to add (or subtract) (5 in our example) corresponds to the syntax elements delta rps sign and abs delta rps minusl in Table 5.
[0043] The HEVC decoder takes the RPS of the previous row in Table 2, adds an element equal to 0 to the set, and then adds the decoded delta number (5 in our example) to every element in the set. The result is a set of candidate pictures for the current RPS.
[0044] In the example in Table 4 above, the decoder would take the set -2, -4, -6, 2, add the element 0 and add 5 to each element. This results in the set 3, 1, -1, 7, 5. Then, what pictures to keep and the value of the flag used by curr pie flag must be signaled. In our example above, used by curr pie flag is equal to one everywhere since we do not have any RPS entries marked with stars (*). In HEVC, these two types of information are coded together. If we should keep the candidate and used_by_curr_pic_flag is equal to one, we send the code ‘ 1’. If we should keep the candidate but used by curr pic flag is zero, we send the code ‘01 ’. If we should not keep the candidate, we send the code ‘00’. In our example we would send ‘ 1 ’ (we should keep -2+5 = 3, flag = 1), ‘ 1’ (we should keep -4+5 = 1, flag = 1), ‘ 1’ (we should keep -6+5 = -1, flag = 1), ‘ 1’ (we should keep 2+5 = 7, flag = 1) and finally ‘ 1’ (we should keep 5, flag = 1). TABLE 5 - Short-term reference picture set syntax of HEVC. Used for RPS in SPS signaling
Figure imgf000013_0001
Figure imgf000014_0001
[0045] By sending the data this way, a lot of bits can be saved. The savings obtained in HEVC is about 50% of the RPS bits. Still of course, the RPS data is a very small part of the total video bit stream data, so the overall effect may not be large, but it is of course beneficial to compress the data efficiently. [0046] 14. Reference picture lists
[0047] When decoding a picture, references to previous pictures are handled by reference picture lists. HEVC uses at most two reference picture lists, L0 and LI for each picture, and those lists may only contain pictures in the RPS that are set to “used by cur pic”. P-pictures uses L0 and B-pictures uses L0 and LI. When inter prediction is used for a block, the decoder derives a reference index value for L0, and possibly LI, and uses those reference index values as indices in the L0 and LI lists to determine which reference picture(s) to use for the block.
[0048] 15. Reference picture set in VVC
[0049] VVC uses parts of the reference picture set idea, but instead of signaling the RPS as in HEVC, the VVC specification allows signaling of the L0 and LI lists in the SPS. For each of the L0 and LI lists, the number of “active” pictures is signalled in the PPS with an option to override this number in the slice header. Active pictures are reference pictures that are kept in the DPB and can be used for reference by the current picture while inactive pictures must be kept in the DPB but are not used for reference by the current picture. Inactive pictures correspond to pictures with used_by_curr_pic_flag equal to 0 in HEVC. [0050] When the L0 and LI lists in VVC are signaled in the SPS, the decoder can be seen as constructing one list of L0 lists and one list of LI lists. Each entry in those two lists is a reference picture list. FIG. 4 shows an example.
[0051] The SPS syntax in VVC for conveying these lists to the decoder includes an sps_num_ref_pic_lists[0] codeword that specifies the size of the list of L0 lists. For the example shown in FIG. 4, that size is equal to 3. Then, for each of the 3 entries, the codewords in the ref_pic_list_struct() syntax structure as specified in VVC, not shown here, follows. This syntax structure includes a codeword for the size of the L0 list followed by codewords specifying the values of the L0 list. In the FIG. 4 example, the sizes of the three L0 lists are all equal to 5. Thereafter, the sps_num_ref_pic_lists[ 1 ] codeword follows, that specifies the size of the list of LI lists with its ref_pic_list_struct() syntax following. In the FIG. 4 example, the size of the list of LI lists is equal to 2 and the sizes of the LI lists are 2 and 1 respectively.
[0052] A VVC decoder may later, when decoding a picture header or slice header, reference L0 and LI lists that were decoded from the SPS rather than decoding them from the picture header and slice header themselves. If a particular L0 or LI list is used by many coded pictures that reference the same SPS, it is more bit-efficient if the lists are conveyed in the SPS. When SPS referencing is not done, the picture header or slice header contains a ref_pic_list_struct() syntax structure, so the syntax for the L0 and LI lists is very similar regardless of whether it is positioned in the SPS or picture header or slice header in VVC.
[0053] Using FIG. 4 as an example, a VVC decoder may from a ref_pic_lists() syntax structure in a picture header or slice header decode a syntax element flag called rpl sps flag as equal to 1 for L0. The rpl sps flag indicates whether to use an RPL from the SPS or explicitly decode it from the picture header or slice header. Then the next syntax element is an index value that specifies which entry in the list of L0 lists to use. For example, if that index value is equal to 0, then the decoder will use a reference picture list L0 equal to { -32, -64, -48, -40, -36 } for the picture associated with the picture header or slice header.
[0054] Thereafter, the VVC decoder may decode the flag rpl sps flag as equal to 1 for LI as well, followed by decoding an index value. This index value may for example be equal to 1, which then means that the decoder will use a reference picture list LI equal to { 16 }. SUMMARY
[0055] Certain challenges presently exist. For instance, a problem with the signaling of reference picture lists in VVC is that there is no prediction in the signaling, and this makes the signaling inefficient in terms of compression.
[0056] Accordingly, in one aspect there is provided a method for decoding a video bitstream. The method includes constructing a list of lists, LOL, before any coded picture that references the LOL is decoded. The LOL comprises an entry comprising i) a first previous reference picture list, the L0pred_prev list, and ii) a second previous reference picture list, the Llpred_prev list, wherein the L0pred_prev list comprises a first set of values and the Llpred_prev list comprises a second set of values. Each value included in the first set of values and the second set of values represents a reference picture indicator that can be used in the decoding process of a coded picture. The method also includes constructing a first current reference picture list, the LOpred cur list, and a second current reference picture list, the Llpred cur list, based on the L0pred_prev list and the Llpred_prev list. Constructing the LOpred cur list and the Llpred cur list includes: deriving a value, the deltaPOC value, from a set of syntax elements in the bitstream; deriving a set of values, the UniqueList, wherein the UniqueList comprises a) the deltaPOC value and b) a third set of values, wherein each value in the third set of values is equal to either i) the sum of a value from the first set of values and the deltaPOC value or ii) the sum of a value from the second set of values and the deltaPOC value; deriving a value N from the set of syntax elements; deriving a value M from the set of syntax elements; including N values from the UniqueList in the LOpred cur list; and including M values from the UniqueList in the Llpred cur list.
[0057] In some aspects, there is provided a computer program comprising instructions which when executed by processing circuitry of an apparatus causes the apparatus to perform any of the methods disclosed herein. In one embodiment, there is provided a carrier containing the computer program wherein the carrier is one of an electronic signal, an optical signal, a radio signal, and a computer readable storage medium. In another aspect there is provided an apparatus that is configured to perform the methods disclosed herein. The apparatus may include memory and processing circuitry coupled to the memory. [0058] An advantage of embodiments disclosed herein is that they provide video compression bit-rate savings.
BRIEF DESCRIPTION OF THE DRAWINGS
[0059] The accompanying drawings, which are incorporated herein and form part of the specification, illustrate various embodiments.
[0060] FIG. 1 illustrates a system according to an embodiment.
[0061] FIG. 2 is a schematic block diagram of an encoder according to an embodiment.
[0062] FIG. 3 is a schematic block diagram of a decoder according to an embodiment.
[0063] FIG. 4 illustrates an example of a first list of L0 lists and a second list of LI lists.
[0064] FIG. 5 illustrates an example list of lists.
[0065] FIG. 6A is a flowchart illustrating a process according to an embodiment.
[0066] FIG. 6B is a flowchart illustrating a process according to an embodiment.
[0067] FIG. 7 is a block diagram of an encoding apparatus according to an embodiment.
DETAILED DESCRIPTION
[0068] FIG. 1 illustrates a system 100 according to an embodiment. System 100 includes an encoder 102 and a decoder 104, wherein encoder 102 is in communication with decoder 104 via a network 110 (e.g., the Internet or other network). Encoder 102 encodes a source video sequence 101 into a bitstream comprising an encoded video sequence and transmits the bitstream to decoder 104 via network 110. In some embodiments, encoder 102 is not in communication with decoder 104, and, in such an embodiment, rather than transmitting bitstream to decoder 104, the bitstream is stored in a data storage unit. Decoder 104 decodes the coded pictures included in the encoded video sequence to produce video data for display and/or further image processing (e.g. a machine vision task). Accordingly, decoder 104 may be part of a device 103 having an image processor 105 and/or a display 106. The image processor 105 may perform machine vision tasks on the decoded pictures. One such machine vision task may be identifying objects in the picture. The device 103 may be a mobile device, a set-top device, a head-mounted display, or any other device.
[0069] FIG. 2 illustrates functional components of encoder 102 according to some embodiments. It should be noted that encoders may be implemented differently so implementation other than this specific example can be used. Encoder 102 employs a subtractor 241 to produce a residual block which is the difference in sample values between an input block and a prediction block (i.e., the output of a selector 251, which is either an inter prediction block output by an inter predictor 250 (a.k.a., motion compensator) or an intra prediction block output by an intra predictor 249). Then a forward transform 242 is performed on the residual block to produce a transformed block comprising transform coefficients. A quantization unit 243 quantizes the transform coefficients based on a quantization parameter (QP) value (e.g., a QP value obtained based on a picture QP value for the picture in which the input block is a part and a block specific QP offset value for the input block), thereby producing quantized transform coefficients which are then encoded into the bitstream by encoder 244 (e.g., an entropy encoder) and the bitstream with the encoded transform coefficients is output from encoder 102. Next, encoder 102 uses the quantized transform coefficients to produce a reconstructed block. This is done by first applying inverse quantization 245 and inverse transform 246 to the transform coefficients to produce a reconstructed residual block and using an adder 247 to add the prediction block to the reconstructed residual block, thereby producing the reconstructed block, which is stored in the reconstruction picture buffer (RPB) 266. Loop filtering by a loop filter (LF) stage 267 is applied and the final decoded picture is stored in a decoded picture buffer (DPB) 268, where it can then be used by the inter predictor 250 to produce an inter prediction block for the next picture to be processed. LF stage 267 may include three sub-stages: i) a deblocking filter, ii) a sample adaptive offset (SAO) filter, and iii) an Adaptive Loop Filter (ALF).
[0070] FIG. 3 illustrates functional components of decoder 104 according to some embodiments. It should be noted that decoder 104 may be implemented differently so implementations other than this specific example can be used. Decoder 104 includes a decoder module 361 (e.g., an entropy decoder) that decodes from the bitstream quantized transform coefficient values of a block. Decoder 104 also includes a reconstruction stage 398 in which the quantized transform coefficient values are subject to an inverse quantization process 362 and inverse transform process 363 to produce a residual block. This residual block is input to adder 364 that adds the residual block and a prediction block output from selector 390 to form a reconstructed block. Selector 390 either selects to output an inter prediction block or an intra prediction block. The reconstructed block is stored in a RPB 365. The inter prediction block is generated by the inter prediction module 350 and the intra prediction block is generated by the intra prediction module 369. Following the reconstruction stage 398, a loop filter stage 367 applies loop filtering and the final decoded picture may be stored in a decoded picture buffer (DPB) 368 and output to image processor 105. Pictures are stored in the DPB for two primary reasons: 1) to wait for picture output and 2) to be used for reference when decoding future pictures.
[0071] As described above, a challenge presently exists because there is no prediction in the signaling of reference picture lists in VVC, and this makes the signaling inefficient in terms of compression. Accordingly, this disclosure provides, among other things, methods for predicting reference picture list data to increase compression efficiency. In one embodiment, a method for predicting reference picture list data includes generation of a list of lists of picture references without duplicates (called the “UniqueList”) and using this list to generate reference picture lists LOpred and Llpred that may later be used to generate reference picture lists L0 and LI for a picture. In some embodiments, the method is used for storing reference picture list data in a parameter set. The following additional embodiments are described herein: i) a method using decoding of slice header syntax elements; ii) generation of the lists LOpred and Llpred from UniqueList without explicit syntax; iii) a method using syntax elements to control generation of lists LOpred and Llpred from UniqueList; iv) delta coding of the LOpred and Llpred list lengths; v) using LOpred and Llpred to decode a picture; and vi) a syntax table and pseudocode combining several of the embodiments.
[0072] Embodiment 1 - General Methodology
[0073] In a first embodiment, predicting reference picture list data is performed using a list of lists (LOL). This is similar to the list of L0 lists and the list of LI lists in VVC (see FIG. 4), but in this embodiment there is a single list rather than two lists. That is, in this embodiment, each entry in the LOL contains two lists: one LOpred list and one Llpred list. [0074] In this embodiment, the LOpred list and the Llpred list may be used to derive a reference picture list LO and a reference picture list LI. In one variant, at least one of the LO list and the LI list are derived as identical to the LOpred list and the Llpred list, respectively. In another variant, the LO list and the LI list are based on the LOpred and Llpred lists, for example by adding PicOrderCntVal to each of the LOpred and Llpred entries.
[0075] FIG. 5 shows an example LOL (500). In the example, shown, each item of the LOL is a 2-tuple comprising an LOpred list and an Llpred list. The LOL could, however, be split into two LOLs, a first LOL where each entry of the first LOL is an LOpred list and a second LOL where each entry of the second LOL is an Llpred list, as in VVC, but in contrast to VVC, the proposed methodology always uses a single index rather than one index for LOpred and a separate index for Llpred.
[0076] In FIG. 5, the LOL has 3 entries (a.k.a., “items” or “elements”): entry 0, entry 1, and entry 2; hence the LOL has a size of 3. Each entry consists of one LOpred list and one Llpred list. The size of the LOL is not a constant. The size may be signaled in a coded video bitstream such that a video decoder will decode the size and then further construct the content of all entries of the LOL from syntax elements in the bitstream.
[0077] This disclosure provides a method for decoding a video bitstream that includes constructing a LOL based on elements in the bitstream and then using the LOL to decode one or more pictures. The LOL is preferably constructed before any of the pictures that references the LOL or use the LOL is decoded. This means that the syntax elements from which the LOL is decoded precede all coded pictures that reference the LOL.
[0078] As illustrated in FIG. 5, each entry in the LOL may contain two reference picture lists: an LOpred list and an Llpred list. Each value in an LOpred list represents a reference picture indicator that can be used in the decoding process of a coded picture. Likewise, each value in an Llpred list represents a reference picture indicator that can be used in the decoding process of a coded picture.
[0079] Furthermore, in one embodiment, a current LOpred list (a.k.a., the LOpred cur list) and a current Llpred list (a.k.a., the Llpred cur list) is constructed by prediction from a previous LOpred list (a.k.a., the L0pred_prev list) and a previous Llpred list (a.k.a., the Llpred_prev list) of an entry in the LOL. The current LOpred list and the current Llpred list may or may not be present in the LOL.
[0080] The proposed prediction mechanism for predicting a current LOpred list and a current Llpred from a previous LOpred and previous Llpred list of an entry i in the LOL is done as follows.
[0081] 1) A value, denoted “deltaPOC,” is derived from a set of one or more syntax elements in the bitstream. For example, in one embodiment, deltaPOC is decoded from a single syntax element in the bitstream.
[0082] 2) A list, denoted “UniqueList,” is derived such that it contains all previous
LOpred list values and all previous Llpred list values without duplicates (e.g., the UniqueList is an ordered set of values where the set is the union of the set of values contained in the
LOpred prev list with duplicates removed and the set of values contained in the Llpred_prev list with duplicates removed).
[0083] 3) Each entry in UniqueList is then modified by adding the value deltaPOC to the value of each entry, thereby producing a modified UniqueList.
[0084] 4) An additional entry having a value equal to deltaPOC is added to the modified
UniqueList, thereby producing a final UniqueList. Alternatively, before step 3 an additional entry having a value equal to 0 can be added to the modified UniqueList, thereby making step 4 unnecessary (in this alternative the modified UniqueList obtained in step 3 is the final UniqueList).
[0085] 5) The next step comprises deriving the number of entries in the current LOpred list (i.e., deriving the size of the LOpred cur list) and the number of entries in the current Llpred list (i.e., deriving the size of the Llpred cur list) from the set of syntax elements. Preferably, the number of entries in the current LOpred and Llpred lists are derived by decoding one syntax element each in the set of syntax elements.
[0086] 6) Each entry value in the current LOpred list and the current Llpred list is derived using the final UniqueList. For example, if the size of the LOpred cur list is N, then N values from the final UniqueList are included in the LOpred cur list. Likewise, if the size of the Llpred cur list is M, then M values from the final UniqueList are included in the Llpred cur list.
[0087] Accordingly, in one embodiment, a decoder may perform the following two steps.
[0088] 1) Constructing a LOL before any coded picture that references the LOL is decoded, wherein the LOL comprises at least one entry, and wherein each entry in the LOL contains two reference picture lists LOpred and Llpred, and wherein each value in any of the two reference picture lists LOpred and Llpred is, or represents, a reference picture indicator that can be used in the decoding process of a coded picture.
[0089] 2) Constructing a current LOpred list and a current Llpred list by predicting from a previous LOpred list and a previous Llpred list of one entry in the LOL. The predicting step includes:
[0090] 2a) deriving a value deltaPOC from a set of syntax elements in the bitstream;
[0091] 2b) deriving a list that contains all previous LOpred list values and all previous
Llpred list values without duplicates;
[0092] 2c) modifying each entry in the list by adding the value deltaPOC to the value of each entry, thereby producing a modified list;
[0093] 2d) adding to the modified list an additional entry having a value equal to deltaPOC, thereby producing a final list (a.k.a., the final UniqueList);
[0094] 2e) deriving the number of entries in the current LOpred list and the number of entries in the current Llpred list from the set of syntax elements; and
[0095] 2f) deriving each entry value in the current LOpred and Llpred list from the values in the final UniqueList.
[0096] The decoder may further decode a current coded picture from the video bitstream wherein the current LOpred list and the current Llpred list are used in the decoding process of the current coded picture.
[0097] As an example, consider LOL 500 shown in FIG. 5. Let the previous LOpred list and the previous Llpred list be from entry 0 of LOL 500. This means that the L0pred_prev list is equal to { -32, -64, -48, -40, -36 } and the Llpred_prev list is equal to { -32, -48 }. Then the construction of the current LOpred list and the current Llpred list according to steps 2a-2f above may be done as follows:
[0098] a) A UniqueList (i.e., a set of numbers) is derived as equal to { -32, -64, -48, -40, -36 }, which contains all previous LOpred list values without duplicates and all previous Llpred list values without duplicates. For example, the UniqueList (UL) is the union of the set of numbers specified in the previous LOpred list with duplicates removed (denoted LOpred) and the set of numbers specified in the Llpred list with duplicates removed (denoted Llpred) (i.e., UL G (LOpred U Llpred)). In this example, neither the LOpred list nor the Llpred lists contains a duplicate, but if one of them had a duplicate value (e.g., if the Llpred list contained numbers {-32, -48, -32}) then the UniqueList would not change (i.e., the UniqueList would still be {-32, -64, -48, -40, -36}).
[0099] b) UL is modified by adding a value deltaPOC to the value of each entry in UL, thereby producing a modified UL (ULm). For example, if deltaPOC is equal to 16, then ULm is equal to: { -16, -48, -32, -24, -20 }.
[00100] c) ULm is modified by adding the value deltaPOC to ULm, thereby producing a final set of values (ULf) (a.k.a., the final UniqueList), which in this example is: { -16, -48, -32, - 24, -20, 16}.
[00101] d) The number of entries in the current LOpred list is derived as equal to 5 and the number of entries in the current Llpred list is derived as equal to 1. This matches entry 1 in LOL 500.
[00102] e) Finally, each entry value in the current LOpred list and the current Llpred list is derived from the values in UniqueList. From FIG. 5 it can be seen that all entries in the current LOpred list and the current Llpred list are present in the final UniqueList. Deriving the entry values of the current LOpred list is then a matter of selecting 5 entries from UniqueList in the correct order and deriving the entry values of the current Llpred list is a matter of selecting 1 entry from UniqueList.
[00103] It is much more bit-efficient to signal the values in the current LOpred list and the current Llpred list by using the final UniqueList rather than of signaling the values as-is.
[00104] Embodiment 2: Constructing the entire LOL when decoding a SPS or PPS [00105] This embodiment may be combined with embodiment 1. As in embodiment 1, the previous LOpred list and the previous Llpred list are in this embodiment present in an entry in the LOL. In this embodiment they are present in an entry i in the LOL. Additionally, the current LOpred list and the current Llpred list are stored in an entry j in the LOL.
[00106] The entry i is an earlier entry than the entry j . In one variant, entry i is the entry immediately preceding entry j such that i is equal toj-1. In this case, the decoder may keep LOpred and Llpred from the previous entry when constructing the contents of entry j. Alternatively, the decoder calculates j-1 and uses that value to find the previous LOpred and Llpred lists in the LOL. In another variant, the entry i is not the entry immediately preceding entry j . In this case, a value representing the delta between i and j may be derived, preferably by decoding a value from the set of syntax elements. This delta value is then used to identify the entry in the LOL containing the previous LOpred and previous Llpred to use.
[00107] Referring to FIG. 5, let entry i be the first entry (entry 0) and entry j be the second entry (entry 1). This means that the current LOpred list { -16, -32, -48, -24, -20 } is derived using both the previous LOpred list { -32, -64, -48, -40, -36 } and the previous Llpred list { -32, -48 } for prediction. Additionally, the current Llpred list { 16 } is also derived using both the previous LOpred list and the previous Llpred list for prediction.
[00108] In this embodiment, the set of syntax elements are preferably present in a parameter set syntax structure, for example the DPS (a.k.a. DCI), VPS, SPS, PPS, APS, or any other parameter set.
[00109] All entries in the entire LOL are here derived from the parameter set syntax structure. This means that any current LOpred and Llpred lists that are derived are stored in the LOL.
[00110] In this embodiment, the prediction mechanism for predicting a current LOpred list and a current Llpred of an entry j in the LOL from a previous LOpred list and a previous Llpred list of an entry i in the LOL is done as follows:
[00111] First a value “deltaPOC” is derived from a set of syntax elements in the bitstream. This value may be decoded from a single syntax element in a bitstream. Then the UniqueList is initially derived such that it contains all values from the previous LOpred list and all values from the previous Llpred list, without duplicates. After this initial derivation of UniqueList, the method includes modifying each entry in UniqueList by adding the value deltaPOC to the value of each entry. Thereafter, a new entry with a value equal to deltaPOC is added to UniqueList.
[00112] For example, if deltaPOC is equal to 16 and the initially derived UniqueList contains the following elements: { -32, -64, -48, -40, -36 }, as they would be if entry i was entry 0 in FIG. 5, then the final UniqueList will include the following elements { -16, -48, -32, -24, - 20, 16 }.
[00113] Next, similarly to embodiment 1, the method includes deriving the number of entries in the current LOpred list and the number of entries in the current Llpred list from the set of syntax elements. In FIG. 5, those would be derived as 5 and 1, respectively.
[00114] Finally, each entry value in the current LOpred list and the current Llpred list are derived using UniqueList. Referring again to FIG. 5 with i equal to 0 and j equal to 1, note that all values in the current LOpred and Llpred lists are present in the final UniqueList. Deriving the entry values of the current LOpred list is then a matter of selecting 5 entries from UniqueList in the correct order and deriving the entry values of the current Llpred list is a matter of selecting 1 entry from UniqueList. Embodiments 4 and 5 below contains details on how this may be done.
[00115] A decoder may perform the following two steps according to this embodiment:
[00116] 1) Constructing a LOL before any coded picture that references it is decoded wherein each entry in the LOL contain two reference picture lists LOpred and Llpred, and wherein each value in any of the two reference picture lists LOpred and Llpred is, or represents, a reference picture indicator that can be used in the decoding process of a coded picture.
[00117] 2) Constructing a current LOpred list and a current Llpred list by prediction from a previous LOpred list and a previous Llpred list of an entry in the LOL, the prediction comprising:
[00118] 2a) deriving a value deltaPOC from parameter set syntax elements in the bitstream;
[00119] 2b) deriving a list (UniqueList) that contains all previous LOpred list values and all previous Llpred list values without duplicates; [00120] 2c) modifying each entry in UniqueList by adding the value deltaPOC to the value of each entry;
[00121] 2d) adding an additional entry in UniqueList with the value deltaPOC;
[00122] 2e) deriving the number of entries in the current LOpred list and the number of entries in the current Llpred list from parameter set syntax elements;
[00123] 2f) deriving each entry value in the current LOpred and Llpred list from the values in UniqueList; and
[00124] 2g) storing the current LOpred and Llpred list as an entry in the LOL.
[00125] Embodiment 3: First constructing the LOL from SPS/PPS data, then referencing an entry in the LOL in the slice header and constructing the current LOpred list and the current Llpred list when decoding the slice header.
[00126] In contrast to embodiment 2, in this embodiment the set of syntax elements is not present in any parameter set, but instead is associated with a single coded picture. For example, the set of syntax elements may here be one of, or a combination of, video coding layer (VCL) syntax elements, slice header syntax elements, or picture header syntax elements.
[00127] As in embodiments 1 and 2, the previous LOpred list and the previous Llpred list are present in an entry i in the LOL. But as previously mentioned, in this embodiment, the current LOpred list and the current Llpred lists are not present or stored in any entry in the LOL. Instead, the current LOpred list and current Llpred list are constructed and used for a current picture only.
[00128] A decoder may perform the following two steps according to this embodiment:
[00129] 1) Constructing a LOL before any coded picture that references the LOL is decoded wherein each entry in the LOL contains two reference picture lists: LOpred and Llpred, and wherein each value in any of the two reference picture lists LOpred and Llpred is, or represents, a reference picture indicator that can be used in the decoding process of a coded picture. The LOL may be derived from parameter set syntax elements in the bitstream.
[00130] 2) Constructing a current LOpred list and a current Llpred list by prediction from a previous LOpred list and a previous Llpred list of an entry in the LOL. The prediction includes: [00131] 2a) deriving a value deltaPOC from a set of syntax elements in the bitstream (e.g., the above mentioned parameter set syntax elements);
[00132] 2b) deriving an entry value i and using the entry value i as an index for the LOL to identify the previous LOpred list and previous Llpred list in the LOL (the entry value i may be decoded from a syntax element in the set of syntax elements);
[00133] 2c) deriving a list UniqueList that contains all previous LOpred list values and all previous Llpred list values without duplicates;
[00134] 2d) modifying each entry in UniqueList by adding the value deltaPOC to the value of each entry;
[00135] 2e) adding an additional entry in UniqueList with the value deltaPOC;
[00136] 2f) deriving the number of entries in the current LOpred list and the number of entries in the current Llpred list from the set of syntax elements; and
[00137] 2g) deriving each entry value in the current LOpred and Llpred list from the values in UniqueList.
[00138] The set of syntax elements is one of, or a combination of, VCL syntax elements, slice header syntax elements, or picture header syntax elements.
[00139] Embodiment 4: Deriving entry values by decoding indices to UniqueList
[00140] In this embodiment, a value in the current LOpred list is derived from UniqueList by deriving an index value j then setting the value in current LOpred list equal to the j ’th value in UniqueList. A value in the current Llpred list is derived in the same way; by deriving an index value j then setting the value in the current Llpred list equal to the j’th value in UniqueList. That is, for each entry in the current LOpred list (L0pred[]), determine an index value (j) and store in said entry of the current LOpred list the j’th value in the UniqueList (i.e., UniqueList[j]).
Likewise, for each entry in the current Llpred list (Llpred[]), determine an index value (j) and store in said entry of the current Llpred list the j’th value from the UniqueList (i.e., UniqueList[j]).
[00141] Pseudo-code to implement the above process is shown below in Table 6:
TABLE 6
Figure imgf000028_0002
[00142] The function decode_index_valuej() is derives an index value from the set of syntax elements.
[00143] In one variant of this embodiment, the index value j is derived as follows: [00144] 1) The number of elements, N, in UniqueList is derived.
[00145] 2) A number of bits to decode, denoted M, is determined as equal to ceil(log2(N)). Here ceil() is the ceiling function and log2() is a 2-logarithm function. The value of M for N=2, 3, 4, ..., 8 is shown in Table 7 below. The ceil(log2(N)) operation may be implemented by determining the most significant bit of N or by a bitscan operation as is well- known in the art.
TABLE 7
Figure imgf000028_0001
Figure imgf000029_0001
[00146] 3) The index value j is derived by decoding a syntax element of M bits and either setting] equal to the value V of the decoded syntax element (j=V) or setting] equal to the value V plus 1 (j=V+l). [00147] The entry value in the current LOpred list and/or the entry value in the current
Llpred may then be set equal to the value of the j ’th entry in UniqueList.
[00148] Table 8 below illustrates pseudo-code for populating the current LOpred list and the current Llpred list:
TABLE 8
Figure imgf000029_0002
[00149] ReadBits(M) is a function that reads M bits from the bitstream and returns the value of the M bits (e.g., if M=3 and the three bits are 1, 0, 1, then the value returned by ReadBits(3) is 5). C is a constant that may be equal to 0 or 1.
[00150] Embodiment 5: Deriving entry values by constructing guessed lists from UniqueList [00151] In this embodiment, the current LOpred list and/or the current Llpred list is/are derived from UniqueList by a different method than the one described in embodiment 4. The method here constructs the list directly from the elements of the UniqueList without any guiding syntax such as the index values j in embodiment 4. One can describe this as the current LOpred list or the current Llpred list is set equal to a “guessed” list, where the guessed list is constructed from the UniqueList alone.
[00152] In a preferred variant, there is a flag in the bitstream that specifies whether the guessed list shall be used or not. In one variant, there is a single flag. If this single flag is equal to a predetermined value, then a guessed list is constructed from UniqueList for the current LOpred list and another guessed list is constructed from UniqueList for the current Llpred list. If the single syntax element is not equal to the predetermined value, the current LOpred and Llpred lists are derived by other means.
[00153] In another preferred variant there are two separate flags, one for the current LOpred list and one for the current Llpred list. If the flag for the current LOpred list is equal to a predetermined value, a guessed list for LOpred is constructed and used as the current LOpred list.
Otherwise, the current LOpred list is derived by other means. The same is done for the current Llpred list.
[00154] Table 9 below shows example pseudocode for implementing this embodiment.
TABLE 9
Figure imgf000030_0001
Figure imgf000031_0001
[00155] The functions DeriveListEntryValueLO(ii, UniqueList) and DeriveListEntryValueLl(ii, UniqueList) select a value from UniqueList without reading any bits to guide the selection. The N elements of the current LOpred list may be set equal to the N first entries in a sorted sequence of the N negative numbers in decreasing order, with the highest negative number first. For instance, if UniqueList is equal to { -3, -1, 5, -6, 3 }, then the 3 first entries of the current LOpred list are set equal to { -1, -3, -6, ... }
[00156] Pseudocode for this variant is shown below in Table 10:
TABLE 10
Figure imgf000031_0002
[00157] In one variant, there are N positive numbers in UniqueList, with N larger than 1, where the N first entries of the current Llpred list are set equal to the N first entries of a sorted sequence of the N positive numbers in increased order. For instance, if UniqueList is equal to { - 3, -1, 5, -6, 3}, then the 2 first entries of the current Llpred list are set equal to { 3, 5 ... }. [00158] Pseudocode for this variant is shown below in Table 11 :
TABLE 11
Figure imgf000032_0001
[00159] In another variant, the current LOpred list is first filled with the negative numbers from the UniqueList preferably in decreasing order, followed by the positive numbers from the UniqueList, preferably in increasing order.
[00160] For instance, if we use the same example as above with UniqueList equal to { -3, -1, 5, -6, 3 }, then the entries of the current LOpred list are set equal to { -1, -3, -6, 3, 5 }.
[00161] This could also be done for LI in a similar way. The current Llpred list is first filled with the positive numbers from the UniqueList preferably in increasing order, followed by the negative numbers from the UniqueList, preferably in decreasing order. If we use the same example as above, then the entries of the current Llpred list are set equal to { 3, 5, -1, -3, -6 }.
[00162] Similar to VVC, the number of “active” pictures in each of LOpred and Llpred may be derived or signaled in the bitstream. For example, if the number of “active” pictures are set to 4 for both LOpred and Llpred in the example above we would end up with the following active entries in the current LOpred list { -1, -3, -6, 3 } and the following active entries in the current Llpred list { 3, 5, -1, -3 }.
[00163] A decoder may perform the following steps according to this embodiment 5 to derive at least one entry in the current LOpred list:
[00164] 1) Decode a flag from the set of syntax elements. [00165] 2) In response to the flag being equal to a predetermined value, setting the entry value in the current LOpred list to one of the values in UniqueList without decoding any other syntax element than the flag in order to determine which value in UniqueList to set the entry value to.
[00166] 3) Optional additional step: Assuming N negative numbers in UniqueList, with N larger than 1 and the size of the current LOpred list equal to or larger than N, setting the N first entries of the current LOpred list equal to the first N entries of a sorted sequence of the N negative numbers; sorted in decreased order with the highest negative number first.
[00167] 4) Optional additional step: Assuming M positive numbers in UniqueList, with M larger than 1, setting the M last entries of the current LOpred list equal to the first M entries of a sorted sequence of the M positive numbers; sorted in increased order with the lowest positive number first.
[00168] A decoder may perform the following steps according to this embodiment 5 to derive at least one entry in the current Llpred list:
[00169] 1) Decode a flag from the set of syntax elements.
[00170] 2) In response to the flag being equal to a predetermined value, setting the entry value in the current Llpred list to one of the values in UniqueList without decoding any other syntax element than the flag in order to determine which value in UniqueList to set the entry value to.
[00171] 3) Optional additional step: Assuming N positive numbers in UniqueList, with N larger than 1 and the size of the current Llpred list equal to or larger than N, setting the N first entries of the current Llpred list equal to the first N entries of a sorted sequence of the N positive numbers; sorted in increased order with the lowest positive number first.
[00172] 4) Optional additional step: Assuming M negative numbers in UniqueList, with M larger than 1, setting the M last entries of the current Llpred list equal to the first M entries of a sorted sequence of the M negative numbers; sorted in increased order with the lowest positive number first.
[00173] Embodiment 6: Delta coding of the L0 and LI lengths [00174] In this embodiment 6, the number of entries in the current LOpred list is derived by decoding a first delta value from the set of syntax elements and adding the first delta value to a predicted number of LOpred entries. In a preferred version, this embodiment is combined with embodiment 1 and the predicted number of LOpred entries is equal to the number of LOpred entries of the entry i in the LOL, with entry i of the LOL being as described in embodiment 1.
[00175] Similarly, the number of entries in the current Llpred list is derived by decoding a second delta value from the set of syntax elements and adding the second delta value to a predicted number of Llpred entries. In a preferred version the predicted number of Llpred entries is equal to the number of Llpred entries of the entry i in the LOL, with entry i of the LOL being as described in embodiment 1.
[00176] The first and second delta values are preferably decoded from one syntax element each.
[00177] Embodiment 7: Use of the LOL for decoding a picture
[00178] This embodiment 7 is an addition to at least embodiment 1 above and describes how the current LOpred list and the current Llpred list may be used to decode a current coded picture. The current LOpred list and current Llpred list may be constructed according to any previously described embodiment, and the process of decoding a current coded picture includes the following steps.
[00179] First, an index value v is decoded from one or more syntax elements of the current coded picture. The one or more syntax elements may be decoded from slice data of the current coded picture. In an implementation in a video codec environment similar to HEVC and VVC, this would mean that the syntax elements are CABAC-coded syntax elements.
[00180] The index value v is then used as an index to the current LOpred list to select an element of the current LOpred list. The value of that element is then used to identify a first reference picture. This first reference picture is then used in a motion compensation process in the process of decoding the current coded picture.
[00181] Optionally, an index value w is decoded from one or more syntax elements of the current coded picture. These one or more syntax elements may also be decoded from slice data of the current coded picture. The index value w is then used as an index to the current Llpred list to select an element of the current Llpred list. The value of that element is then used to identify a second reference picture. This second reference picture is then used in combination with the first reference picture in the motion compensation process in the process of decoding the current coded picture. [00182] Embodiment 8: Syntax table and pseudocode
[00183] An embodiment was implemented on top of the ECM-6.0 experimental video codec. The ECM-6.0 codec is built on top of VVC and uses the VVC handling of reference picture lists. The implementation added a number of syntax elements to the ECM-6.0 sequence parameter set as shown in Table 12 below where lines 4-28 are added. The syntax table format follows that of the VVC specification where syntax elements are shown in bold and have an associated type identifier in the “Descriptor” column, which type identifier shows the syntax element type. u(l) is a 1 -bit flag, ue(v) is a UVLC codeword, se(v) is a signed UVLC codeword and u(v) is a codeword of length v, where the value of v is derived by the decoder.
TABLE 12
Figure imgf000035_0001
Figure imgf000036_0001
[00184] Line 3 is a syntax element of ECM-6.0. It specifies whether the Llpred list is a copy of the LOpred list or specified separately. If Llpred is a copy, some Llpred syntax is not present in the bitstream.
[00185] The sps inter rpl flag on line 4 is a 1 -bit flag that specifies whether the proposed method is used or not. If equal to 0, lines 29-32 directly follows and those are part of the existing ECM-6.0 syntax. If the flag is equal to 1, the sps_inter_rpl_num_ref_pic_lists_minus2 syntax element follows. It specifies the number of entries in the LOL minus 2. The syntax element is UVLC coded and the constant 2 is added which means that the lowest expressible size of the LOL is 2. For list sizes of 0 or 1, sps inter rpl flag must be equal to 0. Thereafter, on line 7, there is a loop over the number of entries in the LOL.
[00186] Lines 9-16 are applied for the first entry. These lines are duplicates of existing ECM-6.0 syntax inside the ref_pic_list_struct( i, j ) syntax structure, not shown. These syntax elements are also part of the ref_pic_list_struct( listldx, rplsldx ) syntax structure in the VVC standard specification where num_ref_entries[ i ] [ rplldx ] in the table corresponds to num_ref_entries[ listldx ][ rplsldx ] in VVC, inter_rpl_entry_abs_delta_poc[ i ][ rplldx ][ j ] corresponds to abs_delta_poc_st[ listldx ][ rplsldx ][ i ], and inter_rpl_entry_sign_flag[ i ][ rplldx ][ j ] corresponds to strp_entry_sign_flag[ listldx ][ rplsldx ][ i ]. In short, lines 9-16 specifies the LOpred and Llpred lists for the first entry in the LOL without use of any of the proposed prediction method. [00187] Lines 17-27 are applied for the remaining entries in the LOL. The syntax elements on lines 18 and 19 specifies the delta POC value to be used in the method. Line 20 specifies whether there is syntax for Llpred or not and the delta_num_ref_entries[ i ] syntax element on line 21 carries a delta size value. The size of the current LOpred list is set equal to this delta size value plus the size of the LOpred list for the previous entry in LOL. The flag on line 22 specifies whether the current LOpred or Llpred list is using its guessed list. If not, the LOpred or Llpred list is constructed using the value of the syntax element on line 25 for each entry. This syntax element provides an index value to UniqueList that holds the value to use.
[00188] Table 13 below shows exemplary pseudo-code of the implementation as implemented on-top of ECM-6.0:
TABLE 13
Figure imgf000037_0001
Figure imgf000038_0001
Figure imgf000039_0001
Figure imgf000040_0001
[00189] Let’s walk through the code using an example where the first two entries in the LOL are as shown in FIG. 5 as follows:
[00190] First entry (rplIdx=0): LOpred = { -32, -64, -48, -40, -36 } Llpred = { -32, -48 }. [00191] Second entry (rplldx==l): LOpred = { -16, -32, -48, -24, -20 } Llpred = { 16 }.
[00192] The decoder first decodes sps inter rpl flag equal to 1 followed by sps_inter_rpl_num_ref_pic_lists_minus2. Let’s assume there are 32 entries so this value is equal to 30 which makes numberOfRPL equal to 32. sps->getRPLlCopyFromRPLOFlag() is equal to 0 throughout since LOpred and Llpred are not identical. [00193] For the first entry (rplIdx==0), num ref entries lx is decoded as equal to 5 followed by the following decoded syntax elements and values:
[00194] inter_rpl_entry_abs_delta_poc [ Lx] [ rplldx] [ i] 31
[00195] inter_rpl_entry_sign_f lag[ Lx] [ rplldx] [i] 1 [00196] inter_rpl_entry_abs_delta_poc [ Lx] [ rplldx] [ i] 32
[00197] inter_rpl_entry_sign_f lag[ Lx] [ rplldx] [i] 1
[00198] inter_rpl_entry_abs_delta_poc [ Lx] [ rplldx] [ i] 16
[00199] inter_rpl_entry_sign_f lag[ Lx] [ rplldx] [i] 0
[00200] inter_rpl_entry_abs_delta_poc [ Lx] [ rplldx] [ i] 8
[00201] inter_rpl_entry_sign_f lag[ Lx] [ rplldx] [i] 0
[00202] inter_rpl_entry_abs_delta_poc [ Lx] [ rplldx] [ i] 4
[00203] inter_rpl_entry_sign_f lag[ Lx] [ rplldx] [i] 0.
[00204] This gives an LOpred equal to { -32, -64, -48, -40, -36 } where each value is stored in the list of list by the statement rplLX[lx]->setRefPic!dentifier(ii, deltaValue, 0, false, o);
[00205] Similarly, Llpred equal to { -32, -48 } is decoded by the following sequence of syntax elements:
[00206] num_ref_entries_lx 2
[00207] inter_rpl_entry_abs_delta_poc [ Lx] [ rplldx] [ i] 31
[00208] inter_rpl_entry_sign_f lag[ Lx] [ rplldx] [i] 1
[00209] inter_rpl_entry_abs_delta_poc [ Lx] [ rplldx] [ i] 16
[00210] inter_rpl_entry_sign_f lag[ Lx] [ rplldx] [i] 1
[00211] After this, prevNumEntriesLX[0] and prevNumEntriesLX[l] is set to 5 and 2 respectively followed by construction of UniqueList. It is first set to the empty list. Thereafter, each element in LOpred and Llpred is added to UniqueList, conditioned on that the element is not already present in UniqueList.
[00212] This results in the UniqueList being equal to { -32, -64, -48, -40, -36 }. All elements from LOpred are added, but no element from Llpred is then added since both of the values in Llpred already exist in UniqueList.
[00213] Thereafter the for loop continues with rplldx equal to 1. [00214] First, DeltapocRPL is set equal to 16 by decoding these two syntax elements:
[00215] inter_rpl_abs_delta_poc_minusl 15
[00216] inter_rpl_sign_flag 0
[00217] Then 16 is added to each of the elements in UniqueList followed by adding 16 itself followed by sorting, which results in UniqueList being equal to { -48, -32, -24, -20, -16, 16
}. SizeUni queList is set equal to 6.
[00218] Next, delta num ref entries lx is decoded as equal to 0, then setting numEntriesLX[0] equal to 5. Without delta coding of the size, the value 5 would have had to be sent, which would have cost more bits.
[00219] inter rpl use guessed list flag lx is next decoded as equal to 0, specifying that the guessed list is not used by LOpred. Since numEntriesLX[0] is equal to 5, the following syntax elements are decoded, each coded by 3 bits:
[00220] inter_rpl_idx_lx 4
[00221] inter_rpl_idx_lx 1
[00222] inter_rpl_idx_lx 0
[00223] inter_rpl_idx_lx 2
[00224] inter_rpl_idx_lx 3.
[00225] LOpred is therefore decoded to be equal to { -16, -32, -48, -24, -20 }.
[00226] For Llpred, delta num ref entries lx is decoded as equal to -1, then setting numEntriesLX[l] equal to 1. inter rpl use guessed list flag lx is decoded to 1, specifying that the guessed list is used by Llpred.
[00227] The for loop sets indexToFirstPositiveEntry equal to 5, and UniqueList[5] is indeed the first positive entry, and therefore the lowest positive value in UniqueList. The variable index is set equal to 5 and since UniqueList[5] is equal to 16, Llpred is set equal to { 16 }
[00228] Finally, prevNumEntriesLX is updated for the next iteration of rplldx and UniqueList is constructed. It is first set to the empty list. Thereafter, each element in LOpred and Llpred is added to UniqueList, conditioned on that the element is not already present in UniqueList. There are no duplicates this time, so UniqueList is set equal to { -16, -32, -48, -24, - 20, 16 }.
[00229] In an additional embodiment, the method is used with pictures in the bitstreams being coded using a hierarchical B-picture prediction structure, where the number of pictures inbetween anchor pictures is equal to 2AN-1. This means that after an anchor picture is decoded from the bitstream, there are 2AN-1 pictures immediately following the anchor pictures in the bitstream that are arranged in a hierarchical B-picture prediction structure and are output before the anchor picture is output. We here say that the sub-GOP size of the B-picture prediction structure is equal to N. In this embodiment, the deltaPOC value is derived as follows. A first value F is decoded from a single syntax element in the set of syntax elements, wherein the first value represents a sub-GOP size equal to 2AF. Then the deltaPOC value is derived from the value F and j, wherein j is the index in LOL comprising the current LOpred list and the current LI pred list. The value F is decoded once and used for deriving 2AN entries in the LOL, each entry corresponding to one picture position in the sub-GOP. This means that instead of decoding 2AN syntax elements, one for each deltaPOC value, there is only one value F to decode which defines a sequence of deltaPOC values to use for the 2AN entries. In one variant, a list DELTA is derived from the value F and the deltaPOC value to use for the current LOpred list and the current Llpred list is equal to DELTA[j], where j is the index in LOL comprising the current LOpred list and the current Llpred list.
[00230] Additional Disclosure
[00231] This disclosure further proposes to add prediction to the signaling of the RPLs in the SPS. The method utilizes that the L0 and LI picture references of an entry in the SPS list of RPLs is limited to the L0 and LI picture references of the previous entry plus the previous picture. The slice data and therefore also all decoded sample values is asserted to be unaffected by the method.
[00232] The HEVC specification includes reference picture sets (RPSs) that are used both for DPB management and signaling of reference picture lists L0 and LI. An HEVC SPS may contain a list of RPSs, and such an RPS can be referred to by a slice header. The HEVC slice header then include a syntax element short term ref pic set idx that specifies which entry in the SPS list of RPSs to use for the current slice. [00233] The RPS signaling in HEVC includes a mechanism for predicting an entry in the list of RPSs from another, previously signaled, entry. This prediction is enabled when the inter_ref_pic_set_prediction_flag in HEVC is equal to 1.
[00234] The VVC specification uses reference picture lists (RPLs) rather than RPSs. The RPLs in VVC are similar to RPSs, signals L0 and LI lists more directly. As in HEVC The RPLs can be signaled in the SPS with a syntax element rpl_idx[ i ] in the picture header or slice header to specifies which one to use for a picture. VVC does not include prediction of RPLs as HEVC does for RPSs.
[00235] ECM has inherited the RPL method from VVC and uses a very similar syntax as shown in Table 14 and Table 15.
TABLE 14 - VVC syntax elements and corresponding ECM-6.0 SPS decoding code
Figure imgf000044_0001
TABLE 15 - VVC syntax elements and corresponding ECM-6.0 picture header decoding code
Figure imgf000044_0002
[00236] This disclosure proposes to add prediction to the signaling of the RPLs in the SPS. The method utilizes that the L0 and LI picture references of an entry in the SPS list of RPLs is limited to the LO and LI picture references of the previous entry plus the previous picture.
[00237] Let RplList be the list of RPLs in the SPS and let RPL be the entry in RplList containing an LO and LI list. [00238] This disclosure proposes adding prediction to the signaling of the RplList. For a current RPL, the decoder will maintain a list of picture references used by the previously signaled RPL, add the previous picture to this list, and use decoded index values in this list to construct the LO and LI lists for the current RPL. Simplified, the syntax for prediction looks as shown in Table 16. TABLE 16 - Conceptual RPL prediction syntax
Figure imgf000045_0001
[00239] sps_inter_rpl_num_ref_pic_lists is the number of RPLs in RplList.
[00240] sps_num_ref_entries[ lx ] is the number of entries for LX (L0 or LI).
[00241] sps_inter_rpl_idx[ lx ] [ rplldx ] [ j ] is the index to the list of picture references identifying the value for LX[ j ] in the rplldx’th entry of RplList.
[00242] In addition, it is proposed that: 1) one L0 flag and one LI flag for each RPL is added to ECM. Each flag specifies whether automatically generated L0 and LI lists will be used. If so, the corresponding sps_inter_rpl_idx[ lx ][ rplldx ][ j ] syntax elements are not present; 2) signaling the log2 of the GOP size is added, which may be used to derive POC delta values between RPL entries that are needed to maintain the list of picture references; and 3) Delta coding of sps_num_ref_entries[ lx ] is used.
[00243] FIG. 6A is a flowchart illustrating a process 600 for decoding a video bitstream. Process 600 may begin in step s602. Step s602 comprises constructing an LOL before any coded picture that references the LOL is decoded. The LOL comprises an entry comprising i) a first previous reference picture list, the L0pred_prev list, and ii) a second previous reference picture list, the Llpred_prev list, wherein the L0pred_prev list comprises a first set of values (as used herein a “set of values” may contain a single value and also may contain duplicate values, hence each of { 1 }, { 1, 2, 3 (and { 1, 2, 3, 1 } is an example of a “set of values.”) and the Llpred_prev list comprises a second set of values, and further wherein each value included in the first set of values and the second set of values represents a reference picture indicator that can be used in the decoding process of a coded picture. Step s604 comprises constructing a first current reference picture list, the LOpred cur list, and a second current reference picture list, the Llpred cur list, based on the L0pred_prev list and the Llpred_prev list.
[00244] FIG. 6B illustrates the steps that comprise step s604 in some embodiments. As shown in FIG. 6B, step s604 includes steps s606 to s616. Step s606 comprises deriving a value, the deltaPOC value, from a set of syntax elements in the bitstream. Step s608 comprises deriving a set of values, the UniqueList, wherein the UniqueList comprises a) the deltaPOC value and b) a third set of values, wherein each value in the third set of values is equal to either i) the sum of a value from the first set of values and the deltaPOC value or ii) the sum of a value from the second set of values and the deltaPOC value (in one embodiment, other than the deltaPOC value, there is no value within the UniqueList that is not either i) equal to the sum of a value from the first set of values and the deltaPOC value or ii) equal to the sum of a value from the second set of values and the deltaPOC value). Note that L0pred_prev list and LlPred_prev list may contain duplicates, but there are no duplicates within UniqueList. Step s610 comprises deriving a value N (e.g. N may be derived from the set of syntax elements). Step s612 comprises deriving a value M (e.g. M may be derived from the set of syntax elements). Step s614 comprises including N values from the UniqueList in the LOpred cur list. Step s616 comprises including M values from the UniqueList in the Llpred cur list. [00245] FIG. 7 is a block diagram of an apparatus 700 for implementing encoder 102 and/or decoder 104, according to some embodiments. When apparatus 700 implements encoder 102, apparatus 700 may be referred to as an encoder apparatus, and when apparatus 700 implements decoder 104, apparatus 700 may be referred to as a decoder apparatus. As shown in FIG. 7, apparatus 700 may comprise: processing circuitry (PC) 702, which may include one or more processors (P) 755 (e.g., one or more general purpose microprocessors and/or one or more other processors, such as an application specific integrated circuit (ASIC), field-programmable gate arrays (FPGAs), and the like), which processors may be co-located in a single housing or in a single data center or may be geographically distributed (i.e., encoder apparatus 700 may be a distributed computing apparatus); at least one network interface 748 (e.g., a physical interface or air interface) comprising a transmitter (Tx) 745 and a receiver (Rx) 747 for enabling apparatus 700 to transmit data to and receive data from other nodes connected to a network 110 (e.g., an Internet Protocol (IP) network) to which network interface 748 is connected (physically or wirelessly) (e.g., network interface 748 may be coupled to an antenna arrangement comprising one or more antennas for enabling encoder apparatus 700 to wirelessly transmit/receive data); and a storage unit (a.k.a., “data storage system”) 708, which may include one or more nonvolatile storage devices and/or one or more volatile storage devices. In embodiments where PC 702 includes a programmable processor, a computer readable storage medium (CRSM) 742 may be provided. CRSM 742 may store a computer program (CP) 743 comprising computer readable instructions (CRI) 744. CRSM 742 may be a non-transitory computer readable medium, such as, magnetic media (e.g., a hard disk), optical media, memory devices (e.g., random access memory, flash memory), and the like. In some embodiments, the CRI 744 of computer program 743 is configured such that when executed by PC 702, the CRI causes encoder apparatus 700 to perform steps described herein (e.g., steps described herein with reference to the flow charts). In other embodiments, encoder apparatus 700 may be configured to perform steps described herein without the need for code. That is, for example, PC 702 may consist merely of one or more ASICs. Hence, the features of the embodiments described herein may be implemented in hardware and/or software.
[00246] Summary of Various Embodiments
Al. A method (600) for decoding a video bitstream, the method comprising: constructing a list of lists, LOL, before any coded picture that references the LOL is decoded, the LOL comprising an entry comprising i) a first previous reference picture list, the LOpred_prev list, and ii) a second previous reference picture list, the Llpred prev list, wherein the LOpred_prev list comprises a first set of values and the Llpred_prev list comprises a second set of values, and further wherein each value included in the first set of values and the second set of values represents a reference picture indicator that can be used in the decoding process of a coded picture; and constructing a first current reference picture list, the LOpred cur list, and a second current reference picture list, the Llpred cur list, based on the LOpred prev list and the L I pred prev list, wherein constructing the LOpred cur list and Llpred cur list comprises: deriving a value, the deltaPOC value, from a set of syntax elements in the bitstream; deriving a set of values, the UniqueList, wherein the UniqueList comprises a) the deltaPOC value and b) a third set of values, wherein each value in the third set of values is equal to either i) the sum of a value from the first set of values and the deltaPOC value or ii) the sum of a value from the second set of values and the deltaPOC value; deriving a value N (e.g. N may be derived from the set of syntax elements); deriving a value M (e.g. M may be derived from the set of syntax elements); including N values from the UniqueList in the LOpred cur list; and including M values from the UniqueList in the Llpred cur list.
A2. The method of embodiment Al, further comprising using the LOpred cur list and the Llpred cur list in a process for decoding a current coded picture from the video bitstream.
A3. The method of embodiment Al or A2, wherein the LOpred cur list is included in an entry of the LOL, and the Llpred cur list is also included in the entry of the LOL.
A4. The method of any one of embodiments A1-A3, wherein the set of syntax elements are parameter set syntax elements. A5. The method of any one of embodiments A1-A3, wherein the set of syntax elements is one of, or a combination of, video coding layer, VCL syntax elements, slice header syntax elements, or picture header syntax elements.
A6. The method of any one of embodiments A1-A5, further comprising: decoding an index value from a syntax element in the set of syntax elements, wherein the index value identifies said entry of the LOL that contains the LOpred_prev and LlPred_prev lists; and prior to constructing the LOpred cur and Llpred cur lists based on the LOpred_prev list and the Llpred_prev list, using the decoded index value to select said entry of the LOL.
A7. The method of any one of embodiments A1-A6, wherein including the N values from the UniqueList in the LOpred cur list comprises: deriving an index value, j, from the set of syntax elements; and including in the LOpred cur list the value from the entry of the UniqueList associated with the index value, j .
A7.5. The method of any one of embodiments A1-A6, wherein including the N values from the UniqueList in the LOpred cur list comprises: for each entry in the LOpred cur list, deriving an index value, j, from the set of syntax elements and storing in said entry of the LOpred cur list the j ’th value in the UniqueList (e.g., UniqueList[j] where the UniqueList is implemented as an array of values).
A8. The method of embodiment A7 or A7.5, wherein deriving) comprises: determining the size of the UniqueList; determining a number of bits, M, based on the determined size, wherein M=ceil(log2(S)), where S is the determined size; and decoding an M bit syntax, wherein V is the value of the decoded M bit syntax element; and setting j equal to V or equal to V+l . A9. The method of any one of embodiments A1-A6, wherein including the N values from the UniqueList in the LOpred cur list comprises: decoding a flag from the set of syntax elements; and in response to determining that the flag is equal to a predetermined value, including the N values in UniqueList in the LOpred cur list without decoding any syntax element that specifies which of the N values in UniqueList to add to the LOpred cur list (and, in some embodiments, without decoding any syntax element specifying in what order to add the N values.).
A10. The method of embodiment A9 wherein, the UniqueList comprises N negative numbers, where N > 1, and the N values from the UniqueList that are added to the LOpred cur list are said N negative numbers.
Al l. The method of embodiment A10, wherein the N negative numbers comprise a first negative number, Nl, and a second negative number, N2, wherein Nl > N2, the LOpred cur list is data structure (e.g., an array, a linked list, etc.) comprising a first element followed by a second element, and including the N values from the UniqueList in the LOpred cur list comprises: i) setting the value of the first element of the data structure to Nl; and ii) setting the value of the second element of the data structure to N2.
A12. The method of embodiment A10, wherein including the N negative numbers from the UniqueList in the LOpred cur list comprises setting the N first entries of the LOpred cur list equal to the first N entries of a sorted sequence of the N negative numbers, sorted in decreased order with the highest negative number first.
A13. The method of any one of embodiments A1-A12, wherein the LOpred_prev list consists of P entries; and deriving N comprises: i) deriving a delta value, D, from the set of syntax elements; and ii) setting N equal to D + P.
A14 The method of any one of embodiments A2-A13, further comprising: deriving a second index value, W, from on one of, or a combination of, video coding layer, VCL syntax elements, slice header syntax elements, or picture header data; and using the second index value W to select an entry from the LOL, wherein the selected entry comprises LOpred cur and Llpred cur.
A15. The method of any one of embodiments Al-AM, further comprising: deriving a first index value, V, from one or more first syntax elements of a coded picture; using the V to select a value from the LOpred cur list; using the selected value to identify a reference picture; and using the identified reference picture in a motion compensation process for decoding the coded picture.
Al 6. The method of embodiment Al 5, wherein the one or more first syntax elements are positioned in coded slice data of the coded picture.
Al 7. The method of any one of embodiments Al -Al 6, wherein deriving the deltaPOC value from a set of syntax elements in the bitstream comprises: decoding a first value N representing a sub-GOP size from the set of syntax elements wherein the sub-GOP size is equal to 2AN; and deriving the deltaPOC value from the value N and a value J, wherein J is the index in the LOL comprising the current LOpred list and the current Llpred list.
A18. The method of any one of embodiments A1-A17, wherein including in the
Llpred cur list the N values from the UniqueList comprises: deriving an index value, j, from the set of syntax elements; and including in the Llpred cur list the value from the entry of the UniqueList associated with the index value, j . Al 9. The method of any one of embodiments Al -Al 8, wherein deriving the UniqueList comprises: obtaining an initial set of values, wherein the initial set of values consists of the union of a third set of values and a fourth set of values, wherein the third set of values is equal to the first of values with all duplicates removed and the fourth set of values is equal to the second set of values with all duplicates removed; for each value included in the initial set of values, adding the deltaPOC value to the value included in the initial set of values, thereby producing an intermediate set of values; and adding to the intermediate set of values the deltaPOC value, thereby producing the UniqueList (a.k.a., the “final UniqueList”).
A20. The method of any one of embodiments Al -Al 8, wherein deriving the UniqueList comprises: obtaining an initial set of values, wherein the initial set of values consists of i) the union of a third set of values and a fourth set of values and ii) a value equal to 0, wherein the third set of values is equal to the first of values with all duplicates removed and the fourth set of values is equal to the second set of values with all duplicates removed; for each value included in the initial set of values, adding the deltaPOC value to the value included in the initial set of values, thereby producing the UniqueList.
A21. The method of any one of embodiments A1-A20, wherein any of the LOpred cur list, the Llpred cur list, the L0pred_prev list, and the Llpred_prev list may contain duplicates, but there are no duplicates within UniqueList.
BL A computer program (743) comprising instructions (744) which when executed by processing circuitry (702) of an apparatus (700) causes the apparatus to perform the method of any one of the above embodiments.
B2. A carrier containing the computer program of embodiment Bl, wherein the carrier is one of an electronic signal, an optical signal, a radio signal, and a computer readable storage medium (742). Cl. A decoder apparatus (700) configured to perform the method of any one of embodiments A1-A21.
5 [00247] Results and Conclusion
[00248] The described embodiments provide video compression bit-rate savings. The pseudo-code disclosed herein was implemented on top of the ECM-6.0 experimental video codec and tested according to the testing procedures specified by MPEG and ITU-T in document JVET-Y2017, jvet-experts(dot)org/doc_end_user/current_document.php?id=11473. On the sold called Class D test set, the method provides bit-rate savings between 0.09% and 1.18% compared to ECM-6.0 as-is, as shown in table 17 below. The average bit-rate savings for each of the four sequences is between 0.28% and 0.49%.
TABLE 17
Figure imgf000053_0001
15 [00249] Because the embodiments are implemented on the sequence-level rather than picture-level in a video codec, the computational complexity that the embodiments add is very small. Therefore, the embodiments provide a good compression/complexity trade-off. [00250] While the terminology in this disclosure is described in terms of VVC, the embodiments of this disclosure also apply to any existing or future codec, which may use a different, but equivalent terminology.
[00251] While various embodiments are described herein, it should be understood that they have been presented by way of example only, and not limitation. Thus, the breadth and scope of this disclosure should not be limited by any of the above-described exemplary embodiments. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the disclosure unless otherwise indicated herein or otherwise clearly contradicted by context.
[00252] Additionally, while the processes described above and illustrated in the drawings are shown as a sequence of steps, this was done solely for the sake of illustration. Accordingly, it is contemplated that some steps may be added, some steps may be omitted, the order of the steps may be re-arranged, and some steps may be performed in parallel.

Claims

1. A method (600) for decoding a video bitstream, the method comprising: constructing a list of lists, LOL, before any coded picture that references the LOL is decoded, the LOL comprising an entry comprising i) a first previous reference picture list, the L0pred_prev list, and ii) a second previous reference picture list, the Llpred prev list, wherein the L0pred_prev list comprises a first set of values and the Llpred_prev list comprises a second set of values, and further wherein each value included in the first set of values and the second set of values represents a reference picture indicator that can be used in the decoding process of a coded picture; and constructing a first current reference picture list, the LOpred cur list, and a second current reference picture list, the Llpred cur list, based on the LOpred prev list and the L I pred prev list, wherein constructing the LOpred cur list and Llpred cur list comprises: deriving a value, the deltaPOC value, from a set of syntax elements in the bitstream; deriving a set of values, the UniqueList, wherein the UniqueList comprises a) the deltaPOC value and b) a third set of values, wherein each value in the third set of values is equal to either i) the sum of a value from the first set of values and the deltaPOC value or ii) the sum of a value from the second set of values and the deltaPOC value; deriving a value N from the set of syntax elements; deriving a value M from the set of syntax elements; including N values from the UniqueList in the LOpred cur list; and including M values from the UniqueList in the Llpred cur list.
2. The method of claim 1, further comprising using the LOpred cur list and the Llpred cur list in a process for decoding a current coded picture from the video bitstream.
3. The method of claims 1 or 2, wherein the LOpred cur list is included in an entry of the LOL, and the Llpred cur list is also included in the entry of the LOL.
4. The method of any one of claims 1-3, wherein the set of syntax elements are parameter set syntax elements.
5. The method of any one of claims 1-3, wherein the set of syntax elements is one of, or a combination of, video coding layer, VCL syntax elements, slice header syntax elements, or picture header syntax elements.
6. The method of any one of claims 1-5, further comprising: decoding an index value from a syntax element in the set of syntax elements, wherein the index value identifies said entry of the LOL that contains the LOpred_prev and LlPred_prev lists; and prior to constructing the LOpred cur and Llpred cur lists based on the LOpred_prev list and the Llpred_prev list, using the decoded index value to select said entry of the LOL.
7. The method of any one of claims 1-6, wherein including the N values from the UniqueList in the LOpred cur list comprises: deriving an index value, j, from the set of syntax elements; and including in the LOpred cur list the value from the entry of the UniqueList associated with the index value, j .
8. The method of any one of claims 1-6, wherein including the N values from the UniqueList in the LOpred cur list comprises: for each entry in the LOpred cur list, deriving an index value, j, from the set of syntax elements and storing in said entry of the LOpred cur list the j ’th value in the UniqueList (e.g., UniqueList[j] where the UniqueList is implemented as an array of values).
9. The method of claims 7 or 8, wherein deriving) comprises: determining the size of the UniqueList; determining a number of bits, Ml, based on the determined size, wherein Ml=ceil(log2(S)), where S is the determined size; and decoding an Ml bit syntax, wherein V is the value of the decoded Ml bit syntax element; and setting j equal to V or equal to V+l .
10. The method of any one of claims 1-6, wherein including the N values from the UniqueList in the LOpred cur list comprises: decoding a flag from the set of syntax elements; and in response to determining that the flag is equal to a predetermined value, including the N values in UniqueList in the LOpred cur list without decoding any syntax element that specifies which of the N values in UniqueList to add to the LOpred cur list.
11. The method of claim 10 wherein, the UniqueList comprises N negative numbers, where N > 1, and the N values from the UniqueList that are added to the LOpred cur list are said N negative numbers.
12. The method of claim 11, wherein the N negative numbers comprise a first negative number, Nl, and a second negative number, N2, wherein Nl > N2, the LOpred cur list is data structure comprising a first element followed by a second element, and including the N values from the UniqueList in the LOpred cur list comprises: i) setting the value of the first element of the data structure to Nl; and ii) setting the value of the second element of the data structure to N2.
13. The method of claim 11, wherein including the N negative numbers from the UniqueList in the LOpred cur list comprises setting the N first entries of the LOpred cur list equal to the first N entries of a sorted sequence of the N negative numbers, sorted in decreased order with the highest negative number first.
14. The method of any one of claims 1-13, wherein the L0pred_prev list consists of P entries; and deriving N comprises: i) deriving a delta value, D, from the set of syntax elements; and ii) setting N equal to D + P.
15. The method of any one of claims 2-14, further comprising: deriving a second index value, W, from on one of, or a combination of, video coding layer, VCL syntax elements, slice header syntax elements, or picture header data; and using the second index value W to select an entry from the LOL, wherein the selected entry comprises LOpred cur and Llpred cur.
16. The method of any one of claims 1-15, further comprising: deriving a first index value, VI, from one or more first syntax elements of a coded picture; using the VI to select a value from the LOpred cur list; using the selected value to identify a reference picture; and using the identified reference picture in a motion compensation process for decoding the coded picture.
17. The method of claim 16, wherein the one or more first syntax elements are positioned in coded slice data of the coded picture.
18. The method of any one of claims 1-17, wherein deriving the deltaPOC value from a set of syntax elements in the bitstream comprises: decoding a first value N3 representing a sub-GOP size from the set of syntax elements wherein the sub-GOP size is equal to 2AN; and deriving the deltaPOC value from the value N and a value J, wherein J is the index in the LOL comprising the current LOpred list and the current Llpred list.
19. The method of any one of claims 1-18, wherein including in the Llpred cur list the N values from the UniqueList comprises: deriving an index value, j 1, from the set of syntax elements; and including in the Llpred cur list the value from the entry of the UniqueList associated with the index value, j 1.
20. The method of any one of claims 1-19, wherein deriving the UniqueList comprises: obtaining an initial set of values, wherein the initial set of values consists of the union of a third set of values and a fourth set of values, wherein the third set of values is equal to the first set of values with all duplicates removed and the fourth set of values is equal to the second set of values with all duplicates removed; for each value included in the initial set of values, adding the deltaPOC value to the value included in the initial set of values, thereby producing an intermediate set of values; and adding to the intermediate set of values the deltaPOC value, thereby producing the UniqueList.
21. The method of any one of claims 1-19, wherein deriving the UniqueList comprises: obtaining an initial set of values, wherein the initial set of values consists of i) the union of a third set of values and a fourth set of values and ii) a value equal to 0, wherein the third set of values is equal to the first set of values with all duplicates removed and the fourth set of values is equal to the second set of values with all duplicates removed; for each value included in the initial set of values, adding the deltaPOC value to the value included in the initial set of values, thereby producing the UniqueList.
22. The method of any one of claims 1-21, wherein any of the LOpred cur list, the Llpred cur list, the LOpred prev list, and the L I pred prev list may contain duplicates, but there are no duplicates within UniqueList.
23. A computer program (743) comprising instructions (744) which when executed by processing circuitry (702) of an apparatus (700) causes the apparatus to perform the method of any one of the above claims.
24. A carrier containing the computer program of claim 23, wherein the carrier is one of an electronic signal, an optical signal, a radio signal, and a computer readable storage medium (742).
25. A decoder apparatus (700) configured to perform the method of any one of claims 1- 22.
PCT/SE2023/051012 2022-10-13 2023-10-10 Inter-predicted reference picture lists WO2024080916A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263415716P 2022-10-13 2022-10-13
US63/415,716 2022-10-13

Publications (1)

Publication Number Publication Date
WO2024080916A1 true WO2024080916A1 (en) 2024-04-18

Family

ID=90669725

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2023/051012 WO2024080916A1 (en) 2022-10-13 2023-10-10 Inter-predicted reference picture lists

Country Status (1)

Country Link
WO (1) WO2024080916A1 (en)

Similar Documents

Publication Publication Date Title
US11778189B2 (en) Method for encoding and decoding quantized matrix and apparatus using same
CN107409215B (en) Palette mode coding for video coding
KR102028527B1 (en) Image decoding method and apparatus using same
EP3120548B1 (en) Decoding of video using a long-term palette
KR20140092807A (en) Method and device for transmitting image information, and decoding method and device using same
WO2021068897A1 (en) Method and apparatus of harmonizing triangular merge mode with weighted prediction
EP3707903A1 (en) Enhanced reference picture management in video coding
CN112514399B (en) Signaling parameter value information in a parameter set to reduce the amount of data contained in an encoded video bitstream
CN114846791A (en) Video coding supporting sub-pictures, slices, and blocks
JP2024032777A (en) High-level signaling method and apparatus for weighted prediction
JP2022549648A (en) Signaling of non-picture level syntactic elements at the picture level
US10506244B2 (en) Method and apparatus for video coding using adaptive tile sizes
US20230319315A1 (en) Coding enhancement in cross-component sample adaptive offset
WO2024080916A1 (en) Inter-predicted reference picture lists
KR20220145407A (en) Image encoding/decoding method, apparatus and method of transmitting bitstream for selectively encoding size information of a rectangular slice
KR20220136436A (en) Video encoding/decoding method and apparatus for selectively signaling filter availability information, and method for transmitting a bitstream
WO2024080917A1 (en) Quantization parameter (qp) coding for video compression
WO2019234000A1 (en) Prediction of sao parameters
GB2574421A (en) Video coding and decoding
US20240137546A1 (en) Coding enhancement in cross-component sample adaptive offset
WO2021068738A1 (en) Method and apparatus of adaptive adjustment of weighting prediction parameter precision
WO2020256615A1 (en) Video coding layer up-switching indication
WO2019233998A1 (en) Video coding and decoding