WO2006040413A1 - Reference picture management in video coding - Google Patents

Reference picture management in video coding Download PDF

Info

Publication number
WO2006040413A1
WO2006040413A1 PCT/FI2005/050359 FI2005050359W WO2006040413A1 WO 2006040413 A1 WO2006040413 A1 WO 2006040413A1 FI 2005050359 W FI2005050359 W FI 2005050359W WO 2006040413 A1 WO2006040413 A1 WO 2006040413A1
Authority
WO
WIPO (PCT)
Prior art keywords
pictures
parameter
reference picture
rpn
value
Prior art date
Application number
PCT/FI2005/050359
Other languages
French (fr)
Inventor
Ye-Kui Wang
Miska Hannuksela
Original Assignee
Nokia Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Corporation filed Critical Nokia Corporation
Priority to EP05799154A priority Critical patent/EP1800262A4/en
Publication of WO2006040413A1 publication Critical patent/WO2006040413A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/58Motion compensation with long-term prediction, i.e. the reference frame for a current frame not being the temporally closest one
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • H04N19/463Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/573Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction

Definitions

  • the invention relates to reference picture management in video coding and decoding.
  • H.264/AVC is the work output of a Joint Video Team (JVT) of ITU-T Video Coding Experts Group (VCEG) and ISO/IEC MPEG.
  • motion compensation i.e. predictive coding
  • one or more previously decoded pictures are used as reference pictures of the current picture being encoded or decoded.
  • a reference block from the reference picture is searched such that the difference signal between the current block and the reference block requires a minimum number of bits to represent.
  • Encoding of the displacement between the current block and the reference block may also be considered in searching the reference block.
  • the distortion of the reconstructed block may also be considered in searching the reference block.
  • some pictures may be used as reference pictures when encoding of other pictures, while some may never be used as reference pictures.
  • a picture that is not to be used as a reference picture is called a non- reference picture.
  • the encoder should then signal whether a picture is a reference picture to a decoder such that the decoder does not need to store the picture for motion compensation reference. Initially, each reference picture should be stored in the post-decoder buffer or decoded picture buffer and marked as "used for reference”. However, when a reference picture is not used for reference anymore, it should be marked as "unused for reference”. Marking of a reference picture as "used for reference” or "unused for reference” among other things are done by a reference picture management process.
  • the reference picture selected for coding or decoding a block may be a recently decoded picture (typically called short-term reference picture), or a decoded picture that is far preceding the currently coded picture in decoding order (typically called long-term reference picture).
  • short-term reference picture a decoded picture that is far preceding the currently coded picture in decoding order
  • long-term reference picture a decoded picture that is far preceding the currently coded picture in decoding order
  • the reference picture 101 is assumed to be a short-term reference picture (when encoding of picture 103 and 102) while the reference picture 105 is assumed to be a long-term reference picture (when encoding of picture 106).
  • the pictures between the long-term reference picture 105 and the picture 106 which uses the long-term reference picture as a reference picture are not shown in Fig. 1.
  • reference picture list construction (specified in subclause 8.2.4 of the H.264/AVC specification) and reference picture marking (specified in subclause 8.2.5 of the H.264/AVC specification) are separated for short- term reference pictures and long-term reference pictures.
  • the 10-bit temporal reference index TRI or RTR representing temporal reference is used to identify reference pictures.
  • One disadvantage in this solution is that the temporal distance between the reference picture and the current picture is limited to be less than 1024 units. The unit is defined according to the active picture clock frequency. In other words, the so-called long-term reference picture is not enabled.
  • the 10-bit picture number (PN) that is incremented by 1 for each reference picture (called as "stored picture” therein) is used to identify short-term reference pictures.
  • the variable length coded LPIN representing long-term picture index is used to identify long-term reference pictures.
  • PicNum and LongTermPicNum are used, respectively, to identify short-term and long-term reference pictures.
  • PicNum and LongTermPicNum are similar as PN and LPIN, respectively, in the standard H.263 Annex U, but both are extended for both progressive coding and interlace coding.
  • PicNum has yet another difference from PN, being that the value of PicNum may be negative and is degressive with the difference between the decoding order of the current picture and the decoding order of the reference picture.
  • the PN of a list of reference pictures may be 1022, 1023, 0, 1 , 2, while the PicNum of the same list of reference pictures may be -2, -1 , 0, 1 , 2.
  • patent applications US-09/892977, WO 01/86960 and GB 2382403, and the standard H.263 Annex U and the standard H.264/AVC disclose some prior art solutions to reference picture management in video coding.
  • This invention provides a reference picture management solution for implementation in e.g. video encoders and/or decoders whether or not the usage of long-term reference picture approach is supported.
  • the reference pictures are managed in the same way no matter how far away they are from the current picture being encoded or decoded in decoding order. Therefore the reference pictures are not needed to be separated as short-term or long- term reference pictures.
  • a reference picture is identified by a variable whose value can be unique for a reference picture throughout the coded video sequence. That variable can also be used in all the management processes of reference pictures in addition to identify reference pictures.
  • a uniform reference picture management process is disclosed that may enable simplified video decoder and/or encoder implementations when long-term reference picture implementation is supported.
  • the invention can largely be implemented as a software wherein the software can be simplified to some extent.
  • the proposed reference picture reordering and marking processes may enable efficient signaling of information required for the reference picture management processes.
  • FIG. 1 shows an example of a picture stream which comprises reference pictures and non-reference pictures
  • Fig. 2 shows an example of a picture stream which comprises frame numbers
  • Fig. 3 shows an example of a signal according to the present invention
  • Fig. 4 shows an example of a method according to the present invention as a flow diagram
  • Fig. 5 depicts an advantageous embodiment of the system according to the present invention
  • Fig. 6 depicts an advantageous embodiment of the encoder according to the present invention
  • Fig. 7 depicts an advantageous embodiment of the decoder according to the present invention
  • the pictures to be encoded can be, for example, pictures of a video stream from a video source 3, e.g. a camera, a video recorder, etc.
  • the pictures (frames) of the video stream can be divided into smaller portions such as slices.
  • the slices can further be divided into blocks.
  • the video stream is encoded to reduce the information to be transmitted via a transmission channel 4, or to a storage media (not shown).
  • Pictures of the video stream are input to the encoder 1.
  • the encoder has an encoding buffer 1.1 (Fig. 6) for temporarily storing some of the pictures to be encoded.
  • the encoder 1 also includes a memory 1.3 and a processor 1.2 in which the encoding tasks according to the invention can be applied.
  • the memory 1.3 and the processor 1.2 can be common with the transmitting device 6 or the transmitting device 6 can have another processor and/or memory (not shown) for other functions of the transmitting device 6.
  • the encoder 1 performs motion estimation and/or some other tasks to compress the video stream.
  • the reference picture has to be stored in a buffer (e.g. in the decoded picture buffer 5.2) as long as it is used as a reference picture.
  • the encoder 1 may also insert information on display order of the pictures into the transmission stream.
  • the encoded pictures are moved to an picture interleaving buffer 5.3, if necessary. Furthermore, the encoded reference pictures are decoded and inserted into the decoded picture buffer 5.2 of the encoder.
  • the encoded pictures are transmitted from the encoder 1 by the transmitter 7 to the receiving device 8 via the transmission channel 4.
  • the receiver 9 receives the transmitted information and performs necessary operations to transform signals transmitted by the transmitter 7 into form suitable for the decoder 2 which is known as such.
  • the encoded pictures are decoded to form uncompressed pictures corresponding as much as possible to the encoded pictures.
  • the decoder 1 also includes a memory 2.3 and a processor 2.2 in which the decoding tasks can be applied.
  • the memory 2.3 and the processor 2.2 can be common with the receiving device 8 or the receiving device 8 can have another processor and/or memory (not shown) for other functions of the receiving device 8.
  • Pictures from the video source 3 are entered to the encoder 1 and stored in the encoding buffer 1.1 when necessary.
  • the encoding process is not necessarily started immediately after the first picture is entered to the encoder, but after a certain amount of pictures are available in the encoding buffer 1.1.
  • the encoder 1 tries to find suitable candidates from the pictures to be used as the reference frames for motion estimation.
  • the encoder 1 then performs the encoding to form encoded pictures.
  • the encoded pictures can be, for example, predicted pictures (P), bi-predictive pictures (B), and/or intra-coded pictures (I).
  • the intra-coded pictures can be decoded without using any other pictures, but other type of pictures need at least one reference picture before they can be decoded. Pictures of any of the above mentioned picture types can be used as a reference picture.
  • the encoder 1 attaches for example two time stamps to the pictures: a decoding time stamp (DTS) and output time stamp (OTS).
  • DTS decoding time stamp
  • OTS output time stamp
  • the decoder can use the time stamps to determine the correct decoding time and time to output (display) the pictures.
  • those time stamps are not necessarily transmitted to the decoder or it does not use them.
  • the buffering model is presented next.
  • the pre-encoding buffer 1.0, decoded picture buffer 5.2 and interleaving buffer 5.3 are initially empty. Uncompressed pictures in capturing order are inserted to the pre-encoding buffer. When any temporal scalability scheme is applied, more than one uncompressed picture is buffered in the pre-encoding buffer before encoding. After this initial pre-encoding buffering, the encoding process starts.
  • the encoder 5 performs the encoding process. As a result of the encoding process, the encoder produces decoded reference pictures and encoded pictures and removes picture that was encoded from the pre-encoding buffer.
  • the decoded reference pictures are inserted in the decoded picture buffer 5.2 and encoded pictures are inserted in the interleaving buffer 5.3.
  • the transmitting device selects data units of encoded pictures from the interleaving buffer to be transmitted. A transmitted data unit of an encoded picture is removed from the interleaving buffer.
  • the transmission and/or storing of the encoded pictures can be started immediately after the first encoded picture is ready. This picture is not necessarily the first one in decoder output order because the decoding order and the output order may not be the same.
  • the transmission can be started.
  • the encoded pictures are optionally stored to the interleaving buffer 5.3.
  • the transmission can also start at a later stage, for example, after a certain part of the video stream is encoded.
  • the receiver 8 collects all data units of received signal(s) belonging to a picture, bringing them into a reasonable order. The strictness of the order depends on the profile employed.
  • the received data units are stored in reception order into the receiving buffer 9.1 (pre-decoding buffer, de- interleaving buffer).
  • the receiver 8 discards anything that is unusable, and passes the rest to the decoder 2.
  • the encoded pictures are decoded by the processor 2.2 and stored into the decoded picture buffer 2.1.
  • the decoded picture buffer 2.1 contains memory places for storing a number of pictures. Those places can also be called as frame stores.
  • the decoder 2 decodes the received pictures in the order they are removed from the de-interleaving buffer (i.e. in decoding order).
  • the pictures which are used as reference pictures will be stored in the decoded picture buffer 2.1 as long as they are needed as reference pictures.
  • reference picture When a reference picture is marked as "unused for reference” (or alternatively the marking "used for reference” is removed) that reference picture can be removed from the decoded picture buffer 2.1 if its output or display time is elapsed and/or a newly decoded picture can be stored onto that reference picture.
  • the decoder 2 should also output the decoded pictures in correct order, for example by using the ordering of the picture order counts as specified in the standard H.264/AVC, and hence the reordering process need be defined clearly and normatively.
  • a variable having unique values for all the reference pictures within a coded video sequence is used to identify reference pictures, regardless how far a reference picture, within the same coded video sequence, is away from the current picture, in temporal order, decoding order or any other order.
  • This variable is called as a reference picture number and it is abbreviated as RPN herein.
  • a coded video sequence is essentially the same as the term defined in the standard H.264/AVC.
  • the definition for the coded video sequence is: a sequence of coded pictures that consists, in decoding order, of an instantaneous decoding refresh (IDR) picture followed by zero or more non-
  • IDR pictures including all subsequent pictures up to but not including any subsequent IDR picture.
  • An IDR picture is an intra coded picture after the decoding of which all following coded pictures in decoding order can be decoded without reference from any picture decoded prior to the IDR picture.
  • the first picture of each coded video sequence is an IDR picture.
  • Reference picture number is derived from the signaled information for each picture.
  • the reference picture number can be derived from temporal reference (e.g. TR in H.263 picture header) or frame number (FN) that is incremented by 1 for each reference picture in modulo arithmetic (e.g. frame_num in H.264/AVC slice header and PN as specified in H.263 Annex U).
  • reference picture number RPN is derived from frame number FN.
  • frame number FN counts only reference pictures and second, non-reference pictures are not stored in the post- decoder picture buffer for reference. It is obvious that similar derivation method can be used to derive reference picture number RPN from other information such as temporal reference.
  • the frame number value of an IDR picture can be set to any integer value between 0 and the maximum frame number value MaxFN, though typically it can be set to 0.
  • the sum of the maximum frame number value MaxFN and 1 is denoted as MaxFNplusi .
  • MaxFNplusi can be indicated according to the signaled information and/or the codec specification.
  • An IDR picture is naturally a reference picture.
  • the FN value in a picture is equal to the FN value of the previous reference picture in decoding order plus 1 modulo MaxFNplusi as is shown in the example of Fig. 2, where all the shown pictures are reference pictures and MaxFNplusi is 256.
  • the reference picture number of a reference picture is derived based on the frame number FN as follows. For a reference picture with frame number equal to FN and stored in the post-decoder buffer 5.2, 2.1 for reference, let the parameter prevFN equal to the frame number of the previous reference picture in decoding order, and let the parameter prevRPN equal to the reference picture number of the previous reference picture. The reference picture number of the reference picture is then calculated as follows:
  • RPN prevRPN + FN - prevFN else
  • RPN prevRPN + FN - prevFN + MaxFNplusi
  • the initial reference picture list indexes the reference pictures stored in the post-decoder buffer for reference such that the reference pictures are ordered starting with the reference picture with the highest RPN value and proceeding through to the reference picture with the lowest RPN value. For example, if there are four pictures stored to be used for reference, and their RPN values are 255, 502, 1027 and 1029, the initial list order is 1029, 1027, 502, 255. With this default list order, variable length coded (VLC) code 0 can be used to indicate the reference picture with RPN value 1029, code 1 can be used to indicate the reference picture with RPN value 1027, and so on.
  • VLC variable length coded
  • Each predictive picture may have multiple reference pictures. These reference pictures are ordered in two reference picture lists, called RefPicListO and RefPicListi .
  • Each reference picture list has an initial order, and the order may be changed by the reference picture list reordering process. For example, assume that the initial order of RefPicListO is r ⁇ , r1 , r2, ..., rm, which are coded using variable length codes. Code 0 represents rO, code 1 represents r1 , and so on. If the encoder knows that r1 is used more frequently than rO, then it can reorder the list by swapping rO and r1 such that code 1 represents rO, code 0 represents M . Since code 0 is shorter than code 1 in code length, improved coding efficiency is achieved.
  • the reference picture reordering process must be signaled in the bit stream so that the decoder can derive the correct reference picture for each reference picture list order.
  • One method for reference picture list reordering is to signal the RPN value to indicate which reference picture is to be reordered. For example, if the list order 1029, 1027, 502, 255 is to be reordered as 255, 1027, 1029, 502, the list reordering information to be signaled is (in the order as they appear):
  • the decoder 2 processes the two VLC codes in the order as they appear. After processing of the first code, the reference picture with RPN value 255 is put first in the order, and the orders of other reference pictures are put after the first reference picture in the order according to the initial order. The list order then becomes 255, 1029, 1027, 502.
  • the reference picture with RPN value 1027 is put second in the order, and the orders of other reference pictures except the one processed above are put after the second reference picture in the order according to the initial order.
  • the list order then becomes 255, 1027, 1029, 502.
  • a problem of the above method is that the number of bits to signal the original RPN value could be very large since in VLC coding larger values typically have a larger code length.
  • RPN values can be utilized.
  • a possible method is similar as that used for short-term reference picture list reordering in the standard H.264/AVC. Instead of directly signaling the RPN value for the to-be-reordered reference picture, the absolute difference between the prediction and the RPN value minus 1 , denoted as AbsDIFFminusi , is signaled, together with an indication of whether the absolute difference is added to or subtracted from the prediction value to derive the RPN value, denoted as ASidc.
  • the prediction value denoted as predRPN
  • predRPN is set equal to PRN value of the just reordered reference picture.
  • RPN predRPN + (AbsDIFFminusi + 1 )
  • the present invention provides an efficient coding of reference picture list reordering information. Prediction of the RPN values of the to-be-reordered reference pictures are used. Three pieces of information are signaled for indication of an RPN value:
  • PS scale of the prediction value denoted as PS.
  • the value of PS shall be selected such that AbsDIFFminusi is in the range of 0 to MaxFNplusi , exclusive.
  • RPN predRPN + (AbsDIFFminusi + 1 )
  • the three information pieces may be contained in two syntax elements (by combining ASidc and PS in one syntax element) as well as three syntax elements.
  • the prediction scale PS could be based on a value other than MaxFNplusi provided that the value can be indicated from the codec specification and/or related signaled information.
  • the reference picture marking process is mainly used to mark some reference pictures as "unused for reference” such that they can be removed from the post-decoder buffer 2.1 , 5.2 if their output or display times have elapsed.
  • the information needed to derive the RPN of the to-be-marked reference picture is signaled.
  • the information to be signaled is the difference between RPNcurr and the RPN value of the to-be-marked reference picture minus 1 , denoted as diffRPNminusi .
  • the RPN value of the to-be-marked reference picture is derived as
  • RPN RPNcurr - (diffRPNmiusi + 1 )
  • This invention provides a solution for the above problem.
  • another information is signaled additionally to indicate the size of the sliding window, denoted as SSW.
  • SSW the size of the sliding window
  • the additionally signaled information is equal to the difference between the maximum number of stored pictures for reference and SSW.
  • the additionally signaled information is then just a code representing 1 (equal to 3 - 2).
  • This invention also provides an efficient signaling method for the adaptive marking operation. Two pieces of information are signaled to mark one reference picture as "unused for reference":
  • the value of PS shall be selected such that AbsDIFFminusi is in the range of 0 to MaxFNplusi , exclusive.
  • predRPN RPNcurr - PS * MaxFNplusi
  • the prediction scale PS could be based on a value other than MaxFNplusi provided that the value can be indicated from the codec specification and/or related signaled information.
  • the encoder 1 performs the encoding of the picture stream and calculates the values for the parameters.
  • the encoder 1 further initiates a signal transmission for informing the decoder 2 of the receiving device 8 that a reference picture can be removed from the post- decoder buffer 2.1 of the decoder if its display or output time is elapsed.
  • the signal is included with the parameters which indicate the reference picture number, reference picture list reordering information and/or the reference picture marking information.
  • the signal is transmitted by the transmitter 7 of the transmitting device 6.
  • the present invention can be applied in many kinds of systems and devices.
  • the transmitting device 6 can be e.g. a computing device such as a server device, a video transmitter, a wireless communication device, etc.
  • the receiving device 8 can be a computing device such as a workstation, a wireless communication device, a video receiver etc.
  • the transmitting device 6 including the encoder 1 advantageously include also a transmitter 7 to transmit the encoded pictures to the transmission channel 4.
  • the receiving device 8 include the receiver 9 to receive the encoded pictures, the decoder 2, and optionally a display 10 on which the decoded pictures can be displayed.
  • the transmission channel can be, for example, a landline communication channel and/or a wireless communication channel.
  • the transmitting device and the receiving device also include one or more processors 1.2, 2.2 which can perform the necessary steps for controlling the encoding/decoding process of video stream according to the invention. Therefore, the method according to the present invention can mainly be implemented as machine executable steps of the processors.
  • the buffering of the pictures can be implemented in the memory 1.3, 2.3 of the devices.
  • the program code 1.4 of the encoder can be stored into the memory 1.3.
  • the program code 2.4 of the decoder can be stored into the memory 2.3.

Abstract

A method for encoding a sequence of pictures comprising using one or more pictures as reference pictures, labeling the reference pictures with a first parameter, signaling the first parameter to a decoder, and using a reference picture management, wherein all the reference pictures are identified by a second parameter which is derived on the basis of the first parameter.

Description

Reference Picture Management in Video Coding
Field of the Invention
The invention relates to reference picture management in video coding and decoding.
Background of the Invention
There are a number of video coding standards including ITU-T H.261 , ISO/IEC MPEG-1 Visual, ITU-T H.262 or ISO/IEC MPEG-2 Visual, ITU-T H.263, ISO/IEC MPEG-4 Visual and ITU-T H.264 or ISO/IEC MPEG-4 AVC. H.264/AVC is the work output of a Joint Video Team (JVT) of ITU-T Video Coding Experts Group (VCEG) and ISO/IEC MPEG.
In addition, there are efforts working towards new video coding standards. One is the development of scalable video coding (SVC) standard in MPEG. This will become MPEG-21 Part 13. The second effort is the development of China video coding standards organized by the China Audio Visual coding Standard Work Group (AVS). AVS finalized its first video coding specification, AVS 1.0 targeted for SDTV and HDTV applications, in February 2004. Since then the focus has moved to mobile video services.
Many of the available video coding standards utilize motion compensation, i.e. predictive coding, to remove temporal redundancy between video signals for high coding efficiency. In motion compensation, one or more previously decoded pictures are used as reference pictures of the current picture being encoded or decoded. When encoding one block of pixels of the current picture (the current block), a reference block from the reference picture is searched such that the difference signal between the current block and the reference block requires a minimum number of bits to represent. Encoding of the displacement between the current block and the reference block may also be considered in searching the reference block. Further, the distortion of the reconstructed block may also be considered in searching the reference block. In a coded video bit stream, some pictures may be used as reference pictures when encoding of other pictures, while some may never be used as reference pictures. A picture that is not to be used as a reference picture is called a non- reference picture. The encoder should then signal whether a picture is a reference picture to a decoder such that the decoder does not need to store the picture for motion compensation reference. Initially, each reference picture should be stored in the post-decoder buffer or decoded picture buffer and marked as "used for reference". However, when a reference picture is not used for reference anymore, it should be marked as "unused for reference". Marking of a reference picture as "used for reference" or "unused for reference" among other things are done by a reference picture management process.
The reference picture selected for coding or decoding a block may be a recently decoded picture (typically called short-term reference picture), or a decoded picture that is far preceding the currently coded picture in decoding order (typically called long-term reference picture). In Fig. 1 there is depicted an example of a picture stream 100 which comprises reference pictures 101 ,
103, 105, 106, 108, 110 and non-reference pictures 102, 104, 107, 109. The reference picture 101 is assumed to be a short-term reference picture (when encoding of picture 103 and 102) while the reference picture 105 is assumed to be a long-term reference picture (when encoding of picture 106). The pictures between the long-term reference picture 105 and the picture 106 which uses the long-term reference picture as a reference picture are not shown in Fig. 1.
In the standards that allow for both short-term and long-term reference pictures, e.g. H.263 and H.264/AVC, reference picture management processes are separated between short-term reference pictures and long- term reference pictures. In addition, a process is specified to mark a short- term reference picture as a long-term reference picture. In H.264/AVC, a short-term reference picture is identified by the variable PicNum, and a long- term reference picture is identified by the variable LongTermPicNum. Both PicNum and LongTermPicNum are specified in subclause 8.2.4.1 of the H.264/AVC specification. Accordingly, all other reference management operations such as reference picture list construction (specified in subclause 8.2.4 of the H.264/AVC specification) and reference picture marking (specified in subclause 8.2.5 of the H.264/AVC specification) are separated for short- term reference pictures and long-term reference pictures.
In the standard H.263 Annex N (reference picture selection mode), the 10-bit temporal reference index TRI or RTR representing temporal reference is used to identify reference pictures. One disadvantage in this solution is that the temporal distance between the reference picture and the current picture is limited to be less than 1024 units. The unit is defined according to the active picture clock frequency. In other words, the so-called long-term reference picture is not enabled.
In the standard H.263 Annex U (enhanced reference picture selection mode), the 10-bit picture number (PN) that is incremented by 1 for each reference picture (called as "stored picture" therein) is used to identify short-term reference pictures. The variable length coded LPIN representing long-term picture index is used to identify long-term reference pictures.
In the standard H.264/AVC, PicNum and LongTermPicNum are used, respectively, to identify short-term and long-term reference pictures. PicNum and LongTermPicNum are similar as PN and LPIN, respectively, in the standard H.263 Annex U, but both are extended for both progressive coding and interlace coding. PicNum has yet another difference from PN, being that the value of PicNum may be negative and is degressive with the difference between the decoding order of the current picture and the decoding order of the reference picture. For example, the PN of a list of reference pictures may be 1022, 1023, 0, 1 , 2, while the PicNum of the same list of reference pictures may be -2, -1 , 0, 1 , 2.
For example, patent applications US-09/892977, WO 01/86960 and GB 2382403, and the standard H.263 Annex U and the standard H.264/AVC disclose some prior art solutions to reference picture management in video coding.
The separated management of short-term and long-term reference pictures results in complex reference picture management operations, hence increased implementation complexity for both hardware and software implementations. Summary of the Invention
This invention provides a reference picture management solution for implementation in e.g. video encoders and/or decoders whether or not the usage of long-term reference picture approach is supported.
According to an example embodiment of the present invention, the reference pictures are managed in the same way no matter how far away they are from the current picture being encoded or decoded in decoding order. Therefore the reference pictures are not needed to be separated as short-term or long- term reference pictures. A reference picture is identified by a variable whose value can be unique for a reference picture throughout the coded video sequence. That variable can also be used in all the management processes of reference pictures in addition to identify reference pictures.
In the present invention a uniform reference picture management process is disclosed that may enable simplified video decoder and/or encoder implementations when long-term reference picture implementation is supported.
In the standard H.264/AVC there is a syntax table for reference picture reordering. There are eight syntax elements (i.e. coding points) in the syntax table. Two of the syntax elements are not needed when the present invention is used. In the standard H.264/AVC there is also a syntax table for reference picture remarking. There are eight syntax elements in the syntax table from which four are not needed in the implementations of the present invention.
The invention can largely be implemented as a software wherein the software can be simplified to some extent.
The proposed reference picture reordering and marking processes may enable efficient signaling of information required for the reference picture management processes.
Description of the Drawings
In the following the present invention will be described in more detail with respect to the appended drawings in which Fig. 1 shows an example of a picture stream which comprises reference pictures and non-reference pictures,
Fig. 2 shows an example of a picture stream which comprises frame numbers,
Fig. 3 shows an example of a signal according to the present invention,
Fig. 4 shows an example of a method according to the present invention as a flow diagram,
Fig. 5 depicts an advantageous embodiment of the system according to the present invention,
Fig. 6 depicts an advantageous embodiment of the encoder according to the present invention,
Fig. 7 depicts an advantageous embodiment of the decoder according to the present invention,
Detailed Description of the Invention
The following implementation aspects of the current invention are described in the way for progressive coding only, where a picture is equivalently a frame.
However, it is obvious for them to be extended for use in both progressive coding and interlace coding, where a picture may either be a field or a frame, in the way similarly as in the prior art according to the standard H.264/AVC.
Further, the following aspects of the current invention are described for forward prediction only. It is also obvious for those to be extended for bi- prediction as defined in the standard H.264/AVC.
In the following the invention will be described in more detail with reference to the system of Fig. 5, the encoder 1 of Fig. 6 and decoder 2 of Fig. 7. The pictures to be encoded can be, for example, pictures of a video stream from a video source 3, e.g. a camera, a video recorder, etc. The pictures (frames) of the video stream can be divided into smaller portions such as slices. The slices can further be divided into blocks. In the encoder 1 the video stream is encoded to reduce the information to be transmitted via a transmission channel 4, or to a storage media (not shown). Pictures of the video stream are input to the encoder 1. The encoder has an encoding buffer 1.1 (Fig. 6) for temporarily storing some of the pictures to be encoded. The encoder 1 also includes a memory 1.3 and a processor 1.2 in which the encoding tasks according to the invention can be applied. The memory 1.3 and the processor 1.2 can be common with the transmitting device 6 or the transmitting device 6 can have another processor and/or memory (not shown) for other functions of the transmitting device 6. The encoder 1 performs motion estimation and/or some other tasks to compress the video stream. The reference picture has to be stored in a buffer (e.g. in the decoded picture buffer 5.2) as long as it is used as a reference picture. The encoder 1 may also insert information on display order of the pictures into the transmission stream.
From the encoding process the encoded pictures are moved to an picture interleaving buffer 5.3, if necessary. Furthermore, the encoded reference pictures are decoded and inserted into the decoded picture buffer 5.2 of the encoder. The encoded pictures are transmitted from the encoder 1 by the transmitter 7 to the receiving device 8 via the transmission channel 4. In the receiving device 8 the receiver 9 receives the transmitted information and performs necessary operations to transform signals transmitted by the transmitter 7 into form suitable for the decoder 2 which is known as such. In the decoder 2 the encoded pictures are decoded to form uncompressed pictures corresponding as much as possible to the encoded pictures.
The decoder 1 also includes a memory 2.3 and a processor 2.2 in which the decoding tasks can be applied. The memory 2.3 and the processor 2.2 can be common with the receiving device 8 or the receiving device 8 can have another processor and/or memory (not shown) for other functions of the receiving device 8.
Encoding
Let us now consider the encoding-decoding process in more detail. Pictures from the video source 3 are entered to the encoder 1 and stored in the encoding buffer 1.1 when necessary. The encoding process is not necessarily started immediately after the first picture is entered to the encoder, but after a certain amount of pictures are available in the encoding buffer 1.1. Then the encoder 1 tries to find suitable candidates from the pictures to be used as the reference frames for motion estimation. The encoder 1 then performs the encoding to form encoded pictures. The encoded pictures can be, for example, predicted pictures (P), bi-predictive pictures (B), and/or intra-coded pictures (I). The intra-coded pictures can be decoded without using any other pictures, but other type of pictures need at least one reference picture before they can be decoded. Pictures of any of the above mentioned picture types can be used as a reference picture.
The encoder 1 attaches for example two time stamps to the pictures: a decoding time stamp (DTS) and output time stamp (OTS). The decoder can use the time stamps to determine the correct decoding time and time to output (display) the pictures. However, those time stamps are not necessarily transmitted to the decoder or it does not use them.
The buffering model is presented next. The pre-encoding buffer 1.0, decoded picture buffer 5.2 and interleaving buffer 5.3 are initially empty. Uncompressed pictures in capturing order are inserted to the pre-encoding buffer. When any temporal scalability scheme is applied, more than one uncompressed picture is buffered in the pre-encoding buffer before encoding. After this initial pre-encoding buffering, the encoding process starts. The encoder 5 performs the encoding process. As a result of the encoding process, the encoder produces decoded reference pictures and encoded pictures and removes picture that was encoded from the pre-encoding buffer. The decoded reference pictures are inserted in the decoded picture buffer 5.2 and encoded pictures are inserted in the interleaving buffer 5.3. The transmitting device selects data units of encoded pictures from the interleaving buffer to be transmitted. A transmitted data unit of an encoded picture is removed from the interleaving buffer.
Transmission
The transmission and/or storing of the encoded pictures (and the optional virtual decoding) can be started immediately after the first encoded picture is ready. This picture is not necessarily the first one in decoder output order because the decoding order and the output order may not be the same. When the first picture of the video stream is encoded the transmission can be started. The encoded pictures are optionally stored to the interleaving buffer 5.3. The transmission can also start at a later stage, for example, after a certain part of the video stream is encoded.
Decoding
The receiver 8 collects all data units of received signal(s) belonging to a picture, bringing them into a reasonable order. The strictness of the order depends on the profile employed. The received data units are stored in reception order into the receiving buffer 9.1 (pre-decoding buffer, de- interleaving buffer). The receiver 8 discards anything that is unusable, and passes the rest to the decoder 2.
The encoded pictures are decoded by the processor 2.2 and stored into the decoded picture buffer 2.1. The decoded picture buffer 2.1 contains memory places for storing a number of pictures. Those places can also be called as frame stores. The decoder 2 decodes the received pictures in the order they are removed from the de-interleaving buffer (i.e. in decoding order). The pictures which are used as reference pictures will be stored in the decoded picture buffer 2.1 as long as they are needed as reference pictures. When a reference picture is marked as "unused for reference" (or alternatively the marking "used for reference" is removed) that reference picture can be removed from the decoded picture buffer 2.1 if its output or display time is elapsed and/or a newly decoded picture can be stored onto that reference picture.
The decoder 2 should also output the decoded pictures in correct order, for example by using the ordering of the picture order counts as specified in the standard H.264/AVC, and hence the reordering process need be defined clearly and normatively.
Identification of reference pictures
In this invention, a variable having unique values for all the reference pictures within a coded video sequence is used to identify reference pictures, regardless how far a reference picture, within the same coded video sequence, is away from the current picture, in temporal order, decoding order or any other order. This variable is called as a reference picture number and it is abbreviated as RPN herein.
A coded video sequence is essentially the same as the term defined in the standard H.264/AVC. The definition for the coded video sequence is: a sequence of coded pictures that consists, in decoding order, of an instantaneous decoding refresh (IDR) picture followed by zero or more non-
IDR pictures including all subsequent pictures up to but not including any subsequent IDR picture. An IDR picture is an intra coded picture after the decoding of which all following coded pictures in decoding order can be decoded without reference from any picture decoded prior to the IDR picture.
The first picture of each coded video sequence is an IDR picture.
Reference picture number (RPN) is derived from the signaled information for each picture. For example, the reference picture number can be derived from temporal reference (e.g. TR in H.263 picture header) or frame number (FN) that is incremented by 1 for each reference picture in modulo arithmetic (e.g. frame_num in H.264/AVC slice header and PN as specified in H.263 Annex U).
There are some advantages when the reference picture number RPN is derived from frame number FN. First, frame number FN counts only reference pictures and second, non-reference pictures are not stored in the post- decoder picture buffer for reference. It is obvious that similar derivation method can be used to derive reference picture number RPN from other information such as temporal reference.
The frame number value of an IDR picture can be set to any integer value between 0 and the maximum frame number value MaxFN, though typically it can be set to 0. The sum of the maximum frame number value MaxFN and 1 is denoted as MaxFNplusi . MaxFNplusi can be indicated according to the signaled information and/or the codec specification. An IDR picture is naturally a reference picture. For later pictures in the same coded video sequence in decoding order, the FN value in a picture, whether it is a reference or a non- reference picture, is equal to the FN value of the previous reference picture in decoding order plus 1 modulo MaxFNplusi as is shown in the example of Fig. 2, where all the shown pictures are reference pictures and MaxFNplusi is 256. The reference picture number of a reference picture is derived based on the frame number FN as follows. For a reference picture with frame number equal to FN and stored in the post-decoder buffer 5.2, 2.1 for reference, let the parameter prevFN equal to the frame number of the previous reference picture in decoding order, and let the parameter prevRPN equal to the reference picture number of the previous reference picture. The reference picture number of the reference picture is then calculated as follows:
if(prevFN <= FN)
RPN = prevRPN + FN - prevFN else
RPN = prevRPN + FN - prevFN + MaxFNplusi
Reference picture list initialization
The initial reference picture list indexes the reference pictures stored in the post-decoder buffer for reference such that the reference pictures are ordered starting with the reference picture with the highest RPN value and proceeding through to the reference picture with the lowest RPN value. For example, if there are four pictures stored to be used for reference, and their RPN values are 255, 502, 1027 and 1029, the initial list order is 1029, 1027, 502, 255. With this default list order, variable length coded (VLC) code 0 can be used to indicate the reference picture with RPN value 1029, code 1 can be used to indicate the reference picture with RPN value 1027, and so on.
Reference picture list reordering
Each predictive picture may have multiple reference pictures. These reference pictures are ordered in two reference picture lists, called RefPicListO and RefPicListi . Each reference picture list has an initial order, and the order may be changed by the reference picture list reordering process. For example, assume that the initial order of RefPicListO is rθ, r1 , r2, ..., rm, which are coded using variable length codes. Code 0 represents rO, code 1 represents r1 , and so on. If the encoder knows that r1 is used more frequently than rO, then it can reorder the list by swapping rO and r1 such that code 1 represents rO, code 0 represents M . Since code 0 is shorter than code 1 in code length, improved coding efficiency is achieved. The reference picture reordering process must be signaled in the bit stream so that the decoder can derive the correct reference picture for each reference picture list order.
One method for reference picture list reordering is to signal the RPN value to indicate which reference picture is to be reordered. For example, if the list order 1029, 1027, 502, 255 is to be reordered as 255, 1027, 1029, 502, the list reordering information to be signaled is (in the order as they appear):
VLC code for 255 VLC code for 1027
The decoder 2 processes the two VLC codes in the order as they appear. After processing of the first code, the reference picture with RPN value 255 is put first in the order, and the orders of other reference pictures are put after the first reference picture in the order according to the initial order. The list order then becomes 255, 1029, 1027, 502.
After processing of the second code, the reference picture with RPN value 1027 is put second in the order, and the orders of other reference pictures except the one processed above are put after the second reference picture in the order according to the initial order. The list order then becomes 255, 1027, 1029, 502.
A problem of the above method is that the number of bits to signal the original RPN value could be very large since in VLC coding larger values typically have a larger code length.
To save bits for representing the list reordering information, predictive coding of RPN values can be utilized. A possible method is similar as that used for short-term reference picture list reordering in the standard H.264/AVC. Instead of directly signaling the RPN value for the to-be-reordered reference picture, the absolute difference between the prediction and the RPN value minus 1 , denoted as AbsDIFFminusi , is signaled, together with an indication of whether the absolute difference is added to or subtracted from the prediction value to derive the RPN value, denoted as ASidc. For the first to- be-reordered reference picture, the prediction value, denoted as predRPN, is equal to RPNcurr. After processing the list reordering information of each to- be-reordered reference picture, predRPN is set equal to PRN value of the just reordered reference picture.
The RPN value of the to-be-reordered reference picture is derived as follows: if(ASidc == 0)
RPN = predRPN - (AbsDIFFminusi + 1 ) else if(ASidc == 1 )
RPN = predRPN + (AbsDIFFminusi + 1 )
For the above example, assuming that RPNcurr is equal to 1030, the list reordering information to be signaled becomes:
AbsDIFFminusi = 774, ASidc = 0 AbsDIFFminusi = 771 , ASidc = 1
It can be derived that the first to-be-reordered reference picture has RPN value equal to (1030-(774+1 )=255), and the second has RPN value equal to (255+(771+1 )=1027).
However, as can be seen, the above method is not efficient since the signaled value could still be very large.
The present invention provides an efficient coding of reference picture list reordering information. Prediction of the RPN values of the to-be-reordered reference pictures are used. Three pieces of information are signaled for indication of an RPN value:
1 ) the absolute difference between the prediction and the RPN value minus 1 , denoted as AbsDIFFminusi ,
2) an indication of whether addition or subtraction is used to derive the prediction value and the RPN value, denoted as ASidc, and
3) scale of the prediction value denoted as PS. The value of PS shall be selected such that AbsDIFFminusi is in the range of 0 to MaxFNplusi , exclusive.
For the first to-be-reordered reference picture, the prediction value predRPN is calculates as follows: predRPN = RPNcurr - PS * MaxFNplusi After processing the list reordering information of each to-be-reordered reference picture, the prediction value predRPN is first set equal to PRN value of the just reordered reference picture. Then predRPN is updated as follows: if(ASidc == 0) predRPN = predRPN - PS * MaxFNplusi else if(PNidc == 1 ) predRPN = predRPN + PS * MaxFNplusi
The RPN value of the to-be-reordered reference picture is derived as follows: if(ASidc == 0)
RPN = predRPN - (AbsDIFFminusi + 1 ) else if(ASidc == 1)
RPN = predRPN + (AbsDIFFminusi + 1 )
For the above example, assuming that RPNcurr is equal to 1030 and MaxFNplusi is equal to 256, the list reordering information to be signaled in a signal 300 becomes as follows:
AbsDIFFminusi = 6, ASidc = 0, PS = 3 (this is illustrated with reference 301 in Fig. 3)
AbsDIFFminusi = 3, ASidc = 1 , PS = 3 (this is illustrated with reference 302 in Fig. 3)
It can be derived that the first to-be-reordered reference picture has RPN value equal to 1030-3*256-(6+1 )=255, and the second to-be-reordered reference picture has RPN value equal to 255+3*256+(3+1)=1027.
It can be seen that the signaled values are small, hence bits can be saved in representations of the reference picture list reordering process.
It should be stated that simple changes of the above method are always possible. For example, the three information pieces may be contained in two syntax elements (by combining ASidc and PS in one syntax element) as well as three syntax elements. The prediction scale PS could be based on a value other than MaxFNplusi provided that the value can be indicated from the codec specification and/or related signaled information. Reference picture marking
The reference picture marking process is mainly used to mark some reference pictures as "unused for reference" such that they can be removed from the post-decoder buffer 2.1 , 5.2 if their output or display times have elapsed. There are two kinds of reference picture making mechanisms, the first-in first- out sliding window method and the customized adaptive marking method.
Methods similar as those for both sliding window marking operation and adaptive marking operation in H.264/AVC can be applied in the scenario where RPN is used to identify reference pictures.
For the sliding window marking operation, whenever the total number of pictures stored in the post-decoder buffer for reference is equal to the maximum value and new reference picture is to be stored, the one having the smallest value of RPN is marked as "unused for reference".
For the adaptive marking operation, information needed to derive the RPN of the to-be-marked reference picture is signaled. The information to be signaled is the difference between RPNcurr and the RPN value of the to-be-marked reference picture minus 1 , denoted as diffRPNminusi .
The RPN value of the to-be-marked reference picture is derived as
RPN = RPNcurr - (diffRPNmiusi + 1 )
For the same example as earlier, if the reference picture with RPN equal to 255 is to be marked as "unused for reference", the information to be signaled is diffRPNminusi = 774.
It can be derived that the reference picture to be marked has RPN value equal to (1030-(774+1)=255).
A problem with the above described prior-art sliding window marking operation is illustrated through the following example. Assuming that RPNcurr is equal to 200, three pictures are stored in the post-decoder buffer for reference with RPN values equal to 60, 198 and 199, the maximum number of stored pictures for reference is 3. For the next to-be-encoded picture, the encoder 1 would still like to have the reference picture with RPN equal to 60 to be stored for later use while to mark the reference picture with PRN equal to 199 as "unused as reference". In such a case, it would be efficient to use sliding window marking operation. However, the prior-art sliding window marking operation will mark the reference picture with RPN equal to 60 as "unused for reference".
This invention provides a solution for the above problem. For the sliding window reference picture marking operation, another information is signaled additionally to indicate the size of the sliding window, denoted as SSW. Only the SSW reference pictures with the largest values of RPN are operated according to the first-in first-out rule. Reference pictures with smaller values are not involved.
For example, the additionally signaled information is equal to the difference between the maximum number of stored pictures for reference and SSW. In the above example, the additionally signaled information is then just a code representing 1 (equal to 3 - 2).
It can also be seen that the prior-art adaptive marking operation is not efficient since the signaled value could be very large. Unfortunately, to directly signal the RPN value of the to-be-marked reference picture is also inefficient.
This invention also provides an efficient signaling method for the adaptive marking operation. Two pieces of information are signaled to mark one reference picture as "unused for reference":
1 ) the difference between the prediction of the RPN and the RPN value of the to-be-marked reference picture minus 1 , denoted as diffPRNminusi , and
2) the prediction scale indicating how the prediction is derived, denoted as PS.
The value of PS shall be selected such that AbsDIFFminusi is in the range of 0 to MaxFNplusi , exclusive.
The prediction, denoted as predRPN, is derived as predRPN = RPNcurr - PS * MaxFNplusi
The RPN value of the to-be-marked reference picture is derived as RPN = predRPN - (diffRPNminusi + 1) = RPNcurr - PS*MaxFNplus1 - (diffRPNminusi + 1)
For the same example as earlier, if the reference picture with RPN equal to 255 is to be marked as "unused for reference", the information to be signaled is diffRPNminusi = 6, PS = 3 (this is illustrated with reference 303 in Fig. 3).
It can be derived that the reference picture to be marked has RPN value equal to (1030-3*256 - (6+1 )=255).
Again, it should be stated that simple changes of the above method are always possible. For example, the prediction scale PS could be based on a value other than MaxFNplusi provided that the value can be indicated from the codec specification and/or related signaled information.
In the example system of Fig. 5 the encoder 1 performs the encoding of the picture stream and calculates the values for the parameters. The encoder 1 further initiates a signal transmission for informing the decoder 2 of the receiving device 8 that a reference picture can be removed from the post- decoder buffer 2.1 of the decoder if its display or output time is elapsed. The signal is included with the parameters which indicate the reference picture number, reference picture list reordering information and/or the reference picture marking information. The signal is transmitted by the transmitter 7 of the transmitting device 6.
The present invention can be applied in many kinds of systems and devices. The transmitting device 6 can be e.g. a computing device such as a server device, a video transmitter, a wireless communication device, etc. The receiving device 8 can be a computing device such as a workstation, a wireless communication device, a video receiver etc. The transmitting device 6 including the encoder 1 advantageously include also a transmitter 7 to transmit the encoded pictures to the transmission channel 4. The receiving device 8 include the receiver 9 to receive the encoded pictures, the decoder 2, and optionally a display 10 on which the decoded pictures can be displayed. The transmission channel can be, for example, a landline communication channel and/or a wireless communication channel. The transmitting device and the receiving device also include one or more processors 1.2, 2.2 which can perform the necessary steps for controlling the encoding/decoding process of video stream according to the invention. Therefore, the method according to the present invention can mainly be implemented as machine executable steps of the processors. The buffering of the pictures can be implemented in the memory 1.3, 2.3 of the devices. The program code 1.4 of the encoder can be stored into the memory 1.3. Respectively, the program code 2.4 of the decoder can be stored into the memory 2.3.

Claims

What is claimed is:
1. A method for encoding a sequence of pictures comprising: using one or more pictures as reference pictures; labeling the reference pictures with a first parameter; signaling the first parameter to a decoder; and using a reference picture management; wherein all the reference pictures are identified by a second parameter which is derived on the basis of the first parameter.
2. A method according to claim 1 comprising using a frame number FN as said first parameter, and using a reference picture number RPN as said second parameter.
3. A method according to claim 2 comprising defining a decoding order for pictures of said sequence of pictures; defining a parameter prevFN equal to the frame number of the previous reference picture in said decoding order; defining a parameter prevRPN equal to the reference picture number of the previous reference picture; defining a maximum value for the frame number; defining a parameter maxFNplusi equal to said maximum value for the frame number + 1 ; and calculating the reference picture number of the reference picture as follows: if(prevFN <= FN)
RPN = prevRPN + FN - prevFN else
RPN = prevRPN + FN - prevFN + maxFNplusi
4. A method according to claim 1 , the reference picture management comprising reference picture list initialization and reference picture list reordering.
5. A method according to claim 4 comprising signaling a parameter AbsDIFFminusi indicative of the absolute difference between the prediction of the RPN and the RPN value, wherein the prediction of the RPN is an expected value of the RPN; a parameter ASidc indicative of whether the absolute difference is added to or subtracted from the prediction value of the RPN to derive the RPN value; and a parameter PS indicative of the scale of the prediction value of the RPN.
6. A method according to claim 5 comprising setting a parameter RPNcurr to the value of the RPN of a first to-be-reordered reference picture; calculating the prediction value predRPN for the first to-be-reordered reference picture as follows: predRPN = RPNcurr- PS * MaxFNplusi setting the prediction value predRPN first equal to PRN value of the previous reordered reference picture; and updating the predRPN as follows: if(ASidc == 0) predRPN = predRPN - PS * MaxFNplusi else if(PNidc == 1 ) predRPN = predRPN + PS * MaxFNplusi
7. A method according to claim 1 , the reference picture management comprising reference picture marking.
8. A method according to claim 7 comprising signaling a parameter diffPRNminusi indicative of the difference between the prediction of the RPN and the RPN value of the to-be-marked reference picture minus
1 ; and a parameter PS indicative of the scale of the prediction value.
9. A method according to claim 8 comprising setting a parameter RPNcurr to the value of the RPN of a to-be-marked reference picture; and calculating the reference picture number value RPN for the to-be-marked reference picture as follows:
RPN = predRPN - (diffRPNminusi + 1) = RPNcurr - PS*MaxFNplus1 - (diffRPNminusi + 1 )
10. A method for decoding a sequence of encoded pictures comprising: using one or more pictures as reference pictures, said reference pictures being labeled with a first parameter; obtaining the first parameter from the encoded pictures; and using a reference picture management; wherein all the reference pictures are identified by a second parameter which is derived on the basis of the first parameter.
11. A method according to claim 10, the reference picture management comprising reference picture list initialization and reference picture list reordering.
12. A method according to claim 10, the reference picture management comprising reference picture marking.
13. A method according to claim 10, the reference picture management comprising reference picture reordering and reference picture marking.
14. A signal comprising a sequence of encoded pictures; said sequence comprising one or more reference pictures, said reference pictures being labeled with a first parameter; said signal being used according to claim 1.
15. A hardware for implementing claim 1.
16. A module for encoding a sequence of pictures comprising: a first element for selecting one or more pictures to be used as reference pictures; a second element for labeling the reference pictures with a first parameter; a third element for including the first parameter in a signal to be transmitted to a decoder; and a fourth element for derivation of a second parameter based on the first parameter; wherein all the reference pictures are identified by the second parameter.
17. A module according to claim 16 wherein the module is included in a wireless device.
18. A module for decoding a sequence of encoded pictures, the pictures comprising one or more pictures as reference pictures, said reference pictures being labeled with a first parameter; the module comprising: a first element for obtaining the first parameter from the encoded pictures; a reference picture manager; and a second element for deriving a second parameter on the basis of the first parameter for identifying all the reference pictures.
19. A module according to claim 18 wherein the module is included in a wireless device.
20. A system comprising: an encoding device for encoding a sequence of pictures comprising: a first element for selecting one or more pictures to be used as reference pictures; a second element for labeling the reference pictures with a first parameter; a third element for including the first parameter in a signal to be transmitted to a decoder; a fourth element for derivation of a second parameter based on the first parameter; wherein all the reference pictures are identified by the second parameter; a decoding device for decoding the signal, the decoding device comprising a fifth element for obtaining the first parameter from the encoded pictures; a reference picture manager; and a sixth element for deriving a second parameter on the basis of the first parameter for identifying all the reference pictures.
21. A computer program product comprising software for encoding a sequence of pictures, the software comprising machine executable code stored on a readable medium for execution by a processor, the machine executable code for: using one or more pictures as reference pictures; labeling the reference pictures with a first parameter; including the first parameter in a signal to be transmitted; and deriving of a second parameter based on the first parameter; wherein all the reference pictures are identified by the second parameter.
22. A computer program product comprising software for decoding a sequence of pictures, the software comprising machine executable code stored on a readable medium for execution by a processor, the machine executable code for: using one or more pictures as reference pictures, said reference pictures being labeled with a first parameter; obtaining the first parameter from the encoded pictures; using a reference picture management; and deriving a second parameter on the basis of the first parameter; and identifying all the reference pictures by said second parameter.
PCT/FI2005/050359 2004-10-14 2005-10-13 Reference picture management in video coding WO2006040413A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP05799154A EP1800262A4 (en) 2004-10-14 2005-10-13 Reference picture management in video coding

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US61897404P 2004-10-14 2004-10-14
US60/618,974 2004-10-14
US11/116,109 2005-04-26
US11/116,109 US20060083298A1 (en) 2004-10-14 2005-04-26 Reference picture management in video coding

Publications (1)

Publication Number Publication Date
WO2006040413A1 true WO2006040413A1 (en) 2006-04-20

Family

ID=36148077

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI2005/050359 WO2006040413A1 (en) 2004-10-14 2005-10-13 Reference picture management in video coding

Country Status (3)

Country Link
US (1) US20060083298A1 (en)
EP (1) EP1800262A4 (en)
WO (1) WO2006040413A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008047316A1 (en) * 2006-10-20 2008-04-24 Nokia Corporation Virtual decoded reference picture marking and reference picture list
WO2012102973A1 (en) * 2011-01-24 2012-08-02 Qualcomm Incorporated Single reference picture list construction for video coding
CN101611633B (en) * 2006-07-06 2012-10-03 汤姆逊许可证公司 Method and apparatus for decoupling frame number and/or picture order count (poc) for multi-view video encoding and decoding
TWI488502B (en) * 2012-12-06 2015-06-11 Acer Inc Video editing method and video editing device
EP3767950B1 (en) 2011-10-13 2022-03-30 Dolby International AB Tracking a reference picture based on a designated picture on an electronic device
US11943466B2 (en) 2011-10-13 2024-03-26 Dolby International Ab Tracking a reference picture on an electronic device

Families Citing this family (75)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9325781B2 (en) 2005-01-31 2016-04-26 Invention Science Fund I, Llc Audio sharing
US20060170956A1 (en) 2005-01-31 2006-08-03 Jung Edward K Shared image devices
US9489717B2 (en) 2005-01-31 2016-11-08 Invention Science Fund I, Llc Shared image device
US9082456B2 (en) 2005-01-31 2015-07-14 The Invention Science Fund I Llc Shared image device designation
US9124729B2 (en) 2005-01-31 2015-09-01 The Invention Science Fund I, Llc Shared image device synchronization or designation
US9910341B2 (en) 2005-01-31 2018-03-06 The Invention Science Fund I, Llc Shared image device designation
US8964054B2 (en) 2006-08-18 2015-02-24 The Invention Science Fund I, Llc Capturing selected image objects
US9451200B2 (en) 2005-06-02 2016-09-20 Invention Science Fund I, Llc Storage access technique for captured data
US9967424B2 (en) 2005-06-02 2018-05-08 Invention Science Fund I, Llc Data storage usage protocol
US9001215B2 (en) 2005-06-02 2015-04-07 The Invention Science Fund I, Llc Estimating shared image device operational capabilities or resources
US9191611B2 (en) 2005-06-02 2015-11-17 Invention Science Fund I, Llc Conditional alteration of a saved image
US9076208B2 (en) 2006-02-28 2015-07-07 The Invention Science Fund I, Llc Imagery processing
US20070222865A1 (en) * 2006-03-15 2007-09-27 Searete Llc, A Limited Liability Corporation Of The State Of Delaware Enhanced video/still image correlation
US9621749B2 (en) 2005-06-02 2017-04-11 Invention Science Fund I, Llc Capturing selected image objects
US9819490B2 (en) 2005-05-04 2017-11-14 Invention Science Fund I, Llc Regional proximity for shared image device(s)
US10003762B2 (en) 2005-04-26 2018-06-19 Invention Science Fund I, Llc Shared image devices
KR101335214B1 (en) 2006-03-15 2013-11-29 브리티쉬 텔리커뮤니케이션즈 파블릭 리미티드 캄퍼니 Video coding
US20100091845A1 (en) * 2006-03-30 2010-04-15 Byeong Moon Jeon Method and apparatus for decoding/encoding a video signal
BRPI0710048A2 (en) * 2006-03-30 2011-08-02 Lg Electronics Inc method and apparatus for decoding / encoding a video signal
WO2008023967A1 (en) * 2006-08-25 2008-02-28 Lg Electronics Inc A method and apparatus for decoding/encoding a video signal
US20080050608A1 (en) * 2006-08-25 2008-02-28 Mcfaul Surry D Metal coating process and product
AU2014210664A1 (en) * 2006-10-16 2014-08-28 Nokia Technologies Oy System and method for implementing efficient decoded buffer management in multi-view video coding
CN101523920B (en) * 2006-10-16 2013-12-04 汤姆森许可贸易公司 Method for using a network abstract layer unit to signal an instantaneous decoding refresh during a video operation
EP2080380A2 (en) * 2006-10-24 2009-07-22 Thomson Licensing Picture identification for multi-view video coding
US8875199B2 (en) * 2006-11-13 2014-10-28 Cisco Technology, Inc. Indicating picture usefulness for playback optimization
US20090180546A1 (en) 2008-01-09 2009-07-16 Rodriguez Arturo A Assistance for processing pictures in concatenated video streams
US8873932B2 (en) 2007-12-11 2014-10-28 Cisco Technology, Inc. Inferential processing to ascertain plural levels of picture interdependencies
US8416859B2 (en) * 2006-11-13 2013-04-09 Cisco Technology, Inc. Signalling and extraction in compressed video of pictures belonging to interdependency tiers
BRPI0809210A2 (en) * 2007-04-04 2014-09-02 Thomson Licensing REFERENCE IMAGE LIST MANAGEMENT
US8494049B2 (en) * 2007-04-09 2013-07-23 Cisco Technology, Inc. Long term reference frame management with error video feedback for compressed video communication
US8576918B2 (en) * 2007-07-09 2013-11-05 Broadcom Corporation Method and apparatus for signaling and decoding AVS1-P2 bitstreams of different versions
US8958486B2 (en) * 2007-07-31 2015-02-17 Cisco Technology, Inc. Simultaneous processing of media and redundancy streams for mitigating impairments
US8804845B2 (en) * 2007-07-31 2014-08-12 Cisco Technology, Inc. Non-enhancing media redundancy coding for mitigating transmission impairments
EP2213097A2 (en) * 2007-10-16 2010-08-04 Cisco Technology, Inc. Conveyance of concatenation properties and picture orderness in a video stream
US8416858B2 (en) * 2008-02-29 2013-04-09 Cisco Technology, Inc. Signalling picture encoding schemes and associated picture properties
US8923285B2 (en) * 2008-04-30 2014-12-30 Qualcomm Incorporated Apparatus and methods for transmitting data over a wireless mesh network
US8886022B2 (en) 2008-06-12 2014-11-11 Cisco Technology, Inc. Picture interdependencies signals in context of MMCO to assist stream manipulation
US8705631B2 (en) * 2008-06-17 2014-04-22 Cisco Technology, Inc. Time-shifted transport of multi-latticed video for resiliency from burst-error effects
US8699578B2 (en) 2008-06-17 2014-04-15 Cisco Technology, Inc. Methods and systems for processing multi-latticed video streams
US8971402B2 (en) 2008-06-17 2015-03-03 Cisco Technology, Inc. Processing of impaired and incomplete multi-latticed video streams
EP2297964A4 (en) * 2008-06-25 2017-01-18 Cisco Technology, Inc. Support for blocking trick mode operations
EP2356812B1 (en) 2008-11-12 2015-06-10 Cisco Technology, Inc. Processing of a video program having plural processed representations of a single video signal for reconstruction and output
US8326131B2 (en) * 2009-02-20 2012-12-04 Cisco Technology, Inc. Signalling of decodable sub-sequences
US20100218232A1 (en) * 2009-02-25 2010-08-26 Cisco Technology, Inc. Signalling of auxiliary information that assists processing of video according to various formats
US8782261B1 (en) 2009-04-03 2014-07-15 Cisco Technology, Inc. System and method for authorization of segment boundary notifications
US8949883B2 (en) * 2009-05-12 2015-02-03 Cisco Technology, Inc. Signalling buffer characteristics for splicing operations of video streams
US8279926B2 (en) 2009-06-18 2012-10-02 Cisco Technology, Inc. Dynamic streaming with latticed representations of video
JP2011082683A (en) * 2009-10-05 2011-04-21 Sony Corp Image processing apparatus, image processing method, and program
US9398308B2 (en) 2010-07-28 2016-07-19 Qualcomm Incorporated Coding motion prediction direction in video coding
US9066102B2 (en) * 2010-11-17 2015-06-23 Qualcomm Incorporated Reference picture list construction for generalized P/B frames in video coding
KR101852789B1 (en) 2011-04-26 2018-06-04 엘지전자 주식회사 Method for managing a reference picture list, and apparatus using same
CN103843349B (en) 2011-08-25 2017-03-01 太阳专利托管公司 Method and apparatus video being encoded and being decoded for the description of usage cycles buffer
BR112013020486B1 (en) 2011-09-07 2022-07-19 Sun Patent Trust IMAGE ENCODING METHOD, IMAGE DECODING METHOD, IMAGE ENCODING APPARATUS, IMAGE DECODING APPARATUS AND IMAGE ENCODING AND DECODING APPARATUS
KR102011157B1 (en) 2011-09-19 2019-10-21 선 페이턴트 트러스트 Image encoding method, image decoding method, image encoding device, image decoding device, and image encoding-decoding device
US9131245B2 (en) 2011-09-23 2015-09-08 Qualcomm Incorporated Reference picture list construction for video coding
CN103843340B (en) * 2011-09-29 2018-01-19 瑞典爱立信有限公司 Reference picture list processing
JP5698644B2 (en) * 2011-10-18 2015-04-08 株式会社Nttドコモ Video predictive encoding method, video predictive encoding device, video predictive encoding program, video predictive decoding method, video predictive decoding device, and video predictive decode program
PL3742735T3 (en) * 2011-10-19 2022-11-14 Sun Patent Trust Image decoding method and image decoding apparatus
US9264717B2 (en) 2011-10-31 2016-02-16 Qualcomm Incorporated Random access with advanced decoded picture buffer (DPB) management in video coding
US10003817B2 (en) 2011-11-07 2018-06-19 Microsoft Technology Licensing, Llc Signaling of state information for a decoded picture buffer and reference picture lists
US9918080B2 (en) 2011-11-08 2018-03-13 Nokia Technologies Oy Reference picture handling
US9369710B2 (en) 2012-02-06 2016-06-14 Qualcomm Incorporated Reference picture list modification for video coding
EP3926832B1 (en) * 2012-04-15 2023-11-22 Samsung Electronics Co., Ltd. Video decoding apparatus using parameter update for de-binarization of entropy coded transformation coefficient, and encoding method using same for binarization
WO2013156679A1 (en) * 2012-04-16 2013-10-24 Nokia Corporation Method and apparatus for video coding
MX342497B (en) 2012-06-29 2016-10-03 Sony Corp Coding device, and coding method.
US9313500B2 (en) 2012-09-30 2016-04-12 Microsoft Technology Licensing, Llc Conditional signalling of reference picture list modification information
CN103873872B (en) * 2012-12-13 2017-07-07 联发科技(新加坡)私人有限公司 Reference pictures management method and device
WO2014089805A1 (en) * 2012-12-13 2014-06-19 Mediatek Singapore Pte. Ltd. A new reference management method for video coding
CA2909566C (en) 2013-04-17 2018-07-03 Nokia Technologies Oy An apparatus, a method and a computer program for video coding and decoding
JP6365924B2 (en) * 2013-05-09 2018-08-01 サン パテント トラスト Image decoding method and image decoding apparatus
US9807407B2 (en) * 2013-12-02 2017-10-31 Qualcomm Incorporated Reference picture selection
KR101610725B1 (en) 2014-09-23 2016-04-08 삼성전자주식회사 Method and apparatus for video stream encoding to control reference image data according to reference frequency, method and apparatus for video stream decoding to control reference image data according to reference frequency
CN113597768A (en) * 2019-01-28 2021-11-02 Op方案有限责任公司 Online and offline selection of extended long-term reference picture preservation
US11595652B2 (en) 2019-01-28 2023-02-28 Op Solutions, Llc Explicit signaling of extended long term reference picture retention
CN112532908B (en) * 2019-09-19 2022-07-19 华为技术有限公司 Video image transmission method, sending equipment, video call method and equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0851685A2 (en) * 1996-12-27 1998-07-01 Oki Electric Industry Co., Ltd. Video coder, decoder and transmission system
EP0868086A1 (en) * 1997-03-24 1998-09-30 Oki Electric Industry Co., Ltd. Video decoder
US20010040700A1 (en) * 2000-05-15 2001-11-15 Miska Hannuksela Video coding
WO2003094496A2 (en) * 2002-05-02 2003-11-13 Nokia Corporation Video coding
WO2004015999A1 (en) * 2002-08-08 2004-02-19 Matsushita Electric Industrial Co., Ltd. Moving picture encoding method and decoding method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3264290B2 (en) * 1992-09-22 2002-03-11 ソニー株式会社 Decoding method and decoding device
US6633673B1 (en) * 1999-06-17 2003-10-14 Hewlett-Packard Development Company, L.P. Fast fade operation on MPEG video or other compressed data
FI114527B (en) * 2002-01-23 2004-10-29 Nokia Corp Grouping of picture frames in video encoding

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0851685A2 (en) * 1996-12-27 1998-07-01 Oki Electric Industry Co., Ltd. Video coder, decoder and transmission system
EP0868086A1 (en) * 1997-03-24 1998-09-30 Oki Electric Industry Co., Ltd. Video decoder
US20010040700A1 (en) * 2000-05-15 2001-11-15 Miska Hannuksela Video coding
WO2003094496A2 (en) * 2002-05-02 2003-11-13 Nokia Corporation Video coding
WO2004015999A1 (en) * 2002-08-08 2004-02-19 Matsushita Electric Industrial Co., Ltd. Moving picture encoding method and decoding method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP1800262A4 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101611633B (en) * 2006-07-06 2012-10-03 汤姆逊许可证公司 Method and apparatus for decoupling frame number and/or picture order count (poc) for multi-view video encoding and decoding
WO2008047316A1 (en) * 2006-10-20 2008-04-24 Nokia Corporation Virtual decoded reference picture marking and reference picture list
AU2007311489B2 (en) * 2006-10-20 2012-05-24 Nokia Technologies Oy Virtual decoded reference picture marking and reference picture list
US9986256B2 (en) 2006-10-20 2018-05-29 Nokia Technologies Oy Virtual decoded reference picture marking and reference picture list
WO2012102973A1 (en) * 2011-01-24 2012-08-02 Qualcomm Incorporated Single reference picture list construction for video coding
CN103339936A (en) * 2011-01-24 2013-10-02 高通股份有限公司 Single reference picture list construction for video coding
US9008181B2 (en) 2011-01-24 2015-04-14 Qualcomm Incorporated Single reference picture list utilization for interprediction video coding
CN103339936B (en) * 2011-01-24 2017-02-08 高通股份有限公司 Single reference picture list construction for video coding
EP3767950B1 (en) 2011-10-13 2022-03-30 Dolby International AB Tracking a reference picture based on a designated picture on an electronic device
US11943466B2 (en) 2011-10-13 2024-03-26 Dolby International Ab Tracking a reference picture on an electronic device
TWI488502B (en) * 2012-12-06 2015-06-11 Acer Inc Video editing method and video editing device

Also Published As

Publication number Publication date
US20060083298A1 (en) 2006-04-20
EP1800262A4 (en) 2009-10-28
EP1800262A1 (en) 2007-06-27

Similar Documents

Publication Publication Date Title
US20060083298A1 (en) Reference picture management in video coding
JP6556894B2 (en) Coding the least significant bit of a picture order count value that identifies a long-term reference picture
KR102058759B1 (en) Signaling of state information for a decoded picture buffer and reference picture lists
JP5658390B2 (en) Video encoding apparatus and method
JP4226645B2 (en) Video coding method
JP2019126079A (en) Decoded picture buffer management
CA2460471C (en) Picture encoding method and picture decoding method
EP1422946A1 (en) Moving picture encoding method, moving picture decoding method, and recording medium
JP2006094544A (en) Buffer processing of picture for prediction reference value and indicator
US9491487B2 (en) Error resilient management of picture order count in predictive coding systems
JP2007507128A (en) Video picture encoding and decoding with delayed reference picture refresh
JP4230289B2 (en) Video encoding method and video decoding method
JP2004088737A (en) Image encoding and decoding method
JP2012070153A (en) Moving image encoding apparatus, moving image decoding apparatus, moving image encoding method, moving image decoding method and program
KR20050018729A (en) Video encoding method and video decoding method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2005799154

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2808/DELNP/2007

Country of ref document: IN

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 200580040403.8

Country of ref document: CN

WWP Wipo information: published in national office

Ref document number: 2005799154

Country of ref document: EP