CN113615184A - Explicit signaling to extend long-term reference picture preservation - Google Patents
Explicit signaling to extend long-term reference picture preservation Download PDFInfo
- Publication number
- CN113615184A CN113615184A CN202080021301.6A CN202080021301A CN113615184A CN 113615184 A CN113615184 A CN 113615184A CN 202080021301 A CN202080021301 A CN 202080021301A CN 113615184 A CN113615184 A CN 113615184A
- Authority
- CN
- China
- Prior art keywords
- long
- decoder
- term reference
- frame
- bitstream
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000007774 longterm Effects 0.000 title claims abstract description 38
- 230000011664 signaling Effects 0.000 title description 14
- 238000004321 preservation Methods 0.000 title description 3
- 238000000034 method Methods 0.000 claims abstract description 38
- 230000014759 maintenance of location Effects 0.000 claims abstract description 30
- 230000000717 retained effect Effects 0.000 claims abstract description 14
- 230000008569 process Effects 0.000 claims description 12
- 238000013139 quantization Methods 0.000 claims description 11
- 238000003066 decision tree Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 238000007906 compression Methods 0.000 description 14
- 230000006835 compression Effects 0.000 description 14
- 238000010586 diagram Methods 0.000 description 10
- 230000003287 optical effect Effects 0.000 description 6
- 230000002093 peripheral effect Effects 0.000 description 5
- 238000000638 solvent extraction Methods 0.000 description 5
- 238000013459 approach Methods 0.000 description 4
- 238000007792 addition Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/109—Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/58—Motion compensation with long-term prediction, i.e. the reference frame for a current frame not being the temporally closest one
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/86—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
A decoder includes circuitry configured to: receiving a bit stream; storing a plurality of long-term reference frames in a reference list; retaining the long-term reference frame in the reference list for a length of time based on the retention time; and decoding at least a portion of the video using the long-term reference frames retained in the reference list. Related apparatus, systems, techniques, and articles are also described.
Description
Cross Reference to Related Applications
This application claims priority from U.S. provisional patent application No. 62/797,806 entitled "EXPLICIT SIGNALING OF EXPLICIT LONG TERM REFERENCE PICTURE RETENTION" filed on 28.1.2019, which is incorporated herein by reference in its entirety.
Technical Field
The present invention relates generally to the field of video compression. In particular, the present invention relates to explicit signaling extending long-term reference picture reservation.
Background
A video codec may include electronic circuitry or software that compresses or decompresses digital video. Which can convert uncompressed video to a compressed format and vice versa. In video compression, the device that compresses the video (and/or performs some of its functions) may be generally referred to as an encoder and the device that decompresses the video (and/or performs some of its functions) may be referred to as a decoder.
The format of the compressed data may conform to standard video compression specifications. Compression can be lossy because the compressed video lacks some of the information present in the original video. The result may include that the quality of the decompressed video may be lower than the original uncompressed video because there is not enough information to accurately reconstruct the original video.
There may be a complex relationship between video quality, the amount of data used to represent the video (e.g., as determined by bit rate), the complexity of encoding and decoding algorithms, susceptibility to data loss and errors, ease of editing, random access, end-to-end delay (e.g., latency), and so forth.
Motion compensation may include methods of predicting a video frame or portion thereof given a reference frame (e.g., a previous and/or future frame) by taking into account motion of the camera and/or objects in the video. It can be used for encoding and decoding of video data for video compression, for example in the encoding and decoding of the Moving Picture Experts Group (MPEG) -2 (also known as Advanced Video Coding (AVC) and h.264) standard. Motion compensation may describe a picture in terms of a transformation of a reference picture to a current picture. The reference picture may be previous in time, may be from the future, or may include a Long Term Reference (LTR) frame when compared to the current picture. Compression efficiency may be improved when images may be accurately synthesized from previously transmitted and/or stored images.
Long Term Reference (LTR) frames have been used in video coding standards such as MPEG-2, H.264 (also known as AVC or MPEG-4 part 10), and H.265 (also known as High Efficiency Video Coding (HEVC)). Frames marked as LTR frames in the video bitstream may be used as references until they are explicitly removed by bitstream signaling. LTR frames improve prediction and compression efficiency in scenes with long periods of static background (e.g., background in video conferencing or parking lot surveillance video). However, over time, the background of the scene may gradually change (e.g., become part of the background scene when the car is parked in the open space). Thus, the LTR frames are updated to allow better prediction to improve compression performance.
Current standards, such as h.264 and h.265, allow for updating LTR frames by signaling that newly decoded frames are saved and made available as reference frames. This update is signaled by the encoder and the entire frame is updated. But the cost of updating the entire frame can be high. And when updating the LTR frame, the previous LTR frame is discarded. If the static background associated with the previously dropped LTR frames appears again in the video (e.g., as in the video switching from a first scene to a second scene and then back to the first scene), the previous LTR frames must be encoded again in the bitstream, which reduces compression efficiency.
Disclosure of Invention
In one aspect, a decoder includes circuitry configured to receive a bitstream, store a plurality of long-term reference frames in a reference list, retain the long-term reference frames in the reference list for a length of time based on a retention time, and decode at least a portion of video using the long-term reference frames retained in the reference list.
In another aspect, a method includes receiving, by a decoder, a bitstream. The method includes storing, by a decoder, a plurality of long-term reference frames in a reference list. The method includes retaining, by a decoder, a long-term reference frame in a reference list for a length of time based on a retention time. The method includes decoding, by a decoder, at least a portion of a video using long-term reference frames retained in a reference list.
The details of one or more variations of the subject matter described herein are set forth in the accompanying drawings and the description below. Other features and advantages of the subject matter described herein will be apparent from the description and drawings, and from the claims.
Drawings
For the purpose of illustrating the invention, the drawings show aspects of one or more embodiments of the invention. It should be understood, however, that the present invention is not limited to the precise arrangements and instrumentalities shown in the attached drawings, wherein:
FIG. 1 shows an example reference list for frame prediction over a long period of time;
FIG. 2 is a process flow diagram illustrating an example process of extending long term reference (eLTR) frame retention, where eLTR frames are retained in a reference list;
FIG. 3 is a system block diagram illustrating an example decoder capable of decoding a bitstream using eLTR frames retained in a reference list;
FIG. 4 is a process flow diagram illustrating an example process for encoding video using eLTR frames retained in a reference list that can improve compression efficiency compared to some prior approaches in accordance with some aspects of the present subject matter;
FIG. 5 is a system block diagram illustrating an example video encoder capable of signaling eLTR reservation in a reference list; and
FIG. 6 is a block diagram of a computing system that may be used to implement any one or more of the methods disclosed herein and any one or more portions thereof.
The drawings are not necessarily to scale and may be shown with phantom lines, diagrammatic representations and fragmentary views. In certain instances, details that are not necessary for an understanding of the embodiments or that render other details difficult to perceive may have been omitted. Like reference symbols in the various drawings indicate like elements.
Detailed Description
In the case where certain portions of a frame are occluded and then repeatedly uncovered over time, long term reference pictures (LTRs) may be used to better predict the video frame. Conventionally, LTRs are used for the duration of a scene or a group of pictures, and are then replaced or discarded. Some embodiments of the current subject matter extend the utility of LTR usage by selecting the best candidate LTR to remain in the reference list. In some embodiments, an explicitly signaled extended long term reference (etlr) frame may remain in the reference list for the explicitly signaled duration. Some embodiments of the present subject matter may provide significant compression efficiency gains compared to some existing approaches.
Some embodiments of the present subject matter may enable selection and retention of the etlr frames in video coding. The etlr may remain in the picture reference list, which may be used by the current frame or group of frames for prediction. The eLTR may remain in the reference list, although all other frames in the list may change in a relatively short period of time. For example, fig. 1 shows an example reference list for long temporal frame prediction. As a non-limiting illustrative example, video frames shown as shadows may be reconstructed using reference frames. The reference list may contain frames that change over time and the retained eLTR.
In some embodiments, still referring to fig. 1, the encoder performs the operations of etlr selection and reservation calculation. The selected frame and retention time may be signaled to the decoder, for example, using a pair (eLTRN, TRn) indicating an index of eLTR (eLTRN) and a retention time of frame n (TRn). The decoder may retain frame etlrn in the reference list for a period of time TRn. After the eLTRn frame resides at least in TRn in the reference list, the eLTRn frame may be marked as unavailable for further use. In some embodiments, the etlrn frame may remain in memory, but in an unavailable state. In some implementations, the encoder may explicitly signal the decoder to mark the etlrn frame as available or unavailable. For example, after the retention time TRn has elapsed, an etlrn frame previously marked as unavailable may be marked as available. This property may enable eLTRN to be reused in the future, for example for video containing scenes switched back and forth. In some embodiments, the encoder may include a signal in the bitstream for the decoder to remove the etlrn frame from memory. The decoder may remove the etlrn frame from the reference list and memory based on such signals.
Fig. 2 is a process flow diagram illustrating a non-limiting example of a process 200 of ltr frame retention, where the ltr frames are retained in a reference list. Such an etlr reservation may improve compression efficiency compared to some existing video encoding and decoding methods.
In step 210, still referring to fig. 2, the decoder receives a bitstream. The bitstream may include data found in the bitstream as input to a decoder, for example, when data compression is utilized. The bitstream may include information required to decode the video. The receiving operation may include extracting and/or parsing a block and associated signaling information from the bitstream. In some implementations, receiving the bitstream may include parsing the etltr frames, the indices of the frames (etlrn), and the associated retention times (TRn), where the retention times are based on time within the decoded frames and/or video.
With continued reference to fig. 2, at step 220, the etlr frame may be stored in a reference picture list.
At step 230, still referring to fig. 2, the stored etlr frame may be retained (e.g., held) in the reference list for a certain length of time based on the associated retention time (TRn).
At step 240, still referring to fig. 2, at least a portion of the video may be decoded from the bitstream. The decoding operation may include decoding the current block. For example, a received current encoded block contained in a bitstream may be decoded, for example, by using inter-prediction. Decoding via inter prediction may include using previous, future and/or etlr frames as references for computing a prediction, which may be combined with a residual contained in the bitstream.
With further reference to FIG. 2, for subsequent current blocks, the eLTR frame may be used as a reference frame for inter prediction. For example, a second encoded block may be received. Possibly determining whether inter prediction mode is enabled for the second coding block; the determining operation may include receiving an explicit signal from the bitstream indicating whether inter prediction mode is enabled. It is possible to use the etlr frame as a reference frame and determine the second decoded block according to the inter prediction mode. For example, a decoding operation via inter-prediction may include using the etlr frame as a reference to compute a prediction, which may be combined with a residual contained in the bitstream.
Fig. 3 is a system block diagram illustrating a non-limiting example of a decoder 300, the decoder 300 being capable of decoding a bitstream 370 using the etlr frames retained in the reference list. The decoder 300 may include an entropy decoder processor 310, an inverse quantization and inverse transform processor 320, a deblocking filter 330, a frame buffer 340, a motion compensation processor 350, and an intra prediction processor 360. In some embodiments, bitstream 370 may include parameters (e.g., fields in a bitstream header) that represent the etltr index (etlrn) and the retention time (TRn). The motion compensation processor 350 may reconstruct the pixel information using the eLTR frame and retain the eLTR frame according to its associated retention time (TRn). For example, when an eLTR frame (eLTRN) is received and retained in the reference list for at least one associated retention time, the eLTR frame (eLTRN) may be used as a reference for inter-frame prediction mode, at least during the associated retention time.
In operation, still referring to fig. 3, the bitstream 370 may be received by the decoder 300 and input to the entropy decoder processor 310, and the entropy decoder processor 310 may entropy decode the bitstream into quantized coefficients. The quantized coefficients may be provided to an inverse quantization and inverse transform processor 320, and the inverse quantization and inverse transform processor 320 may perform inverse quantization and inverse transform to create a residual signal, which may be added to the output of the motion compensation processor 350 or the intra prediction processor 360 according to a processing mode. The output of the motion compensation processor 350 and the intra prediction processor 360 may include block predictions based on previously decoded blocks and/or erlr frames maintained in a reference list. The sum of the prediction and the residual may be processed by deblocking filter 630 and stored in frame buffer 640.
Fig. 4 is a process flow diagram illustrating a non-limiting example of a process 400 of encoding video with an etlr frame retained in a reference list that may improve compression efficiency compared to some existing approaches, in accordance with some aspects of the present subject matter. At step 410, a sequence of video frames may be encoded, including determining one or more eLTR frames. At step 420, an eLTR frame retention time (TRn) may be determined, for example, based on the length of time an eLTR frame is utilized by an encoder/decoder, where, for example, time is based on the frame being decoded in the video.
At step 430, still referring to fig. 4, additional signaling parameters may be determined. For example, it may be determined whether and when an eLTR frame is marked as unavailable or available, and it may be determined whether and when each eLTR frame should be removed from memory.
In step 440, still referring to fig. 4, the etlr retention time and additional signaling parameters may be included in the bitstream.
Fig. 5 is a system block diagram illustrating a non-limiting example of a video encoder 500, the video encoder 500 being capable of signaling an etlr reservation in a reference list. The example video encoder 500 receives input video 505, and the input video 505 may be initially partitioned or partitioned according to a processing scheme, such as a tree-structured macroblock partitioning scheme (e.g., quad tree plus binary tree). Examples of tree-structured macroblock partitioning schemes may include partitioning schemes that partition a picture frame into large block elements, which for purposes of this disclosure may be referred to as Coding Tree Units (CTUs). In some embodiments, each CTU may be further divided one or more times into a plurality of sub-blocks called Coding Units (CUs). The result of such partitioning may include a set of sub-blocks, which for purposes of this disclosure may be referred to as Prediction Units (PUs). Transform Units (TUs) may also be used.
Still referring to fig. 5, the example video encoder 500 may include an intra prediction processor 515, a motion estimation/compensation processor 520 capable of supporting etlr frame preservation (also referred to as an inter prediction processor), a transform/quantization processor 525, an inverse quantization/inverse transform processor 530, an in-loop filter 535, a decoded picture buffer 540, and an entropy encoding processor 545. In some embodiments, the motion estimation/compensation processor 520 may determine the etlr retention time and additional signaling parameters. Bitstream parameters representing the retention of the etlr frame and additional parameters may be input to the entropy encoding processor 545 for inclusion in the output bitstream 550.
In operation, and with continued reference to fig. 5, for each block of a frame of input video 505, it may be determined whether to process the block via intra-picture prediction or using motion estimation/compensation. The block may be provided to an intra prediction processor 510 or a motion estimation/compensation processor 520. If the block is to be processed by intra prediction, the intra prediction processor 510 may perform processing to output a prediction value. If the block is to be processed by motion estimation/compensation, motion estimation/compensation processor 520 may perform processing including using the eLTR frame as a reference for inter prediction (if applicable).
With continued reference to fig. 5, it is possible to form a residual by subtracting the prediction value from the input video. The residual may be received by a transform/quantization processor 525, and the transform/quantization processor 525 may perform a transform process (e.g., a Discrete Cosine Transform (DCT)) to produce coefficients that may be quantized. The quantized coefficients and any associated signaling information may be provided to the entropy encoding processor 545 for entropy encoding and inclusion in the output bitstream 550. The entropy encoding processor 545 may support encoding of signaling information related to the retention of the etlr frame. Further, the quantized coefficients may be provided to an inverse quantization/inverse transform processor 530, which may render pixels that may be combined with the predictor and processed by an in-loop filter 535, the output of which may be stored in a decoded picture buffer 540 for use by the motion estimation/compensation processor 520 capable of supporting the retention of the eLTR frame.
Still referring to fig. 5, although some variations have been described in detail above, other modifications or additions are possible. For example, in some implementations, the current block may include any symmetric block (8 × 8, 16 × 16, 32 × 32, 64 × 64, 128 × 128, etc.) as well as any asymmetric block (8 × 4, 16 × 8, etc.).
In some embodiments, and with continued reference to fig. 5, it is possible to implement a quadtree plus binary decision tree (QTBT). In QTBT, at the coding tree unit level, the partitioning parameters of QTBT may be dynamically derived to adapt to local characteristics without transmitting any overhead. Subsequently, at the coding unit level, the joint classifier decision tree structure may eliminate unnecessary iterations and control the risk of mispredictions.
In some embodiments, the decoder may include an ltr frame reservation processor (not shown) that determines whether and when to mark an ltr frame as unavailable or removed from a reference list.
In some embodiments, the current subject matter may be applied to broadcast (and similar) scenarios in which the decoder tunes in (turn in) in the middle of the retention period. To support standard playback, the encoder may mark (e) LTR frames as Instantaneous Decoding Refresh (IDR) type frames. In this case, streaming may resume after the next available ltr (idr) frame. This approach may be similar to some current broadcast standards that specify inter-frame frames as IDR frames.
The subject matter described herein provides a number of technical advantages. For example, some embodiments of the current subject matter may provide for decoding a block using the etlr frame that remains in the reference list. This method can improve compression efficiency.
It should be noted that any one or more of the aspects and embodiments described herein may be conveniently implemented using digital electronic circuitry, integrated circuitry, a specially designed Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) computer hardware, firmware, software, and/or combinations thereof, as implemented and/or embodied in one or more machines (e.g., one or more computing devices as user computing devices for electronic documents, one or more server devices such as document servers, etc.). These various aspects or features may include implementations in one or more computer programs and/or software executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device. Appropriate software coding can readily be prepared by skilled programmers based on the teachings of the present disclosure, as will be apparent to those skilled in the software art. The aspects and embodiments discussed above that employ software and/or software modules may also include suitable hardware for facilitating the implementation of the machine-executable instructions of the software and/or software modules.
Such software may be a computer program product employing a machine-readable storage medium. A machine-readable storage medium may be any medium that can store and/or encode a sequence of instructions for execution by a machine (e.g., a computing device) and that cause the machine to perform any one of the methods and/or embodiments described herein. Examples of a machine-readable storage medium include, but are not limited to, magnetic disks, optical disks (e.g., CD-R, DVD-R, etc.), magneto-optical disks, read-only memory "ROM" devices, random-access memory "RAM" devices, magnetic cards, optical cards, solid-state memory devices, EPROM, EEPROM, Programmable Logic Devices (PLD), and/or any combination thereof. Machine-readable media as used herein is intended to include both a single medium and a collection of physically separate media, such as a collection of optical disks or one or more hard disk drives in combination with computer memory. As used herein, a machine-readable storage medium does not include transitory forms of signal transmission.
Such software may also include information (e.g., data) carried as a data signal on a data carrier (e.g., a carrier wave). For example, machine-executable information may be included as data-bearing signals embodied in data carriers, where the signals encode a sequence of instructions or portions thereof for execution by a machine (e.g., a computing device) and any related information (e.g., data structures and data) that cause the machine to perform any one of the methods and/or embodiments described herein.
Examples of a computing device include, but are not limited to, an electronic book reading device, a computer workstation, a terminal computer, a server computer, a handheld device (e.g., tablet computer, smartphone, etc.), a network appliance, a network router, network switch, network bridge, any machine capable of executing a sequence of instructions that specify actions to be taken by that machine, and any combination thereof. In one example, the computing device may include and/or be included in a kiosk.
Fig. 6 shows a diagram of one embodiment of a computing device in the exemplary form of a computer system 600 within which a set of instructions, for causing a control system to perform any one or more aspects and/or methods of the present disclosure, may be executed. It is also contemplated that multiple computing devices may be utilized to implement a specifically configured set of instructions for causing one or more devices to perform any one or more aspects and/or methods of the present disclosure. The computer system 600 includes a processor 604 and a memory 608, the processor 604 and the memory 608 communicating with each other and with other components via a bus 612. The bus 612 may include any of several types of bus structures including, but not limited to, a memory bus, a memory controller, a peripheral bus, a local bus, and any combinations thereof, using any of a variety of bus architectures.
The computer system 600 may also include a storage device 624. Examples of storage devices (e.g., storage device 624) include, but are not limited to, hard disk drives, magnetic disk drives, optical disk drives in combination with optical media, solid state storage devices, and any combination thereof. The storage device 624 may be connected to the bus 612 by an appropriate interface (not shown). Example interfaces include, but are not limited to, SCSI, Advanced Technology Attachment (ATA), Serial ATA, Universal Serial Bus (USB), IEEE 1394 (firewire), and any combination thereof. In one example, storage 624 (or one or more components thereof) may be removably interfaced with computer system 600 (e.g., via an external port connector (not shown)). In particular, storage 624 and associated machine-readable media 628 may provide non-volatile and/or volatile storage of machine-readable instructions, data structures, program modules, and/or other data for computer system 600. In one example, software 620 may reside, completely or partially, within machine-readable media 628. In another example, the software 620 may reside, completely or partially, within the processor 604.
The computer system 600 may also include an input device 632. In one example, a user of computer system 600 may enter commands and/or other information into computer system 600 via input device 632. Examples of input device 632 include, but are not limited to, an alphanumeric input device (e.g., a keyboard), a pointing device, a joystick, a gamepad, an audio input device (e.g., a microphone, a voice response system, etc.), a cursor control device (e.g., a mouse), a touchpad, an optical scanner, a video capture device (e.g., a still camera, a video camera), a touchscreen, and any combination thereof. Input device 632 may be connected to bus 612 via any of a variety of interfaces (not shown), including, but not limited to, a serial interface, a parallel interface, a game port, a USB interface, a firewire interface, a direct interface to bus 612, and any combination thereof. The input device 632 may comprise a touch screen interface that may be part of the display 636 or separate from the display 636, as will be discussed further below. Input device 632 may serve as a user selection device for selecting one or more graphical representations in a graphical interface as described above.
A user may also enter commands and/or other information into computer system 600 through storage 624 (e.g., a removable disk drive, a flash drive, etc.) and/or network interface device 640. A network interface device, such as network interface device 640, may be used to connect the computer system 600 to one or more of various networks, such as the network 644, and to one or more remote devices 648 connected thereto. Examples of network interface devices include, but are not limited to, a network interface card (e.g., a mobile network interface card, a LAN card), a modem, and any combination thereof. Examples of networks include, but are not limited to, a wide area network (e.g., the internet, an enterprise network), a local area network (e.g., a network associated with an office, a building, a campus, or other relatively small geographic space), a telephone network, a data network associated with a telephone/voice provider (e.g., a mobile communications provider data and/or voice network), a direct connection between two computing devices, and any combination thereof. Networks such as network 644 may employ wired and/or wireless communication modes. In general, any network topology may be used. Information (e.g., data, software 620, etc.) may be transferred to computer system 600 and/or from computer system 600 via network interface device 640.
The foregoing has described in detail illustrative embodiments of the invention. Various modifications and additions can be made without departing from the spirit and scope of the invention. The features of the various embodiments described above may be combined with the features of the other described embodiments as appropriate in order to provide a variety of combinations of features in the associated new embodiments. Furthermore, while the foregoing describes a number of separate embodiments, this description is merely illustrative of the application of the principles of the present invention. Moreover, although particular methods herein may be shown and/or described as being performed in a particular order, this order is highly variable within the ordinary skill in order to implement the embodiments disclosed herein. Accordingly, this description is meant to be exemplary only, and not limiting as to the scope of the invention.
In the description and claims above, phrases such as "at least one" or "one or more" may appear after a connected list of elements or features. The term "and/or" may also appear in a list of two or more elements or features. Such phrases are intended to mean any element or feature listed individually or in combination with any other listed element or feature, unless implicitly or explicitly contradicted by context in which it is used. For example, at least one of the phrases "a and B; one or more of "" "A and B; "and" A and/or B "are each intended to mean" A alone, B alone, or A and B together. "similar explanations apply to lists containing three or more items. For example, at least one of the phrases "A, B and C; one or more of "" "A, B, C"; and "A, B and/or C" are each intended to mean "a alone, B alone, C alone, a and B together, a and C together, B and C together, or a and B and C together". Furthermore, the use of the term "based on" above and in the claims is intended to mean "based at least in part on" such that unrecited features or elements are also permissible.
The subject matter described herein may be embodied in systems, apparatus, methods, and/or articles of manufacture depending on the desired configuration. The embodiments set forth in the foregoing description do not represent all embodiments consistent with the subject matter described herein. Rather, they are merely a few examples consistent with aspects related to the described subject matter. Although some variations have been described in detail above, other modifications or additions are possible. In particular, further features and/or variations may be provided in addition to those set forth herein. For example, the above-described embodiments may be directed to various combinations and subcombinations of the disclosed features and/or combinations and subcombinations of several further features disclosed above. Moreover, the logic flows depicted in the figures and/or described herein do not necessarily require the particular order shown, or sequential order, to achieve desirable results. Other embodiments are possible within the scope of the following claims.
Claims (20)
1. A decoder comprising circuitry configured to:
receiving a bit stream;
storing a plurality of long-term reference frames in a reference list;
retaining long-term reference frames in the reference list for a length of time based on a retention time; and
decoding at least a portion of video using the long-term reference frames retained in the reference list.
2. The decoder of claim 1 wherein each of the stored long-term reference frames each includes an associated retention time.
3. The decoder of claim 1, further configured to mark the long-term reference frame as unavailable after the long-term reference frame has resided in the reference list for at least the retention time.
4. The decoder of claim 3, further configured to mark the long-term reference frame as available based on a signal in the bitstream.
5. The decoder of claim 1, wherein the bitstream includes a signal indicating removal of the long-term reference frame from memory.
6. The decoder of claim 5, further configured to remove the long-term reference frame from the reference list based on the signal.
7. The decoder of claim 1, further comprising:
an entropy decoder processor configured to receive the bitstream and decode the bitstream into quantized coefficients;
an inverse quantization and inverse transform processor configured to process the quantized coefficients, including performing inverse discrete cosine processing;
a deblocking filter;
a frame buffer; and
an intra prediction processor.
8. The decoder of claim 1, further configured to:
receiving a coding block;
determining that inter-prediction mode has been enabled for the coding block; and
determining a decoded block using the long-term reference frame as a reference frame and according to the inter prediction mode.
9. The decoder of claim 8 wherein the decoded block forms part of a quad-tree plus binary decision tree.
10. The decoder of claim 8 wherein the decoded block is a non-leaf node of a quadtree plus binary decision tree.
11. A method, comprising:
receiving, by a decoder, a bitstream;
storing, by the decoder, a plurality of long-term reference frames in a reference list;
retaining, by the decoder, the long-term reference frame in the reference list for a length of time based on a retention time; and
decoding, by the decoder, at least a portion of video using the long-term reference frames retained in the reference list.
12. The method of claim 11, wherein each of the stored long-term reference frames each includes an associated retention time.
13. The method of claim 11, further comprising marking the long-term reference frame as unavailable after the long-term reference frame has resided in the reference list for at least the retention time.
14. The method of claim 13, further comprising marking the long-term reference frame as available based on a signal in the bitstream.
15. The method of claim 11, wherein the bitstream includes a signal indicating removal of the long-term reference frame from memory.
16. The method of claim 15, further comprising removing the long-term reference frame from the reference list based on the signal.
17. The method of claim 11, wherein the decoder further comprises:
an entropy decoder processor configured to receive the bitstream and decode the bitstream into quantized coefficients;
an inverse quantization and inverse transform processor configured to process the quantized coefficients, including performing inverse discrete cosine processing;
a deblocking filter;
a frame buffer; and
an intra prediction processor.
18. The method of claim 11, further comprising:
receiving a coding block;
determining that inter-prediction mode has been enabled for the coding block; and
determining a decoded block using the long-term reference frame as a reference frame and according to the inter prediction mode.
19. The method of claim 18, wherein the decoded block forms part of a quad-tree plus binary decision tree.
20. The method of claim 18, wherein said decoded block is a non-leaf node of a quadtree plus binary decision tree.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411033336.8A CN118714324A (en) | 2019-01-28 | 2020-01-28 | Explicit signaling to extend long-term reference picture reservation |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201962797806P | 2019-01-28 | 2019-01-28 | |
US62/797,806 | 2019-01-28 | ||
PCT/US2020/015414 WO2020159993A1 (en) | 2019-01-28 | 2020-01-28 | Explicit signaling of extended long term reference picture retention |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411033336.8A Division CN118714324A (en) | 2019-01-28 | 2020-01-28 | Explicit signaling to extend long-term reference picture reservation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113615184A true CN113615184A (en) | 2021-11-05 |
CN113615184B CN113615184B (en) | 2024-08-09 |
Family
ID=71841917
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411033336.8A Pending CN118714324A (en) | 2019-01-28 | 2020-01-28 | Explicit signaling to extend long-term reference picture reservation |
CN202080021301.6A Active CN113615184B (en) | 2019-01-28 | 2020-01-28 | Explicit signaling to extend long-term reference picture reservation |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411033336.8A Pending CN118714324A (en) | 2019-01-28 | 2020-01-28 | Explicit signaling to extend long-term reference picture reservation |
Country Status (8)
Country | Link |
---|---|
EP (1) | EP3918799A4 (en) |
JP (2) | JP7498502B2 (en) |
KR (1) | KR20210118155A (en) |
CN (2) | CN118714324A (en) |
BR (1) | BR112021014753A2 (en) |
MX (1) | MX2021009024A (en) |
SG (1) | SG11202108105YA (en) |
WO (1) | WO2020159993A1 (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101690202A (en) * | 2007-04-09 | 2010-03-31 | 思科技术公司 | The long term reference frame management that is used for the band error feedback of compressed video communication |
CN103167283A (en) * | 2011-12-19 | 2013-06-19 | 华为技术有限公司 | Video coding method and device |
US20130279589A1 (en) * | 2012-04-23 | 2013-10-24 | Google Inc. | Managing multi-reference picture buffers for video data coding |
US20130329787A1 (en) * | 2012-06-07 | 2013-12-12 | Qualcomm Incorporated | Signaling data for long term reference pictures for video coding |
US20140003538A1 (en) * | 2012-06-28 | 2014-01-02 | Qualcomm Incorporated | Signaling long-term reference pictures for video coding |
WO2014111222A1 (en) * | 2013-01-16 | 2014-07-24 | Telefonaktiebolaget L M Ericsson (Publ) | Decoder and encoder and methods for coding of a video sequence |
KR101674556B1 (en) * | 2015-07-27 | 2016-11-10 | 인하대학교 산학협력단 | Method and apparatus for estimating motion using multi reference frames |
US9609341B1 (en) * | 2012-04-23 | 2017-03-28 | Google Inc. | Video data encoding and decoding using reference picture lists |
CN106961609A (en) * | 2016-01-08 | 2017-07-18 | 三星电子株式会社 | Application processor and mobile terminal for handling reference picture |
CN108432253A (en) * | 2016-01-21 | 2018-08-21 | 英特尔公司 | Long-term reference picture decodes |
US20180324458A1 (en) * | 2011-09-23 | 2018-11-08 | Velos Media, Llc | Decoded picture buffer management |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1994764B1 (en) * | 2006-03-15 | 2016-01-13 | BRITISH TELECOMMUNICATIONS public limited company | Video coding with reference picture set management |
US10003817B2 (en) | 2011-11-07 | 2018-06-19 | Microsoft Technology Licensing, Llc | Signaling of state information for a decoded picture buffer and reference picture lists |
CN105933800A (en) * | 2016-04-29 | 2016-09-07 | 联发科技(新加坡)私人有限公司 | Video play method and control terminal |
-
2020
- 2020-01-28 CN CN202411033336.8A patent/CN118714324A/en active Pending
- 2020-01-28 BR BR112021014753-5A patent/BR112021014753A2/en unknown
- 2020-01-28 WO PCT/US2020/015414 patent/WO2020159993A1/en unknown
- 2020-01-28 JP JP2021543479A patent/JP7498502B2/en active Active
- 2020-01-28 SG SG11202108105YA patent/SG11202108105YA/en unknown
- 2020-01-28 KR KR1020217027065A patent/KR20210118155A/en not_active Application Discontinuation
- 2020-01-28 MX MX2021009024A patent/MX2021009024A/en unknown
- 2020-01-28 CN CN202080021301.6A patent/CN113615184B/en active Active
- 2020-01-28 EP EP20748672.1A patent/EP3918799A4/en active Pending
-
2024
- 2024-05-24 JP JP2024084708A patent/JP2024100973A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101690202A (en) * | 2007-04-09 | 2010-03-31 | 思科技术公司 | The long term reference frame management that is used for the band error feedback of compressed video communication |
US20180324458A1 (en) * | 2011-09-23 | 2018-11-08 | Velos Media, Llc | Decoded picture buffer management |
CN103167283A (en) * | 2011-12-19 | 2013-06-19 | 华为技术有限公司 | Video coding method and device |
US20130279589A1 (en) * | 2012-04-23 | 2013-10-24 | Google Inc. | Managing multi-reference picture buffers for video data coding |
US9609341B1 (en) * | 2012-04-23 | 2017-03-28 | Google Inc. | Video data encoding and decoding using reference picture lists |
US20130329787A1 (en) * | 2012-06-07 | 2013-12-12 | Qualcomm Incorporated | Signaling data for long term reference pictures for video coding |
US20140003538A1 (en) * | 2012-06-28 | 2014-01-02 | Qualcomm Incorporated | Signaling long-term reference pictures for video coding |
WO2014111222A1 (en) * | 2013-01-16 | 2014-07-24 | Telefonaktiebolaget L M Ericsson (Publ) | Decoder and encoder and methods for coding of a video sequence |
KR101674556B1 (en) * | 2015-07-27 | 2016-11-10 | 인하대학교 산학협력단 | Method and apparatus for estimating motion using multi reference frames |
CN106961609A (en) * | 2016-01-08 | 2017-07-18 | 三星电子株式会社 | Application processor and mobile terminal for handling reference picture |
CN108432253A (en) * | 2016-01-21 | 2018-08-21 | 英特尔公司 | Long-term reference picture decodes |
Non-Patent Citations (3)
Title |
---|
BENJAMIN BROSS: "High efficiency video coding (HEVC) text specification draft 7(JCTVC-I1003_d9)", 《JOINT COLLABORATIVE TEAM ON VIDEO CODING (JCT-VC) OF ITU-T SG16 WP3 AND ISO/IEC JTC1/SC29/WG11 9TH MEETING: GENEVA, CH, 27 APRIL – 7 MAY 2012》, pages 1 - 278 * |
JIANLE CHEN ET AL: "Algorithm Description of Joint Exploration Test Model 7 (JEM 7)(JVET-G1001-v1)", 《JOINT VIDEO EXPLORATION TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 7TH MEETING: TORINO, IT, 13–21 JULY 2017》, pages 1 - 48 * |
YE-KUI WANG ET AL: "On reference picture management for VVCOn reference picture management for VVC(JVET-M0128-v1)", 《JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 13TH MEETING: MARRAKECH, MA, 9–18 JAN. 2019》, pages 1 - 9 * |
Also Published As
Publication number | Publication date |
---|---|
KR20210118155A (en) | 2021-09-29 |
JP7498502B2 (en) | 2024-06-12 |
BR112021014753A2 (en) | 2021-09-28 |
SG11202108105YA (en) | 2021-08-30 |
CN113615184B (en) | 2024-08-09 |
MX2021009024A (en) | 2021-10-13 |
JP2024100973A (en) | 2024-07-26 |
EP3918799A1 (en) | 2021-12-08 |
WO2020159993A1 (en) | 2020-08-06 |
CN118714324A (en) | 2024-09-27 |
EP3918799A4 (en) | 2022-03-23 |
JP2022524917A (en) | 2022-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114616826A (en) | Implicit identification of adaptive resolution management based on frame type | |
JP7482536B2 (en) | Shape-adaptive discrete cosine transform for geometric partitioning with an adaptive number of regions. | |
JP2024045720A (en) | Global motion constrained motion vector in inter prediction | |
CN114073083A (en) | Global motion for merge mode candidates in inter prediction | |
JP7542278B2 (en) | Adaptive block updating of unavailable reference frames using explicit and implicit signaling - Patents.com | |
CN113647104A (en) | Inter prediction in geometric partitioning with adaptive region number | |
CN114128260A (en) | Efficient coding of global motion vectors | |
CN114009042A (en) | Candidates in frames with global motion | |
CN114128291A (en) | Adaptive motion vector prediction candidates in frames with global motion | |
CN113170175A (en) | Adaptive temporal filter for unavailable reference pictures | |
CN114080811A (en) | Selective motion vector prediction candidates in frames with global motion | |
CN113615184B (en) | Explicit signaling to extend long-term reference picture reservation | |
US11985318B2 (en) | Encoding video with extended long term reference picture retention | |
US11595652B2 (en) | Explicit signaling of extended long term reference picture retention | |
CN113597768B (en) | On-line and off-line selection to extend long-term reference picture preservation | |
RU2792865C2 (en) | Method for adaptive update of unavailable reference frame blocks using explicit and implicit signaling | |
KR20210152567A (en) | Signaling of global motion vectors in picture headers | |
JP2024149767A (en) | Adaptive block updating of unavailable reference frames using explicit and implicit signaling - Patents.com |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |