TW202344051A

TW202344051A - Temporal initialization points for context-based arithmetic coding

Info

Publication number: TW202344051A
Application number: TW112107560A
Authority: TW
Inventors: 瓦迪姆賽萊金; 瑪塔卡克基維克茲
Original assignee: 美商高通公司
Priority date: 2022-03-03
Filing date: 2023-03-02
Publication date: 2023-11-01
Also published as: WO2023167997A1

Abstract

A method includes determining one or more context values for at least one context used for encoding or decoding a current slice or picture, determining that a buffer for storing sets of temporal initialization points from two or more slices or pictures for context-based arithmetic coding is full, determining a first set of temporal initialization points associated with a slice or picture, from among the two or more slices or pictures, based on at least one of a slice type, a temporal identification value, or a quantization parameter (QP) value of the slice or picture, removing the first set of temporal initialization points that is associated with the slice or picture, and storing a second set of temporal initialization points associated with the current slice or picture, wherein the second set of temporal initialization points are based on the determined one or more context values.

Description

Temporal initialization point for context-based arithmetic decoding

本專利申請案主張享受2022年3月3日提出申請的美國臨時申請案第63/268,844號和2022年3月29日提出申請的美國臨時申請案第63/362,118號的優先權，故以引用方式將這些申請案中的每一份的全部內容併入本文。This patent application claims the priority of U.S. Provisional Application No. 63/268,844 filed on March 3, 2022 and U.S. Provisional Application No. 63/362,118 filed on March 29, 2022, and is therefore incorporated by reference. The entire contents of each of these applications are incorporated herein by reference.

本案內容係關於視訊編碼和視訊解碼。The content of this case is about video encoding and video decoding.

數位視訊能力可以併入範圍廣泛的設備中，包括數位電視、數位直接廣播系統、無線廣播系統、個人數位助理（PDA）、筆記型電腦或桌上型電腦、平板電腦、電子書閱讀器、數碼照相機、數位錄音設備、數位媒體播放機、視訊遊戲裝置、視訊遊戲主控台、蜂巢式電話或衛星無線電話、所謂的「智慧手機」、視訊電話會議設備、視訊串流設備等。數位視訊設備實現視訊譯碼技術，諸如在經由MPEG-2、MPEG-4、ITU-T H.263、ITU-T H.264/MPEG-4、部分10、改進的視訊譯碼（AVC）、ITU-T H.265/高效視訊譯碼（HEVC）標準、ITU-T H.265/多功能視訊譯碼（VVC）定義的標準中所描述的視訊譯碼技術，以及對此類標準的擴展，以及諸如由開放媒體聯盟發展的AOMedio視訊1（AV1）的專有視訊轉碼器/格式。視訊設備可以經由實現此類視訊譯碼技術來更有效地發送、接收、編碼、解碼及/或儲存數位視訊資訊。Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), notebook or desktop computers, tablets, e-book readers, digital Cameras, digital recording equipment, digital media players, video game devices, video game consoles, cellular or satellite wireless phones, so-called "smartphones", video conferencing equipment, video streaming equipment, etc. Digital video equipment implements video decoding technologies such as via MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), Video coding techniques described in the standards defined by ITU-T H.265/High Efficiency Video Coding (HEVC), ITU-T H.265/Versatile Video Coding (VVC), and extensions to such standards , and proprietary video transcoders/formats such as AOMedio Video 1 (AV1) developed by the Alliance for Open Media. Video equipment can more efficiently send, receive, encode, decode and/or store digital video information by implementing such video decoding technology.

視訊譯碼技術包括空間（圖片內）預測及/或時間（圖片間）預測，以減少或移除視訊序列中的固有的冗餘。針對基於塊的視訊譯碼，視訊切片（例如，視訊圖片或視訊圖片的一部分）可以劃分為視訊塊，其亦可以稱為譯碼樹單元（CTU）、譯碼單元（CU）及/或譯碼節點。圖片的訊框內譯碼（I）的切片中的視訊塊是使用相對於相同的圖片中的鄰近塊中的參考取樣的空間預測來編碼的。圖片的訊框間解碼（P或B）切片中的視訊塊可以使用相對於相同的圖片中鄰近塊中的參考取樣的空間預測，或者相對於其他參考圖片中的參考取樣的時間預測。圖片可以稱為訊框，以及參考圖片可以稱為參考訊框。Video coding techniques include spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove inherent redundancy in video sequences. For block-based video coding, a video slice (e.g., a video picture or a portion of a video picture) may be divided into video blocks, which may also be referred to as coding tree units (CTUs), coding units (CUs), and/or coding units. code node. Video blocks in a slice of intra-coded (I) pictures are coded using spatial prediction relative to reference samples in neighboring blocks in the same picture. Video blocks in an inter-frame decoded (P or B) slice of a picture may use spatial prediction relative to reference samples in neighboring blocks in the same picture, or temporal prediction relative to reference samples in other reference pictures. The picture may be called a frame, and the reference picture may be called a reference frame.

通常，本案內容描述了用於決定在基於上下文的算術譯碼（例如，上下文自我調整二進位算術譯碼（CABAC））中使用的一或多個上下文的初始化點的技術。初始化點可以視作為一或多個上下文的起始點，並且可以包括一或多個上下文狀態、訊窗或速率自我調整參數以及算術譯碼操作中使用的其他參數。Generally, this document describes techniques for determining initialization points for one or more contexts used in context-based arithmetic coding (eg, context self-adjusting binary arithmetic coding (CABAC)). The initialization point may be considered the starting point of one or more contexts, and may include one or more context states, windowing or rate self-adjustment parameters, and other parameters used in arithmetic decoding operations.

在一些實例中，用於當前切片或當前圖片中的視訊資料的一或多個上下文的初始化點，可以基於先前圖片中的視訊資料的一或多個上下文的初始化點。此類初始化點稱為時間初始化點。In some examples, the initialization point for one or more contexts of the video material in the current slice or current picture may be based on the initialization point of one or more contexts of the video material in the previous picture. Such initialization points are called temporal initialization points.

本案內容描述了用於決定當前切片或圖片的時間初始化點的實例技術。在一些實例中，視訊譯碼器（例如，視訊轉碼器或視訊解碼器）可以利用時間標識（ID）值及/或量化參數（QP）值來決定時間初始化點。在一些實例中，視訊譯碼器可以基於位於相應位置的先前切片的初始化點，來決定用於當前切片的初始化點。This article describes an example technique for determining the temporal initialization point of the current slice or picture. In some examples, a video coder (eg, a video transcoder or a video decoder) may utilize a time identifier (ID) value and/or a quantization parameter (QP) value to determine the time initialization point. In some examples, the video decoder may determine the initialization point for the current slice based on the initialization point of the previous slice at the corresponding location.

由於記憶體大小的限制，可以儲存多少時間初始化點可能存在限制。當要儲存新的時間初始化點集合時，可以去除已經儲存的時間初始化點集合。本案內容描述了用於基於時間標識值及/或QP值，插入和去除時間初始化點的記憶體管理的實例技術，該時間標識值及/或QP值平衡了記憶體大小限制，同時確保獲得及時解碼的時間初始化點保留在緩衝器中。Due to memory size limitations, there may be a limit to how many time initialization points can be stored. When a new time initialization point set is to be stored, the already stored time initialization point set can be removed. This document describes example techniques for memory management of inserting and removing time initialization points based on time stamp values and/or QP values that balance memory size constraints while ensuring timely access The decoded time initialization point is kept in the buffer.

在一個實例中，本案內容描述了一種處理視訊資料的方法，該方法包括：決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值；決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿，其中每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點；基於切片或圖片的切片類型、時間標識值或量化參數（QP）值中的至少一項，從該兩個或兩個以上切片或圖片中決定與該切片或圖片相關聯的第一時間初始化點集合；從該緩衝器中，去除與該切片或者圖片相關聯的該第一時間初始化點集合；並在該緩衝器中，儲存與當前切片或者圖片相關聯的第二時間初始化點集合，其中該第二時間初始化點集合基於該決定的一或多個上下文值。In one example, the content of this case describes a method of processing video data. The method includes: determining one or more context values for at least one context used to encode or decode the current slice or picture; A buffer is full for a set of temporal initialization points of one or more slices or pictures for context-based arithmetic decoding, where each set of temporal initialization points is associated with a slice or picture of the two or more slices or pictures. Associated, and including one or more temporal initialization points; determined from the two or more slices or pictures based on at least one of the slice type, temporal marker value, or quantization parameter (QP) value of the slice or picture The first time initialization point set associated with the slice or picture; remove the first time initialization point set associated with the slice or picture from the buffer; and store the current slice or picture in the buffer. A second set of temporal initialization points associated with the picture, wherein the second set of temporal initialization points is based on the determined one or more contextual values.

在一個實例中，本案內容描述了一種用於處理視訊資料的設備，該設備包括：緩衝器，其被配置為儲存來自兩個或兩個以上切片或圖片的時間初始化點集合，以用於基於上下文的算術譯碼；及耦合到該緩衝器的處理電路，該處理電路被配置為：決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值；決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿，其中每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點；基於切片或圖片的切片類型、時間標識值或量化參數（QP）值中的至少一項，從該兩個或兩個以上切片或圖片中決定與該切片或圖片相關聯的第一時間初始化點集合；從該緩衝器中，去除與該切片或者圖片相關聯的該第一時間初始化點集合；並在該緩衝器中，儲存與當前切片或者圖片相關聯的第二時間初始化點集合，其中該第二時間初始化點集合基於該決定的一或多個上下文值。In one example, this case content describes a device for processing video data, the device includes: a buffer configured to store a set of temporal initialization points from two or more slices or pictures for use based on Arithmetic decoding of context; and processing circuitry coupled to the buffer, the processing circuitry configured to: determine one or more context values for at least one context for encoding or decoding a current slice or picture; determine for A buffer that stores sets of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding is full, where each set of temporal initialization points is associated with a set of temporal initialization points in the two or more slices or pictures. The slices or pictures are associated and include one or more temporal initialization points; based on at least one of the slice type, the temporal identification value or the quantization parameter (QP) value of the slice or picture, from the two or more slices or Determine the first time initialization point set associated with the slice or picture in the picture; remove the first time initialization point set associated with the slice or picture from the buffer; and store the first time initialization point set associated with the slice or picture in the buffer. A second set of temporal initialization points associated with the current slice or picture, wherein the second set of temporal initialization points is based on the determined one or more context values.

在一個實例中，本案內容描述了在其上儲存指令的電腦可讀取儲存媒體，當該等指令被執行時，使一或多個處理器用於：決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值；決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿，其中每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點；基於切片或圖片的切片類型、時間標識值或量化參數（QP）值中的至少一項，從該兩個或兩個以上切片或圖片中決定與該切片或圖片相關聯的第一時間初始化點集合；從該緩衝器中，去除與該切片或者圖片相關聯的該第一時間初始化點集合；並在該緩衝器中，儲存與當前切片或者圖片相關聯的第二時間初始化點集合，其中該第二時間初始化點集合基於該決定的一或多個上下文值。In one example, the present disclosure describes a computer-readable storage medium having instructions stored thereon that, when executed, cause one or more processors to: determine whether to encode the current slice or picture or One or more context values of at least one context being decoded; deciding that a buffer used to store a set of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding is full, where each temporal The set of initialization points is associated with a slice or picture of the two or more slices or pictures and includes one or more temporal initialization points; based on a slice type, a time stamp value or a quantization parameter (QP) value of the slice or picture At least one of: determining the first time initialization point set associated with the slice or picture from the two or more slices or pictures; removing the first time initialization point set associated with the slice or picture from the buffer A first set of temporal initialization points; and in the buffer, store a second set of temporal initialization points associated with the current slice or picture, wherein the second set of temporal initialization points is based on the determined one or more context values.

在一個實例中，本案內容描述了一種用於處理視訊資料的設備，該設備包括：用於決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值的單元；用於決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿的單元，其中每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點；用於基於切片或圖片的切片類型、時間標識值或量化參數（QP）值中的至少一項，從該兩個或兩個以上切片或圖片中決定與該切片或圖片相關聯的第一時間初始化點集合的單元；用於從該緩衝器中，去除與該切片或者圖片相關聯的該第一時間初始化點集合的單元；用於在該緩衝器中，儲存與當前切片或者圖片相關聯的第二時間初始化點集合的單元，其中該第二時間初始化點集合基於該決定的一或多個上下文值。In one example, the content of this case describes a device for processing video data, which device includes: a unit for determining one or more context values of at least one context used to encode or decode the current slice or picture; Determining which buffer is full for storing sets of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding, where each set of temporal initialization points is associated with the two or more Slices or pictures of more than one slice or picture are associated and include one or more temporal initialization points; for based on at least one of the slice type, time stamp value or quantization parameter (QP) value of the slice or picture, from The unit that determines the first time initialization point set associated with the slice or picture among the two or more slices or pictures; used to remove the first time initialization point associated with the slice or picture from the buffer a unit for a set of initialization points; a unit for storing in the buffer a second set of temporal initialization points associated with the current slice or picture, wherein the second set of temporal initialization points is based on the determined one or more context values .

在附圖和下文的說明書中闡述了一或多個實例的細節。根據說明書、附圖以及申請專利範圍，其他特徵、物件和優點將變得顯而易見。The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, features and advantages will become apparent from the description, drawings and claims.

在視訊譯碼（例如，基於上下文的算術譯碼）中，視訊譯解碼器（例如，視訊轉碼器或視訊解碼器）利用初始化點來初始化當前圖片或切片。例如，對於基於上下文的算術譯碼（其包括上下文自我調整二進位算術譯碼（CABAC）），可以為每個上下文使用初始化點。初始化點可以包括一或多個上下文狀態、訊窗或速率自我調整參數、以及算術譯碼操作所需的其他參數。In video coding (eg, context-based arithmetic coding), a video transcoding decoder (eg, a video transcoder or a video decoder) utilizes an initialization point to initialize the current picture or slice. For example, for context-based arithmetic coding, which includes context self-adjusting binary arithmetic coding (CABAC), an initialization point may be used for each context. The initialization point may include one or more context states, window or rate self-adjustment parameters, and other parameters required for arithmetic decoding operations.

可以預定義初始化點。然而，在一些實例中，除了使用預定義的初始化點之外或者代替使用預定義的初始化點，視訊譯碼器可以利用時間初始化點。時間初始化點可以代表譯碼順序中用於決定（例如，選擇）當前圖片的初始化點的先前圖片的初始化點。例如，時間初始化點可以是用於對當前圖片進行編碼或解碼的至少一個上下文的一或多個上下文值，或者基於該一或多個上下文值匯出，隨後使用該上下文值來初始化後續圖片的至少一個上下文的上下文值。Initialization points can be predefined. However, in some instances, the video decoder may utilize temporal initialization points in addition to or instead of using predefined initialization points. The temporal initialization point may represent an initialization point of a previous picture in the decoding order that was used to determine (eg, select) the initialization point of the current picture. For example, the temporal initialization point may be one or more context values of at least one context used to encode or decode the current picture, or derived based on the one or more context values that are subsequently used to initialize subsequent pictures. The context value of at least one context.

使用時間初始化點可能存在某些問題。例如，儲存時間初始化點的緩衝器可以具有有限的大小，並且視訊譯碼器可以去除時間初始化點的集合（例如，一或多個時間初始化點）以允許儲存另一時間初始化點集合。本案內容描述了用於以確保保留的時間初始化點集合促進高效編碼和解碼的方式，決定去除哪個時間初始化點集合的實例技術。There may be some issues with using time initialization points. For example, a buffer storing temporal initialization points may have a limited size, and the video decoder may remove a set of temporal initialization points (eg, one or more temporal initialization points) to allow storage of another set of temporal initialization points. This document describes an example technique for deciding which set of temporal initialization points to remove in a manner that ensures that the retained set of temporal initialization points facilitates efficient encoding and decoding.

在一或多個實例中，視訊譯碼器在緩衝器中儲存切片或圖片的相應時間初始化點集合（例如，在對該切片或圖片進行譯碼之後）。時間初始化點集合可以包括如前述的一或多個初始化點，初始化點可以是上下文的上下文值，或者根據用於對當前切片或圖片進行譯碼的上下文的上下文值匯出。In one or more examples, the video coder stores a corresponding set of temporal initialization points for a slice or picture in a buffer (eg, after decoding the slice or picture). The set of temporal initialization points may include one or more initialization points as described above, and the initialization points may be context values of the context, or derived according to context values of the context used to decode the current slice or picture.

例如，視訊譯碼器在緩衝器中儲存與第一切片或圖片相關聯的第一時間初始化點集合（例如，在對第一切片或圖片進行譯碼之後），在緩衝器中儲存與第二切片或圖片相關聯的第二時間初始化點集合（例如，在對第二切片或者圖片進行譯碼之後）等等。以這種方式，緩衝器儲存來自兩個或兩個以上切片或圖片的時間初始化點集合，以用於基於上下文的算術譯碼。每個時間初始化點集合與兩個或兩個以上切片或圖片中的切片或圖片相關聯。For example, the video decoder stores in a buffer a first set of temporal initialization points associated with a first slice or picture (e.g., after decoding the first slice or picture), and stores in the buffer a set of initialization points associated with the first slice or picture. A second set of temporal initialization points associated with the second slice or picture (eg, after decoding the second slice or picture), and so on. In this way, the buffer stores a set of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding. Each set of temporal initialization points is associated with a slice or picture within two or more slices or pictures.

每個切片或圖片可以具有時間標識（ID）值及/或量化參數（QP）值。每個切片亦可以與切片類型相關聯。先前圖片的時間ID值可以指示該先前圖片是否可以用於當前圖片的訊框間預測。例如，只有具有與當前圖片相同或更低的時間ID值的圖片，才能用於當前圖片的訊框間預測。用此方式，基於頻寬可用性或處理能力，可以從位元串流中丟棄或不解碼具有比某個閥值更高的時間ID值的圖片，而不影響對具有小於閥值的時間ID的圖片進行解碼的能力。QP值指示在編碼程序中應用的量化的量。Each slice or picture may have a temporal identification (ID) value and/or a quantization parameter (QP) value. Each slice can also be associated with a slice type. The temporal ID value of the previous picture may indicate whether the previous picture can be used for inter-frame prediction of the current picture. For example, only pictures with the same or lower temporal ID value as the current picture can be used for inter-frame prediction of the current picture. In this way, pictures with temporal ID values higher than a certain threshold can be dropped or not decoded from the bitstream based on bandwidth availability or processing power, without affecting the processing of temporal IDs with temporal IDs smaller than the threshold. The ability to decode images. The QP value indicates the amount of quantization applied in the encoding process.

在一或多個實例中，若用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿，則視訊譯碼器可以基於切片類型、切片或圖片的時間標識值或量化參數（QP）值中的至少一項，從緩衝器中決定（例如，辨識）與該兩個或兩個以上切片或圖片中的切片或圖片相關聯的第一時間初始化點集合。舉一個實例，視訊譯碼器可以從緩衝器中決定（例如，辨識）與兩個或兩個以上切片或圖片中的具有以下各項中的至少一項的切片或圖片相關聯的第一時間初始化點集合：最小時間標識值或量化參數（QP）值。In one or more instances, if a buffer used to store a set of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding is full, the video decoder may based on the slice type , at least one of a timestamp value or a quantization parameter (QP) value of a slice or picture, determining (e.g., identifying) from the buffer the slice or picture associated with the slice or picture of the two or more slices or pictures. Initialize the point collection at the first time. As an example, the video decoder may determine (e.g., identify) from the buffer a first time associated with a slice or picture of two or more slices or pictures that has at least one of the following: Initialization point set: minimum time stamp value or quantization parameter (QP) value.

在一些實例中，第一時間初始化點集合可以與切片類型不同於要編碼或解碼的當前切片的切片類型的切片相關聯。在一些實例中，可以針對一種切片類型儲存多個時間初始化點集合。在此類實例中，視訊譯碼器可以決定與當前切片具有相同切片類型的切片相關聯的時間初始化點集合的封包。視訊譯碼器可以從與當前切片具有相同切片類型的切片相關聯的時間初始化點集合的封包中，決定第一時間初始化點集合（例如，與具有最小時間標識值或QP值的切片或圖片相關聯的那些時間初始化點）。In some examples, the first time initialization point set may be associated with a slice whose slice type is different from the slice type of the current slice to be encoded or decoded. In some instances, multiple sets of time initialization points may be stored for a slice type. In such instances, the video decoder may determine packets for a set of temporal initialization points associated with slices of the same slice type as the current slice. The video decoder may determine the first temporal initialization point set from packets of temporal initialization point sets associated with slices of the same slice type as the current slice (e.g., associated with the slice or picture with the smallest temporal stamp value or QP value those time initialization points of the connection).

視訊譯碼器可以從緩衝器中去除與切片或圖片相關聯的第一時間初始化點集合。視訊譯碼器可以基於所決定的當前切片或圖片的一或多個上下文值，在緩衝器中儲存與當前切片或圖片相關聯的第二時間初始化點集合。The video decoder may remove the first set of initialization points associated with the slice or picture from the buffer. The video decoder may store a second set of temporal initialization points associated with the current slice or picture in the buffer based on the determined one or more context values of the current slice or picture.

在一或多個實例中，視訊譯碼器可以決定（例如，選擇）儲存在緩衝器中的時間初始化點集合，以及基於所決定的時間初始化點集合，來初始化用於對後續切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值。為了決定時間初始化點集合，視訊譯碼器可以決定時間標識值及/或QP值最接近後續切片或圖片的時間標識值及/或QP值的切片或圖片，並且在一些情況下，具有相同的切片類型。例如，若後續圖片的QP值是X，則視訊譯碼器可以決定哪一時間初始化點集合與QP值為X或最接近X的圖片相關聯。視訊譯碼器可以選擇所決定的時間初始化點集合。視訊譯碼器可以對後續的切片或圖片進行基於上下文的算術編碼或解碼。In one or more examples, the video decoder may determine (eg, select) a set of temporal initialization points stored in a buffer and initialize subsequent slices or pictures based on the determined set of temporal initialization points. One or more context values of at least one context encoded or decoded. In order to determine the set of temporal initialization points, the video decoder may determine the slice or picture whose time stamp value and/or QP value is closest to the time stamp value and/or QP value of the subsequent slice or picture, and in some cases, has the same Slice type. For example, if the QP value of a subsequent picture is X, the video decoder can determine which set of temporal initialization points is associated with the picture with a QP value of The video decoder can select a determined set of time initialization points. The video decoder can perform context-based arithmetic encoding or decoding of subsequent slices or pictures.

經由去除與具有最小時間標識值或QP值的切片或圖片相關聯的時間初始化點集合，可能存在譯碼增益。在一或多個實例中，與傾向於有更少的變換係數的具有更高的時間標識值及/或QP值的圖片相比，具有更小的時間標識值及/或QP值的圖片傾向於具有更多的變換係數（例如，由於較少的量化）。若存在相對較高數量的變換係數（例如，由於較低的QP值），則可以相對快速地更新上下文的上下文值，並將其用於切片的剩餘部分（例如，可以在切片譯碼開始時調整上下文，並且將對切片的其餘部分進行高效地譯碼）。There may be a coding gain by removing the set of temporal initialization points associated with the slice or picture with the smallest temporal stamp value or QP value. In one or more instances, pictures with smaller temporal stamp values and/or QP values tend to have fewer transform coefficients than pictures with higher temporal stamp values and/or QP values tend to have fewer transform coefficients. due to having more transform coefficients (e.g. due to less quantization). If there is a relatively high number of transform coefficients (e.g., due to a low QP value), the context value of the context can be updated relatively quickly and used for the remainder of the slice (e.g., it can be at the beginning of slice decoding The context is adjusted and the rest of the slice will be decoded efficiently).

然而，若存在相對較少數量的變換係數（例如，由於較高的QP值），則相對緩慢地更新上下文的上下文值。亦即，具有較高QP值的切片具有較少的變換係數，並且上下文自我調整將較慢，因此將對切片的較少塊進行高效地譯碼。However, if there is a relatively small number of transform coefficients (eg due to a high QP value), the context value of the context is updated relatively slowly. That is, slices with higher QP values have fewer transform coefficients and context self-adjustment will be slower, so fewer blocks of the slice will be coded efficiently.

在一或多個實例中，儲存在緩衝器中的時間初始化點集合可能更接近（例如，在某種映射或縮放之後）用於對後續切片或圖片進行譯碼的上下文值。因此，若儲存在緩衝器中的時間初始化點集合與具有更高時間標識值或QP值的圖片相關聯，則這些時間初始化點集合將更可能用於具有更高的時間標識值和QP值的後續圖片。換言之，在一些實例中，可以使用時間初始化點，為具有更高時間標識值或QP值的切片或圖片實現更高的譯碼效率增益。因此，經由儲存具有較高時間標識值或QP值的圖片的時間初始化點，並去除具有較低時間標識值和QP值的圖片的時間初始化點（由於緩衝區大小限制），則實例技術可以在不需要大尺寸緩衝區的情況下提高譯碼效率。In one or more instances, the set of temporal initialization points stored in the buffer may be closer (eg, after some mapping or scaling) to the context values used to code subsequent slices or pictures. Therefore, if the set of time initialization points stored in the buffer is associated with a picture with a higher time stamp value or QP value, then these time initialization point sets will be more likely to be used for a picture with a higher time stamp value or QP value. Pictures to follow. In other words, in some examples, temporal initialization points may be used to achieve higher coding efficiency gains for slices or pictures with higher temporal stamp values or QP values. Therefore, by storing the temporal initialization points of pictures with higher temporal stamp values or QP values, and removing the temporal initialization points of pictures with lower temporal stamp values and QP values (due to buffer size limitations), the example technique can Improve decoding efficiency without requiring large size buffers.

使用時間初始化點可能存在其他問題。如前述，每個圖片可以與時間標識（ID）值相關聯。然而，若具有較高時間ID值的圖片的時間初始化點將用於決定當前圖片的初始化點，則可能存在錯誤，這是因為具有較高時間ID值的圖片的時間初始化點可能不可用。There may be other issues with using time initialization points. As mentioned before, each picture can be associated with an ID value. However, if the temporal initialization point of the picture with the higher temporal ID value would be used to determine the initialization point of the current picture, there may be an error because the temporal initialization point of the picture with the higher temporal ID value may not be available.

如更詳細描述的，本案內容描述了使用圖片的時間ID值來決定後續圖片是否可以利用該圖片的時間初始化點的實例技術。以此方式，視訊解碼器取決於不可用的時間初始化點的可能性降低。As described in more detail, this document describes an example technique that uses a picture's temporal ID value to determine whether subsequent pictures can utilize the picture's temporal initialization point. In this way, the likelihood that the video decoder depends on an unavailable time initialization point is reduced.

在一些實例中，圖片的時間初始化點可能在切片與切片之間不同。在這種情況下，當先前圖片的切片的時間初始化點可用時，當前圖片的一個切片的時間初始化點可能覆蓋先前圖片的一個切片的時間初始化點。本案內容描述了使重寫時間初始化點資訊的負面影響降到最低，以及確保儲存正確的時間初始化點的實例技術。In some instances, the temporal initialization point of a picture may differ from slice to slice. In this case, when the temporal initialization point of a slice of the previous picture is available, the temporal initialization point of one slice of the current picture may overwrite the temporal initialization point of one slice of the previous picture. This case describes example techniques to minimize the negative impact of rewriting time initialization point information and ensure that the correct time initialization point is stored.

圖1是示出可以執行本案內容的技術的實例視訊編碼和解碼系統100的方塊圖。本案內容的技術通常針對於對視訊資料進行譯碼（編碼及/或解碼）。一般而言，視訊資料包括用於處理視訊的任何資料。因此，視訊資料可以包括原始的、未經編碼的視訊、經編碼的視訊、經解碼的（例如，重構的）視訊以及視訊中繼資料（諸如訊號傳遞資料）。FIG. 1 is a block diagram illustrating an example video encoding and decoding system 100 that may perform the techniques described herein. The technology in this case is generally directed at decoding (encoding and/or decoding) video data. Generally speaking, video data includes any data used to process videos. Accordingly, video data may include original, unencoded video, encoded video, decoded (eg, reconstructed) video, and video relay data (such as signaling data).

如圖1所示，在該實例中，系統100包括提供經編碼的要由目標設備116來解碼和顯示的視訊資料的源設備102。特別地，源設備102經由電腦可讀取媒體110向目標設備116提供視訊資料。源設備102和目標設備116可以包括範圍廣泛的設備中的任何設備，該設備包括桌上型電腦、筆記型電腦（亦即，膝上型電腦）、行動設備、平板電腦、機上盒、電話手機（諸如智慧手機）、電視機、照相機、顯示設備、數位媒體播放機、視訊遊戲主控台、視訊串流設備、廣播接收器設備等。在一些情況下，源設備102和目標設備116可以配備用於無線通訊，並且因此可以稱為無線通訊設備。As shown in FIG. 1 , in this example, system 100 includes source device 102 that provides encoded video material to be decoded and displayed by target device 116 . In particular, source device 102 provides video material to target device 116 via computer-readable media 110 . Source device 102 and target device 116 may include any of a wide range of devices, including desktop computers, notebook computers (i.e., laptops), mobile devices, tablets, set-top boxes, phones Mobile phones (such as smartphones), televisions, cameras, display devices, digital media players, video game consoles, video streaming devices, broadcast receiver devices, etc. In some cases, source device 102 and target device 116 may be equipped for wireless communications, and thus may be referred to as wireless communications devices.

在圖1的實例中，源設備102包括視訊源104、記憶體106、視訊轉碼器200和輸出介面108。目標設備116包括輸入介面122、視訊解碼器300、記憶體120和顯示設備118。根據本案內容，源設備102的視訊轉碼器200和目標設備116的視訊解碼器300可以被配置為應用用於當前圖片或切片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的時間初始化點的技術。因此，源設備102表示視訊編碼設備的實例，而目標設備116表示視訊解碼設備的實例。在其他實例中，源設備和目標設備可以包括其他部件或排列。例如，源設備102可以從外部視訊源（諸如外部照相機）接收視訊資料。同樣地，目標設備116可以與外部顯示設備相連接，而不是包括整合的顯示設備。In the example of FIG. 1, source device 102 includes video source 104, memory 106, video transcoder 200, and output interface 108. Target device 116 includes input interface 122, video decoder 300, memory 120, and display device 118. According to the present case, the video transcoder 200 of the source device 102 and the video decoder 300 of the target device 116 may be configured to apply one or more of the context-based arithmetic decoding methods used for the video material of the current picture or slice. Techniques for context time initialization points. Thus, source device 102 represents an instance of a video encoding device, and target device 116 represents an instance of a video decoding device. In other examples, the source device and target device may include other components or arrangements. For example, source device 102 may receive video material from an external video source, such as an external camera. Likewise, target device 116 may be connected to an external display device rather than including an integrated display device.

如圖1所示的系統100僅僅是一個實例。一般而言，任何數位視訊編碼及/或解碼設備可以執行在當前圖片或切片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的技術。源設備102和目標設備116僅僅是如下此類譯碼設備的實例：源設備102在其中產生用於到目標設備116的傳輸的經解碼的視訊資料的解碼設備。本案內容將「解碼」設備稱為執行對資料的譯碼（編碼及/或解碼）的設備。因此，視訊轉碼器200和視訊解碼器300分別表示譯碼設備的實例，特別是視訊轉碼器和視訊解碼器。在一些實例中，源設備102和目標設備116可以以基本上對稱的方式來操作，使得源設備102和目標設備116中的各者包括視訊編碼用部件和視訊解碼用部件。因此，系統100可以支援在源設備102與目標設備116之間的單向視訊傳輸或雙向視訊傳輸，例如，用於視訊串流、視訊重播、視訊廣播或視訊電話。The system 100 shown in Figure 1 is only one example. In general, any digital video encoding and/or decoding device can perform one or more context techniques used in context-based arithmetic decoding of the video data of the current picture or slice. Source device 102 and target device 116 are merely examples of the type of decoding device in which source device 102 generates decoded video material for transmission to target device 116 . This case refers to a "decoding" device as a device that performs the decoding (encoding and/or decoding) of data. Therefore, video transcoder 200 and video decoder 300 represent examples of decoding devices, particularly a video transcoder and a video decoder, respectively. In some examples, source device 102 and target device 116 may operate in a substantially symmetrical manner such that each of source device 102 and target device 116 includes components for video encoding and video decoding. Accordingly, the system 100 may support one-way video transmission or two-way video transmission between the source device 102 and the target device 116, for example, for video streaming, video replay, video broadcasting, or video telephony.

一般而言，視訊源104表示視訊資料的源（亦即，原始的、未經解碼的視訊資料）以及向視訊轉碼器200提供視訊資料的一順序系列的圖片（亦稱為「訊框」），該視訊轉碼器200對針對圖片的資料進行編碼。源設備102的視訊源104可以包括視訊捕捉設備（諸如攝像機、包含先前捕捉的原始視訊的視訊存檔及/或視訊饋送介面）以從視訊內容提供者接收視訊。作為進一步的替代方案，視訊源104可以產生基於電腦圖形的資料作為源視訊，或者直播視訊、存檔的視訊和電腦產生的視訊的組合。在每種情況下，視訊轉碼器200對捕捉的、預捕捉的或電腦產生的視訊資料進行編碼。視訊轉碼器200可以將圖片從接收的順序（有時稱為「顯示順序」）重新排列為用於譯碼的譯碼順序。視訊轉碼器200可以產生包括經編碼的視訊資料的位元串流。源設備102可以接著經由輸出介面108將經編碼的視訊資料輸出到電腦可讀取媒體110上，用於由例如目標設備116的輸入介面122進行的接收及/或取回。Generally speaking, video source 104 represents a source of video data (i.e., raw, undecoded video data) and a sequential series of pictures (also referred to as "frames") that provide the video data to video transcoder 200 ), the video transcoder 200 encodes the data for the picture. Video source 104 of source device 102 may include a video capture device (such as a video camera, a video archive containing previously captured raw video, and/or a video feed interface) to receive video from a video content provider. As a further alternative, video source 104 may generate computer graphics-based material as the source video, or a combination of live video, archived video, and computer-generated video. In each case, video transcoder 200 encodes captured, pre-captured, or computer-generated video material. Video transcoder 200 may rearrange pictures from the order in which they are received (sometimes referred to as "display order") to the decoding order for decoding. Video transcoder 200 may generate a bit stream including encoded video data. Source device 102 may then output the encoded video data onto computer-readable media 110 via output interface 108 for reception and/or retrieval by, for example, input interface 122 of target device 116 .

源設備102的記憶體106和目標設備116的記憶體120表示通用的記憶體。在一些實例中，記憶體106、記憶體120可以儲存原始的視訊資料，例如，來自視訊源104的原始的視訊和來自視訊解碼器300的原始的、經解碼的視訊資料。補充地或替代地，記憶體106、記憶體120可以儲存分別能由例如視訊轉碼器200和視訊解碼器300執行的軟體指令。儘管在該實例中記憶體106和記憶體120是與視訊轉碼器200和視訊解碼器300分別地示出的，但是應當理解的是，視訊轉碼器200和視訊解碼器300亦可以包括用於功能類似的或等效的目的的內部記憶體。此外，記憶體106、記憶體120可以儲存經編碼的視訊資料，例如，來自視訊轉碼器200的輸出和去往視訊解碼器300的輸入。在一些實例中，記憶體106、記憶體120中的一部分可以分配為一或多個視訊緩衝區，例如，以儲存原始的、經解碼的及/或經編碼的視訊資料。Memory 106 of source device 102 and memory 120 of target device 116 represent common memories. In some examples, the memory 106 and the memory 120 may store original video data, for example, the original video from the video source 104 and the original, decoded video data from the video decoder 300 . Additionally or alternatively, memory 106, memory 120 may store software instructions executable by, for example, video transcoder 200 and video decoder 300, respectively. Although memory 106 and memory 120 are shown separately from video transcoder 200 and video decoder 300 in this example, it should be understood that video transcoder 200 and video decoder 300 may also include Internal memory that serves a functionally similar or equivalent purpose. In addition, the memory 106 and the memory 120 may store encoded video data, such as the output from the video transcoder 200 and the input to the video decoder 300 . In some examples, a portion of memory 106, memory 120 may be allocated as one or more video buffers, for example, to store raw, decoded and/or encoded video data.

電腦可讀取媒體110可以表示能夠將經編碼的視訊資料從源設備102傳送給目標設備116的任何類型的媒體或設備。在一個實例中，電腦可讀取媒體110表示通訊媒體以使得源設備102能夠即時地將經編碼的視訊資料直接地發送給目標設備116，例如，經由射頻網路或基於電腦的網路。輸出介面108可以對包括經編碼的視訊資料的傳輸訊號進行調制，以及輸入介面122可以根據通訊標準（諸如無線通訊協定）來對接收的傳輸訊號進行解調。通訊媒體可以包括任何無線的通訊媒體或有線的通訊媒體，諸如射頻（RF）頻譜或一或多個實體傳輸線。通訊媒體可以形成基於封包的網路的一部分，諸如區域網路、廣域網或全球網路（諸如網際網路）。通訊媒體可以包括路由器、交換機、基地台或可以用於促進從源設備102到目標設備116的通訊的任何其他裝備。Computer-readable media 110 may represent any type of media or device capable of transmitting encoded video material from source device 102 to target device 116 . In one example, computer-readable media 110 represents a communication medium that enables source device 102 to instantly send encoded video data directly to target device 116 , such as via a radio frequency network or a computer-based network. The output interface 108 can modulate the transmission signal including the encoded video data, and the input interface 122 can demodulate the received transmission signal according to a communication standard (such as a wireless communication protocol). The communication media may include any wireless or wired communication media, such as the radio frequency (RF) spectrum or one or more physical transmission lines. The communication medium may form part of a packet-based network, such as a local area network, a wide area network, or a global network (such as the Internet). Communication media may include routers, switches, base stations, or any other equipment that may be used to facilitate communication from source device 102 to target device 116 .

在一些實例中，源設備102可以將經編碼的資料從輸出介面108輸出到存放裝置112。類似地，目標設備116可以經由輸入介面122從存放裝置112存取經編碼的資料。存放裝置112可以包括各種分散式的或本端存取的資料儲存媒體中的任何一者，諸如硬碟、藍光光碟、DVD、CD-ROM、快閃記憶體、揮發性記憶體或非揮發性記憶體，或用於儲存經編碼的視訊資料的任何其他合適的數位儲存媒體。In some examples, source device 102 may output encoded data from output interface 108 to storage device 112 . Similarly, target device 116 may access encoded data from storage device 112 via input interface 122 . Storage device 112 may include any of a variety of distributed or locally accessed data storage media, such as a hard drive, Blu-ray Disc, DVD, CD-ROM, flash memory, volatile memory, or non-volatile memory. memory, or any other suitable digital storage medium for storing encoded video data.

在一些實例中，源設備102可以將經編碼的視訊資料輸出到檔案伺服器114或可以儲存由源設備102產生的經編碼的視訊資料的另一中間存放裝置。目標設備116可以經由串流或下載從檔案伺服器114存取儲存的視訊資料。In some examples, source device 102 may output the encoded video data to file server 114 or another intermediate storage device that may store the encoded video data generated by source device 102 . Target device 116 may access stored video data from file server 114 via streaming or downloading.

檔案伺服器114可以是能夠儲存經編碼的視訊資料和將該經編碼的視訊資料發送給目標設備116的任何類型的伺服器設備。檔案伺服器114可以表示網路服務器（例如，針對網站）、被配置為提供檔傳送協定服務（諸如檔傳送協定（FTP）或單向輸送檔傳遞（FLUTE）協定））的伺服器、內容遞送網路（CDN）設備、超文字傳送協定（HTTP）伺服器、多媒體廣播多播服務（MBMS）或增強型MBMS（eMBMS）伺服器及/或網路附屬儲存（NAS）設備。檔案伺服器114可以補充或替代地實現一或多個HTTP串流協定，諸如經由HTTP的動態自我調整串流（DASH）、HTTP即時串流（HLS）、即時串流協定（RTSP）、HTTP動態串流等。File server 114 may be any type of server device capable of storing encoded video data and sending the encoded video data to target device 116 . File server 114 may represent a network server (e.g., for a website), a server configured to provide file transfer protocol services such as File Transfer Protocol (FTP) or One-Way File Transfer (FLUTE) protocol, content delivery Network (CDN) devices, Hypertext Transfer Protocol (HTTP) servers, Multimedia Broadcast Multicast Service (MBMS) or Enhanced MBMS (eMBMS) servers, and/or Network Attached Storage (NAS) devices. File server 114 may additionally or alternatively implement one or more HTTP streaming protocols, such as Dynamic Self-Adapting Streaming over HTTP (DASH), HTTP Live Streaming (HLS), Real-Time Streaming Protocol (RTSP), HTTP Dynamic Streaming etc.

目標設備116可以經由任何標準資料連接（包括網際網路連接）從檔案伺服器114存取經編碼的視訊資料。這可以包括適合用於對儲存在檔案伺服器114上的經編碼的視訊資料進行存取的無線通道（例如，Wi-Fi連接）、有線連接（例如，數位用戶線路（DSL）、纜線數據機等）或兩者的組合。輸入介面122可以被配置為根據上文論述的用於從檔案伺服器114取回或接收媒體資料的各種協定中的任何一項或多項或用於取回媒體資料的其他此類協定進行操作。Target device 116 may access encoded video data from file server 114 via any standard data connection, including an Internet connection. This may include wireless channels (e.g., Wi-Fi connections), wired connections (e.g., Digital Subscriber Line (DSL), cable data) suitable for accessing the encoded video data stored on file server 114 machine, etc.) or a combination of both. Input interface 122 may be configured to operate in accordance with any one or more of the various protocols discussed above for retrieving or receiving media data from file server 114 or other such protocols for retrieving media data.

輸出介面108和輸入介面122可以表示無線發射器/接收器、數據機、有線連網部件（例如，乙太網路卡）、根據各種IEEE 802.11標準中的任何標準進行操作的無線通訊部件或其他實體部件。在輸出介面108和輸入介面122包括無線部件的實例中，輸出介面108和輸入介面122可以被配置為根據蜂巢通訊標準（諸如4G、4G-LTE（長期進化）、改進的LTE、5G等）來傳送資料（諸如經編碼的視訊資料）。在輸出介面108包括無線發射器的一些實例中，輸出介面108和輸入介面122可以被配置為根據其他無線標準（諸如IEEE 802.11規範、IEEE 802.15規範（例如，紫蜂 ^TM（Zigbee ^TM））、藍芽 ^TM標準等）來傳送資料（諸如經編碼的視訊資料）。在一些實例中，源設備102及/或目標設備116可以包括各自的片上系統（SoC）設備。例如，源設備102可以包括SoC設備以執行歸因於視訊轉碼器200及/或輸出介面108的功能，以及目標設備116可以包括SoC設備以執行歸因於視訊解碼器300及/或輸入介面122的功能。 Output interface 108 and input interface 122 may represent a wireless transmitter/receiver, a modem, a wired networking component (e.g., an Ethernet network card), a wireless communications component operating in accordance with any of the various IEEE 802.11 standards, or other Solid parts. In instances where the output interface 108 and the input interface 122 include wireless components, the output interface 108 and the input interface 122 may be configured to communicate in accordance with cellular communications standards such as 4G, 4G-LTE (Long Term Evolution), LTE-Advanced, 5G, etc. Transmit data (such as encoded video data). In some instances where output interface 108 includes a wireless transmitter, output interface 108 and input interface 122 may be configured to operate in accordance with other wireless standards, such as the IEEE 802.11 specification, the IEEE 802.15 specification (eg, Zigbee ^™ ), ^Blue ^BudTM standard, etc.) to transmit data (such as encoded video data). In some examples, source device 102 and/or target device 116 may include respective system-on-chip (SoC) devices. For example, source device 102 may include an SoC device to perform functions attributed to video transcoder 200 and/or output interface 108 and target device 116 may include an SoC device to perform functions attributed to video decoder 300 and/or input interface 122 functions.

本案內容的技術可以應用於支援各種多媒體應用中的任何多媒體應用的視訊解碼，諸如空中電視廣播、電纜電視傳輸、衛星電視傳輸、網際網路串流視訊傳輸，諸如經由HTTP的動態自我調整串流（DASH）、編碼到資料儲存媒體上的數位視訊、對儲存在資料儲存媒體上的數位視訊的解碼或其他應用。The technology described in this case can be applied to support video decoding in any of a variety of multimedia applications, such as over-the-air television broadcasts, cable television transmissions, satellite television transmissions, and Internet streaming video transmissions, such as dynamic self-adjusting streaming via HTTP (DASH), digital video encoded on data storage media, decoding of digital video stored on data storage media, or other applications.

目標設備116的輸入介面122從電腦可讀取媒體110（例如，通訊媒體、存放裝置112、檔案伺服器114等）接收經編碼的視訊位元串流。經編碼的視訊位元串流可以包括由視訊轉碼器200定義的訊號傳遞資訊，其亦由視訊解碼器300使用，諸如具有描述視訊塊或其他經譯碼的單元（例如，切片、圖片、圖片組、序列等）的特性及/或對視訊塊或其他經解碼的單元的處理的值的語法元素。顯示設備118向使用者顯示經解碼的視訊資料的經解碼的圖片。顯示設備118可以表示各種顯示設備中的任何顯示設備，諸如陰極射線管（CRT）、液晶顯示器（LCD）、電漿顯示器、有機發光二極體（OLED）顯示器或另一類型的顯示設備。The input interface 122 of the target device 116 receives the encoded video bit stream from the computer-readable medium 110 (eg, communication media, storage device 112, file server 114, etc.). The encoded video bitstream may include signaling information defined by video transcoder 200 and also used by video decoder 300, such as information describing video blocks or other coded units (e.g., slices, pictures, Syntax elements that represent properties of a group of pictures, sequence, etc.) and/or values for the processing of video blocks or other decoded units. The display device 118 displays decoded pictures of the decoded video data to the user. Display device 118 may represent any of a variety of display devices, such as a cathode ray tube (CRT), a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or another type of display device.

儘管未在圖1中示出，但是在一些實例中，視訊轉碼器200和視訊解碼器300可以是各自與音訊編碼器及/或音訊解碼器整合的，以及可以包括適當的MUX-DEMUX單元或其他硬體及/或軟體，以處理包括在共同的資料串流中的音訊和視訊兩者的多工流。Although not shown in Figure 1, in some examples, video transcoder 200 and video decoder 300 may be integrated with an audio encoder and/or audio decoder, respectively, and may include appropriate MUX-DEMUX units. or other hardware and/or software to handle multiplexing of both audio and video included in a common data stream.

視訊轉碼器200和視訊解碼器300可以各自實現為各種合適的編碼器及/或解碼器電路中的任何一者，諸如一或多個微處理器、數位訊號處理器（DSP）、特殊應用積體電路（ASIC）、現場可程式設計閘陣列（FPGA）、個別邏輯、軟體、硬體、韌體或其任意組合。當技術在軟體中部分地實現時，設備可以儲存針對在合適的、非暫時性的電腦可讀取媒體中的軟體的指令，以及在使用一或多個處理器的硬體中執行指令以執行本案內容的技術。視訊轉碼器200和視訊解碼器300中的各者可以是包括在一或多個編碼器或解碼器中的，該編碼器或解碼器中的任一者可以整合為各自的設備中的組合的編碼器/解碼器（CODEC）的一部分。包括視訊轉碼器200及/或視訊解碼器300的設備可以包括積體電路、微處理器及/或無線通訊設備，諸如蜂巢式電話。Video transcoder 200 and video decoder 300 may each be implemented as any of a variety of suitable encoder and/or decoder circuits, such as one or more microprocessors, digital signal processors (DSPs), application specific Integrated circuit (ASIC), field programmable gate array (FPGA), individual logic, software, hardware, firmware, or any combination thereof. When technology is implemented partially in software, a device may store instructions for the software in a suitable, non-transitory computer-readable medium, and execute the instructions in hardware using one or more processors for execution technology in this case. Each of the video transcoder 200 and the video decoder 300 may be included in one or more encoders or decoders, either of which may be integrated into a combination of respective devices. part of the encoder/decoder (CODEC). Devices including video transcoder 200 and/or video decoder 300 may include integrated circuits, microprocessors, and/or wireless communication devices, such as cellular phones.

視訊轉碼器200和視訊解碼器300可以根據視訊譯碼標準（諸如ITU-T H.265，亦稱為高效視訊譯碼（HEVC）或其擴展（諸如多視點及/或可縮放視訊譯碼擴展））來進行操作。或者，視訊轉碼器200和視訊解碼器300可以根據其他專有或工業標準（諸如ITU-T H.266，亦稱為多功能視訊譯碼（VVC））進行操作。在其他實例中，視訊轉碼器200和視訊解碼器300可以根據專有視訊轉碼器/格式（諸如AOMedia視訊1（AV1）、對AV1的擴展及/或AV1的成功版本（例如，AV2））進行操作。在其他實例中，視訊轉碼器200和視訊解碼器300可以根據其他專有格式或工業標準來進行操作。然而，本案內容的技術不受限於任何特定的譯碼標準或格式。Video transcoder 200 and video decoder 300 may perform coding according to video coding standards such as ITU-T H.265, also known as High Efficiency Video Coding (HEVC) or extensions thereof such as multi-viewpoint and/or scalable video coding. extension)) to operate. Alternatively, video transcoder 200 and video decoder 300 may operate according to other proprietary or industry standards such as ITU-T H.266, also known as Versatile Video Coding (VVC). In other examples, video transcoder 200 and video decoder 300 may be based on proprietary video transcoders/formats such as AOMedia Video 1 (AV1), extensions to AV1, and/or successful versions of AV1 (eg, AV2) ) to operate. In other examples, video transcoder 200 and video decoder 300 may operate according to other proprietary formats or industry standards. However, the technology in this case is not limited to any particular decoding standard or format.

通常，視訊轉碼器200和視訊解碼器300可以被配置為結合任何視訊譯碼技術來執行本案內容的技術，其中這些視訊譯碼技術使用當前圖片或切片的視訊資料的基於上下文的算術解碼中使用的一或複數個上下文的時間初始化點。例如，該多個時間初始化點包括在按譯碼順序在當前切片或圖片之前的一或多個先前切片或圖片的視訊資料中。In general, video transcoder 200 and video decoder 300 may be configured to perform the techniques described herein in conjunction with any video decoding technique that uses context-based arithmetic decoding of the video material of the current picture or slice. The temporal initialization point of one or more contexts to use. For example, the plurality of temporal initialization points are included in the video data of one or more previous slices or pictures preceding the current slice or picture in decoding order.

時間初始化點可以與切片或圖片相關聯。因此，對於每個切片或圖片，可能存在複數個時間初始化點集合。例如，第一時間初始化點集合可以與第一切片或圖片相關聯，第二時間初始化點集合可以與第二切片或圖片相關，以此類推。Temporal initialization points can be associated with slices or pictures. Therefore, for each slice or picture, there may be a plurality of sets of temporal initialization points. For example, a first set of temporal initialization points may be associated with a first slice or picture, a second set of temporal initialization points may be associated with a second slice or picture, and so on.

每個時間初始化點集合可以是切片或圖片的至少一個上下文的上下文值，或者可以是根據切片或圖片的至少一個上下文的上下文值匯出的值。若基於上下文的算術譯碼啟用了時間初始化，則視訊轉碼器200和視訊解碼器300可以使用儲存的時間初始化點集合（例如，用於先前編碼或解碼的切片或圖片）來初始化用於編碼或解碼後續切片或圖片的至少一個上下文的上下文值。當視訊轉碼器200和視訊解碼器300對後續切片或圖片的塊進行編碼或解碼時，視訊轉碼器200或視訊解碼器300可以在使用初始化點集合的初始化之後，更新上下文值。Each set of temporal initialization points may be a context value of at least one context of a slice or picture, or may be a value derived from a context value of at least one context of a slice or picture. If temporal initialization is enabled for context-based arithmetic coding, video transcoder 200 and video decoder 300 may use a stored set of temporal initialization points (e.g., for previously encoded or decoded slices or pictures) to initialize for encoding or decode the context value of at least one context for a subsequent slice or picture. When the video transcoder 200 and the video decoder 300 encode or decode subsequent slices or blocks of pictures, the video transcoder 200 or the video decoder 300 may update the context value after initialization using the initialization point set.

如更詳細描述的，在一或多個實例中，緩衝器儲存時間初始化點集合。然而，時間初始化點的可能集合的數量可能相對較大，並且緩衝區可能存在大小限制。因此，視訊轉碼器200和視訊解碼器300可以被配置為執行緩衝器管理，其中視訊轉碼器200與視訊解碼器300選擇性地決定要去除哪個初始化點集合，以便為要插入到緩衝器中的新的初始化點集合騰出記憶體空間。As described in more detail, in one or more instances, the buffer stores a set of temporal initialization points. However, the number of possible sets of temporal initialization points may be relatively large, and there may be size limitations on the buffer. Accordingly, the video transcoder 200 and the video decoder 300 may be configured to perform buffer management, wherein the video transcoder 200 and the video decoder 300 selectively determine which set of initialization points to remove for the purpose of inserting into the buffer. The new initialization point set in frees up memory space.

本案內容描述了此類緩衝區管理的實例技術。例如，若緩衝器是滿的，則視訊轉碼器200和視訊解碼器300可以評估與儲存在緩衝器中的時間初始化點相關聯的切片或圖片的時間標識（ID）值及/或QP值，以及可能連同切片類型。例如，除了儲存時間初始化點集合之外，緩衝器亦可以儲存指示切片或圖片的時間ID值及/或QP值的資訊，以及可能與時間初始化點的每個集合相關聯的切片類型（例如，I切片、B切片、P切片）。This article describes example techniques for such buffer management. For example, if the buffer is full, the video transcoder 200 and the video decoder 300 may evaluate the time identification (ID) value and/or QP value of the slice or picture associated with the time initialization point stored in the buffer. , and possibly along with the slice type. For example, in addition to storing a set of temporal initialization points, the buffer may also store information indicating temporal ID values and/or QP values for slices or pictures, and possibly a slice type associated with each set of temporal initialization points (e.g., I slice, B slice, P slice).

視訊轉碼器200和視訊解碼器300可以將與儲存在緩衝器中的時間初始化點集合相關聯的切片或圖片的時間ID值及/或QP值進行比較。基於該比較，視訊轉碼器200和視訊解碼器300可以基於切片或圖片的時間ID值或QP值，來決定與切片或圖片相關聯的時間初始化點集合。例如，視訊轉碼器200和視訊解碼器300可以決定與具有最小時間ID值或QP值中的至少一項的切片或圖片相關聯的時間初始化點集合。視訊轉碼器200和視訊解碼器300可以去除第一時間初始化點集合，以便為與當前切片或圖片（例如，剛剛經編碼或經解碼的切片或圖片）相關聯的第二時間初始化點集合騰出記憶體空間。第二時間初始化點集合可以是基於針對當前切片或圖片決定的一或多個上下文值。The video transcoder 200 and the video decoder 300 may compare the temporal ID values and/or QP values of slices or pictures associated with the set of temporal initialization points stored in the buffer. Based on the comparison, the video transcoder 200 and the video decoder 300 may determine a set of temporal initialization points associated with the slice or picture based on the temporal ID value or QP value of the slice or picture. For example, the video transcoder 200 and the video decoder 300 may determine a set of temporal initialization points associated with a slice or picture having at least one of a minimum temporal ID value or a QP value. Video transcoder 200 and video decoder 300 may remove the first set of temporal initialization points to make room for a second set of temporal initialization points associated with the current slice or picture (eg, the slice or picture that was just encoded or decoded). out of memory space. The second set of temporal initialization points may be based on one or more context values determined for the current slice or picture.

在一或多個實例中，視訊轉碼器200和視訊解碼器300亦可以基於切片類型，決定要去除的時間初始化點集合。在一些實例中，若緩衝器儲存與當前切片具有相同切片類型的切片相關聯的時間初始化點集合，則視訊轉碼器200和視訊解碼器300可以去除該時間初始化點集合。在一些情況下，即使存在與具有較低時間標識值及/或QP值的切片或圖片相關聯的時間初始化點集合，視訊轉碼器200和視訊解碼器300亦可以去除該時間初始化點集合。In one or more examples, the video transcoder 200 and the video decoder 300 may also determine the set of temporal initialization points to be removed based on the slice type. In some examples, if the buffer stores a set of temporal initialization points associated with a slice of the same slice type as the current slice, the video transcoder 200 and the video decoder 300 may remove the set of temporal initialization points. In some cases, video transcoder 200 and video decoder 300 may remove a set of temporal initialization points even if there is a set of temporal initialization points associated with a slice or picture with a lower temporal stamp value and/or QP value.

例如，為了去除時間初始化點集合，視訊轉碼器200和視訊解碼器300可以首先決定是否存在與當前切片具有相同切片類型的切片相關聯的時間初始化點集合。若存在，則視訊轉碼器200和視訊解碼器300可以去除該時間初始化點集合。若不存在，則視訊轉碼器200和視訊解碼器300可以將當前切片的時間初始化點集合添加到緩衝器中（假設緩衝器未滿）。在上面的實例中，視訊轉碼器200和視訊解碼器300優先考慮切片類型，但實例技術並不受此限制。在一些實例中，為了去除時間初始化點集合，視訊轉碼器200和視訊解碼器300可以首先決定是否存在與當前切片的時間標識值或QP值具有相同的時間標識值或QP值的切片相關聯的時間初始化點集合，而與切片類型無關。若存在，視訊轉碼器200和視訊解碼器300可以去除該時間初始化點集合。For example, to remove a temporal initialization point set, the video transcoder 200 and the video decoder 300 may first determine whether there is a temporal initialization point set associated with a slice having the same slice type as the current slice. If present, the video transcoder 200 and the video decoder 300 can remove the time initialization point set. If it does not exist, the video transcoder 200 and the video decoder 300 can add the set of time initialization points of the current slice to the buffer (assuming the buffer is not full). In the above example, the video transcoder 200 and the video decoder 300 give priority to the slice type, but the example technology is not limited by this. In some examples, in order to remove the set of temporal initialization points, the video transcoder 200 and the video decoder 300 may first determine whether there is a slice associated with the same time stamp value or QP value as the current slice's time stamp value or QP value. time to initialize the point collection, regardless of the slice type. If present, the video transcoder 200 and the video decoder 300 can remove the time initialization point set.

然而，在緩衝器已滿並且已經去除時間初始化點集合的情況下，視訊轉碼器200和視訊解碼器300可以執行本案內容中描述的實例技術，例如決定與具有最小時間ID值或QP值中的至少一項的切片或圖片相關聯的時間初始化點集合，隨後去除該時間初始化點集合。However, in the event that the buffer is full and the set of temporal initialization points has been removed, the video transcoder 200 and the video decoder 300 may perform the example techniques described in this context, such as deciding to match the time with the smallest temporal ID value or QP value. A set of temporal initialization points associated with at least one of the slices or pictures, and subsequently removing the set of temporal initialization points.

如更詳細描述的，視訊轉碼器200和視訊解碼器300可以基於與複數個時間初始化點集合相關聯的切片或圖片的相應時間標識（ID）值及/或量化參數（QP）值以及當前圖片或切片的時間ID值或QP值，來決定（例如，選擇）該複數個時間初始化點集合中的至少一個時間初始化點集合。以這種方式，可以減少對於當前圖片不具有可用性的時間初始化點的概率（例如，當具有比當前圖片更高的時間ID值的圖片從位元串流中去除或不被處理時）。As described in more detail, video transcoder 200 and video decoder 300 may be based on corresponding time identification (ID) values and/or quantization parameter (QP) values of slices or pictures associated with a plurality of sets of time initialization points and the current The time ID value or QP value of the picture or slice is used to determine (for example, select) at least one time initialization point set among the plurality of time initialization point sets. In this way, the probability of a temporal initialization point that is not available for the current picture can be reduced (eg when a picture with a higher temporal ID value than the current picture is removed from the bitstream or not processed).

視訊轉碼器200和視訊解碼器300可以執行對圖片的基於塊的譯碼。術語「塊」通常指的是包括要處理的（例如，經編碼的、經解碼的或在編碼及/或解碼程序中以其他方式使用的）資料的結構。例如，塊可以包括亮度及/或色度資料的取樣的二維矩陣。一般而言，視訊轉碼器200和視訊解碼器300可以對以YUV（例如，Y、Cb、Cr）格式表示的視訊資料進行譯碼。亦即，視訊轉碼器200和視訊解碼器300可以對亮度分量和色度分量進行譯碼，而不是對針對圖片的取樣的紅、綠和藍（RGB）資料進行解碼，其中色度分量可以包括紅色色調色度分量和藍色色調色度分量兩者。在一些實例中，視訊轉碼器200在編碼之前將接收的RGB格式的資料轉換為YUV表示，以及視訊解碼器300將YUV表示轉換為RGB格式。或者，預處理單元和後處理單元（未圖示）可以執行這些轉換。Video transcoder 200 and video decoder 300 may perform block-based coding of pictures. The term "chunk" generally refers to a structure containing data to be processed (eg, encoded, decoded, or otherwise used in an encoding and/or decoding process). For example, a block may include a two-dimensional matrix of samples of luma and/or chroma data. Generally speaking, the video transcoder 200 and the video decoder 300 can decode video data represented in YUV (eg, Y, Cb, Cr) format. That is, the video transcoder 200 and the video decoder 300 may decode the luma and chroma components instead of decoding the sampled red, green, and blue (RGB) data for the picture, where the chroma components may Includes both a red hue chroma component and a blue hue chroma component. In some examples, the video transcoder 200 converts the received RGB format data into a YUV representation before encoding, and the video decoder 300 converts the YUV representation into an RGB format. Alternatively, pre- and post-processing units (not shown) can perform these transformations.

本案內容可以通常指的是對圖片的譯碼（例如，編碼和解碼）以包括對圖片的資料進行編碼或解碼的程序。類似地，本案內容可以指的是對圖片的塊的譯碼以包括對針對塊的資料的編碼或解碼的程序，例如，預測及/或殘差譯碼。經編碼的視訊位元串流通常包括針對表示譯碼決定（例如，譯碼模式）和將圖片劃分為塊的語法元素的一系列值。因此，對圖片或塊進行譯碼的參考通常應當理解為針對形成圖片或塊的語法元素的譯碼值。This context may generally refer to the decoding (eg, encoding and decoding) of pictures to include procedures for encoding or decoding the material of the pictures. Similarly, this context may refer to the coding of blocks of a picture to include procedures for encoding or decoding data for the blocks, such as prediction and/or residual coding. An encoded video bitstream typically includes a series of values for syntax elements that represent coding decisions (eg, coding modes) and partitioning of the picture into blocks. Therefore, a reference to coding a picture or block should generally be understood to be a coding value for the syntax elements forming the picture or block.

HEVC定義各種塊，包括譯碼單元（CU）、預測單元（PU）和變換單元（TU）。根據HEVC，視訊譯碼器（諸如視訊轉碼器200）根據四叉樹結構將譯碼樹單元（CTU）劃分為CU。亦即，視訊譯碼器將CTU和CU劃分為四個相等的、非重疊的正方形，以及四叉樹的每個節點具有零個子節點或者四個子節點。沒有子節點的節點可以稱為「葉節點」，以及此類葉節點的CU可以包括一或多個PU及/或一或多個TU。視訊譯碼器可以進一步劃分PU和TU。例如，在HEVC中，殘差四叉樹（RQT）表示對TU的劃分。在HEVC中，PU表示訊框間預測資料，而TU表示殘差資料。訊框內預測的CU包括訊框內預測資訊，諸如訊框內模式指示。HEVC defines various blocks, including coding units (CU), prediction units (PU), and transform units (TU). According to HEVC, a video coder (such as video transcoder 200) divides coding tree units (CTUs) into CUs according to a quad-tree structure. That is, the video decoder divides the CTU and CU into four equal, non-overlapping squares, and each node of the quadtree has zero or four child nodes. A node without child nodes may be referred to as a "leaf node," and the CU of such a leaf node may include one or more PUs and/or one or more TUs. Video decoders can further divide PU and TU. For example, in HEVC, a residual quadtree (RQT) represents the partitioning of TUs. In HEVC, PU represents inter-frame prediction data, and TU represents residual data. Intra-predicted CUs include intra-prediction information, such as intra-mode indication.

作為另一實例，視訊轉碼器200和視訊解碼器300可以被配置為根據VVC進行操作。根據VVC，視訊譯碼器（諸如視訊轉碼器200）將圖片劃分為複數個譯碼樹單元（CTU）。視訊轉碼器200可以根據樹結構來劃分CTU，諸如四叉樹-二叉樹（QTBT）結構或多類型樹（MTT）結構。QTBT結構去除多個劃分類型的概念，諸如對HEVC的CU、PU與TU之間的分開。QTBT結構包括兩個級別：根據四叉樹劃分進行劃分的第一級別，以及根據二叉樹劃分進行劃分的第二級別。QTBT結構的根節點對應於CTU。二叉樹的葉節點對應於譯碼單元（CU）。As another example, video transcoder 200 and video decoder 300 may be configured to operate according to VVC. According to VVC, a video coder (such as video transcoder 200) divides a picture into a plurality of coding tree units (CTUs). The video transcoder 200 may divide the CTU according to a tree structure, such as a quadtree-binary tree (QTBT) structure or a multi-type tree (MTT) structure. The QTBT structure removes the concept of multiple partition types, such as the separation between CU, PU and TU for HEVC. The QTBT structure consists of two levels: a first level divided according to a quadtree partition, and a second level divided according to a binary tree partition. The root node of the QTBT structure corresponds to the CTU. The leaf nodes of the binary tree correspond to coding units (CUs).

在MTT劃分結構中，可以使用四叉樹（QT）分割、二叉樹（BT）分割和一或多個類型的三叉樹（TT）（亦稱為三叉樹型（TT））分割，對塊進行劃分。三叉樹或三叉樹型分割是將一個塊分割成三個子塊的劃分。在一些實例中，三叉樹或三叉樹型分割將一個塊劃分為三個子塊，而不穿過中心來劃分原始塊。MTT中的劃分類型（例如，QT、BT和TT）可以是對稱的，亦可以是不對稱的。In the MTT partitioning structure, blocks can be divided using quadtree (QT) partitioning, binary tree (BT) partitioning, and one or more types of ternary tree (TT) (also called ternary tree (TT)) partitioning. . A ternary tree or ternary tree-type partition is a partition that splits a block into three sub-blocks. In some instances, ternary tree or ternary tree-type partitioning divides a block into three sub-blocks without dividing the original block through the center. The types of partitioning in MTT (for example, QT, BT, and TT) can be symmetric or asymmetric.

當根據AV1轉碼器進行操作時，視訊轉碼器200和視訊解碼器300可以被配置為以塊的形式對視訊資料進行譯碼。在AV1中，可以處理的最大譯碼塊稱為超塊。在AV1中，超塊可以是128x128亮度取樣或64x64亮度取樣。然而，在後續的視訊譯碼格式（例如，AV2）中，可以經由不同的（例如，更大的）亮度取樣大小來定義超塊。在一些實例中，超塊是塊四叉樹的頂層。視訊轉碼器200可以進一步將超塊分割成更小的譯碼塊。視訊轉碼器200可以使用正方形或非正方形分割，將超塊和其他譯碼塊分割成更小的塊。非正方形塊可以包括N/2xN、NxN/2、N/4xN和NxN/4塊。視訊轉碼器200和視訊解碼器300可以對每個譯碼塊執行單獨的預測和變換程序。When operating in accordance with the AV1 transcoder, video transcoder 200 and video decoder 300 may be configured to decode video material in blocks. In AV1, the largest decoding block that can be processed is called a superblock. In AV1, a superblock can be 128x128 luma samples or 64x64 luma samples. However, in subsequent video coding formats (eg, AV2), super-blocks may be defined via different (eg, larger) luma sample sizes. In some instances, the superblock is the top level of a block quadtree. Video transcoder 200 may further partition the super-block into smaller decoding blocks. Video transcoder 200 may partition super-blocks and other decoding blocks into smaller blocks using square or non-square partitioning. Non-square blocks may include N/2xN, NxN/2, N/4xN, and NxN/4 blocks. Video transcoder 200 and video decoder 300 may perform separate prediction and transformation procedures for each coding block.

AV1亦定義了視訊資料的圖塊（tile）。圖塊是可以獨立於其他圖塊進行譯碼的超塊的矩形陣列。亦即，視訊轉碼器200和視訊解碼器300可以分別編碼和解碼圖塊內的譯碼塊，而不使用來自其他圖塊的視訊資料。然而，視訊轉碼器200和視訊解碼器300可以跨圖塊邊界來執行濾波。圖塊的大小可以是統一的，亦可以是不統一的。基於圖塊的譯碼可以實現編碼器和解碼器實現的並行處理及/或多執行緒。AV1 also defines tiles of video data. A tile is a rectangular array of superblocks that can be decoded independently of other tiles. That is, the video transcoder 200 and the video decoder 300 can respectively encode and decode decoding blocks within a tile without using video data from other tiles. However, video transcoder 200 and video decoder 300 may perform filtering across block boundaries. The size of the tiles can be uniform or non-uniform. Tile-based decoding enables parallel processing and/or multi-threading of encoder and decoder implementations.

在一些實例中，視訊轉碼器200和視訊解碼器300可以使用單個QTBT或MTT結構來表示亮度分量和色度分量中的各者，而在其他實例中，視訊轉碼器200和視訊解碼器300可以使用兩個或兩個以上QTBT或MTT結構，諸如用於亮度分量的一個QTBT/MTT結構和用於兩個色度分量的另一QTBT/MTT結構（或用於各自的色度分量的兩個QTBT/MTT結構）。In some examples, video transcoder 200 and video decoder 300 may use a single QTBT or MTT structure to represent each of the luma and chrominance components, while in other examples, video transcoder 200 and video decoder 300 may use two or more QTBT or MTT structures, such as one QTBT/MTT structure for the luma component and another QTBT/MTT structure for the two chroma components (or for each of the chroma components). Two QTBT/MTT structures).

視訊轉碼器200和視訊解碼器300可以被配置為使用四叉樹劃分、QTBT劃分、MTT劃分、超級塊劃分或其他劃分結構。The video transcoder 200 and the video decoder 300 may be configured to use quad-tree partitioning, QTBT partitioning, MTT partitioning, super-block partitioning, or other partitioning structures.

在一些實例中，一個CTU包括亮度取樣的解譯樹塊（CTB）、具有三個取樣陣列的圖片的色度取樣的兩個對應的CTB、或單色圖片或者使用三個獨立的色彩平面和語法結構（用於對取樣進行譯碼）進行譯碼的圖片的取樣的CTB。CTB可以是某個N值的NxN取樣塊，使得組成元素到CTB的分割是一種分區。組成元素是組成4:2:0、4:2:2或4:4:4顏色格式的圖片的三個陣列（亮度和兩個色度）之一的陣列或單個取樣，或者是組成單色格式圖片的陣列中的該陣列或單個取樣。在一些實例中，譯碼塊是針對一些M和N值的取樣的MxN取樣塊，使得CTB到譯碼塊的分割是一種分區。In some examples, a CTU includes a interpretation tree block (CTB) of luma samples, two corresponding CTBs of chroma samples for a picture with three sample arrays, or a monochrome picture or using three independent color planes and Syntax structure (used to decode the sample) CTB of the sample of the picture being decoded. The CTB can be an NxN sample block of some N value, such that the partitioning of the constituent elements into the CTB is a partition. A constituent element is an array or a single sample of one of the three arrays (luma and two chroma) that make up a picture in a 4:2:0, 4:2:2, or 4:4:4 color format, or a single sample that makes up a single color This array or a single sample in the array of format pictures. In some examples, the coding block is an MxN block of samples for some M and N values, such that the partitioning of CTBs into coding blocks is a partition.

可以以各種方式，在圖片中對塊（例如，CTU或CU）進行群組。舉一個實例，磚塊（brick）可以代表圖片中特定圖塊（tile）內的某個矩形區域的CTU行。圖塊可以是圖片中的特定圖塊列和特定圖塊行內的CTU的矩形區域。圖塊列是指高度等於圖片的高度、並且具有（例如，諸如在圖片參數集中）由語法元素指定的寬度的矩形區域的CTU。圖塊行是指具有由語法元素指定的高度（例如，諸如在圖片參數集中提供的）、並且寬度等於圖片的寬度的矩形區域的CTU。Blocks (eg, CTUs or CUs) can be grouped in a picture in various ways. As an example, a brick can represent the CTU rows of a rectangular area within a specific tile in the image. A tile can be a rectangular area of CTU within a specific tile column and a specific tile row in the picture. A tile column refers to a CTU of a rectangular area with a height equal to the height of the picture and a width specified by a syntax element (eg, such as in a picture parameter set). A tile row refers to a CTU of a rectangular area with a height specified by a syntax element (eg, such as provided in the picture parameter set), and a width equal to the width of the picture.

在一些實例中，可以將一個圖塊劃分成多個磚塊，每個磚塊可以包括該圖塊內的一或多個CTU行。沒有被劃分為多個磚塊的圖塊，亦可以稱為磚塊。但是，作為圖塊的一個真實子集的磚塊不能稱為圖塊。亦可以將圖片中的磚塊排列在切片中。切片可以是圖片中能夠專門地被包含在單個網路抽象層（NAL）單元中的整數數量磚塊。在一些實例中，一個切片包括多個完整圖塊，或者僅包括一個圖塊的連續序列的磚塊。In some examples, a tile may be divided into multiple tiles, and each tile may include one or more CTU rows within the tile. Tiles that are not divided into multiple bricks can also be called bricks. However, bricks that are a true subset of tiles cannot be called tiles. You can also arrange the bricks in the picture into slices. A slice can be an integer number of bricks in a picture that can be specifically contained within a single Network Abstraction Layer (NAL) unit. In some instances, a slice includes multiple complete tiles, or only a contiguous sequence of tiles of a tile.

本案內容可以可交換地使用「NxN」和「N乘N」來指按照垂直維度和水平維度的塊（諸如CU或其他視訊塊）的取樣維度，例如，16x16取樣或16乘16取樣。一般而言，16x16 CU將在垂直方向上具有16個取樣（y＝16），以及將在水平方向上具有16個取樣（x＝16）。同樣地，NxN CU通常在垂直方向上具有N個取樣，以及在水平方向上具有N個取樣，其中N表示非負整數值。CU中的取樣可以是以行和列來排列的。此外，CU不需要必然地在水平方向上具有與在垂直方向上相同數量的取樣。例如，CU可以包括NxM個取樣，其中M不一定等於N。This document may use "NxN" and "N by N" interchangeably to refer to the sampling dimensions of a block (such as a CU or other video block) in terms of vertical and horizontal dimensions, e.g., 16x16 samples or 16 by 16 samples. Generally speaking, a 16x16 CU will have 16 samples in the vertical direction (y=16), and 16 samples in the horizontal direction (x=16). Likewise, a NxN CU typically has N samples in the vertical direction and N samples in the horizontal direction, where N represents a non-negative integer value. Samples in a CU can be arranged in rows and columns. Furthermore, a CU does not necessarily need to have the same number of samples in the horizontal direction as in the vertical direction. For example, a CU may include NxM samples, where M is not necessarily equal to N.

視訊轉碼器200對針對CU的表示預測及/或殘差資訊以及其他資訊的視訊資料進行編碼。預測資訊指示要如何預測CU以便形成針對CU的預測塊。殘差資訊通常表示在編碼之前的CU的取樣與預測塊之間的逐取樣差異。Video transcoder 200 encodes video data representing prediction and/or residual information and other information for a CU. The prediction information indicates how to predict the CU in order to form a prediction block for the CU. Residual information usually represents the sample-by-sample difference between the samples of the CU before encoding and the prediction block.

為了預測CU，視訊轉碼器200通常可以經由訊框間預測或訊框內預測來形成針對CU的預測塊。訊框間預測通常指的是從先前經譯碼的圖片的資料預測CU，而訊框內預測通常指的是從相同的圖片的先前經譯碼的資料預測CU。為了執行訊框間預測，視訊轉碼器200可以使用一或多個運動向量來產生預測塊。視訊轉碼器200通常可以執行運動搜尋以辨識與CU緊密地匹配的參考塊，例如，按照在CU與參考塊之間的差異。視訊轉碼器200可以使用絕對差之和（SAD）、誤差平方和（SSD）、平均絕對差（MAD）、均方誤差（MSD）或其他此類差分計算來對差分度量進行計算，以決定參考塊是否與當前的CU緊密地匹配。在一些實例中，視訊轉碼器200可以使用單向預測或雙向預測來預測當前的CU。To predict a CU, video transcoder 200 may typically form a prediction block for the CU via inter prediction or intra prediction. Inter-frame prediction generally refers to predicting a CU from previously coded data of a picture, while intra-frame prediction generally refers to predicting a CU from previously coded data of the same picture. To perform inter-frame prediction, video transcoder 200 may use one or more motion vectors to generate prediction blocks. Video transcoder 200 may typically perform a motion search to identify reference blocks that closely match the CU, eg, in terms of differences between the CU and the reference blocks. Video transcoder 200 may calculate the difference metric using sum of absolute differences (SAD), sum of square errors (SSD), mean absolute difference (MAD), mean square error (MSD), or other such difference calculations to determine Whether the reference block closely matches the current CU. In some examples, video transcoder 200 may use unidirectional prediction or bidirectional prediction to predict the current CU.

VVC的一些實例亦提供仿射運動補償模式，其可以被認為是訊框間預測模式。在仿射運動補償模式中，視訊轉碼器200可以決定表示非平移運動的兩個或兩個以上運動向量，諸如放大或縮小、旋轉、透視運動或其他不規則的運動類型。Some examples of VVC also provide an affine motion compensation mode, which can be considered an inter-frame prediction mode. In affine motion compensation mode, video transcoder 200 may determine two or more motion vectors that represent non-translational motion, such as zoom-in or zoom-out, rotation, perspective motion, or other irregular motion types.

為了執行訊框內預測，視訊轉碼器200可以選擇訊框內預測模式以產生預測塊。VVC的一些實例提供六十七種訊框內預測模式，包括各種方向的模式以及平面模式和DC模式。一般而言，視訊轉碼器200決定訊框內預測模式，該訊框內預測模式描述對於當前塊（例如，CU的塊）而言的從其預測當前塊的取樣的鄰近的取樣。假設視訊轉碼器200以光柵掃瞄順序（左到右、上到下）來對CTU和CU進行解碼，則此類取樣通常可以在與當前塊相同的圖片中的當前塊的上方、上方並且左邊、或左邊。To perform intra prediction, video transcoder 200 may select an intra prediction mode to generate prediction blocks. Some instances of VVC offer sixty-seven intra-frame prediction modes, including modes in various directions as well as planar and DC modes. Generally speaking, video transcoder 200 determines an intra prediction mode that describes, for a current block (eg, a block of a CU), neighboring samples from which samples of the current block are predicted. Assuming that video transcoder 200 decodes CTUs and CUs in raster scan order (left to right, top to bottom), such samples may typically be above, above, and above the current block in the same picture as the current block. left, or left.

視訊轉碼器200對表示針對當前塊的預測模式的資料進行編碼。例如，針對訊框間預測模式，視訊轉碼器200可以對表示使用各種可用的訊框間預測模式中的那個訊框間預測模式以及針對相應的模式的運動資訊的資料進行編碼。針對單向訊框間預測或雙向訊框間預測，例如，視訊轉碼器200可以使用改進的運動向量預測（AMVP）或合併模式來對運動向量進行編碼。視訊轉碼器200可以使用類似的模式來對用於仿射運動補償模式的運動向量進行編碼。Video transcoder 200 encodes data representing the prediction mode for the current block. For example, for inter-frame prediction modes, video transcoder 200 may encode data indicating which inter-frame prediction mode among various available inter-frame prediction modes is used and motion information for the corresponding mode. For unidirectional inter-frame prediction or bi-directional inter-frame prediction, for example, the video transcoder 200 may use improved motion vector prediction (AMVP) or merge mode to encode motion vectors. Video transcoder 200 may use a similar pattern to encode motion vectors for affine motion compensation mode.

AV1包括用於對視訊資料的譯碼塊進行編碼和解碼的兩種通用技術。這兩種通用技術是訊框內預測（例如，訊框內預測或空間預測）和訊框間預測（例如，訊框間預測或時間預測）。在AV1的上下文中，當使用訊框內預測模式來預測視訊資料的當前訊框的塊時，視訊轉碼器200和視訊解碼器300不使用來自視訊資料的其他訊框的視訊資料。對於大多數訊框內預測模式，視訊轉碼器200基於當前塊中的取樣值與從相同的訊框中的參考取樣產生的預測值之間的差異，對當前訊框的塊進行編碼。視訊轉碼器200基於訊框內預測模式，決定從參考取樣產生的預測值。AV1 includes two common technologies for encoding and decoding coding blocks of video material. The two common techniques are intra-frame prediction (eg, intra-frame prediction or spatial prediction) and inter-frame prediction (eg, inter-frame prediction or temporal prediction). In the context of AV1, when using intra-frame prediction mode to predict blocks of the current frame of video data, video transcoder 200 and video decoder 300 do not use video data from other frames of video data. For most intra-frame prediction modes, video transcoder 200 encodes blocks of the current frame based on differences between sample values in the current block and predicted values generated from reference samples in the same frame. Video transcoder 200 determines prediction values generated from reference samples based on the intra prediction mode.

在預測（諸如對塊的訊框內預測或訊框間預測）之後，視訊轉碼器200可以計算針對塊的殘差資料。殘差資料（諸如殘差塊）表示在塊與針對塊的使用相應的預測模式形成的預測塊之間的逐取樣差異。視訊轉碼器200可以對殘差塊應用一或多個變換，以在變換域而不是取樣域中產生經變換的資料。例如，視訊轉碼器200可以對殘差視訊資料應用離散餘弦變換（DCT）、整數變換、小波變換或概念上類似的變換。另外地，視訊轉碼器200可以在第一變換之後應用二次變換，諸如取決於模式的不可分的二次變換（MDNSST）、取決於訊號的變換、卡洛南-洛伊（Karhunen-Loeve）變換（KLT）等。視訊轉碼器200在對一或多個變換的應用之後產生變換係數。After prediction (such as intra prediction or inter prediction for the block), video transcoder 200 may calculate residual data for the block. Residual information, such as a residual block, represents the sample-by-sample difference between a block and a prediction block formed using a corresponding prediction mode for the block. Video transcoder 200 may apply one or more transforms to the residual block to produce transformed data in the transform domain rather than the sample domain. For example, video transcoder 200 may apply a discrete cosine transform (DCT), integer transform, wavelet transform, or conceptually similar transform to the residual video data. Alternatively, the video transcoder 200 may apply a secondary transform after the first transform, such as mode-dependent non-separable quadratic transform (MDNSST), signal-dependent transform, Karhunen-Loeve Transform (KLT) etc. Video transcoder 200 generates transform coefficients after applying one or more transforms.

如前述，在任何變換以產生變換係數之後，視訊轉碼器200可以執行對變換係數的量化。量化通常指的是在其中對變換係數進行量化以可能地減少用於表示變換係數的資料的量，提供進一步的壓縮的程序。經由執行量化程序，視訊轉碼器200可以減少與變換係數中的一些變換係數或所有變換係數相關聯的位元深度。例如，視訊轉碼器200可以在量化期間將 n位元值向下四捨五入到 m位元值，其中 n大於 m。在一些實例中，為了執行量化，視訊轉碼器200可以執行對要量化的值的逐位右移。 As mentioned above, after any transformation to generate transform coefficients, the video transcoder 200 may perform quantization of the transform coefficients. Quantization generally refers to a procedure in which transform coefficients are quantized to possibly reduce the amount of data used to represent the transform coefficients, providing further compression. By performing a quantization procedure, video transcoder 200 may reduce the bit depth associated with some or all of the transform coefficients. For example, video transcoder 200 may round down an n- bit value to an m -bit value during quantization, where n is greater than m . In some examples, to perform quantization, video transcoder 200 may perform a bit-wise right shift of the value to be quantized.

在量化之後，視訊轉碼器200可以掃瞄變換係數，從包括經量化的變換係數的二維矩陣產生一維向量。掃瞄可以被設計為在向量的前面放置較高的能量（並且因此較低的頻率）變換係數，以及在向量的後面放置較低的能量（並且因此較高的頻率）變換係數。在一些實例中，視訊轉碼器200可以利用預先定義的掃瞄順序來掃瞄經量化的變換係數以產生序列化的向量，以及接著對向量的經量化的變換係數進行熵編碼。在其他實例中，視訊轉碼器200可以執行自我調整掃瞄。在掃瞄經量化的變換係數以形成一維向量之後，視訊轉碼器200可以例如根據上下文自我調整的二進位算術譯碼（CABAC）來對一維向量進行熵編碼。視訊轉碼器200亦可以對針對描述與經編碼的視訊資料相關聯的中繼資料的語法元素的值進行熵編碼，用於由視訊解碼器300在對視訊資料進行解碼時使用。After quantization, video transcoder 200 may scan the transform coefficients to generate a one-dimensional vector from a two-dimensional matrix including the quantized transform coefficients. The scan can be designed to place higher energy (and therefore lower frequency) transform coefficients in front of the vector, and lower energy (and therefore higher frequency) transform coefficients in the back of the vector. In some examples, video transcoder 200 may scan the quantized transform coefficients using a predefined scan order to generate a serialized vector, and then entropy encode the quantized transform coefficients of the vector. In other examples, video transcoder 200 may perform self-adjusting scanning. After scanning the quantized transform coefficients to form a one-dimensional vector, the video transcoder 200 may entropy encode the one-dimensional vector, such as context-based self-adaptive binary arithmetic coding (CABAC). Video transcoder 200 may also entropy encode values for syntax elements describing relay data associated with encoded video data for use by video decoder 300 in decoding the video data.

為了執行CABAC，視訊轉碼器200可以將在上下文模型內的上下文的一或多個上下文值分配給要發送的符號。上下文值可以涉及例如符號的鄰近的值是否是零值。概率決定（例如，上下文值）可以是基於分配給符號的上下文。To perform CABAC, video transcoder 200 may assign one or more context values of the context within the context model to the symbols to be sent. The context value may relate to, for example, whether the symbol's neighboring values are zero values. Probabilistic decisions (eg, context values) may be based on the context assigned to the symbol.

視訊轉碼器200可以進一步例如在圖片標頭、塊標頭、切片標頭或其他語法資料（諸如序列參數集（SPS）、圖片參數集（PPS），或視訊參數集（VPS））中向視訊解碼器300產生語法資料（諸如基於塊的語法資料、基於圖片的語法資料和基於序列的語法資料）。視訊解碼器300同樣地可以對此種語法資料進行解碼以決定如何對相應的視訊資料進行解碼。Video transcoder 200 may further provide, for example, a picture header, a block header, a slice header, or other syntax data such as a sequence parameter set (SPS), a picture parameter set (PPS), or a video parameter set (VPS). The video decoder 300 generates syntax data (such as block-based syntax data, picture-based syntax data, and sequence-based syntax data). The video decoder 300 can also decode the syntax data to determine how to decode the corresponding video data.

以這種方式，視訊轉碼器200可以產生包括經編碼的視訊資料的位元串流，例如，描述對圖片到塊（例如，CU）的劃分的語法元素和針對塊的預測及/或殘差資訊。最終，視訊解碼器300可以接收位元串流以及對經編碼的視訊資料進行解碼。In this manner, video transcoder 200 may generate a bitstream that includes encoded video data, such as syntax elements describing the partitioning of pictures into blocks (eg, CUs) and block-specific predictions and/or residuals. Poor information. Finally, the video decoder 300 can receive the bit stream and decode the encoded video data.

一般而言，視訊解碼器300執行與由視訊轉碼器200執行的相互的程序，以對位元串流的經編碼的視訊資料進行解碼。例如，視訊解碼器300可以使用CABAC以基本上類似於（儘管是相互的）視訊轉碼器200的CABAC編碼程序的方式來對針對位元串流的語法元素的值進行解碼。語法元素可以定義用於對圖片到CTU的劃分以及根據相應的劃分結構（諸如QTBT結構）來對每個CTU的劃分的劃分資訊，以定義CTU的CU。語法元素可以進一步定義針對視訊資料的塊（例如，CU）的預測資訊和殘差資訊。Generally speaking, video decoder 300 performs a process identical to that performed by video transcoder 200 to decode encoded video data of a bitstream. For example, video decoder 300 may use CABAC to decode values for syntax elements of the bit stream in a manner substantially similar (albeit reciprocally) to the CABAC encoding procedure of video transcoder 200 . Syntax elements may define partitioning information for partitioning pictures into CTUs and partitioning each CTU according to a corresponding partitioning structure (such as a QTBT structure) to define CUs of CTUs. Syntax elements may further define prediction information and residual information for blocks (eg, CUs) of video data.

殘差資訊可以是經由例如經量化的變換係數來表示的。視訊解碼器300可以對塊的經量化的變換係數進行逆量化和逆變換，以再現針對該塊的殘差塊。視訊解碼器300使用以訊號發送的預測模式（訊框內預測或訊框間預測）和相關的預測資訊（例如，用於訊框間預測的運動資訊）以形成針對該塊的預測塊。視訊解碼器300可以接著組合預測塊和殘差塊（在逐個取樣的基礎上）以再現初始塊。視訊解碼器300可以執行另外的處理，諸如執行解塊程序以減少沿著塊的邊界的視覺偽像。The residual information may be represented via, for example, quantized transform coefficients. Video decoder 300 may inverse-quantize and inverse-transform the quantized transform coefficients of a block to reproduce a residual block for the block. Video decoder 300 uses the signaled prediction mode (intra prediction or inter prediction) and the associated prediction information (eg, motion information for inter prediction) to form a prediction block for the block. Video decoder 300 may then combine the prediction block and the residual block (on a sample-by-sample basis) to reproduce the initial block. Video decoder 300 may perform additional processing, such as performing deblocking procedures to reduce visual artifacts along block boundaries.

本案內容通常可以指的是「以訊號發送」某些資訊（諸如語法元素）。術語「以訊號發送」通常可以指的是針對語法元素及/或用於對經編碼的視訊資料進行解碼的其他資料的值的通訊。亦即，視訊轉碼器200可以以訊號發送針對位元串流中的語法元素的值。一般而言，以訊號發送指的是產生位元串流中的值。如前述，源設備102可以基本上即時地或非即時地將位元串流傳輸到目標設備116，諸如可能當將語法元素儲存到存放裝置112用於由目標設備116進行的稍後的取回時發生。Content can generally refer to "signaling" certain information (such as grammatical elements). The term "signaling" may generally refer to the communication of values for syntax elements and/or other data used to decode encoded video data. That is, the video transcoder 200 may signal values for syntax elements in the bit stream. Generally speaking, signaling means producing a value in a stream of bits. As previously described, source device 102 may stream bits to target device 116 substantially instantaneously or non-instantly, such as possibly while storing syntax elements to storage 112 for later retrieval by target device 116 occurs.

在以下文件中，描述了使用先前編碼/解碼順序CABAC（上下文自我調整二進位算術譯碼）初始化點來對當前圖片或切片進行CABAC初始化的方法：JVET-Y0181：「AHG12:根據先前切片間的CABAC初始化」，Seregin等人，ITU-T SG 16 WP 3和ISO/IEC JTC 1/SC 29的聯合視訊專家組（JVET），第25次電話會議，2022年1月12日至21日。儘管關於CABAC進行了描述，但是這些實例技術亦適用於基於上下文的算術譯碼。例如，CABAC算術譯碼可能需要每個上下文的起始點，並且該起始點稱為初始化點。該起始點（例如，初始化點）可以包括一或多個上下文狀態、訊窗或速率自我調整參數、以及算術譯碼操作所需要的其他參數。例如，可能存在用於上下文的一或多個上下文值的一或多個初始化點，並且一或多個初始化點可以是用於一或多個上下文值的值（例如，初始值）。The method of CABAC initialization of the current picture or slice using previous encoding/decoding sequential CABAC (Context Self-Adjusting Binary Arithmetic Coding) initialization points is described in the following document: JVET-Y0181: "AHG12: Based on the previous inter-slice "CABAC Initialization", Seregin et al., Joint Videoconferencing Expert Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, 25th Telephone Conference, January 12-21, 2022. Although described with respect to CABAC, these example techniques are also applicable to context-based arithmetic decoding. For example, CABAC arithmetic decoding may require a starting point for each context, and this starting point is called the initialization point. The starting point (eg, initialization point) may include one or more context states, window or rate self-adjustment parameters, and other parameters required for arithmetic decoding operations. For example, there may be one or more initialization points for one or more context values of the context, and the one or more initialization points may be values (eg, initial values) for the one or more context values.

在視訊轉碼器中，通常這些初始化點是預定義的，並且是視訊轉碼器200和視訊解碼器300皆已知的，例如在HEVC和VVC中，在每個切片類型、I片、P片和B片的初始化表中定義了初始化點。In a video transcoder, usually these initialization points are predefined and known to both the video transcoder 200 and the video decoder 300. For example, in HEVC and VVC, each slice type, I-slice, P Initialization points are defined in the initialization tables for slices and B slices.

在時間CABAC初始化中，除了或代替使用預定義的初始化點，可以將初始化點儲存在圖片或切片的特定CTU處，並且可以使用這些儲存的初始化點來初始化下一個圖片或切片，而不是或除了預定義的初始化點之外。亦即，在一些實例中，將時間初始化點包括在按譯碼順序在當前切片或圖片之前的一或多個先前切片或圖片的視訊資料中，並且可以針對於當前切片或圖片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文。In temporal CABAC initialization, in addition to or in addition to using predefined initialization points, initialization points can be stored at specific CTUs of pictures or slices, and these stored initialization points can be used to initialize the next picture or slice, instead of or in addition to outside of predefined initialization points. That is, in some examples, the temporal initialization point is included in the video material of one or more previous slices or pictures preceding the current slice or picture in coding order, and may be specific to the video material of the current slice or picture. One or more contexts used in context-based arithmetic decoding.

視訊轉碼器200或視訊解碼器300可以儲存初始化點的CTU可以發生變化，並且可以用訊號通知CTU的資訊指示。舉一個實例，視訊轉碼器200和視訊解碼器300可以在圖片中間儲存切片的初始化點。亦即，當視訊轉碼器200對切片進行編碼或視訊解碼器300對切片進行解碼時，當視訊轉碼器200對切片中間的CTU進行編碼或者視訊解碼器300對切片中間的CTU進行解碼時，視訊轉碼器200或視訊解碼器300可以將上下文的當前一或多個上下文值儲存為初始化點，這些初始化點用於對後續切片或圖片中的語法元素進行編碼或解碼。例如，當視訊轉碼器200或視訊解碼器300開始對後續切片或圖片進行編碼或解碼時，視訊轉碼器200或視訊解碼器300可以將上下文的上下文值的初始值設置為等於或基於儲存的初始化點（例如，利用映射、縮放、加權等）。隨後，當視訊轉碼器200和視訊解碼器300正在對後續切片或圖片進行編碼或解碼時，視訊轉碼器200或視訊解碼器300可以基於最近編碼或解碼的視訊資料，從初始值更新上下文值。The video transcoder 200 or the video decoder 300 may store information indicating that the CTU of the initialization point may change, and may signal the CTU. As an example, the video transcoder 200 and the video decoder 300 can store the initialization point of the slice in the middle of the picture. That is, when the video transcoder 200 encodes a slice or the video decoder 300 decodes a slice, when the video transcoder 200 encodes the CTU in the middle of the slice or the video decoder 300 decodes the CTU in the middle of the slice. , the video transcoder 200 or the video decoder 300 may store the current one or more context values of the context as initialization points, and these initialization points are used to encode or decode syntax elements in subsequent slices or pictures. For example, when the video transcoder 200 or the video decoder 300 starts encoding or decoding subsequent slices or pictures, the video transcoder 200 or the video decoder 300 may set the initial value of the context value of the context to be equal to or based on the stored value. Initialization points (e.g., using mapping, scaling, weighting, etc.). Subsequently, when the video transcoder 200 and the video decoder 300 are encoding or decoding subsequent slices or pictures, the video transcoder 200 or the video decoder 300 can update the context from the initial value based on the most recently encoded or decoded video data. value.

在一些實例中，視訊轉碼器200和視訊解碼器300可以在對CABAC狀態、訊窗和特定CTU的其他參數進行譯碼之後，儲存初始化點。例如，對於某個CTU，儲存單元用於在譯碼CABAC狀態、訊窗和用於初始化的其他參數之後進行儲存。例如，與預定義的初始化點相比，適用於前一圖片中的經解碼內容的參數可以更好地表示當前圖片的視訊資料的起始初始化點。In some examples, video transcoder 200 and video decoder 300 may store initialization points after decoding CABAC status, windows, and other parameters for a specific CTU. For example, for a certain CTU, the storage unit is used to store CABAC status, message windows, and other parameters for initialization after decoding. For example, parameters applicable to the decoded content in the previous picture may better represent the starting initialization point of the video material of the current picture than a predefined initialization point.

可以針對每個切片類型和切片量化參數（QP）分別進行儲存。在一些實例中，與當前切片具有相同的切片類型和相同QP的初始化，可以用於當前切片CABAC初始化。The slice quantization parameters (QP) can be stored separately for each slice type. In some instances, initialization with the same slice type and the same QP as the current slice can be used for the current slice CABAC initialization.

基於上下文的算術譯碼的時間初始化點可能存在某些問題。例如，由於可以對當前圖片的切片進行獨立地解碼，並且作為時間可縮放性的一部分，可以去除時間標識（ID）值高於閥值的圖片，因此可以細化利用時間初始化點的技術以允許時間可縮放，同時確保可用的時間初始化點可用。There may be some problems with the temporal initialization point of context-based arithmetic decoding. For example, since slices of the current picture can be decoded independently, and pictures with temporal identification (ID) values above a threshold can be removed as part of temporal scalability, techniques that exploit temporal initialization points can be refined to allow Time is scalable while ensuring that available time initialization points are available.

在一些實例中，例如在FIFO緩衝器中（其中頂部初始化點來自最接近當前圖片的先前圖片），可以按照譯碼順序，從先前圖片中儲存多個初始化點。例如，緩衝器可以為第一切片或圖片的上下文的一或多個上下文值儲存第一初始化點集合，為第二切片或圖片的上下文的一或多個上下文值儲存第二初始化點集合，以此類推。In some examples, such as in a FIFO buffer (where the top initialization point is from the previous picture closest to the current picture), multiple initialization points can be stored from previous pictures in decoding order. For example, the buffer may store a first set of initialization points for one or more context values of the context of a first slice or picture, and a second set of initialization points for one or more context values of the context of a second slice or picture, And so on.

可以引入索引來指示從FIFO緩衝區中使用哪個初始化點，並且在位元串流中（例如，在圖片或切片標頭中）用訊號通知該初始化索引。例如，視訊轉碼器200可以用訊號通知並且視訊解碼器300可以接收到緩衝器中的索引，該索引指示使用哪個初始化點集合來初始化當前正在解碼的切片或圖片的上下文的一或多個上下文值。視訊解碼器300可以基於索引，檢索初始化點集合，以及初始化上下文的一或多個上下文值。例如，視訊解碼器300可以將一或多個上下文值的初始值設置為等於所檢索的初始化點集合。再舉一個實例，視訊解碼器300可以對所檢索的初始化點集合進行映射、縮放、加權或執行某種其他操作，以決定一或多個上下文值的初始值。An index can be introduced to indicate which initialization point to use from the FIFO buffer, and the initialization index is signaled in the bitstream (eg, in the picture or slice header). For example, video transcoder 200 may signal and video decoder 300 may receive an index in a buffer that indicates which set of initialization points to use to initialize one or more contexts for the context of the slice or picture currently being decoded. value. The video decoder 300 may retrieve the set of initialization points and one or more context values of the initialization context based on the index. For example, video decoder 300 may set an initial value of one or more context values equal to the retrieved set of initialization points. As another example, the video decoder 300 may map, scale, weight, or perform some other operation on the retrieved set of initialization points to determine the initial value of one or more context values.

下文描述了時間可縮放性，以及使用時間初始化點實現時間可縮放。每個經譯碼的圖片可以具有由視訊轉碼器200分配的時間ID值。視訊轉碼器200可以在網路抽象層單元（NALU）標頭中，用訊號通知時間ID值。時間ID值用於時間可縮放性，其中可以忽略某些圖片（例如，從位元串流中去除或者不被處理），而可以在沒有去除或不處理的圖片的情況下，對其他圖片進行解碼。Temporal scalability and the use of temporal initialization points to achieve temporal scalability are described below. Each coded picture may have a temporal ID value assigned by video transcoder 200. Video transcoder 200 may signal the time ID value in a Network Abstraction Layer Unit (NALU) header. Temporal ID values are used for temporal scalability, where certain pictures can be ignored (e.g., removed from the bitstream or not processed), while other pictures can be processed without removing or not processing pictures. Decode.

在一個實例中，可以經由設置具有較低時間ID值的圖片不能使用具有較高時間ID值進行訊框間預測的限制，來實現時間可縮放性。在這種情況下，視訊解碼器300能夠對具有較低時間ID值的圖片進行解碼，而無需使用具有較高時間ID值，這是因為具有較高時間ID值的圖片不能用於具有較低時間ID值的圖片的訊框間預測。因此，可以從位元串流中去除具有較高時間ID值的圖片或者不進行處理（例如，忽略）。當應用基於上下文的算術譯碼的時間初始化時，視訊轉碼器200和視訊解碼器300可以利用時間ID值，來辨識用於當前切片/圖片初始化的可用的儲存初始化點。In one example, temporal scalability may be achieved by setting a restriction that pictures with lower temporal ID values cannot be used for inter-frame prediction with higher temporal ID values. In this case, the video decoder 300 is able to decode pictures with lower temporal ID values without using pictures with higher temporal ID values because pictures with higher temporal ID values cannot be used with lower temporal ID values. Inter-frame prediction for pictures with temporal ID values. Therefore, pictures with higher temporal ID values may be removed from the bitstream or not processed (eg, ignored). When applying temporal initialization of context-based arithmetic decoding, the video transcoder 200 and the video decoder 300 may utilize the temporal ID value to identify available storage initialization points for current slice/picture initialization.

在一或多個實例中，視訊轉碼器200和視訊解碼器30可以儲存具有初始化點的時間ID值。例如，每個初始化點集合（其中一個集合包括一或多個初始化點）可以與切片或圖片相關聯。視訊轉碼器200和視訊解碼器300可以儲存初始化點集合以及指示與該初始化點集合相關聯的切片或圖片的時間ID值的資訊。舉一個實例，第一初始化點集合可以與具有第一時間ID值的第一切片或圖片相關聯，並且第二初始化點集合可以與具有第二時間ID值的第二切片或圖片相關聯。視訊轉碼器200和視訊解碼器300可以儲存第一初始化點集合和第一時間ID值、以及第一時間ID的值用於與第一初始化點集合相關聯的第一切片或圖片的資訊（例如，將第一時間ID和第一初始化點集合相關聯）。視訊轉碼器200和視訊解碼器300可以儲存第二初始化集合點和第二時間ID值、以及第二時間ID值用於與第二初始化點集合相關聯的第二切片或圖片的資訊（例如，將第二時間ID值與第二初始化點集合相關聯）。In one or more examples, video transcoder 200 and video decoder 30 may store temporal ID values with initialization points. For example, each set of initialization points (where a set includes one or more initialization points) can be associated with a slice or picture. Video transcoder 200 and video decoder 300 may store a set of initialization points and information indicating temporal ID values of slices or pictures associated with the set of initialization points. As one example, a first set of initialization points may be associated with a first slice or picture having a first temporal ID value, and a second set of initialization points may be associated with a second slice or picture having a second temporal ID value. The video transcoder 200 and the video decoder 300 may store the first initialization point set and the first time ID value, and the first time ID value for the first slice or picture information associated with the first initialization point set. (For example, associate the first time ID with the first set of initialization points). The video transcoder 200 and the video decoder 300 may store the second initialization point set and the second time ID value, and the second time ID value for information of the second slice or picture associated with the second initialization point set (eg, , associating the second time ID value with the second set of initialization points).

為了基於所儲存的初始化點集合的時間ID值來決定（例如，選擇）初始化點，視訊轉碼器200和視訊解碼器300可以將時間ID值與當前切片/圖片時間ID進行比較。例如，視訊轉碼器200和視訊解碼器300可以基於與複數個時間初始化點集合相關聯的相應時間ID值和當前圖片或切片的時間ID值，來選擇複數個時間初始化點集合中的至少一個時間初始化點集合。To determine (eg, select) an initialization point based on the temporal ID value of the stored initialization point set, the video transcoder 200 and the video decoder 300 may compare the temporal ID value with the current slice/picture temporal ID. For example, the video transcoder 200 and the video decoder 300 may select at least one of the plurality of temporal initialization point sets based on the corresponding temporal ID value associated with the plurality of temporal initialization point sets and the temporal ID value of the current picture or slice. A collection of time initialization points.

選擇時間ID等於或小於當前時間ID的初始化點集合作為可用點。例如，為了決定至少一個時間初始化點集合，視訊轉碼器200和視訊解碼器300可以決定該複數個時間初始化點集合中的時間初始化點集合，其中該時間初始化點集合的相應時間ID值小於或等於當前圖片或切片的時間ID值。視訊轉碼器200和視訊解碼器300可以從該時間初始化點集合中選擇時間初始化點集合。Select the set of initialization points whose time ID is equal to or smaller than the current time ID as available points. For example, in order to determine at least one time initialization point set, the video transcoder 200 and the video decoder 300 can determine a time initialization point set among the plurality of time initialization point sets, wherein the corresponding time ID value of the time initialization point set is less than or Equal to the time ID value of the current picture or slice. The video transcoder 200 and the video decoder 300 can select a time initialization point set from the time initialization point set.

在一個實例中，僅選擇與當前切片/圖片時間ID具有相同時間ID的初始化點集合，作為可用的初始化點。例如，由於較小的時間ID通常具有較低的QP，來自具有較低時間ID值的先前圖片的初始化點可能不能正確地表示當前圖片的切片的初始化點。在一些實例中，為了決定該至少一個時間初始化點集合，視訊轉碼器200和視訊解碼器300可以決定該複數個時間初始化點集合中的一組時間初始化點，其中該組時間初始化點中的時間初始化點的相應時間ID值等於當前圖片或切片的時間ID值。視訊轉碼器200和視訊解碼器300可以從該組時間初始化點中選擇至少一個時間初始化點集合。In one instance, only the set of initialization points with the same time ID as the current slice/picture time ID is selected as available initialization points. For example, since smaller temporal IDs generally have lower QP, initialization points from previous pictures with lower temporal ID values may not correctly represent the initialization points of slices of the current picture. In some examples, in order to determine the at least one time initialization point set, the video transcoder 200 and the video decoder 300 can determine a set of time initialization points in the plurality of time initialization point sets, wherein the time initialization points in the set of time initialization points The corresponding time ID value of the time initialization point is equal to the time ID value of the current picture or slice. The video transcoder 200 and the video decoder 300 may select at least one set of time initialization points from the set of time initialization points.

在另一實例中，搜尋具有相同時間ID的初始化點，若該初始化點不可用，則檢查較小的時間ID（例如，當前時間ID值減1），若不可用，則檢查當前時間ID值減2，依此類推。使用找到的初始化點集合進行初始化。In another example, search for an initialization point with the same time ID. If the initialization point is not available, then check a smaller time ID (for example, the current time ID value minus 1). If not available, then check the current time ID value. Subtract 2, and so on. Initialize using the found set of initialization points.

例如，為了決定至少一個時間初始化點集合，視訊轉碼器200和視訊解碼器300可以決定該複數個時間初始化點集合中沒有時間初始化點具有等於當前圖片或切片的時間ID值的時間ID。基於決定該複數個時間初始化點集合中沒有時間初始化點具有等於當前圖片或切片的時間ID值的時間ID值，視訊轉碼器200和視訊解碼器300可以決定該複數個時間初始化點集合中的任何時間初始化點是否具有比當前圖片或切片的時間ID值小一的時間ID。基於決定一或多個時間初始化點具有比當前圖片或切片的時間ID值小於一的時間ID，視訊轉碼器200和視訊解碼器300可以從時間ID值比當前圖片或切片的時間ID值小於一的一或多個時間初始化點集合中，選擇至少一個時間初始化點集合。For example, in order to determine at least one time initialization point set, the video transcoder 200 and the video decoder 300 may determine that no time initialization point in the plurality of time initialization point sets has a time ID equal to the time ID value of the current picture or slice. Based on determining that no time initialization point in the plurality of time initialization point sets has a time ID value equal to the time ID value of the current picture or slice, the video transcoder 200 and the video decoder 300 may determine that no time initialization point in the plurality of time initialization point sets has a time ID value equal to the time ID value of the current picture or slice. Whether any time initialization point has a time ID that is one less than the current picture or slice's time ID value. Based on determining that one or more temporal initialization points have a temporal ID less than one than the temporal ID value of the current picture or slice, the video transcoder 200 and the video decoder 300 may start from a temporal ID value smaller than the temporal ID value of the current picture or slice. Select at least one time initialization point set from one or more time initialization point sets.

類似地，代替使用相同的QP初始化，當不可用時，搜尋QP-1或QP+1，若儲存，則使用，若不可用，則檢查QP-2或QP+2，依此類推。將找到的結果用於初始化。當相同的時間ID和QP初始化點不可用時，可以對時間ID和QP搜尋進行組合。Similarly, instead of using the same QP initialization, when not available, search for QP-1 or QP+1, if stored, use it, if not available, check QP-2 or QP+2, and so on. Use the found results for initialization. Time ID and QP searches can be combined when the same time ID and QP initialization point are not available.

例如，在一或多個實例中，視訊轉碼器200和視訊解碼器30可以儲存具有初始化點的QP值。例如，每個初始化點集合（其中一集合包括一或多個初始化點）可以與切片或圖片相關聯。視訊轉碼器200和視訊解碼器300可以儲存初始化點集合以及指示與該初始化點集合相關聯的切片或圖片的QP值的資訊。舉一個實例，第一初始化點集合可以與具有第一QP值的第一切片或圖片相關聯，而第二初始化點集合可以與具有第二QP值的第二切片或圖片相關聯。視訊轉碼器200和視訊解碼器300可以儲存第一初始化點集合和第一QP值、以及第一QP值用於與第一初始化點集合相關聯的第一切片或圖片的資訊（例如，將第一QP值與第一初始化點集合相關聯）。視訊轉碼器200和視訊解碼器300可以儲存第二初始化點集合和第二QP值、以及第二QP值用於與第二初始化點集合相關聯的第二切片或圖片的資訊（例如，將第二QP與第二初始化點集合相關聯）。For example, in one or more instances, video transcoder 200 and video decoder 30 may store QP values with initialization points. For example, each set of initialization points (where a set includes one or more initialization points) may be associated with a slice or picture. Video transcoder 200 and video decoder 300 may store a set of initialization points and information indicating a QP value of a slice or picture associated with the set of initialization points. As an example, a first set of initialization points may be associated with a first slice or picture having a first QP value, and a second set of initialization points may be associated with a second slice or picture having a second QP value. The video transcoder 200 and the video decoder 300 may store the first initialization point set and the first QP value, and the first QP value for information of the first slice or picture associated with the first initialization point set (eg, Associating a first QP value with a first set of initialization points). The video transcoder 200 and the video decoder 300 may store the second initialization point set and the second QP value, and the second QP value for information of the second slice or picture associated with the second initialization point set (e.g., The second QP is associated with the second set of initialization points).

在一個實例中，僅選擇與當前切片/圖片QP值具有相同QP值的初始化點集合，作為可用的初始化點。在一些實例中，為了決定該至少一個時間初始化點集合，視訊轉碼器200和視訊解碼器300可以決定該複數個時間初始化點集合中的一組時間初始化點，其中該組時間初始化點中的時間初始化點的相應QP值等於當前圖片或切片的QP值。視訊轉碼器200和視訊解碼器300可以從該組時間初始化點中，選擇至少一個時間初始化點集合。In one instance, only the set of initialization points with the same QP value as the current slice/picture QP value is selected as available initialization points. In some examples, in order to determine the at least one time initialization point set, the video transcoder 200 and the video decoder 300 can determine a set of time initialization points in the plurality of time initialization point sets, wherein the time initialization points in the set of time initialization points The corresponding QP value of the temporal initialization point is equal to the QP value of the current picture or slice. The video transcoder 200 and the video decoder 300 may select at least one time initialization point set from the set of time initialization points.

在另一實例中，搜尋具有相同QP值的初始化點，若該初始化點不可用，則檢查最近的QP值，例如當前QP值加1或減1，若不可用，則檢查當前QP值加2或減2，依此類推。將找到的初始化點集合用於初始化。In another example, search for an initialization point with the same QP value. If the initialization point is not available, then check the latest QP value, such as the current QP value plus 1 or minus 1. If it is not available, then check the current QP value plus 2. Or minus 2, and so on. Use the found set of initialization points for initialization.

例如，為了決定至少一個時間初始化點集合，視訊轉碼器200和視訊解碼器300可以決定該複數個時間初始化點集合中沒有一個時間初始化點具有等於當前圖片或切片的QP值的QP值。基於決定該複數個時間初始化點集合中沒有一個時間初始化點具有等於當前圖片或切片的QP值的QP值，視訊轉碼器200和視訊解碼器300可以決定該複數個時間初始化點集合中的任何時間初始化點是否具有比當前圖片或切片的QP值小於一或大於一的的QP值。基於決定存在一或多個時間初始化點具有比當前圖片或切片的QP值小於一或大於一的QP值，視訊轉碼器200和視訊解碼器300可以從QP值比當前圖片或切片的QP值小於一或大於一的的一或多個時間初始化點集合中，選擇至少一個時間初始化點集合。For example, in order to determine at least one temporal initialization point set, the video transcoder 200 and the video decoder 300 may determine that no temporal initialization point in the plurality of temporal initialization point sets has a QP value equal to the QP value of the current picture or slice. Based on determining that none of the plurality of temporal initialization point sets has a QP value equal to the QP value of the current picture or slice, the video transcoder 200 and the video decoder 300 may determine that any of the plurality of temporal initialization point sets has a QP value equal to the QP value of the current picture or slice. Whether the temporal initialization point has a QP value less than one or greater than the QP value of the current picture or slice. Based on determining that there are one or more temporal initialization points having a QP value that is less than one or greater than one than the QP value of the current picture or slice, the video transcoder 200 and the video decoder 300 may start from a QP value that is greater than the QP value of the current picture or slice. Select at least one time initialization point set from one or more time initialization point sets that are less than one or greater than one.

因此，在一或多個實例中，視訊轉碼器200和視訊解碼器300可以被配置為決定（例如，選擇）儲存在緩衝器中的時間初始化點集合，以及基於所選擇的時間初始化點集合，來初始化用於對後續切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值。視訊轉碼器200和視訊解碼器300可以對後續切片或圖片進行基於上下文的算術編碼或解碼。例如，視訊轉碼器200和視訊解碼器300可以基於初始化點，來設置用於上下文值的初始值，並使用初始值來對後續切片或圖片的一或多個語法值進行編碼或解碼。視訊轉碼器200和視訊解碼器300可以從初始值更新上下文值。Accordingly, in one or more examples, video transcoder 200 and video decoder 300 may be configured to determine (eg, select) a set of temporal initialization points stored in a buffer, and based on the selected set of temporal initialization points , to initialize one or more context values of at least one context used for encoding or decoding subsequent slices or pictures. The video transcoder 200 and the video decoder 300 may perform context-based arithmetic encoding or decoding on subsequent slices or pictures. For example, video transcoder 200 and video decoder 300 may set initial values for context values based on the initialization point and use the initial values to encode or decode one or more syntax values of subsequent slices or pictures. The video transcoder 200 and the video decoder 300 can update the context value from the initial value.

選擇時間初始化點集合的一種實例方式可以包括：視訊轉碼器200和視訊解碼器300決定用於後續切片或圖片的時間標識值。視訊轉碼器200和視訊解碼器300可以決定具有儲存在緩衝器中的相關聯的時間初始化點集合的兩個或兩個以上切片或圖片中沒有一個具有等於後續切片或圖片的時間標識值的時間標識值。在這種情況下，視訊轉碼器200和視訊解碼器300可以從該兩個或兩個以上切片或圖片中，決定時間標識值最接近並且小於後續切片或圖片的時間標識值的切片或圖片。為了選擇時間初始化點集合，視訊轉碼器200和視訊解碼器300可以選擇與所決定的切片或圖片相關聯的時間初始化點集合，該時間初始化點集合具有最接近並且小於後續切片或圖片的時間標識值的時間標識值。An example way of selecting a set of temporal initialization points may include: the video transcoder 200 and the video decoder 300 determine a time stamp value for subsequent slices or pictures. Video transcoder 200 and video decoder 300 may determine that none of two or more slices or pictures having an associated set of temporal initialization points stored in the buffer has a time stamp value equal to that of a subsequent slice or picture. Timestamp value. In this case, the video transcoder 200 and the video decoder 300 may determine, from the two or more slices or pictures, the slice or picture whose time stamp value is closest to and smaller than the time stamp value of the subsequent slice or picture. . To select a set of temporal initialization points, the video transcoder 200 and the video decoder 300 may select a set of temporal initialization points associated with the determined slice or picture that has the closest and smaller time to the subsequent slice or picture. The timestamp value of the identification value.

選擇時間初始化點集合的另一種實例方式可以包括：視訊轉碼器200和視訊解碼器300決定用於後續切片或圖片的QP值。視訊轉碼器200和視訊解碼器300可以決定具有儲存在緩衝器中的相關聯的時間初始化點集合的兩個或兩個以上切片或圖片中沒有一個具有等於後續切片或圖片的QP值的QP值。在此種情況下，視訊轉碼器200和視訊解碼器300可以從該兩個或兩個以上切片或圖片中，決定具有最接近後續切片或圖片的QP值的切片或圖片。在此種情況下，最接近的QP值可以大於當前切片或圖片的QP值。為了選擇時間初始化點集合，視訊轉碼器200和視訊解碼器300可以選擇與具有最接近於後續切片或圖片的QP值的QP值相關聯的初始化點集合。Another example way of selecting a set of temporal initialization points may include the video transcoder 200 and the video decoder 300 determining QP values for subsequent slices or pictures. Video transcoder 200 and video decoder 300 may determine that none of two or more slices or pictures having an associated set of temporal initialization points stored in the buffer has a QP equal to the QP value of a subsequent slice or picture. value. In this case, the video transcoder 200 and the video decoder 300 can determine the slice or picture with the closest QP value to the subsequent slice or picture from the two or more slices or pictures. In this case, the closest QP value can be greater than the QP value of the current slice or picture. To select a set of temporal initialization points, video transcoder 200 and video decoder 300 may select a set of initialization points associated with a QP value that is closest to the QP value of a subsequent slice or picture.

在上面的實例中，視訊轉碼器200和視訊解碼器300可以決定與具有以下的時間標識值的切片或圖片相關聯的初始化點集合及/或與QP值最接近後續切片或圖片的QP的切片或圖片相關聯的初始化點集合：該時間標識值最接近後續切片或圖片的時間標識值，而不大於當前切片的時間標識值。在一些實例中，視訊轉碼器200和視訊解碼器300亦可以在決定使用哪個時間初始化點集合時，考慮切片類型。例如，該時間初始化點集合可以與與後續切片的切片類型相同的切片類型相關聯。In the above example, the video transcoder 200 and the video decoder 300 may determine the set of initialization points associated with the slice or picture having the time stamp value and/or the QP with the QP value closest to the subsequent slice or picture A set of initialization points associated with a slice or picture: the time stamp value is closest to the time stamp value of the subsequent slice or picture, and not greater than the time stamp value of the current slice. In some examples, the video transcoder 200 and the video decoder 300 may also consider the slice type when deciding which set of time initialization points to use. For example, the set of temporal initialization points may be associated with the same slice type as that of subsequent slices.

下文描述對初始化點的儲存。如前述，可以對相同的圖片的切片進行獨立地解碼（亦即，相同的圖片的下一切片不能取決於相同的圖片的前一切片）。因此，儲存在相同的圖片的先前切片中的初始化點不能用於相同的圖片中的任何切片。The storage of initialization points is described below. As mentioned before, slices of the same picture can be decoded independently (ie, the next slice of the same picture cannot depend on the previous slice of the same picture). Therefore, initialization points stored in previous slices of the same picture cannot be used for any slice in the same picture.

為了實現這一點，在一個實例中，初始化點（例如，初始化點集合，其中該初始化點集合可以包括一或多個初始化點）臨時儲存在臨時緩衝器中，並且僅在處理（編碼、解碼、解析）相同的圖片的所有切片之後，才從臨時緩衝器添加到儲存緩衝器。以這種方式，可以更新臨時緩衝器中的初始化點，直到處理了相同的圖片的所有切片，隨後儲存緩衝器從臨時緩衝器接收初始化點。To achieve this, in one example, initialization points (e.g., a set of initialization points, where the set of initialization points may include one or more initialization points) are temporarily stored in a temporary buffer and are only processed (encoding, decoding, After parsing) all slices of the same image are added from the temporary buffer to the storage buffer. In this way, the initialization points in the temporary buffer can be updated until all slices of the same picture have been processed, and then the storage buffer receives the initialization points from the temporary buffer.

當添加初始化點時，可以刪除/替換先前儲存的初始化點，這是因為緩衝區有限制，若在處理相同的圖片的切片時存在更新，則可以用相同的圖片的先前切片替換先前圖片的適當初始化點。亦即，若沒有使用臨時緩衝器，則可能會覆蓋儲存緩衝器中應當保留的初始化點。經由使用臨時緩衝器，可以避免覆蓋儲存緩衝器中應當保留的初始化點，並且僅在處理相同圖片的切片之後，才用臨時緩衝器中的初始化點覆蓋儲存緩衝器。When adding initialization points, previously stored initialization points can be deleted/replaced. This is because of buffer limitations. If there is an update while processing a slice of the same image, the appropriate slice of the previous image can be replaced with the previous slice of the same image. Initialization point. That is, if a temporary buffer is not used, initialization points that should remain in the storage buffer may be overwritten. By using a temporary buffer, you can avoid overwriting initialization points in the storage buffer that should be retained, and only overwrite the storage buffer with initialization points in the temporary buffer after processing a slice of the same picture.

因此，將初始化點單獨儲存在臨時緩衝器中，並僅在處理完圖片的所有切片後才使用臨時緩衝器來更新儲存緩衝器可能具有一些益處。為了檢查圖片的結尾，可以檢查CTU位址是否等於最後一個CTU或切片索引是否是圖片的最後一個切片。Therefore, there may be some benefit to storing the initialization point separately in a temporary buffer, and only using the temporary buffer to update the storage buffer after all slices of the picture have been processed. To check the end of the picture, you can check if the CTU address is equal to the last CTU or if the slice index is the last slice of the picture.

例如，視訊轉碼器200和視訊解碼器300可以在第一緩衝器中儲存用於一或多個先前圖片的視訊資料的基於上下文的算術解碼中使用的一或多個上下文的一或多個時間初始化點。視訊轉碼器200和視訊譯碼器300可以在第二緩衝器中儲存用於當前圖片的切片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的時間初始化點（例如，時間初始化點集合），其中在第二緩衝器中儲存包括：在當前圖片的視訊資料的譯碼期間，在第二緩衝器中儲存。視訊轉碼器200和視訊解碼器300可以在處理當前圖片的最後譯碼樹單元（CTU）或切片之後，將儲存在第二緩衝器中的時間初始化點儲存在第一緩衝器中。For example, video transcoder 200 and video decoder 300 may store in a first buffer one or more of one or more contexts used in context-based arithmetic decoding of video data for one or more previous pictures. Time initialization point. Video transcoder 200 and video decoder 300 may store in the second buffer temporal initialization points of one or more contexts used in context-based arithmetic coding of video material for a slice of the current picture (e.g., Temporal initialization point set), wherein storing in the second buffer includes: storing in the second buffer during decoding of the video data of the current picture. The video transcoder 200 and the video decoder 300 may store the temporal initialization point stored in the second buffer in the first buffer after processing the last coding tree unit (CTU) or slice of the current picture.

在另一實例中，可以在自我調整參數集（APS）中用訊號通知初始化點，例如，類似於攜帶濾波器係數的自我調整迴路濾波器參數集。APS可能已經具有針對時間ID值處理而定義的限制，並且不使用從當前圖片匯出的APS來應用於相同的圖片譯碼。In another example, the initialization point may be signaled in an self-tuning parameter set (APS), e.g., similar to a self-tuning loop filter parameter set carrying filter coefficients. The APS may already have limitations defined for temporal ID value processing, and the APS exported from the current picture is not used to apply to the same picture decoding.

在一或多個實例中，初始化點可以儲存在各種圖片位置（例如，圖片的中心或結尾）。亦可以使用圖片中的其他位置。In one or more instances, the initialization point may be stored at various picture locations (eg, the center or end of the picture). Other locations in the image may also be used.

可以在位元串流中，用訊號通知使用哪個儲存位置的選擇。在一個實例中，視訊轉碼器200可以用訊號發送此類指示，以及視訊解碼器300可以在圖片或切片標頭、或任何其他參數集或其他地方，解析此類指示。圖片標頭可以是實例的訊號傳遞位置，因為圖片標頭由圖片的所有切片共享，並且可能不需要在每個切片中用訊號通知儲存選擇來實現圖片級別自我調整。The choice of which storage location to use can be signaled in the bit stream. In one example, video transcoder 200 may signal such indications, and video decoder 300 may parse such indications in the picture or slice header, or in any other parameter set or elsewhere. The image header can be the signaling location for an instance, since the image header is shared by all slices of the image, and there may not be a need to signal storage selections in every slice for image level self-adjustment.

在用於時間CABAC的圖片或切片標頭中可以存在多個語法元素，例如，是否將時間初始化應用於圖片或切片、儲存位元置訊號傳遞，或者如何從儲存初始化緩衝器中去除條目等等。在圖片或切片標頭中發訊號通知的這種語法元素可以經由更高級別指示來調節，例如，經由在SPS或PPS中發訊號通知的指示是否使用臨時CABAC初始化的語法來調節。There can be multiple syntax elements in the picture or slice header for temporal CABAC, such as whether temporal initialization is applied to the picture or slice, storage bit setting signaling, or how to remove entries from the storage initialization buffer, etc. . Such syntax elements signaled in the picture or slice header may be adjusted via a higher level indication, for example, via syntax indicating whether to use temporary CABAC initialization in the SPS or PPS.

圖片或切片標頭中的語法元素訊號傳遞亦可能取決於緩衝器中是否存在可以用於切片的初始化條目。若緩衝器中不存在此類條目，則可以不應用時間初始化，並且可以在切片或圖片標頭中，用訊號通知與時間初始化相關聯的語法元素。Syntax element signaling in the image or slice header may also depend on whether there is an initialization entry in the buffer available for the slice. If no such entry exists in the buffer, temporal initialization may not be applied, and syntax elements associated with temporal initialization may be signaled in the slice or image header.

可以經由比較待譯碼切片的時間ID及/或QP值以及緩衝器中的條目值來執行條目辨識。亦即，在一或多個實例中，視訊解碼器300可以將正在解碼的切片或圖片的時間標識值及/或QP值和與每個時間初始化點集合相關聯的時間標識值及/或QP值進行比較，以選擇對用於編碼或解碼切片或圖片的上下文的上下文值進行初始化的初始化點集合。Entry identification may be performed by comparing the temporal ID and/or QP value of the slice to be coded with the entry value in the buffer. That is, in one or more examples, the video decoder 300 may combine the time identification value and/or QP value of the slice or picture being decoded with the time identification value and/or QP value associated with each set of time initialization points. The values are compared to select a set of initialization points that initialize the context values used to encode or decode the context of a slice or picture.

在一些實例中，出於實現目的，初始化儲存可能受到一定數量的條目的限制，這是因為儲存每個時間ID和QP的初始化儲存（例如，初始化點，但其他儲存資訊亦是可能的）可能會很昂貴，例如需要較大的緩衝區大小。例如，每個切片類型的緩衝區大小可以限制為N。在這種情況下，當緩衝器已滿時（亦即，所有N個條目皆添加到緩衝器中時），應在添加下一個條目之前刪除一個條目。在一個實例中，可以將數字N設置為等於5，因為它表示典型的GOP 32譯碼。In some instances, for implementation purposes, the initialization store may be limited to a certain number of entries because it is possible to store the initialization storage for each time ID and QP (e.g., the initialization point, but other storage information is also possible). Can be expensive, e.g. requiring larger buffer sizes. For example, the buffer size per slice type can be limited to N. In this case, when the buffer is full (that is, when all N entries have been added to the buffer), one entry should be deleted before adding the next one. In one example, the number N can be set equal to 5 since it represents typical GOP 32 decoding.

可以由視訊轉碼器200和視訊解碼器300根據基於緩衝條目的時間ID值及/或QP值的特定規則，來完成條目去除。例如，該規則可以如下所示：去除具有最小時間ID的條目，及/或去除具有最小時間ID和最小QP的條目。Entry removal may be accomplished by the video transcoder 200 and the video decoder 300 according to specific rules based on the temporal ID value and/or QP value of the buffer entry. For example, the rule may be as follows: remove the entry with the smallest time ID, and/or remove the entry with the smallest time ID and the smallest QP.

在一些實例中，若存在與後續切片具有相同切片類型的切片相關聯的時間初始化點集合的條目，則視訊轉碼器200和視訊解碼器300可以去除該時間初始化點集合的條目，即使在緩衝器中存在與具有較小時間ID或較小QP的切片或圖片相關聯的初始化點集合的條目。在一些實例中，具有最小時間ID及/或最小QP的條目可以用於與切片類型不同於後續切片的切片類型的切片相關聯的時間初始化點集合。In some examples, the video transcoder 200 and the video decoder 300 may remove the entry of the temporal initialization point set if there is an entry of the temporal initialization point set associated with a slice of the same slice type as the subsequent slice, even if the buffering There is an entry in the controller for a set of initialization points associated with a slice or picture with a smaller temporal ID or smaller QP. In some examples, the entry with the smallest temporal ID and/or smallest QP may be used in a set of temporal initialization points associated with a slice whose slice type is different from that of subsequent slices.

在一些實例中，一個切片類型可能存在多個條目。例如，對於切片類型I片，緩衝器可以儲存多達5個時間初始化點集合，對於切片類型P片，緩衝器可以儲存多達5個時間初始化點集合，而對於切片類型B片，緩衝器可以儲存多達5個時間初始化點集合。在此類實例中，視訊轉碼器200和視訊解碼器300可以首先決定和與後續切片的切片類型相同的切片類型相關聯的時間初始化點集合。隨後，視訊轉碼器200和視訊解碼器300可以在所決定的時間初始化點集合內，決定具有最小時間標識值及/或QP值的時間初始化點集合，以及決定應當從與後續切片具有相同切片類型的時間初始化點集合中去除具有最小時間標識值及/或QP值的時間初始化點集合。In some instances, there may be multiple entries for a slice type. For example, for slice type I slice, the buffer can store up to 5 temporal initialization point sets, for slice type P slice, the buffer can store up to 5 temporal initialization point sets, and for slice type B slice, the buffer can store Store up to 5 time initialization point sets. In such instances, video transcoder 200 and video decoder 300 may first determine a set of temporal initialization points associated with the same slice type as the slice type of the subsequent slice. Subsequently, the video transcoder 200 and the video decoder 300 may determine a set of time initialization points with a minimum time stamp value and/or a QP value within the determined set of time initialization points, and determine that the set of time initialization points should start from the same slice as the subsequent slice. The time initialization point set with the smallest time identification value and/or QP value is removed from the time initialization point set of the type.

去除具有最小QP的條目可以是因為最小QP切片具有更多的變換係數（更少的量化），因此可以在切片譯碼開始時調整上下文，並且將對切片的其餘部分進行高效地譯碼。具有較高QP的切片具有較少的變換係數，並且上下文自我調整可能較慢，因此將對切片的較少塊進行高效地譯碼。Removing the entry with minimum QP can be because the minimum QP slice has more transform coefficients (less quantization), so the context can be adjusted at the beginning of slice decoding, and the rest of the slice will be decoded efficiently. Slices with higher QP have fewer transform coefficients and context self-adjustment may be slower, so fewer blocks of the slice will be decoded efficiently.

換句話說，具有時間初始化點允許初始化上下文值，隨後可以將其作為編碼或解碼的一部分進行調整。若可以相對快速地調整切片的上下文值（例如，具有較低QP值的切片），則具有時間初始化點可能具有一些益處。然而，此類益處可能會減少，因為即使沒有正確初始化，亦可以相對快速地調整具有較低QP值的切片的上下文值。對於上下文值不相對快速地適應的切片（例如，具有較高QP值的切片），適當地初始化上下文值可能會有更多益處。In other words, having a temporal initialization point allows context values to be initialized which can subsequently be adjusted as part of encoding or decoding. Having a temporal initialization point may have some benefit if the context value of the slice can be adjusted relatively quickly (for example, a slice with a lower QP value). However, such benefits may be reduced because context values for slices with lower QP values can be adjusted relatively quickly even without proper initialization. For slices where the context values do not adapt relatively quickly (e.g. slices with higher QP values), there may be more benefit from initializing the context values appropriately.

舉一個實例，假設具有較低QP值的第一切片的上下文值趨於快速適應，而具有較高QP值的第二切片的上下文值趨於不快速適應。若第一切片的時間初始化點可用，則在加速上下文值的調整態樣可能具有一些益處。然而，此類益處可能不大，因為即使沒有正確初始化，第一切片的上下文值亦傾向於快速適應。As an example, assume that the context value of a first slice with a lower QP value tends to adapt quickly, while the context value of a second slice with a higher QP value tends not to adapt quickly. If the temporal initialization point of the first slice is available, there may be some benefit in accelerating the adjustment of context values. However, such benefits may be modest, as the context value of the first slice tends to adapt quickly even if it is not initialized correctly.

若第二切片的時間初始化點可用，則與第一切片相比，在加速上下文值的調整態樣可能有更大的益處。例如，經由初始化第二切片的上下文值，上下文值的自我調整從更接近最終值的值開始，因此，與時間初始化點不可用的情況相比，第二切片中的塊有更好的壓縮。If a temporal initialization point for the second slice is available, there may be greater benefit in accelerating the adjustment of context values than for the first slice. For example, by initializing the context value of the second slice, the self-adjustment of the context value starts from a value closer to the final value, so there is better compression of the blocks in the second slice than if the temporal initialization point was not available.

如前述，具有較高QP值的切片往往是其上下文值適應更慢的切片。因此，若具有較高QP值的切片的時間初始化點儲存在緩衝器中，則此類時間初始化點將可用於具有時間初始化點的益處更為明顯的未來切片。結果，保持具有較高QP的初始化條目，可以為未來圖片提供高效的譯碼。As mentioned earlier, slices with higher QP values tend to be slices whose context values adapt more slowly. Therefore, if the temporal initialization points of slices with higher QP values are stored in the buffer, such temporal initialization points will be available for future slices where the benefits of having temporal initialization points are more obvious. As a result, keeping initialized entries with higher QP can provide efficient decoding for future pictures.

類似地，可以去除較低的時間ID條目，因為通常較低時間ID切片利用較小的QP進行譯碼。例如，與上述類似，具有時間初始化點允許對上下文值進行初始化，隨後可以將其作為編碼或解碼的一部分進行調整。若可以相對快速地適應切片的上下文值（例如，具有較低時間標識值的切片），則具有時間初始化點可能具有一些益處。然而，由於具有較低時間標識值的切片的上下文值即使沒有正確初始化亦可以相對快速地適應，因此這種益處可能減少。對於其上下文值不相對快速地適應的切片（例如，具有較高時間標識值的切片），適當地初始化上下文值可能具有更多益處。Similarly, lower temporal ID entries can be removed since typically lower temporal ID slices are decoded with smaller QPs. For example, similar to above, having a temporal initialization point allows context values to be initialized which can then be adjusted as part of encoding or decoding. Having a temporal initialization point may have some benefit if the context value of the slice can be adapted relatively quickly (for example, a slice with a lower timestamp value). However, this benefit may be reduced because the context values of slices with lower timestamp values can adapt relatively quickly even if they are not initialized correctly. For slices whose context values do not adapt relatively quickly (for example, slices with high timestamp values), there may be more benefit in initializing the context values appropriately.

舉一個實例，假設具有較低時間標識值的第一切片的上下文值傾向於快速適應，而具有較高時間標識值的第二切片的上下文值傾向於不快速適應。若第一切片的時間初始化點可用，則在加速上下文值的調整態樣可能具有一些益處。然而，此類益處可能不大，因為即使沒有正確初始化，第一切片的上下文值亦傾向於快速適應。As an example, assume that the context value of a first slice with a lower time stamp value tends to adapt quickly, while the context value of a second slice with a higher time stamp value tends not to adapt quickly. If the temporal initialization point of the first slice is available, there may be some benefit in accelerating the adjustment of context values. However, such benefits may be modest, as the context value of the first slice tends to adapt quickly even if it is not initialized correctly.

若第二切片的時間初始化點可用，則與第一切片相比，在加速上下文值的調整態樣可能存在更大的益處。例如，經由初始化第二切片的上下文值，上下文值的自我調整從更接近最終值的值開始，因此，與時間初始化點不可用的情況相比，第二切片中的塊具有更好的壓縮。If a temporal initialization point for the second slice is available, there may be greater benefit in adjusting the acceleration context value than for the first slice. For example, by initializing the context values of the second slice, the self-adjustment of the context values starts from a value closer to the final value, so the blocks in the second slice have better compression than if the temporal initialization point was not available.

如前述，具有較高時間標識值的切片往往是其上下文值適應更慢的切片。因此，若在緩衝器中儲存具有較高時間標識值的切片的時間初始化點，則此類時間初始化點將可用於具有時間初始化點的益處更為明顯的未來切片。結果，保持具有較高時間ID的初始化條目，可以為未來圖片提供高效的譯碼。As mentioned earlier, slices with higher timestamp values tend to be slices whose context values adapt more slowly. Therefore, if the temporal initialization points of slices with higher time stamp values are stored in the buffer, such temporal initialization points will be available for future slices where the benefits of having temporal initialization points are more obvious. As a result, keeping initialized entries with higher temporal IDs can provide efficient decoding of future pictures.

基於時間ID值、QP值或切片類型的其他規則是可能的，並且本案內容並不限於時間ID值和QP值或切片類型的實例規則。可以在圖片或切片標頭中的位元串流中，或者在任何其他參數集中，或者在其他地方，發訊號通知對該規則的選擇。Other rules based on time ID values, QP values, or slice types are possible, and the content of this case is not limited to instance rules for time ID values and QP values or slice types. The selection of this rule may be signaled in the bitstream in the picture or slice header, or in any other parameter set, or elsewhere.

因此，在一或多個實例中，視訊轉碼器200和視訊解碼器300可以被配置為處理視訊資料。為了處理視訊資料，視訊轉碼器200和視訊解碼器300可以被配置為決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值。Accordingly, in one or more examples, video transcoder 200 and video decoder 300 may be configured to process video data. To process video data, the video transcoder 200 and the video decoder 300 may be configured to determine one or more context values of at least one context for encoding or decoding the current slice or picture.

視訊轉碼器200和視訊解碼器300可以決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿（例如，可以儲存N個條目的緩衝器中有N個條目）。如前述，每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點。Video transcoder 200 and video decoder 300 may decide that a buffer used to store sets of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding is full (e.g., may store N There are N entries in the buffer of entries). As mentioned above, each set of temporal initialization points is associated with a slice or picture of the two or more slices or pictures, and includes one or more temporal initialization points.

視訊轉碼器200和視訊解碼器300可以基於切片或圖片的切片類型、時間標識值或量化參數（QP）值中的至少一項，從該兩個或兩個以上切片或圖片中決定（例如，辨識）與該切片或圖片相關聯的第一時間初始化點集合。視訊轉碼器200和視訊解碼器300可以從緩衝器中，去除與切片或者圖片相關聯的第一時間初始化點集合，以及在緩衝器中，儲存與當前切片或圖片相關聯的第二時間初始化點集合。第二時間初始化點集合可以是基於所決定的一或多個上下文值（例如，第二時間初始化點集合等於該上下文的一或多個上下文值，或者是根據該上下文的一或多個上下文值產生的）。The video transcoder 200 and the video decoder 300 may determine from the two or more slices or pictures based on at least one of a slice type, a time stamp value, or a quantization parameter (QP) value of the slice or picture (eg, , identify) the first set of initialization points associated with the slice or picture. The video transcoder 200 and the video decoder 300 may remove the first temporal initialization point set associated with the slice or picture from the buffer, and store the second temporal initialization point associated with the current slice or picture in the buffer. Point collection. The second set of time initialization points may be based on the determined one or more context values (for example, the second set of time initialization points is equal to the one or more context values of the context, or based on the one or more context values of the context generated).

舉一個實例，為了決定第一時間初始化點集合（例如，要去除的時間初始化點集合），視訊轉碼器200和視訊解碼器300可以從兩個或兩個以上切片或圖片中決定與具有最小時間標識值或量化參數（QP）值中的至少一項的切片或圖片相關聯的第一時間初始化點集合。例如，視訊轉碼器200和視訊解碼器300可以從兩個或兩個以上切片或圖片的時間標識值中決定具有最小時間標識值的切片或圖片，或者可以決定具有最小QP值的切片或者圖片。在一些情況下，與第一時間初始化點集合相關聯的切片可以具有與兩個或兩個以上切片或圖片的QP值中的當前切片的切片類型不同的切片類型。As an example, in order to determine the first set of temporal initialization points (for example, the set of temporal initialization points to be removed), the video transcoder 200 and the video decoder 300 may determine from two or more slices or pictures with the smallest A first set of temporal initialization points associated with a slice or picture of at least one of a time stamp value or a quantization parameter (QP) value. For example, the video transcoder 200 and the video decoder 300 may determine the slice or picture with the smallest time stamp value from the time stamp values of two or more slices or pictures, or may determine the slice or picture with the smallest QP value. . In some cases, the slice associated with the first set of temporal initialization points may have a different slice type than the slice type of the current slice in the QP values of two or more slices or pictures.

在一些實例中，若緩衝器已滿，或者可能即使緩衝器未滿，視訊轉碼器200和視訊解碼器300亦可以在去除與具有最小時間標識值或QP值的切片或圖片相關聯的初始化點集合之前，先去除重複的條目。例如，若緩衝器已滿，或者可能即使緩衝器未滿，則在將初始化點集合儲存在與當前切片或圖片相關聯的緩衝器中之前，視訊轉碼器200和視訊解碼器300可以決定緩衝器中是否存在與以下的切片或圖片相關聯的任何初始化點集合：該切片或圖片具有與當前切片或圖片相同的時間標識值或QP值，及/或具有相同的切片類型。In some examples, if the buffer is full, or perhaps even if the buffer is not full, video transcoder 200 and video decoder 300 may remove the initialization associated with the slice or picture with the smallest time stamp value or QP value. Before clicking the collection, remove duplicate entries. For example, if the buffer is full, or perhaps even if the buffer is not full, video transcoder 200 and video decoder 300 may decide to buffer before storing the set of initialization points in the buffer associated with the current slice or picture. Whether there is any set of initialization points in the controller associated with a slice or picture that has the same timestamp value or QP value as the current slice or picture, and/or has the same slice type.

若在與當前切片或圖片具有相同的時間標識值、QP值或切片類型的切片或圖片相關聯的緩衝器中存在初始化點集合，則視訊轉碼器200和視訊解碼器300可以去除該初始化點集合，即使存在與具有較低時間標識值或QP值的切片或圖片相關聯的其他初始化點集合。若在與當前切片或圖片具有相同的時間標識值、QP值或切片類型的切片或圖片相關聯的緩衝器中沒有初始化點集合，則視訊轉碼器200和視訊解碼器300可以去除具有最小時間標識值或QP值的初始化點集合。If there is a set of initialization points in the buffer associated with a slice or picture that has the same timestamp value, QP value, or slice type as the current slice or picture, the video transcoder 200 and the video decoder 300 may remove the initialization point. collection, even if there are other initialization point collections associated with slices or pictures with lower timestamp values or QP values. If there is no set of initialization points in the buffer associated with a slice or picture that has the same timestamp value, QP value, or slice type as the current slice or picture, the video transcoder 200 and the video decoder 300 may remove the fragment with the minimum time A collection of initialization points for identity values or QP values.

在一些實例中，被去除的具有最小時間標識值或QP值的初始化點集合可能具有與當前切片不同的切片類型。在一些實例中，例如在可以為切片類型儲存多個時間初始化點集合的情況下，視訊轉碼器200和視訊解碼器300可以決定與當前切片具有相同切片類型的切片相關聯的時間初始化點集合的封包。隨後，視訊轉碼器200和視訊解碼器300可以將該封包內具有最小時間標識值或QP值的時間初始化點集合，決定為要去除的時間初始化點集合。In some instances, the removed set of initialization points with the smallest time stamp value or QP value may have a different slice type than the current slice. In some examples, such as where multiple sets of temporal initialization points may be stored for a slice type, the video transcoder 200 and the video decoder 300 may determine a set of temporal initialization points associated with a slice of the same slice type as the current slice. of packets. Subsequently, the video transcoder 200 and the video decoder 300 may determine the time initialization point set with the minimum time stamp value or QP value in the packet as the time initialization point set to be removed.

因此，視訊轉碼器200和視訊解碼器300可以決定當前切片或圖片的時間標識值或QP值中的至少一項不同於具有儲存在緩衝器中的相關聯的時間初始化點集合的兩個或兩個以上切片或圖片之每一者切片或圖片的時間標識值或QP值。在此類實例中，視訊轉碼器200和視訊解碼器300可以基於決定當前切片或圖片的時間標識值或QP值中的至少一項不同於該兩個或兩個以上切片或圖片之每一者切片或圖片的時間標識值或QP值，來去除第一時間初始化點集合。例如，與第一時間初始化點集合相關聯的切片或圖片的時間標識值或QP值不同於當前切片或圖片中的時間標識值或QP值。Therefore, the video transcoder 200 and the video decoder 300 may determine that at least one of the time stamp value or the QP value of the current slice or picture is different from the two or the associated set of time initialization points stored in the buffer. The timestamp value or QP value of each slice or picture of two or more slices or pictures. In such instances, the video transcoder 200 and the video decoder 300 may determine that at least one of the time stamp value or the QP value of the current slice or picture is different from each of the two or more slices or pictures. Or the time stamp value or QP value of the slice or picture to remove the first time initialization point set. For example, the time stamp value or QP value of the slice or picture associated with the first set of temporal initialization points is different from the time stamp value or QP value in the current slice or picture.

例如，假設當前切片或圖片是第一切片或圖片。在該實例中，視訊轉碼器200和視訊解碼器300可以決定用於對第二切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值。視訊轉碼器200和視訊解碼器300可以從兩個或兩個以上切片或圖片中決定第三切片或圖片，該第三切片或圖片具有與第二切片或圖片的時間標識值或QP值相同的時間標識值或QP值中的至少一項。在該實例中，視訊轉碼器200和視訊解碼器300可以從緩衝器中去除與第三切片或圖片相關聯的第三時間初始化點集合，基於所決定的用於對第二切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值，在緩衝器中儲存與第二切片或圖片相關聯的第四時間初始化點集合。For example, assume the current slice or picture is the first slice or picture. In this example, video transcoder 200 and video decoder 300 may determine one or more context values of at least one context for encoding or decoding the second slice or picture. The video transcoder 200 and the video decoder 300 may determine a third slice or picture from two or more slices or pictures, and the third slice or picture has the same time stamp value or QP value as the second slice or picture. At least one of the timestamp value or QP value. In this example, video transcoder 200 and video decoder 300 may remove the third set of temporal initialization points associated with the third slice or picture from the buffer, based on the decision to use for the second slice or picture. One or more context values of the at least one context are encoded or decoded, and a fourth set of temporal initialization points associated with the second slice or picture is stored in the buffer.

初始化點可以包括多個參數，例如多個上下文狀態和多個適配率或適應訊窗（用於指示在每個二進位譯碼之後可以適應多快的上下文狀態）。可以經由儲存這些參數的量化值來減少初始化儲存緩衝器，以便減少可能值的動態範圍，這可能需要更少的位元來儲存這些參數。在一個實例中，視訊轉碼器200和視訊解碼器300可以儲存狀態的總和與適配率的總和。視訊轉碼器200和視訊解碼器300可以將狀態的總和除以狀態的數量（平均狀態）來分配狀態值。視訊轉碼器200和視訊解碼器300可以將適配率（適配訊窗）值分配為適配率的總和除以適配率的數量（平均適配率）。The initialization point may include multiple parameters, such as multiple context states and multiple adaptation rates or adaptation windows (indicating how quickly the context state can be adapted after each binary decoding). The initial storage buffer can be reduced by storing quantized values of these parameters to reduce the dynamic range of possible values, which may require fewer bits to store these parameters. In one example, the video transcoder 200 and the video decoder 300 may store the sum of states and the sum of adaptation rates. The video transcoder 200 and the video decoder 300 may divide the sum of states by the number of states (average states) to assign state values. The video transcoder 200 and the video decoder 300 may assign an adaptation rate (adaptation window) value as the sum of the adaptation rates divided by the number of adaptation rates (the average adaptation rate).

在另一實例中，視訊轉碼器200和視訊解碼器300可以僅儲存某些參數（例如，僅儲存一些初始化點）。在此類實例中，視訊轉碼器200和視訊解碼器300可以根據預設值來初始化其他參數。例如，僅儲存一個狀態和一個自適配率。當執行初始化時，將所儲存的值分別分配給第一狀態和第一適配率，並且從可能已經儲存在轉碼器（例如，視訊轉碼器200和視訊解碼器300）中的預設初始值（例如，上面所描述的每個I片、P片、B片儲存的初始化點）中分配第二狀態和第二適配率。亦可以應用應用於狀態和適配率值的其他值封裝機制。In another example, the video transcoder 200 and the video decoder 300 may only store certain parameters (eg, only store some initialization points). In such instances, the video transcoder 200 and the video decoder 300 may initialize other parameters according to preset values. For example, only one state and one adaptation rate are stored. When initialization is performed, the stored values are assigned to the first state and the first adaptation rate respectively, and from the presets that may have been stored in the transcoder (eg, the video transcoder 200 and the video decoder 300 ) The second state and the second adaptation rate are assigned to the initial value (for example, the initialization point stored in each I slice, P slice, and B slice described above). Other value encapsulation mechanisms applied to status and adaptation rate values may also be applied.

下文描述多個初始化點。每個圖片可以儲存多個初始化點。亦即，對於一個圖片或切片，可能存在初始化點集合，其中初始化點集合將包括一個初始化點或多個初始化點。在一個實例中，當在圖片中使用多於一個切片時，可以使用多個初始化點。例如，可以針對每個切片儲存一個初始化點。Multiple initialization points are described below. Each image can store multiple initialization points. That is, for a picture or slice, there may be a set of initialization points, where the set of initialization points will include one initialization point or multiple initialization points. In one instance, multiple initialization points can be used when using more than one slice in the picture. For example, one initialization point can be stored for each slice.

切片在圖片內具有相對位置，因此從與當前切片位置相對應的先前圖片中儲存初始化點。在一個實例中，將初始化點儲存在前一圖片之每一者切片的中間，並用於初始化對應的當前切片。Slices have relative positions within the picture, so the initialization point is stored from the previous picture corresponding to the current slice position. In one example, the initialization point is stored in the middle of each slice of the previous picture and used to initialize the corresponding current slice.

當前圖片和先前圖片的切片邊界不必對準。在一個實例中，經由當前圖片切片邊界，來驅動儲存初始化點的位置。The slice boundaries of the current picture and the previous picture do not have to be aligned. In one example, the location of the initialization point is driven via the current image slice boundary.

在另一實例中，在特定位置為當前圖片的每個切片儲存初始化點，隨後的圖片可以根據特定的規則來決定使用哪個初始化點。例如，此類規則可以是檢查儲存初始化的座標，並將其與該位置是否屬於當前切片進行比較，若是，則可以使用此類初始化。在另一實例中，每個切片的初始化點亦可以儲存在緩衝器中，並且用訊號通知當前切片的索引，以標識要使用哪個初始化點。In another example, initialization points are stored at specific locations for each slice of the current picture, and subsequent pictures can decide which initialization point to use based on specific rules. For example, such a rule could be to check the coordinates of the stored initialization and compare it to whether the position belongs to the current slice, and if so, then such an initialization can be used. In another example, the initialization point for each slice can also be stored in a buffer, and the index of the current slice is signaled to identify which initialization point to use.

例如，視訊轉碼器200和視訊解碼器300可以儲存前一圖片中的先前切片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文所對應的先前時間初始化點。對於當前切片，視訊轉碼器200和視訊解碼器300可以決定當前圖片中的當前切片在當前圖片中具有與先前圖片中的先前切片的位置相對應的位置。基於當前切片在當前圖片中具有對應於先前圖片中的先前切片的位置的位置，視訊轉碼器200和視訊解碼器300可以基於先前時間初始化點，來決定用於當前切片的當前時間初始化點。For example, the video transcoder 200 and the video decoder 300 may store previous temporal initialization points corresponding to one or more contexts used in context-based arithmetic decoding of video data of previous slices in the previous picture. For the current slice, the video transcoder 200 and the video decoder 300 may determine that the current slice in the current picture has a position in the current picture that corresponds to the position of the previous slice in the previous picture. Based on the current slice having a position in the current picture corresponding to the position of the previous slice in the previous picture, the video transcoder 200 and the video decoder 300 may decide a current temporal initialization point for the current slice based on the previous temporal initialization point.

圖2是示出可以執行本案內容的技術的實例視訊轉碼器200的方塊圖。提供圖2以便於解釋的目的，故其不應被認為是對本案內容中廣泛例示和描述的技術的限制。為了便於說明起見，本案內容根據VCC技術（在開發中的ITU-T H.266）和HEVC（ITU-T H.265）技術來描述視訊轉碼器200。但是，本案內容的技術可以由被配置為實現其他視訊譯碼標準和視訊譯碼格式（例如，AV1和AV1視訊譯碼格式的後續）的視訊編碼設備來執行。FIG. 2 is a block diagram illustrating an example video transcoder 200 that may perform the techniques of this disclosure. Figure 2 is provided for purposes of explanation and should not be considered a limitation on the techniques broadly illustrated and described in this context. For ease of explanation, the content of this case describes the video transcoder 200 based on VCC technology (ITU-T H.266 under development) and HEVC (ITU-T H.265) technology. However, the techniques described in this case may be performed by video encoding devices configured to implement other video decoding standards and video decoding formats (eg, AV1 and the successor to the AV1 video decoding format).

在圖2的實例中，視訊轉碼器200包括視訊資料記憶體230、模式選擇單元202、殘差產生單元204、變換處理單元206、量化單元208、逆量化單元210、逆變換處理單元212、重構單元214、濾波器單元216、經解碼的圖片緩衝器（DPB）218和熵編碼單元220。視訊資料記憶體230、模式選擇單元202、殘差產生單元204、變換處理單元206、量化單元208、逆量化單元210、逆變換處理單元212、重構單元214、濾波器單元216、DPB 218和熵編碼單元220中的任何一個或全部，可以在一或多個處理器中或者在處理電路中實現。例如，可以將視訊轉碼器200的單元實現為一或多個電路或邏輯元件，作為硬體電路的一部分，或者作為處理器、ASIC或FPGA的一部分。此外，視訊轉碼器200可以包括補充的或替代的處理器或處理電路，以執行這些功能和其他功能。In the example of FIG. 2, the video transcoder 200 includes a video data memory 230, a mode selection unit 202, a residual generation unit 204, a transformation processing unit 206, a quantization unit 208, an inverse quantization unit 210, an inverse transformation processing unit 212, Reconstruction unit 214, filter unit 216, decoded picture buffer (DPB) 218, and entropy encoding unit 220. Video data memory 230, mode selection unit 202, residual generation unit 204, transformation processing unit 206, quantization unit 208, inverse quantization unit 210, inverse transformation processing unit 212, reconstruction unit 214, filter unit 216, DPB 218 and Any or all of the entropy encoding units 220 may be implemented in one or more processors or in processing circuitry. For example, units of the video transcoder 200 may be implemented as one or more circuits or logic elements, as part of a hardware circuit, or as part of a processor, ASIC, or FPGA. Additionally, video transcoder 200 may include supplemental or alternative processors or processing circuitry to perform these and other functions.

視訊資料記憶體230可以儲存將由視訊轉碼器200的部件編碼的視訊資料。視訊轉碼器200可以從例如視訊源104（圖1）接收儲存在視訊資料記憶體230中的視訊資料。DPB 218可以充當參考圖片記憶體，該參考圖片記憶體儲存參考視訊資料，以供視訊轉碼器200預測後續視訊資料時使用。視訊資料記憶體230和DPB 218可以由多種記憶體設備（例如，動態隨機存取記憶體（DRAM）（其包括同步DRAM（SDRAM））、磁阻RAM（MRAM）、電阻性RAM（RRAM）或其他類型的存放裝置）中的任何一個形成。視訊資料記憶體230和DPB 218可以由相同的存放裝置或不同的存放裝置提供。在各個實例中，視訊資料記憶體230可以與視訊轉碼器200的其他部件一起在晶片上，如圖所示，或者相對於那些部件在晶片外。Video data memory 230 may store video data to be encoded by components of video transcoder 200 . Video transcoder 200 may receive video data stored in video data memory 230 from, for example, video source 104 (FIG. 1). The DPB 218 may serve as a reference picture memory that stores reference video data for use by the video transcoder 200 when predicting subsequent video data. Video data memory 230 and DPB 218 may be composed of a variety of memory devices such as dynamic random access memory (DRAM) (which includes synchronous DRAM (SDRAM)), magnetoresistive RAM (MRAM), resistive RAM (RRAM), or any other type of storage device). Video data memory 230 and DPB 218 may be provided by the same storage device or different storage devices. In various examples, video data memory 230 may be on-chip with other components of video transcoder 200, as shown, or off-chip relative to those components.

在本案內容中，對視訊資料記憶體230的引用不應被解釋為限於視訊轉碼器200內部的記憶體（除非如此具體地描述），亦不應被解釋為限於視訊轉碼器200之外的記憶體（除非如此具體地描述）。而是，對視訊資料記憶體230的引用應當被理解為儲存視訊資料的參考記憶體，其中視訊轉碼器200接收該視訊資料（例如，要編碼的當前塊的視訊資料）以進行編碼。圖1的記憶體106亦可以提供對來自視訊轉碼器200的各個單元的輸出的臨時儲存。In the context of this case, references to video data memory 230 should not be construed as being limited to memory internal to video transcoder 200 (unless so specifically described), nor should it be construed as being limited to memory outside video transcoder 200 of memory (unless so specifically described). Rather, a reference to the video data memory 230 should be understood as a reference memory that stores the video data that the video transcoder 200 receives for encoding (eg, the current block of video data to be encoded). Memory 106 of FIG. 1 may also provide temporary storage of outputs from various units of video transcoder 200.

圖示圖2的各個單元以説明理解由視訊轉碼器200執行的操作。這些單元可以實現為固定功能電路、可程式設計電路或者其組合。固定功能電路代表提供特定功能、並在可以執行的操作上預先設置的電路。可程式設計電路代表可以被程式設計以執行各種任務，並且在可以執行的操作中提供靈活功能的電路。例如，可程式設計電路可以執行使可程式設計電路以軟體或韌體的指令所定義的方式進行操作的軟體或韌體。固定功能電路可以執行軟體指令（例如，用於接收參數或輸出參數），但是固定功能電路執行的操作的類型通常是不可變的。在一些實例中，這些單元中的一或多個單元可以是不同的電路塊（固定功能或可程式設計），並且在一些實例中，該一或多個單元可以是積體電路。The various units of FIG. 2 are illustrated to illustrate understanding the operations performed by the video transcoder 200. These units may be implemented as fixed function circuits, programmable circuits, or a combination thereof. Fixed function circuits represent circuits that provide a specific function and are preset in terms of the operations they can perform. Programmable circuits represent circuits that can be programmed to perform a variety of tasks and provide flexible functionality in the operations that can be performed. For example, the programmable circuit may execute software or firmware that causes the programmable circuit to operate in a manner defined by the instructions of the software or firmware. Fixed-function circuits can execute software instructions (for example, for receiving parameters or outputting parameters), but the types of operations performed by fixed-function circuits are generally immutable. In some examples, one or more of the units may be distinct circuit blocks (fixed function or programmable), and in some examples, the one or more units may be integrated circuits.

視訊轉碼器200可以包括由可程式設計電路形成的算數邏輯單位（ALU）、基本功能單元（EFU）、數位電路、類比電路及/或可程式設計核。在使用由可程式設計電路執行的軟體來執行視訊轉碼器200的操作的實例中，記憶體106（圖1）可以儲存視訊轉碼器200接收並執行的軟體的指令（例如，目標代碼），或者視訊轉碼器200中的另一記憶體（未圖示）可以儲存此類指令。The video transcoder 200 may include an arithmetic logic unit (ALU), an elementary functional unit (EFU), a digital circuit, an analog circuit, and/or a programmable core formed of programmable circuits. In examples where software executed by programmable circuitry is used to perform operations of video transcoder 200 , memory 106 ( FIG. 1 ) may store instructions (eg, object code) for the software that video transcoder 200 receives and executes. , or another memory (not shown) in the video transcoder 200 can store such instructions.

視訊資料記憶體230被配置為儲存接收到的視訊資料。視訊轉碼器200可以從視訊資料記憶體230檢索視訊資料的圖片，並將視訊資料提供給殘差產生單元204和模式選擇單元202。視訊資料記憶體230中的視訊資料可以是將進行編碼的原始視訊資料。The video data memory 230 is configured to store received video data. The video transcoder 200 may retrieve pictures of the video data from the video data memory 230 and provide the video data to the residual generation unit 204 and the mode selection unit 202. The video data in the video data memory 230 may be original video data to be encoded.

模式選擇單元202包括運動估計單元222、運動補償單元224和訊框內預測單元226。模式選擇單元202可以包括額外功能單元，以根據其他預測模式來執行視訊預測。舉例而言，模式選擇單元202可以包括調色板單元、塊內複製單元（其可以是運動估計單元222及/或運動補償單元224的一部分）、仿射單元、線性模型（LM）單元等等。Mode selection unit 202 includes a motion estimation unit 222, a motion compensation unit 224, and an intra prediction unit 226. The mode selection unit 202 may include additional functional units to perform video prediction according to other prediction modes. For example, mode selection unit 202 may include palette units, intra-block replication units (which may be part of motion estimation unit 222 and/or motion compensation unit 224), affine units, linear model (LM) units, and the like. .

模式選擇單元202通常協調多個編碼通道，以測試編碼參數的組合以及針對此類組合的最終率失真值。編碼參數可以包括：CTU到CU的劃分、用於CU的預測模式、用於CU的殘差資料的變換類型、用於CU的殘差資料的量化參數等等。模式選擇單元202可以最終選擇具有比其他測試的組合更好的速率失真值的編碼參數的組合。Mode selection unit 202 typically coordinates multiple encoding passes to test combinations of encoding parameters and resulting rate-distortion values for such combinations. Coding parameters may include: CTU to CU partitioning, prediction mode for CU, transform type for residual data of CU, quantization parameters for residual data of CU, etc. The mode selection unit 202 may ultimately select a combination of encoding parameters that has a better rate-distortion value than other tested combinations.

視訊轉碼器200可以將從視訊資料記憶體230檢索到的圖片劃分為一系列CTU，並將一或多個CTU封裝在片段中。模式選擇單元202可以根據樹結構（例如，上面所描述的MTT結構、QTBT結構、超塊結構或四叉樹結構）來劃分圖片的CTU。如前述，視訊轉碼器200可以根據樹結構，經由劃分CTU來形成一或多個CU。此類CU通常亦可以稱為「視訊塊」或「塊」。Video transcoder 200 may divide the picture retrieved from video data memory 230 into a series of CTUs and encapsulate one or more CTUs in fragments. The mode selection unit 202 may divide the CTU of the picture according to a tree structure (eg, the above-described MTT structure, QTBT structure, super-block structure, or quad-tree structure). As mentioned above, the video transcoder 200 can form one or more CUs by dividing the CTU according to the tree structure. Such CUs may also be commonly referred to as "video blocks" or "blocks".

通常，模式選擇單元202亦控制其部件（例如，運動估計單元222、運動補償單元224和訊框內預測單元226）以產生針對當前塊（例如，當前CU、或者在HEVC中，PU和TU的重疊部分）的預測塊。對於當前塊的訊框間預測，運動估計單元222可以執行運動搜尋以辨識一或多個參考圖片（例如，儲存在DPB 218中的一或多個先前經譯碼的圖片）中的一或多個緊密匹配的參考塊。具體而言，運動估計單元222可以例如根據絕對差之和（SAD）、平方差之和（SSD）、平均絕對差（MAD）、均方差（MSD）等等，來計算表示潛在參考塊與當前塊有多麼相似的值。運動估計單元222通常可以使用當前塊與正考慮的參考塊之間的逐取樣差異來執行這些計算。運動估計單元222可以辨識具有由這些計算所產生的最小值的參考塊，該參考塊指示與當前塊最緊密匹配的參考塊。Typically, mode select unit 202 also controls its components (eg, motion estimation unit 222, motion compensation unit 224, and intra prediction unit 226) to generate overlapping PUs and TUs for the current block (eg, the current CU, or in HEVC, PU and TU). part) prediction block. For inter-frame prediction of the current block, motion estimation unit 222 may perform a motion search to identify one or more reference pictures (e.g., one or more previously coded pictures stored in DPB 218) a closely matching reference block. Specifically, the motion estimation unit 222 may calculate, for example, based on the sum of absolute differences (SAD), the sum of squared differences (SSD), the mean absolute difference (MAD), the mean square error (MSD), etc., to represent the difference between the potential reference block and the current How similar values do blocks have. Motion estimation unit 222 may typically perform these calculations using sample-by-sample differences between the current block and the reference block under consideration. Motion estimation unit 222 may identify the reference block with the minimum value produced by these calculations, which reference block indicates the reference block that most closely matches the current block.

運動估計單元222可以形成一或多個運動向量（MV），這些MV定義參考圖片中的參考塊相對於當前圖片中的當前塊的位置。隨後，運動估計單元222可以將運動向量提供給運動補償單元224。例如，對於單向訊框間預測，運動估計單元222可以提供單個運動向量，而對於雙向訊框間預測，運動估計單元222可以提供兩個運動向量。隨後，運動補償單元224可以使用運動向量來產生預測塊。例如，運動補償單元224可以使用運動向量來檢索參考塊的資料。再舉一個實例，若運動向量具有分數取樣精度，則運動補償單元224可以根據一或多個內插濾波器，對用於預測塊的值進行內插。此外，對於雙向訊框間預測，運動補償單元224可以例如經由逐取樣平均或加權平均，來檢索由相應的運動向量辨識的兩個參考塊的資料，並對檢索到的資料進行組合。Motion estimation unit 222 may form one or more motion vectors (MVs) that define the location of the reference block in the reference picture relative to the current block in the current picture. Motion estimation unit 222 may then provide the motion vector to motion compensation unit 224. For example, for unidirectional inter-frame prediction, motion estimation unit 222 may provide a single motion vector, and for bi-directional inter-frame prediction, motion estimation unit 222 may provide two motion vectors. Motion compensation unit 224 may then use the motion vectors to generate prediction blocks. For example, motion compensation unit 224 may use motion vectors to retrieve reference block information. As another example, if the motion vector has fractional sampling precision, motion compensation unit 224 may interpolate the value for the prediction block according to one or more interpolation filters. In addition, for bidirectional inter-frame prediction, the motion compensation unit 224 may retrieve the information of two reference blocks identified by the corresponding motion vectors, such as through sample-by-sample averaging or weighted averaging, and combine the retrieved information.

當根據AVl視訊譯碼格式進行操作時，運動估計單元222和運動補償單元224可以被配置為使用平移運動補償、仿射運動補償、重疊塊運動補償（OBMC）及/或複合訊框內預測，對視訊資料的譯碼塊（例如，亮度和色度譯碼塊）進行編碼。When operating in accordance with the AV1 video coding format, motion estimation unit 222 and motion compensation unit 224 may be configured to use translational motion compensation, affine motion compensation, overlapping block motion compensation (OBMC), and/or composite intra-frame prediction, Encodes coding blocks of video data (eg, luma and chroma coding blocks).

再舉一個實例，對於訊框內預測或訊框內預測譯碼，訊框內預測單元226可以根據與當前塊相鄰的取樣來產生預測塊。例如，對於定向模式，訊框內預測單元226通常可以在數學上組合相鄰取樣的值，並跨越當前塊沿定義的方向填充這些計算的值以產生預測塊。再舉一個實例，對於DC模式，訊框內預測單元226可以計算與當前塊的相鄰取樣的平均值，並且產生預測塊以包括針對預測塊的每個取樣的該所得平均值。As another example, for intra prediction or intra prediction coding, intra prediction unit 226 may generate a prediction block based on samples adjacent to the current block. For example, for directional mode, intra prediction unit 226 may typically mathematically combine values from adjacent samples and pad these calculated values in a defined direction across the current block to produce a prediction block. As another example, for DC mode, intra prediction unit 226 may calculate an average of adjacent samples to the current block and generate a prediction block to include the resulting average for each sample of the prediction block.

當根據AVl視訊譯碼格式進行操作時，訊框內預測單元226可以被配置為使用定向訊框內預測、非定向訊框內預測、遞迴過濾訊框內預測、色度從亮度（CFL）預測、塊內複製（IBC）及/或調色板模式，對視訊資料的譯碼塊（例如，亮度和色度譯碼塊）進行編碼。模式選擇單元202可以包括額外的功能單元，以根據其他預測模式來執行視訊預測。When operating according to the AV1 video coding format, intra prediction unit 226 may be configured to use directional intra prediction, non-directional intra prediction, recursive filtered intra prediction, chroma from luma (CFL) Prediction, intra-block copy (IBC), and/or palette modes encode coding blocks of video data (eg, luma and chroma coding blocks). The mode selection unit 202 may include additional functional units to perform video prediction according to other prediction modes.

模式選擇單元202將預測塊提供給殘差產生單元204。殘差產生單元204從視訊資料記憶體230接收當前塊的原始未編碼版本，並從模式選擇單元202接收預測塊。殘差產生單元204計算當前塊和預測塊之間的逐取樣差異。所得的逐取樣差異定義了當前塊的殘差塊。在一些實例中，殘差產生單元204亦可以決定殘差塊中的取樣值之間的差，以使用殘差差分脈衝碼調制（RDPCM）來產生殘差塊。在一些實例中，可以使用執行二進位減法的一或多個減法器電路，來形成殘差產生單元204。The mode selection unit 202 supplies the prediction block to the residual generation unit 204. The residual generation unit 204 receives the original unencoded version of the current block from the video data memory 230 and the prediction block from the mode selection unit 202 . Residual generation unit 204 calculates the sample-by-sample difference between the current block and the prediction block. The resulting sample-by-sample difference defines the residual block for the current block. In some examples, the residual generation unit 204 may also determine the difference between the sample values in the residual block to generate the residual block using residual differential pulse code modulation (RDPCM). In some examples, residual generation unit 204 may be formed using one or more subtractor circuits that perform binary subtraction.

在模式選擇單元202將CU劃分為PU的實例中，每一個PU可以與亮度預測單元和對應的色度預測單元相關聯。視訊轉碼器200和視訊解碼器300可以支援具有各種大小的PU。如前述，CU的大小可以代表CU的亮度譯碼塊的大小，以及PU的大小可以代表PU的亮度預測單元的大小。假設特定CU的大小為2Nx2N，則視訊轉碼器200可以支援2Nx2N或NxN的PU大小來用於訊框內預測，並支援2Nx2N、2NxN、Nx2N、NxN等等的對稱PU大小來用於訊框間預測。視訊轉碼器200和視訊解碼器300亦可以對於訊框間預測，支援PU大小為2NxnU、2NxnD、nLx2N和nRx2N的非對稱劃分。In examples where mode selection unit 202 partitions a CU into PUs, each PU may be associated with a luma prediction unit and a corresponding chroma prediction unit. The video transcoder 200 and the video decoder 300 can support PUs of various sizes. As mentioned above, the size of the CU may represent the size of the luma coding block of the CU, and the size of the PU may represent the size of the luma prediction unit of the PU. Assuming that the size of a specific CU is 2Nx2N, the video transcoder 200 can support a PU size of 2Nx2N or NxN for intra-frame prediction, and support a symmetric PU size of 2Nx2N, 2NxN, Nx2N, NxN, etc. for intra-frame prediction. prediction. The video transcoder 200 and the video decoder 300 may also support asymmetric partitioning of PU sizes of 2NxnU, 2NxnD, nLx2N, and nRx2N for inter-frame prediction.

在模式選擇單元202不將CU進一步劃分成PU的實例中，每個CU可以與亮度譯碼塊和對應的色度譯碼塊相關聯。如前述，CU的大小可以代表CU的亮度譯碼塊的大小。視訊轉碼器200和視訊解碼器300可以支援2Nx2N、2NxN或Nx2N的CU大小。In instances where mode selection unit 202 does not further partition the CU into PUs, each CU may be associated with a luma coding block and a corresponding chroma coding block. As mentioned above, the size of the CU may represent the size of the luma decoding block of the CU. The video transcoder 200 and the video decoder 300 may support a CU size of 2Nx2N, 2NxN or Nx2N.

對於其他視訊譯碼技術（例如，訊框內塊複製模式譯碼、仿射模式譯碼和線性模型（LM）模式譯碼，僅舉一些實例），模式選擇單元202經由與譯碼技術相關聯的各個單元，針對正在編碼的當前塊來產生預測塊。在一些實例中（例如，調色板模式譯碼），模式選擇單元202可以不產生預測塊，而是產生語法元素，這些語法元素指示基於所選的調色板來重構塊的方式。在此類模式下，模式選擇單元202可以將這些語法元素提供給熵編碼單元220以進行編碼。For other video coding techniques (eg, intra-block copy mode coding, affine mode coding, and linear model (LM) mode coding, to name some examples), mode selection unit 202 is associated with the coding technology via Each unit of , generates a prediction block for the current block being encoded. In some examples (eg, palette mode coding), mode selection unit 202 may not generate predictive blocks but instead generate syntax elements that indicate how the blocks are reconstructed based on the selected palette. In such modes, mode selection unit 202 may provide these syntax elements to entropy encoding unit 220 for encoding.

如前述，殘差產生單元204接收針對當前塊和對應的預測塊的視訊資料。隨後，殘差產生單元204產生針對當前塊的殘差塊。為了產生殘差塊，殘差產生單元204計算預測塊和當前塊之間的逐取樣差。As mentioned above, the residual generation unit 204 receives video data for the current block and the corresponding prediction block. Subsequently, the residual generation unit 204 generates a residual block for the current block. To generate a residual block, residual generation unit 204 calculates the sample-by-sample difference between the prediction block and the current block.

變換處理單元206將一或多個變換應用於殘差塊以產生變換係數的塊（本文稱為「變換係數塊」）。變換處理單元206可以將各種變換應用於殘差塊以形成變換係數塊。例如，變換處理單元206可以將離散餘弦變換（DCT）、方向變換、Karhunen-Loeve變換（KLT）或者概念上類似的變換應用於殘差塊。在一些實例中，變換處理單元206可以對殘差塊執行多個變換（例如，主變換和次輔變換（如，旋轉變換））。在一些實例中，變換處理單元206不向殘差塊應用變換。Transform processing unit 206 applies one or more transforms to the residual block to produce a block of transform coefficients (referred to herein as a "transform coefficient block"). Transform processing unit 206 may apply various transforms to the residual block to form a block of transform coefficients. For example, transform processing unit 206 may apply a discrete cosine transform (DCT), a directional transform, a Karhunen-Loeve transform (KLT), or a conceptually similar transform to the residual block. In some examples, transform processing unit 206 may perform multiple transforms (eg, a primary transform and a secondary transform (eg, a rotation transform)) on the residual block. In some examples, transform processing unit 206 does not apply transforms to the residual blocks.

當根據AVl操作時，變換處理單元206可以將一或多個變換應用於殘差塊以產生變換係數塊（本文稱為「變換係數塊」）。變換處理單元206可以對殘差塊應用各種變換以形成變換係數塊。例如，變換處理單元206可以應用水平/垂直變換組合，該組合可以包括離散餘弦變換（DCT）、非對稱離散正弦變換（ADST）、翻轉ADST（例如，逆序的ADST）和身份變換（IDTX）。當使用恒等變換時，會在垂直或水平方向之一跳過變換。在一些實例中，可以跳過變換處理。When operating in accordance with AV1, transform processing unit 206 may apply one or more transforms to the residual block to produce a block of transform coefficients (referred to herein as a "transform coefficient block"). Transform processing unit 206 may apply various transforms to the residual block to form a block of transform coefficients. For example, transform processing unit 206 may apply a horizontal/vertical transform combination, which may include a discrete cosine transform (DCT), an asymmetric discrete sine transform (ADST), a flipped ADST (eg, inverse ADST), and an identity transform (IDTX). When using an identity transformation, the transformation is skipped in one of the vertical or horizontal directions. In some instances, transformation processing may be skipped.

量化單元208可以對變換係數塊中的變換係數進行量化，以產生經量化的變換係數塊。量化單元208可以根據與當前塊相關聯的量化參數（QP）值，來量化變換係數塊的變換係數。視訊轉碼器200（例如，經由模式選擇單元202）可以經由調整與CU相關聯的QP值，來調整應用於與當前塊相關聯的變換係數塊的量化程度。量化可能導致資訊的丟失，因此，量化後的變換係數的精度可能比變換處理單元206所產生的原始變換係數的精度低。Quantization unit 208 may quantize the transform coefficients in the block of transform coefficients to produce a quantized block of transform coefficients. Quantization unit 208 may quantize the transform coefficients of the block of transform coefficients based on a quantization parameter (QP) value associated with the current block. Video transcoder 200 (eg, via mode selection unit 202) may adjust the degree of quantization applied to the block of transform coefficients associated with the current block by adjusting the QP value associated with the CU. Quantization may result in loss of information, and therefore, the accuracy of the quantized transform coefficients may be lower than the accuracy of the original transform coefficients generated by the transform processing unit 206 .

逆量化單元210和逆變換處理單元212可以將逆量化和逆變換分別應用於量化的變換係數塊，以根據變換係數塊來重建殘差塊。重構單元214可以基於重構的殘差塊和由模式選擇單元202產生的預測塊，來產生與當前塊相對應的重構塊（儘管可能具有一定程度的失真）。例如，重構單元214可以將重構的殘差塊的取樣添加到模式選擇單元202所產生的預測塊中的對應取樣，以產生重構的塊。The inverse quantization unit 210 and the inverse transform processing unit 212 may respectively apply inverse quantization and inverse transform to the quantized transform coefficient block to reconstruct the residual block from the transform coefficient block. Reconstruction unit 214 may generate a reconstructed block corresponding to the current block (although possibly with a certain degree of distortion) based on the reconstructed residual block and the prediction block generated by mode selection unit 202 . For example, reconstruction unit 214 may add samples of the reconstructed residual block to corresponding samples in the prediction block generated by mode selection unit 202 to produce a reconstructed block.

濾波器單元216可以對重構的塊執行一或多個濾波操作。例如，濾波器單元216可以執行解塊操作以減少沿著CU的邊緣的塊狀偽影。在一些實例中，可以跳過濾波器單元216的操作。Filter unit 216 may perform one or more filtering operations on the reconstructed blocks. For example, filter unit 216 may perform deblocking operations to reduce blocking artifacts along edges of CUs. In some examples, operation of filter unit 216 may be skipped.

當根據AVl操作時，濾波器單元216可以對重構塊執行一或多個濾波操作。例如，濾波器單元216可以執行解塊操作，以減少沿CU邊緣的塊狀偽影。在其他實例中，濾波器單元216可以應用約束方向增強濾波（CDEF），其可以在解塊之後應用，並且可以包括基於估計的邊緣方向應用不可分離的非線性低通方向濾波。濾波器單元216亦可以包括在CDEF之後應用的迴路恢復濾波器，並且可以包括可分離的對稱正規化維納濾波器或雙自導濾波器。When operating according to AV1, filter unit 216 may perform one or more filtering operations on the reconstructed block. For example, filter unit 216 may perform deblocking operations to reduce blocking artifacts along CU edges. In other examples, filter unit 216 may apply constrained direction enhancement filtering (CDEF), which may be applied after deblocking, and may include applying non-separable, non-linear low-pass directional filtering based on the estimated edge directions. The filter unit 216 may also include a loop recovery filter applied after the CDEF, and may include a separable symmetric normalized Wiener filter or a dual self-steering filter.

視訊轉碼器200將重構的塊儲存在DPB 218中。例如，在不執行濾波器單元216的操作的實例中，重構單元214可以將重構的塊儲存到DPB 218中。在執行濾波器單元216的操作的實例中，濾波器單元216可以將濾波後的重構塊儲存到DPB 218中。運動估計單元222和運動補償單元224可以從DPB 218檢索參考圖片，該參考圖片由重構（並且可能濾波）的塊形成，以對隨後編碼的圖片進行訊框間預測。另外，訊框內預測單元226可以使用當前圖片的DPB 218中的重構塊，對當前圖片中的其他塊進行訊框內預測。Video transcoder 200 stores the reconstructed blocks in DPB 218. For example, in instances where the operations of filter unit 216 are not performed, reconstruction unit 214 may store the reconstructed blocks into DPB 218 . In an example in which the operations of filter unit 216 are performed, filter unit 216 may store the filtered reconstruction block into DPB 218 . Motion estimation unit 222 and motion compensation unit 224 may retrieve reference pictures from DPB 218 that are formed from the reconstructed (and possibly filtered) blocks for inter-frame prediction of subsequently encoded pictures. In addition, the intra prediction unit 226 may use the reconstructed blocks in the DPB 218 of the current picture to perform intra prediction on other blocks in the current picture.

通常，熵編碼單元220可以對從視訊轉碼器200的其他功能元件接收的語法元素進行熵編碼。例如，熵編碼單元220可以對來自量化單元208的量化的變換係數塊進行熵編碼。再舉一個實例，熵編碼單元220可以對來自模式選擇單元202的預測語法元素（例如，用於訊框間預測的運動資訊或者用於訊框內預測的訊框內模式資訊）進行熵編碼。熵編碼單元220可以對作為視訊資料的另一個實例的語法元素執行一或多個熵編碼操作，以產生熵編碼的資料。例如，熵編碼單元220可以執行上下文自我調整可變長度譯碼（CAVLC）操作、CABAC操作、變數至變數（V2V）長度譯碼操作、基於語法的上下文自我調整二進位算術譯碼（SBAC）操作、概率間隔分割熵（PIPE）譯碼操作、指數葛籣布編碼操作、或者對資料的另一種類型的熵編碼操作。在一些實例中，熵編碼單元220可以在不對語法元素進行熵編碼的旁通模式下操作。Generally, the entropy encoding unit 220 may entropy encode syntax elements received from other functional elements of the video transcoder 200 . For example, entropy encoding unit 220 may entropy encode the quantized transform coefficient block from quantization unit 208 . As another example, entropy encoding unit 220 may entropy encode prediction syntax elements from mode selection unit 202 (eg, motion information for inter prediction or intra mode information for intra prediction). Entropy encoding unit 220 may perform one or more entropy encoding operations on a syntax element that is another instance of video data to generate entropy encoded data. For example, entropy encoding unit 220 may perform context self-adjusting variable length coding (CAVLC) operations, CABAC operations, variable-to-variable (V2V) length coding operations, syntax-based context self-adjusting binary arithmetic coding (SBAC) operations , a Probabilistic Interval Partitioning Entropy (PIPE) decoding operation, an exponential gorilla coding operation, or another type of entropy coding operation on the data. In some examples, entropy encoding unit 220 may operate in a bypass mode that does not entropy encode syntax elements.

視訊轉碼器200可以輸出位元串流，該位元串流包括用於重構切片或圖片的塊所需要的經熵編碼的語法元素。具體而言，熵編碼單元220可以輸出位元串流。Video transcoder 200 may output a bitstream that includes entropy-encoded syntax elements required for reconstructing blocks of a slice or picture. Specifically, the entropy encoding unit 220 may output a bit stream.

根據AV1，熵編碼單元220可以被配置為符號到符號自我調整多符號算術譯碼器。AV1中的語法元素包括N個元素的字母表，以及上下文（例如，概率模型）包括N個概率的集合。熵編碼單元220可以將概率儲存為n位元（例如，15位元）累積分佈函數（CDF）。熵編碼單元22可以使用基於字母大小的更新因數執行遞迴縮放，來更新上下文。According to AV1, the entropy encoding unit 220 may be configured as a symbol-to-symbol self-adjusting multi-symbol arithmetic decoder. Grammar elements in AV1 consist of an alphabet of N elements, and contexts (e.g., probabilistic models) consist of sets of N probabilities. Entropy encoding unit 220 may store the probabilities as n-bit (eg, 15-bit) cumulative distribution functions (CDFs). Entropy encoding unit 22 may perform recursive scaling using an update factor based on the letter size to update the context.

關於塊描述了上面所描述的操作。此類描述應當被理解為用於亮度譯碼塊及/或色度譯碼塊的操作。如前述，在一些實例中，亮度譯碼塊和色度譯碼塊是CU的亮度和色度分量。在一些實例中，亮度譯碼塊和色度譯碼塊是PU的亮度分量和色度分量。The operations described above are described with respect to blocks. Such descriptions should be understood to be for the operation of the luma coding block and/or the chroma coding block. As mentioned previously, in some examples, the luma coding block and the chroma coding block are the luma and chroma components of the CU. In some examples, the luma and chroma coding blocks are the luma and chroma components of the PU.

在一些實例中，不需要針對色度譯碼塊重複針對亮度譯碼塊執行的操作。舉一個實例，不需要重多工於辨識亮度譯碼塊的運動向量（MV）和參考圖片的操作，來辨識用於色度塊的MV和參考圖片。相反，可以縮放用於亮度譯碼塊的MV以決定用於色度塊的MV，並且參考圖片可以是相同的。再舉一個實例，對於亮度譯碼塊和色度譯碼塊，訊框內預測處理可以是相同的。In some examples, operations performed for the luma coding block need not be repeated for the chroma coding block. As an example, the operation of identifying motion vectors (MVs) and reference pictures for luma coding blocks does not need to be repeated to identify MVs and reference pictures for chroma blocks. Instead, the MV for the luma coding block can be scaled to determine the MV for the chroma block, and the reference pictures can be the same. As another example, the intra prediction process may be the same for luma coding blocks and chroma coding blocks.

在一或多個實例中，對於諸如基於上下文的算術編碼之類的熵編碼，熵編碼單元220可以決定一或多個上下文的上下文值。在對切片或圖片進行編碼期間，熵編碼單元220可以更新上下文值（例如，概率值）。然而，在切片或圖片的開頭，上下文值可能是未定義的。熵編碼單元220可以使用預定義的初始化點（例如，儲存在DPB 218、視訊資料記憶體230或某種其他記憶體中的初始值）來初始化一或多個上下文值，而不是從未定義的上下文值開始。In one or more examples, for entropy coding such as context-based arithmetic coding, entropy coding unit 220 may determine context values for one or more contexts. During encoding of a slice or picture, entropy encoding unit 220 may update context values (eg, probability values). However, at the beginning of a slice or picture, the context value may be undefined. Entropy encoding unit 220 may initialize one or more context values using a predefined initialization point (e.g., an initial value stored in DPB 218, video data memory 230, or some other memory), rather than undefined. Context value starts.

在一或多個實例中，代替或除了使用預定義的初始化點之外，熵編碼單元220亦可以使用先前編碼的切片或圖片的一或多個上下文值，或者一或多個上下文值的映射、縮放、加權等版本作為初始化點。例如，熵編碼單元220可以在緩衝器（例如，DPB 218、視訊資料記憶體230或某種其他記憶體）中儲存經編碼的切片或圖片的上下文的一或多個上下文值（例如，在對切片或圖片的最後CTU進行編碼之後），作為初始化點集合。熵編碼單元220可以利用該初始化點集合（例如，上下文值或基於先前編碼的切片或圖片的上下文值的值），來初始化用於對後續切片或圖片進行編碼的上下文的上下文值。In one or more examples, instead of or in addition to using predefined initialization points, entropy encoding unit 220 may also use one or more context values of previously encoded slices or pictures, or a mapping of one or more context values. , scaled, weighted, etc. versions as initialization points. For example, entropy encoding unit 220 may store one or more context values (e.g., for the context of the encoded slice or picture) in a buffer (e.g., DPB 218, video data memory 230, or some other memory). After the last CTU of the slice or picture is encoded), it is used as a set of initialization points. Entropy encoding unit 220 may utilize the set of initialization points (eg, context values or values based on context values of previously encoded slices or pictures) to initialize context values for context used to encode subsequent slices or pictures.

熵編碼單元220可以儲存多個先前編碼的切片或圖片的初始化點集合。例如，緩衝器可以儲存與第一切片或圖片相關聯的第一初始化點集合、與第二切片或圖片相關聯的第二初始化點集合，等等。此外，為了決定（例如，選擇）熵編碼單元220可以將哪一個初始化點集合用於後續切片或圖片，熵編碼單元220亦可以儲存與各個初始化點集合中的每一個初始化點相關聯的切片或圖片的時間標識值及/或QP值的資訊。Entropy encoding unit 220 may store a plurality of initialization point sets of previously encoded slices or pictures. For example, the buffer may store a first set of initialization points associated with a first slice or picture, a second set of initialization points associated with a second slice or picture, and so on. Additionally, in order to determine (eg, select) which set of initialization points that entropy coding unit 220 may use for subsequent slices or pictures, entropy coding unit 220 may also store slices or pictures associated with each initialization point in the respective sets of initialization points. Information about the timestamp value and/or QP value of the image.

視訊轉碼器200表示被配置為對視訊資料進行編碼的設備的實例，該設備包括被配置為儲存視訊資料的記憶體、以及利用電路來實現的一或多個處理單元，該一或多個處理單元被配置為執行本案內容中描述的實例技術。例如，在一或多個實例中，當緩衝器滿時，熵編碼單元220可以決定去除哪一個初始化點集合，以為剛剛決定的初始化點集合騰出空間。例如，熵編碼單元220可以決定用於對當前切片或圖片進行編碼的至少一個上下文的一或多個上下文值。熵編碼單元220可以決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿。如前述，每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點。Video transcoder 200 represents an example of a device configured to encode video data. The device includes a memory configured to store video data, and one or more processing units implemented using circuitry. The one or more processing units The processing unit is configured to perform the example techniques described in this case content. For example, in one or more instances, when the buffer is full, entropy encoding unit 220 may determine which set of initialization points to remove to make room for the just determined set of initialization points. For example, entropy encoding unit 220 may determine one or more context values for at least one context used to encode the current slice or picture. Entropy encoding unit 220 may decide that a buffer used to store sets of temporal initialization points from two or more slices or pictures for context-based arithmetic coding is full. As mentioned above, each set of temporal initialization points is associated with a slice or picture of the two or more slices or pictures, and includes one or more temporal initialization points.

熵編碼單元220可以基於切片或圖片的時間標識值或量化參數（QP）值，決定與該兩個或兩個以上切片或圖片中的切片或圖片相關聯的第一時間初始化點集合。熵編碼單元220可以從緩衝器中，去除與切片或者圖片相關聯的第一時間初始化點集合，並在緩衝器中，儲存與當前切片或者圖片相關聯的第二時間初始化點集合。第二時間初始化點集合基於所決定的一或多個上下文值（例如，等於所決定的當前切片或圖片的至少一個上下文的一或多個上下文值，或者從當前切片或圖片的至少一個上下文的一或多個上下文值中匯出）。The entropy encoding unit 220 may determine a first set of temporal initialization points associated with a slice or picture of the two or more slices or pictures based on a temporal identification value or a quantization parameter (QP) value of the slice or picture. The entropy encoding unit 220 may remove the first set of temporal initialization points associated with the slice or picture from the buffer, and store the second set of temporal initialization points associated with the current slice or picture in the buffer. The second set of temporal initialization points is based on the determined one or more context values (e.g., one or more context values equal to the determined one or more context values of at least one context of the current slice or picture, or from at least one context value of the current slice or picture). exported from one or more context values).

因此，熵編碼單元220可以利用時間標識值及/或QP值來決定去除哪個初始化點集合（包括可能的切片類型），而不是將何時對切片或圖片進行編碼作為用於決定（例如，選擇）應當從緩衝器中去除哪個初始化點集合的唯一因素。例如，為了決定第一時間初始化點集合，熵編碼單元220可以從該兩個或兩個以上切片或圖片中，決定與切片或圖片相關聯的第一時間初始化點集合，其中該切片或圖片在該兩個或兩個以上切片或圖片的時間標識值或量化參數（QP）值中具有最小時間標識值或QP值中的至少一項。例如，熵編碼單元220可以從兩個或兩個以上切片或圖片的時間標識值中決定具有最小時間標識值的切片或圖片，或者從兩個或者更多個切片或圖片的QP值中決定QP值最小的切片或圖片。隨後，熵編碼單元220可以從緩衝器中去除與所決定的切片或圖片相關聯的初始化點集合（例如，具有最小時間標識值或QP值的初始點）。Therefore, entropy encoding unit 220 may utilize time stamp values and/or QP values to decide which set of initialization points (including possible slice types) to remove, rather than when to encode a slice or picture as a decision (eg, selection) The only factor that determines which set of initialization points should be removed from the buffer. For example, in order to determine the first temporal initialization point set, the entropy encoding unit 220 may determine the first temporal initialization point set associated with the slice or picture from the two or more slices or pictures, wherein the slice or picture is in The time stamp values or quantization parameter (QP) values of the two or more slices or pictures have at least one of the minimum time stamp value or QP value. For example, the entropy encoding unit 220 may determine the slice or picture with the smallest time identification value from the time identification values of two or more slices or pictures, or determine the QP from the QP values of two or more slices or pictures. The slice or image with the smallest value. Subsequently, entropy encoding unit 220 may remove from the buffer the set of initialization points associated with the determined slice or picture (eg, the initial point with the smallest time stamp value or QP value).

在一或多個實例中，若初始化點集合中沒有一個初始化點與以下的切片或圖片相關聯，則熵編碼單元220可以去除與具有最小時間標識值或者QP值的切片或圖片相關聯的初始化點集合：該切片或圖片具有與當前切片或圖片的時間標識值或QP值相同的時間標識值或QP值。例如，熵編碼單元220可以決定當前切片或圖片的時間標識值或QP值中的至少一項不同於該兩個或兩個以上切片或圖片之每一者切片或圖片的時間標識值或QP值。在該實例中，熵編碼單元220可以基於決定當前切片或圖片的時間標識值或QP值中的至少一項不同於該兩個或兩個以上切片或圖片之每一者切片或圖片的時間標識值或QP值，來去除第一時間初始化點集合。例如，與第一時間初始化點集合相關聯的切片或圖片的時間標識值或QP值不同於當前切片或圖片的時間標識值或QP值。In one or more examples, if no initialization point in the set of initialization points is associated with the following slice or picture, entropy encoding unit 220 may remove the initialization associated with the slice or picture with the smallest time stamp value or QP value. Point set: This slice or picture has the same timestamp value or QP value as the timestamp value or QP value of the current slice or picture. For example, entropy encoding unit 220 may determine that at least one of the temporal identification value or QP value of the current slice or picture is different from the temporal identification value or QP value of each of the two or more slices or pictures. . In this example, entropy encoding unit 220 may determine that at least one of the time identification value or the QP value of the current slice or picture is different from the time identification of each of the two or more slices or pictures. value or QP value to remove the first initialization point set. For example, the time stamp value or QP value of the slice or picture associated with the first set of temporal initialization points is different from the time stamp value or QP value of the current slice or picture.

舉一個實例，假設當前切片或圖片是第一切片或圖片。在該實例中，熵編碼單元220可以決定用於對第二切片或圖片進行編碼的至少一個上下文的一或多個上下文值，並且從該兩個或兩個以上切片或圖片中決定與第二切片或圖片的時間標識值或QP值具有相同的時間標識值或者QP值中的至少一項的第三切片或圖片。在該實例中，熵編碼單元220可以從緩衝器中，去除與第三切片或圖片相關聯的第三時間初始化點集合，並且基於所決定的用於對第二切片或圖片進行編碼的至少一個上下文的一或多個上下文值，在緩衝器中，儲存與第二切片或圖片相關聯的第四時間初始化點集合。As an example, assume that the current slice or picture is the first slice or picture. In this example, entropy encoding unit 220 may determine one or more context values for at least one context used to encode a second slice or picture, and determine from the two or more slices or pictures the context value associated with the second slice or picture. A third slice or picture whose time stamp value or QP value has the same time stamp value or at least one of the QP value. In this example, entropy encoding unit 220 may remove a third set of temporal initialization points associated with the third slice or picture from the buffer, and based on the determined at least one parameter for encoding the second slice or picture One or more context values of the context store in the buffer a fourth set of temporal initialization points associated with the second slice or picture.

隨後，為了對後續圖片進行編碼，熵編碼單元220可以決定（例如，選擇）儲存在緩衝器中的時間初始化點集合，並基於所選擇的時間初始化點集合來初始化用於對後續切片或圖片進行編碼的至少一個上下文的一或多個上下文值。例如，熵編碼單元220可以將一或多個上下文值的初始值指派為等於所選擇的時間初始化點集合，或者基於（例如，經由映射、縮放、加權等）所選擇的時間初始化點集合來匯出一或多個上下文值的初始值。熵編碼單元220可以對後續切片或圖片進行基於上下文的算術編碼。例如，熵編碼單元220可以利用上下文值的初始值來對後續切片或圖片的語法元素進行編碼，並在後續切片或圖片的編碼期間更新上下文值。Subsequently, to encode subsequent pictures, entropy encoding unit 220 may determine (eg, select) a set of temporal initialization points stored in the buffer and initialize the encoding for subsequent slices or pictures based on the selected set of temporal initialization points. One or more context values encoding at least one context. For example, entropy encoding unit 220 may assign an initial value of one or more context values to be equal to, or based on (eg, via mapping, scaling, weighting, etc.) a selected set of temporal initialization points. Initializes one or more context values. Entropy encoding unit 220 may perform context-based arithmetic encoding on subsequent slices or pictures. For example, entropy encoding unit 220 may utilize the initial value of the context value to encode syntax elements of subsequent slices or pictures, and update the context value during encoding of subsequent slices or pictures.

在一或多個實例中，視訊轉碼器200亦可以被配置為儲存針對在當前圖片或切片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的複數個時間初始化點（其中該複數個時間初始化點包括在按譯碼順序在當前圖片之前的一或多個先前圖片的視訊資料中），儲存與該複數個時間初始化點中的每一個時間初始化點相關聯的相應時間標識（ID）值，基於與該複數個時間初始化點相關聯的各個時間ID值和當前圖片或切片的時間ID值，來選擇該複數個時間初始化點中的至少一個時間初始化點，以及基於所選擇的至少一個時間初始化點，對當前圖片或切片的視訊資料進行基於上下文的算術編碼。In one or more examples, video transcoder 200 may also be configured to store a plurality of temporal initialization points ( wherein the plurality of time initialization points are included in the video data of one or more previous pictures before the current picture in decoding order), and a corresponding time associated with each of the plurality of time initialization points is stored. An identification (ID) value, selecting at least one time initialization point among the plurality of time initialization points based on each time ID value associated with the plurality of time initialization points and the time ID value of the current picture or slice, and based on the At least one time initialization point is selected to perform context-based arithmetic coding on the video data of the current picture or slice.

視訊轉碼器200可以被配置為在第一緩衝器中，儲存針對在一或多個先前圖片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的一或多個時間初始化點，並在第二緩衝器中，儲存在當前圖片的切片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的時間初始化點，其中在第二緩衝器中儲存包括：在當前圖片的視訊資料的譯碼期間在第二緩衝器中進行儲存，並在處理當前圖片的最後譯碼樹單元（CTU）或切片之後，在第二緩衝器中儲存在第一緩衝器中儲存的時間初始化點。Video transcoder 200 may be configured to store, in a first buffer, one or more temporal initializations for one or more contexts used in context-based arithmetic decoding of video data for one or more previous pictures. point, and in the second buffer, store one or more temporal initialization points of the context used in the context-based arithmetic decoding of the video data of the slice of the current picture, wherein the storage in the second buffer includes: Video data for the current picture is stored in the second buffer during decoding and is stored in the first buffer after processing of the last coding tree unit (CTU) or slice of the current picture. time initialization point.

視訊轉碼器200可以被配置為儲存在對先前圖片中的先前切片（針對當前切片）的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文所對應的先前時間初始化點，基於當前圖片中具有與先前圖片中的先前切片的位置相對應的位置的當前切片，決定當前圖片中的當前切片在當前圖片中具有與先前圖片中的先前切片的位置相對應的位置，基於先前時間初始化點來決定當前切片的當前時間初始化點。Video transcoder 200 may be configured to store previous temporal initialization points corresponding to one or more contexts used in context-based arithmetic decoding of video material for previous slices (for the current slice) in previous pictures, based on The current slice in the current picture has a position corresponding to the position of the previous slice in the previous picture, determines the current slice in the current picture has a position in the current picture corresponding to the position of the previous slice in the previous picture, based on the previous time Initialization point to determine the current time initialization point of the current slice.

圖3是示出可以執行本案內容的技術的實例視訊解碼器300的方塊圖。提供圖3以便於解釋的目的，故其不應被認為是對本案內容中廣泛例示和描述的技術的限制。為了便於說明起見，本案內容根據VVC技術（在開發中的ITU-T H.266）和HEVC技術（ITU-T H.265）來描述了視訊解碼器300。然而，本案內容的技術可以由被配置為實現其他視訊譯碼標準的視訊譯碼設備來執行。FIG. 3 is a block diagram illustrating an example video decoder 300 that may perform the techniques of this disclosure. Figure 3 is provided for purposes of explanation and should not be considered a limitation on the techniques broadly illustrated and described in this context. For ease of explanation, the content of this case describes the video decoder 300 based on VVC technology (ITU-T H.266 under development) and HEVC technology (ITU-T H.265). However, the techniques of this disclosure may be performed by video decoding devices configured to implement other video decoding standards.

在圖3的實例中，視訊解碼器300包括經譯碼的圖片緩衝器（CPB）記憶體320、熵解碼單元302、預測處理單元304、逆量化單元306、逆變換處理單元308、重構單元310、濾波器單元312和經解碼的圖片緩衝器（DPB）314。CPB記憶體320、熵解碼單元302、預測處理單元304、逆量化單元306、逆變換處理單元308、重構單元310、濾波器單元312和DPB 314中的任何一項或全部項可以在一或多個處理器中或者在處理電路中實現。例如，可以將視訊解碼器300的單元實現為一或多個電路或邏輯部件，作為硬體電路的一部分，或者作為處理器、ASIC或FPGA的一部分。此外，視訊解碼器300可以包括補充的或替代的處理器或處理電路，以執行這些功能和其他功能。In the example of FIG. 3, video decoder 300 includes coded picture buffer (CPB) memory 320, entropy decoding unit 302, prediction processing unit 304, inverse quantization unit 306, inverse transform processing unit 308, reconstruction unit 310. Filter unit 312 and decoded picture buffer (DPB) 314. Any or all of CPB memory 320, entropy decoding unit 302, prediction processing unit 304, inverse quantization unit 306, inverse transform processing unit 308, reconstruction unit 310, filter unit 312, and DPB 314 may be configured in one or Implemented in multiple processors or in processing circuitry. For example, the units of the video decoder 300 may be implemented as one or more circuits or logic components, as part of a hardware circuit, or as part of a processor, ASIC, or FPGA. Additionally, video decoder 300 may include supplemental or alternative processors or processing circuitry to perform these and other functions.

預測處理單元304包括運動補償單元316和訊框內預測單元318。預測處理單元304可以包括用於根據其他預測模式來執行預測的額外單元。舉例而言，預測處理單元304可以包括調色板單元、塊內複製單元（其可以形成運動補償單元316的一部分）、仿射單元、線性模型（LM）單元等等。在其他實例中，視訊解碼器300可以包括更多、更少或者不同的功能部件。Prediction processing unit 304 includes motion compensation unit 316 and intra prediction unit 318. Prediction processing unit 304 may include additional units for performing predictions according to other prediction modes. For example, prediction processing unit 304 may include palette units, intra-block copy units (which may form part of motion compensation unit 316), affine units, linear model (LM) units, and the like. In other examples, video decoder 300 may include more, fewer, or different functional components.

當根據AVl操作時，運動補償單元316可以被配置為使用平移運動補償、仿射運動補償、OBMC及/或複合訊框間-訊框內預測，對視訊資料的譯碼塊（例如，亮度譯碼塊和色度譯碼塊兩者）進行解碼，如前述。訊框內預測單元318可以被配置為使用定向訊框內預測、非定向訊框內預測、遞迴過濾訊框內預測、CFL、訊框內塊複製（IBC）及/或調色板模式，對視訊資料的譯碼塊（例如，亮度譯碼塊和色度譯碼塊兩者）進行解碼，如前述。When operating in accordance with AV1, motion compensation unit 316 may be configured to use translational motion compensation, affine motion compensation, OBMC, and/or composite inter-intra prediction to decode blocks of video data (e.g., luma decoding code blocks and chroma decoding blocks) are decoded as described above. Intra-prediction unit 318 may be configured to use directional intra-prediction, non-directional intra-prediction, recursive filtered intra-prediction, CFL, intra-block copy (IBC), and/or palette mode, Decoding blocks of video data (eg, both luma coding blocks and chroma coding blocks) are decoded as described above.

CPB記憶體320可以儲存將由視訊解碼器300的部件解碼的視訊資料（例如，經編碼的視訊位元串流）。例如，可以從電腦可讀取媒體110（圖1）中獲得儲存在CPB記憶體320中的視訊資料。CPB記憶體320可以包括儲存來自經編碼的視訊位元串流的經編碼的視訊資料（例如，語法元素）的CPB。而且，CPB記憶體320可以儲存除經解碼圖片的語法元素之外的視訊資料，例如，表示來自視訊解碼器300的各個單元的輸出的臨時資料。DPB 314通常儲存經解碼的圖片，視訊解碼器300在解碼經編碼的視訊位元串流的後續資料或圖片時，可以輸出及/或使用該經譯碼的圖片作為參考視訊資料。CPB記憶體320和DPB 314可以由諸如DRAM（其包括SDRAM）、MRAM、RRAM或其他類型的存放裝置之類的各種記憶體設備中的任何一項來形成。CPB記憶體320和DPB 314可以由相同的記憶體設備或不同的記憶體設備來提供。在各個實例中，CPB記憶體320可以與視訊解碼器300的其他部件一起在晶片上，或者相對於那些部件在晶片外。CPB memory 320 may store video data (eg, an encoded video bit stream) to be decoded by components of video decoder 300 . For example, the video data stored in the CPB memory 320 can be obtained from the computer readable medium 110 (FIG. 1). CPB memory 320 may include a CPB that stores encoded video data (eg, syntax elements) from an encoded video bit stream. Furthermore, CPB memory 320 may store video data in addition to the syntax elements of the decoded picture, for example, temporary data representing outputs from various units of video decoder 300 . DPB 314 typically stores decoded pictures, which video decoder 300 may output and/or use as reference video data when decoding subsequent data or pictures in the encoded video bit stream. CPB memory 320 and DPB 314 may be formed from any of a variety of memory devices such as DRAM (including SDRAM), MRAM, RRAM, or other types of storage devices. CPB memory 320 and DPB 314 may be provided by the same memory device or different memory devices. In various examples, CPB memory 320 may be on-die with other components of video decoder 300 or off-die relative to those components.

補充地或替代地，在一些實例中，視訊解碼器300可以從記憶體120（圖1）中檢索經譯碼的視訊資料。亦即，記憶體120可以儲存資料，如上面參照CPB記憶體320所論述的。同樣，當視訊解碼器300的一些或全部功能利用由視訊解碼器300的處理電路執行的軟體來實現時，記憶體120可以儲存將由視訊解碼器300執行的指令。Additionally or alternatively, in some examples, video decoder 300 may retrieve the decoded video material from memory 120 (FIG. 1). That is, memory 120 may store data, as discussed above with reference to CPB memory 320 . Likewise, memory 120 may store instructions to be executed by video decoder 300 when some or all of the functions of video decoder 300 are implemented using software executed by the processing circuitry of video decoder 300 .

圖示圖3的各個單元以説明理解由視訊解碼器300執行的操作。這些單元可以實現為固定功能電路、可程式設計電路或者其組合。類似於圖2，固定功能電路代表提供特定功能、並在可以執行的操作上預先設置的電路。可程式設計電路代表可以被程式設計以執行各種任務，並且在可以執行的操作中提供靈活功能的電路。例如，可程式設計電路可以執行軟體或韌體，軟體或韌體使可程式設計電路以軟體或韌體的指令所定義的方式進行操作。固定功能電路可以執行軟體指令（例如，用於接收參數或輸出參數），但是固定功能電路執行的操作的類型通常是不可變的。在一些實例中，這些單元中的一或多個單元可以是不同的電路塊（固定功能或可程式設計），並且在一些實例中，該一或多個單元可以是積體電路。The various units of FIG. 3 are illustrated to illustrate understanding the operations performed by the video decoder 300. These units may be implemented as fixed function circuits, programmable circuits, or a combination thereof. Similar to Figure 2, a fixed-function circuit represents a circuit that provides a specific function and is preset in terms of the operations it can perform. Programmable circuits represent circuits that can be programmed to perform a variety of tasks and provide flexible functionality in the operations that can be performed. For example, the programmable circuit may execute software or firmware that causes the programmable circuit to operate in a manner defined by the instructions of the software or firmware. Fixed-function circuits can execute software instructions (for example, for receiving parameters or outputting parameters), but the types of operations performed by fixed-function circuits are generally immutable. In some examples, one or more of the units may be distinct circuit blocks (fixed function or programmable), and in some examples, the one or more units may be integrated circuits.

視訊解碼器300可以包括ALU、EFU、數位電路、類比電路及/或由可程式設計電路形成的可程式設計核。在經由在可程式設計電路上執行的軟體來執行視訊解碼器300的操作的實例中，片上或片外記憶體可以儲存視訊解碼器300接收並執行的軟體的指令（例如，目標代碼）。Video decoder 300 may include ALU, EFU, digital circuits, analog circuits, and/or a programmable core formed of programmable circuits. In instances where operations of video decoder 300 are performed via software executing on programmable circuitry, on-chip or off-chip memory may store instructions (eg, object code) for the software that video decoder 300 receives and executes.

熵解碼單元302可以從CPB接收經編碼的視訊資料，並且對視訊資料進行熵解碼以再現語法元素。預測處理單元304、逆量化單元306、逆變換處理單元308、重構單元310和濾波器單元312可以基於從位元串流中提取的語法元素來產生經解碼的視訊資料。Entropy decoding unit 302 may receive encoded video data from the CPB and entropy decode the video data to reproduce syntax elements. Prediction processing unit 304, inverse quantization unit 306, inverse transform processing unit 308, reconstruction unit 310 and filter unit 312 may generate decoded video material based on syntax elements extracted from the bit stream.

通常，視訊解碼器300在逐塊的基礎上重構圖片。視訊解碼器300可以單獨地對每個區塊執行重構操作（其中當前正在重構（亦即，解碼）的塊可以稱為「當前塊」）。Typically, video decoder 300 reconstructs pictures on a block-by-block basis. Video decoder 300 may perform reconstruction operations on each block individually (where the block currently being reconstructed (ie, decoded) may be referred to as the "current block").

熵解碼單元302可以對定義經量化的變換係數塊的經量化的變換係數的語法元素、以及諸如量化參數（QP）及/或變換模式指示之類的變換資訊進行熵解碼。逆量化單元306可以使用與經量化的變換係數塊相關聯的QP來決定量化程度，並且同樣地，決定用於逆量化單元306應用的逆量化程度。例如，逆量化單元306可以執行按位元左移運算以對經量化的變換係數進行逆量化。逆量化單元306可以由此形成包括變換係數的變換係數塊。Entropy decoding unit 302 may entropy decode syntax elements that define quantized transform coefficients of a block of quantized transform coefficients, and transform information such as quantization parameters (QPs) and/or transform mode indications. Inverse quantization unit 306 may use the QP associated with the quantized transform coefficient block to determine the degree of quantization and, likewise, determine the degree of inverse quantization for inverse quantization unit 306 to apply. For example, inverse quantization unit 306 may perform a bitwise left shift operation to inversely quantize the quantized transform coefficients. Inverse quantization unit 306 may thereby form a transform coefficient block including transform coefficients.

在逆量化單元306形成變換係數塊之後，逆變換處理單元308可以將一或多個逆變換應用於變換係數塊以產生與當前塊相關聯的殘差塊。例如，逆變換處理單元308可以向變換係數塊應用逆DCT、逆整數變換、逆Karhunen-Loeve變換（KLT）、逆旋轉變換、逆方向變換或者另一種逆變換。After inverse quantization unit 306 forms the block of transform coefficients, inverse transform processing unit 308 may apply one or more inverse transforms to the block of transform coefficients to produce a residual block associated with the current block. For example, the inverse transform processing unit 308 may apply an inverse DCT, an inverse integer transform, an inverse Karhunen-Loeve transform (KLT), an inverse rotation transform, an inverse direction transform, or another inverse transform to the transform coefficient block.

此外，預測處理單元304根據由熵解碼單元302進行熵解碼的預測資訊語法元素，來產生預測塊。例如，若預測資訊語法元素指示當前塊是訊框間預測的，則運動補償單元316可以產生預測塊。在這種情況下，預測資訊語法元素可以指示DPB 314中的從其檢索參考塊的參考圖片、以及標識參考圖片中的參考塊相對於當前圖片中的當前塊的位置的運動向量。運動補償單元316通常可以以與關於運動補償單元224（圖2）所描述的方式基本上相似的方式，來執行訊框間預測處理。In addition, the prediction processing unit 304 generates prediction blocks based on the prediction information syntax elements entropy-decoded by the entropy decoding unit 302 . For example, if the prediction information syntax element indicates that the current block is inter-predicted, motion compensation unit 316 may generate a prediction block. In this case, the prediction information syntax element may indicate a reference picture in DPB 314 from which the reference block is retrieved, and a motion vector identifying the position of the reference block in the reference picture relative to the current block in the current picture. Motion compensation unit 316 may generally perform inter-frame prediction processing in a manner substantially similar to that described with respect to motion compensation unit 224 (FIG. 2).

再舉一個實例，若預測資訊語法元素指示當前塊是訊框內預測的，則訊框內預測單元318可以根據由預測資訊語法元素指示的訊框內預測模式來產生預測塊。再次，訊框內預測單元318通常可以以與關於訊框內預測單元226（圖2）所描述的方式基本上相似的方式來執行訊框內預測程序。訊框內預測單元318可以從DPB 314檢索當前塊的相鄰取樣的資料。As another example, if the prediction information syntax element indicates that the current block is intra-predicted, the intra prediction unit 318 may generate the prediction block according to the intra prediction mode indicated by the prediction information syntax element. Again, intra-prediction unit 318 may generally perform the intra-prediction process in a manner substantially similar to that described with respect to intra-prediction unit 226 (FIG. 2). Intra-frame prediction unit 318 may retrieve data for adjacent samples of the current block from DPB 314 .

重構單元310可以使用預測塊和殘差塊來重構當前塊。例如，重構單元310可以將殘差塊的取樣添加到預測塊的對應取樣以重構當前的塊。Reconstruction unit 310 may reconstruct the current block using the prediction block and the residual block. For example, reconstruction unit 310 may add samples of the residual block to corresponding samples of the prediction block to reconstruct the current block.

濾波器單元312可以對經重構的塊執行一或多個濾波操作。例如，濾波器單元312可以執行解塊操作，以減少沿著重構塊的邊緣的塊狀偽像。不一定在所有實例中皆執行濾波器單元312的操作。Filter unit 312 may perform one or more filtering operations on the reconstructed blocks. For example, filter unit 312 may perform deblocking operations to reduce blocking artifacts along the edges of reconstructed blocks. The operation of filter unit 312 may not necessarily be performed in all instances.

視訊解碼器300可以將經重構的塊儲存在DPB 314中。例如，在不執行濾波器單元312的操作的實例中，重構單元310可以將重構的塊儲存到DPB 314中。在執行濾波器單元312的操作的實例中，濾波器單元312可以將經濾波的重構塊儲存在DPB 314中。如前述，DPB 314可以向預測處理單元304提供參考資訊，例如用於訊框內預測的當前圖片的取樣以及用於後續運動補償的先前經解碼的圖片。此外，視訊解碼器300可以從DPB 314輸出經解碼的圖片（例如，經解碼的視訊），以便隨後在諸如圖1的顯示設備118之類的顯示設備上呈現。Video decoder 300 may store the reconstructed blocks in DPB 314. For example, in an example where the operations of filter unit 312 are not performed, reconstruction unit 310 may store the reconstructed blocks into DPB 314 . In an example in which the operations of filter unit 312 are performed, filter unit 312 may store the filtered reconstruction block in DPB 314 . As mentioned above, DPB 314 may provide reference information to prediction processing unit 304, such as samples of the current picture for intra-frame prediction and previously decoded pictures for subsequent motion compensation. Additionally, video decoder 300 may output decoded pictures (eg, decoded video) from DPB 314 for subsequent presentation on a display device, such as display device 118 of FIG. 1 .

用此方式，視訊解碼器300表示一種視訊解碼設備的實例，該視訊解碼設備包括被配置為儲存視訊資料的記憶體、以及在電路中實現的一或多個處理單元，其中該一或多個處理單元被配置為執行本案內容所描述的實例技術。例如，對於諸如基於上下文的算術解碼之類的熵解碼，熵解碼單元302可以決定一或多個上下文的上下文值。在對切片或圖片進行解碼期間，熵解碼單元302可以更新上下文值（例如，概率值）。然而，在切片或圖片的開頭，上下文值可能是未定義的。熵解碼單元302可以使用預定義的初始化點（例如，儲存在DPB 314、CPB記憶體320或某種其他記憶體中的初始值）來初始化一或多個上下文值，而不是從未定義的上下文值開始。In this manner, video decoder 300 represents an example of a video decoding device that includes a memory configured to store video data, and one or more processing units implemented in circuitry, wherein the one or more processing units The processing unit is configured to perform the example techniques described in this case. For example, for entropy decoding such as context-based arithmetic decoding, entropy decoding unit 302 may determine context values for one or more contexts. During decoding of a slice or picture, entropy decoding unit 302 may update context values (eg, probability values). However, at the beginning of a slice or picture, the context value may be undefined. Entropy decoding unit 302 may initialize one or more context values using a predefined initialization point (e.g., an initial value stored in DPB 314, CPB memory 320, or some other memory), rather than from an undefined context. value starts.

在一或多個實例中，除了或替代使用預定義的初始化點之外，熵解碼單元302亦可以使用先前解碼的切片或圖片的一或多個上下文值，或者一或多個上下文值的映射、縮放、加權等版本作為初始化點。例如，熵解碼單元302可以在緩衝器（例如，DPB 314、CPB記憶體320或某種其他記憶體）中儲存經解碼的切片或圖片的上下文的一或多個上下文值（例如，在對切片或圖片的最後CTU進行解碼之後），作為初始化點集合。熵解碼單元302可以利用該初始化點集合（例如，上下文值或基於先前經解碼的切片或圖片的上下文值的值），來初始化用於對後續切片或圖片進行解碼的上下文的上下文值。In one or more examples, in addition to or instead of using predefined initialization points, entropy decoding unit 302 may also use one or more context values from previously decoded slices or pictures, or a mapping of one or more context values. , scaled, weighted, etc. versions as initialization points. For example, entropy decoding unit 302 may store one or more context values for the context of the decoded slice or picture (e.g., in a buffer (e.g., DPB 314, CPB memory 320, or some other memory)) Or after decoding the last CTU of the picture), as a set of initialization points. Entropy decoding unit 302 may utilize the set of initialization points (eg, context values or values based on context values of previously decoded slices or pictures) to initialize context values for context used to decode subsequent slices or pictures.

熵解碼單元302可以儲存多個先前經解碼的切片或圖片的初始化點集合。例如，緩衝器可以儲存與第一切片或圖片相關聯的第一初始化點集合、與第二切片或圖片相關聯的第二初始化點集合，等等。此外，為了決定（例如，選擇）熵解碼單元302可以將哪個初始化點集合用於後續切片或圖片，熵解碼單元302亦可以儲存與各個初始化點集合之每一者初始化點相關聯的切片或圖片的時間標識值及/或QP值的資訊。Entropy decoding unit 302 may store a set of initialization points for multiple previously decoded slices or pictures. For example, the buffer may store a first set of initialization points associated with a first slice or picture, a second set of initialization points associated with a second slice or picture, and so on. Additionally, in order to determine (eg, select) which set of initialization points that entropy decoding unit 302 may use for subsequent slices or pictures, entropy decoding unit 302 may also store slices or pictures associated with each initialization point in the respective sets of initialization points. time stamp value and/or QP value information.

在一或多個實例中，當緩衝器滿時，熵解碼單元302可以決定去除哪個初始化點集合，以為剛剛決定的初始化點集合騰出空間。例如，熵解碼單元302可以決定用於對當前切片或圖片進行解碼的至少一個上下文的一或多個上下文值。熵解碼單元302可以決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿。如前述，每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點。In one or more instances, when the buffer is full, entropy decoding unit 302 may determine which set of initialization points to remove to make room for the just decided set of initialization points. For example, entropy decoding unit 302 may determine one or more context values for at least one context used to decode the current slice or picture. Entropy decoding unit 302 may decide that a buffer used to store sets of temporal initialization points from two or more slices or pictures for context-based arithmetic coding is full. As mentioned above, each set of temporal initialization points is associated with a slice or picture of the two or more slices or pictures, and includes one or more temporal initialization points.

熵解碼單元302可以基於切片或圖片的時間標識值或量化參數（QP）值，決定與該兩個或兩個以上切片或圖片中的切片或圖片相關聯的第一時間初始化點集合。熵解碼單元302可以從緩衝器中，去除與切片或者圖片相關聯的第一時間初始化點集合，並在緩衝器中，儲存與當前切片或者圖片相關聯的第二時間初始化點集合。第二時間初始化點集合是基於所決定的一或多個上下文值的（例如，等於所決定的當前切片或圖片的至少一個上下文的一或多個上下文值，或者根據當前切片或圖片的至少一個上下文的一或多個上下文值中匯出）。The entropy decoding unit 302 may determine a first set of temporal initialization points associated with a slice or picture of the two or more slices or pictures based on a temporal identification value or a quantization parameter (QP) value of the slice or picture. The entropy decoding unit 302 may remove the first set of temporal initialization points associated with the slice or picture from the buffer, and store the second set of temporal initialization points associated with the current slice or picture in the buffer. The second set of temporal initialization points is based on the determined one or more context values (e.g., one or more context values equal to the determined one or more context values of the current slice or picture, or based on at least one determined context value of the current slice or picture). exported from one or more context values of the context).

因此，熵解碼單元302可以利用時間標識值及/或QP值來決定去除哪個初始化點集合，而不是將何時對切片或圖片進行解碼作為用於決定（例如，選擇）應當從緩衝器中去除哪個初始化點集合的唯一因素。例如，為了決定第一時間初始化點集合，熵解碼單元302可以從該兩個或兩個以上切片或圖片中，決定與切片或圖片相關聯的第一時間初始化點集合，其中該切片或圖片在該兩個或兩個以上切片或圖片的時間標識值或量化參數（QP）值中具有最小時間標識值或QP值中的至少一項。例如，熵解碼單元302可以從兩個或兩個以上切片或圖片的時間標識值中決定具有最小時間標識值的切片或圖片，或者從兩個或者更多個切片或圖片的QP值中決定QP值最小的切片或圖片。隨後，熵解碼單元302可以從緩衝器中去除與所決定的切片或圖片相關聯的初始化點集合（例如，具有最小時間標識值或QP值的一個初始化點）。Therefore, entropy decoding unit 302 may utilize the time stamp value and/or the QP value to decide which set of initialization points to remove, rather than when the slice or picture is decoded as the basis for deciding (eg, selecting) which set of initialization points should be removed from the buffer. The only factor that initializes the point collection. For example, in order to determine the first temporal initialization point set, the entropy decoding unit 302 may determine the first temporal initialization point set associated with the slice or picture from the two or more slices or pictures, wherein the slice or picture is in The time stamp values or quantization parameter (QP) values of the two or more slices or pictures have at least one of the minimum time stamp value or QP value. For example, the entropy decoding unit 302 may determine the slice or picture with the smallest time identification value from the time identification values of two or more slices or pictures, or determine the QP from the QP values of two or more slices or pictures. The slice or image with the smallest value. Subsequently, entropy decoding unit 302 may remove from the buffer the set of initialization points associated with the determined slice or picture (eg, the one with the smallest time stamp value or QP value).

在一或多個實例中，若初始化點集合中沒有一個初始化點與以下的切片或圖片相關聯，則熵解碼單元302可以去除與具有最小時間標識值或者QP值的切片或圖片相關聯的初始化點集合：該切片或圖片具有與當前切片或圖片的時間標識值或QP值相同的時間標識值或QP值。例如，熵解碼單元302可以決定當前切片或圖片的時間標識值或QP值中的至少一項不同於該兩個或兩個以上切片或圖片之每一者切片或圖片的時間標識值或QP值。在該實例中，熵解碼單元302可以基於決定當前切片或圖片的時間標識值或QP值中的至少一項不同於該兩個或兩個以上切片或圖片之每一者切片或圖片的時間標識值或QP值，來去除第一時間初始化點集合。例如，與第一時間初始化點集合相關聯的切片或圖片的時間標識值或QP值不同於當前切片或圖片的時間標識值或QP值。In one or more examples, if no initialization point in the set of initialization points is associated with the following slice or picture, entropy decoding unit 302 may remove the initialization associated with the slice or picture with the smallest time stamp value or QP value. Point set: This slice or picture has the same timestamp value or QP value as the timestamp value or QP value of the current slice or picture. For example, entropy decoding unit 302 may determine that at least one of the temporal identification value or QP value of the current slice or picture is different from the temporal identification value or QP value of each of the two or more slices or pictures. . In this example, entropy decoding unit 302 may determine that at least one of the time identification value or the QP value of the current slice or picture is different from the time identification of each of the two or more slices or pictures. value or QP value to remove the first initialization point set. For example, the time stamp value or QP value of the slice or picture associated with the first set of temporal initialization points is different from the time stamp value or QP value of the current slice or picture.

舉一個實例，假設當前切片或圖片是第一切片或圖片。在該實例中，熵解碼單元302可以決定用於對第二切片或圖片進行解碼的至少一個上下文的一或多個上下文值，並且從該兩個或兩個以上切片或圖片中決定與第二切片或圖片的時間標識值或QP值具有相同的時間標識值或者QP值中的至少一項的第三切片或圖片。在該實例中，熵解碼單元302可以從緩衝器中，去除與第三切片或圖片相關聯的第三時間初始化點集合，並且基於所決定的用於對第二切片或圖片進行解碼的至少一個上下文的一或多個上下文值，在緩衝器中，儲存與第二切片或圖片相關聯的第四時間初始化點集合。As an example, assume that the current slice or picture is the first slice or picture. In this example, entropy decoding unit 302 may determine one or more context values for at least one context used to decode a second slice or picture, and determine from the two or more slices or pictures the context value associated with the second slice or picture. A third slice or picture whose time stamp value or QP value has the same time stamp value or at least one of the QP value. In this example, entropy decoding unit 302 may remove a third set of temporal initialization points associated with the third slice or picture from the buffer, and based on the determined at least one parameter for decoding the second slice or picture One or more context values of the context store in the buffer a fourth set of temporal initialization points associated with the second slice or picture.

隨後，為了對後續圖片進行解碼，熵解碼單元302可以決定（例如，選擇）儲存在緩衝器中的時間初始化點集合，並基於所選擇的時間初始化點集合來初始化用於對後續切片或圖片進行解碼的至少一個上下文的一或多個上下文值。例如，熵解碼單元302可以將一或多個上下文值的初始值指派為等於所選擇的時間初始化點集合，或者基於（例如，經由映射、縮放、加權等）所選擇的時間初始化點集合來匯出一或多個上下文值的初始值。熵解碼單元302可以對後續切片或圖片進行基於上下文的算術解碼。例如，熵解碼單元302可以利用上下文值的初始值來對後續切片或圖片的語法元素進行解碼，並在後續切片或圖片的解碼期間更新上下文值。Subsequently, to decode subsequent pictures, entropy decoding unit 302 may determine (eg, select) a set of temporal initialization points stored in the buffer and initialize the decoding of subsequent slices or pictures based on the selected set of temporal initialization points. One or more context values of the decoded at least one context. For example, entropy decoding unit 302 may assign an initial value of one or more context values to be equal to a selected set of temporal initialization points, or may be based on (eg, via mapping, scaling, weighting, etc.) a selected set of temporal initialization points. Initializes one or more context values. Entropy decoding unit 302 may perform context-based arithmetic decoding on subsequent slices or pictures. For example, entropy decoding unit 302 may utilize the initial value of the context value to decode syntax elements of subsequent slices or pictures, and update the context value during decoding of subsequent slices or pictures.

在一或多個實例中，視訊解碼器300亦可以被配置為儲存針對在當前圖片或切片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的複數個時間初始化點（其中該複數個時間初始化點包括在按譯碼順序在當前圖片之前的一或多個先前圖片的視訊資料中），儲存與該複數個時間初始化點中的每一個時間初始化點相關聯的相應時間標識（ID）值，基於與該複數個時間初始化點相關聯的各個時間ID值和當前圖片或切片的時間ID值，來選擇該複數個時間初始化點中的至少一個時間初始化點，以及基於所選擇的至少一個時間初始化點，對當前圖片或切片的視訊資料進行基於上下文的算術解碼。In one or more examples, video decoder 300 may also be configured to store a plurality of temporal initialization points for one or more contexts used in context-based arithmetic coding of video data for the current picture or slice (where The plurality of time initialization points are included in the video data of one or more previous pictures before the current picture in decoding order), and a corresponding time identifier associated with each of the plurality of time initialization points is stored. (ID) value, selecting at least one of the plurality of time initialization points based on each time ID value associated with the plurality of time initialization points and the time ID value of the current picture or slice, and based on the selected At least one time initialization point, context-based arithmetic decoding is performed on the video data of the current picture or slice.

視訊解碼器300可以被配置為在第一緩衝器中，儲存針對在一或多個先前圖片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的一或多個時間初始化點，並在第二緩衝器中，儲存在當前圖片的切片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的時間初始化點，其中在第二緩衝器中儲存包括：在當前圖片的視訊資料的譯碼期間在第二緩衝器中進行儲存，並在處理當前圖片的最後譯碼樹單元（CTU）或切片之後，將儲存在第二緩衝器中的時間初始化點儲存在第一緩衝器中。Video decoder 300 may be configured to store, in a first buffer, one or more temporal initialization points for one or more contexts used in context-based arithmetic decoding of video material of one or more previous pictures. , and in the second buffer, store one or more context temporal initialization points used in the context-based arithmetic decoding of the video data of the slice of the current picture, wherein the storage in the second buffer includes: in the current The video data of the picture is stored in the second buffer during decoding, and after processing the last coding tree unit (CTU) or slice of the current picture, the temporal initialization point stored in the second buffer is stored in the second buffer. in a buffer.

視訊解碼器300可以被配置為儲存針對在對先前圖片中的先前切片（針對當前切片）的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的先前時間初始化點，基於當前圖片中具有與先前圖片中的先前切片的位置相對應的位置的當前切片，決定當前圖片中的當前切片在當前圖片中具有與先前圖片中的先前切片的位置相對應的位置，基於先前時間初始化點來決定當前切片的當前時間初始化點。Video decoder 300 may be configured to store previous temporal initialization points for one or more contexts used in context-based arithmetic coding of video material for previous slices in previous pictures (for the current slice), based on the current picture. The current slice in the current picture has a position corresponding to the position of the previous slice in the previous picture, determines that the current slice in the current picture has a position in the current picture corresponding to the position of the previous slice in the previous picture, based on the previous time initialization point to determine the current time initialization point of the current slice.

圖4是示出用於根據本案內容的技術，對當前塊進行編碼的實例方法的流程圖。當前塊可以包括當前CU。儘管關於視訊轉碼器200（圖1和圖2）進行了描述，但是應當理解的是，其他設備亦可以被配置為執行與圖4類似的方法。4 is a flowchart illustrating an example method for encoding a current block in accordance with the techniques of this disclosure. The current block may include the current CU. Although described with respect to the video transcoder 200 (Figs. 1 and 2), it should be understood that other devices may also be configured to perform a method similar to that of Fig. 4.

在該實例中，視訊轉碼器200在初始時預測當前塊（400）。舉例而言，視訊轉碼器200可以形成針對當前塊的預測塊。隨後，視訊轉碼器200可以針對當前塊來計算殘差塊（402）。為了計算殘差塊，視訊轉碼器200可以計算原始未經編碼的塊與當前塊的預測塊之間的差。隨後，視訊轉碼器200可以對殘差塊進行變換，並量化殘差塊的變換係數（404）。接下來，視訊轉碼器200可以掃瞄殘差塊的經量化的係數（406）。在掃瞄期間或者在掃瞄之後，視訊轉碼器200可以對變換係數進行熵編碼（408）。例如，視訊轉碼器200可以使用CAVLC或CABAC，對變換係數進行編碼。根據一或多個實例，視訊轉碼器200可以使用利用本案內容中描述的技術決定的上下文值，對變換係數進行編碼。隨後，視訊轉碼器200可以輸出塊的經熵編碼的資料（410）。In this example, video transcoder 200 initially predicts the current block (400). For example, the video transcoder 200 may form a prediction block for the current block. Video transcoder 200 may then calculate a residual block for the current block (402). To calculate the residual block, video transcoder 200 may calculate the difference between the original uncoded block and the predicted block of the current block. Subsequently, the video transcoder 200 may transform the residual block and quantize the transform coefficients of the residual block (404). Next, video transcoder 200 may scan the quantized coefficients of the residual block (406). During or after scanning, video transcoder 200 may entropy encode the transform coefficients (408). For example, the video transcoder 200 may use CAVLC or CABAC to encode the transform coefficients. According to one or more examples, video transcoder 200 may encode transform coefficients using context values determined using techniques described in this document. Video transcoder 200 may then output the entropy-encoded data of the block (410).

圖5是示出用於根據本案內容的技術，對視訊資料的當前塊進行解碼的實例方法的流程圖。當前塊可以包括當前CU。儘管關於視訊解碼器300（圖1和3）進行了描述，但應當理解，其他設備亦可以被配置為執行與圖5類似的方法。5 is a flowchart illustrating an example method for decoding a current block of video material in accordance with the techniques of this disclosure. The current block may include the current CU. Although described with respect to video decoder 300 (Figures 1 and 3), it should be understood that other devices may be configured to perform methods similar to Figure 5.

視訊解碼器300可以接收當前塊的經熵編碼的資料（例如，經熵編碼的預測資訊和用於對應於當前塊的殘差塊的變換係數的經熵編碼的資料）（500）。視訊解碼器300可以對熵編碼的資料進行熵解碼，以決定當前塊的預測資訊並再現殘差塊的變換係數（502）。根據一或多個實例，視訊解碼器300可以使用利用本案內容中描述的技術決定的上下文值，對經編碼的資料進行解碼。視訊解碼器300可以例如使用如針對當前塊的預測資訊所指示的訊框內或訊框間預測模式來預測當前塊（504），以計算針對當前塊的預測塊。隨後，視訊解碼器300可以對再現的變換係數進行逆掃瞄（506），以建立經量化的變換係數的塊。隨後，視訊解碼器300可以對變換係數進行逆量化，並將逆變換應用於這些變換係數以產生殘差塊（508）。視訊解碼器300可以經由組合預測塊和殘差塊，來最終解碼當前塊（510）。Video decoder 300 may receive entropy-encoded information for the current block (eg, entropy-encoded prediction information and entropy-encoded information for transform coefficients of a residual block corresponding to the current block) (500). The video decoder 300 may perform entropy decoding on the entropy-encoded data to determine prediction information of the current block and reproduce transform coefficients of the residual block (502). According to one or more examples, video decoder 300 may decode encoded material using context values determined using techniques described in this document. Video decoder 300 may predict the current block (504), for example using an intra or inter prediction mode as indicated by prediction information for the current block, to calculate a predicted block for the current block. Video decoder 300 may then inverse scan (506) the rendered transform coefficients to create a block of quantized transform coefficients. Video decoder 300 may then inversely quantize the transform coefficients and apply the inverse transform to the transform coefficients to produce a residual block (508). Video decoder 300 may finally decode the current block by combining the prediction block and the residual block (510).

圖6是示出用於對視訊資料進行處理的實例方法的流程圖。為了便於說明起見，關於處理電路和緩衝器來描述圖6的實例，處理電路的實例包括視訊轉碼器200和視訊解碼器300的處理電路，緩衝器的實例包括記憶體106、記憶體120、視訊資料記憶體230、DPB 218、CPB記憶體320、DPB 314、或者用於視訊轉碼器200或視訊解碼器300的任何其他記憶體。Figure 6 is a flowchart illustrating an example method for processing video data. For ease of explanation, the example of FIG. 6 is described with respect to processing circuits and buffers. Examples of processing circuits include the processing circuits of video transcoder 200 and video decoder 300 . Examples of buffers include memory 106 , memory 120 , video data memory 230, DPB 218, CPB memory 320, DPB 314, or any other memory for the video transcoder 200 or the video decoder 300.

處理電路可以被配置為決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值（600）。例如，當視訊轉碼器200和視訊解碼器300正在對當前切片或圖片進行編碼或解碼時，視訊轉碼器200或視訊解碼器300可以更新針對基於上下文的切片或圖片的最近編碼或解碼的視訊資料的上下文值。The processing circuitry may be configured to determine one or more context values for at least one context used to encode or decode the current slice or picture (600). For example, when the video transcoder 200 and the video decoder 300 are encoding or decoding the current slice or picture, the video transcoder 200 or the video decoder 300 may update the most recently encoded or decoded value for the context-based slice or picture. The context value of the video data.

處理電路可以決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿（602）。如前述，每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點。例如，緩衝區可以儲存的時間初始化點集合的數量可能存在限制，以保持緩衝區的大小實用。舉一個實例，緩衝器可以儲存多達五個時間初始化點集合。The processing circuitry may determine that a buffer used to store sets of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding is full (602). As mentioned above, each set of temporal initialization points is associated with a slice or picture of the two or more slices or pictures, and includes one or more temporal initialization points. For example, there may be a limit on the number of sets of temporal initialization points that a buffer can store to keep the size of the buffer practical. As an example, the buffer can store up to five sets of time initialization points.

處理電路可以基於切片或圖片的切片類型、時間標識值或量化參數（QP）值中的至少一項，從該兩個或兩個以上切片或圖片中決定（例如，辨識）切片或圖片相關聯的第一時間初始化點集合（604）。例如，緩衝器亦可以儲存與儲存在緩衝器中的各個初始化點集合相關聯的切片和圖片的時間標識值、QP值及/或切片類型資訊。The processing circuit may determine (eg, identify) a slice or picture association from the two or more slices or pictures based on at least one of a slice type, a time stamp value, or a quantization parameter (QP) value of the slice or picture The first time initialization point set (604). For example, the buffer may also store timestamp values, QP values, and/or slice type information of slices and pictures associated with each set of initialization points stored in the buffer.

舉一個實例，為了決定第一時間初始化點集合，處理電路可以被配置為從該兩個或兩個以上切片或圖片中，決定與切片或圖片相關聯的第一時間初始化點集合，其在該兩個或兩個以上切片或圖片的時間標識值或量化參數（QP）值中具有最小時間標識值或QP值中的至少一項。例如，處理電路可以從該兩個或兩個以上切片或圖片的時間標識值中決定具有最小時間標識值的切片或圖片，及/或從該兩個或者更多個切片或者圖片的QP值中決定具有最小QP值的切片或者圖片。As an example, in order to determine the first temporal initialization point set, the processing circuit may be configured to determine, from the two or more slices or pictures, the first temporal initialization point set associated with the slice or picture, which is in the first temporal initialization point set. The time stamp values or quantization parameter (QP) values of two or more slices or pictures have at least one of the smallest time stamp value or QP value. For example, the processing circuit may determine the slice or picture with the smallest time stamp value from the time stamp values of the two or more slices or pictures, and/or from the QP values of the two or more slices or pictures Determine the slice or image with the smallest QP value.

在一些實例中，具有最小時間標識值或QP值的第一初始化點集合可以具有與當前切片不同的切片類型。在一些實例中，例如針對切片類型儲存多個時間初始化點集合的情況下，視訊轉碼器200和視訊解碼器300可以決定與具有與當前切片相同的切片類型的切片相關聯的時間初始化點集合的封包。隨後，視訊轉碼器200和視訊解碼器300可以將該封包內具有最小時間標識值或QP值的時間初始化點集合，決定為第一時間初始化點集合。In some examples, the first set of initialization points with the smallest time stamp value or QP value may have a different slice type than the current slice. In some examples, such as where multiple sets of temporal initialization points are stored for a slice type, video transcoder 200 and video decoder 300 may determine the set of temporal initialization points associated with a slice having the same slice type as the current slice. of packets. Subsequently, the video transcoder 200 and the video decoder 300 may determine the time initialization point set with the minimum time stamp value or QP value in the packet as the first time initialization point set.

處理電路可以從緩衝器中，去除與切片或圖片相關聯的第一時間初始化點集合（606）。處理電路可以在緩衝器中，儲存與當前切片或圖片相關聯的第二時間初始化點集合（608）。第二時間初始化點集合是基於所決定的一或多個上下文值（例如，等於所決定的用於當前切片或圖片的上下文的一或多個上下文數值，或者根據其來匯出）。在一些實例中，在處理當前切片或圖片的最後譯碼樹單元（CTU）之後，處理電路可以儲存第二時間初始化點集合。The processing circuit may remove the first set of initialization points associated with the slice or picture from the buffer (606). The processing circuit may store a second set of temporal initialization points associated with the current slice or picture in the buffer (608). The second set of temporal initialization points is based on the determined context value(s) (eg, equal to or derived from the determined context value(s) for the current slice or picture). In some examples, the processing circuitry may store a second set of temporal initialization points after processing the last coding tree unit (CTU) of the current slice or picture.

在一或多個實例中，對於按譯碼順序的後續切片或圖片，處理電路可以決定（例如，選擇）儲存在緩衝器中的時間初始化點集合。處理電路可以基於所選擇的時間初始化點集合，來初始化用於對後續切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值（例如，將上下文值的初始值設置為等於所選擇的時間初始化點集合，或者基於所選擇的時間初始化點集合來匯出上下文值的初始值）。處理電路可以對後續切片或圖片進行基於上下文的算術編碼或解碼。In one or more examples, processing circuitry may determine (eg, select) a set of temporal initialization points to be stored in a buffer for subsequent slices or pictures in coding order. The processing circuitry may initialize one or more context values for at least one context used to encode or decode subsequent slices or pictures based on the selected set of temporal initialization points (e.g., set an initial value of the context value equal to the selected a set of time initialization points, or export the initial value of the context value based on a selected set of time initialization points). Processing circuitry can perform context-based arithmetic encoding or decoding of subsequent slices or pictures.

圖7是示出用於對視訊資料進行處理的另一實例方法的流程圖。為了便於說明起見，關於處理電路和緩衝器描述了圖7的實例，處理電路的實例包括視訊轉碼器200和視訊解碼器300的處理電路，緩衝器的實例包括記憶體106、記憶體120、視訊資料記憶體230、DPB 218、CPB記憶體320、DPB 314、或者用於視訊轉碼器200或視訊解碼器300中的任何其他記憶體。Figure 7 is a flowchart illustrating another example method for processing video data. For ease of explanation, the example of FIG. 7 is described with respect to processing circuits and buffers. Examples of processing circuits include the processing circuits of video transcoder 200 and video decoder 300 . Examples of buffers include memory 106 , memory 120 , video data memory 230, DPB 218, CPB memory 320, DPB 314, or any other memory used in the video transcoder 200 or the video decoder 300.

如前述，處理電路可以去除與具有最小時間標識值或QP值的切片或圖片相關聯的時間初始化點集合。然而，在一些實例中，若不存在與當前切片或圖片具有相同的時間標識值、QP值或切片類型的切片或圖片相關聯的時間初始化點集合，則處理電路可以執行這種去除程序。As previously described, the processing circuit may remove the set of temporal initialization points associated with the slice or picture having the smallest temporal stamp value or QP value. However, in some examples, the processing circuitry may perform such removal if there is no set of temporal initialization points associated with a slice or picture that has the same time stamp value, QP value, or slice type as the current slice or picture.

例如，假設圖6的當前切片或圖片是第一切片或圖片。在該實例中，處理電路可以決定用於對第二切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值（700）。亦即，處理電路可以對第二切片或圖片執行類似的編碼或解碼操作，並如前述地更新上下文值。For example, assume that the current slice or picture of Figure 6 is the first slice or picture. In this example, the processing circuitry may determine one or more context values for at least one context used to encode or decode the second slice or picture (700). That is, the processing circuitry may perform similar encoding or decoding operations on the second slice or picture and update the context value as previously described.

處理電路可以從兩個或兩個以上切片或圖片中決定第三切片或圖片，該第三切片或圖片具有與第二切片或圖片的時間標識值、QP值或切片類型相同的時間標識值、QP值和切片類型中的至少一項（702）。在該實例中，處理電路可以從緩衝器中去除與第三切片或圖片相關聯的第三時間初始化點集合，而不是去除與具有最小時間標識值或QP值的切片或圖片相關聯的初始化點集合（704）。處理電路可以基於所決定的用於對第二切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值，在緩衝器中儲存與第二切片或圖片相關聯的第四時間初始化點集合（706）。The processing circuit may determine a third slice or picture from the two or more slices or pictures, the third slice or picture having the same time stamp value, QP value or slice type as the second slice or picture. At least one of QP value and slice type (702). In this example, the processing circuitry may remove the third set of temporal initialization points associated with the third slice or picture from the buffer, rather than removing the initialization point associated with the slice or picture with the smallest temporal stamp value or QP value. Collection(704). The processing circuit may store in the buffer a fourth temporal initialization point associated with the second slice or picture based on the determined one or more context values of the at least one context used to encode or decode the second slice or picture. Collection(706).

圖7的實例僅用於說明目的，不應視為限制。在一些實例中，處理電路可以不執行圖7的方法。相反，處理電路可以基於時間標識值或QP值，來去除初始化點集合（例如，與具有最小時間標識值或QP值的切片或圖片相關聯的時間初始化點集合）。此外，可以不將切片類型視作為這些因素之一。亦即，處理電路可以去除與當前切片或圖片具有相同的切片類型及/或相同的時間標識值及/或相同QP值的時間初始化點集合的條目。The example of Figure 7 is for illustrative purposes only and should not be considered limiting. In some examples, the processing circuitry may not perform the method of Figure 7. Instead, the processing circuit may remove the set of initialization points based on the time stamp value or QP value (eg, the set of time initialization points associated with the slice or picture with the smallest time stamp value or QP value). Additionally, slice type may not be considered one of these factors. That is, the processing circuit may remove entries in the temporal initialization point set that have the same slice type and/or the same time identifier value and/or the same QP value as the current slice or picture.

圖8是示出對視訊資料進行處理的另一種實例方法的流程圖。為了便於描述起見，關於處理電路和緩衝器描述了圖8的實例，處理電路的實例包括視訊轉碼器200和視訊解碼器300的處理電路，緩衝器的實例包括記憶體106、記憶體120、視訊資料記憶體230、DPB 218、CPB記憶體320、DPB 314、或者用於視訊轉碼器200或視訊解碼器300中的任何其他記憶體。Figure 8 is a flowchart illustrating another example method of processing video data. For ease of description, the example of FIG. 8 is described with respect to processing circuits and buffers. Examples of the processing circuits include the processing circuits of the video transcoder 200 and the video decoder 300 . Examples of the buffers include the memory 106 , the memory 120 , video data memory 230, DPB 218, CPB memory 320, DPB 314, or any other memory used in the video transcoder 200 or the video decoder 300.

類似於圖6和圖7，處理電路可以決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值（800）。在一或多個實例中，處理電路可以決定緩衝器是否已經針對與當前切片或圖片具有相同的時間標識值或QP值的切片或圖片儲存了時間初始化點集合（802）。Similar to Figures 6 and 7, processing circuitry may determine one or more context values for at least one context used to encode or decode the current slice or picture (800). In one or more examples, the processing circuitry may determine whether the buffer has stored a set of temporal initialization points for a slice or picture that has the same time stamp value or QP value as the current slice or picture (802).

若緩衝器已經針對與當前切片或圖片具有相同的時間標識值或QP值的切片或圖片儲存了時間初始化點集合（802處的「是」），則處理電路可以使用當前切片或圖片的時間初始化點來重寫（overwrite）（804）所儲存的時間初始化點集合（例如，所決定的一或多個上下文值或者根據該一或多個上下文值來匯出）。隨後，處理電路可以將下一個切片或圖片設置為當前切片或圖片，並返回決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值（800）。If the buffer already stores a set of temporal initialization points for a slice or picture that has the same timestamp value or QP value as the current slice or picture ("Yes" at 802), then the processing circuitry may use the temporal initialization of the current slice or picture. Points to overwrite (804) a stored set of time initialization points (eg, one or more context values determined or exported based on the one or more context values). The processing circuitry may then set the next slice or picture to be the current slice or picture and return one or more context values that determine at least one context used to encode or decode the current slice or picture (800).

若緩衝器尚未儲存與當前切片或圖片具有相同的時間標識值或QP值的切片或圖片的時間初始化點集合（802處為「否」），則處理電路可以決定緩衝器是否已滿（806）。若緩衝器未滿（在806處為「否」），則處理電路可以將時間初始化點（例如，當前切片或圖片的上下文值，或者根據當前切片或圖片的上下文值匯出的值）儲存在緩衝器中（808）。隨後，處理電路可以將下一個切片或圖片設置為當前切片或圖片，並返回決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值（800）。If the buffer has not yet stored a set of temporal initialization points for a slice or picture with the same timestamp value or QP value as the current slice or picture (No at 802), the processing circuitry may determine whether the buffer is full (806) . If the buffer is not full (NO at 806), the processing circuitry may store a temporal initialization point (e.g., the context value of the current slice or picture, or a value exported from the context value of the current slice or picture) in in buffer (808). The processing circuitry may then set the next slice or picture to be the current slice or picture and return one or more context values that determine at least one context used to encode or decode the current slice or picture (800).

若緩衝器已滿（在806處為「是」），則處理電路可以從兩個或兩個以上切片或圖片中，決定（例如，辨識）與切片或圖片相關聯的第一時間初始化點集合，其在該兩個或兩個以上切片或圖片的時間標識值或量化參數（QP）值中具有最小時間標識值或QP值中的至少一項（810）。類似於圖6，處理電路可以從緩衝器中，去除與該切片或圖片相關聯的第一時間初始化點集合（812），並在緩衝器中，儲存與當前切片或圖片相關聯的第二時間初始化點集合，其中第二時間初始化點集合是基於所決定的一或多個上下文值（814）。If the buffer is full (YES at 806), the processing circuitry may determine (eg, identify) a first set of initialization points associated with the slices or pictures from two or more slices or pictures. , which has at least one of the smallest time signature value or QP value among the time signature values or quantization parameter (QP) values of the two or more slices or pictures (810). Similar to Figure 6, the processing circuit may remove the first time initialization point set associated with the slice or picture from the buffer (812) and store the second time initialization point associated with the current slice or picture in the buffer. A set of initialization points, wherein a second set of temporal initialization points is based on the determined one or more context values (814).

在圖8中示出並由處理電路執行的實例操作順序不應被視為限制性的。例如，處理電路可以首先決定緩衝器是否已滿（806），並且若緩衝器未滿，則處理電路可以儲存初始化點（808），即使緩衝器已經儲存與當前切片或圖片具有相同的時間標識值或QP值的切片或圖片的時間初始化點。再舉一個實例，若緩衝器已滿，則處理電路可以去除與具有最小時間標識值或QP值的切片或圖片相關聯的時間初始化點集合，即使緩衝器已經儲存了與當前切片或圖片具有相同時間標識值和QP值的切片或圖片的時間初始化點。可以對操作順序進行其他修改。The example sequence of operations shown in Figure 8 and performed by the processing circuitry should not be considered limiting. For example, the processing circuit may first determine whether the buffer is full (806), and if the buffer is not full, the processing circuit may store the initialization point (808) even if the buffer already stores the same timestamp value as the current slice or picture Or the temporal initialization point of a slice or picture of QP values. As another example, if the buffer is full, the processing circuitry may remove the set of time initialization points associated with the slice or picture with the smallest time stamp value or QP value, even if the buffer already stores the same set of time initialization points as the current slice or picture. The time initialization point of the slice or picture of the timestamp value and QP value. Other modifications to the sequence of operations are possible.

元件符號804和812所示的技術包括去除初始化點集合的實例。通常，處理電路可以首先決定緩衝器是否儲存了與當前切片或圖片相同的時間ID或QP值的時間初始化點，若是，則去除該時間初始化點集合，並寫入當前切片或圖片的時間初始化點。否則，處理電路可以去除所決定的（例如，所辨識的）初始化點集合，其中所決定的初始化點集合與具有最小時間標識值或QP值的切片或圖片相關聯。Techniques shown by reference numerals 804 and 812 include instances of removing a set of initialization points. Generally, the processing circuit can first determine whether the buffer stores a time initialization point with the same time ID or QP value as the current slice or picture. If so, remove the time initialization point set and write the time initialization point of the current slice or picture. . Otherwise, the processing circuitry may remove the determined (eg, identified) set of initialization points associated with the slice or picture having the smallest time stamp value or QP value.

下文描述了可以一起或單獨執行的實例技術。Example techniques that may be performed together or separately are described below.

條款1。一種對視訊資料進行譯碼的方法，該方法包括：儲存針對在當前圖片或切片的視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的複數個時間初始化點，其中該複數個時間初始點包括在按譯碼順序在該當前圖片之前的一或多個先前圖片的視訊資料中；儲存與該複數個時間初始化點之每一者時間初始化點相關聯的各個時間標識（ID）值；基於與該複數個時間初始化點相關聯的該各個時間ID值和該當前圖片或切片的時間ID值，選擇該複數個時間初始化點中的至少一個（例如，若可用）時間初始化點；並基於所選擇的至少一個時間初始化點，對該當前圖片或切片的該視訊資料進行基於上下文的算術譯碼。Clause 1. A method of decoding video data, the method comprising: storing a plurality of temporal initialization points for one or more contexts used in context-based arithmetic decoding of video data of a current picture or slice, wherein the plurality of The time initialization point is included in the video data of one or more previous pictures before the current picture in decoding order; storing each time identification (ID) associated with each of the plurality of time initialization points. value; selecting at least one of the plurality of time initialization points (e.g., if available) a time initialization point based on the respective time ID value associated with the plurality of time initialization points and the time ID value of the current picture or slice; And based on the selected at least one time initialization point, context-based arithmetic decoding is performed on the video data of the current picture or slice.

條款2。根據條款1之方法，其中選擇該至少一個時間初始化點包括：決定該複數個時間初始化點中的時間初始化點集合，其中該時間初始化點集合中的時間初始化點的各個時間ID值小於或等於該當前圖片或切片的該時間ID值；並從該時間初始化點集合中，選擇該至少一個時間初始化點。Clause 2. The method according to clause 1, wherein selecting the at least one time initialization point includes: determining a time initialization point set among the plurality of time initialization points, wherein each time ID value of the time initialization point in the time initialization point set is less than or equal to the The time ID value of the current picture or slice; and select at least one time initialization point from the time initialization point set.

條款3。根據條款1之方法，其中選擇該至少一個時間初始化點包括：決定該複數個時間初始化點中的時間初始化點集合，其中該時間初始化點集合中的時間初始化點的各個時間ID值等於該當前圖片或切片的該時間ID值；並從該時間初始化點集合中，選擇該至少一個時間初始化點。Clause 3. The method according to clause 1, wherein selecting the at least one time initialization point includes: determining a time initialization point set among the plurality of time initialization points, wherein each time ID value of the time initialization point in the time initialization point set is equal to the current picture or the time ID value of the slice; and select at least one time initialization point from the time initialization point set.

條款4。根據條款1之方法，其中選擇該至少一個時間初始化點包括：決定該複數個時間初始化點中沒有時間初始化點具有等於該當前圖片或切片的該時間ID值的時間ID值；基於該複數個時間初始化點中沒有時間初始化點具有等於該當前圖片或切片的該時間ID值的時間ID值的決定，決定該複數個時間初始化點中的任何時間初始化點是否具有比該當前圖片或切片的該時間ID值小於一的時間ID值；並且基於決定存在時間ID值比該當前圖片或切片的該時間ID值小一的一或多個時間初始化點，從該一或多個時間初始化點中選擇該至少一個時間初始化點，其將具有比該當前圖片或切片的該時間ID值小於一的時間ID值。Clause 4. The method according to clause 1, wherein selecting the at least one temporal initialization point includes: determining that no temporal initialization point among the plurality of temporal initialization points has a temporal ID value equal to the temporal ID value of the current picture or slice; based on the plurality of temporal The determination that no time initialization point among the initialization points has a time ID value equal to the time ID value of the current picture or slice determines whether any of the plurality of time initialization points has a time ID value greater than that of the current picture or slice. a temporal ID value with an ID value less than one; and selecting the one or more temporal initialization points from the one or more temporal initialization points based on determining that there are one or more temporal initialization points with a temporal ID value that is one smaller than the temporal ID value of the current picture or slice. At least one temporal initialization point, which will have a temporal ID value less than one than the temporal ID value of the current picture or slice.

條款5。一種對視訊資料進行譯碼的方法，該方法包括：在第一緩衝器中，儲存針對在一或多個先前圖片的該視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的一或多個時間初始化點；在第二緩衝器中，儲存針對在當前圖片的切片的該視訊資料的基於上下文的算術譯碼中使用的一或多個上下文的時間初始化點，其中在該第二緩衝器中儲存包括：在該當前圖片的該視訊資料的譯碼期間，在該第二緩衝器中儲存；並在處理該當前圖片的最後譯碼樹單元（CTU）或切片之後，將儲存在該第二緩衝器中的時間初始化點儲存在該第一緩衝器中。Clause 5. A method of decoding video data, the method comprising: storing in a first buffer one or more contexts used in context-based arithmetic decoding of the video data for one or more previous pictures. one or more temporal initialization points; in the second buffer, store temporal initialization points for one or more contexts used in context-based arithmetic decoding of the video material of the slice of the current picture, wherein in the second buffer Storing in the second buffer includes: storing in the second buffer during decoding of the video data of the current picture; and storing after processing the last coding tree unit (CTU) or slice of the current picture. The time initialization point in the second buffer is stored in the first buffer.

條款6。根據條款1-4中的任何一項所述的方法，其中儲存該複數個時間初始化點包括：根據條款5之方法，儲存該複數個時間初始化點。Clause 6. The method according to any one of clauses 1-4, wherein storing the plurality of time initialization points includes: according to the method of clause 5, storing the plurality of time initialization points.

條款7。一種對視訊資料進行譯碼的方法，該方法包括：儲存用於對先前圖片中的先前切片的該視訊資料進行基於上下文的算術譯碼中使用的一或多個上下文對應的先前時間初始化點；對於當前切片，決定該當前圖片中的該當前切片在該當前圖片中具有與該先前圖片中的該先前切片的位置相對應的位置；基於該當前切片在該當前圖片中具有與該先前圖片中的該先前切片的位置相對應的位置，基於該先前時間初始化點來決定用於該當前切片的當前時間初始化點。Clause 7. A method of decoding video data, the method comprising: storing one or more context-corresponding previous time initialization points used in context-based arithmetic decoding of the video data of previous slices in previous pictures; For the current slice, it is determined that the current slice in the current picture has a position in the current picture corresponding to the position of the previous slice in the previous picture; based on the current slice having in the current picture a position corresponding to the position in the previous picture The current time initialization point for the current slice is determined based on the previous time initialization point.

條款8。一種對視訊資料進行譯碼的方法，包括根據條款1-7中的任何條款的組合。Clause 8. A method of decoding video material, including in accordance with any combination of clauses 1-7.

條款9。根據條款1-8中的任何一項所述的方法，其中基於上下文的算術譯碼包括上下文自我調整二進位算術譯碼（CABAC）。Clause 9. A method according to any one of clauses 1-8, wherein the context-based arithmetic coding includes context self-adjusting binary arithmetic coding (CABAC).

條款10。根據條款1-9中的任何一項所述的方法，其中基於上下文的算術譯碼包括基於上下文的算術解碼。Clause 10. A method as in any one of clauses 1-9, wherein context-based arithmetic decoding includes context-based arithmetic decoding.

條款11。根據條款1-9中的任何一項所述的方法，其中基於上下文的算術譯碼包括基於上下文的算術編碼。Clause 11. A method as in any one of clauses 1-9, wherein context-based arithmetic decoding includes context-based arithmetic encoding.

條款12。一種用於對視訊資料進行譯碼的設備，該設備包括：被配置為儲存視訊資料的記憶體；及處理電路，其被配置為執行根據條款1-11中的任何一項或組合所述的方法。Clause 12. A device for decoding video data, the device comprising: a memory configured to store the video data; and a processing circuit configured to perform any one or combination of clauses 1-11 method.

條款13。根據條款12之設備，其中該設備包括視訊解碼器。Clause 13. Equipment under clause 12, wherein the equipment includes a video decoder.

條款14。根據條款12和13中的任何一項所述的設備，其中該設備包括視訊轉碼器。Clause 14. Equipment according to any of clauses 12 and 13, wherein the equipment includes a video transcoder.

條款15。根據條款12-14中的任何一項所述的設備，亦包括被配置為顯示經解碼的視訊資料的顯示器。Clause 15. Apparatus according to any of clauses 12-14 also includes a display configured to display decoded video data.

條款16。根據條款12-15中的任何一項所述的設備，其中該設備包括以下各項中的一項或多項：照相機、電腦、行動設備、廣播接收器設備或機上盒。Clause 16. Equipment according to any of clauses 12-15, wherein the equipment includes one or more of the following: a camera, a computer, a mobile device, a broadcast receiver device or a set-top box.

條款17。一種在其上儲存有指令的電腦可讀取儲存媒體，當該等指令被執行時，導致一或多個處理器執行條款1-11中的任何一項所述的方法。Clause 17. A computer-readable storage medium having instructions stored thereon that, when executed, cause one or more processors to perform the method described in any of Clauses 1-11.

條款18。一種用於對視訊資料進行譯碼的設備，該設備包括：用於執行根據條款1-11中的任何一項或組合所述的方法的單元。Clause 18. A device for decoding video material, the device comprising: a unit for performing the method according to any one or combination of clauses 1-11.

條款1A。一種對視訊資料進行處理的方法，該方法包括：決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值；決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿，其中每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點；基於切片或圖片的切片類型、時間標識值或量化參數（QP）值中的至少一項，從該兩個或兩個以上切片或圖片中決定與該切片或圖片相關聯的第一時間初始化點集合；從該緩衝器中，去除與該切片或者圖片相關聯的該第一時間初始化點集合；並且在該緩衝器中，儲存與當前切片或者圖片相關聯的第二時間初始化點集合，其中該第二時間初始化點集合基於該決定的一或多個上下文值。Clause 1A. A method for processing video data, the method includes: determining one or more context values of at least one context used to encode or decode the current slice or picture; deciding to store data from two or more slices or A buffer is full for a set of temporal initialization points of a picture for context-based arithmetic decoding, where each set of temporal initialization points is associated with a slice or picture of the two or more slices or pictures and includes one or Multiple temporal initialization points; determining the slice or picture to be associated with from the two or more slices or pictures based on at least one of the slice type, time stamp value, or quantization parameter (QP) value of the slice or picture the first time initialization point set; remove the first time initialization point set associated with the slice or picture from the buffer; and store the second time initialization point set associated with the current slice or picture in the buffer A set of initialization points, wherein the second set of temporal initialization points is based on the determined one or more context values.

條款2A。根據條款1A所述的方法，其中決定該第一時間初始化點集合包括：從該兩個或兩個以上切片或圖片中，決定與該切片或圖片相關聯的該第一時間初始化點集合，其在該兩個或兩個以上切片或圖片的時間標識值或量化參數（QP）值中具有最小時間標識值或QP值中的至少一項。Clause 2A. The method according to clause 1A, wherein determining the first temporal initialization point set includes: determining the first temporal initialization point set associated with the slice or picture from the two or more slices or pictures, wherein There is at least one of the smallest time stamp value or QP value among the time stamp values or quantization parameter (QP) values of the two or more slices or pictures.

條款3A。根據條款2A所述的方法，其中從該兩個或兩個以上切片或圖片中決定具有該最小時間標識值或QP值中的至少一項的該切片或圖片包括：從該兩個或兩個以上切片或圖片的時間標識值中，決定具有最小時間標識值的該切片或圖片。Clause 3A. The method according to clause 2A, wherein determining the slice or picture having at least one of the minimum time stamp value or the QP value from the two or more slices or pictures includes: from the two or more Among the time stamp values of the above slices or pictures, the slice or picture with the smallest time stamp value is determined.

條款4A。根據條款2A所述的方法，其中從該兩個或兩個以上切片或圖片中決定具有最小時間標識值或QP值中的至少一項的該切片或圖片包括：從該兩個或兩個以上切片或圖片的QP值中，決定具有最小QP值的該切片或圖片。Clause 4A. The method according to clause 2A, wherein determining the slice or picture with at least one of the minimum time stamp value or the QP value from the two or more slices or pictures includes: from the two or more Among the QP values of a slice or picture, determine the slice or picture with the smallest QP value.

條款5A。根據條款1A-4A中的任何一項所述的方法，亦包括：決定該當前切片或圖片的時間標識值或QP值中的至少一項不同於該兩個或兩個以上切片或圖片之每一者切片或圖片的時間標識值或QP值，其中與該第一時間初始化點集合相關聯的該切片或圖片的該時間標識值或QP值不同於該當前切片或圖片的該時間標識值或QP值，並且其中去除該第一時間初始化點集合包括：基於決定該當前切片或圖片的該時間標識值或QP值中的至少一項不同於該兩個或兩個以上切片或圖片之每一者切片或圖片的該時間標識值和QP值，來去除該第一時間初始化點集合。Clause 5A. The method according to any one of clauses 1A-4A, also comprising: determining that at least one of the time stamp value or QP value of the current slice or picture is different from each of the two or more slices or pictures. a time stamp value or QP value of a slice or picture, wherein the time stamp value or QP value of the slice or picture associated with the first set of time initialization points is different from the time stamp value or QP value of the current slice or picture or QP value, and wherein removing the first temporal initialization point set includes: determining that at least one of the temporal identification value or the QP value of the current slice or picture is different from each of the two or more slices or pictures. or the time stamp value and QP value of the slice or picture to remove the first time initialization point set.

條款6A。根據條款1A-5A中的任何一項所述的方法，其中該當前切片或圖片是第一切片或圖片，該方法亦包括：決定用於對第二切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值；從該兩個或兩個以上切片或圖片中，決定與該第二切片或圖片的時間標識值、QP值或切片類型具有相同的時間標識值、QP值或切片類型中的至少一項的第三切片或圖片；從該緩衝器中，去除與該第三切片或圖片相關聯的第三時間初始化點集合；並基於所決定的用於對該第二切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值，在該緩衝器中，儲存與該第二切片或圖片相關聯的第四時間初始化點集合。Clause 6A. A method according to any one of clauses 1A-5A, wherein the current slice or picture is a first slice or picture, the method further comprising: determining at least one of the parameters for encoding or decoding the second slice or picture. One or more context values of the context; from the two or more slices or pictures, determine that the time stamp value, QP value or slice type has the same time stamp value, QP value or slice type as the second slice or picture a third slice or picture of at least one of the slice types; removing from the buffer a third set of temporal initialization points associated with the third slice or picture; and based on the determined value for the second slice or one or more context values of at least one context in which a picture is encoded or decoded, and a fourth set of temporal initialization points associated with the second slice or picture is stored in the buffer.

條款7A。根據條款1A-6A中的任何一項所述的方法，亦包括：決定儲存在該緩衝器中的時間初始化點集合；基於所決定的時間初始化點集合，對用於編碼或解碼後續切片或圖片的至少一個上下文的一或多個上下文值進行初始化；並對該後續的切片或圖片進行基於上下文的算術編碼或解碼。Clause 7A. The method according to any one of clauses 1A-6A, further comprising: determining a set of temporal initialization points stored in the buffer; and, based on the determined set of temporal initialization points, encoding or decoding subsequent slices or pictures. Initialize one or more context values of at least one context; and perform context-based arithmetic encoding or decoding on the subsequent slice or picture.

條款8A。根據條款7A所述的方法，亦包括：決定該後續的切片或圖片的時間標識值；決定具有儲存在該緩衝器中的相關聯的時間初始化點集合的兩個或兩個以上切片或圖片中沒有一個切片或圖片具有等於該後續切片或圖片的該時間標識值的時間標識值；並從該兩個或兩個以上切片或圖片中，決定時間標識值最接近並且小於該後續切片或圖片的該時間標識值的第二切片或圖片，其中決定該時間初始化點集合包括：選擇與該第二切片或圖片相關聯的該時間初始化點集合。Clause 8A. The method according to clause 7A, further comprising: determining a time stamp value of the subsequent slice or picture; determining two or more slices or pictures having an associated set of time initialization points stored in the buffer. No slice or picture has a time stamp value equal to the time stamp value of the subsequent slice or picture; and from the two or more slices or pictures, determine the time stamp value that is closest to and smaller than that of the subsequent slice or picture The second slice or picture of the time identification value, wherein determining the set of time initialization points includes: selecting the set of time initialization points associated with the second slice or picture.

條款9A。根據條款7A所述的方法，亦包括：決定該後續切片或圖片的QP值；決定具有儲存在該緩衝器中的相關聯的時間初始化點集合的兩個或兩個以上切片或圖片中沒有一個切片或圖片具有等於該後續切片或圖片的該QP值的QP值；並從該兩個或兩個以上切片或圖片中，決定QP值最接近該後續切片或圖片的該QP值的第二切片或圖片，其中決定該時間初始化點集合包括：選擇與該第二切片或圖片相關聯的該初始化點集合。Clause 9A. The method of clause 7A, further comprising: determining a QP value for the subsequent slice or picture; determining that none of the two or more slices or pictures has an associated set of temporal initialization points stored in the buffer a slice or picture having a QP value equal to the QP value of the subsequent slice or picture; and from the two or more slices or pictures, determine the second slice whose QP value is closest to the QP value of the subsequent slice or picture or picture, wherein determining the set of temporal initialization points includes: selecting the set of initialization points associated with the second slice or picture.

條款10A。根據條款1A-9A所述的方法，其中儲存該第二初始化點集合包括：在處理該當前切片或圖片的最後譯碼樹單元（CTU）之後，儲存該第二時間初始化點集合。Clause 10A. The method of clauses 1A-9A, wherein storing the second set of initialization points includes storing the second set of temporal initialization points after processing the last coding tree unit (CTU) of the current slice or picture.

條款11A。一種用於對視訊資料進行處理的設備，該設備包括：緩衝器，其被配置為儲存來自兩個或兩個以上切片或圖片的時間初始化點集合，以用於基於上下文的算術譯碼；及耦合到該緩衝器的處理電路，該處理電路被配置為：決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值；決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿，其中每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點；基於切片或圖片的切片類型、時間標識值或量化參數（QP）值中的至少一項，從該兩個或兩個以上切片或圖片中決定與該切片或圖片相關聯的第一時間初始化點集合；從該緩衝器中，去除與該切片或者圖片相關聯的該第一時間初始化點集合；並在該緩衝器中，儲存與當前切片或者圖片相關聯的第二時間初始化點集合，其中該第二時間初始化點集合基於該決定的一或多個上下文值。Clause 11A. An apparatus for processing video data, the apparatus comprising: a buffer configured to store a set of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding; and Processing circuitry coupled to the buffer, the processing circuitry configured to: determine one or more context values for at least one context for encoding or decoding the current slice or picture; determine for use in storing the context values from two or both The buffer for the above set of temporal initialization points for context-based arithmetic decoding of slices or pictures, where each set of temporal initialization points is associated with a slice or picture of the two or more slices or pictures, is full, and Including one or more time initialization points; based on at least one of the slice type, time stamp value or quantization parameter (QP) value of the slice or picture, determine from the two or more slices or pictures the same as the slice or picture The set of first-time initialization points associated with the picture; remove the set of first-time initialization points associated with the slice or picture from the buffer; and store the set of first-time initialization points associated with the current slice or picture in the buffer. A second set of temporal initialization points, wherein the second set of temporal initialization points is based on the determined one or more context values.

條款12A。根據條款11A所述的設備，其中為了決定該第一時間初始化點集合，該處理電路被配置為：從該兩個或兩個以上切片或圖片中，決定與該切片或圖片相關聯的該第一時間初始化點集合，其在該兩個或兩個以上切片或圖片的時間標識值或量化參數（QP）值中具有最小時間標識值或QP值中的至少一項。Clause 12A. The device according to clause 11A, wherein in order to determine the first set of temporal initialization points, the processing circuit is configured to: from the two or more slices or pictures, determine the third slice or picture associated with the slice or picture. A set of temporal initialization points that has at least one of the smallest temporal identifier value or QP value among the temporal identifier values or quantization parameter (QP) values of the two or more slices or pictures.

條款13A。根據條款12A所述的設備，其中為了從該兩個或兩個以上切片或圖片中決定具有該最小時間標識值或QP值中的至少一項的該切片或圖片，該處理電路被配置為：從該兩個或兩個以上切片或圖片的時間標識值中，決定具有最小時間標識值的該切片或圖片。Clause 13A. The apparatus according to clause 12A, wherein in order to determine the slice or picture having at least one of the minimum time stamp value or the QP value from the two or more slices or pictures, the processing circuit is configured to: From the time stamp values of the two or more slices or pictures, the slice or picture with the smallest time stamp value is determined.

條款14A。根據條款12A所述的設備，其中為了從該兩個或兩個以上切片或圖片中決定具有最小時間標識值或QP值中的至少一項的該切片或圖片，該處理電路被配置為：從該兩個或兩個以上切片或圖片的QP值中，決定具有最小QP值的該切片或圖片。Clause 14A. The apparatus according to clause 12A, wherein in order to determine the slice or picture having at least one of the minimum time stamp value or the QP value from the two or more slices or pictures, the processing circuit is configured to: from Among the QP values of the two or more slices or pictures, the slice or picture with the smallest QP value is determined.

條款15A。根據條款11A-14A中的任何一項所述的設備，其中該處理電路被配置為：決定該當前切片或圖片的時間標識值或QP值中的至少一項不同於該兩個或兩個以上切片或圖片中的每一個的時間標識值或QP值，其中與該第一時間初始化點集合相關聯的該切片或圖片的該時間標識值或QP值不同於該當前切片或圖片的該時間標識值或QP值，並且其中為了去除該第一時間初始化點集合，該處理電路被配置為：基於決定該當前切片或圖片的該時間標識值或QP值中的至少一項不同於該兩個或兩個以上切片或圖片之每一者切片或圖片的該時間標識值和QP值，來去除該第一時間初始化點集合。Clause 15A. The device according to any one of clauses 11A-14A, wherein the processing circuit is configured to: determine that at least one of the time stamp value or QP value of the current slice or picture is different from the two or more A time stamp value or QP value for each of the slices or pictures, wherein the time stamp value or QP value of the slice or picture associated with the first set of time initialization points is different from the time stamp value of the current slice or picture value or QP value, and wherein in order to remove the first time initialization point set, the processing circuit is configured to: determine that at least one of the time identification value or QP value of the current slice or picture is different from the two or The time stamp value and the QP value of each of the two or more slices or pictures are used to remove the first time initialization point set.

條款16A。根據條款11A-15A中的任何一項所述的設備，其中該當前切片或圖片是第一切片或圖片，並且其中該處理電路被配置為：決定用於對第二切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值；從該兩個或兩個以上切片或圖片中，決定與該第二切片或圖片的時間標識值、QP值或切片類型具有相同的時間標識值、QP值或切片類型中的至少一項的第三切片或圖片；從該緩衝器中，去除與該第三切片或圖片相關聯的第三時間初始化點集合；並基於該決定的用於對該第二切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值，在該緩衝器中，儲存與該第二切片或圖片相關聯的第四時間初始化點集合。Clause 16A. Apparatus according to any of clauses 11A-15A, wherein the current slice or picture is a first slice or picture, and wherein the processing circuit is configured to: decide to encode a second slice or picture or One or more context values of the decoded at least one context; from the two or more slices or pictures, it is determined that the time stamp value, QP value or slice type of the second slice or picture has the same time stamp value a third slice or picture of at least one of a , QP value, or slice type; removing from the buffer a third set of temporal initialization points associated with the third slice or picture; and based on the determination, a third slice or picture for One or more context values of at least one context in which the second slice or picture is encoded or decoded, and a fourth temporal initialization point set associated with the second slice or picture is stored in the buffer.

條款17A。根據條款11A-16A中的任何一項所述的設備，其中該處理電路被配置為：決定儲存在該緩衝器中的時間初始化點集合；基於所決定的時間初始化點集合，對用於編碼或解碼後續切片或圖片的至少一個上下文的一或多個上下文值進行初始化；並對該後續的切片或圖片進行基於上下文的算術編碼或解碼。Clause 17A. The device according to any one of clauses 11A-16A, wherein the processing circuit is configured to: determine a set of time initialization points stored in the buffer; based on the determined set of time initialization points, for encoding or decoding one or more context values of at least one context of a subsequent slice or picture to initialize; and performing context-based arithmetic encoding or decoding of the subsequent slice or picture.

條款18A。根據條款17A所述的設備，其中該處理電路被配置為：決定該後續的切片或圖片的時間標識值；決定具有儲存在該緩衝器中的相關聯的時間初始化點集合的兩個或兩個以上切片或圖片中沒有一個切片或圖片具有等於該後續切片或圖片的該時間標識值的時間標識值；並且從該兩個或兩個以上切片或圖片中，決定時間標識值最接近並且小於該後續切片或圖片的該時間標識值的第二切片或圖片，其中為了決定該時間初始化點集合，該處理電路被配置為：選擇與該第二切片或圖片相關聯的該時間初始化點集合。Clause 18A. Apparatus according to clause 17A, wherein the processing circuit is configured to: determine a time stamp value of the subsequent slice or picture; determine two or more sets of associated time initialization points stored in the buffer. None of the above slices or pictures has a time stamp value equal to the time stamp value of the subsequent slice or picture; and from the two or more slices or pictures, it is decided that the time stamp value is closest to and smaller than the A second slice or picture of the time identification value of a subsequent slice or picture, wherein in order to determine the set of time initialization points, the processing circuit is configured to: select the set of time initialization points associated with the second slice or picture.

條款19A。根據條款17A所述的設備，其中該處理電路被配置為：決定該後續切片或圖片的QP值；決定具有儲存在該緩衝器中的相關聯的時間初始化點集合的兩個或兩個以上切片或圖片中沒有一個切片或圖片具有等於該後續切片或圖片的該QP值的QP值；並且從該兩個或兩個以上切片或圖片中，決定QP值最接近該後續切片或圖片的該QP值的第二切片或圖片，其中為了決定該時間初始化點集合，該處理電路被配置為：選擇與該第二切片或圖片相關聯的該初始化點集合。Clause 19A. Apparatus according to clause 17A, wherein the processing circuit is configured to: determine a QP value for the subsequent slice or picture; determine two or more slices having an associated set of temporal initialization points stored in the buffer or none of the slices or pictures in the picture has a QP value equal to the QP value of the subsequent slice or picture; and from the two or more slices or pictures, determine the QP value closest to the QP of the subsequent slice or picture A second slice or picture of values, wherein in order to determine the set of temporal initialization points, the processing circuit is configured to: select the set of initialization points associated with the second slice or picture.

條款20A。根據條款11A-19A中的任何一項所述的設備，其中為了儲存該第二初始化點集合，該處理電路被配置為：在處理該當前切片或圖片的最後譯碼樹單元（CTU）之後，儲存該第二時間初始化點集合。Clause 20A. An apparatus according to any one of clauses 11A-19A, wherein to store the second set of initialization points, the processing circuit is configured to: after processing the last coding tree unit (CTU) of the current slice or picture, Store the second time initialization point set.

條款21A。根據條款11A-20A中的任何一項所述的設備，其中該設備包括以下各項中的一項或多項：照相機、電腦、行動設備、廣播接收器設備或機上盒。Clause 21A. Equipment under any of clauses 11A-20A, wherein the equipment includes one or more of the following: a camera, a computer, a mobile device, a broadcast receiver device or a set-top box.

條款22A。一種其上儲存有指令的電腦可讀取儲存媒體，當該等指令被執行時，導致一或多個處理器執行以下操作：決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值；決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿，其中每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點；基於切片或圖片的切片類型、時間標識值或量化參數（QP）值中的至少一項，從該兩個或兩個以上切片或圖片中決定與切片或圖片相關聯的第一時間初始化點集合；從該緩衝器中，去除與該切片或者圖片相關聯的該第一時間初始化點集合；並且在該緩衝器中，儲存與當前切片或者圖片相關聯的第二時間初始化點集合，其中該第二時間初始化點集合基於該決定的一或多個上下文值。Clause 22A. A computer-readable storage medium having instructions stored thereon that, when executed, cause one or more processors to perform the following operations: determine at least one context for encoding or decoding the current slice or picture. One or more context values; determines that the buffer used to store sets of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding is full, where each set of temporal initialization points is associated with the two Slices or pictures of one or more slices or pictures are associated and include one or more temporal initialization points; based on at least one of a slice type, a time stamp value or a quantization parameter (QP) value of the slice or picture, Determine a first time initialization point set associated with the slice or picture from the two or more slices or pictures; remove the first time initialization point set associated with the slice or picture from the buffer; And in the buffer, a second set of temporal initialization points associated with the current slice or picture is stored, wherein the second set of temporal initialization points is based on the determined one or more context values.

條款23A。根據條款22A所述的電腦可讀取儲存媒體，亦包括使該一或多個處理器執行根據條款1A-10A中的任何一項所述的方法的指令。Clause 23A. The computer-readable storage medium described in Clause 22A also includes instructions causing the one or more processors to perform the method described in any of Clauses 1A-10A.

條款24A。一種用於對視訊資料進行處理的設備，該設備包括：用於決定用於對當前切片或圖片進行編碼或解碼的至少一個上下文的一或多個上下文值的單元；用於決定用於儲存來自兩個或兩個以上切片或圖片的時間初始化點集合以進行基於上下文的算術譯碼的緩衝器已滿的單元，其中每個時間初始化點集合與該兩個或兩個以上切片或圖片中的切片或圖片相關聯，並且包括一或多個時間初始化點；用於基於切片或圖片的切片類型、時間標識值或量化參數（QP）值中的至少一項，從該兩個或兩個以上切片或圖片中決定與該切片或圖片相關聯的第一時間初始化點集合的單元；用於從該緩衝器中，去除與該切片或者圖片相關聯的該第一時間初始化點集合的單元；用於在該緩衝器中，儲存與當前切片或者圖片相關聯的第二時間初始化點集合的單元，其中該第二時間初始化點集合基於該決定的一或多個上下文值。Clause 24A. A device for processing video data, the device includes: a unit for determining one or more context values of at least one context used to encode or decode the current slice or picture; A set of temporal initialization points in two or more slices or pictures for buffer-full units for context-based arithmetic decoding, wherein each set of temporal initialization points is associated with a set of temporal initialization points in the two or more slices or pictures Slices or pictures are associated and include one or more temporal initialization points; for based on at least one of the slice type, temporal identification value or quantization parameter (QP) value of the slice or picture, from the two or more A unit in a slice or picture that determines the first time initialization point set associated with the slice or picture; a unit used to remove the first time initialization point set associated with the slice or picture from the buffer; with A unit storing a second set of temporal initialization points associated with the current slice or picture in the buffer, wherein the second set of temporal initialization points is based on the determined one or more context values.

條款25A。根據條款24A所述的設備，亦包括：用於使該一或多個處理器執行根據條款1A-10A中的任何一項所述的方法的指令。Clause 25A. An apparatus according to clause 24A, also comprising instructions for causing the one or more processors to perform a method according to any of clauses 1A-10A.

要認識的是，取決於實例，本文所描述的技術中的任何技術的某些行為或事件可以是以不同的序列來執行的，可以是一起增加的、合併的或忽視的（例如，不是全部所描述的行為或事件是用於對技術的實踐所必要的）。此外，在某些實例中，動作或事件可以是同時地執行的，例如，經由多執行緒處理、中斷處理或多個處理器，而不是順序地執行。It is understood that, depending on the instance, certain acts or events of any of the techniques described herein may be performed in a different sequence, may be added together, combined, or omitted (e.g., not all The actions or events described are necessary for the practice of the technology). Furthermore, in some instances, actions or events may be performed concurrently, for example, via multi-thread processing, interrupt processing, or multiple processors, rather than sequentially.

在一或多個實例中，所描述的功能可以在硬體、軟體、韌體或其任意組合中實現。若在軟體中實現，則該等功能可以作為一或多個指令或代碼儲存在電腦可讀取媒體中或者經由電腦可讀取媒體進行發送，以及由基於硬體的處理單元來執行。電腦可讀取媒體可以包括電腦可讀取儲存媒體，該電腦可讀取儲存媒體對應於有形的媒體（諸如資料儲存媒體）、或包括例如根據通訊協定促進對從一個地方到另一地方的電腦程式的傳送的任何媒體的通訊媒體。以此方式，電腦可讀取媒體通常可以對應於（1）非暫時性的有形的電腦可讀取儲存媒體或（2）諸如訊號或載波的通訊媒體。資料儲存媒體可以是可以由一或多個電腦或者一或多個處理器存取的任何可用的媒體，以取回用於對本案內容所描述的技術的實現方式的指令、代碼及/或資料結構。電腦程式產品可以包括電腦可讀取媒體。In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or sent over a computer-readable medium as one or more instructions or code and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media that correspond to tangible media (such as data storage media), or may include, for example, computers that facilitate communication from one place to another in accordance with a communications protocol Any medium of communication over which the program is transmitted. In this manner, computer-readable media may generally correspond to (1) non-transitory tangible computer-readable storage media or (2) communication media such as signals or carrier waves. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data used to implement the technology described in this case. structure. Computer program products may include computer-readable media.

經由舉例而非限制性的方式，此類電腦可讀取儲存媒體可以包括RAM、ROM、EEPROM、CD-ROM或其他光碟儲存、磁碟儲存或其他磁存放裝置、快閃記憶體或者可以用於以指令或資料結構的形式儲存期望的程式碼以及可以由電腦來存取的任何其他媒體。此外，任何連接適當地稱為電腦可讀取媒體。例如，若使用同軸電纜、光纖光纜、雙絞線、數位使用者線路（DSL）或無線技術（諸如紅外線、無線電和微波）從網站、伺服器或其他遠端源來發送指令，則同軸電纜、光纖光纜、雙絞線、DSL或無線技術（諸如紅外線、無線電和微波）是包括在對媒體的定義中的。然而，應當理解的是，電腦可讀取儲存媒體和資料儲存媒體不包括連接、載波、訊號或其他暫時性的媒體，但是反而針對非暫時性的、有形的儲存媒體。如本文所使用的，磁碟和光碟包括壓縮光碟（CD）、鐳射光碟、光碟、數位多功能光碟（DVD）和藍光光碟，其中磁碟通常磁性地再現資料，而光碟則利用鐳射來光學地再現資料。上述的組合亦應當是包括在電腦可讀取媒體的範疇內的。By way of example and not limitation, such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory or may be used for Store the desired program code in the form of instructions or data structures and any other medium that can be accessed by the computer. Also, any connection is properly termed a computer-readable medium. For example, if coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies (such as infrared, radio, and microwave) are used to send instructions from a website, server, or other remote source, coaxial cable, Fiber optic cable, twisted pair, DSL or wireless technologies (such as infrared, radio and microwave) are included in the definition of media. However, it should be understood that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but instead refer to non-transitory, tangible storage media. As used herein, disks and optical discs include compact discs (CDs), laser discs, optical discs, digital versatile discs (DVDs), and Blu-ray discs, where disks typically reproduce data magnetically, while optical discs use lasers to optically reproduce data. Reproduce data. The above combinations should also be included in the category of computer-readable media.

指令可以由一或多個處理器來執行，諸如一或多個DSP、通用微處理器、ASIC、FPGA或其他等效的集成的邏輯電路或個別的邏輯電路。因此，如本文所使用的術語「處理器」和「處理電路」可以指的是前述的結構中的任何結構或者適合用於本文所描述的技術的實現方式的任何其他結構。此外，在一些態樣中，本文所描述的功能可以在被配置用於編碼和解碼的專用硬體模組及/或軟體模組內提供，或者合併到組合的轉碼器中。另外，該等技術可以是在一或多個電路或邏輯元件中充分地實現的。Instructions may be executed by one or more processors, such as one or more DSPs, general purpose microprocessors, ASICs, FPGAs, or other equivalent integrated logic circuits or individual logic circuits. Accordingly, the terms "processor" and "processing circuitry" as used herein may refer to any of the foregoing structures or any other structure suitable for implementation of the techniques described herein. Additionally, in some aspects, the functionality described herein may be provided within dedicated hardware modules and/or software modules configured for encoding and decoding, or incorporated into a combined transcoder. Additionally, such techniques may be substantially implemented in one or more circuits or logic elements.

本案內容的技術可以是在各種各樣設備或裝置中實現的，該等設備或裝置包括無線手機、積體電路（IC）或IC的集合（例如，晶片集）。各種部件、模組或單元是在本案內容中描述的，以強調被配置為執行所揭示的技術的設備的功能性態樣，但是不一定要求由不同的硬體單元來實現。而是，如上文所描述的，各種單元可以是在轉碼器硬體單元中組合的，或者經由與合適的軟體及/或韌體協力的一批交交交互動操作的硬體單元（包括如上文所描述的一或多個處理器）來提供的。The technology at issue may be implemented in a variety of devices or devices, including wireless handsets, integrated circuits (ICs), or collections of ICs (e.g., chip sets). Various components, modules, or units are described in this context to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require implementation by distinct hardware units. Rather, as described above, the various units may be combined in a transcoder hardware unit, or via a set of hardware units that interact with each other in conjunction with appropriate software and/or firmware (including as described above) one or more processors as described in this article).

已經描述各種實例。這些實例和其他實例在以下申請專利範圍的範疇內。Various examples have been described. These and other examples are within the scope of the following claims.

100:視訊編碼和解碼系統 102:源設備 104:視訊源 106:記憶體 108:輸出介面 110:電腦可讀取媒體 112:存放裝置 114:檔案伺服器 116:目標設備 118:顯示裝置 120:記憶體 122:輸入介面 200:視訊轉碼器 202:模式選擇單元 204:殘差產生單元 206:變換處理單元 208:量化單元 210:逆量化單元 212:逆變換處理單元 214:重構單元 216:濾波器單元 218:經解碼的圖片緩衝器（DPB） 220:熵編碼單元 222:運動估計單元 224:運動補償單元 226:訊框內預測單元 230:視訊資料記憶體 300:視訊解碼器 302:熵解碼單元 304:預測處理單元 306:逆量化單元 308:逆變換處理單元 310:重構單元 312:濾波器單元 314:經解碼的圖片緩衝器（DPB） 316:運動補償單元 318:訊框內預測單元 320:CPB記憶體 400:方塊 402:方塊 404:方塊 406:方塊 408:方塊 410:方塊 500:方塊 502:方塊 504:方塊 506:方塊 508:方塊 510:方塊 600:方塊 602:方塊 604:方塊 606:方塊 608:方塊 700:方塊 702:方塊 704:方塊 706:方塊 800:方塊 802:方塊 804:方塊 806:方塊 808:方塊 810:方塊 812:方塊 814:方塊 100: Video encoding and decoding system 102: Source device 104:Video source 106:Memory 108:Output interface 110: Computer readable media 112:Storage device 114:File server 116:Target device 118:Display device 120:Memory 122:Input interface 200:Video transcoder 202: Mode selection unit 204: Residual generation unit 206: Transformation processing unit 208: Quantization unit 210: Inverse quantization unit 212: Inverse transformation processing unit 214: Reconstruction unit 216: Filter unit 218: Decoded Picture Buffer (DPB) 220: Entropy coding unit 222: Motion estimation unit 224: Motion compensation unit 226: In-frame prediction unit 230: Video data memory 300:Video decoder 302: Entropy decoding unit 304: Prediction processing unit 306: Inverse quantization unit 308: Inverse transformation processing unit 310: Reconstruction unit 312: Filter unit 314: Decoded Picture Buffer (DPB) 316: Motion compensation unit 318: In-frame prediction unit 320:CPB memory 400:block 402:Block 404:Block 406:Block 408:Block 410:block 500:block 502: Block 504:Block 506:Block 508:Block 510:block 600:block 602: Block 604: Block 606:Block 608:Block 700:block 702:Block 704:Block 706:Block 800:block 802: Block 804: Block 806: Block 808: Block 810:block 812:block 814:block

圖1是示出可以執行本案內容的技術的實例視訊編碼和解碼系統的方塊圖。FIG. 1 is a block diagram illustrating an example video encoding and decoding system that may perform the techniques of this disclosure.

圖2是示出可以執行本案內容的技術的實例視訊轉碼器的方塊圖。FIG. 2 is a block diagram illustrating an example video transcoder that may perform the techniques of this disclosure.

圖3是示出可以執行本案內容的技術的實例視訊解碼器的方塊圖。3 is a block diagram illustrating an example video decoder that may perform the techniques of this disclosure.

圖4是示出根據本案內容的技術的用於對當前塊進行編碼的實例方法的流程圖。4 is a flowchart illustrating an example method for encoding a current block in accordance with the techniques of this disclosure.

圖5是示出根據本案內容的技術的用於對當前塊進行解碼的實例方法的流程圖。5 is a flowchart illustrating an example method for decoding a current block in accordance with the techniques of this disclosure.

圖6是示出用於對視訊資料進行處理的實例方法的流程圖。Figure 6 is a flowchart illustrating an example method for processing video data.

圖7是示出用於對視訊資料進行處理的另一種實例方法的流程圖。Figure 7 is a flowchart illustrating another example method for processing video data.

圖8是示出用於對視訊資料進行處理的另一種實例方法的流程圖。8 is a flowchart illustrating another example method for processing video data.

國內寄存資訊(請依寄存機構、日期、號碼順序註記) 無國外寄存資訊(請依寄存國家、機構、日期、號碼順序註記) 無 Domestic storage information (please note in order of storage institution, date and number) without Overseas storage information (please note in order of storage country, institution, date, and number) without

800:方塊 800:block

802:方塊 802: Block

804:方塊 804: Block

806:方塊 806: Block

808:方塊 808: Block

810:方塊 810:block

812:方塊 812:block

814:方塊 814:block

Claims

A method for processing video data, the method includes the following steps: determining one or more context values for at least one context used to encode or decode a current slice or picture; Determining that a buffer for storing sets of temporal initialization points from two or more slices or pictures, each associated with the two or more slices, for context-based arithmetic decoding is full or all slices or pictures in the picture are associated and include one or more time initialization points; Determine a slice or picture associated with the slice or picture from the two or more slices or pictures based on at least one of a slice type, a time stamp value, or a quantization parameter (QP) value. Initialize the point collection at the first time; Remove the first set of initialization points associated with the slice or picture from the buffer; and In the buffer, a second set of temporal initialization points associated with the current slice or picture is stored, wherein the second set of temporal initialization points is based on the determined one or more context values.

The method according to claim 1, wherein determining the first time initialization point set includes: determining from the two or more slices or pictures, a time stamp value or quantization corresponding to the two or more slices or pictures. The set of first time initialization points associated with the slice or picture having at least one of a minimum time identification value or a QP value in the parameter (QP) value.

The method according to claim 2, wherein determining the slice or picture having at least one of the minimum time stamp value or the QP value from the two or more slices or pictures includes: selecting from the two or more than two slices or pictures Among the timestamp values of slices or pictures, determine the slice or picture with the smallest timestamp value.

The method according to claim 2, wherein determining the slice or picture with at least one of the minimum time stamp value or the QP value from the two or more slices or pictures includes: from the two or more slices Or the QP value of the picture, determine the slice or picture with the smallest QP value.

The method according to claim 1 also includes: determining that at least one of a time stamp value or QP value for the current slice or picture is different from a time stamp value or QP value for each of the two or more slices or pictures, wherein the time stamp value or QP value of the slice or picture associated with the first set of time initialization points is different from the time stamp value or QP value of the current slice or picture, and Wherein removing the first time initialization point set includes: based on determining that at least one of the time identification value or QP value for the current slice or picture is different from the slice for each of the two or more slices or pictures. Or the time stamp value and QP value of the picture to remove the first time initialization point set.

According to the method of claim 1, wherein the current slice or picture is a first slice or picture, the method also includes: determining one or more context values for at least one context used to encode or decode a second slice or picture; From the two or more slices or pictures, determine at least one of a time identifier value, QP value or slice type that is the same as a time identifier value, QP value or slice type of the second slice or picture. a third slice or picture; removing a third set of temporal initialization points associated with the third slice or picture from the buffer; and A fourth time associated with the second slice or picture is stored in the buffer based on the determined one or more context values of the at least one context used to encode or decode the second slice or picture. Initialize the point collection.

The method according to claim 1 also includes: Determine a set of time initialization points stored in the buffer; Initializing one or more context values for at least one context used to encode or decode a subsequent slice or picture based on the determined set of temporal initialization points; and Context-based arithmetic encoding or decoding is performed on the subsequent slice or picture.

The method according to claim 7 also includes: Determine a time stamp value for the subsequent slice or picture; Determining that no one of the two or more slices or pictures having an associated set of temporal initialization points stored in the buffer has a time stamp value equal to the time stamp value for the subsequent slice or picture ;and from the two or more slices or pictures, determine a second slice or picture having a time stamp value that is closest to and smaller than the time stamp value of the subsequent slice or picture, Determining the time initialization point set includes: selecting the time initialization point set associated with the second slice or picture.

The method according to claim 7 also includes: Determine a QP value for the subsequent slice or picture; determining that no one of the two or more slices or pictures having an associated set of temporal initialization points stored in the buffer has a QP value equal to the QP value for the subsequent slice or picture; and from the two or more slices or pictures, determine a second slice or picture having a QP value closest to the QP value of the subsequent slice or picture, Determining the temporal initialization point set includes: selecting the initialization point set associated with the second slice or picture.

The method of claim 1, wherein storing the second set of initialization points includes: storing the second set of temporal initialization points after processing a last coding tree unit (CTU) of the current slice or picture.

A device used to process video data. The device includes: a buffer configured to store a set of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding; and Processing circuitry coupled to the buffer, the processing circuitry being configured to: determining one or more context values for at least one context used to encode or decode a current slice or picture; It is decided that the buffer used to store sets of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding is full, where each set of temporal initialization points is associated with the two or more slices or pictures. All slices or pictures in the picture are associated and include one or more time initialization points; Determine a slice or picture associated with the slice or picture from the two or more slices or pictures based on at least one of a slice type, a time stamp value, or a quantization parameter (QP) value. Initialize the point collection at the first time; Remove the first set of initialization points associated with the slice or picture from the buffer; and In the buffer, a second set of temporal initialization points associated with the current slice or picture is stored, wherein the second set of temporal initialization points is based on the determined one or more context values.

The device according to claim 11, wherein in order to determine the first time initialization point set, the processing circuit is configured to: from the two or more slices or pictures, determine the relationship between the two or more slices or pictures. The first set of temporal initialization points associated with the slice or picture having at least one of the time stamp value or the quantization parameter (QP) value of the picture and a minimum time stamp value or QP value.

The device according to claim 12, wherein in order to determine the slice or picture having at least one of the minimum time stamp value or the QP value from the two or more slices or pictures, the processing circuit is configured to: from Among the time stamp values of the two or more slices or pictures, the slice or picture with the smallest time mark value is determined.

The device according to claim 12, wherein in order to determine the slice or picture with at least one of the minimum time stamp value or the QP value from the two or more slices or pictures, the processing circuit is configured to: from the Among the QP values of two or more slices or pictures, the slice or picture with the smallest QP value is determined.

The device according to claim 11, wherein the processing circuit is configured as: determining that at least one of a time stamp value or QP value for the current slice or picture is different from a time stamp value or QP value for each of the two or more slices or pictures, wherein the time stamp value or QP value of the slice or picture associated with the first set of time initialization points is different from the time stamp value or QP value of the current slice or picture, and In order to remove the first time initialization point set, the processing circuit is configured to: determine that at least one of the time identification value or the QP value for the current slice or picture is different from that for the two or more slices. Or the time stamp value and QP value of each slice or picture of the picture to remove the first time initialization point set.

The apparatus according to claim 11, wherein the current slice or picture is a first slice or picture, and wherein the processing circuit is configured to: determining one or more context values for at least one context used to encode or decode a second slice or picture; From the two or more slices or pictures, determine at least one of a time identifier value, QP value or slice type that is the same as a time identifier value, QP value or slice type of the second slice or picture. a third slice or picture; removing a third set of temporal initialization points associated with the third slice or picture from the buffer; and A fourth time associated with the second slice or picture is stored in the buffer based on the determined one or more context values of the at least one context used to encode or decode the second slice or picture. Initialize the point collection.

The device according to claim 11, wherein the processing circuit is configured as: Determine a set of time initialization points stored in the buffer; Initializing one or more context values for at least one context used to encode or decode a subsequent slice or picture based on the determined set of temporal initialization points; and Context-based arithmetic encoding or decoding is performed on the subsequent slice or picture.

The device according to claim 17, wherein the processing circuit is configured to: Determine a time stamp value for the subsequent slice or picture; Determining that no one of the two or more slices or pictures having an associated set of temporal initialization points stored in the buffer has a time stamp value equal to the time stamp value for the subsequent slice or picture ;and from the two or more slices or pictures, determine a second slice or picture having a time stamp value that is closest to and smaller than the time stamp value of the subsequent slice or picture, In order to determine the temporal initialization point set, the processing circuit is configured to: select the temporal initialization point set associated with the second slice or picture.

The device according to claim 17, wherein the processing circuit is configured to: Determine a QP value for the subsequent slice or picture; determining that no one of the two or more slices or pictures having associated sets of temporal initialization points stored in the buffer has a QP value equal to the QP value for the subsequent slice or picture; and from the two or more slices or pictures, determine a second slice or picture having a QP value closest to the QP value of the subsequent slice or picture, In order to determine the set of temporal initialization points, the processing circuit is configured to: select the set of initialization points associated with the second slice or picture.

The apparatus according to claim 11, wherein in order to store the second set of initialization points, the processing circuit is configured to: store the second temporal initialization point after processing a last coding tree unit (CTU) of the current slice or picture. gather.

The device according to claim 11, wherein the device includes one or more of a camera, a computer, a mobile device, a broadcast receiver device or a set-top box.

A computer-readable storage medium on which instructions are stored that, when executed, cause one or more processors to: determining one or more context values for at least one context used to encode or decode a current slice or picture; Determining that a buffer for storing sets of temporal initialization points from two or more slices or pictures, each associated with the two or more slices, for context-based arithmetic decoding is full or all slices or pictures in the picture are associated and include one or more time initialization points; Determine a slice or picture associated with the slice or picture from the two or more slices or pictures based on at least one of a slice type, a time stamp value, or a quantization parameter (QP) value. Initialize the point collection at the first time; Remove the first set of initialization points associated with the slice or picture from the buffer; and In the buffer, a second set of temporal initialization points associated with the current slice or picture is stored, wherein the second set of temporal initialization points is based on the determined one or more context values.

A device used to process video data. The device includes: means for determining one or more context values for at least one context used to encode or decode a current slice or picture; Determining a buffer full unit for storing sets of temporal initialization points from two or more slices or pictures for context-based arithmetic decoding, where each set of temporal initialization points is associated with the two or more All of the two or more slices or pictures are associated and include one or more temporal initialization points; For determining the association with the slice or picture from the two or more slices or pictures based on at least one of a slice type, a time stamp value or a quantization parameter (QP) value of the slice or picture A unit that initializes the point set at the first time; A unit for removing the first set of initialization points associated with the slice or picture from the buffer; and A unit for storing a second set of temporal initialization points associated with the current slice or picture in the buffer, wherein the second set of temporal initialization points is based on the determined one or more context values.