KR20210036402A

KR20210036402A - Reference image management in video coding

Info

Publication number: KR20210036402A
Application number: KR1020217007340A
Authority: KR
Inventors: 예-쿠이 왕; 프뉴 헨드리
Original assignee: 후아웨이 테크놀러지 컴퍼니 리미티드
Priority date: 2018-08-17
Filing date: 2019-08-16
Publication date: 2021-04-02
Also published as: SG11202101407QA; JP2021534676A; JP2023085317A; CN114501018B; EP3831070A4; AU2019322914B2; PH12021550312A1; KR20210036401A; JP2021534677A; JP2024032732A; CN114205590A; JP2023065392A; SG11202101399VA; JP2023095886A; MX2021001743A; CN114697663A; CN113141784A; EP3831054A1; KR102659936B1; US11979553B2

Abstract

코딩된 비디오 비트스트림을 디코딩하는 방법이 제공된다. 상기 방법은 상기 코딩된 비디오 비트스트림으로 표현된 파라미터 세트를 파싱하는 단계를 포함한다. 상기 파라미터 세트는 참조 영상 리스트 구조의 세트를 포함하는 신택스 요소의 세트를 포함한다. 상기 방법은 또한 상기 코딩된 비디오 비트스트림으로 표현된 현재 슬라이스의 슬라이스 헤더를 파싱하는 단계를 포함한다. 상기 슬라이스 헤더는 상기 파라미터 세트 내의 참조 영상 리스트 구조 세트 중의 참조 영상 리스트 구조의 색인을 포함한다. 상기 방법은 상기 파라미터 세트 내의 상기 신택스 요소의 세트 및 상기 참조 영상 리스트 구조의 색인에 기초하여, 상기 현재 슬라이스의 참조 영상 리스트를 도출하는 단계를 더 포함한다. 상기 방법은 또한 상기 참조 영상 리스트에 기초하여, 상기 현재 슬라이스의 하나 이상의 재구축된 블록을 획득하는 단계를 포함한다.A method of decoding a coded video bitstream is provided. The method includes parsing a parameter set represented by the coded video bitstream. The parameter set includes a set of syntax elements including a set of reference picture list structures. The method also includes parsing a slice header of a current slice represented by the coded video bitstream. The slice header includes an index of a reference picture list structure in a reference picture list structure set in the parameter set. The method further includes deriving a reference picture list of the current slice based on the set of syntax elements in the parameter set and an index of the reference picture list structure. The method also includes obtaining one or more reconstructed blocks of the current slice based on the reference image list.

Description

Reference image management in video coding

관련 출원에 대한 상호 참조Cross-reference to related applications

본 특허출원은 2018년 8월 17일에, Wang Ye-Kui 등이, "Reference Picture Management in Video Coding(비디오 코딩에서의 참조 영상 관리)"라는 명칭으로 출원한, 미국 가 특허출원 제62/719,360호의 이익을 주장하며, 상기 출원은 인용에 의해 본 특허출원에 통합된다.This patent application was filed on August 17, 2018 by Wang Ye-Kui et al. under the name "Reference Picture Management in Video Coding", U.S. Provisional Patent Application No. 62/719,360 Claiming the benefit of this patent application, the above application is incorporated by reference into this patent application.

본 개시는 일반적으로 비디오 코딩에서의 참조 영상 관리를 위한 기술과 관련된다. 보다 구체적으로, 본 개시는 참조 영상 리스트(reference picture list) 및 참조 영상 마킹(reference picture marking)의 구축을 위한 기술을 설명한다.This disclosure generally relates to techniques for reference picture management in video coding. More specifically, the present disclosure describes a technique for constructing a reference picture list and reference picture marking.

비교적 짧은 비디오조차도 보여주는 데 필요한 비디오 데이터의 양은 상당할 수 있으며, 이는 데이터가 스트리밍되거나 한정된 대역폭 용량을 갖는 통신 네트워크를 통해 통신할 때 곤란을 초래할 수 있다. 따라서 비디오 데이터는 일반적으로 현대의 전기통신 네트워크를 통해 통신되기 전에 압축된다. 메모리 자원은 한정될 수 있기 때문에 비디오가 저장 기기에 저장되는 경우에 비디오의 크기 또한 문제가 될 수도 있다. 비디오 압축 기기는 종종 근원지(source)에서 소프트웨어 및/또는 하드웨어를 사용하여 송신 또는 저장 전에 비디오 데이터를 코딩하여, 디지털 비디오 이미지를 표현하는 데 필요한 데이터의 양을 감소시킨다. 그런 다음 압축된 데이터는 비디오 데이터를 디코딩하는 비디오 압축해제 기기에 의해 목적지(destination)에서 수신된다. 한정된 네트워크 자원과 더 높은 비디오 품질에 대한 요구가 계속 증가함에 따라, 이미지 품질을 거의 또는 전혀 희생하지 않고 압축 비율을 향상시키는 개선된 압축 및 압축해제 기술이 바람직하다.The amount of video data required to show even relatively short video can be significant, which can lead to difficulties when the data is streamed or communicated over a communication network with limited bandwidth capacity. Thus, video data is typically compressed before being communicated over modern telecommunication networks. Since memory resources may be limited, the size of the video may also be an issue when the video is stored in a storage device. Video compression devices often code video data prior to transmission or storage using software and/or hardware at the source, reducing the amount of data required to represent a digital video image. The compressed data is then received at a destination by a video decompression device that decodes the video data. As the demand for limited network resources and higher video quality continues to increase, improved compression and decompression techniques that improve compression ratios without sacrificing little or no image quality are desirable.

제1 측면은 코딩된 비디오 비트스트림을 디코딩하는 방법에 관한 것이다. 상기 방법은, 상기 코딩된 비디오 비트스트림으로 표현된 파라미터 세트를 파싱하는 단계 - 상기 파라미터 세트는 참조 영상 리스트 구조의 세트를 포함하는 신택스 요소(syntax element)의 세트를 포함함 -; 상기 코딩된 비디오 비트스트림으로 표현된 현재 슬라이스의 슬라이스 헤더(slice header)를 파싱하는 단계 - 상기 슬라이스 헤더는 상기 파라미터 세트 내의 참조 영상 리스트 구조의 세트 중의 참조 영상 리스트 구조의 색인을 포함함 -; 상기 파라미터 세트 내의 상기 신택스 요소의 세트 및 상기 참조 영상 리스트 구조의 색인에 기초하여, 상기 현재 슬라이스의 참조 영상 리스트를 도출하는 단계; 및 상기 참조 영상 리스트에 기초하여, 상기 현재 슬라이스의 하나 이상의 재구축된 블록을 획득하는 단계를 포함한다. The first aspect relates to a method of decoding a coded video bitstream. The method includes parsing a parameter set represented by the coded video bitstream, the parameter set comprising a set of syntax elements comprising a set of reference picture list structures; Parsing a slice header of a current slice represented by the coded video bitstream, the slice header including an index of a reference picture list structure in a set of reference picture list structures in the parameter set; Deriving a reference picture list of the current slice based on the set of syntax elements in the parameter set and an index of the reference picture list structure; And obtaining one or more reconstructed blocks of the current slice based on the reference image list.

상기 방법은 참조 영상 리스트의 시그널링을 단순화하고 보다 효율적으로 만드는 기술을 제공한다. 따라서 전체 코딩 프로세스가 개선된다.This method provides a technique for simplifying and more efficient signaling of a reference picture list. Thus, the entire coding process is improved.

제1 측면에 따른 방법의 제1 구현 형태에서, 상기 참조 영상 리스트 구조에서의 엔트리 순서는 상기 참조 영상 리스트에서의 대응하는 참조 영상의 순서와 동일하다.In a first implementation form of the method according to the first aspect, the order of entries in the reference image list structure is the same as the order of corresponding reference images in the reference image list.

제1 측면에 따른 방법의 제2 구현 형태 또는 제1 측면의 임의의 선행 구현 형태에서, 상기 엔트리의 순서는 0에서부터 지시된 값까지이다.In the second implementation form of the method according to the first aspect or in any preceding implementation form of the first aspect, the order of the entries is from zero to the indicated value.

제1 측면에 따른 방법의 제3 구현 형태 또는 제1 측면의 임의의 선행 구현 형태에서, 상기 지시된 값은 0에서부터 sps_max_dec_pic_buffering_minus1에 의해 지시된 값까지이다.In a third implementation form of the method according to the first aspect or any preceding implementation form of the first aspect, the indicated value is from 0 to the value indicated by sps_max_dec_pic_buffering_minus1.

제1 측면에 따른 방법의 제4 구현 형태 또는 제1 측면의 임의의 선행 구현 형태에서, 상기 참조 영상 리스트는 RefPictList[0]으로 지정된다.In a fourth implementation form of the method according to the first aspect or any preceding implementation form of the first aspect, the reference picture list is designated as RefPictList[0].

제1 측면에 따른 방법의 제5 구현 형태 또는 제1 측면의 임의의 선행 구현 형태에서, 상기 참조 영상 리스트는 RefPictList[1]로 지정된다.In the fifth implementation form of the method according to the first aspect or any preceding implementation form of the first aspect, the reference picture list is designated as RefPictList[1].

제1 측면에 따른 방법의 제6 구현 형태 또는 제1 측면의 임의의 선행 구현 형태에서, 상기 하나 이상의 재구축된 블록은 전자 기기의 디스플레이에 표시되는 이미지를 생성하는 데 사용된다.In a sixth implementation form of the method according to the first aspect or in any preceding implementation form of the first aspect, the one or more reconstructed blocks are used to generate an image to be displayed on the display of the electronic device.

제1 측면에 따른 방법의 제7 구현 형태 또는 제1 측면의 임의의 선행 구현 형태에서, 상기 참조 영상 리스트는 인터 예측에 사용되는 참조 영상의 리스트를 포함한다. In the seventh implementation form of the method according to the first aspect or any preceding implementation form of the first aspect, the reference picture list includes a list of reference pictures used for inter prediction.

제1 측면에 따른 방법의 제8 구현 형태 또는 제1 측면의 임의의 선행 구현 형태에서, 상기 인터 예측은 P 슬라이스 또는 B 슬라이스에 대한 것In an eighth implementation form of the method according to the first aspect or any preceding implementation form of the first aspect, the inter prediction is for a P slice or a B slice.

제1 측면에 따른 방법의 제9 구현 형태 또는 제1 측면의 임의의 선행 구현 형태에서, 상기 파라미터 세트는 시퀀스 파라미터 세트(sequence parameter set, SPS)를 포함한다.In a ninth implementation form of the method according to the first aspect or any preceding implementation form of the first aspect, the parameter set comprises a sequence parameter set (SPS).

제1 측면에 따른 방법의 제10 구현 형태 또는 제1 측면의 임의의 선행 구현 형태에서, 상기 파라미터 세트로부터의 신택스 요소의 세트는 네트워크 추상화 계층(Network Abstraction Layer, NAL) 유닛의 원시 바이트 시퀀스 페이로드(Raw Byte Sequence Payload, RBSP)에 배치된다.In the tenth implementation form of the method according to the first aspect or in any preceding implementation form of the first aspect, the set of syntax elements from the parameter set is a raw byte sequence payload of a Network Abstraction Layer (NAL) unit. Arranged in (Raw Byte Sequence Payload, RBSP).

제1 측면에 따른 방법의 제11 구현 형태 또는 제1 측면의 임의의 선행 구현 형태에서, 상기 참조 영상 리스트는 RefPictList[0] 또는 RefPictList[1]로 지정되고, 상기 참조 영상 리스트 구조에서의 엔트리 순서는 상기 참조 영상 리스트에서의 대응하는 참조 영상의 순서와 동일하다.In the eleventh implementation form of the method according to the first aspect or any preceding implementation form of the first aspect, the reference picture list is designated as RefPictList[0] or RefPictList[1], and the order of entries in the reference picture list structure Is the same as the order of the corresponding reference images in the reference image list.

제1 측면에 따른 방법의 제12 구현 형태 또는 제1 측면의 임의의 선행 구현 형태에서, 상기 코딩된 비디오 비트스트림으로 표현된 파라미터 세트를 파싱하는 단계 - 상기 파라미터 세트는 참조 영상 리스트 구조의 세트를 포함하는 신택스 요소의 세트를 포함함 -; 상기 코딩된 비디오 비트스트림으로 표현된 참조 영상 리스트 구조를 획득하는 단계; 참조 영상 리스트 구조에 기초하여, 현재 슬라이스의 제1 참조 영상 리스트 도출하는 단계 - 상기 제1 참조 영상 리스트는 하나 이상의 활성 엔트리(active entry) 및 하나 이상의 비활성 엔트리(inactive entry)를 포함하고, 상기 하나 이상의 비활성 엔트리는 상기 현재 슬라이스의 인터 예측에 사용되지 않지만 제2 참조 영상 리스트 내의 활성 엔트리에 의해 참조되는 참조 영상을 지칭하고, 상기 제2 참조 영상 리스트는 디코딩 순서에서 상기 현재 슬라이스 다음에 오는 슬라이스의 참조 영상 리스트이거나, 디코딩 순서에서 현재 영상 다음에 오는 영상의 참조 영상 리스트임 -; 및 상기 제1 참조 영상 리스트의 하나 이상의 활성 엔트리에 기초하여, 상기 현재 슬라이스의 하나 이상의 재구축된 블록을 획득하는 단계를 더 포함한다.In a twelfth implementation form of the method according to the first aspect or in any preceding implementation form of the first aspect, parsing a parameter set represented by the coded video bitstream, the parameter set being a set of reference picture list structures. Contains a set of containing syntax elements -; Obtaining a reference picture list structure represented by the coded video bitstream; Deriving a first reference picture list of the current slice based on the reference picture list structure-The first reference picture list includes one or more active entries and one or more inactive entries, and the one The above inactive entry refers to a reference picture that is not used for inter prediction of the current slice but is referenced by an active entry in a second reference picture list, and the second reference picture list includes a slice following the current slice in decoding order. It is a reference picture list or a reference picture list of the picture following the current picture in decoding order -; And obtaining one or more reconstructed blocks of the current slice based on one or more active entries of the first reference image list.

제2 측면은 디코딩 기기에 관한 것이며, 상기 디코딩 기기는, 코딩된 비디오 비트스트림을 수신하도록 구성된 수신기; 상기 수신기에 결합되고, 명령어를 저장하는 메모리; 및 상기 메모리에 결합된 프로세서를 포함하고, 상기 프로세서는 상기 메모리에 저장된 명령어를 실행하여, 상기 코딩된 비디오 비트스트림으로 표현된 파라미터 세트를 파싱하고 - 상기 파라미터 세트는 참조 영상 리스트 구조의 세트를 포함하는 신택스 요소의 세트를 포함함 -; 상기 코딩된 비디오 비트스트림으로 표현된 현재 슬라이스의 슬라이스 헤더를 파싱하고 - 상기 슬라이스 헤더는 상기 파라미터 세트 내의 참조 영상 리스트 구조의 세트 중의 참조 영상 리스트 구조의 색인을 포함함 -; 상기 파라미터 세트 내의 상기 신택스 요소의 세트 및 상기 참조 영상 리스트 구조의 색인에 기초하여, 상기 현재 슬라이스의 참조 영상 리스트를 도출하고; 상기 참조 영상 리스트에 기초하여, 상기 현재 슬라이스의 하나 이상의 재구축된 블록을 획득히도록 구성된다. A second aspect relates to a decoding device, the decoding device comprising: a receiver configured to receive a coded video bitstream; A memory coupled to the receiver and storing instructions; And a processor coupled to the memory, wherein the processor executes an instruction stored in the memory to parse a parameter set represented by the coded video bitstream, the parameter set comprising a set of reference picture list structures. Contains a set of syntax elements that do -; Parsing a slice header of a current slice represented by the coded video bitstream, the slice header including an index of a reference picture list structure in a set of reference picture list structures in the parameter set; Derive a reference picture list of the current slice based on the set of syntax elements in the parameter set and the index of the reference picture list structure; Based on the reference image list, it is configured to obtain one or more reconstructed blocks of the current slice.

상기 디코딩 기기는 참조 영상 리스트의 시그널링을 단순화하고 보다 효율적으로 만드는 기술을 제공한다. 따라서 전체 코딩 프로세스가 개선된다.The decoding device provides a technique for simplifying and more efficient signaling of a reference video list. Thus, the entire coding process is improved.

제2 측면에 따른 디코딩 기기의 제1 구현 형태에서, 상기 디코딩 기기는 상기 하나 이상의 재구축된 블록에 기초하여 생성된 현재 영상을 표시하도록 구성된 디스플레이를 더 포함한다.In a first implementation form of the decoding device according to the second aspect, the decoding device further includes a display configured to display a current image generated based on the one or more reconstructed blocks.

제3 측면은 코딩 장치에 관한 것이며, 상기 코딩 장치는, 디코딩할 비트스트림을 수신하도록 구성된 수신기; 상기 수신기에 결합되고, 디코딩된 이미지를 디스플레이에 송신하도록 구성된 송신기; 상기 수신기 또는 상기 송신기 중 적어도 하나에 결합되고, 명령어를 저장하도록 구성된 메모리; 및 상기 메모리에 연결되고, 상기 메모리에 저장된 명령어를 실행하여, 선행 측면 또는 구현 형태 중 어느 하나에서의 방법을 수행하도록 구성된 프로세서를 포함한다.A third aspect relates to a coding apparatus comprising: a receiver configured to receive a bitstream to be decoded; A transmitter coupled to the receiver and configured to transmit a decoded image to a display; A memory coupled to at least one of the receiver or the transmitter and configured to store instructions; And a processor coupled to the memory and configured to execute an instruction stored in the memory to perform the method in any of the preceding aspects or implementation forms.

제4 측면은 상기 시스템은, 인코더; 및 상기 인코더와 통신하는 디코더를 포함하는 시스템에 관한 것이다. 상기 인코더 또는 디코더는 선행 측면 또는 구현 형태 중 어느 하나에서의 디코딩 기기 또는 코딩 장치를 포함한다.In a fourth aspect, the system comprises: an encoder; And a decoder in communication with the encoder. The encoder or decoder comprises a decoding device or a coding device in any of the preceding aspects or implementation forms.

상기 시스템은 참조 영상 리스트의 시그널링을 단순화하고 보다 효율적으로 만드는 기술을 제공한다. 따라서 전체 코딩 프로세스가 개선된다.The system provides a technique for simplifying and more efficient signaling of a reference picture list. Thus, the entire coding process is improved.

제5 측면은 코딩을 위한 수단에 관한 것이며, 상기 코딩을 위한 수단은, 디코딩할 비트스트림을 수신하도록 구성된 수신 수단; 상기 수신 수단에 결합되고, 디코딩된 이미지를 디스플레이 수단에 송신하도록 구성된 송신 수단; 상기 수신 수단 또는 상기 송신 수단 중 적어도 하나에 결합되고, 명령어를 저장하도록 구성된 저장 수단; 및 상기 저장 수단에 결합되고, 선행 측면 또는 구현 형태 중 어느 하나에서의 방법을 수행하기 위해 상기 저장 수단에 저장된 명령어를 실행하도록 구성된 처리 수단을 포함한다.A fifth aspect relates to means for coding, the means for coding comprising: receiving means configured to receive a bitstream to be decoded; Transmitting means coupled to the receiving means and configured to transmit the decoded image to the display means; Storage means coupled to at least one of said receiving means or said transmitting means and configured to store an instruction; And processing means coupled to said storage means and configured to execute instructions stored in said storage means to perform the method in any of the preceding aspects or implementation forms.

이 코딩 위한 수단은 참조 영상 리스트의 시그널링을 단순화하고 보다 효율적으로 만드는 기술을 제공한다. 따라서 전체 코딩 프로세스가 개선된다.This means for coding provides a technique that simplifies and makes the signaling of the reference picture list more efficient. Thus, the entire coding process is improved.

본 개시의 더 완전한 이해를 위해, 첨부 도면 및 상세한 설명과 관련하여 취해진 다음의 간단한 설명이 참조되며, 여기서 유사한 참조 번호는 유사한 부분을 나타낸다.
도 1은 양측 예측(bi-lateral prediction) 기술을 이용할 수 있는 예시적인 코딩 시스템을 나타낸 블록도이다.
도 2는 양측 예측 기술을 구현할 수 있는 예시적인 비디오 인코더를 나타낸 블록도이다.
도 3은 양측 예측 기술을 구현할 수 있는 비디오 디코더의 예를 나타낸 블록도이다.
도 4는 참조 영상 세트(reference picture set, RPS)의 모든 서브세트에 엔트리를 갖는 영상을 갖는 RPS를 나타낸 개략도이다.
도 5는 코딩된 비디오 비트스트림을 디코딩하는 방법의 일 실시예이다.
도 6은 비디오 코딩 기기의 개략도이다.
도 7은 코딩을 위한 수단의 실시예의 개략도이다.For a more complete understanding of the present disclosure, reference is made to the following brief description taken in connection with the accompanying drawings and detailed description, where like reference numerals designate like parts.
1 is a block diagram illustrating an exemplary coding system capable of using a bi-lateral prediction technique.
2 is a block diagram illustrating an exemplary video encoder capable of implementing a bilateral prediction technique.
3 is a block diagram showing an example of a video decoder capable of implementing a bilateral prediction technique.
4 is a schematic diagram showing an RPS having a picture having entries in all subsets of a reference picture set (RPS).
5 is an embodiment of a method of decoding a coded video bitstream.
6 is a schematic diagram of a video coding device.
7 is a schematic diagram of an embodiment of a means for coding.

도 1은 여기에 설명된 바와 같은 비디오 코딩 기술을 활용할 수 있는 예시적인 코딩 시스템(10)을 나타낸 블록도이다. 도 1에 도시된 바와 같이, 코딩 시스템(10)은 목적지 기기(14)에 의해 나중에 디코딩될 인코딩된 비디오 데이터를 제공하는 근원지 기기(12)를 포함한다. 특히, 근원지 기기(12)는 컴퓨터로 판독 가능한 매체(16)를 통해 목적지 기기(14)에 비디오 데이터를 제공할 수 있다. 근원지 기기(12) 및 목적지 기기(14)는 데스크톱 컴퓨터, 노트북(예: 랩톱) 컴퓨터, 태블릿 컴퓨터, 셋톱 박스, 소위 "스마트" 폰과 같은 전화 핸드셋, 소위 "스마트" 패드, 텔레비전, 카메라, 디스플레이 기기, 디지털 미디어 플레이어, 비디오 게이밍 콘솔, 비디오 스트리밍 기기 등을 포함한, 임의의 광범위한 기기를 포함할 수 있다. 경우에 따라서는, 근원지 기기(12) 및 목적지 기기(14)는 무선 통신을 위해 장비될 수 있다.1 is a block diagram illustrating an exemplary coding system 10 that may utilize video coding techniques as described herein. As shown in FIG. 1, the coding system 10 includes a source device 12 that provides encoded video data to be decoded later by a destination device 14. In particular, the source device 12 may provide video data to the destination device 14 through a computer-readable medium 16. Source device 12 and destination device 14 are desktop computers, notebook computers (e.g. laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called "smart" phones, so-called "smart" pads, televisions, cameras, displays. Devices, digital media players, video gaming consoles, video streaming devices, and the like. In some cases, the source device 12 and the destination device 14 may be equipped for wireless communication.

목적지 기기(14)는 컴퓨터로 판독 가능한 매체(16)를 통해 디코딩될 인코딩된 비디오 데이터를 수신할 수 있다. 컴퓨터로 판독 가능한 매체(16)는 인코딩된 비디오 데이터를 근원지 기기(12)로부터 목적지 기기(14)로 이동할 수 있는 임의의 유형의 매체 또는 기기를 포함할 수 있다. 일례에서, 컴퓨터로 판독 가능한 매체(16)는 근원지 기기(12)가 인코딩된 비디오 데이터를 목적지 기기(14)에 실시간으로 직접 송신할 수 있게 하는 통신 매체를 포함할 수 있다. 인코딩된 비디오 데이터는 무선 통신 프로토콜과 같은, 통신 표준에 따라 변조되어 목적지 기기(14)에 전송될 수 있다. 통신 매체는 무선 주파수(radio frequency, RF) 스펙트럼 또는 하나 이상의 물리 송신 선로와 같은, 임의의 무선 또는 유선 통신 매체를 포함할 수 있다. 또는 통신 매체는 근거리 네트워크, 광역 네트워크, 또는 인터넷과 같은 글로벌 네트워크와 같은, 패킷 기반 네트워크의 일부를 형성할 수 있다. 통신 매체는 라우터, 교환기, 기지국, 또는 근원지 기기(12)로부터 목적지 기기(14)로의 통신을 용이하게 하는 데 유용할 수 있는 임의의 다른 장비를 포함할 수 있다.Destination device 14 may receive encoded video data to be decoded via computer-readable medium 16. Computer-readable medium 16 may include any tangible medium or device capable of moving encoded video data from source device 12 to destination device 14. In one example, the computer-readable medium 16 may comprise a communication medium that enables the source device 12 to transmit the encoded video data directly to the destination device 14 in real time. The encoded video data may be modulated and transmitted to the destination device 14 according to a communication standard, such as a wireless communication protocol. The communication medium may include any wireless or wired communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines. Alternatively, the communication medium may form part of a packet-based network, such as a local area network, a wide area network, or a global network such as the Internet. Communication media may include routers, exchanges, base stations, or any other equipment that may be useful to facilitate communication from source device 12 to destination device 14.

일부 예에서, 인코딩된 데이터는 출력 인터페이스(22)로부터 저장 기기로 출력될 수 있다. 유사하게, 인코딩된 데이터는 입력 인터페이스에 의해 저장 기기로부터 액세스될 수 있다. 저장 기기는 하드 드라이브, 블루레이(Blu-ray) 디스크, 디지털 비디오 디스크(digital video disk,DVD), 컴팩트 디스크 판독 전용 메모리(Compact Disc Read-Only Memory, CD-ROM), 플래시 메모리, 휘발성 또는 비휘발성 메모리, 또는 인코딩된 비디오 데이터를 저장하기 위한 다른 적합한 디지털 저장 매체와 같은, 다양한 분산되거나 로컬로 액세스되는 데이터 저장 매체 중 어느 것을 포함할 수 있다. 추가 예에서, 저장 기기는 근원지 기기(12)에 의해 생성되는 인코딩된 비디오를 저장할 수 있는 파일 서버 또는 다른 중간 저장 기기에 대응할 수 있다. 목적지 기기(14)는 스트리밍 또는 다운로드를 통해 저장 기기로부터 저장된 비디오 데이터에 액세스할 수 있다. 파일 서버는 인코딩된 비디오 데이터를 저장하고 인코딩된 비디오 데이터를 목적지 기기(14)에 전송할 수 있는 임의의 유형의 서버일 수 있다. 예시적인 파일 서버로는 웹 서버(예: 웹 사이트용), 파일 전송 프로토콜(file transfer protocol, FTP) 서버, 네트워크 부착형 저장(Network Attached Storage, NAS) 기기 또는 로컬 디스크 드라이브를 포함한다. 목적지 기기(14)는 인터넷 연결을 포함한, 임의의 표준 데이터 연결을 통해 인코딩된 비디오 데이터에 액세스할 수 있다. 여기에는 무선 채널(예: Wi-Fi 연결), 유선 연결(예: 디지털 가입자 회선(digital subscriber line, DSL), 케이블 모뎀 등) 또는 파일 서버에 저장된 인코딩된 비디오 데이터의 액세스에 적합한 둘의 조합이 포함될 수 있다. 저장 기기로부터의 인코딩된 비디오 데이터의 송신은 스트리밍 송신, 다운로드 송신 또는 이들의 조합일 수 있다.In some examples, the encoded data may be output from the output interface 22 to a storage device. Similarly, encoded data can be accessed from a storage device by an input interface. Storage devices include hard drives, Blu-ray discs, digital video disks (DVDs), Compact Disc Read-Only Memory (CD-ROM), flash memory, volatile or non-volatile memory. It may include any of a variety of distributed or locally accessed data storage media, such as volatile memory or other suitable digital storage media for storing encoded video data. In a further example, the storage device may correspond to a file server or other intermediate storage device capable of storing the encoded video generated by the source device 12. The destination device 14 may access the stored video data from the storage device via streaming or download. The file server may be any type of server capable of storing encoded video data and transmitting the encoded video data to destination device 14. Exemplary file servers include web servers (eg, for web sites), file transfer protocol (FTP) servers, Network Attached Storage (NAS) devices, or local disk drives. Destination device 14 may access the encoded video data through any standard data connection, including an Internet connection. This includes a wireless channel (e.g. Wi-Fi connection), a wired connection (e.g. digital subscriber line (DSL), cable modem, etc.), or a combination of the two suitable for accessing encoded video data stored on a file server. Can be included. The transmission of the encoded video data from the storage device may be a streaming transmission, a download transmission, or a combination thereof.

본 개시의 기술은 반드시 무선 애플리케이션 또는 설정으로 한정되는 것은 아니다. 이 기술은 공중파 텔레비전 방송, 케이블 텔레비전 송신, 위성 텔레비전 송신, HTTP를 통한 동적 적응적 스트리밍(dynamic adaptive streaming over HTTP, DASH)과 같은, 인터넷 스트리밍 비디오 송신, 데이터 저장 매체에 인코딩되는 디지털 비디오, 데이터 저장 매체에 저장된 디지털 비디오, 또는 기타 애플리케이션과 같은, 다양한 멀티미디어 애플리케이션 중 어느 것을 지원하는 비디오 코딩에 적용될 수 있다. 일부 예에서, 코딩 시스템(10)은 비디오 스트리밍, 비디오 재생, 비디오 방송 및/또는 비디오 전화와 같은 애플리케이션을 지원하기 위해 단방향 또는 양방향 비디오 송신을 지원하도록 구성될 수 있다.The techniques of this disclosure are not necessarily limited to wireless applications or settings. These technologies include over-the-air television broadcasting, cable television transmission, satellite television transmission, Internet streaming video transmission, such as dynamic adaptive streaming over HTTP (DASH), digital video encoded on data storage media, and data storage. It can be applied to video coding supporting any of a variety of multimedia applications, such as digital video stored on a medium, or other applications. In some examples, the coding system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.

도 1의 예에서, 근원지 기기(12)는 비디오 소스(18), 비디오 인코더(20) 및 출력 인터페이스(22)를 포함한다. 목적지 기기(14)는 입력 인터페이스(28), 비디오 디코더(30) 및 디스플레이 기기(32)를 포함한다. 본 개시에 따르면, 근원지 기기(12)의 비디오 인코더(20)는 및/또는 목적지 기기(14)의 비디오 디코더(30)는 비디오 코딩을 위한 기술을 적용하도록 구성될 수 있다. 다른 예에서, 근원지 기기 및 목적지 기기는 다른 구성요소 또는 구성(arrangement)을 포함할 수 있다. 예를 들어, 근원지 기기(12)는 외부 카메라와 같은, 외부 비디오 소스로부터 비디오 데이터를 수신한다. 마찬가지로, 목적지 기기(14)는 통합된 디스플레이 기기를 포함하지 않는 것이 아니라, 외부 디스플레이 기기와 인터페이스할 수 있다.In the example of FIG. 1, source device 12 includes a video source 18, a video encoder 20 and an output interface 22. The destination device 14 comprises an input interface 28, a video decoder 30 and a display device 32. According to the present disclosure, the video encoder 20 of the source device 12 and/or the video decoder 30 of the destination device 14 may be configured to apply a technique for video coding. In another example, the source device and the destination device may include other components or arrangements. For example, the source device 12 receives video data from an external video source, such as an external camera. Likewise, the destination device 14 does not include an integrated display device, but may interface with an external display device.

도 1의 예시된 코딩 시스템(10)은 단지 하나의 예이다. 비디오 코딩 기술은 임의의 디지털 비디오 인코딩 및/또는 디코딩 기기에 의해 수행될 수 있다. 본 개시의 기술은 일반적으로 비디오 코딩 기기에 의해 수행되지만, 이 기술은 일반적으로 "CODEC"로 지칭되는 비디오 인코더/디코더에 의해 수행될 수도 있다. 더욱이, 본 개시의 기술은 또한 비디오 전처리기(video preprocessor)에 의해 수행될 수 있다. 비디오 인코더 및/또는 디코더는 그래픽 처리 기기(graphics processing unit, GPU) 또는 유사한 기기일 수 있다.The illustrated coding system 10 of FIG. 1 is only one example. Video coding techniques can be performed by any digital video encoding and/or decoding device. Although the technique of this disclosure is generally performed by a video coding device, this technique may also be performed by a video encoder/decoder generally referred to as “CODEC”. Moreover, the techniques of this disclosure can also be performed by a video preprocessor. The video encoder and/or decoder may be a graphics processing unit (GPU) or similar device.

근원지 기기(12) 및 목적지 기기(14)는, 근원지 기기(12)가 목적지 기기(14) 에의 송신을 위해 코딩된 비디오 데이터를 생성하는 그러한 코딩 기기의 예일 뿐이다. 일부 예에서, 근원지 기기(12) 및 목적지 기기(14) 각각이 비디오 인코딩 및 디코딩 구성요소 포함하도록 하여 근원지 기기(12) 및 목적지 기기(14)는 실질적으로 대칭적인 방식으로 동작할 수 있다. 따라서, 코딩 시스템(10)은, 예컨대, 비디오 스트리밍, 비디오 재생, 비디오 방송 또는 비디오 전화를 위해, 비디오 기기(12, 14) 사이의 단방향 또는 양방향 비디오 송신을 지원할 수 있다.Source device 12 and destination device 14 are only examples of such coding devices in which the source device 12 generates coded video data for transmission to the destination device 14. In some examples, source device 12 and destination device 14 may each include a video encoding and decoding component so that source device 12 and destination device 14 can operate in a substantially symmetrical manner. Thus, the coding system 10 may support one-way or two-way video transmission between video devices 12, 14, for example, for video streaming, video playback, video broadcasting or video telephony.

근원지 기기(12)의 비디오 소스(18)는 비디오 카메라와 같은 비디오 캡처 기기, 이전에 캡처된 비디오를 포함하는 비디오 아카이브(video archive), 및/또는 비디오 콘텐츠 제공자로부터 비디오를 수신하기 위한 비디오 피드 인터페이스(video feed interface)를 포함할 수 있다. 추가 대안으로서, 비디오 소스(18)는 소스 비디오, 또는 라이브 비디오, 보관된 비디오(archived video) 및 컴퓨터로 생성된 비디오(computer-generated video)의 조합으로서 컴퓨터 그래픽 기반 데이터를 생성할 수 있다.The video source 18 of the source device 12 includes a video capture device such as a video camera, a video archive containing previously captured video, and/or a video feed interface for receiving video from a video content provider. (video feed interface) may be included. As a further alternative, video source 18 may generate computer graphics-based data as source video, or a combination of live video, archived video and computer-generated video.

일부 경우에, 비디오 소스(18)가 비디오 카메라일 때, 근원지 기기(12) 및 목적지 기기(14)는 소위 카메라 폰 또는 비디오 폰을 형성할 수 있다. 그러나 전술한 바와 같이, 본 개시에서 설명된 기술은 일반적으로 비디오 코딩에 적용될 수 있고 무선 및/또는 유선 애플리케이션에 적용될 수 있다. 각각의 경우에, 캡처되거나, 미리 캡처되거나, 컴퓨터로 생성된 비디오는 비디오 인코더(20)에 의해 인코딩될 수 있다. 인코딩된 비디오 정보는 출력 인터페이스(22)에 의해 컴퓨터로 판독 가능한 매체(16)에 출력될 수 있다.In some cases, when the video source 18 is a video camera, the source device 12 and the destination device 14 may form a so-called camera phone or video phone. However, as described above, the techniques described in this disclosure can generally be applied to video coding and to wireless and/or wired applications. In each case, the captured, pre-captured, or computer-generated video may be encoded by the video encoder 20. The encoded video information may be output to a computer-readable medium 16 by an output interface 22.

컴퓨터로 판독 가능한 매체(16)는 무선 방송 또는 유선 네트워크 송신과 같은 일시적인 매체(transient media), 또는 하드 디스크, 플래시 드라이브, 컴팩트 디스크, 디지털 비디오 디스크, 블루레이 디스크 또는 기타 컴퓨터로 판독 가능한 매체와 같은 저장 매체(즉, 비 일시적인 저장 매체)를 포함할 수 있다. 일부 예에서, 네트워크 서버(도시되지 않음)는 근원지 기기(12)로부터 인코딩된 비디오 데이터를 수신하고 인코딩된 비디오 데이터를, 예컨대, 네트워크 송신을 통해 목적지 기기(14)에 제공할 수 있다. 유사하게, 디스크 스탬핑 설비와 같은 매체 생산 설비의 컴퓨팅 기기는 근원지 기기(12)로부터 인코딩된 비디오 데이터를 수신하고 인코딩된 비디오 데이터를 포함하는 디스크를 생성할 수 있다. 따라서, 컴퓨터로 판독 가능한 매체(16)는 다양한 예에서, 다양한 형태의 하나 이상의 컴퓨터로 판독 가능한 매체를 포함하는 것으로 이해될 수 있다. The computer-readable medium 16 may be a transient media such as a wireless broadcast or wired network transmission, or a hard disk, a flash drive, a compact disk, a digital video disk, a Blu-ray disk, or other computer-readable medium. Storage media (ie, non-transitory storage media). In some examples, a network server (not shown) may receive encoded video data from source device 12 and provide the encoded video data to destination device 14, such as via network transmission. Similarly, a computing device in a media production facility, such as a disc stamping facility, may receive encoded video data from the source device 12 and generate a disc containing the encoded video data. Accordingly, the computer-readable medium 16 may be understood to include one or more computer-readable media of various types, in various examples.

목적지 기기(14)의 입력 인터페이스(28)는 컴퓨터로 판독 가능한 매체(16)로부터 정보를 수신한다. 컴퓨터로 판독 가능한 매체(16)의 정보는 블록 및 기타 코딩된 단위(coded unit)의, 예컨대, 영상의 그룹(Group of Picture, GOP)의, 특성 및/또는 처리를 설명하는 신택스 요소를 포함하는, 비디오 인코더(20)에 의해 정의된 신택스 정보를 포함할 수 있으며, 이는 또한 비디오 디코더(30)에 의해 사용된다. 디스플레이 기기(32)는 디코딩된 비디오 데이터를 사용자에게 표시하고, 음극선관(cathode ray tube, CRT), 액정 디스플레이(liquid crystal display, LCD), 플라즈마 디스플레이, 유기 발광 다이오드(organic light emitting diode, OLED) 디스플레이 또는 기타 유형의 디스플레이 기기와 같은 다양한 디스플레이 기기 중 어느 것을 포함할 수 있다.The input interface 28 of the destination device 14 receives information from the computer-readable medium 16. The information on the computer-readable medium 16 includes syntax elements describing the characteristics and/or processing of blocks and other coded units, e.g., Group of Pictures (GOPs). , May contain syntax information defined by the video encoder 20, which is also used by the video decoder 30. The display device 32 displays the decoded video data to a user, and a cathode ray tube (CRT), a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) It may include any of a variety of display devices, such as a display or other type of display device.

비디오 인코더(20) 및 비디오 디코더(30)는 현재 개발중인 고효율 비디오 코딩(High Efficiency Video Coding, HEVC) 표준과 같은 비디오 코딩 표준에 따라 동작할 수 있고, HEVC 테스트 모델(HEVC Test Model, HM)에 준거할 수 있다. 대안으로, 비디오 인코더(20) 및 비디오 디코더(30)는 대안으로 MPEG(Moving Picture Expert Group)-4, Part 10, AVC(Advanced Video Coding), H.265/HEVC, 또는 이러한 표준의 확장으로 지칭되는 ITU-T(International Telecommunications Union Telecommunication Standardization Sector) H.264 표준과 같은, 다른 독점적 또는 산업 표준에 따라 동작할 수 있다. 그러나 본 개시의 기술은 임의의 특정 코딩 표준에 한정되지 않는다. 비디오 코딩 표준의 다른 예로는 MPEG-2 및 ITU-T H.263을 포함한다. 도 1에 도시되어 있지는 않지만, 일부 측면에서, 비디오 인코더(20) 및 비디오 디코더(30)는 각각 오디오 인코더 및 디코더와 통합될 수 있고, 공통 데이터 스트림 또는 개별 데이터 스트림의 오디오 및 비디오 모두의 인코딩을 처리하기 위해, 적절한 멀티플렉서-디멀티플렉서(multiplexer-demultiplexer, MUX-DEMUX) 유닛, 또는 다른 하드웨어 및 소프트웨어를 포함할 수 있다. 적용 가능하다면, MUX-DEMUX 장치는 ITU H.223 멀티플렉서 프로토콜, 또는 사용자 데이터 그램 프로토콜(user datagram protocol, UDP)와 같은 기타 프로토콜에 준거할 수 있다.The video encoder 20 and the video decoder 30 can operate according to video coding standards such as High Efficiency Video Coding (HEVC) standards currently being developed, and are based on the HEVC Test Model (HM). I can comply. Alternatively, the video encoder 20 and the video decoder 30 are alternatively referred to as Moving Picture Expert Group (MPEG)-4, Part 10, Advanced Video Coding (AVC), H.265/HEVC, or an extension of this standard. It can operate according to other proprietary or industry standards, such as the International Telecommunications Union Telecommunication Standardization Sector (ITU-T) H.264 standard. However, the technology of this disclosure is not limited to any specific coding standard. Other examples of video coding standards include MPEG-2 and ITU-T H.263. Although not shown in Fig. 1, in some aspects, video encoder 20 and video decoder 30 may be integrated with an audio encoder and decoder, respectively, and enable encoding of both audio and video of a common data stream or a separate data stream. For processing, it may include a suitable multiplexer-demultiplexer (MUX-DEMUX) unit, or other hardware and software. If applicable, the MUX-DEMUX device may conform to the ITU H.223 multiplexer protocol, or other protocols such as user datagram protocol (UDP).

비디오 인코더(20) 및 비디오 디코더(30)는 각각, 하나 이상의 마이크로프로세서, 디지털 신호 프로세서(digital signal processor, DSP), 주문형 반도체(application specific integrated circuit, ASIC), 필드 프로그래밍 가능 게이트 어레이(field programmable gate array, FPGA), 이산 로직(discrete logic), 소프트웨어, 하드웨어, 펌웨어 또는 이들의 조합과 같은, 다양한 적합한 인코더 회로중 어느 것으로 구현될 수 있다. 기술이 소프트웨어로 부분적으로 구현될 때, 기기는 소프트웨어에 대한 명령어를 적합한, 비 일시적 컴퓨터로 판독 가능한 매체에 저장할 수 있고 본 개시의 기술을 수행하기 위해 하나 이상의 프로세서를 사용하여 하드웨어에서 명령어를 실행할 수 있다. 비디오 인코더(20) 및 비디오 디코더(30) 각각은 하나 이상의 인코더 또는 디코더에 포함될 수 있으며, 이 중 어느 하나는 각각의 기기에서 결합된 인코더/디코더(combined encoder/decod, CODEC)의 일부로서 통합될 수 있다. 비디오 인코더(20) 및/또는 비디오 디코더(30)를 포함하는 기기는 집적 회로, 마이크로프로세서, 및/또는 셀룰러 전화와 같은 무선 통신 기기를 포함할 수 있다.Each of the video encoder 20 and the video decoder 30 includes at least one microprocessor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), and a field programmable gate array. array, FPGA), discrete logic, software, hardware, firmware, or combinations thereof. When the technology is partially implemented in software, the device may store instructions for the software in a suitable, non-transitory computer-readable medium and execute the instructions in hardware using one or more processors to perform the techniques of this disclosure. have. Each of the video encoder 20 and the video decoder 30 may be included in one or more encoders or decoders, any of which may be incorporated as part of a combined encoder/decod (CODEC) in each device. I can. Devices including video encoder 20 and/or video decoder 30 may include integrated circuits, microprocessors, and/or wireless communication devices such as cellular telephones.

도 2는 비디오 코딩 기술을 구현할 수 있는 비디오 인코더(20)의 이례를 나타낸 블록도이다. 비디오 인코더(20)는 비디오 슬라이스 내의 비디오 블록의 인트라 코딩(intra-coding) 및 인터 코딩(inter-coding)을 수행할 수 있다. 인트라 코딩은 공간 예측에 의존하여 주어진 비디오 프레임 또는 영상 내에서 비디오의 공간 중복성(spatial redundancy)을 줄이거나 제거한다. 인터 코딩은 시간 예측에 의존하여 비디오 시퀀스의 인접한 프레임 또는 영상 내의 비디오에서 시간 중복성(temporal redundancy)을 줄이거나 제거한다. 인트라 모드(I mode)는 여러 공간 기반 코딩 모드 중 어느 것을 가리킬 수 있다. 단방향 예측(uni-directional prediction, uni prediction으로도 알려짐)(P 모드) 또는 양방향 예측(bi-prediction, bi prediction로도 알려짐)(B 모드)과 같은 인터 모드는 여러 시간 기반 코딩 모드 중 임의의 것을 가리킬 수 있다.2 is a block diagram showing an example of a video encoder 20 capable of implementing a video coding technique. The video encoder 20 may perform intra-coding and inter-coding of video blocks in a video slice. Intra coding relies on spatial prediction to reduce or remove spatial redundancy of video within a given video frame or image. Inter coding relies on temporal prediction to reduce or remove temporal redundancy in adjacent frames of a video sequence or in video within an image. The intra mode (I mode) may refer to any of several spatial based coding modes. Inter-modes such as uni-directional prediction (also known as uni prediction) (P mode) or bi-prediction (also known as bi-prediction, bi prediction) (B mode) refer to any of several time-based coding modes. I can.

도 2에 도시된 바와 같이, 비디오 인코더(20)는 인코딩될 비디오 프레임 내의 현재 비디오 블록을 수신한다. 도 2의 예에서, 비디오 인코더(20)는 모드 선택 유닛(40), 참조 프레임 메모리(64), 합산기(50), 변환 처리 유닛(52), 양자화 유닛(54) 및 엔트로피 코딩 유닛(56)을 포함한다. 모드 선택 유닛(40)은 차례로, 움직임 보상 유닛(44), 움직임 추정 유닛(42), 인트라 예측(intra-prediction, intra prediction으로도 알려짐) 유닛(46), 및 분할 유닛(48)을 포함한다. 비디오 블록 재구축을 위해, 비디오 인코더(20)는 또한 역 양자화 유닛(58), 역변환 유닛(60) 및 합산기(62)를 포함한다. 디블록킹 필터(deblocking filter)(도 2에 도시되지 않음)는 또한 재구축된 비디오로부터 블록성 아티팩트(blockiness artifact)를 제거하기 위해 블록 경계를 필터링하기 위해 포함될 수 있다. 원하는 경우, 디블로킹 필터는 일반적으로 합산기(62)의 출력을 필터링한다. 디블로킹 필터에 더하여 추가 필터(루프 내 또는 루프 뒤)가 또한 사용될 수 있다. 이러한 필터는 간결함을 위해 도시되지 않지만, 원하는 경우, 합산기(50)의 출력을 필터링할 수 있다(루프 내 필터로서).As shown in Fig. 2, video encoder 20 receives a current video block within a video frame to be encoded. In the example of FIG. 2, the video encoder 20 includes a mode selection unit 40, a reference frame memory 64, a summer 50, a transform processing unit 52, a quantization unit 54, and an entropy coding unit 56. ). The mode selection unit 40, in turn, includes a motion compensation unit 44, a motion estimation unit 42, an intra-prediction (also known as intra prediction) unit 46, and a segmentation unit 48. . For video block reconstruction, the video encoder 20 also includes an inverse quantization unit 58, an inverse transform unit 60 and a summer 62. A deblocking filter (not shown in FIG. 2) may also be included to filter the block boundaries to remove blockiness artifacts from the reconstructed video. If desired, the deblocking filter generally filters the output of summer 62. In addition to the deblocking filter, additional filters (in the loop or after the loop) can also be used. Such a filter is not shown for brevity, but if desired, the output of summer 50 can be filtered (as a filter in the loop).

인코딩 프로세스 동안, 비디오 인코더(20)는 코딩될 비디오 프레임 또는 슬라이스를 수신한다. 프레임 또는 슬라이스는 다수의 비디오 블록으로 분할될 수 있다. 움직임 추정 유닛(42) 및 움직임 보상 유닛(44)은 하나 이상의 참조 프레임에서 하나 이상의 블록에 대해 수신된 비디오 블록의 인터 예측 코딩을 수행하여 시간 예측을 제공한다. 인트라 예측 유닛(46)은 대안으로 코딩될 블록과 동일한 프레임 또는 슬라이스에서 하나 이상의 이웃 블록에 대해 수신된 비디오 블록의 인트라 예측 코딩을 수행하여 공간 예측을 제공할 수 있다. 비디오 인코더(20)는, 예를 들어 비디오 데이터의 블록 각각에 대해 적절한 코딩 모드를 선택하기 위해, 다수의 코딩 패스를 수행할 수 있다.During the encoding process, video encoder 20 receives a video frame or slice to be coded. A frame or slice can be divided into multiple video blocks. The motion estimation unit 42 and the motion compensation unit 44 provide temporal prediction by performing inter prediction coding of a video block received on one or more blocks in one or more reference frames. The intra prediction unit 46 may alternatively provide spatial prediction by performing intra prediction coding of a video block received on one or more neighboring blocks in the same frame or slice as the block to be coded. Video encoder 20 may perform multiple coding passes, for example to select an appropriate coding mode for each block of video data.

더욱이, 분할 유닛(48)은 이전 코딩 패스에서의 이전 분할 방식의 평가에 기초하여, 비디오 데이터의 블록을 서브 블록으로 분할할 수 있다. 예를 들어, 분할 유닛(48)은 초기에 프레임 또는 슬라이스를 최대 코딩 유닛(largest coding unit, LCU)으로 분할하고, 각각의 LCU를 레이트 왜곡 분석(예: 레이트 왜곡 최적화)에 기초하여 서브 코딩 유닛(sub-coding unit, sub-CU)으로 분할할 수 있다. 모드 선택 유닛(40)은 LCU를 sub-CU로 분할하는 것을 나타내는 쿼드 트리(quad-tree) 데이터 구조를 더 생성할 수 있다. 쿼드 트리의 리프 노드(leaf-node) CU는 하나 이상의 예측 유닛(prediction unit, PU) 및 하나 이상의 변환 유닛(transform unit, TU)을 포함할 수 있다.Moreover, the dividing unit 48 may divide the block of video data into sub-blocks based on the evaluation of the previous dividing scheme in the previous coding pass. For example, the partitioning unit 48 initially divides a frame or slice into a largest coding unit (LCU), and divides each LCU into a sub-coding unit based on rate distortion analysis (e.g., rate distortion optimization). It can be divided into (sub-coding unit, sub-CU). The mode selection unit 40 may further generate a quad-tree data structure indicating dividing the LCU into sub-CUs. A leaf-node CU of a quad tree may include one or more prediction units (PUs) and one or more transform units (TUs).

본 개시는 HEVC의 맥락에서 CU, PU 또는 TU 중 어느 것, 또는 다른 표준의 맥락에서 유사한 데이터 구조(예: H.264/AVC에서 매크로 블록 및 그 서브 블록)를 지칭하기 위해 "블록"이라는 용어를 사용한다. CU는 코딩 노드, PU 및 코딩 노드와 관련된 TU를 포함한다. CU의 크기는 코딩 노드의 크기에 해당하며 정사각형 모양이다. CU의 크기는 8×8 화소에서 최대 64×64 화소 이상의 트리 블록 크기까지 범위가 될 수 있다. 각각의 CU는 하나 이상의 PU 및 하나 이상의 TU를 포함할 수 있다. CU와 연관된 신택스 데이터는 예를 들어 CU를 하나 이상의 PU로 분할하는 것을 설명할 수 있다.The present disclosure refers to the term “block” to refer to any of CU, PU or TU in the context of HEVC, or similar data structures (eg, macro blocks and sub-blocks thereof in H.264/AVC) in the context of other standards. Use. The CU includes a coding node, a PU, and a TU associated with the coding node. The size of the CU corresponds to the size of the coding node and has a square shape. The size of the CU can range from 8×8 pixels to a maximum tree block size of 64×64 pixels or more. Each CU may include one or more PUs and one or more TUs. Syntax data associated with the CU may describe, for example, dividing the CU into one or more PUs.

분할 모드(partitioning mode)는 CU가 스킵 또는 직접 모드로 인코딩되어 있는지, 인트라 예측 모드로 인코딩되어 있는지 또는 인터 예측(inter-prediction, inter prediction으로도 알려짐) 모드로 인코딩되어 있는지에 따라 다를 수 있다. PU는 정사각형이 아닌 모양으로 분할될 수 있다. CU와 연관된 신택스 데이터는 또한, 예를 들어 쿼드 트리에 따라 CU를 하나 이상의 TU로 분할하는 것을 설명할 수 있다. TU의 형상은 정사각형이거나 정사각형이 아닐 수 있다(예: 직사각형).The partitioning mode may differ depending on whether the CU is encoded in a skip or direct mode, an intra prediction mode, or an inter prediction (also known as inter-prediction, inter prediction) mode. PU can be divided into non-square shapes. Syntax data associated with the CU may also describe dividing the CU into one or more TUs according to, for example, a quad tree. The shape of the TU may or may not be square (eg, rectangle).

모드 선택 유닛(40)은 예컨대, 오류 결과에 기초하여 인트라 또는 인터 코딩 모드 중 하나를 선택할 수 있고, 결과로서 생긴 인트라 또는 인터 코딩된 블록을 합산기(50)에 제공하여 잔차 블록 데이터를 생성하고 합산기(62)에 제공하여 참조 프레임으로서 사용하기 위한 인코딩된 블록을 재구축한다. 모드 선택 유닛(40)은 또한 움직임 벡터, 인트라 모드 지시자(intra mode indicator), 분할 정보 및 기타 그러한 신택스 정보와 같은, 신택스 요소를 엔트로피 코딩 유닛(56)에 제공한다.The mode selection unit 40 may select one of an intra or inter coding mode based on, for example, an error result, and provides the resulting intra or inter-coded block to the summer 50 to generate residual block data, and Provided to summer 62 to reconstruct the encoded block for use as a reference frame. The mode selection unit 40 also provides syntax elements to the entropy coding unit 56, such as motion vectors, intra mode indicators, segmentation information and other such syntax information.

움직임 추정 유닛(42) 및 움직임 보상 유닛(44)은 고도로 통합될 수 있지만, 개념적 목적을 위해 별도로 나타낸다. 움직임 추정 유닛(42)에 의해 수행되는 움직임 추정은 비디오 블록에 대한 움직임을 추정하는 움직임 벡터를 생성하는 프로세스이다. 예를 들어, 움직임 벡터는 현재 프레임(또는 다른 코딩된 유닛(coded unit)) 내의 코딩되고 있는 현재 블록에 대한 참조 프레임(또는 다른 코딩된 유닛) 내의 예측 블록에 대한 현재 비디오 프레임 내의 비디오 블록의 PU의 변위를 지시할 수 있다. 예측 블록은 절대 차이의 합(sum of absolute difference, SAD), 제곱 차이의 합(sum of square difference, SSD) 또는 기타 차이 메트릭(difference metric)에 의해 결정될 수 있는 화소 차이 측면에서, 코딩될 블록과 밀접하게 매칭되는 것으로 확인된 블록이다. 일부 예에서, 비디오 인코더(20)는 참조 프레임 메모리(64)에 저장된 참조 영상의 정수 미만(sub-integer)의 화소 위치에 대한 값을 계산할 수 있다. 예를 들어, 비디오 인코더(20)는 1/4 화소 위치, 1/8 화소 위치, 또는 기타의 참조 영상의 부분 화소 위치의 값을 보간할 수 있다. 따라서, 움직임 추정 유닛(42)은 완전한 화소 위치 및 부분 화소 위치에 대한 움직임 검색을 수행하고 부분 화소 정밀도로 움직임 벡터를 출력할 수 있다.The motion estimation unit 42 and the motion compensation unit 44 can be highly integrated, but are shown separately for conceptual purposes. Motion estimation performed by the motion estimation unit 42 is a process of generating a motion vector that estimates motion for a video block. For example, the motion vector is the PU of the video block in the current video frame for the prediction block in the reference frame (or other coded unit) for the current block being coded in the current frame (or other coded unit). The displacement of can be indicated. The prediction block is compared to the block to be coded in terms of pixel differences, which can be determined by the sum of absolute difference (SAD), sum of square difference (SSD) or other difference metric. Blocks that have been confirmed to be closely matched. In some examples, the video encoder 20 may calculate a value for a sub-integer pixel position of the reference image stored in the reference frame memory 64. For example, the video encoder 20 may interpolate a value of a 1/4 pixel position, a 1/8 pixel position, or other partial pixel position of a reference image. Accordingly, the motion estimation unit 42 may perform a motion search for a complete pixel position and a partial pixel position, and output a motion vector with partial pixel precision.

움직임 추정 유닛(42)은 PU의 위치를 참조 영상의 예측 블록의 위치와 비교함으로써 인터 코딩된 슬라이스에서 비디오 블록의 PU에 대한 움직임 벡터를 계산한다. 참조 영상은 제1 참조 영상 리스트(List 0) 또는 제2 참조 영상 리스트(List 1)로부터 선택될 수 있으며, 그 각각은 참조 프레임 메모리(64)에 저장된 하나 이상의 참조 영상을 식별할 수 있게 해준다. 움직임 추정 유닛(42)은 계산된 움직임 벡터를 엔트로피 인코딩 유닛(56) 및 움직임 보상 유닛(44)에 전송한다.The motion estimation unit 42 calculates a motion vector for the PU of the video block in the inter-coded slice by comparing the position of the PU with the position of the prediction block of the reference picture. The reference image may be selected from the first reference image list (List 0) or the second reference image list (List 1), each of which enables one or more reference images stored in the reference frame memory 64 to be identified. The motion estimation unit 42 transmits the calculated motion vector to the entropy encoding unit 56 and the motion compensation unit 44.

움직임 보상 유닛(44)에 의해 수행되는 움직임 보상은 움직임 추정 유닛(42)에 의해 결정된 움직임 벡터에 기초하여 예측 블록을 인출(fetching) 또는 생성하는 것을 포함할 수 있다. 또한, 일부 예에서, 움직임 추정 유닛(42)과 움직임 보상 유닛(44)은 기능적으로 통합될 수 있다. 현재 비디오 블록의 PU에 대한 움직임 벡터를 수신하면, 움직임 보상 유닛(44)은 참조 영상 리스트 중 하나에서 움직임 벡터가 가리키는 예측 블록을 찾아낼 수 있다. 합산기(50)는 코딩되는 현재 비디오 블록의 화소 값에서 예측 블록의 화소 값을 빼서 잔차 비디오 블록(residual video block)을 형성하여, 아래에서 논의되는 바와 같이 화소 차이 값을 형성한다. 일반적으로, 움직임 추정 유닛(42)은 루마 성분(luma component)에 대한 움직임 추정을 수행하고, 움직임 보상 유닛(44)은 크로마 성분(chroma component) 및 루마 성분 모두에 대해 루마 성분에 기초하여 계산된 움직임 벡터를 사용한다. 모드 선택 유닛(40)은 또한 비디오 슬라이스의 비디오 블록을 디코딩할 때 비디오 디코더(30)에 의한 사용을 위해 비디오 블록 및 비디오 슬라이스와 연관된 신택스 요소를 생성할 수 있다.Motion compensation performed by the motion compensation unit 44 may include fetching or generating a prediction block based on a motion vector determined by the motion estimation unit 42. Further, in some examples, the motion estimation unit 42 and the motion compensation unit 44 may be functionally integrated. Upon receiving the motion vector for the PU of the current video block, the motion compensation unit 44 may find a prediction block indicated by the motion vector from one of the reference picture lists. The summer 50 forms a residual video block by subtracting the pixel value of the prediction block from the pixel value of the current video block being coded to form a pixel difference value as discussed below. In general, the motion estimation unit 42 performs motion estimation on a luma component, and the motion compensation unit 44 is calculated based on a luma component for both a chroma component and a luma component. Use a motion vector. The mode selection unit 40 may also generate the video block and syntax elements associated with the video slice for use by the video decoder 30 when decoding the video block of the video slice.

인트라 예측 유닛(46)은 전술한 바와 같이, 움직임 추정 유닛(42) 및 움직임 보상 유닛(44)에 의해 수행되는 인터 예측의 대안으로서 현재 블록을 인트라 예측할 수 있다. 특히, 인트라 예측 유닛(46)은 현재 블록을 인코딩하기 위해 사용할 인트라 예측 모드를 결정할 수 있다. 일부 예에서, 인트라 예측 유닛(46)은 예컨대, 개별 인코딩 패스 동안, 다양한 인트라 예측 모드를 사용하여 현재 블록을 인코딩할 수 있고, 인트라 예측 유닛(46)(또는 일부 예에서는 모드 선택 유닛(40))은 테스트된 모드 중에서 사용할 적절한 인트라 예측 모드를 선택할 수 있다.As described above, the intra prediction unit 46 may intra-predict the current block as an alternative to inter prediction performed by the motion estimation unit 42 and the motion compensation unit 44. In particular, the intra prediction unit 46 may determine an intra prediction mode to be used to encode the current block. In some examples, intra prediction unit 46 may encode the current block using various intra prediction modes, e.g., during a separate encoding pass, and intra prediction unit 46 (or mode selection unit 40 in some examples). ) Can select an appropriate intra prediction mode to be used among the tested modes.

예를 들어, 인트라 예측 유닛(46)은 테스트된 다양한 인트라 예측 모드에 대한 레이트 왜곡 분석을 사용하여 레이트 왜곡 값을 계산하고, 테스트된 모드 중 가장 양호한 레이트 왜곡 특성을 갖는 인트라 예측 모드를 선택할 수 있다. 레이트 왜곡 분석은 일반적으로 인코딩된 블록과 인코딩된 블록을 생성하기 위해 인코딩된 원래의 인코딩되지 않은 블록 사이의 왜곡(또는 오류)의 양과, 인코딩된 블록의 생성에 사용되는 비트레이트(즉, 비트 수)를 결정한다. 인트라 예측 유닛(46)은 어떤 인트라 예측 모드가 블록에 대한 최상의 레이트 왜곡 값을 나타내는지를 결정하기 위해, 다양한 인코딩된 블록에 대한 왜곡 및 레이트로부터 비율을 계산할 수 있다.For example, the intra prediction unit 46 may calculate a rate distortion value using rate distortion analysis for various tested intra prediction modes, and select an intra prediction mode having the best rate distortion characteristic among the tested modes. . Rate distortion analysis generally refers to the amount of distortion (or error) between the encoded block and the original unencoded block encoded to produce the encoded block, and the bit rate (i.e., the number of bits) used to generate the encoded block. ) Is determined. Intra prediction unit 46 can calculate a ratio from the rates and distortions for the various encoded blocks to determine which intra prediction mode represents the best rate distortion value for the block.

또한, 인트라 예측 유닛(46)은 심도 모델링 모드(depth modeling mode, DMM)를 사용하여 심도 맵(depth map)의 심도 블록을 코딩하도록 구성될 수 있다. 모드 선택 유닛(40)은 가용 DMM 모드가 예컨대, 레이트 왜곡 최적화(rate-distortion optimization, RDO)를 사용하여, 인트라 예측 모드 및 다른 DMM 모드보다 우수한 코딩 결과를 생성하는지의 여부를 판정할 수 있다. 심도 맵에 대응하는 텍스처 이미지(texture image)에 대한 데이터는 참조 프레임 메모리(64)에 저장될 수 있다. 움직임 추정 유닛(42) 및 움직임 보상 유닛(44)은 또한 심도 맵의 심도 블록을 인터 예측하도록 구성될 수 있다.In addition, the intra prediction unit 46 may be configured to code a depth block of a depth map using a depth modeling mode (DMM). The mode selection unit 40 may determine whether the available DMM mode produces a better coding result than the intra prediction mode and other DMM modes, e.g., using rate-distortion optimization (RDO). Data on a texture image corresponding to the depth map may be stored in the reference frame memory 64. The motion estimation unit 42 and the motion compensation unit 44 may also be configured to inter-predict the depth block of the depth map.

블록에 대한 인트라 예측 모드(예: 종래의 인트라 예측 모드, 또는 DMM 모드 중 하나)를 선택한 후, 인트라 예측 유닛(46)은 블록에 대한 선택된 인트라 예측 모드를 지시하는 정보를 엔트로피 코딩 유닛(56)에 제공할 수 있다. 엔트로피 코딩 유닛(56)은 선택된 인트라 예측 모드를 지시하는 정보를 인코딩할 수 있다. 비디오 인코더(20)는 복수의 인트라 예측 모드 색인 표 및 복수의 수정된 인트라 예측 모드 색인 표(코드워드 매핑 표라고도 함)을 포함할 수 있는, 송신된 비트스트림 구성 데이터에, 다양한 블록의 인코딩 컨텍스트의 정의와, 각각의 컨텍스트에 사용할 가장 가능성이 높은 인트라 예측 모드, 인트라 예측 모드 색인 표, 및 수정된 인트라 예측 모드 색인 표의 지시를 포함할 수 있다.After selecting an intra prediction mode for a block (eg, one of a conventional intra prediction mode or a DMM mode), the intra prediction unit 46 transmits information indicating the selected intra prediction mode for the block to the entropy coding unit 56 Can be provided to. The entropy coding unit 56 may encode information indicating the selected intra prediction mode. The video encoder 20 may include a plurality of intra prediction mode index tables and a plurality of modified intra prediction mode index tables (also referred to as codeword mapping tables) in the transmitted bitstream configuration data, encoding context of various blocks. And an indication of an intra prediction mode most likely to be used in each context, an intra prediction mode index table, and a modified intra prediction mode index table.

비디오 인코더(20)는 코딩되는 원래 비디오 블록으로부터 모드 선택 유닛(40)으로부터의 예측 데이터를 감산함으로써 잔차 비디오 블록을 형성한다. 합산기(50)는 이 감산 연산을 수행하는 구성요소를 나타낸다.The video encoder 20 forms a residual video block by subtracting the prediction data from the mode selection unit 40 from the original video block being coded. Summer 50 represents a component that performs this subtraction operation.

변환 처리 유닛(52)은 이산 코사인 변환(discrete cosine transform, DCT) 또는 개념적으로 유사한 변환과 같은 변환을 잔차 블록에 적용하여, 잔차 변환 계수 값을 포함하는 비디오 블록을 생성한다. 변환 처리 유닛(52)은 개념적으로 DCT와 유사한 다른 변환을 수행할 수 있다. 웨이블릿 변환, 정수 변환, 서브 대역 변환 또는 기타 유형의 변환도 사용될 수 있다.The transform processing unit 52 generates a video block including a residual transform coefficient value by applying a transform such as a discrete cosine transform (DCT) or a conceptually similar transform to the residual block. The transform processing unit 52 may conceptually perform other transforms similar to DCT. Wavelet transforms, integer transforms, subband transforms or other types of transforms can also be used.

변환 처리 유닛(52)은 잔차 블록에 변환을 적용하여, 잔차 변환 계수의 블록을 생성한다. 변환은 잔차 정보를 화소 값 도메인에서 주파수 도메인과 같은 변환 도메인으로 변환할 수 있다. 변환 처리 유닛(52)은 결과로서 생긴 변환 계수를 양자화 유닛(54)에 전송할 수 있다. 양자화 유닛(54)은 비트 레이트를 더 감소시키기 위해 변환 계수를 양자화한다. 양자화 프로세스는 계수의 일부 또는 전부와 연관된 비트 심도를 감소시킬 수 있다. 양자화의 정도는 양자화 파라미터를 조정함으로써 수정될 수 있다. 일부 예에서, 양자화 유닛(54)은 그 후 양자화된 변환 계수를 포함하는 행렬의 스캔을 수행할 수 있다. 대안으로, 엔트로피 인코딩 유닛(56)이 스캔을 수행할 수 있다.The transform processing unit 52 applies a transform to the residual block to generate a block of residual transform coefficients. Transformation may convert residual information from a pixel value domain to a transform domain such as a frequency domain. The transform processing unit 52 can transmit the resulting transform coefficient to the quantization unit 54. The quantization unit 54 quantizes the transform coefficients to further reduce the bit rate. The quantization process can reduce the bit depth associated with some or all of the coefficients. The degree of quantization can be modified by adjusting the quantization parameters. In some examples, quantization unit 54 may then perform a scan of the matrix including the quantized transform coefficients. Alternatively, the entropy encoding unit 56 can perform the scan.

양자화에 이어서, 엔트로피 코딩 유닛(56)은 양자화된 변환 계수를 엔트로피 코딩한다. 예를 들어, 엔트로피 코딩 유닛(56)은 컨텍스트 적응적 가변 길이 코딩(context adaptive binary arithmetic coding, CAVLC), 컨텍스트 적응적 이진 산술 코팅(context adaptive binary arithmetic coding, CABAC), 신택스 기반 컨텍스트 적응적 이진 산술 코딩(syntax-based context-adaptive binary arithmetic coding, SBAC), 확률 구간 분할 엔트로피(probability interval partitioning entropy, PIPE) 코딩 또는 다른 엔트로피 코딩 기술을 수행할 수 있다. 컨텍스트 기반 엔트로피 코딩의 경우, 컨텍스트는 이웃 블록에 기초할 수 있다. 엔트로피 코딩 유닛(56)에 의한 엔트로피 코딩에 이어서, 인코딩된 비트스트림은 다른 기기(예: 비디오 디코더(30))에 송신되거나 나중의 송신 또는 검색을 위해 보관될 수 있다.Following quantization, the entropy coding unit 56 entropy codes the quantized transform coefficients. For example, the entropy coding unit 56 includes context adaptive binary arithmetic coding (CAVLC), context adaptive binary arithmetic coding (CABAC), and syntax-based context adaptive binary arithmetic. Coding (syntax-based context-adaptive binary arithmetic coding (SBAC)), probability interval partitioning entropy (PIPE) coding, or other entropy coding techniques may be performed. In the case of context-based entropy coding, the context may be based on a neighboring block. Following entropy coding by entropy coding unit 56, the encoded bitstream may be transmitted to another device (eg, video decoder 30) or stored for later transmission or retrieval.

역 양자화 유닛(58) 및 역변환 유닛(60)은 각각 역 양자화 및 역변환을 적용하여, 예컨대 나중에 참조 블록으로서 사용하기 위해, 화소 도메인에서 잔차 블록을 재구축한다. 움직임 보상 유닛(44)은 참조 프레임 메모리(64)의 프레임들 중 하나의 예측 블록에 잔차 블록을 추가함으로써 참조 블록을 계산할 수 있다. 움직임 보상 유닛(44)은 또한 하나 이상의 보간 필터를 재구축된 잔차 블록에 적용하여, 움직임 추정에 사용할 정수 미만 화소 값을 계산할 수 있다. 합산기(62)는 재구축된 잔여 블록을 움직임 보상 유닛(44)에 의해 생성된 움직임 보상된 예측 블록에 추가하여 참조 프레임 메모리(64)에 저장하기 위한 재구축된 비디오 블록을 생성한다. 재구축된 비디오 블록은 움직임 추정 유닛(42) 및 움직임 보상 유닛(44)에 의해 후속 비디오 프레임에서 블록을 인터 코딩하기 위한 참조 블록으로서 사용된다.The inverse quantization unit 58 and the inverse transform unit 60 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain for use as a reference block later, for example. The motion compensation unit 44 may calculate a reference block by adding a residual block to one of the frames of the reference frame memory 64. The motion compensation unit 44 may also apply one or more interpolation filters to the reconstructed residual block to calculate a sub-integer pixel value to be used for motion estimation. The summer 62 generates a reconstructed video block for storage in the reference frame memory 64 by adding the reconstructed residual block to the motion compensated prediction block generated by the motion compensation unit 44. The reconstructed video block is used by the motion estimation unit 42 and the motion compensation unit 44 as a reference block for inter-coding the block in a subsequent video frame.

도 3은 비디오 코딩 기술을 구현할 수 있는 비디오 디코더(30)의 예를 나타낸 블록도이다. 도 3의 예에서, 비디오 디코더(30)는 엔트로피 디코딩 유닛(70), 움직임 보상 유닛(72), 인트라 예측 유닛(74), 역 양자화 유닛(76), 역변환 유닛(78), 참조 프레임 메모리(82) 및 합산기(80)를 포함한다. 비디오 디코더(30)는 일부 예에서, 비디오 인코더(20)(도 2)에 대해 설명한 인코딩 패스에 일반적으로 역인 디코딩 패스를 수행한다. 움직임 보상 유닛(72)은 엔트로피 디코딩 유닛(70)으로부터 수신된 움직임 벡터에 기초하여 예측 데이터를 생성할 수 있는 반면, 인트라 예측 유닛(74)은 엔트로피 디코딩 유닛(70)으로부터 수신된 인트라 예측 모드 지시자에 기초하여 예측 데이터를 생성할 수 있다.3 is a block diagram showing an example of a video decoder 30 that can implement a video coding technique. In the example of FIG. 3, the video decoder 30 includes an entropy decoding unit 70, a motion compensation unit 72, an intra prediction unit 74, an inverse quantization unit 76, an inverse transform unit 78, and a reference frame memory ( 82) and a summer 80. Video decoder 30, in some examples, performs a decoding pass that is generally inverse to the encoding pass described for video encoder 20 (FIG. 2). The motion compensation unit 72 may generate prediction data based on the motion vector received from the entropy decoding unit 70, while the intra prediction unit 74 is an intra prediction mode indicator received from the entropy decoding unit 70. Prediction data may be generated based on.

디코딩 프로세스 동안, 비디오 디코더(30)는 비디오 인코더(20)로부터 인코딩된 비디오 슬라이스의 비디오 블록 및 연관된 신택스 요소를 나타내는 인코딩된 비디오 비트스트림을 수신한다. 비디오 디코더(30)의 엔트로피 디코딩 유닛(70)은 비트스트림을 엔트로피 디코딩하여 양자화된 계수, 움직임 벡터 또는 인트라 예측 모드 지시자 및 기타 신택스 요소를 생성한다. 엔트로피 디코딩 유닛(70)은 움직임 벡터 및 다른 신택스 요소를 움직임 보상 유닛(72)에 포워딩한다. 비디오 디코더(30)는 비디오 슬라이스 레벨 및/또는 비디오 블록 레벨에서 신택스 요소를 수신할 수 있다.During the decoding process, video decoder 30 receives from video encoder 20 an encoded video bitstream representing a video block of an encoded video slice and an associated syntax element. The entropy decoding unit 70 of the video decoder 30 entropy decodes the bitstream to generate quantized coefficients, motion vectors or intra prediction mode indicators, and other syntax elements. The entropy decoding unit 70 forwards the motion vector and other syntax elements to the motion compensation unit 72. The video decoder 30 may receive a syntax element at a video slice level and/or a video block level.

비디오 슬라이스가 인트라 코딩된(I) 슬라이스로 코딩될 때, 인트라 예측 유닛(74)은 시그널링된 인트라 예측 모드 및 현재 프레임 또는 영상의 이전에 디코딩된 블록으로터의 데이터에 기초하여 현재 비디오 슬라이스의 비디오 블록에 대한 예측 데이터를 생성할 수 있다. 비디오 프레임이 인터 코딩된(예: B, P 또는 GPB) 슬라이스로서 코딩될 때, 움직임 보상 유닛(72)은 엔트로피 디코딩 유닛(70)으로부터 수신된 움직임 벡터 및 다른 신택스 요소에 기초하여 현재 비디오 슬라이스의 비디오 블록에 대한 예측 블록을 생성한다. 예측 블록은 참조 영상 리스트 중 하나 내의 참조 영상 중 하나로부터 생성될 수 있다. 비디오 디코더(30)는 참조 프레임 메모리(82)에 저장된 참조 영상에 기초한 디폴트 구축 기술을 사용하여, 참조 프레임 리스트, 리스트 0 및 리스트 1을 구축할 수 있다.When a video slice is coded with an intra coded (I) slice, the intra prediction unit 74 determines the video of the current video slice based on the signaled intra prediction mode and data from a previously decoded block of the current frame or image. It is possible to generate predictive data for a block. When a video frame is coded as an inter-coded (e.g., B, P or GPB) slice, the motion compensation unit 72 determines the current video slice based on the motion vector and other syntax elements received from the entropy decoding unit 70. Generate a prediction block for a video block. The prediction block may be generated from one of the reference pictures in one of the reference picture lists. The video decoder 30 may construct a reference frame list, list 0 and list 1 by using a default construction technique based on a reference image stored in the reference frame memory 82.

움직임 보상 유닛(72)은 움직임 벡터 및 다른 신택스 요소를 파싱함으로써 현재 비디오 슬라이스의 비디오 블록에 대한 예측 정보를 결정하고, 예측 정보를 사용하여, 디코딩되는 현재 비디오 블록에 대한 예측 블록을 생성한다. 예를 들어, 움직임 보상 유닛(72)은 수신된 신택스 요소 중 일부를 사용하여, 비디오 슬라이스의 비디오 블록을 코딩하는 데 사용되는 예측 모드(예: 인트라 예측 또는 인터 예측), 인터 예측 슬라이스 유형(예: B 슬라이스, P 슬라이스 또는 GPB 슬라이스), 슬라이스에 대한 참조 영상 리스트 중 하나 이상에 대한 구축 정보, 슬라이스의 인터 인코딩된 비디오 블록 각각에 대한 움직임 벡터, 슬라이스의 인터 코딩된 비디오 블록 각각에 대한 인터 예측 상태, 및 현재 비디오 슬라이스 내의 비디오 블록을 디코딩하기 위한 기타 정보를 결정한다.The motion compensation unit 72 determines prediction information for a video block of a current video slice by parsing a motion vector and other syntax elements, and generates a prediction block for the current video block to be decoded using the prediction information. For example, the motion compensation unit 72 uses some of the received syntax elements, a prediction mode (e.g., intra prediction or inter prediction) used to code a video block of a video slice, and an inter prediction slice type (e.g. : B slice, P slice, or GPB slice), construction information for one or more of the reference image list for the slice, motion vector for each inter-encoded video block of the slice, inter prediction for each inter-coded video block of the slice Determine the state, and other information for decoding the video block in the current video slice.

움직임 보상 유닛(72)은 또한 보간 필터에 기초한 보간을 수행할 수 있다. 움직임 보상 유닛(72)은 참조 블록의 정수 미만 화소에 대한 보간된 값을 계산하기 위해 비디오 블록의 인코딩 동안에 비디오 인코더(20)에 의해 사용되는 보간 필터를 사용할 수 있다. 이 경우, 움직임 보상 유닛(72)은 수신된 신택스 요소로부터 비디오 인코더(20)에 의해 사용되는 보간 필터를 결정하고 그 보간 필터를 예측 블록을 생성하기 위해 사용할 수 있다.The motion compensation unit 72 may also perform interpolation based on an interpolation filter. The motion compensation unit 72 may use an interpolation filter used by the video encoder 20 during encoding of the video block to calculate interpolated values for less than integer pixels of the reference block. In this case, the motion compensation unit 72 may determine an interpolation filter used by the video encoder 20 from the received syntax element and use the interpolation filter to generate a prediction block.

심도 맵에 대응하는 텍스처 이미지에 대한 데이터는 참조 프레임 메모리(82)에 저장될 수 있다. 움직임 보상 유닛(72)은 또한 심도 맵의 심도 블록을 인터 예측하도록 구성될 수 있다.Data on the texture image corresponding to the depth map may be stored in the reference frame memory 82. The motion compensation unit 72 may also be configured to inter-predict the depth block of the depth map.

이미지 및 비디오 압축은 급속한 성장을 경험하였으며, 다양한 코딩 표준으로 이어졌다. 이러한 비디오 코딩 표준으로는 ITU-T H.261, ISO/IEC MPEG(Motion Picture Experts Group)-1 Part 2, ITU-T H.262 또는 ISO(International Organization for Standardization)/IEC(International Electrotechnical Commission) MPEG- 2 Part 2, ITU-T H.263, ISO/IEC MPEG-4 Part 2, ITU-T H.264 또는 ISO/IEC MPEG-4 Part 10으로도 알려진 AVC(Advanced Video Coding), H.265 또는 MPEG-H Part 2로도 알려진 HEVC(High Efficiency Video Coding)를 포함한다. AVC는 SVC(Scalable Video Coding), MVC(Multiview Video Coding) 및 MVC+D(Multiview Video Coding plus Depth), 그리고 3D AVC(3D-AVC)와 같은 확장을 포함한다. HEVC는 SHVC(Scalable HEVC), MV-HEVC(Multiview HEVC) 및 3D-HEVC(3D HEVC)와 같은 확장을 포함한다. Image and video compression has experienced rapid growth and has led to a variety of coding standards. Such video coding standards include ITU-T H.261, ISO/IEC Motion Picture Experts Group (MPEG)-1 Part 2, ITU-T H.262, or International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC) MPEG. -Advanced Video Coding (AVC), H.265 or also known as 2 Part 2, ITU-T H.263, ISO/IEC MPEG-4 Part 2, ITU-T H.264 or ISO/IEC MPEG-4 Part 10 It includes High Efficiency Video Coding (HEVC), also known as MPEG-H Part 2. AVC includes extensions such as Scalable Video Coding (SVC), Multiview Video Coding (MVC), Multiview Video Coding plus Depth (MVC+D), and 3D-AVC (3D AVC). HEVC includes extensions such as Scalable HEVC (SHVC), Multiview HEVC (MV-HEVC) and 3D HEVC (3D HEVC).

VVC(Versatile Video Coding)은 ITU-T 및 ISO/IEC의 공동 비디오 전문가 팀(joint video experts team, JVET)에서 개발중인 새로운 비디오 코딩 표준이다. 작성 당시, VVC의 최신 작업 초안(Working Draft, WD)이 JVET-K1001-v1에 포함되어 있다. JVET 문서 JVET-K0325-v3는 VVC의 고급 신택스에 대한 업데이트를 포함한다.Versatile Video Coding (VVC) is a new video coding standard being developed by the joint video experts team (JVET) of ITU-T and ISO/IEC. At the time of writing, VVC's latest Working Draft (WD) is included in JVET-K1001-v1. The JVET document JVET-K0325-v3 contains an update to the advanced syntax of VVC.

일반적으로, 본 개시는 VVC 표준의 개발중인 기술을 설명한다. 그러나 이 기술은 다른 비디오/미디어 코덱 사양에도 적용된다.In general, this disclosure describes the technology under development of the VVC standard. However, this technique also applies to other video/media codec specifications.

비디오 압축 기술은 비디오 시퀀스에 내재된 중복성을 줄이거나 제거하기 위해 공간 (인트라 영상) 예측 및/또는 시간 (인터 영상) 예측을 수행한다. 블록 기반 비디오 코딩의 경우, 비디오 슬라이스(예: 비디오 영상 또는 비디오 영상의 일부)는 비디오 블록으로 분할될 수 있으며, 이는 트리 블록, 코딩 트리 블록(CTB), 코딩 트리 유닛(CTU), 코딩 유닛(CU) 및/또는 코딩 노드라고도 할 수 있다. 영상의 인트라 코딩된(I) 슬라이스의 비디오 블록은 동일한 영상의 이웃 블록에 있는 참조 샘플에 대한 공간 예측을 사용하여 인코딩된다. 영상의 인터 코딩된(P 또는 B) 슬라이스의 비디오 블록은 동일한 영상의 이웃 블록의 참조 샘플에 대한 공간 예측 또는 다른 참조 영상의 참조 샘플에 대한 시간 예측을 사용할 수 있다. 영상은 프레임이라고 할 수 있고, 참조 영상은 참조 프레임이라고 할 수 있다.Video compression technology performs spatial (intra image) prediction and/or temporal (inter image) prediction in order to reduce or remove redundancy inherent in a video sequence. In the case of block-based video coding, a video slice (eg, a video image or a part of a video image) may be divided into video blocks, which are tree blocks, coding tree blocks (CTBs), coding tree units (CTUs), and coding units ( CU) and/or a coding node. The video blocks of the intra-coded (I) slice of the picture are encoded using spatial prediction for reference samples in neighboring blocks of the same picture. A video block of an inter-coded (P or B) slice of an image may use spatial prediction for a reference sample of a neighboring block of the same image or temporal prediction for a reference sample of another reference image. An image may be referred to as a frame, and a reference image may be referred to as a reference frame.

공간 예측 또는 시간 예측은 코딩될 블록에 대한 예측 블록을 결과로서 생성한다. 잔차 데이터는 코딩될 원본 블록(original block)과 예측 블록 간의 화소 차이를 나타낸다. 인터 코딩된 블록은 예측 블록을 형성하는 참조 샘플 블록을 가리키는 움직임 벡터 및, 코딩된 블록과 예측 블록의 차이를 나타내는 잔차 데이터에 따라 인코딩된다. 인트라 코딩된 블록은 인트라 코딩 모드 및 잔차 데이터에 따라 인코딩된다. 추가 압축을 위해, 잔차 데이터는 화소 도메인에서 변환 도메인으로 변환될 수 있고, 결과로서 잔여 변환 계수가 생성될 수 있으며, 이는 나중에 양자화될 수 있다. 초기에 2차원 어레이로 배열된 양자화된 변환 계수는 변환 계수의 1차원 벡터를 생성하기 위해 스캔될 수 있으며, 훨씬 더 많은 압축을 달성하기 위해 엔트로피 코딩이 적용될 수 있다.Spatial prediction or temporal prediction produces as a result a prediction block for the block to be coded. The residual data represents a pixel difference between an original block to be coded and a prediction block. The inter-coded block is encoded according to a motion vector indicating a reference sample block forming a prediction block and residual data indicating a difference between the coded block and the prediction block. The intra coded block is encoded according to the intra coding mode and residual data. For further compression, the residual data can be transformed from the pixel domain to the transform domain, and as a result residual transform coefficients can be generated, which can be quantized later. Quantized transform coefficients initially arranged in a two-dimensional array can be scanned to generate a one-dimensional vector of transform coefficients, and entropy coding can be applied to achieve even more compression.

비디오 코덱 사양에서, 영상은 인터 예측에서 참조 영상으로 사용하기 위한 것, 디코딩된 영상 버퍼(DPB)로부터 영상을 출력하기 위한 것, 움직임 벡터의 스케일링을 위한 것, 가중된 예측을 위한 것 등을 포함한, 여러 목적으로 식별된다. AVC 및 HEVC에서, 영상은 영상 순서 카운트(picture order count, POC)에 의해 식별될 수 있다. AVC 및 HEVC에서, DPB의 영상은 "단기 참조용으로 사용됨", "장기 참조용으로 사용됨" 또는 "참조용으로 사용되지 않음"으로 마킹될 수 있다. 영상이 "참조용으로 사용되지 않음"으로 마킹되면, 그 영상은 더 이상 예측에 사용될 수 없다. 영상이 더 이상 출력에 필요하지 않으면, 그 영상은 DPB에서 제거될 수 있다.In the video codec specification, an image includes for use as a reference image in inter prediction, for outputting an image from a decoded image buffer (DPB), for scaling motion vectors, for weighted prediction, etc. , Identified for several purposes. In AVC and HEVC, an image can be identified by a picture order count (POC). In AVC and HEVC, the image of the DPB may be marked as "used for short-term reference", "used for long-term reference" or "not used for reference". If an image is marked "not used for reference", the image can no longer be used for prediction. If the image is no longer needed for output, the image can be removed from the DPB.

AVC에는, 단기 및 장기의 두 가지 유형의 참조 영상이 있다. 참조 영상이 더 이상 예측 참조에 필요하지 않은 경우 "참조용으로 사용되지 않음"으로 마킹될 수 있다. 이 세 가지 상태(단기, 장기 및 참조용으로 사용되지 않음) 간의 변환은 디코딩된 참조 영상 마킹 프로세스에 의해 제어된다. 두 가지 대안 디코딩된 참조 영상 표시 메커니즘, 즉 암시적 슬라이딩 윈도 프로세스와 명시적 메모리 관리 제어 작업(memory management control operation, MMCO) 프로세스가 있다. 슬라이딩 윈도 프로세스는 참조 프레임 수가 주어진 최대 수(시퀀스 파라미터 세트(SPS)의 max_num_ref_frames)와 같을 때 단기 참조 영상을 "참조용으로 사용되지 않음"으로 마킹한다. 단기 참조 영상은 선입 선출 방식으로 저장되어 가장 최근에 디코딩된 단기 영상이 DPB에 보관된다.In AVC, there are two types of reference images, short and long. If the reference image is no longer needed for predictive reference, it may be marked as "not used for reference". The transformation between these three states (short term, long term and not used for reference) is controlled by the decoded reference image marking process. There are two alternative decoded reference picture display mechanisms: an implicit sliding window process and an explicit memory management control operation (MMCO) process. The sliding window process marks a short-term reference image as "not used for reference" when the number of reference frames is equal to a given maximum number (max_num_ref_frames of the sequence parameter set (SPS)). The short-term reference image is stored in a first-in, first-out manner, and the most recently decoded short-term image is stored in the DPB.

명시적 MMCO 프로세스는 다수의 MMCO 커맨드를 포함할 수 있다. MMCO 커맨드는 하나 이상의 단기 또는 장기 참조 영상을 "참조용으로 사용되지 않음"으로 마킹할 수 있거나, 모든 영상을 "참조용으로 사용되지 않음"으로 마킹할 수 있거나, 현재 참조 영상 또는 기존의 단기 참조 영상을 장기로 마킹한 다음, 장기 참조 영상 색인을 그 장기 참조 영상에 할당한다.The explicit MMCO process can include multiple MMCO commands. The MMCO command can mark one or more short or long-term reference images as "not used for reference", or all images as "not used for reference", or a current reference image or an existing short-term reference After the images are marked as organs, a long-term reference image index is assigned to the long-term reference image.

AVC에서, 영상이 디코딩된 후에 DPB로부터 영상의 출력 및 제거를 위한 프로세스뿐만 아니라 참조 영상 마킹 동작이 수행된다.In AVC, after an image is decoded, a process for outputting and removing an image from a DPB as well as a reference image marking operation is performed.

HEVC는 참조 영상 세트(RPS)라고 하는, 참조 영상 관리를 위한 상이한 접근법을 도입한다. AVC의 MMCO/슬라이딩 윈도 프로세스와 비교하여 RPS 개념의 가장 근본적인 차이점은 각각의 특정 슬라이스에 대해 현재 영상 또는 임의의 후속 영상에 의해 사용되는 참조 영상의 완전한 세트가 제공된다는 것이다. 따라서 현재 또는 미래의 영상에 의해 사용하기 위해 DPB에 보관해야 하는 모든 영상의 완전한 세트가 제공된다. 이것은 DPB에 대한 상대적인 변화만을 시그널링하는 AVC 방식과 다르다. RPS 개념을 사용하면, DPB에서 참조 영상의 올바른 상태를 유지하기 위해 디코딩 순서에서 이전 영상으로부터의 정보가 필요 없다.HEVC introduces a different approach for reference picture management, called a reference picture set (RPS). The most fundamental difference of the RPS concept compared to AVC's MMCO/sliding window process is that for each particular slice a complete set of reference pictures used by the current picture or any subsequent picture is provided. Thus, a complete set of all images that must be kept in the DPB for use by current or future images is provided. This is different from the AVC scheme, which signals only relative changes to the DPB. Using the RPS concept, information from the previous picture is not required in the decoding order in order to maintain the correct state of the reference picture in the DPB.

HEVC에서 영상 디코딩 및 DPB 동작의 순서는 AVC에 비해 RPS의 장점을 활용하고 오류 복원력을 향상시키기 위해 변경된다. AVC에서, 영상 마킹(picture marking) 및 버퍼 조작(DPB로부터 디코딩된 영상을 출력 및 제거 모두)은 일반적으로 현재 영상이 디코딩된 후에 적용된다. HEVC에서, RPS는 먼저 현재 영상의 슬라이스 헤더로부터 디코딩된 다음, 영상 마킹 및 버퍼 조작이 일반적으로 현재 영상을 디코딩하기 전에 적용된다.In HEVC, the order of video decoding and DPB operations is changed in order to utilize the advantages of RPS compared to AVC and improve error resilience. In AVC, picture marking and buffer manipulation (both outputting and removing the decoded picture from the DPB) are generally applied after the current picture is decoded. In HEVC, the RPS is first decoded from the slice header of the current picture, and then picture marking and buffer manipulation are generally applied before decoding the current picture.

HEVC에서 각각의 슬라이스 헤더는 슬라이스를 포함한 영상에 대한 RPS의 시그널링에 대한 파라미터를 포함해야 한다. 유일한 예외는 IDR(Instantaneous Decoding Refresh) 슬라이스에 대해 RPS가 시그널링되지 않는다는 것이다. 대신, RPS가 비어있는 것으로 추정된다. IDR 영상에 속하지 않는 I 슬라이스의 경우, I 영상에 속하지 않더라도 디코딩 순서로 I 영상에 선행하는 영상으로부터 인터 예측을 사용하는 디코딩 순서로 I 영상 다음에 오는 영상이 있을 수 있으므로, RPS가 제공될 수 있다. RPS 내의 영상 수는 SPS에서의 sps_max_dec_pic_buffering 신택스 요소에 의해 지정된 DPB 크기 제한을 초과하지 않아야 한다.In HEVC, each slice header must include a parameter for RPS signaling for a video including a slice. The only exception is that RPS is not signaled for Instantaneous Decoding Refresh (IDR) slices. Instead, it is assumed that the RPS is empty. In the case of an I slice that does not belong to an IDR image, even if it does not belong to an I image, since there may be an image following the I image in decoding order using inter prediction from the image preceding the I image in decoding order, an RPS may be provided. . The number of images in the RPS must not exceed the DPB size limit specified by the sps_max_dec_pic_buffering syntax element in the SPS.

각각의 영상은 출력 순서를 나타내는 POC 값과 연관된다. 슬라이스 헤더는 POC LSB로도 알려진, 전체 POC 값의 최하위 비트(least significant bit, LSB)를 나타내는 고정 길이 코드워드, pic_order_cnt_lsb를 포함한다. 코드 워드의 길이는 SPS에서 시그널링되며, 예를 들어 4 비트에서 16 비트 사이일 수 있다. RPS 개념은 POC를 사용하여 참조 영상을 식별한다. 자체 POC 값 외에도, 각각의 슬라이스 헤더는 RPS 내의 영상 각각의 POC 값(또는 LSB)의 코딩된 표현을 직접 포함하거나 SPS로부터 승계한다.Each image is associated with a POC value indicating the output order. The slice header includes a fixed length codeword, pic_order_cnt_lsb, representing the least significant bit (LSB) of the entire POC value, also known as POC LSB. The length of the code word is signaled in the SPS, and may be between 4 bits and 16 bits, for example. The RPS concept identifies a reference image using POC. In addition to its own POC value, each slice header directly contains a coded representation of the POC value (or LSB) of each picture in the RPS or inherits from the SPS.

각각의 영상에 대한 RPS는 5개의 RPS 서브세트라고도 하는, 참조 영상의 5개의 상이한 리스트로 구성된다. RefPicSetStCurrBefore는 디코딩 순서와 출력 순서 모두에서 현재 영상보다 앞선 모든 단기 참조 영상으로 구성되며, 현재 영상의 인터 예측에 사용될 수 있다. RefPicSetStCurrAfter는 디코딩 순서에서 현재 영상보다 앞서고, 출력 순서에서 현재 영상에 이어지며, 현재 영상의 인터 예측에 사용될 수 있는 모든 단기 참조 영상으로 구성된다. RefPicSetStFoll은 디코딩 순서에서 현재 영상 다음에 오는 하나 이상의 영상의 인터 예측에 사용될 수 있고, 현재 영상의 인터 예측에 사용되지 않는 모든 단기 참조 영상으로 구성된다. RefPicSetLtCurr는 현재 영상의 인터 예측에 사용될 수 있는 모든 장기 참조 영상으로 구성된다. RefPicSetLtFoll은 디코딩 순서에서 현재 영상 다음에 오는 하나 이상의 영상의 인터 예측에 사용될 수 있고, 현재 영상의 인터 예측에 사용되지 않는 모든 장기 참조 영상으로 구성된다.The RPS for each picture consists of five different lists of reference pictures, also referred to as a subset of the five RPS. RefPicSetStCurrBefore consists of all short-term reference images that precede the current image in both the decoding order and the output order, and can be used for inter prediction of the current image. RefPicSetStCurrAfter is composed of all short-term reference pictures that precede the current picture in decoding order, follow the current picture in output order, and can be used for inter prediction of the current picture. RefPicSetStFoll can be used for inter prediction of one or more pictures following the current picture in decoding order, and consists of all short-term reference pictures that are not used for inter prediction of the current picture. RefPicSetLtCurr consists of all long-term reference images that can be used for inter prediction of the current image. RefPicSetLtFoll can be used for inter prediction of one or more pictures following the current picture in decoding order, and consists of all long-term reference pictures that are not used for inter prediction of the current picture.

RPS는 서로 다른 유형의 참조 영상: 현재 영상보다 POC 값이 낮은 단기 참조 영상, 현재 영상보다 POC 값이 높은 단기 참조 영상, 및 장기 참조 영상을 통해 반복되는 최대 3개의 루프를 사용하여 시그널링된다. The RPS is signaled using different types of reference pictures: a short-term reference picture having a lower POC value than the current picture, a short-term reference picture having a higher POC value than the current picture, and a maximum of three loops repeated through the long-term reference picture.

또한, 참조 영상이 현재 영상에 의해 참조에 사용되는지 (리스트 RefPicSetStCurrBefore, RefPicSetStCurrAfter 또는 RefPicSetLtCurr 중 하나에 포함된) 사용되지 않는지(리스트 RefPicSetStFoll 또는 RefPicSetLtFoll 중 하나에 포함되지 않음)의 여부를 지시하는 플래그(used_by_curr_pic_X_flag)가 참조 영상 각각에 대해 전송된다.In addition, a flag indicating whether the reference image is used for reference by the current image (included in one of the lists RefPicSetStCurrBefore, RefPicSetStCurrAfter or RefPicSetLtCurr) or not (not included in one of the lists RefPicSetStFoll or RefPicSetLtFoll) (used_by_curr_pic_pic_pic) Is transmitted for each reference image.

도 4는 RPS(400)의 모든 서브세트(402)에서 엔트리(예: 영상)를 갖는 현재 영상 B14를 갖는 RPS(400)를 나타낸다. 도 4의 예에서, 현재 영상 B14는 5개의 서브세트(402)(RPS 서브세트로도 알려짐) 각각에 정확히 하나의 영상을 포함한다. P8은 영상이 출력 순서에서 앞이고 B14에 의해 사용되기 때문에 RefPicSetStCurrBefore라고하는 서브세트(402)의 영상이다. P12는 영상이 출력 순서에서 뒤에 있고 B14에 의해 사용되기 때문에 RefPicSetStCurrAfter라고 하는 서브세트(402)의 영상이다. P13은 영상이 B14에 의해 사용되지 않는 단기 참조 영상이기 때문에 RefPicSetStFoll이라고 하는 서브세트(402)의 영상이다(그러나 B15에 의해 사용되기 때문에 DPB에 유지되어야 함). P4는 영상이 B14에 의해 사용되는 장기 참조 영상이기 때문에 RefPicSetLtCurr라고 하는 서브세트(402)의 영상이다. I0은 영상이 현재 영상에 의해 사용되지 않는 장기 참조 영상이기 때문에 RefPicSetLtFoll이라고 하는 서브세트(402)의 영상이다(그러나 B15에 의해 사용되기 때문에 DPB에 유지되어야 함).4 shows an RPS 400 with a current picture B14 having an entry (eg, video) in all subsets 402 of the RPS 400. In the example of FIG. 4, the current picture B14 contains exactly one picture in each of the five subsets 402 (also known as the RPS subset). P8 is an image of a subset 402 called RefPicSetStCurrBefore because the image is ahead in the output order and is used by B14. P12 is an image of a subset 402 called RefPicSetStCurrAfter because the image is later in the output order and is used by B14. P13 is a picture of a subset 402 called RefPicSetStFoll because the picture is a short-term reference picture that is not used by B14 (but must be kept in DPB because it is used by B15). P4 is an image of a subset 402 called RefPicSetLtCurr because the image is a long-term reference image used by B14. I0 is an image of a subset 402 called RefPicSetLtFoll because the image is a long-term reference image that is not used by the current image (but must be maintained in the DPB because it is used by B15).

RPS(400)의 단기 부분은 슬라이스 헤더에 직접 포함될 수 있다. 대안으로, 슬라이스 헤더는 색인을 나타내는 신택스 요소만 포함할 수 있으며, 활성 SPS에서 전송된 미리 정의된 RPS 리스트를 참조한다. RPS(402)의 단기 부분은 두 가지 다른 방식: 아래 설명된 인터 RPS 또는 여기 설명된 인트라 RPS 중 하나를 사용하여 시그널링될 수 있다. 인트라 RPS가 사용되는 경우, num_negative_pics 및 num_positive_pics가 시그널링되어, 두 개의 서로 다른 참조 영상 리스트의 길이를 나타낸다. 이 리스트들은 각각 현재 영상과 비교하여 음의 POC 차이와 양의 POC 차이가 있는 참조 영상을 포함한다. 이들 리스트의 요소 각각은 리스트의 이전 요소에서 1을 뺀 값과 관련된 POC 값의 차이를 나타내는 가변 길이 코드로 인코딩된다. 각각의 리스트의 첫 번째 영상의 경우, 시그널링은 현재 영상의 POC 값에서 1을 뺀 값에 관련이 있다. The short-term portion of the RPS 400 may be directly included in the slice header. Alternatively, the slice header may contain only a syntax element indicating an index, and refers to a predefined RPS list transmitted in the active SPS. The short-term portion of RPS 402 may be signaled using one of two different ways: Inter RPS described below or Intra RPS described herein. When intra RPS is used, num_negative_pics and num_positive_pics are signaled to indicate the lengths of two different reference picture lists. Each of these lists includes a reference image having a negative POC difference and a positive POC difference compared to the current image. Each of these list elements is encoded with a variable length code representing the difference in POC value associated with the previous element of the list minus one. For the first video in each list, signaling is related to a value subtracting 1 from the POC value of the current video.

시퀀스 파라미터 세트에서 반복 RPS를 인코딩하는 경우, 시퀀스 파라미터 세트에서 이미 인코딩된 다른 RPS를 참조하여 하나의 RPS(예: RPS(400))의 요소를 인코딩할 수 있다. 이를 인터 RPS라고 한다. 시퀀스 파라미터 변수 세트의 모든 RPS가 동일한 네트워크 추상화 계층(NAL) 유닛에 있으므로 이 방법과 연관된 오류 견고성(error robustness problem) 문제는 없다. 인터 RPS 신택스는 현재 영상의 RPS가 이전에 디코딩된 영상의 RPS로부터 예측될 수 있다는 사실을 이용한다. 이는 현재 영상의 모든 참조 영상이 이전 영상의 참조 영상이거나 이전에 디코딩된 영상 자체여야 하기 때문이다. 이들 영상 중 어느 것이 참조 영상이어야 하고 현재 영상의 예측에 사용되어야 하는지를 지시하기만 하면 된다. 따라서, 신택스는 다음을 포함할 수 있다: 예측자(predictor)로서 사용할 RPS를 가리키는 색인, 현재 RPS의 델타 POC를 획득하기 위해 예측자의 delta_POC에 추가될 delta_POC, 그리고 어느 영상이 참조 영상이고 미래 영상의 예측에만 사용되는지의 여부를 지시하는 지시자의 세트.When encoding the repetitive RPS in the sequence parameter set, an element of one RPS (eg, RPS 400) may be encoded by referring to another RPS already encoded in the sequence parameter set. This is called inter RPS. There is no error robustness problem associated with this method since all RPSs of the sequence parameter variable set are in the same network abstraction layer (NAL) unit. Inter RPS syntax uses the fact that the RPS of the current picture can be predicted from the RPS of the previously decoded picture. This is because all reference images of the current image must be reference images of the previous image or images that have been previously decoded. It is only necessary to indicate which of these images should be the reference image and which should be used for prediction of the current image. Thus, the syntax may include: an index pointing to the RPS to be used as a predictor, a delta_POC to be added to the delta_POC of the predictor to obtain the delta POC of the current RPS, and which image is the reference image and of the future image. A set of indicators that indicate whether they are used only for prediction.

장기 참조 영상의 사용을 이용하려는 인코더는 SPS 신택스 요소 long_term_ref_pics_present_flag를 1로 설정해야 한다. 그러면 장기 참조 영상은 각각의 장기 영상의 전체 POC 값의 최하위 비트를 나타내는 고정 길이 코드 워드, poc_lsb_lt에 의해 슬라이스 헤더에서 시그널링될 수 있다. 각각의 poc_lsb_lt는 특정 장기 영상에 대해 시그널링되었던 pic_order_cnt_lsb 코드워드의 사본이다. SPS의 장기 영상 세트를 POC LSB 값의 리스트로서 시그널링하는 것도 가능하다. 그러면 장기 영상에 대한 POC LSB는 슬라이스 헤더에서 이 리스트에 대한 색인으로서 시그널링될 수 있다.An encoder that intends to use the long-term reference picture must set the SPS syntax element long_term_ref_pics_present_flag to 1. Then, the long-term reference picture may be signaled in the slice header by a fixed-length code word, poc_lsb_lt, indicating the least significant bit of the total POC value of each long-term picture. Each poc_lsb_lt is a copy of the pic_order_cnt_lsb codeword that was signaled for a specific organ image. It is also possible to signal the long-term image set of the SPS as a list of POC LSB values. The POC LSB for the long-term image can then be signaled as an index to this list in the slice header.

delta_poc_msb_cycle_lt_minus1 신택스 요소는 현재 영상에 대한 장기 참조 영상의 전체 POC 거리를 계산할 수 있도록 추가로 시그널링될 수 있다. 코드 워드 delta_poc_msb_cycle_lt_minus1은 RPS의 다른 참조 영상과 동일한 POC LSB 값을 갖는 장기 참조 영상 각각에 대해 시그널링되어야 한다.The delta_poc_msb_cycle_lt_minus1 syntax element may be additionally signaled to calculate the total POC distance of the long-term reference image with respect to the current image. The code word delta_poc_msb_cycle_lt_minus1 should be signaled for each long-term reference picture having the same POC LSB value as other reference pictures of the RPS.

HEVC에서의 참조 영상 마킹의 경우, 일반적으로 영상 디코딩 전에 DPB에 다수의 영상이 존재할 것이다. 일부 영상은 예측에 이용할 수 있으며, "참조용으로 사용됨"으로 마킹되어 있다. 다른 영상은 예측에 이용할 수 없지만 출력 대기 중이므로, "참조용으로 사용되지 않음"으로 마킹된다. 슬라이스 헤더가 이미 파싱된 경우, 슬라이스 데이터가 디코딩되기 전에 영상 마킹 프로세스가 수행된다. DPB에 있고 "참조용으로 사용됨"으로 마킹되어 있지만 RPS에 포함되지 않은 영상은 "참조용으로 사용되지 않음"으로 마킹된다. DPB에는 없지만 참조 영상 세트에 포함되어 있는 영상은 used_by_curr_pic_X_flag가 0과 같을 때 무시된다. 그러나 used_by_curr_pic_X_flag가 1과 같을 때, 이 참조 영상은 현재 영상에서 예측에 사용하기 위한 것이었지만 누락된 것이다. 그런 다음 의도하지 않은 영상 손실이 유추되고 디코더는 적절한 조치를 취해야 한다.In the case of reference image marking in HEVC, in general, a plurality of images will exist in the DPB before image decoding. Some of the images are available for prediction and are marked "used for reference". Other images are not available for prediction, but are waiting for output, so they are marked as "not used for reference". If the slice header has already been parsed, an image marking process is performed before the slice data is decoded. Images that are in the DPB and marked "used for reference" but not included in the RPS are marked "not used for reference". Images that are not in the DPB but included in the reference picture set are ignored when used_by_curr_pic_X_flag is equal to 0. However, when used_by_curr_pic_X_flag is equal to 1, this reference picture is for use in prediction in the current picture, but is omitted. Then unintended video loss is inferred and the decoder must take appropriate action.

현재 영상을 디코딩한 후, 이는 "단기 참조용으로 사용됨"으로 마킹된다.After decoding the current picture, it is marked "used for short-term reference".

다음으로, HEVC에서의 참조 영상 리스트 구축에 대해 설명한다. HEVC에서, 인터 예측이라는 용어는 현재 디코딩된 영상 이외의 참조 영상의 데이터 요소(예: 샘플 값 또는 움직임 벡터)로부터 도출된 예측을 나타내는 데 사용된다. AVC와 마찬가지로, 다수의 참조 영상으로부터 영상이 예측될 수 있다. 인터 예측에 사용되는 참조 영상은 하나 이상의 참조 영상 리스트로 편성된다. 참조 색인은 리스트 내의 참조 영상 중 어느 것이 예측 신호를 생성하는 데 사용되어야 하는지를 식별할 수 있게 해준다.Next, construction of a reference video list in HEVC will be described. In HEVC, the term inter prediction is used to denote prediction derived from data elements (eg, sample values or motion vectors) of a reference picture other than the currently decoded picture. Like AVC, an image can be predicted from multiple reference images. Reference pictures used for inter prediction are organized into one or more reference picture lists. The reference index makes it possible to identify which of the reference pictures in the list should be used to generate the prediction signal.

P 슬라이스에는 단일의 참조 영상 리스트, List 0이 사용되고, B 슬라이스에는 두 개의 참조 영상 리스트, List 0과 List 1이 사용된다. AVC와 유사하게, HEVC에서의 참조 영상 리스트 구축은 참조 영상 리스트 초기화와 참조 영상 리스트 수정을 포함한다.A single reference image list, List 0, is used for the P slice, and two reference image lists, List 0 and List 1, are used for the B slice. Similar to AVC, construction of a reference picture list in HEVC involves initializing the reference picture list and modifying the reference picture list.

AVC에서, List 0에 대한 초기화 프로세스는 P 슬라이스(디코딩 순서가 사용됨)와 B 슬라이스(출력 순서가 사용됨)에 대한 것과 다르다. HEVC에서는 두 경우 모두 출력 순서가 사용된다.In AVC, the initialization process for List 0 is different from that for P slices (decoding order is used) and B slices (output order is used). In HEVC, the output order is used in both cases.

참조 영상 리스트 초기화는 세 개의 RPS 서브세트: RefPicSetStCurrBefore, RefPicSetStCurrAfter 및 RefPicSetLtCurr에 기초하여 디폴트 List 0 및 List 1(슬라이스가 B 슬라이스인 경우)을 생성한다. 출력 순서가 빠른(늦은) 단기 영상이 먼저 현재 영상에 대한 POC 거리 오름차순으로 List 0(List 1)에 삽입된 다음, 출력 순서가 늦은(빠른) 단기 영상이 현재 영상까지의 POC 거리 오름차순으로 List 0(List 1)에 삽입된 다음, 최후에 장기 영상이 끝에 삽입된다. RPS 측면에서, List 0의 경우, RefPicSetStCurrBefore의 엔트리가 초기 리스트에 삽입되고, 그 뒤에 RefPicSetStCurrAfter의 엔트리가 삽입된다. 그 후, RefPicSetLtCurr의 엔트리가, 사용 가능한 경우, 추가된다.The reference picture list initialization generates default List 0 and List 1 (when the slice is a B slice) based on three RPS subsets: RefPicSetStCurrBefore, RefPicSetStCurrAfter, and RefPicSetLtCurr. Short-term videos with a fast (late) output are first inserted into List 0 (List 1) in ascending order of the current video's POC distance, and then List 0 with a short-term video with a slower (fast) output sequence in ascending order of the POC distance to the current video. After being inserted in (List 1), the organ image is inserted at the end. In terms of RPS, in the case of List 0, an entry of RefPicSetStCurrBefore is inserted into the initial list, followed by an entry of RefPicSetStCurrAfter. Thereafter, an entry of RefPicSetLtCurr is added, if available.

HEVC에서, 리스트의 엔트리 수가 활성 참조 영상(영상 파라미터 세트 또는 슬라이스 헤더에서 시그널링됨)의 타깃 수보다 작은 경우에 상기한 프로세스는 반복된다(참조 영상 리스트에 이미 추가된 참조 영상이 다시 추가된다). 엔트리의 수가 타깃 수보다 큰 경우, 리스트는 절단된다.In HEVC, if the number of entries in the list is smaller than the target number of the active reference picture (signaled in the picture parameter set or slice header), the above process is repeated (the reference picture already added to the reference picture list is added again). If the number of entries is greater than the number of targets, the list is truncated.

참조 영상 리스트가 초기화된 후, 현재 영상에 대한 참조 영상이 임의의 순서로 배열될 수 있도록 수정될 수 있으며, 참조 영상 리스트 수정 커맨드에 기초하여, 하나의 특정한 참조 영상이 리스트에서 하나 이상의 위치에 나타날 수 있는 경우를 포함한다. 리스트 수정이 있음을 지시하는 플래그가 1로 설정되는 경우, 커맨드의 고정된 수(참조 영상 리스트 내의 엔트리의 타깃 수와 동일)가 시그널링되고, 각각의 커맨드는 참조 영상 리스트에 대해 하나의 엔트리를 삽입한다. 참조 영상은 RPS 시그널링으로부터 도출된 현재 영상에 대한 참조 영상의 리스트에 대한 색인에 의해 커맨드에서 식별된다. 이것은 H.264/AVC에서의 참조 영상 리스트 수정과는 다르며, 여기서 영상은 영상 번호(frame_num 신택스 요소로부터 도출됨) 또는 장기 참조 영상 색인에 의해 식별되며, 예컨대, 초기 리스트의 처음 두 엔트리를 교환(swapping)하거나 초기 리스트의 시작 부분에 하나의 엔트리를 삽입하고 다른 엔트리들을 시프트하는 데 필요한 커맨드가 더 적을 수 있다.After the reference image list is initialized, it can be modified so that the reference images for the current image can be arranged in an arbitrary order, and based on the reference image list modification command, one specific reference image appears at one or more positions in the list. Includes possible cases. When the flag indicating that there is a list modification is set to 1, a fixed number of commands (same as the target number of entries in the reference video list) is signaled, and each command inserts one entry for the reference video list. do. The reference picture is identified in the command by an index into the list of reference pictures for the current picture derived from RPS signaling. This is different from the reference picture list modification in H.264/AVC, where the picture is identified by picture number (derived from frame_num syntax element) or long-term reference picture index, e.g. exchanging the first two entries of the initial list ( swapping) or inserting one entry at the beginning of the initial list and shifting other entries may require fewer commands.

참조 영상 리스트는 현재 영상보다 더 큰 TemporalId를 가진 참조 영상을 포함할 수 없다. HEVC 비트스트림은 여러 시간 서브계층(sub-layer)으로 구성될 수 있다. 각각의 NAL 유닛은 TemporalId(temporal_id_plus1-1과 동일함)로 지시된 특정 서브계층에 속한다.The reference picture list cannot include a reference picture having a larger TemporalId than the current picture. The HEVC bitstream may be composed of several temporal sub-layers. Each NAL unit belongs to a specific sub-layer indicated by TemporalId (same as temporal_id_plus1-1).

참조 영상 관리는 참조 영상 리스트에 직접 기초한다. JCT-VC 문서 JCTVC-G643는 DPB 내의 참조 영상을 관리하기 위해, 3개의 참조 영상 리스트: 참조 영상 리스트 0, 참조 영상 리스트 1, 유휴 참조 영상 리스트를 직접 사용하는 접근법을 포함함으로써, 1) AVC에서 슬라이딩 윈도 및 MMCO 프로세스뿐만 아니라 참조 영상 리스트 초기화 및 수정 프로세스, 또는 2) HEVC에서, 참조 영상 세트뿐만 아니라 참조 영상 리스트 초기화 및 수정 프로세스를 포함하는 시그널링 및 디코딩 프로세스의 필요를 방지한다.Reference image management is based directly on the reference image list. JCT-VC document JCTVC-G643 includes an approach that directly uses three reference picture lists: reference picture list 0, reference picture list 1, and idle reference picture list to manage the reference picture in the DPB, 1) in AVC The sliding window and MMCO process as well as the reference picture list initialization and modification process, or 2) in HEVC, avoids the need for a signaling and decoding process including a reference picture set as well as a reference picture list initialization and modification process.

참조 영상 관리를 위한 접근법에는 몇 가지 문제가 있을 수 있다. AVC 접근법은 슬라이딩 윈도, MMCO 프로세스, 그리고 복잡한 참조 영상 리스트 초기화 및 수정 프로세스가 포함된다. 더욱이, 영상의 손실은 추가적인 인터 예측 참조 목적을 위해 어떤 영상이 DPB에 영상이 있어야 한다는 측면에서 DPB의 상태의 손실로 이어질 수 있다. HEVC 접근법에는 DPB 상태 손실 문제가 없다. 그러나 HEVC 접근법은 복잡한 참조 영상 세트 시그널링 및 도출 프로세스뿐만 아니라 복잡한 참조 영상 리스트 초기화 및 수정 프로세스를 포함한다. JCTVC-G643에서 DPB 내의 참조 영상을 관리하기 위해, 3개의 참조 영상 리스트: 참조 영상 리스트 0, 참조 영상 리스트 1 및 유휴 참조 영상 리스트를 직접 사용하는 접근법은 다음과 같은 측면을 포함한다: 세번 재 참조 영상 리스트, 즉 유휴 참조 영상 리스트; "단기" 부분인 POC 차이와 ue(v) 코딩된 "장기" 부분의 두 부분 코딩; POC 차이 코딩을 위한 TemporalId 기반 POC 그래뉼래러티, "단기 참조용으로 사용됨" 또는 "장기 참조용으로 사용됨" 사이의 마킹을 결정하기 위한 POC 차이의 두 부분 코딩; There may be several problems with the approach to reference image management. The AVC approach includes a sliding window, an MMCO process, and a complex reference image list initialization and modification process. Moreover, the loss of an image may lead to a loss of the state of the DPB in the aspect that a certain image must have an image in the DPB for additional inter prediction reference purposes. There is no DPB state loss problem with the HEVC approach. However, the HEVC approach involves not only a complex reference picture set signaling and derivation process, but also a complex reference picture list initialization and modification process. In JCTVC-G643, to manage the reference picture in the DPB, the approach of using three reference picture lists: reference picture list 0, reference picture list 1, and idle reference picture list includes the following aspects: Re-reference three times. An image list, that is, an idle reference image list; Two-part coding of the "short-term" part, the POC difference and the ue(v) coded "long-term" part; TemporalId-based POC granularity for POC difference coding, two-part coding of the POC difference to determine the marking between "used for short-term reference" or "used for long-term reference";

특정한 이전(earlier) 참조 영상 리스트 설명의 꼬리(tail)로부터 참조 영상 리스트를 제거함으로써 참조 영상 리스트를 지정하는 능력을 활성화시키는 참조 영상 리스트 서브세트 설명; 신택스 요소 ref_pic_list_copy_flag에 의해 활성화된 참조 영상 리스트 복사 모드; 및 참조 영상 리스트 설명 프로세스. 선행하는 측면 각각은 접근법을 불필요하게 복잡하게 만든다. 또한, JCTVC-G643에서 참조 영상 리스트에 대한 디코딩 프로세스도 복잡한다. 장기 참조 영상의 시그널링은 슬라이스 헤더에서 POC 사이클의 시그널링을 필요로 할 수 있다. 이것은 효율적이지 않다.A reference picture list subset description that activates the ability to designate a reference picture list by removing the reference picture list from the tail of a specific earlier reference picture list description; A reference picture list copy mode activated by the syntax element ref_pic_list_copy_flag; And reference image list description process. Each of the preceding aspects complicates the approach unnecessarily. In addition, the decoding process for the reference picture list in JCTVC-G643 is also complicated. Signaling of a long-term reference picture may require signaling of a POC cycle in a slice header. This is not efficient.

위에 열거된 문제를 해결하기 위해, 여기에 개시되는 해결 방안은 각각 개별적으로 적용될 수 있고, 그 일부는 조합하여 적용될 수 있다. 1) 참조 영상 마킹은 두 개의 참조 영상 리스트, 즉 참조 영상 리스트 0과 참조 영상 리스트 1에 직접 기초한다. 1a) 두 개의 참조 영상 리스트의 도출하기 위한 정보는 SPS, PPS 및/또는 슬라이스 헤더에서 신택스 요소와 신택스 구조에 기초하여 시그널링된다. 1b) 영상에 대한 두 개의 참조 영상 리스트 각각은 참조 영상 리스트 구조에서 명시적으로 시그널링된다. 1b.i) 하나 이상의 참조 영상 리스트 구조는 SPS에서 시그널링될 수 있으며 각각은 슬라이스 헤더로부터 색인에 의해 참조될 수 있다. 1b.ii) 참조 영상 리스트 0과 1 각각은 슬라이스 헤더에서 직접 시그널링될 수 있다. 2) 두 참조 영상 리스트의 도출을 위한 정보는 모든 유형의 슬라이스, 즉 B(이중 예측) 슬라이스, P(단일 예측) 슬라이스 및 I(인트라) 슬라이스에 대해 시그널링된다. 슬라이스라는 용어는 HEVC의 슬라이스 또는 최신 VVC WD와 같은 코딩 트리 유닛의 모음을 가리키고; 또한 HEVC의 타일과 같은 코딩 트리 유닛의 다른 모음을 가리킬 수 있도 있다. 3) 모든 유형의 슬라이스, 즉 B, P 및 I 슬라이스에 대해 두 개의 참조 영상 리스트가 생성된다. 4) 두 개의 참조 영상 리스트는 참조 영상 리스트 초기화 프로세스와 참조 영상 리스트 수정 프로세스를 사용하지 않고 직접 구축된다. 5) 두 개의 참조 영상 리스트 각각에서, 현재 영상의 인터 예측에 사용될 수 있는 참조 영상은 리스트의 시작 부분에 있는 다수의 엔트리에 의해서만 참조될 수 있다. 이러한 엔트리를 리스트 내의 활성 엔트리라고 하고, 다른 엔트리를 리스트 내의 비활성 엔트리라고 한다. 리스트의 총 엔트리 수와 활성 엔트리 수는 모두 도출될 수 있다. 6) 참조 영상 리스트의 비활성 엔트리에 참조되는 영상은 참조 영상 리스트의 다른 엔트리 또는 다른 참조 영상 리스트의 엔트리에 의해 참조될 수 없다. 7) 장기 참조 영상은 특정 수의 POC LSB에 의해서만 식별되며, 여기서 이 수는 POC 값의 도출을 위해 슬라이스 헤더에서 시그널링된 POC LSB의 수보다 클 수 있으며, 이 수는 SPS에서 지시된다. 8) 참조 영상 리스트 구조는 슬라이스 헤더에서만 시그널링되며, 단기 참조 영상과 장기 참조 영상은 모두 POC LSB에 의해 식별되며, POC 값의 도출을 위해 슬라이스 헤더에서 시그널링된 POC LSB를 나타내는 데 사용되는 사용되는 비트 수와 다른 비트 수로 표현될 수 있으며, 단기 참조 영상과 장기 참조 영상을 식별하기 위한 POC LSB를 나타내는 데 사용되는 비트 수는 다를 수 있다. 9) 참조 영상 리스트 구조는 슬라이스 헤더에서만 시그널링되고, 단기 참조 영상과 장기 참조 영상을 구분하지 않으며, 모든 참조 영상은 단지 참조 영상로 명명되며, 참조 영상은 그 POC LSB에 의해 식별되며, POC 값의 도출을 위해 슬라이스 헤더에서 시그널링된 POC LSB를 나타내는 데 사용되는 비트 수와 다른 비트 수로 표현된다.In order to solve the problems listed above, the solutions disclosed herein may be applied individually, and some of them may be applied in combination. 1) Reference image marking is directly based on two reference image lists, namely, reference image list 0 and reference image list 1. 1a) Information for deriving two reference picture lists is signaled based on a syntax element and a syntax structure in the SPS, PPS and/or slice header. 1b) Each of the two reference picture lists for the picture is explicitly signaled in the reference picture list structure. 1b.i) One or more reference picture list structures can be signaled in the SPS, and each can be referenced by an index from the slice header. 1b.ii) Each of the reference picture lists 0 and 1 may be directly signaled in the slice header. 2) Information for derivation of two reference picture lists is signaled for all types of slices, that is, B (double prediction) slice, P (single prediction) slice, and I (intra) slice. The term slice refers to a slice of HEVC or a collection of coding tree units such as the latest VVC WD; It can also refer to other collections of coding tree units, such as tiles in HEVC. 3) Two reference picture lists are created for all types of slices, that is, B, P and I slices. 4) The two reference image lists are built directly without using the reference image list initialization process and the reference image list modification process. 5) In each of the two reference picture lists, a reference picture that can be used for inter prediction of the current picture can be referenced only by a number of entries at the beginning of the list. Such entries are referred to as active entries in the list, and other entries are referred to as inactive entries in the list. Both the total number of entries and the number of active entries in the list can be derived. 6) The picture referenced in the inactive entry of the reference picture list cannot be referenced by another entry in the reference picture list or by an entry in another reference picture list. 7) Long-term reference images are identified only by a specific number of POC LSBs, where this number may be greater than the number of POC LSBs signaled in the slice header for derivation of the POC value, and this number is indicated by the SPS. 8) The reference picture list structure is signaled only in the slice header, both the short-term reference picture and the long-term reference picture are identified by the POC LSB, and a bit used to indicate the POC LSB signaled in the slice header to derive the POC value. It may be expressed as a number of bits different from the number, and the number of bits used to indicate the POC LSB for identifying the short-term reference image and the long-term reference image may be different. 9) The reference picture list structure is signaled only in the slice header, does not distinguish between a short-term reference picture and a long-term reference picture, all reference pictures are only named as reference pictures, and reference pictures are identified by their POC LSBs, and It is expressed as a number of bits different from the number of bits used to indicate the POC LSB signaled in the slice header for derivation.

본 개시의 제1 실시예가 제공된다. 설명은 최신 VVC WD와 관련이 있다. 이 실시예에서, 참조 영상 리스트 0 및 참조 영상 리스트 1 각각에 대해 하나씩, 두 세트의 참조 영상 리스트 구조가 SPS에서 시그널링된다.A first embodiment of the present disclosure is provided. The description relates to the latest VVC WD. In this embodiment, two sets of reference picture list structures, one for each of reference picture list 0 and reference picture list 1, are signaled in the SPS.

여기서 사용되는 일부 용어에 대한 정의를 제공한다. 인트라 랜덤 액세스 포인트(Intra Random Access Point, IRAP) 영상: 각각의 비디오 코딩 계층(video coding layer, VCL) NAL 유닛이 IRAP_NUT와 동일한 nal_unit_type을 갖는 코딩된 영상. 비 IRAP(non-IRAP) 영상: 각각의 VCL NAL 유닛이 NON_IRAP_NUT와 동일한 nal_unit_type을 갖는 코딩된 영상. 참조 영상 리스트: P 또는 B 슬라이스의 인터 예측에 사용되는 참조 영상 리스트. 두 개의 참조 영상 리스트: 참조 영상 리스트 0 및 참조 영상 리스트 1은 비 IRAP 영상의 슬라이스 각각에 대해 생성된다. 유일한 영상의 세트는, 연관된 영상 또는 디코딩 순서에서 연관된 영상 다음에 오는 임의의 영상의 인터 예측에 사용될 수 있는 모든 참조 영상으로 구성되는 영상과 연관된 두 개의 참조 영상 리스트의 모든 엔트리에 의해 참조된다. P 슬라이스의 슬라이스 데이터를 디코딩하기 위해, 참조 영상 리스트 0만이 인터 예측에 사용된다. B 슬라이스의 슬라이스 데이터를 디코딩하기 위해, 두 참조 영상 리스트가 인터 예측에 사용된다. I 슬라이스의 슬라이스 데이터를 디코딩하기 위해, 참조 영상 리스트가 인터 예측에 사용되지 않는다. 장기 참조 영상(LTRP): "장기 참조용으로 사용됨"으로 마킹된 영상이다. 단기 참조 영상(STRP): "단기 참조용으로 사용됨"으로 마킹된 영상이다.Provides definitions for some of the terms used here. Intra Random Access Point (IRAP) video: A coded video in which each video coding layer (VCL) NAL unit has the same nal_unit_type as IRAP_NUT. Non-IRAP (non-IRAP) picture: A coded picture in which each VCL NAL unit has the same nal_unit_type as NON_IRAP_NUT. Reference picture list: A list of reference pictures used for inter prediction of P or B slices. Two reference picture lists: reference picture list 0 and reference picture list 1 are generated for each slice of a non-IRAP picture. The set of unique pictures is referenced by all entries of the two reference picture lists associated with the picture consisting of all reference pictures that can be used for inter prediction of the associated picture or any picture following the related picture in decoding order. In order to decode the slice data of the P slice, only the reference picture list 0 is used for inter prediction. To decode the slice data of the B slice, two reference picture lists are used for inter prediction. In order to decode the slice data of the I slice, the reference picture list is not used for inter prediction. Long-term reference image (LTRP): An image marked as "used for long-term reference". Short-term reference image (STRP): An image marked as "used for short-term reference".

"단기 참조용으로 사용됨", "장기 참조용으로 사용됨" 또는 "참조용으로 사용되지 않음"이라는 용어는 VVC에서 섹션 8.3.3 참조 영상 마킹을 위한 디코딩 프로세스에 정의되어 있고, HEVC에서 섹션 8.3.2 참조 영상 세트에 대한 디코딩 프로세스에 정의되어 있고, AVC에서 섹션 7.4.3.3 디코딩된 참조 영상 마킹 시맨틱스(semantics)에 정의되어 있다. 여기에 사용된 바와 같이, 이 용어들은 동일한 의미를 갖는다. The terms "used for short-term reference", "used for long-term reference" or "not used for reference" are defined in section 8.3.3 in VVC in the decoding process for image marking, see section 8.3 in HEVC. 2 Defined in the decoding process for the reference picture set, and in section 7.4.3.3 decoded reference picture marking semantics in AVC. As used herein, these terms have the same meaning.

제1 실시예에 대한 관련 신택스 및 시맨틱스를 이하에 제공한다.Related syntax and semantics for the first embodiment are provided below.

NAL 유닛 헤더 신택스.NAL unit header syntax.

시퀀스 파라미터 세트 RBSP(Raw Byte Sequence Payload) 신택스.Sequence parameter set Raw Byte Sequence Payload (RBSP) syntax.

영상 파라미터 세트 RBSP 신택스Picture parameter set RBSP syntax

슬라이스 헤더 신택스.Slice header syntax.

참조 영상 리스트 구조 신택스Reference image list structure syntax

NAL 유닛 헤더 시맨틱스.NAL unit header semantics.

forbidden_zero_bit는 0과 같아야 한다. nal_unit_type은 NAL 유닛에 포함된 RBSP 데이터 구조의 유형을 지정한다.forbidden_zero_bit must be equal to 0. nal_unit_type designates the type of the RBSP data structure included in the NAL unit.

[표 7-1] NAL 유닛 유형 코드 및 NAL 유닛 유형 클래스[Table 7-1] NAL unit type code and NAL unit type class

nuh_temporal_id_plus1 minus 1은 NAL 유닛에 대한 시간 식별자를 지정한다. nuh_temporal_id_plus1의 값은 0과 같지 않아야 한다. TemporalId 변수는 다음과 같이 지정된다: TemporalId = nuh_temporal_id_plus1　-　1. nal_unit_type이 IRAP_NUT와 같을 때, 코딩된 슬라이스는 IRAP 양싱에 속하고, TemporalId는 0과 같아야 한다. TemporalId의 값은 액세스 유닛의 모든 VCL NAL 유닛에 대해 동일해야 한다. 코딩된 영상 또는 액세스 유닛의 TemporalId 값은 코딩된 영상 또는 액세스 유닛의 VCL NAL 유닛의 TemporalId 값이다. 비 VCL NAL 유닛의 TemporalId 값은 다음과 같이 제한된다: nal_unit_type이 SPS_NUT와 같으면, TemporalId는 0이고 NAL 유닛을 포함하는 액세스 유닛의 TemporalId는 0이어야 한다. 그렇지 않고 nal_unit_type이 EOS_NUT 또는 EOB_NUT과 같으면, TemporalId는 0과 같아야 한다. 그렇지 않으면, TemporalId는 NAL 유닛을 포함하는 액세스 유닛의 TemporalId보다 크거나 같아야 한다. NAL 유닛이 비 VCL NAL 유닛인 경우, TemporalId의 값은 비 VCL NAL 유닛이 적용되는 모든 액세스 유닛의 TemporalId 값의 최소 값과 같다. nal_unit_type이 PPS_NUT와 같을 때, TemporalId는 모든 영상 파라미터 세트(PPS)가 비트스트림의 시작 부분에 포함될 수 있으므로, 포함하는 액세스 유닛의 TemporalId보다 크거나 같을 수 있으며, 여기서 첫 번째 코딩된 영상은 0과 같은 TemporalId를 갖는다. nal_unit_type이 PREFIX_SEI_NUT 또는 SUFFIX_SEI_NUT와 같을 때, TemporalId는 포함하는 액세스 유닛의 TemporalId보다 크거나 같을 수 있는데, SEI NAL 유닛이, TemporalId 값이 SEI NAL 유닛을 포함하는 액세스 유닛의 TemporalId보다 큰 액세스 유닛을 포함하는 비트스트림 서브세트에 적용되는 정보를 포함할 수 있기 때문이다. nuh_reserved_zero_7bits는 '0000000'과 같아야 한다. nuh_reserved_zero_7bits의 다른 값은 장래에 ITU-T|ISO/IEC에 의해 지정될 수 있다. 디코더는 nuh_reserved_zero_7bits 값이 '0000000'과 같지 않은 NAL 유닛을 무시해야 한다(즉, 비트스트림에서 제거하고 폐기).nuh_temporal_id_plus1 minus 1 designates a time identifier for the NAL unit. The value of nuh_temporal_id_plus1 should not be equal to 0. The TemporalId variable is specified as follows: TemporalId = nuh_temporal_id_plus1　-　1.When nal_unit_type is equal to IRAP_NUT, the coded slice belongs to both IRAP and TemporalId must be equal to 0. The value of TemporalId must be the same for all VCL NAL units of the access unit. The TemporalId value of the coded picture or access unit is the TemporalId value of the VCL NAL unit of the coded picture or access unit. The TemporalId value of a non-VCL NAL unit is limited as follows: If nal_unit_type is equal to SPS_NUT, TemporalId is 0 and TemporalId of the access unit containing the NAL unit must be 0. Otherwise, if nal_unit_type is equal to EOS_NUT or EOB_NUT, TemporalId must be equal to 0. Otherwise, the TemporalId must be greater than or equal to the TemporalId of the access unit containing the NAL unit. When the NAL unit is a non-VCL NAL unit, the value of TemporalId is the same as the minimum value of TemporalId values of all access units to which the non-VCL NAL unit is applied. When nal_unit_type is equal to PPS_NUT, TemporalId may be greater than or equal to TemporalId of the containing access unit, since all image parameter sets (PPS) may be included at the beginning of the bitstream, where the first coded image is equal to 0. It has a TemporalId. When nal_unit_type is equal to PREFIX_SEI_NUT or SUFFIX_SEI_NUT, TemporalId may be greater than or equal to TemporalId of the containing access unit. In the SEI NAL unit, a bit containing an access unit whose TemporalId value is greater than the TemporalId of the access unit including the SEI NAL unit This is because information applied to a subset of streams can be included. nuh_reserved_zero_7bits must be equal to '0000000'. Other values of nuh_reserved_zero_7bits may be specified by ITU-T|ISO/IEC in the future. The decoder must ignore NAL units whose nuh_reserved_zero_7bits value is not equal to '0000000' (ie, remove and discard from the bitstream).

시퀀스 파라미터 세트 RBSP 시맨틱스.Sequence parameter set RBSP semantics.

log2_max_pic_order_cnt_lsb_minus4는 영상 순서 카운트(picture order count)에 대한 디코딩 프로세스에서 사용되는 변수 MaxPicOrderCntLsb의 값을 다음과 같이 지정한다: MaxPicOrderCntLsb = 2(log2_max_pic_order_cnt_lsb_minus4 + 4). log2_max_pic_order_cnt_lsb_minus4의 값은 0에서 12(포함)까지의 범위에 있어야 한다. sps_max_dec_pic_buffering_minus1 plus 1은 CVS에 필요한 디코딩된 영상 버퍼의 최대 크기를 영상 저장 버퍼 단위로 지정한다. sps_max_dec_pic_buffering_minus1의 값은 0에서 MaxDpbSize - 1(포함)까지의 범위에 있어야 하며, 여기서 MaxDpbSize는 다른 곳에 지정된 것과 같다. long_term_ref_pics_flag equal to 0는 CVS에서 임의의 코딩된 영상의 인터 예측에 LTRP가 사용되지 않음을 지정한다. long_term_ref_pics_flag equal to 1은 LTRP가 CVS에서 하나 이상의 코딩된 영상의 인터 예측에 사용될 수 있음을 지정한다. additional_lt_poc_lsb는 참조 영상 리스트의 디코딩 프로세스에서 사용되는 변수 MaxLtPicOrderCntLsb의 값을 다음과 같이 지정한다: MaxLtPicOrderCntLsb = 2(log2_max_pic_order_cnt_lsb_minus4 + 4 + additional_lt_poc_lsb). additional_lt_poc_lsb의 값은 0에서 32 - log2_max_pic_order_cnt_lsb_minus4 - 4(포함)까지의 범위에 있어야 한다. 존재하지 않는 경우, additional_lt_poc_lsb의 값은 0과 같은 것으로 추론된다. num_ref_pic_lists_in_sps[i]는 SPS에 포함된 i와 동일한 listIdx를 갖는 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 수를 지정한다. num_ref_pic_lists_in_sps[i]의 값은 0에서 64(포함)까지의 범위에 있어야 한다. listIdx의 값(equal to 0 또는 1) 각각에 대해, 하나의 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조는 현재 영상의 슬라이스 헤더에서 직접 시그널링되기 때문에, 디코더는 num_ref_pic_lists_in_sps[i] + 1 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 총수에 대해 메모리를 할당해야 한다.log2_max_pic_order_cnt_lsb_minus4 specifies the value of the variable MaxPicOrderCntLsb used in the decoding process for picture order count as follows: MaxPicOrderCntLsb = 2 (log2_max_pic_order_cnt_lsb_minus4 + 4). The value of log2_max_pic_order_cnt_lsb_minus4 must be in the range of 0 to 12 (inclusive). sps_max_dec_pic_buffering_minus1 plus 1 designates the maximum size of a decoded video buffer required for CVS in video storage buffer units. The value of sps_max_dec_pic_buffering_minus1 must be in the range of 0 to MaxDpbSize-1 (inclusive), where MaxDpbSize is the same as specified elsewhere. long_term_ref_pics_flag equal to 0 specifies that LTRP is not used for inter prediction of an arbitrary coded image in CVS. long_term_ref_pics_flag equal to 1 specifies that LTRP can be used for inter prediction of one or more coded images in CVS. additional_lt_poc_lsb specifies the value of the variable MaxLtPicOrderCntLsb used in the decoding process of the reference video list as follows: MaxLtPicOrderCntLsb = 2 (log2_max_pic_order_cnt_lsb_minus4 + 4 + additional_lt_poc_lsb). The value of additional_lt_poc_lsb must be in the range of 0 to 32-log2_max_pic_order_cnt_lsb_minus4-4 (inclusive). If not present, the value of additional_lt_poc_lsb is inferred to be equal to 0. num_ref_pic_lists_in_sps[i] designates the number of ref_pic_list_struct (listIdx, rplsIdx, ltrpFlag) syntax structures having the same listIdx as i included in the SPS. The value of num_ref_pic_lists_in_sps[i] must be in the range of 0 to 64 (inclusive). For each value of listIdx ( equal to 0 or 1), one ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) syntax structure is signaled directly from the slice header of the current video, so the decoder uses num_ref_pic_lists_in_sps[i] + 1 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) Memory must be allocated for the total number of syntax structures.

영상 파라미터 세트 RBSP 시맨틱스.Image parameter set RBSP semantics.

num_ref_idx_default_active_minus1[i] plus 1은, i가 0인 경우, num_ref_idx_active_override_flag equal to 0을 가진 P 또는 B 슬라이스에 대한 변수 NumRefIdxActive[0]의 추론된 값을 지정하고, i가 1일 때, num_ref_idx_active_override_flag equal to 0을 가진 B 슬라이스에 대한 NumRefIdxActive[1]의 추론된 값을 지정한다. num_ref_idx_default_active_minus1[i]의 값은 0에서 14(포함)까지의 범위에 있어야 한다.num_ref_idx_default_active_minus1[i] plus 1 specifies the inferred value of NumRefIdxActive[0] for a P or B slice with num_ref_idx_active_override_flag equal to 0 when i is 0, and when i is 1, num_ref_idx_active_override_flag equal to 0 Specifies the inferred value of NumRefIdxActive[1] for the B slice with. The value of num_ref_idx_default_active_minus1[i] must be in the range of 0 to 14 (inclusive).

슬라이스 헤더 시맨틱스.Slice header semantics.

존재하는 경우, 슬라이스 헤더 신택스 요소 slice_pic_parameter_set_id 및 slice_pic_order_cnt_lsb 각각의 값은 코딩된 영상의 모든 슬라이스 헤더에서 동일해야한다. ... slice_type은 표 7-3에 따라 슬라이스의 코딩 유형을 지정한다.If present, the values of each of the slice header syntax elements slice_pic_parameter_set_id and slice_pic_order_cnt_lsb must be the same in all slice headers of the coded image. ... slice_type designates the coding type of the slice according to Table 7-3.

[표 7-3] slice_type에 대한 명칭 연관관계[Table 7-3] Name correlation relationship for slice_type

nal_unit_type이 IRAP_NUT와 같을 때, 즉, 영상이 IRAP 영상일 때, slice_type은 2와 같아야 한다. ... slice_pic_order_cnt_lsb는 현재 영상에 대한 영상 순서 카운트 모듈로(modulo) MaxPicOrderCntLsb를 지정한다. slice_pic_order_cnt_lsb 신택스 요소의 길이는 log2_max_pic_order_cnt_lsb_minus4 + 4 비트이다. slice_pic_order_cnt_lsb의 값은 0에서 MaxPicOrderCntLsb - 1(포함)까지의 범위에 있어야 한다. slice_pic_order_cnt_lsb가 없으면, slice_pic_order_cnt_lsb는 0과 같은 것으로 추론된다. ref_pic_list_sps_flag[i] equal to 1은, 현재 영상의 참조 영상 리스트 i가 활성 SPS에서 listIdx가 i과 같은 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조 중 하나에 기초하여 도출되도록 지정한다. ref_pic_list_sps_flag[i] equal to 0은 현재 영상의 슬라이스 헤더에 직접 포함된 listIdx가 i와 같은 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조에 기초하여 현재 영상의 참조 영상 리스트 i가 도출되도록 지정한다. num_ref_pic_lists_in_sps[i] 가 0인 경우 ref_pic_list_sps_flag[i]의 값은 0과 같아야 한다. ref_pic_list_idx[i]는 색인을, 현재 영상의 참조 영상 리스트 i의 도출에 사용되는 listIdx가 i와 같은 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의, 활성 SPS에 포함된 listIdx가 i와 같은 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 리스트에 지정한다. 신택스 요소 ref_pic_list_idx[i]는 Ceil(Log2(num_ref_pic_lists_in_sps[i] )) 비트로 표현된다. 존재하지 않을 경우, ref_pic_list_idx[i]의 값은 0으로 추론된다. ref_pic_list_idx[i]의 값은 0에서 num_ref_pic_lists_in_sps[i] - 1(포함)까지의 범위에 있어야 한다. num_ref_idx_active_override_flag equal to 1은 신택스 요소 num_ref_idx_active_minus1[0]이 P 및 B 슬라이스에 존재하고 신택스 요소 num_ref_idx_active_minus1[1]이 B 슬라이스에 존재함을 지정한다. num_ref_idx_active_override_flag equal to 0은 신택스 요소 num_ref_idx_active_minus1[0] 및 num_ref_idx_active_minus1[1]이 존재하지 않음을 지정한다. num_ref_idx_active_minus1[i]는, 존재하는 경우, 변수 NumRefIdxActive[i]의 값을 다음과 같이 지정한다: NumRefIdxActive[i] = num_ref_idx_active_minus1[i] + 1. num_ref_idx_active_minus1[i]의 값은 0에서 14(포함)까지의 범위에 있어야 한다. When nal_unit_type is equal to IRAP_NUT, that is, when the image is an IRAP image, slice_type must be equal to 2. ... slice_pic_order_cnt_lsb designates MaxPicOrderCntLsb as a modulo of the image order count for the current image. The length of the slice_pic_order_cnt_lsb syntax element is log2_max_pic_order_cnt_lsb_minus4 + 4 bits. The value of slice_pic_order_cnt_lsb must be in the range of 0 to MaxPicOrderCntLsb-1 (inclusive). If there is no slice_pic_order_cnt_lsb, slice_pic_order_cnt_lsb is inferred to be equal to 0. ref_pic_list_sps_flag[i] equal to 1 specifies that the reference video list i of the current video is derived based on one of the syntax structures of ref_pic_list_struct (listIdx, rplsIdx, ltrpFlag) whose listIdx is i in the active SPS. ref_pic_list_sps_flag[i] equal to 0 specifies that the reference picture list i of the current picture is derived based on the syntax structure of ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) in which listIdx directly included in the slice header of the current picture is i. When num_ref_pic_lists_in_sps[i] is 0, the value of ref_pic_list_sps_flag[i] must be equal to 0. ref_pic_list_idx[i] is an index, ref_pic_list_struct (listIdx, rplsIdx, ltrpFlag) syntax structure in which the listIdx used to derive the reference video list i of the current video is the same as i ref_pic_list_struct (listIdx) in the syntax structure, and the listIdx included in the active SPS is i , rplsIdx, ltrpFlag) is specified in a list of syntax structures. The syntax element ref_pic_list_idx[i] is represented by Ceil(Log2(num_ref_pic_lists_in_sps[i] )) bits. If not present, the value of ref_pic_list_idx[i] is deduced as 0. The value of ref_pic_list_idx[i] must be in the range of 0 to num_ref_pic_lists_in_sps[i]-1 (inclusive). num_ref_idx_active_override_flag equal to 1 specifies that the syntax element num_ref_idx_active_minus1[0] exists in the P and B slices, and the syntax element num_ref_idx_active_minus1[1] exists in the B slice. num_ref_idx_active_override_flag equal to 0 specifies that the syntax elements num_ref_idx_active_minus1[0] and num_ref_idx_active_minus1[1] do not exist. num_ref_idx_active_minus1[i], if present, specifies the value of the variable NumRefIdxActive[i] as follows: NumRefIdxActive[i] = num_ref_idx_active_minus1[i] + 1. The value of num_ref_idx_active_minus1[i] ranges from 0 to 14 (inclusive). Should be in the range of.

NumRefIdxActive[i] - 1의 값은 슬라이스를 디코딩하는 데 사용될 수 있는 참조 영상 리스트 i에 대한 최대 참조 색인을 지정한다. NumRefIdxActive[i]의 값이 0인 경우, 참조 영상 리스트 i에 대한 참조 색인은 슬라이스를 디코딩하는 데 사용될 수 없다. i가 0 또는 1인 경우, 현재 슬라이스가 B 슬라이스이고 num_ref_idx_active_override_flag가 0인 경우, NumRefIdxActive[i]는 num_ref_idx_default_active_minus1[i] + 1과 같은 것으로 추론된다. 현재 슬라이스가 P 슬라이스이고 num_ref_idx_active_override_flag가 0인 경우, NumRefIdxActive[0]는 num_ref_idx_default_active_minus1[0] + 1과 같은 것으로 추론된다. 현재 슬라이스가 P 슬라이스일 때, NumRefIdxActive[1]은 0과 같은 것으로 추론된다. 현재 슬라이스가 I 슬라이스인 경우, NumRefIdxActive[0] 및 NumRefIdxActive[1] 모두 0과 같은 것으로 추론된다.A value of NumRefIdxActive[i]-1 specifies the maximum reference index for the reference picture list i that can be used to decode a slice. When the value of NumRefIdxActive[i] is 0, the reference index for the reference picture list i cannot be used to decode a slice. When i is 0 or 1, when the current slice is a B slice and num_ref_idx_active_override_flag is 0, it is inferred that NumRefIdxActive[i] is equal to num_ref_idx_default_active_minus1[i] + 1. When the current slice is a P slice and num_ref_idx_active_override_flag is 0, NumRefIdxActive[0] is inferred to be equal to num_ref_idx_default_active_minus1[0] + 1. When the current slice is a P slice, NumRefIdxActive[1] is inferred to be equal to 0. When the current slice is an I slice, it is inferred that both NumRefIdxActive[0] and NumRefIdxActive[1] are equal to 0.

대안으로, i가 0 또는 1인 경우, 상기한 것 후에 다음이 적용된다: rplsIdx1은 ref_pic_list_sps_flag[i] ? ref_pic_list_idx[i] : num_ref_pic_lists_in_sps[i]과 동일하게 설정되도록 하고, numRpEntries[i]는 num_strp_entries[i][rplsIdx1]　+　num_ltrp_entries[i][rplsIdx1]와 같다. NumRefIdxActive[i] 가 numRpEntries[i] 보다 클 때, NumRefIdxActive[i]의 값은 numRpEntries[i] 와 동일하게 설정된다.Alternatively, if i is 0 or 1, the following applies after the above: rplsIdx1 is ref_pic_list_sps_flag[i]? ref_pic_list_idx[i]: Set the same as num_ref_pic_lists_in_sps[i], and numRpEntries[i] is the same as num_strp_entries[i][rplsIdx1]　+　num_ltrp_entries[i][rplsIdx1]. When NumRefIdxActive[i] is greater than numRpEntries[i], the value of NumRefIdxActive[i] is set equal to numRpEntries[i].

참조 영상 리스트 구조 시맨틱스.Reference image list structure semantics.

ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조는 SPS 또는 슬라이스 헤더에 존재할 수 있다. 신택스 구조가 슬라이스 헤더에 포함되는지 SPS에 포함되는지에 따라 다음이 적용된다: 슬라이스 헤더에 존재하면, ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조는 현재 영상(슬라이스를 포함하는 영상)의 참조 영상 리스트 listIdx를 지정한다. 그렇지 않으면(SPS에 존재함), ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조는 참조 영상 리스트 listIdx의 후보를 지정하고, 이 섹션의 나머지 부분에 지정된 시맨틱스에서 "현재 영상"이라는 용어는, 1) SPS에 포함된 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 리스트에 대한 색인과 동일한 ref_pic_list_idx[listIdx]를 포함하는 하나 이상의 슬라이스가 있고, 2) 활성 SPS인 SPS를 갖는 CVS에 있는 각각의 영상을 가리킨다. num_strp_entries[listIdx][rplsIdx]는 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 STRP 엔트리 수를 지정한다. num_ltrp_entries[listIdx][rplsIdx]는 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 LTRP 엔트리 수를 지정한다. 존재하지 않는 경우, num_ltrp_entries[listIdx][rplsIdx]의 값은 0과 같은 것으로 추론된다. 변수 NumEntriesInList[listIdx][rplsIdx]는 다음과 같이 도출된다: NumEntriesInList[listIdx][rplsIdx] = num_strp_entries[listIdx][rplsIdx] + num_ltrp_entries[listIdx] [rplsIdx]. NumEntriesInList[listIdx][rplsIdx]의 값은 0에서 sps_max_dec_pic_buffering_minus1(포함)까지의 범위에 있어야 한다. lt_ref_pic_flag[listIdx][rplsIdx][i] equal to 1은 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 i번째 엔트리가 LTRP 엔트리임을 지정한다. lt_ref_pic_flag[listIdx][rplsIdx][i] equal to 0은 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 i번째 엔트리가 STRP 엔트리임을 지정한다. 존재하지 않는 경우, lt_ref_pic_flag[listIdx][rplsIdx][i]의 값은 0과 같은 것으로 추론된다. 비트스트림 적합성(bitstream conformance)의 요건은, 0에서 NumEntriesInList[listIdx][rplsIdx]　-　1(포함)까지의 범위에 있는 i의 모든 값에 대해, lt_ref_pic_flag[listIdx][rplsIdx][i]의 합이 num_ltrp_entries[listIdx][rplsIdx]와 같아야 하는 것이다. delta_poc_st[listIdx][rplsIdx][i]는, i번째 엔트리가 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조의 첫 번째 STRP 엔트리인 경우, 현재 영상과 i번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 값 간의 차이를 지정하거나, i번째 엔트리가 STRP 엔트리이지만 ref_pic_list_struct(rplsIdx,　ltrpFlag) 신택스 구조에서의 첫 번째 STRP 엔트리가 아닌 경우, i번째 엔트리에 의해 참조되는 영상과 ref_pic_list_struct(listIdx,　rplsIdx,　ltrpFlag) 신택스 구조의 이전 STRP에 의해 참조되는 영상의 영상 순서 카운트 값 사이의 차이를 지정한다. delta_poc_st[listIdx][rplsIdx][i]의 값은 -215에서 215 - 1(포함)까지의 범위에 있어야 한다. poc_lsb_lt[listIdx][rplsIdx][i]는 ref_pic_list_struct(listIdx,　rplsIdx,　ltrpFlag) 신택스 구조의 i번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 모듈로 MaxLtPicOrderCntLsb의 값을 지정한다. poc_lsb_lt[listIdx][rplsIdx][i] 신택스 요소의 길이는 Log2(MaxLtPicOrderCntLsb) 비트이다.The ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) syntax structure may exist in the SPS or slice header. Depending on whether the syntax structure is included in the slice header or the SPS, the following applies: If present in the slice header, ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) syntax structure is the reference picture list listIdx of the current picture (picture including the slice). Specify Otherwise (exists in SPS), the syntax structure ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) specifies a candidate for the reference picture list listIdx, and the term "current picture" in the semantics specified in the rest of this section is 1) There is one or more slices including ref_pic_list_idx[listIdx] which is the same as the index for the list of the syntax structure ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) included in the SPS, and 2) it points to each image in the CVS with the SPS, which is the active SPS. . num_strp_entries[listIdx][rplsIdx] designates the number of STRP entries in the syntax structure ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag). num_ltrp_entries[listIdx][rplsIdx] designates the number of LTRP entries in the syntax structure ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag). If not present, it is inferred that the value of num_ltrp_entries[listIdx][rplsIdx] is equal to 0. The variable NumEntriesInList[listIdx][rplsIdx] is derived as follows: NumEntriesInList[listIdx][rplsIdx] = num_strp_entries[listIdx][rplsIdx] + num_ltrp_entries[listIdx] [rplsIdx]. The value of NumEntriesInList[listIdx][rplsIdx] must be in the range of 0 to sps_max_dec_pic_buffering_minus1 (inclusive). lt_ref_pic_flag[listIdx][rplsIdx][i] equal to 1 designates that the i-th entry of the ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) syntax structure is an LTRP entry. lt_ref_pic_flag[listIdx][rplsIdx][i] equal to 0 designates that the i-th entry of the ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) syntax structure is an STRP entry. If it does not exist, it is inferred that the value of lt_ref_pic_flag[listIdx][rplsIdx][i] is equal to 0. The requirement for bitstream conformance is that for all values of i in the range 0 to NumEntriesInList[listIdx][rplsIdx]　-　1 (inclusive), the sum of lt_ref_pic_flag[listIdx][rplsIdx][i] is It should be the same as num_ltrp_entries[listIdx][rplsIdx]. delta_poc_st[listIdx][rplsIdx][i] is the difference between the current picture and the picture order count value of the picture referenced by the i-th entry when the i-th entry is the first STRP entry of the ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure Or if the i-th entry is an STRP entry but is not the first STRP entry in the ref_pic_list_struct(rplsIdx, 　ltrpFlag) syntax structure, the image referenced by the i-th entry and ref_pic_list_struct(listIdx, 　rplsIdx, 　ltrpFlag) syntax structure transfer Specifies the difference between the image order count values of the images referenced by the STRP. The value of delta_poc_st[listIdx][rplsIdx][i] must be in the range of -215 to 215-1 (inclusive). poc_lsb_lt[listIdx][rplsIdx][i] specifies the value of MaxLtPicOrderCntLsb as an image order count module of the image referenced by the i-th entry of the ref_pic_list_struct(listIdx, 　rplsIdx, 　ltrpFlag) syntax structure. The length of the poc_lsb_lt[listIdx][rplsIdx][i] syntax element is Log2 (MaxLtPicOrderCntLsb) bits.

디코딩 프로세스가 논의된다. 디코딩 프로세스는 현재 영상 CurrPic에 대해 다음과 같이 동작한다: NAL 유닛의 디코딩은 아래에 지정되어 있다. 아래의 프로세스는 슬라이스 헤더 계층 이상의 신택스 요소를 사용하여 다음 디코딩 프로세스를 지정한다. 영상 순서 카운트와 관련된 변수 및 함수가 도출된다. 이것은 영상의 첫 번째 슬라이스에 대해서만 호출되어야 한다. 비 IRAP 영상의 슬라이스 각각에 대한 디코딩 프로세스의 시작에서, 참조 영상 리스트 구축을 위한 디코딩 프로세스가 참조 영상 리스트 0(RefPicList[0]) 및 참조 영상 리스트 1(RefPicList[1]의 도출을 위해 호출된다. 참조 영상 마킹을 위한 디코딩 프로세스가 호출되며, 여기서 참조 영상은 "참조용으로 사용되지 않음" 또는 "장기 참조용으로 사용됨"으로 마킹될 수 있다. 이것은 영상의 첫 번째 슬라이스에 대해서만 호출되어야 한다. 코딩 트리 유닛, 스케일링, 변환, 루프 내 필터링 등에 대한 디코딩 프로세스가 호출된다. 현재 영상의 모든 슬라이스가 디코딩된 후, 현재 디코딩된 영상은 "단기 참조용으로 사용됨"으로 마킹된다.The decoding process is discussed. The decoding process operates as follows for the current picture CurrPic: The decoding of the NAL unit is specified below. The following process specifies the next decoding process by using the syntax element above the slice header layer. Variables and functions related to the image sequence count are derived. This should only be called for the first slice of the image. At the beginning of the decoding process for each slice of the non-IRAP picture, a decoding process for constructing a reference picture list is called to derive a reference picture list 0 (RefPicList[0]) and a reference picture list 1 (RefPicList[1]). The decoding process for marking the reference picture is called, where the reference picture can be marked as “not used for reference” or “used for long-term reference.” This should only be called for the first slice of the picture. The decoding process is called for tree unit, scaling, transform, in-loop filtering, etc. After all slices of the current picture have been decoded, the currently decoded picture is marked as "used for short-term reference".

NAL 유닛 디코딩 프로세스가 논의된다. 이 프로세스에 대한 입력은 현재 영상의 NAL 유닛 및 그와 연관된 비 VCL NAL 유닛이다. 이 프로세스의 출력은 NAL 유닛 내에 캡슐화된, 파싱된 RBSP 신택스 구조이다. 각각의 NAL 유닛에 대한 디코딩 프로세스는 NAL 유닛에서 RBSP 신택스 구조를 추출한 다음 RBSP 신택스 구조를 파싱한다. 영상 순서 카운트에 대한 디코딩 프로세스를 포함하는, 슬라이스 디코딩 프로세스가 논의된다. 이 프로세스의 출력은 현재 영상의 영상 순서 카운트인 PicOrderCntVal이다. 영상 순서 카운트는 병합 모드에서 움직임 파라미터 도출 및 움직임 벡터 예측을 위해, 그리고 디코더 적합성 검사를 위해, 영상을 식별하는 데 사용된다. 각각의 코딩된 영상은 PicOrderCntVal로 표시되는 영상 순서 카운트 변수와 연관된다. 현재 영상이 IRAP 영상이 아닌 경우, 변수 prevPicOrderCntLsb 및 prevPicOrderCntMsb는 다음과 같이 도출된다: prevTid0Pic을 디코딩 순서에서 TemporalId가 0과 같은 이전 영상라고 하자. 변수 prevPicOrderCntLsb는 prevTid0Pic의 slice_pic_order_cnt_lsb와 동일하게 설정된다. prevPicOrderCntMsb 변수는 prevTid0Pic의 PicOrderCntMsb와 동일하게 설정된다.The NAL unit decoding process is discussed. The input to this process is the NAL unit of the current picture and the non-VCL NAL unit associated with it. The output of this process is the parsed RBSP syntax structure, encapsulated within the NAL unit. The decoding process for each NAL unit extracts the RBSP syntax structure from the NAL unit and then parses the RBSP syntax structure. The slice decoding process is discussed, including the decoding process for the picture order count. The output of this process is PicOrderCntVal, which is the image order count of the current image. The picture order count is used to identify the picture, for deriving motion parameters and predicting motion vectors, and for checking decoder suitability in the merge mode. Each coded image is associated with an image order count variable denoted by PicOrderCntVal. If the current picture is not an IRAP picture, the variables prevPicOrderCntLsb and prevPicOrderCntMsb are derived as follows: Let prevTid0Pic be the previous picture with TemporalId equal to 0 in decoding order. The variable prevPicOrderCntLsb is set equal to slice_pic_order_cnt_lsb of prevTid0Pic. The prevPicOrderCntMsb variable is set the same as PicOrderCntMsb of prevTid0Pic.

현재 영상의 변수 PicOrderCntMsb는 다음과 같이 도출된다: 현재 영상이 IRAP 영상이면, PicOrderCntMsb는 0으로 설정된다. 그렇지 않으면 PicOrderCntMsb는 다음과 같이 도출된다:The variable PicOrderCntMsb of the current image is derived as follows: If the current image is an IRAP image, PicOrderCntMsb is set to zero. Otherwise, PicOrderCntMsb is derived as follows:

PicOrderCntVal은 다음과 같이 도출된다: PicOrderCntVal = PicOrderCntMsb + slice_pic_order_cnt_lsb.PicOrderCntVal is derived as follows: PicOrderCntVal = PicOrderCntMsb + slice_pic_order_cnt_lsb.

모든 IRAP 영상은, slice_pic_order_cnt_lsb가 IRAP 영상에 대해 0으로 추론되고 prevPicOrderCntLsb와 prevPicOrderCntMsb가 모두 0으로 설정되기 때문에, 0과 같은 PicOrderCntVal을 가질 것이다. PicOrderCntVal의 값은 -231에서 231 - 1(포함)까지의 범위에 있어야 한다. 하나의 CVS에서, 임의의 두 개의 코딩된 영상에 대한 PicOrderCntVal 값은 동일하지 않아야 한다.All IRAP images will have a PicOrderCntVal equal to 0 because slice_pic_order_cnt_lsb is inferred to be 0 for the IRAP image, and both prevPicOrderCntLsb and prevPicOrderCntMsb are set to 0. The value of PicOrderCntVal must be in the range of -231 to 231-1 (inclusive). In one CVS, the PicOrderCntVal values for any two coded images should not be the same.

디코딩 프로세스 중 임의의 순간에, DPB 내의 임의의 두 개의 참조 영상에 대한 PicOrderCntVal & (MaxLtPicOrderCntLsb-1)의 값은 동일하지 않을 것이다. 함수 PicOrderCnt(picX)는 다음과 같이 지정된다: PicOrderCnt(picX) = 영상 picX의 PicOrderCntVal. 함수 DiffPicOrderCnt(picA, picB)는 다음과 같이 지정된다: DiffPicOrderCnt(picA, picB) = PicOrderCnt(picA) - PicOrderCnt(picB). 비트스트림은 디코딩 프로세스에 사용된 DiffPicOrderCnt(picA, picB) 값이 -215에서 215 - 1(포함)까지의 범위에 있지 않은 데이터를 포함하지 않아야 한다. X를 현재 영상으로하고 Y와 Z를 동일한 코딩된 비디오 시퀀스(CVS)의 다른 두 개의 영상이라고 하면, DiffPicOrderCnt(X, Y) 및 DiffPicOrderCnt(X, Z)가 모두 양수이거나 모두 음수인 경우에 Y와 Z는 X로부터 동일한 출력 순서 방향에 있는 것으로 간주된다.At any moment during the decoding process, the value of PicOrderCntVal & (MaxLtPicOrderCntLsb-1) for any two reference pictures in the DPB will not be the same. The function PicOrderCnt(picX) is specified as follows: PicOrderCnt(picX) = PicOrderCntVal of image picX. The function DiffPicOrderCnt(picA, picB) is specified as follows: DiffPicOrderCnt(picA, picB) = PicOrderCnt(picA)-PicOrderCnt(picB). The bitstream shall not contain data whose DiffPicOrderCnt(picA, picB) value used in the decoding process does not range from -215 to 215 -1 (inclusive). If X is the current image and Y and Z are the other two images of the same coded video sequence (CVS), Y and Z if both DiffPicOrderCnt(X, Y) and DiffPicOrderCnt(X, Z) are positive or both are negative. Z is considered to be in the same output order direction from X.

참조 영상 리스트 구축을 위한 디코딩 프로세스가 논의된다. 이 프로세스는 비 IRAP 영상의 슬라이스에 각각에 대한 디코딩 프로세스의 시작에서 호출된다. 참조 영상은 참조 색인을 통해 처리된. 참조 색인은 참조 영상 리스트에 대한 색인이다. I 슬라이스를 디코딩하는 경우, 슬라이스 데이터의 디코딩에 참조 영상 리스트가 사용되지 않는다. P 슬라이스를 디코딩하는 경우, 참조 영상 리스트 0(즉, RefPicList[0])만이 슬라이스 데이터의 디코딩에 사용된다. B 슬라이스를 디코딩하는 경우, 참조 영상 리스트 0과 참조 영상 리스트 1(즉, RefPicList[1])이 모두 슬라이스 데이터의 디코딩에 사용된다. 비 IRAP 영상의 슬라이스에 대한 디코딩 프로세스의 시작에서, 참조 영상 리스트 RefPicList[0] 및 RefPicList[1]이 도출된다. 참조 영상 리스트는 참조 영상의 마킹 또는 슬라이스 데이터의 디코딩에 사용된다. 영상의 첫 번째 슬라이스가 아닌 비 IRAP 영상의 I 슬라이스의 경우, RefPicList[0] 및 RefPicList[1]은 비트스트림 적합성 검사 목적으로 도출될 수 있지만, 이들의 도출은 현재 영상 또는 디코딩 순서에서 현재 영상 다음에 오는 나오는 영상의 디코딩에 필요하지 않다. 영상의 첫 번째 슬라이스가 아닌 P 슬라이스의 경우, 비트스트림 적합성 검사 목적으로 RefPicList[1]가 도출될 수 있지만, 이 도출은 현재 영상 또는 디코딩 순서에서 현재 영상 다음에 오는 영상의 디코딩에 필요하지 않다. 참조 영상 리스트 RefPicList[0] 및 RefPicList[1]은 다음과 같이 구축된다:The decoding process for building a reference picture list is discussed. This process is called at the beginning of the decoding process for each slice of a non-IRAP picture. The reference image was processed through the reference index. The reference index is an index for the reference video list. When decoding the I slice, the reference picture list is not used for decoding the slice data. When decoding a P slice, only reference picture list 0 (ie, RefPicList[0]) is used for decoding slice data. When decoding the B slice, both reference picture list 0 and reference picture list 1 (ie, RefPicList[1]) are used for decoding slice data. At the beginning of the decoding process for a slice of a non-IRAP picture, reference picture lists RefPicList[0] and RefPicList[1] are derived. The reference picture list is used for marking a reference picture or decoding slice data. In the case of an I slice of a non-IRAP image other than the first slice of an image, RefPicList[0] and RefPicList[1] can be derived for the purpose of checking bitstream conformance, but their derivation is next to the current image in the current image or decoding order. It is not necessary to decode the video coming in. In the case of a P slice other than the first slice of an image, RefPicList[1] may be derived for the purpose of checking bitstream suitability, but this derivation is not necessary for decoding the current image or the image following the current image in decoding order. The reference picture lists RefPicList[0] and RefPicList[1] are constructed as follows:

0 또는 1인 각각의 i에 대해, 다음이 적용된다: RefPicList[i]의 첫 번째 NumRefIdxActive[i] 엔트리는 RefPicList[i]의 활성 엔트리라고 하고, RefPicList[i]의 다른 엔트리는 RefPicList[i]의 비활성 엔트리라고 한다. 0에서 NumEntriesInList[i][RplsIdx[i]]　-　1(포함)까지의 범위에 있는 j에 대한 RefPicList[i][j]의 엔트리 각각은 lt_ref_pic_flag[i][RplsIdx[i]][j]가 0이면 STRP 엔트리라고 하고, 그렇지 않으면 LTRP 엔트리라고 한다. RefPicList[0]의 엔트리와 RefPicList[1]의 엔트리 둘 다에 의해 특정 영상이 참조되는 것도 가능하다. RefPicList[0]의 하나의 엔트리 또는 RefPicList[1]의 하나 이상의 엔트리에 의해 특정 영상이 참조되는 것도 가능하다. RefPicList[0]의 활성 엔트리와 RefPicList[1]의 활성 엔트리는 현재 영상의 인터 예측에 사용될 수 있는 모든 참조 영상과 디코딩 순서에서 현재 영상 다음에 오는 하나 이상의 영상을 총칭한다. RefPicList[0]의 비활성 엔트리와 RefPicList[1]의 비활성 엔트리는 현재 영상의 인터 예측에 사용되지 않지만 디코딩 순서에서 현재 영상 다음에 오는 하나 이상의 영상에 대한 인터 예측에 사용될 수 있는 모든 참조 영상을 총칭한다. RefPicList[0] 또는 RefPicList[1]에는 대응하는 영상이 DPB에 존재하지 않기 때문에 "참조 영상 없음"과 동일한 엔트리가 하나 이상 있을 수 있다. "참조 영상 없음"과 동일한 RefPicList[0] 또는 RefPicList[0]의 비활성 엔트리 각각은 무시되어야 한다. 의도하지 않은 영상 손실은 "참조 영상 없음"과 동일한 RefPicList[0] 또는 RefPicList[1]의 활성 엔트리 각각에 대해 추론되어야 한다.For each i, which is 0 or 1, the following applies: The first NumRefIdxActive[i] entry in RefPicList[i] is called the active entry in RefPicList[i], and the other entries in RefPicList[i] are RefPicList[i]. Is called an inactive entry. Each entry in RefPicList[i][j] for j in the range 0 to NumEntriesInList[i][RplsIdx[i]]　-　1 (inclusive) has lt_ref_pic_flag[i][RplsIdx[i]][j] If it is 0, it is called an STRP entry, otherwise it is called an LTRP entry. It is also possible for a specific image to be referenced by both the entry of RefPicList[0] and the entry of RefPicList[1]. It is also possible for a specific image to be referenced by one entry of RefPicList[0] or one or more entries of RefPicList[1]. The active entry of RefPicList[0] and the active entry of RefPicList[1] collectively refer to all reference pictures that can be used for inter prediction of the current picture and one or more pictures following the current picture in decoding order. The inactive entry of RefPicList[0] and the inactive entry of RefPicList[1] are not used for inter prediction of the current picture, but collectively refer to all reference pictures that can be used for inter prediction of one or more pictures following the current picture in decoding order. . In RefPicList[0] or RefPicList[1], since a corresponding picture does not exist in the DPB, there may be one or more entries identical to "No Reference Picture". Each of the inactive entries of RefPicList[0] or RefPicList[0] equal to "No Reference Picture" shall be ignored. Unintended picture loss must be inferred for each active entry in RefPicList[0] or RefPicList[1] equal to "No Reference Picture".

다음과 같은 제약이 적용되는 비트스트림 적합성의 요건이다: 0 또는 1인 각각 i에 대해, NumEntriesInList[i][RplsIdx[i]]은 NumRefIdxActive[i]보다 작지 않아야 한다. RefPicList[0] 또는 RefPicList[1]의 활성 엔트리 각각에 의해 참조되는 영상은 DPB에 있어야 하며 현재 영상의 TemporalId보다 작거나 같은 TemporalId를 가져야 한다. 선택적으로 다음과 같은 제약이 추가로 지정될 수 있다: RefPicList[0] 또는 RefPicList[1]의 임의의 비활성 엔트리의 엔트리 색인은 현재 영상의 디코딩을 위한 참조 색인으로 사용되지 않아야 한다. 선택적으로 다음의 제약이 추가로 지정될 수 있다: RefPicList[0] 또는 RefPicList[1]의 비활성 엔트리는 RefPicList[0] 또는 RefPicList[1]의 다른 엔트리와 동일한 영상을 참조하지 않아야 한다. 영상의 슬라이스의 RefPicList[0] 또는 RefPicList[1]에 있는 STRP 엔트리와, 동일한 영상의 동일한 슬라이스 또는 상이한 슬라이스의 RefPicList[0] 또는 RefPicList[1]에 있는 LTRP 엔트리는 동일한 영상을 참조하지 않아야 한다. 현재 영상 자체는 RefPicList[0] 또는 RefPicList[1]의 임의의 엔트리에 의해 참조되지 않아야 한다. RefPicList[0] 또는 RefPicList[1]에는 현재 영상의 PicOrderCntVal과 엔트리가 참조하는 영상의 PicOrderCntVal 사이의 차이가 224보다 크거나 같은 LTRP 엔트리가 없어야 한다. setOfRefPics를 RefPicList[0]의 모든 엔트리와 RefPicList[1]의 모든 엔트리에 의해 참조되는 유일한 영상의 세트라고 하자. setOfRefPics의 영상 수는 sps_max_dec_pic_buffering_minus1보다 작거나 같아야 하며 setOfRefPics는 영상의 모든 슬라이스에 대해 동일해야 한다.The following restrictions apply to the requirements of bitstream conformance: For each i, which is 0 or 1, NumEntriesInList[i][RplsIdx[i]] shall not be less than NumRefIdxActive[i]. The picture referenced by each of the active entries of RefPicList[0] or RefPicList[1] must be in the DPB and must have a TemporalId less than or equal to the TemporalId of the current picture. Optionally, the following constraints may be additionally specified: The entry index of any inactive entry in RefPicList[0] or RefPicList[1] shall not be used as a reference index for decoding of the current picture. Optionally, the following constraints may be additionally specified: An inactive entry in RefPicList[0] or RefPicList[1] must not refer to the same picture as another entry in RefPicList[0] or RefPicList[1]. STRP entries in RefPicList[0] or RefPicList[1] of a slice of an image and LTRP entries in RefPicList[0] or RefPicList[1] of the same slice or different slices of the same image shall not refer to the same image. The current picture itself must not be referenced by any entry in RefPicList[0] or RefPicList[1]. In RefPicList[0] or RefPicList[1], there must not be an LTRP entry with a difference between the PicOrderCntVal of the current video and the PicOrderCntVal of the video referenced by the entry is greater than or equal to 224. Let setOfRefPics be the set of unique images referenced by all entries in RefPicList[0] and all entries in RefPicList[1]. The number of images in setOfRefPics must be less than or equal to sps_max_dec_pic_buffering_minus1, and setOfRefPics must be the same for all slices of the image.

참조 영상 마킹을 위한 디코딩 프로세스.Decoding process for marking reference images.

이 프로세스는 슬라이스 헤더의 디코딩 및 슬라이스에 대한 참조 영상 리스트 구축을 위한 디코딩 프로세스 이후, 그러나 슬라이스 데이터의 디코딩 이전에 영상당 한 번 호출된다. 이 프로세스는 DPB에 있는 하나 이상의 참조 영상이 "참조용으로 사용되지 않음" 또는 "장기 참조용으로 사용됨"으로 마킹되도록 할 수 있다. DPB에서 디코딩된 영상은 "참조용으로 사용되지 않음", "단기 참조용으로 사용됨" 또는 "장기 참조용으로 사용됨"으로 마킹될 수 있지만, 디코딩 프로세스의 동작 중의 임의의 주어진 시각에는 이 세 가지 중 하나만으로 마킹된다. 이러한 마킹 중 하나를 영상에 할당하는 것은 적용 가능한 경우에 이러한 마킹 중 다른 마킹은 암묵적으로 제거된다. 영상이 "참조용으로 사용됨"으로 마킹되는 경우, 이는 "단기 참조용으로 사용됨" 또는 "장기 참조용으로 사용됨"(둘 다는 아님)으로 마킹된 영상을 통칭한다 현재 영상이 IRAP 영상인 경우, 현재 DPB에 있는 모든 참조 영상(있는 경우)이 "참조용으로 사용되지 않음"으로 표시된다. STRP는 PicOrderCntVal 값에 의해 식별된다. LTRP는 그 PicOrderCntVal 값의 Log2(MaxLtPicOrderCntLsb) LSB에 의해 식별된다. 다음이 적용된다: RefPicList[0] 또는 RefPicList[1]의 LTRP 엔트리 각각에 대해, 참조된 영상이 STRP인 경우, 영상은 "장기 참조에 사용됨"으로 마킹된다. RefPicList[0] 또는 RefPicList[1]의 임의의 엔트리에 의해 참조되지 않는 DPB의 참조 영상 각각은 "참조용으로 사용되지 않음"으로 마킹된다.This process is called once per picture after the decoding process for decoding the slice header and constructing a reference picture list for the slice, but before decoding the slice data. This process may cause one or more reference images in the DPB to be marked as “not used for reference” or “used for long-term reference”. Video decoded in DPB may be marked as "not used for reference", "used for short-term reference" or "used for long-term reference", but at any given time during the operation of the decoding process, one of the three It is marked with only one. If allocating one of these markings to an image is applicable, the other of these markings is implicitly removed. When an image is marked as "used for reference", it is collectively referred to as an image marked as "used for short-term reference" or "used for long-term reference" (but not both). If the current image is an IRAP image, the current All reference images (if any) in the DPB are marked as "not used for reference". STRP is identified by the PicOrderCntVal value. LTRP is identified by the Log2 (MaxLtPicOrderCntLsb) LSB of its PicOrderCntVal value. The following applies: For each of the LTRP entries of RefPicList[0] or RefPicList[1], if the referenced picture is STRP, the picture is marked as "used for long-term reference". Each reference picture of a DPB that is not referenced by any entry in RefPicList[0] or RefPicList[1] is marked as "not used for reference".

본 개시 내용의 제2 실시예의 상세한 설명이 제공된다. 이 섹션은 전술한 개시의 제2 실시예를 문서화한다. 설명은 최신 VVC WD에 관계가 있다. 이 실시예에서, 한 세트의 참조 영상 리스트 구조가 참조 영상 리스트 0 및 참조 영상 리스트 1에 의해 공유되는, SPS에서 시그널링된다.A detailed description of a second embodiment of the present disclosure is provided. This section documents the second embodiment of the foregoing disclosure. The description relates to the latest VVC WD. In this embodiment, a set of reference picture list structures is signaled in the SPS, which is shared by reference picture list 0 and reference picture list 1.

시퀀스 파라미터 세트 RBSP 신택스.Sequence parameter set RBSP syntax.

영상 파라미터 집합 RBSP 신택스.Picture parameter set RBSP syntax.

슬라이스 헤더 신택스.Slice header syntax.

참조 영상 리스트 구조 신택스.Reference image list structure syntax.

NAL 유닛 헤더 의미가 논의된다.The meaning of the NAL unit header is discussed.

log2_max_pic_order_cnt_lsb_minus4는 영상 순서 카운트를 위해 디코딩 프로세스에서 사용되는 변수 MaxPicOrderCntLsb의 값을 다음과 같이 지정한다: MaxPicOrderCntLsb = 2(log2_max_pic_order_cnt_lsb_minus4 + 4). log2_max_pic_order_cnt_lsb_minus4의 값은 0에서 12(포함)까지의 범위에 있어야 한다. sps_max_dec_pic_buffering_minus1 plus 1은 CVS에 필요한 디코딩된 영상 버퍼의 최대 크기를 영상 저장 버퍼 단위로 지정한다. sps_max_dec_pic_buffering_minus1의 값은 0에서 MaxDpbSize - 1(포함)까지의 범위에 있어야 하며, 여기서 MaxDpbSize는 다른 곳에 지정된 것과 같다. num_ref_pic_lists_in_sps는 SPS에 포함된 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조의 수를 지정한다. num_ref_pic_lists_in_sps의 값은 0에서 128(포함)까지의 범위에 있어야 한다. 현재 영상의 슬라이스 헤더에서 직접 시그널링되는 두 개의 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조가 있을 수 있으므로, 디코더는 num_short_term_ref_pic_sets　+　2 ref_pic_list_struct(rplsIdx,　ltrpFlag) 신택스 구조의 총 수에 대해 메모리를 할당해야 한다. long_term_ref_pics_flag equal to 0는 CVS에서 임의의 코딩된 영상의 인터 예측에 LTRP가 사용되지 않음을 지정한다. long_term_ref_pics_flag equal to 1은 CVS에서 하나 이상의 코딩된 영상의 인터 예측에 LTRP가 사용될 수 있음을 지정한다. additional_lt_poc_lsb는 참조 영상 리스트에 대한 디코딩 프로세스에서 사용되는 변수 MaxLtPicOrderCntLsb의 값을 다음과 같이 지정한다: MaxLtPicOrderCntLsb　=　2(log2_max_pic_order_cnt_lsb_minus4　+　4　+　additional_lt_poc_lsb)). additional_lt_poc_lsb의 값은 0에서 32　-　log2_max_pic_order_cnt_lsb_minus4　- 4(포함)까지의 범위에 있어야 한다. 존재하지 않는 경우, additional_lt_poc_lsb의 값은 0과 같은 것으로 추론된다.log2_max_pic_order_cnt_lsb_minus4 specifies the value of the variable MaxPicOrderCntLsb used in the decoding process to count the image order as follows: MaxPicOrderCntLsb = 2 (log2_max_pic_order_cnt_lsb_minus4 + 4). The value of log2_max_pic_order_cnt_lsb_minus4 must be in the range of 0 to 12 (inclusive). sps_max_dec_pic_buffering_minus1 plus 1 designates the maximum size of the decoded video buffer required for CVS in video storage buffer units. The value of sps_max_dec_pic_buffering_minus1 must be in the range of 0 to MaxDpbSize-1 (inclusive), where MaxDpbSize is the same as specified elsewhere. num_ref_pic_lists_in_sps designates the number of ref_pic_list_struct (rplsIdx, ltrpFlag) syntax structures included in the SPS. The value of num_ref_pic_lists_in_sps must be in the range of 0 to 128 (inclusive). Since there may be two ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structures signaled directly from the slice header of the current video, the decoder must allocate memory for the total number of num_short_term_ref_pic_sets　+　2 ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structures. long_term_ref_pics_flag equal to 0 specifies that LTRP is not used for inter prediction of an arbitrary coded image in CVS. long_term_ref_pics_flag equal to 1 specifies that LTRP can be used for inter prediction of one or more coded images in CVS. additional_lt_poc_lsb specifies the value of the variable MaxLtPicOrderCntLsb used in the decoding process for the reference video list as follows: MaxLtPicOrderCntLsb=　2(log2_max_pic_order_cnt_lspoc_minusb4　+　4　_lt_ls). The value of additional_lt_poc_lsb should be in the range of 0 to 32　-　log2_max_pic_order_cnt_lsb_minus4　- 4 (inclusive). If not present, the value of additional_lt_poc_lsb is inferred to be equal to 0.

영상 파라미터 세트 RBSP 의미가 논의된다.The picture parameter set RBSP semantics is discussed.

슬라이스 헤더 시맨틱스.Slice header semantics.

존재하는 경우, 슬라이스 헤더 신택스 요소 slice_pic_parameter_set_id 및 slice_pic_order_cnt_lsb 각각의 값은 코딩된 영상의 모든 슬라이스 헤더에서 동일해야 한다. slice_type은 표 7-3에 따라 슬라이스의 코딩 유형을 지정한다.If present, the values of each of the slice header syntax elements slice_pic_parameter_set_id and slice_pic_order_cnt_lsb must be the same in all slice headers of the coded video. slice_type specifies the coding type of a slice according to Table 7-3.

nal_unit_type이 IRAP_NUT와 같을 때, 즉, 영상이 IRAP 영상일 때, slice_type은 2와 같아야 한다. ... slice_pic_order_cnt_lsb는 현재 영상에 대한 영상 순서 카운트 모듈로 MaxPicOrderCntLsb를 지정한다. slice_pic_order_cnt_lsb 신택스 요소의 길이는 log2_max_pic_order_cnt_lsb_minus4 + 4 비트이다. slice_pic_order_cnt_lsb의 값은 0에서 MaxPicOrderCntLsb - 1(포함)까지의 범위에 있어야 한다. slice_pic_order_cnt_lsb가 없으면, slice_pic_order_cnt_lsb는 0과 같은 것으로 추론된다. ref_pic_list_sps_flag[i] equal to 1은, 현재 영상의 참조 영상 리스트 i가 활성 SPS에서 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조 중 하나에 기초하여 도출되도록 지정한다. ref_pic_list_sps_flag[i] equal to 0은 현재 영상의 슬라이스 헤더에 직접 포함된 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조에 기초하여 현재 영상의 참조 영상 리스트 i가 도출되도록 지정한다. num_ref_pic_lists_in_sps가 0인 경우 ref_pic_list_sps_flag[i]의 값은 0과 같아야 한다. ref_pic_list_idx[i]는 색인을, 현재 영상의 참조 영상 리스트 i의 도출에 사용되는 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조의, 활성 SPS에 포함된 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조의 리스트에 지정한다. 신택스 요소 ref_pic_list_idx[i]는 Ceil(Log2(num_ref_pic_lists_in_sps)) 비트로 표현된다. 존재하지 않을 경우, ref_pic_list_idx[i]의 값은 0과 같은 것으로 추론된다. ref_pic_list_idx[i]의 값은 0에서 num_ref_pic_lists_in_sps - 1(포함)까지의 범위에 있어야 한다. num_ref_idx_active_override_flag equal to 1은 신택스 요소 num_ref_idx_active_minus1[0]이 P 및 B 슬라이스에 존재하고 신택스 요소 num_ref_idx_active_minus1[1]이 B 슬라이스에 존재함을 지정한다. num_ref_idx_active_override_flag equal to 0은 신택스 요소 num_ref_idx_active_minus1[0] 및 num_ref_idx_active_minus1[1]이 존재하지 않음을 지정한다. When nal_unit_type is equal to IRAP_NUT, that is, when the image is an IRAP image, slice_type must be equal to 2. ... slice_pic_order_cnt_lsb designates MaxPicOrderCntLsb as an image order count module for the current image. The length of the slice_pic_order_cnt_lsb syntax element is log2_max_pic_order_cnt_lsb_minus4 + 4 bits. The value of slice_pic_order_cnt_lsb must be in the range of 0 to MaxPicOrderCntLsb-1 (inclusive). If there is no slice_pic_order_cnt_lsb, slice_pic_order_cnt_lsb is inferred to be equal to 0. ref_pic_list_sps_flag[i] equal to 1 specifies that the reference picture list i of the current picture is derived based on one of the syntax structures ref_pic_list_struct(rplsIdx, ltrpFlag) in the active SPS. ref_pic_list_sps_flag[i] equal to 0 specifies that the reference picture list i of the current picture is derived based on the syntax structure ref_pic_list_struct(rplsIdx, ltrpFlag) directly included in the slice header of the current picture. When num_ref_pic_lists_in_sps is 0, the value of ref_pic_list_sps_flag[i] must be equal to 0. ref_pic_list_idx[i] designates an index to a list of the ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure included in the active SPS of the ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure used to derive the reference picture list i of the current video. The syntax element ref_pic_list_idx[i] is represented by Ceil(Log2(num_ref_pic_lists_in_sps)) bits. If not present, the value of ref_pic_list_idx[i] is inferred to be equal to 0. The value of ref_pic_list_idx[i] must be in the range of 0 to num_ref_pic_lists_in_sps-1 (inclusive). num_ref_idx_active_override_flag equal to 1 specifies that the syntax element num_ref_idx_active_minus1[0] exists in the P and B slices, and the syntax element num_ref_idx_active_minus1[1] exists in the B slice. num_ref_idx_active_override_flag equal to 0 specifies that the syntax elements num_ref_idx_active_minus1[0] and num_ref_idx_active_minus1[1] do not exist.

num_ref_idx_active_minus1[i]는, 존재하는 경우, 변수 NumRefIdxActive[i]의 값을 다음과 같이 지정한다: NumRefIdxActive[i] = num_ref_idx_active_minus1[i] + 1. num_ref_idx_active_minus1[i]의 값은 0에서 14(포함)까지의 범위에 있어야 한다. NumRefIdxActive[i] - 1의 값은 슬라이스를 디코딩하는 데 사용될 수 있는 참조 영상 리스트 i에 대한 최대 참조 색인을 지정한다. NumRefIdxActive[i]의 값이 0인 경우, 참조 영상 리스트 i에 대한 참조 색인은 슬라이스를 디코딩하는 데 사용될 수 없다. 0 또는 1인 i 각각에 대해, 현재 슬라이스가 B 슬라이스이고 num_ref_idx_active_override_flag가 0인 경우, NumRefIdxActive[i]는 num_ref_idx_default_active_minus1[i] + 1과 같은 것으로 추론된다. 현재 슬라이스가 P 슬라이스이고 num_ref_idx_active_override_flag가 0인 경우, NumRefIdxActive[0]는 num_ref_idx_default_active_minus1[0] + 1과 같은 것으로 추론된다. 현재 슬라이스가 P 슬라이스일 때, NumRefIdxActive[1]은 0과 같은 것으로 추론된다. 현재 슬라이스가 I 슬라이스인 경우, NumRefIdxActive[0] 및 NumRefIdxActive[1] 모두 0과 같은 것으로 추론된다.num_ref_idx_active_minus1[i], if present, specifies the value of the variable NumRefIdxActive[i] as follows: NumRefIdxActive[i] = num_ref_idx_active_minus1[i] + 1. The value of num_ref_idx_active_minus1[i] ranges from 0 to 14 (inclusive). Should be in the range of. A value of NumRefIdxActive[i]-1 specifies the maximum reference index for the reference picture list i that can be used to decode a slice. When the value of NumRefIdxActive[i] is 0, the reference index for the reference picture list i cannot be used to decode a slice. For each i of 0 or 1, when the current slice is a B slice and num_ref_idx_active_override_flag is 0, NumRefIdxActive[i] is inferred to be equal to num_ref_idx_default_active_minus1[i] + 1. When the current slice is a P slice and num_ref_idx_active_override_flag is 0, NumRefIdxActive[0] is inferred to be equal to num_ref_idx_default_active_minus1[0] + 1. When the current slice is a P slice, NumRefIdxActive[1] is inferred to be equal to 0. When the current slice is an I slice, it is inferred that both NumRefIdxActive[0] and NumRefIdxActive[1] are equal to 0.

대안으로, 0 또는 1인 i 각각에 대해, 상기한 것 후에 다음이 적용된다: rplsIdx1은 ref_pic_list_sps_flag[i] ? ref_pic_list_idx[i] : num_ref_pic_lists_in_sps[i]과 동일하게 설정되고, numRpEntries[i]는 num_strp_entries[i][rplsIdx1]　+　num_ltrp_entries[i][rplsIdx1]와 같다고 하자. NumRefIdxActive[i]가 numRpEntries[i] 보다 클 때, NumRefIdxActive[i]의 값은 numRpEntries[i]와 동일하게 설정된다.Alternatively, for each of i, 0 or 1, the following applies after the above: rplsIdx1 is ref_pic_list_sps_flag[i]? ref_pic_list_idx[i]: It is set the same as num_ref_pic_lists_in_sps[i], and numRpEntries[i] is the same as num_strp_entries[i][rplsIdx1]　+　num_ltrp_entries[i][rplsIdx1]. When NumRefIdxActive[i] is greater than numRpEntries[i], the value of NumRefIdxActive[i] is set equal to numRpEntries[i].

ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조는 SPS 또는 슬라이스 헤더에 존재할 수 있다. 신택스 구조가 슬라이스 헤더에 포함되는지 SPS에 포함되는지에 따라 다음이 적용된다: 슬라이스 헤더에 존재하면, ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조는 현재 영상(슬라이스를 포함하는 영상)의 참조 영상 리스트를 지정한다. 그렇지 않으면(SPS에 존재함), ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조는 후보 참조 영상 리스트를 지정하고, 이 섹션의 나머지 부분에 지정된 시맨틱스에서 "현재 영상"이라는 용어는, 1) SPS에 포함된 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조의 리스트에 대한 색인과 동일한 ref_pic_list_idx[i]를 포함하는 하나 이상의 슬라이스가 있고, 2) 활성 SPS인 SPS를 갖는 CVS에 있는 각각의 영상을 가리킨다. num_strp_entries[rplsIdx]는 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조의 STRP 엔트리 수를 지정한다. num_ltrp_entries[rplsIdx]는 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조의 LTRP 엔트리 수를 지정한다. 존재하지 않는 경우, num_ltrp_entries[rplsIdx]의 값은 0과 같은 것으로 추론된다. The ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure may exist in the SPS or slice header. The following applies depending on whether the syntax structure is included in the slice header or the SPS: If present in the slice header, the ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure specifies the reference picture list of the current picture (picture including the slice). . Otherwise (exists in SPS), the syntax structure ref_pic_list_struct(rplsIdx, ltrpFlag) specifies a candidate reference picture list, and the term "current picture" in the semantics specified in the rest of this section is 1) included in the SPS. There is one or more slices containing ref_pic_list_idx[i] which is the same as the index for the list of the ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure, and 2) it indicates each image in the CVS having the SPS that is the active SPS. num_strp_entries[rplsIdx] designates the number of STRP entries in the ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure. num_ltrp_entries[rplsIdx] designates the number of LTRP entries in the syntax structure ref_pic_list_struct(rplsIdx, ltrpFlag). If not present, the value of num_ltrp_entries[rplsIdx] is inferred to be equal to 0.

변수 NumEntriesInList[rplsIdx]는 다음과 같이 도출된다: NumEntriesInList[rplsIdx] = num_strp_entries[rplsIdx]　+　num_ltrp_entries[rplsIdx]. NumEntriesInList[rplsIdx]의 값은 0에서 sps_max_dec_pic_buffering_minus1(포함)까지의 범위에 있어야 한다. lt_ref_pic_flag[rplsIdx][i] equal to 1은 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조의 i번째 엔트리가 LTRP 엔트리임을 지정한다. lt_ref_pic_flag[rplsIdx][i] equal to 0은 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조의 i번째 엔트리가 STRP 엔트리임을 지정한다. 존재하지 않는 경우, lt_ref_pic_flag[rplsIdx][i]의 값은 0과 같은 것으로 추론된다. 비트스트림 적합성의 요건은, 0에서 NumEntriesInList[rplsIdx]　-　1(포함)까지의 범위에 있는 i의 모든 값에 대해, lt_ref_pic_flag[rplsIdx][i]의 합이 num_ltrp_entries[rplsIdx]와 같아야 하는 것이다. delta_poc_st[rplsIdx][i]는, i번째 엔트리가 ref_pic_list_struct(rplsIdx, ltrpFlag) 신택스 구조의 첫 번째 STRP 엔트리인 경우, 현재 영상과 i번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 값 간의 차이를 지정하거나, i번째 엔트리가 STRP 엔트리이지만 ref_pic_list_struct(rplsIdx,　ltrpFlag) 신택스 구조에서의 첫 번째 STRP 엔트리가 아닌 경우, i번째 엔트리에 의해 참조되는 영상과 ref_pic_list_struct(rplsIdx,　ltrpFlag) 신택스 구조의 이전 STRP에 의해 참조되는 영상의 영상 순서 카운트 값 사이의 차이를 지정한다. delta_poc_st[rplsIdx][i]의 값은 0에서 215 - 1(포함)까지의 범위에 있어야 한다. poc_lsb_lt[rplsIdx][i]는 ref_pic_list_struct(rplsIdx,　ltrpFlag) 신택스 구조의 i번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 모듈로 MaxLtPicOrderCntLsb의 값을 지정한다. poc_lsb_lt[rplsIdx][i] 신택스 요소의 길이는 Log2(MaxLtPicOrderCntLsb) 비트이다.The variable NumEntriesInList[rplsIdx] is derived as follows: NumEntriesInList[rplsIdx] = num_strp_entries[rplsIdx]　+　num_ltrp_entries[rplsIdx]. The value of NumEntriesInList[rplsIdx] must be in the range of 0 to sps_max_dec_pic_buffering_minus1 (inclusive). lt_ref_pic_flag[rplsIdx][i] equal to 1 designates that the i-th entry of the ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure is an LTRP entry. lt_ref_pic_flag[rplsIdx][i] equal to 0 designates that the i-th entry of the ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure is an STRP entry. If not present, the value of lt_ref_pic_flag[rplsIdx][i] is inferred to be equal to 0. The requirement for bitstream conformance is that for all values of i in the range 0 to NumEntriesInList[rplsIdx][rplsIdx][rplsIdx][rplsIdx][rplsIdx][rplsIdx][rplsIdx], the sum of lt_ref_pic_flag[rplsIdx][i]] must equal num_ltrp_entries[rplsIdx]. delta_poc_st[rplsIdx][i], when the i-th entry is the first STRP entry of the ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure, specifies the difference between the current video and the video order count value of the video referenced by the i-th entry, or , if the i-th entry is an STRP entry but is not the first STRP entry in the ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure, the image referenced by the i-th entry and ref_pic_list_struct(rplsIdx, ltrpFlag) referenced by the previous STRP in the syntax structure Specifies the difference between the image order count values of the image. The value of delta_poc_st[rplsIdx][i] must be in the range of 0 to 215-1 (inclusive). poc_lsb_lt[rplsIdx][i] is a picture order count module of the picture referenced by the i-th entry of the ref_pic_list_struct(rplsIdx, ltrpFlag) syntax structure, and designates the value of MaxLtPicOrderCntLsb. The length of the poc_lsb_lt[rplsIdx][i] syntax element is Log2 (MaxLtPicOrderCntLsb) bits.

본 개시의 제1 실시예의 상세한 설명의 일부로서 지정된 일반적인 디코딩 프로세스가 적용된다. NAL 유닛 디코딩 프로세스를 설명한다. 본 개시의 제1 실시예의 상세한 설명의 일부로서 지정된 NAL 유닛 디코딩 프로세스가 적용된다.The general decoding process specified as part of the detailed description of the first embodiment of the present disclosure is applied. The NAL unit decoding process will be described. The designated NAL unit decoding process is applied as part of the detailed description of the first embodiment of the present disclosure.

슬라이스 디코딩 프로세스가 제공된다.A slice decoding process is provided.

영상 순서 카운트를 위한 디코딩 프로세스.Decoding process for counting the image sequence.

본 개시의 제1 실시예의 상세한 설명의 일부로서 지정된 영상 순서 카운트에 대한 디코딩 프로세스가 적용된다.As part of the detailed description of the first embodiment of the present disclosure, the decoding process for the specified picture order count is applied.

참조 영상 리스트 구축을 위한 디코딩 프로세스.Decoding process to build a reference video list.

이 프로세스는 비 IRAP 영상의 슬라이스에 각각에 대한 디코딩 프로세스의 시작에서 호출된다. 참조 영상은 참조 색인을 통해 처리된다. 참조 색인은 참조 영상 리스트에 대한 색인이다. I 슬라이스를 디코딩하는 경우, 슬라이스 데이터의 디코딩에 참조 영상 리스트가 사용되지 않는다. P 슬라이스를 디코딩하는 경우, 참조 영상 리스트 0(즉, RefPicList[0])만이 슬라이스 데이터의 디코딩에 사용된다. B 슬라이스를 디코딩하는 경우, 참조 영상 리스트 0과 참조 영상 리스트 1(즉, RefPicList[1])이 모두 슬라이스 데이터의 디코딩에 사용된다. 비 IRAP 영상의 슬라이스에 대한 디코딩 프로세스의 시작에서, 참조 영상 리스트 RefPicList[0] 및 RefPicList[1]이 도출된다. 참조 영상 리스트는 참조 영상의 마킹 또는 슬라이스 데이터의 디코딩에 사용된다. 영상의 첫 번째 슬라이스가 아닌 비 IRAP 영상의 I 슬라이스의 경우, RefPicList[0] 및 RefPicList[1]은 비트스트림 적합성 검사 목적으로 도출될 수 있지만, 이들의 도출은 현재 영상 또는 디코딩 순서에서 현재 영상 다음에 오는 나오는 영상의 디코딩에 필요하지 않다. 영상의 첫 번째 슬라이스가 아닌 P 슬라이스의 경우, 비트스트림 적합성 검사 목적으로 RefPicList[1]가 도출될 수 있지만, 이 도출은 현재 영상 또는 디코딩 순서에서 현재 영상 다음에 오는 영상의 디코딩에 필요하지 않다. 참조 영상 리스트 RefPicList[0] 및 RefPicList[1]은 다음과 같이 구축된다:This process is called at the beginning of the decoding process for each slice of a non-IRAP picture. The reference image is processed through the reference index. The reference index is an index for the reference video list. When decoding the I slice, the reference picture list is not used for decoding the slice data. When decoding a P slice, only reference picture list 0 (ie, RefPicList[0]) is used for decoding slice data. When decoding the B slice, both reference picture list 0 and reference picture list 1 (ie, RefPicList[1]) are used for decoding slice data. At the beginning of the decoding process for a slice of a non-IRAP picture, reference picture lists RefPicList[0] and RefPicList[1] are derived. The reference picture list is used for marking a reference picture or decoding slice data. In the case of an I slice of a non-IRAP image other than the first slice of an image, RefPicList[0] and RefPicList[1] can be derived for the purpose of checking bitstream conformance, but their derivation is next to the current image in the current image or decoding order. It is not necessary to decode the video coming in. In the case of a P slice other than the first slice of an image, RefPicList[1] may be derived for the purpose of checking bitstream suitability, but this derivation is not necessary for decoding the current image or the image following the current image in decoding order. The reference picture lists RefPicList[0] and RefPicList[1] are constructed as follows:

0 또는 1인 각각의 i에 대해, 다음이 적용된다: RefPicList[i]의 첫 번째 NumRefIdxActive[i] 엔트리는 RefPicList[i]의 활성 엔트리라고 하고, RefPicList[i]의 다른 엔트리는 RefPicList[i]의 비활성 엔트리라고 한다. 0에서 NumEntriesInList[RplsIdx[i]]　-　1(포함)까지의 범위에 있는 j에 대한 RefPicList[i][j]의 엔트리 각각은 lt_ref_pic_flag[RplsIdx[i]][j]가 0이면 STRP 엔트리라고 하고, 그렇지 않으면 LTRP 엔트리라고 한다. RefPicList[0]의 엔트리와 RefPicList[1]의 엔트리 둘 다에 의해 특정 영상이 참조되는 것도 가능하다. RefPicList[0]의 하나의 엔트리 또는 RefPicList[1]의 하나 이상의 엔트리에 의해 특정 영상이 참조되는 것도 가능하다. RefPicList[0]의 활성 엔트리와 RefPicList[1]의 활성 엔트리는 현재 영상의 인터 예측에 사용될 수 있는 모든 참조 영상과 디코딩 순서에서 현재 영상 다음에 오는 하나 이상의 영상을 총칭한다. RefPicList[0]의 비활성 엔트리와 RefPicList[1]의 비활성 엔트리는 현재 영상의 인터 예측에 사용되지 않지만 디코딩 순서에서 현재 영상 다음에 오는 하나 이상의 영상에 대한 인터 예측에 사용될 수 있는 모든 참조 영상을 총칭한다. RefPicList[0] 또는 RefPicList[1]에는 대응하는 영상이 DPB에 존재하지 않기 때문에 "참조 영상 없음"과 동일한 엔트리가 하나 이상 있을 수 있다. "참조 영상 없음"과 동일한 RefPicList[0] 또는 RefPicList[0]의 비활성 엔트리 각각은 무시되어야 한다. 의도하지 않은 영상 손실은 "참조 영상 없음"과 동일한 RefPicList[0] 또는 RefPicList[1]의 활성 엔트리 각각에 대해 추론되어야 한다.For each i, which is 0 or 1, the following applies: The first NumRefIdxActive[i] entry in RefPicList[i] is called the active entry in RefPicList[i], and the other entries in RefPicList[i] are RefPicList[i]. Is called an inactive entry. Each entry in RefPicList[i][j] for j in the range 0 to NumEntriesInList[RplsIdx[i]]　-　1 (inclusive) is said to be an STRP entry if lt_ref_pic_flag[RplsIdx[i]][j] is 0. Otherwise, it is called an LTRP entry. It is also possible for a specific image to be referenced by both the entry of RefPicList[0] and the entry of RefPicList[1]. It is also possible for a specific image to be referenced by one entry of RefPicList[0] or one or more entries of RefPicList[1]. The active entry of RefPicList[0] and the active entry of RefPicList[1] collectively refer to all reference pictures that can be used for inter prediction of the current picture and one or more pictures following the current picture in decoding order. The inactive entry of RefPicList[0] and the inactive entry of RefPicList[1] are not used for inter prediction of the current picture, but collectively refer to all reference pictures that can be used for inter prediction of one or more pictures following the current picture in decoding order. . In RefPicList[0] or RefPicList[1], since a corresponding picture does not exist in the DPB, there may be one or more entries identical to "No Reference Picture". Each of the inactive entries of RefPicList[0] or RefPicList[0] equal to "No Reference Picture" shall be ignored. Unintended picture loss must be inferred for each active entry in RefPicList[0] or RefPicList[1] equal to "No Reference Picture".

다음과 같은 제약이 적용되는 비트스트림 적합성의 요건이다: 0 또는 1인 각각 i에 대해, NumEntriesInList[RplsIdx[i]]은 NumRefIdxActive[i]보다 작지 않아야 한다. RefPicList[0] 또는 RefPicList[1]의 활성 엔트리 각각에 의해 참조되는 영상은 DPB에 있어야 하며 현재 영상의 TemporalId보다 작거나 같은 TemporalId를 가져야 한다. 선택적으로 다음과 같은 제약이 추가로 지정될 수 있다: RefPicList[0] 또는 RefPicList[1]의 임의의 비활성 엔트리의 엔트리 색인은 현재 영상의 디코딩을 위한 참조 색인으로 사용되지 않아야 한다. 선택적으로 다음의 제약이 추가로 지정될 수 있다: RefPicList[0] 또는 RefPicList[1]의 비활성 엔트리는 RefPicList[0] 또는 RefPicList[1]의 다른 엔트리와 동일한 영상을 참조하지 않아야 한다. 영상의 슬라이스의 RefPicList[0] 또는 RefPicList[1]에 있는 STRP 엔트리와, 동일한 영상의 동일한 슬라이스 또는 상이한 슬라이스의 RefPicList[0] 또는 RefPicList[1]에 있는 LTRP 엔트리는 동일한 영상을 참조하지 않아야 한다. 현재 영상 자체는 RefPicList[0] 또는 RefPicList[1]의 임의의 엔트리에 의해 참조되지 않아야 한다. RefPicList[0] 또는 RefPicList[1]에는 현재 영상의 PicOrderCntVal과 엔트리가 참조하는 영상의 PicOrderCntVal 사이의 차이가 224보다 크거나 같은 LTRP 엔트리가 없어야 한다. setOfRefPics를 RefPicList[0]의 모든 엔트리와 RefPicList[1]의 모든 엔트리에 의해 참조되는 유일한 영상의 세트라고 하자. setOfRefPics의 영상 수는 sps_max_dec_pic_buffering_minus1보다 작거나 같아야 하며 setOfRefPics는 영상의 모든 슬라이스에 대해 동일해야 한다.The following restrictions apply to the requirements of bitstream conformance: For each i, which is 0 or 1, NumEntriesInList[RplsIdx[i]] shall not be less than NumRefIdxActive[i]. The picture referenced by each of the active entries of RefPicList[0] or RefPicList[1] must be in the DPB and must have a TemporalId less than or equal to the TemporalId of the current picture. Optionally, the following constraints may be additionally specified: The entry index of any inactive entry in RefPicList[0] or RefPicList[1] shall not be used as a reference index for decoding of the current picture. Optionally, the following constraints may be additionally specified: An inactive entry in RefPicList[0] or RefPicList[1] must not refer to the same picture as another entry in RefPicList[0] or RefPicList[1]. STRP entries in RefPicList[0] or RefPicList[1] of a slice of an image and LTRP entries in RefPicList[0] or RefPicList[1] of the same slice or different slices of the same image shall not refer to the same image. The current picture itself must not be referenced by any entry in RefPicList[0] or RefPicList[1]. In RefPicList[0] or RefPicList[1], there must not be an LTRP entry with a difference between the PicOrderCntVal of the current video and the PicOrderCntVal of the video referenced by the entry is greater than or equal to 224. Let setOfRefPics be the set of unique images referenced by all entries in RefPicList[0] and all entries in RefPicList[1]. The number of images in setOfRefPics must be less than or equal to sps_max_dec_pic_buffering_minus1, and setOfRefPics must be the same for all slices of the image.

참조 영상 마킹을 위한 디코딩 프로세스가 논의된다.The decoding process for marking the reference picture is discussed.

이 프로세스는 슬라이스 헤더의 디코딩 및 슬라이스에 대한 참조 영상 리스트 구축을 위한 디코딩 프로세스 이후, 그러나 슬라이스 데이터의 디코딩 이전에 영상당 한 번 호출된다. 이 프로세스는 DPB에 있는 하나 이상의 참조 영상이 "참조용으로 사용되지 않음" 또는 "장기 참조용으로 사용됨"으로 마킹되도록 할 수 있다. DPB에서 디코딩된 영상은 "참조용으로 사용되지 않음", "단기 참조용으로 사용됨" 또는 "장기 참조용으로 사용됨"으로 마킹될 수 있지만, 디코딩 프로세스의 동작 중의 임의의 주어진 시각에는 이 세 가지 중 하나만으로 마킹된다. 이러한 마킹 중 하나를 영상에 할당하는 것은 적용 가능한 경우에 이러한 마킹 중 다른 마킹은 암묵적으로 제거된다. 영상이 "참조용으로 사용됨"으로 마킹되는 경우, 이는 "단기 참조용으로 사용됨" 또는 "장기 참조용으로 사용됨"(둘 다는 아님)으로 마킹된 영상을 통칭한다 현재 영상이 IRAP 영상인 경우, 현재 DPB에 있는 모든 참조 영상(있는 경우)이 "참조용으로 사용되지 않음"으로 표시된다. STRP는 PicOrderCntVal 값에 의해 식별된다. LTRP는 그 PicOrderCntVal 값의 Log2(MaxLtPicOrderCntLsb) LSB에 의해 식별된다. This process is called once per picture after the decoding process for decoding the slice header and constructing a reference picture list for the slice, but before decoding the slice data. This process may cause one or more reference images in the DPB to be marked as “not used for reference” or “used for long-term reference”. Video decoded in DPB may be marked as "not used for reference", "used for short-term reference" or "used for long-term reference", but at any given time during the operation of the decoding process, one of the three It is marked with only one. If allocating one of these markings to an image is applicable, the other of these markings is implicitly removed. When an image is marked as "used for reference", it is collectively referred to as an image marked as "used for short-term reference" or "used for long-term reference" (but not both). If the current image is an IRAP image, the current All reference images (if any) in the DPB are marked as "not used for reference". STRP is identified by the PicOrderCntVal value. LTRP is identified by the Log2 (MaxLtPicOrderCntLsb) LSB of its PicOrderCntVal value.

다음이 적용된다: RefPicList[0] 또는 RefPicList[1]의 LTRP 엔트리 각각에 대해, 참조된 영상이 STRP인 경우, 영상은 "장기 참조에 사용됨"으로 마킹된다. RefPicList[0] 또는 RefPicList[1]의 임의의 엔트리에 의해 참조되지 않는 DPB의 참조 영상 각각은 "참조용으로 사용되지 않음"으로 마킹된다.The following applies: For each of the LTRP entries of RefPicList[0] or RefPicList[1], if the referenced picture is STRP, the picture is marked as "used for long-term reference". Each reference picture of a DPB that is not referenced by any entry in RefPicList[0] or RefPicList[1] is marked as "not used for reference".

도 5는 비디오 디코더(예:, 비디오 디코더(30))에 의해 구현되는 코딩된 비디오 비트스트림을 디코딩하는 방법(500)의 실시예이다. 방법(500)은 디코딩된 비트스트림이 비디오 인코더(예: 비디오 인코더(20))로부터 직접 또는 간접적으로 수신된 후에 수행될 수 있다. 방법(500)은 디코딩 프로세스를 개선(예: 디코딩 프로세스를 종래의 디코딩 프로세스보다 더 효율적이고 빠르게 만드는 것 등)하기 위해 수행될 수 있다. 따라서 실질적으로, 코덱의 성능을 향상시켜 더 나은 사용자 경험을 제공할 수 있다.5 is an embodiment of a method 500 of decoding a coded video bitstream implemented by a video decoder (eg, video decoder 30). Method 500 may be performed after the decoded bitstream is received directly or indirectly from a video encoder (eg, video encoder 20). Method 500 may be performed to improve the decoding process (eg, to make the decoding process more efficient and faster than a conventional decoding process, etc.). Therefore, it is possible to provide a better user experience by substantially improving the performance of the codec.

블록 502에서, 코딩된 비디오 비트스트림으로 표현된 파라미터 세트가 파싱된다. 일 실시예에서, 파라미터 세트는 참조 영상 리스트 구조 세트를 포함하는 신택스 요소의 세트를 포함한다.At block 502, a set of parameters represented by a coded video bitstream is parsed. In one embodiment, the parameter set comprises a set of syntax elements comprising a reference picture list structure set.

블록 504에서, 코딩된 비디오 비트스트림으로 표현된 현재 슬라이스의 슬라이스 헤더가 파싱된다. 일 실시예에서, 슬라이스 헤더는 파라미터 세트 내의 참조 영상 리스트 구조의 세트 중의 참조 영상 리스트 구조의 색인을 포함한다.In block 504, the slice header of the current slice represented by the coded video bitstream is parsed. In one embodiment, the slice header includes an index of a reference picture list structure in a set of reference picture list structures in a parameter set.

블록 506에서, 현재 슬라이스의 참조 영상 리스트가 도출된다. 일 실시예에서, 참조 영상 리스트는 파라미터 세트 내의 신택스 요소 세트 및 참조 영상 리스트 구조의 색인에 기초하여 도출된다. 일 실시예에서, 참조 영상 리스트 구조의 엔트리 순서는 참조 영상 리스트에서의 대응하는 참조 영상의 순서와 동일하다. 일 실시예에서, 순서는 0에서 지시된 값까지이다. 일 실시예에서, 지시된 값은 0에서 sps_max_dec_pic_buffering_minus1에 의해 지시된 값까지이다.In block 506, a list of reference pictures of the current slice is derived. In one embodiment, the reference picture list is derived based on the index of the reference picture list structure and the set of syntax elements in the parameter set. In one embodiment, the order of entries in the reference picture list structure is the same as the order of corresponding reference pictures in the reference picture list. In one embodiment, the order is from 0 to the indicated value. In one embodiment, the indicated value is from 0 to the value indicated by sps_max_dec_pic_buffering_minus1.

블록(508)에서, 현재 슬라이스의 하나 이상의 재구축된 블록이 획득된다. 일 실시예에서, 현재 슬라이스의 하나 이상의 재구축된 블록은 참조 영상 리스트에 기초하여 재구축된다. 재구축 프로세스 후에 비디오 디코더는 비디오 또는 이미지를 출력할 수 있다. 일 실시예에서, 그 비디오 또는 이미지는 전자 기기(예: 스마트 폰, 태블릿, 랩톱 등)의 디스플레이에 표시될 수 있다.At block 508, one or more reconstructed blocks of the current slice are obtained. In one embodiment, one or more reconstructed blocks of the current slice are reconstructed based on the reference picture list. After the reconstruction process, the video decoder can output a video or an image. In one embodiment, the video or image may be displayed on a display of an electronic device (eg, a smart phone, tablet, laptop, etc.).

일 실시예에서, 참조 영상 리스트는 RefPictList[0] 또는 RefPictList[1]로 지정된다. 일 실시예에서, 참조 영상 리스트는 인터 예측에 사용되는 참조 영상의 리스트를 포함한다. 일 실시예에서, 인터 예측은 P 슬라이스 또는 B 슬라이스에 대한 것이다. 일 실시예에서, 파라미터 세트로부터의 신택스 요소 세트는 네트워크 추상화 계층(NAL) 유닛의 원시 바이트 시퀀스 페이로드(RBSP)에 배치된다.In one embodiment, the reference picture list is designated as RefPictList[0] or RefPictList[1]. In one embodiment, the reference picture list includes a list of reference pictures used for inter prediction. In one embodiment, inter prediction is for a P slice or a B slice. In one embodiment, the set of syntax elements from the parameter set is placed in a raw byte sequence payload (RBSP) of a network abstraction layer (NAL) unit.

제1 및 제2 실시예에 기초한 대안 실시예의 요약이 제공된다.A summary of alternative embodiments based on the first and second embodiments is provided.

이 섹션은 본 개시 내용의 다른 대안적인 실시예의 간략한 요약을 제공한다. 요약은 제1 실시예의 설명과 관련된다. 그러나 다음의 대안적인 실시예에 대한 개시의 기본 개념은 또한 제2 실시예에 대한 개시 위의 구현에 적용될 수 있다. 이러한 구현은 측면이 제1 실시예 위에 구현되는 방식과 동일한 사상이이다.This section provides a brief summary of other alternative embodiments of the present disclosure. The summary relates to the description of the first embodiment. However, the basic concept of the disclosure for the following alternative embodiment can also be applied to the implementation above the disclosure for the second embodiment. This implementation is the same idea as the way the aspects are implemented over the first embodiment.

단기 참조 영상 엔트리의 델타 POC의 시맨틱스.Semantics of delta POC of short-term reference image entry.

본 개시의 하나의 대안적인 실시예에서, 참조 영상 리스트 구조 ref_pic_list_struct()에서 i 번째 엔트리의 델타 POC를 지정하는 신택스 요소의 시맨틱스는 현재 영상과 i번째 엔트리와 연관된 참조 영상 간의 POC 차이로 정의된다. 여기에 사용된 설명 중 일부는 델타만 표시되거나 설명된 현재 표준 초안(예: VVC 작업 초안)과 관련된다. 제거된 텍스트는 밑줄 또는 취소선으로 표시되고 추가된 텍스트는 강조 표시된다.In one alternative embodiment of the present disclosure, the semantics of a syntax element specifying a delta POC of the i-th entry in the reference picture list structure ref_pic_list_struct() is defined as a POC difference between the current picture and the reference picture associated with the i-th entry. Some of the descriptions used here relate to the current draft standard (e.g. VVC working draft) where only deltas are indicated or described. Removed text is underlined or strikethrough, and added text is highlighted.

delta_poc_st[listIdx][rplsIdx][i]의 시맨틱스는 다음과 같이 정의된다: delta_poc_st[listIdx][rplsIdx][i]는 현재 영상과 i번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 값 사의 차이를 지정한다. delta_poc_st[listIdx][rplsIdx][i]의 값은 -215에서 215 - 1(포함)까지의 범위에 있어야 한다.The semantics of delta_poc_st[listIdx][rplsIdx][i] are defined as follows: delta_poc_st[listIdx][rplsIdx][i] specifies the difference between the current picture and the picture order count value of the picture referenced by the i-th entry. do. The value of delta_poc_st[listIdx][rplsIdx][i] must be in the range of -215 to 215-1 (inclusive).

참조 영상 리스트 구축 프로세스의 방정식은 업데이트될 필요가 있다. 참조 영상 리스트 RefPicList[0] 및 RefPicList[1]은 다음과 같이 구축된다.The equation of the reference image list construction process needs to be updated. Reference picture lists RefPicList[0] and RefPicList[1] are constructed as follows.

장기 참조 영상 엔트리의 시그널링.Signaling of long-term reference picture entry.

본 개시의 하나의 대안적인 실시예에서, 장기 참조 영상 엔트리는 단기 참조 영상 엔트리를 포함하는 동일한 참조 영상 리스트 구조로 시그널링되지 않는다. 장기 참조 영상 엔트리는 별도의 구조로 시그널링되며, 구조의 엔트리 각각에 대해 최종 참조 영상 리스트에서 대응하는 엔트리 색인을 도출하기 위해 장기 참조 영상 엔트리의 의도된 위치를 설명하는 신택스 요소가 있다. .In one alternative embodiment of the present disclosure, the long-term reference picture entry is not signaled with the same reference picture list structure including the short-term reference picture entry. The long-term reference picture entry is signaled in a separate structure, and for each of the structure entries, there is a syntax element describing the intended position of the long-term reference picture entry to derive a corresponding entry index from the final reference picture list. .

슬라이스 헤더 신택스.Slice header syntax.

참조 영상 리스트 구조 신택스.Reference image list structure syntax.

장기 참조 영상 리스트 구조 신택스.Long-term reference image list structure syntax.

num_ref_pic_lists_lt_in_sps는 SPS에 포함된 ref_pic_list_lt_struct(ltRplsIdx) 신택스 구조의 수를 지정한다. num_ref_pic_lists_lt_in_sps의 값은 0에서 64(포함)까지의 범위에 있어야 한다. 존재하지 않을 경우, num_ref_pic_lists_lt_in_sps의 값은 0과 같은 것으로 추론된다.num_ref_pic_lists_lt_in_sps designates the number of ref_pic_list_lt_struct(ltRplsIdx) syntax structures included in the SPS. The value of num_ref_pic_lists_lt_in_sps must be in the range of 0 to 64 (inclusive). If not present, the value of num_ref_pic_lists_lt_in_sps is inferred to be equal to 0.

슬라이스 헤더 시맨틱스.Slice header semantics.

ref_pic_list_lt_idx[i]는 현재 영상의 참조 영상 리스트 i의 도출에 사용되는 활성 SPS에 포함된 ref_pic_list_lt_struct(ltRplsIdx) 신택스 구조의 리스트에 색인을 지정한다. 신택스 요소 ref_pic_list_lt_idx[i]는 Ceil(Log2(num_ref_pic_lists_lt_in_sps)) 비트로 표현된다. ref_pic_list_lt_idx의 값은 0에서 num_ref_pic_lists_lt_in_sps - 1(포함)까지의 범위에 있어야 한다.ref_pic_list_lt_idx[i] designates an index in the list of the syntax structure ref_pic_list_lt_struct(ltRplsIdx) included in the active SPS used to derive the reference picture list i of the current picture. The syntax element ref_pic_list_lt_idx[i] is represented by Ceil(Log2(num_ref_pic_lists_lt_in_sps)) bits. The value of ref_pic_list_lt_idx must be in the range of 0 to num_ref_pic_lists_lt_in_sps-1 (inclusive).

ref_pic_list_struct(listIdx, rplsIdx) 신택스 구조는 SPS 또는 슬라이스 헤더에 존재할 수 있다. 신택스 구조가 슬라이스 헤더에 포함되는지 SPS에 포함되는지에 따라 다음이 적용된다: 슬라이스 헤더에 있으면, ref_pic_list_struct(listIdx, rplsIdx) 신택스 구조는 현재 영상(슬라이스를 포함하는 영상)의 단기 참조 영상 리스트 listIdx를 지정한다. 그렇지 않으면(SPS에 있음), ref_pic_list_struct(listIdx, rplsIdx) 신택스 구조는 단기 참조 영상 리스트 listIdx에 대한 후보를 지정하며, 이 섹션의 나머지 부분에 지정된 시맨틱스에서 "현재 영상"이라는 용어는, 1) SPS에 포함된 ref_pic_list_struct(listIdx, rplsIdx) 신택스 구조의 리스트에 대한 색인과 동일한 ref_pic_list_idx[listIdx]를 포함하는 하나 이상의 슬라이스가 있고, 2) 활성 SPS인 SPS를 갖는 CVS에 있는 각각의 영상을 가리킨다. num_strp_entries[listIdx][rplsIdx]는 ref_pic_list_struct(listIdx, rplsIdx) 신택스 구조의 STRP 엔트리 수를 지정한다.The ref_pic_list_struct(listIdx, rplsIdx) syntax structure may exist in the SPS or slice header. The following applies depending on whether the syntax structure is included in the slice header or SPS: If it is in the slice header, ref_pic_list_struct(listIdx, rplsIdx) syntax structure specifies the short-term reference picture list listIdx of the current picture (picture including the slice). do. Otherwise (in SPS), the syntax structure ref_pic_list_struct(listIdx, rplsIdx) specifies a candidate for the short-term reference picture list listIdx, and the term "current picture" in the semantics specified in the rest of this section is: 1) SPS There is one or more slices including ref_pic_list_idx[listIdx] that are the same as the index for the list of the ref_pic_list_struct(listIdx, rplsIdx) syntax structure included in the list, and 2) it indicates each image in the CVS having the SPS, which is the active SPS. num_strp_entries[listIdx][rplsIdx] designates the number of STRP entries in the ref_pic_list_struct(listIdx, rplsIdx) syntax structure.

numnum __ ltrpltrp _entries[_entries[ listIdx]listIdx] [[ rplsIdx]rplsIdx] 는 ref_Ref_ picpic _list__list_ struct(listIdxstruct(listIdx , rplsIdx, , rplsIdx, ltrpFlagltrpFlag ) ) 신택스Syntax 구조의 Structural LTRPLTRP 엔트리 수를 지정한다. 존재하지 않을 경우, num_ltrp_entries[listIdx][rplsIdx]의 값은 0과 같은 것으로 추론된다. Specify the number of entries. If not present, it is inferred that the value of num_ltrp_entries[listIdx][rplsIdx] is equal to 0.

변수 variable NumEntriesInList[listIdx]NumEntriesInList[listIdx] [[ rplsIdx]rplsIdx] 는 다음과 같이 도출된다:Is derived as follows:

NumRefPicEntriesInRpl[listIdx]NumRefPicEntriesInRpl[listIdx] [[ rplsIdx]rplsIdx] = num_strp_entries[listIdx][rplsIdx] + num_ltrp_entries[listIdx][rplsIdx] = num_strp_entries[listIdx][rplsIdx] + num_ltrp_entries[listIdx][rplsIdx] (7-34)(7-34)

NumRefPicEntries[listIdx]NumRefPicEntries[listIdx] [[ rplsIdx]rplsIdx] 의 값은 0에서 sps_max_dec_pic_buffering_minus1(포함)까지의 범위에 있어야 한다.The value of must be in the range of 0 to sps_max_dec_pic_buffering_minus1 (inclusive).

ltlt _ref__ref_ picpic _flag[_flag[ listIdx]listIdx] [[ rplsIdx]rplsIdx] [i] equal to 1은 ref_pic_list_struct(listIdx, [i] equal to 1 means ref_pic_list_struct(listIdx, rplsIdxrplsIdx , , ltrpFlagltrpFlag ) ) 신택스Syntax 구조의 i번째 엔트리가 The i-th entry of the structure is LTRPLTRP 엔트리임을 지정한다. Specifies that this is an entry. ltlt _ref__ref_ picpic _flag[_flag[ listIdx]listIdx] [[ rplsIdx]rplsIdx] [i] equal to 0은 ref_[i] equal to 0 is ref_ picpic _list__list_ struct(listIdxstruct(listIdx , , rplsIdxrplsIdx , , ltrpFlag)ltrpFlag) 신택스Syntax 구조의 i번째 엔트리가 The i-th entry of the structure is STRPSTRP 엔트리임을 지정한다. 존재하지 않을 경우, lt_ref_pic_flag[listIdx][rplsIdx][i]의 값은 0과 같은 것으로 추론된다. Specifies that this is an entry. If not present, the value of lt_ref_pic_flag[listIdx][rplsIdx][i] is inferred to be equal to 0.

비트스트림Bitstream 적합성의 요건은, 0에서 NumRefPicEntries[listIdx][rplsIdx]　-　1(포함)까지의 범위에 있는 i의 모든 값에 대해, The conformance requirement is for all values of i in the range 0 to NumRefPicEntries[listIdx][rplsIdx]　-　1 (inclusive), ltlt _ref__ref_ picpic _flag[_flag[ listIdx]listIdx] [[ rplsIdx]rplsIdx] [i]의 합이 num_ltrp_entries[listIdx][rplsIdx]와 같아야 하는 것이다. The sum of [i] must be equal to num_ltrp_entries[listIdx][rplsIdx].

delta_poc_st[listIdx][rplsIdx][i]는, i번째 엔트리가 in ref_pic_list_struct(listIdx,　rplsIdx) 신택스 구조의 첫 번째 STRP 엔트리인 경우, 현재 영상과 i번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 값 간의 차이를 지정하거나, i번째 엔트리가 STRP 엔트리이지만 ref_pic_list_struct(rplsIdx,　ltrpFlag) 신택스 구조에서의 첫 번째 STRP 엔트리가 아닌 경우, i번째 엔트리에 의해 참조되는 영상과 ref_pic_list_struct(listIdx,　rplsIdx) 신택스 구조의 이전 STRP에 의해 참조되는 영상의 영상 순서 카운트 값 사이의 차이를 지정한다. delta_poc_st[listIdx][rplsIdx][i]의 값은 -215에서 215 - 1(포함)까지의 범위에 있어야 한다. delta_poc_st[listIdx][rplsIdx][i] is between the current picture and the picture order count value of the picture referenced by the i-th entry when the i-th entry is the first STRP entry of the in ref_pic_list_struct(listIdx, 　rplsIdx) syntax structure. If the difference is specified or the i-th entry is an STRP entry but is not the first STRP entry in the ref_pic_list_struct(rplsIdx, 　ltrpFlag) syntax structure, the image referenced by the i-th entry and the previous STRP of the ref_pic_list_struct(listIdx, 　rplsIdx) syntax structure Specifies the difference between the image order count values of the image referenced by. The value of delta_poc_st[listIdx][rplsIdx][i] must be in the range of -215 to 215-1 (inclusive).

pocpoc __ lsblsb __ lt[listIdx][rplsIdx][lt[listIdx][rplsIdx][ i]는 ref_i] ref_ picpic _list__list_ structstruct (listIdx,　rplsIdx,　ltrpFlag) (listIdx,　rplsIdx,　ltrpFlag) 신택스Syntax 구조의 i번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 모듈로 Modulo the picture order count of the picture referenced by the i-th entry in the structure MaxLtPicOrderCntLsb의MaxLtPicOrderCntLsb 값을 지정한다. poc_lsb_lt[listIdx][rplsIdx][i] Specify the value. poc_lsb_lt[listIdx][rplsIdx][i] 신택스Syntax 요소의 길이는 Log2(MaxLtPicOrderCntLsb) 비트이다. The length of the element is Log2 (MaxLtPicOrderCntLsb) bits.

장기 참조 영상 리스트 구조 시맨틱스.Long-term reference image list structure semantics.

ref_pic_list_lt_struct(ltRplsIdx) 신택스 구조는 SPS 또는 슬라이스 헤더에 존재할 수 있다. 신택스 구조가 슬라이스 헤더에 포함되는지 SPS에 포함되는지에 따라 다음이 적용된다: 슬라이스 헤더에 있으면, ref_pic_list_lt_struct(ltRplsIdx) 신택스 구조는 현재 영상(슬라이스를 포함하는 영상)의 장기 참조 영상 리스트를 지정한다, 그렇지 않으면(SPS에 있음), ref_pic_list_struct(listIdx, rplsIdx) 신택스 구조는 장기 참조 영상 리스트에 대한 후보를 지정하고, 이 섹션의 나머지 부분에 지정된 시맨틱스에서 "현재 영상"이라는 용어는, 1) SPS에 포함된 ref_pic_list_lt_struct(ltRplsIdx) 신택스 구조의 리스트에 대한 색인과 동일한 ref_pic_list_lt_idx[i]를 포함하는 하나 이상의 슬라이스가 갖고, 2) 활성 SPS인 SPS를 갖는 CVS에 있는 각각의 영상을 가리킨다. num_ltrp_entries[ltRplsIdx]는 ref_pic_list_lt_struct(ltRplsIdx) 신택스 구조의 LTRP 엔트리 수를 지정한다. poc_lsb_lt[rplsIdx][i]는 ref_pic_list_lt_struct(rplsIdx) 신택스 구조에서 i번째 엔트리가 참조하는 영상의 영상 순서 카운트 모듈로 MaxLtPicOrderCntLsb의 값을 지정한다. poc_lsb_lt[rplsIdx][i] 신택스 요소의 길이는 Log2(MaxLtPicOrderCntLsb) 비트이다. lt_pos_idx[rplsIdx][i]는 참조 영상 리스트 구축 후 참조 영상 리스트의 rref_pic_list_lt_struct(rplsIdx) 신택스 구조에서 i번째 엔트리의 색인을 지정한다. lt_pos_idx[rplsIdx][i] 신택스 요소의 길이는 Log2(sps_max_dec_pic_buffering_minus1　+　1) 비트이다. num_ltrp_entries[ltRplsIdx]가 1보다 크면, poc_lsb_lt[rplsIdx][i] 및 lt_pos_idx[rplsIdx][i]는 lt_pos_idx[rplsIdx][i] 값의 내림차순이 된다.The ref_pic_list_lt_struct(ltRplsIdx) syntax structure may exist in the SPS or slice header. Depending on whether the syntax structure is included in the slice header or the SPS, the following applies: If it is in the slice header, the ref_pic_list_lt_struct(ltRplsIdx) syntax structure specifies the long-term reference picture list of the current picture (the picture including the slice), otherwise If not (in SPS), the syntax structure ref_pic_list_struct(listIdx, rplsIdx) specifies a candidate for the long-term reference picture list, and the term "current picture" in the semantics specified in the rest of this section is 1) included in the SPS. One or more slices containing ref_pic_list_lt_idx[i] equal to the index for the list of the ref_pic_list_lt_struct(ltRplsIdx) syntax structured and 2) refer to each image in the CVS with the SPS as the active SPS. num_ltrp_entries[ltRplsIdx] designates the number of LTRP entries in the ref_pic_list_lt_struct(ltRplsIdx) syntax structure. poc_lsb_lt[rplsIdx][i] is a picture order count module of the picture referenced by the i-th entry in the ref_pic_list_lt_struct(rplsIdx) syntax structure, and specifies the value of MaxLtPicOrderCntLsb. The length of the poc_lsb_lt[rplsIdx][i] syntax element is Log2 (MaxLtPicOrderCntLsb) bits. lt_pos_idx[rplsIdx][i] designates the index of the i-th entry in the syntax structure rref_pic_list_lt_struct(rplsIdx) of the reference picture list after construction of the reference picture list. The length of the lt_pos_idx[rplsIdx][i] syntax element is Log2 (sps_max_dec_pic_buffering_minus1　+　1) bits. If num_ltrp_entries[ltRplsIdx] is greater than 1, poc_lsb_lt[rplsIdx][i] and lt_pos_idx[rplsIdx][i] are in descending order of lt_pos_idx[rplsIdx][i] values.

디코딩 프로세스가 설명된다.The decoding process is described.

단기 참조 영상 엔트리의 수의 시그널링이 논의된다.Signaling of the number of short-term reference picture entries is discussed.

본 개시의 하나의 대안적인 실시예에서, 참조 영상 리스트 구조 ref_pic_list_struct()에서 단기 참조 영상과 연관된 엔트리의 수를 지정하는 신택스 요소는 num_strp_entries[listIdx][rplsIdx] 대신, num_strp_entries_minus1[listIdx][rplsIdx]로 정의된다. 이 변경은 참조 영상 리스트의 시그널링에 두 가지 효과가 있다: 요소가 ue(v)를 사용하여 코딩되기 때문에 참조 영상 리스트 구조에서 단기 참조 영상과 연관된 엔트리 수를 시그널링하기 위한 비트를 절약할 수 있다. 이는 각각의 참조 영상 리스트가 하나 이상의 단기 참조 영상을 포함하도록 암묵적으로 제약을 가한다. 이 아이디어를 수용하려면 제1 실시예에 대한 일부 변경이 필요하다.In one alternative embodiment of the present disclosure, the syntax element specifying the number of entries associated with the short-term reference picture in the reference picture list structure ref_pic_list_struct() is num_strp_entries_minus1[listIdx][rplsIdx] instead of num_strp_entries[listIdx][rplsIdx]. Is defined. This change has two effects on the signaling of the reference picture list: Because the element is coded using ue(v), it is possible to save bits for signaling the number of entries associated with the short-term reference picture in the reference picture list structure. This implicitly restricts each reference picture list to include one or more short-term reference pictures. Some changes to the first embodiment are required to accommodate this idea.

슬라이스 헤더에서의 참조 영상 리스트 시그널링을 위해, 슬라이스 유형에 따라 필요한 참조 영상 리스트만 시그널링된다. 즉, I 또는 P 슬라이스에 대한 하나의 참조 영상 리스트(즉, 참조 영상 리스트 0)와 B 슬라이스에 대한 두 개의 참조 영상 리스트(즉, 참조 영상 리스트 0과 참조 영상 리스트 1 모두)가 시그널링된다. 슬라이스 헤더 신택스가 다음과 같이 변경된다:For the reference picture list signaling in the slice header, only the required reference picture list is signaled according to the slice type. That is, one reference picture list for an I or P slice (ie, reference picture list 0) and two reference picture lists for a B slice (ie, both reference picture list 0 and reference picture list 1) are signaled. The slice header syntax is changed as follows:

슬라이스 헤더(즉, I 또는 P 슬라이스에 대한 참조 영상 리스트 0; B 슬라이스에 대한 참조 영상 0 및 참조 영상 1)에 위의 변경을 적용함으로써, P 슬라이스의 경우 단 하나의 단기 참조 영상밖에 없다는 문제로부터 스킴(scheme)을 회피할 수 있다. 그러나 복제된 단기 참조 영상은 참조 영상 리스트 0 및 참조 영상 리스트 1에서 시그널링될 수 없으며, 여기서 참조 영상 리스트 1의 엔트리는 참조 영상 리스트 1의 활성 엔트리 수가 0과 같아야 하는 비활성 엔트리이다. num_strp_entries_minus1[listIdx][rplsIdx]의 시맨틱은 다음과 같이 변경된다: num_strp_entries_minus1[listIdx][rplsIdx] plus 1은 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조에서 STRP 엔트리의 수를 지정한다. 변수 NumEntriesInList[listIdx][rplsIdx]는 다음과 같이 도출된다: NumRefPicEntriesInRpl[listIdx][rplsIdx] = num_strp_entries_minus1[listIdx][rplsIdx] + 1 + num_ltrp_entries[listIdx][rplsIdx]. NumRefPicEntries[listIdx][rplsIdx]의 값은 1에서 sps_max_dec_pic_buffering_minus1(포함)까지의 범위에 있어야 한다.By applying the above change to the slice header (i.e., reference picture list 0 for I or P slice; reference picture 0 and reference picture 1 for B slice), in the case of P slice, there is only one short-term reference picture. Scheme can be avoided. However, the duplicated short-term reference picture cannot be signaled in the reference picture list 0 and the reference picture list 1, where the entry of the reference picture list 1 is an inactive entry in which the number of active entries of the reference picture list 1 must be equal to 0. The semantics of num_strp_entries_minus1[listIdx][rplsIdx] are changed as follows: num_strp_entries_minus1[listIdx][rplsIdx] plus 1 specifies the number of STRP entries in the ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) syntax structure. The variable NumEntriesInList[listIdx][rplsIdx] is derived as follows: NumRefPicEntriesInRpl[listIdx][rplsIdx] = num_strp_entries_minus1[listIdx][rplsIdx] + 1 + num_ltrp_entries[listIdx][rplsIdx]. The value of NumRefPicEntries[listIdx][rplsIdx] must be in the range of 1 to sps_max_dec_pic_buffering_minus1 (inclusive).

참조 영상 리스트에 현재 영상의 포함을 허용.Allows the inclusion of the current video in the reference video list.

본 개시의 하나의 대안적인 실시예에서, 현재 영상이 그 참조 영상 리스트에 포함될 수 있다. 이 특징을 지원하기 위해, 제1 및 제2 실시예의 설명과 관련하여 필요한 신택스 및 시맨틱스 변경이 없다. 그러나 참조 영상 리스트 구축을 위한 디코딩 프로세스에서 설명된 비트스트림 적합성 제약은 다음과 같이 수정되어야 한다: 다음 제약 조건이 적용되는 것은 비트스트림 적합성의 요건이다: 0 또는 1인 각각의 i에 대해, NumEntriesInList[i][RplsIdx[i]]는 NumRefIdxActive[i]보다 작지 않아야 한다. RefPicList[0] 또는 RefPicList[1]의 활성 엔트리 각각에 의해 참조되는 영상은 DPB에 있어야 하고 현재 영상의 TemporalId보다 작거나 같아야 한다. 선택적으로, 다음과 같은 제약이 추가로 지정될 수 있다: RefPicList[0] 또는 RefPicList[1]에서 비활성 엔트리의 엔트리 색인은 현재 영상의 디코딩을 위한 참조 색인으로 사용되지 않아야 한다. 선택적으로 다음 제약 조건을 추가로 지정할 수 있다: RefPicList[0] 또는 RefPicList[1]의 비활성 엔트리는 RefPicList[0] 또는 RefPicList[1]의 다른 엔트리와 동일한 영상을 참조하지 않아야 한다. 영상의 슬라이스의 RefPicList[0] 또는 RefPicList[1]에 있는 STRP 엔트리와, 동일한 영상의 동일한 슬라이스 도는 상이한 슬라이스의 RefPicList[0] 또는 RefPicList[1]에 있는 LTRP 엔트리는 동일한 영상을 참조하지 않아야 한다. 현재 영상 자체는 RefPicList[0] 또는 RefPicList[1]의 엔트리에 의해 참조되지 않는다. 현재 영상이 RefPicList[i]의 엔트리에 의해 참조되는 경우, 0 또는 1인 i 각각에 대해, 엔트리 색인은 NumRefIdxActive[i]보다 작아야 한다. RefPicList[0] 또는 RefPicList[1]에는 현재 영상의 PicOrderCntVal과 그 엔트리에 의해 참조되는 영상의 PicOrderCntVal 사이의 차이가 224보다 크거나 같은 LTRP 엔트리가 없어야 한다. setOfRefPics를 RefPicList[0]의 모든 엔트리와 RefPicList[1]의 모든 엔트리에서 참조하는 유일한 영상의 세트라고 하자. 현재 영상이 setOfRefPics에 포함되지 않으면, setOfRefPics의 영상 수는 sps_max_dec_pic_buffering_minus1보다 작거나 같아야 하고, 그렇지 않으면 setOfRefPics의 영상 수는 sps_max_dec_pic_buffering_minus1 + 1보다 작거나 같아야 한다. 영상의 모든 슬라이스에 대해 동일하다.In one alternative embodiment of the present disclosure, the current image may be included in the reference image list. To support this feature, there are no syntax and semantics changes required in connection with the description of the first and second embodiments. However, the bitstream conformance constraint described in the decoding process for constructing a reference picture list must be modified as follows: It is the requirement of bitstream conformance that the following constraint applies: For each i, which is 0 or 1, NumEntriesInList[ i][RplsIdx[i]] must not be less than NumRefIdxActive[i]. The picture referenced by each of the active entries in RefPicList[0] or RefPicList[1] must be in the DPB and must be less than or equal to the TemporalId of the current picture. Optionally, the following constraint may be additionally specified: The entry index of an inactive entry in RefPicList[0] or RefPicList[1] shall not be used as a reference index for decoding the current picture. Optionally, the following constraints can be additionally specified: An inactive entry in RefPicList[0] or RefPicList[1] must not refer to the same picture as another entry in RefPicList[0] or RefPicList[1]. STRP entries in RefPicList[0] or RefPicList[1] of a slice of an image and LTRP entries in RefPicList[0] or RefPicList[1] of the same slice or different slices of the same image shall not refer to the same image. The current picture itself is not referenced by the entry of RefPicList[0] or RefPicList[1]. When the current picture is referenced by an entry in RefPicList[i], for each i of 0 or 1, the entry index must be less than NumRefIdxActive[i]. In RefPicList[0] or RefPicList[1], there must not be an LTRP entry where the difference between the PicOrderCntVal of the current picture and the PicOrderCntVal of the picture referenced by that entry is greater than or equal to 224. Let setOfRefPics be the set of unique images referenced by all entries in RefPicList[0] and all entries in RefPicList[1]. If the current image is not included in setOfRefPics, the number of images in setOfRefPics must be less than or equal to sps_max_dec_pic_buffering_minus1, otherwise the number of images in setOfRefPics must be less than or equal to sps_max_dec_pic_buffering_minus1 + 1. It is the same for all slices of the image.

참조 영상 리스트의 LTRP 엔트리에 대해 서로 다른 POC LSB 비트 사용.Different POC LSB bits are used for LTRP entries in the reference picture list.

본 개시의 하나의 대안적인 실시예에서, 참조 영상 리스트 구조에서 장기 참조 영상을 식별하기 위해 사용되는 비트의 수는 참조 영상 리스트 0과 참조 영상 리스트 1 사이에 상이할 수 있다. 이 특징을 지원하기 위해, 다음 변경이 필요하다:In one alternative embodiment of the present disclosure, the number of bits used to identify the long-term reference picture in the reference picture list structure may be different between the reference picture list 0 and the reference picture list 1. To support this feature, the following changes are required:

additional_lt_poc_lsb[i]는 i와 동일한 참조 영상 리스트 listIdx에 대한 디코딩 프로세스에서 사용되는 변수 MaxLtPicOrderCntLsb[i]의 값을 다음과 같이 지정한다: MaxLtPicOrderCntLsb[i] = 2(log2_max_pic_poc_lsb[lsb_minus4 + 4 + additional_lt_poc_lsb[i]). additional_lt_poc_lsb[i]의 값은 0에서 32 - log2_max_pic_order_cnt_lsb_minus4-4(포함)까지의 범위에 있어야 한다. 존재하지 않을 경우, additional_lt_poc_lsb[i]의 값은 0과 같은 것으로 추정된다.additional_lt_poc_lsb[i] specifies the value of the variable MaxLtPicOrderCntLsb[i] used in the decoding process for the reference picture list listIdx identical to i as follows: MaxLtPicOrderCntLsb[i] = 2 (log2_max_pic_poc_lsb[lsb_minus4 + 4lti] ). The value of additional_lt_poc_lsb[i] must be in the range of 0 to 32-log2_max_pic_order_cnt_lsb_minus4-4 (inclusive). If not present, the value of additional_lt_poc_lsb[i] is estimated to be equal to 0.

poc_lsb_lt[listIdx][rplsIdx][i]는 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조에서 i 번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 모듈로 MaxLtPicOrderCntLsb[listIdx]의 값을 지정한다. poc_lsb_lt[listIdx][rplsIdx][i] 신택스 요소의 길이는 Log2(MaxLtPicOrderCntLsb[listIdx]) 비트이다.poc_lsb_lt[listIdx][rplsIdx][i] is a picture order count module of the picture referenced by the i-th entry in the ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) syntax structure, and specifies the value of MaxLtPicOrderCntLsb[listIdx]. The length of the poc_lsb_lt[listIdx][rplsIdx][i] syntax element is Log2 (MaxLtPicOrderCntLsb[listIdx]) bits.

참조 영상 리스트 RefPicList[0] 및 RefPicList[1]은 다음과 같이 구축된다.Reference picture lists RefPicList[0] and RefPicList[1] are constructed as follows.

참조 영상 리스트 0 및 1에 대해 동일한 ref_pic_list_sps_flag 사용.Using the same ref_pic_list_sps_flag for reference picture lists 0 and 1.

본 개시의 하나의 대안적인 실시예에서, 참조 영상 리스트 0 및 참조 영상 리스트 1이 활성 SPS에서 ref_pic_list_struct() 신택스 구조를 기반으로 도출되는지의 여부를 나타내기 위해 두 개의 플래그를 사용하는 대신, 하나의 플래그가 두 참조 영상 리스트에 사용된다. 이러한 대안은 두 참조 영상 리스트가 활성 SPS의 ref_pic_list_struct()에 기초하여 도출되거나 현재 영상의 슬라이스 헤더에 직접 포함된 ref_pic_list_struct() 신택스 구조에 기초하여 도출된다. 이 기능을 지원하려면 다음의 변경이 필요하다:In one alternative embodiment of the present disclosure, instead of using two flags to indicate whether reference picture list 0 and reference picture list 1 are derived based on the ref_pic_list_struct() syntax structure in the active SPS, one The flag is used for both reference image lists. This alternative is derived based on the ref_pic_list_struct() of the active SPS, or based on the ref_pic_list_struct() syntax structure included directly in the slice header of the current image. The following changes are required to support this feature:

ref_pic_list_sps_flag[i] equal to 1은 활성 SPS에서 i와 동일한 listIdx를 갖는 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조 중 하나에 기초하여, 현재 영상의 참조 영상 리스트 i가 도출됨을 지정한다. ref_pic_list_sps_flag[i] equal to 0은 현재 영상의 슬라이스 헤더에 직접 포함된 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조를 기반으로 현재 영상의 참조 영상 리스트 i가 도출되도록 지정한다. num_ref_pic_lists_in_sps[0] 또는 num_ref_pic_lists_in_sps[1]이 0인 경우, ref_pic_list_sps_flag[i]의 값은 0과 같아야 한다. pic_lists_in_sps[1]의 값은 0과 같고, ref_pic_list_sps_flag의 값은 0과 같아야 한다.ref_pic_list_sps_flag [i] equal to 1 specifies that the reference picture list i of the current picture is derived based on one of the syntax structures ref_pic_list_struct (listIdx, rplsIdx, ltrpFlag) having the same listIdx as i in the active SPS. ref_pic_list_sps_flag [i] equal to 0 specifies that the reference picture list i of the current picture is derived based on the syntax structure ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) directly included in the slice header of the current picture. When num_ref_pic_lists_in_sps[0] or num_ref_pic_lists_in_sps[1] is 0, the value of ref_pic_list_sps_flag[i] must be equal to 0. The value of pic_lists_in_sps[1] should be equal to 0, and the value of ref_pic_list_sps_flag should be equal to 0.

참조 영상 리스트 RefPicList[0] 및 RefPicList[1]은 다음과 같이 구성된다.Reference picture lists RefPicList[0] and RefPicList[1] are configured as follows.

장기 참조 영상 엔트리에 대한 델타 POC MSB(Most Significant Bit)의 시그널링.Signaling of delta POC Most Significant Bit (MSB) for long-term reference image entry.

본 개시의 하나의 대안적인 실시예에서, ref_pic_list_struct()에서 장기 참조 영상 엔트리의 POC LSB를 나타내기 위해 추가 비트를 사용하는 대신, 장기 참조 영상을 구별하기 위해 POC MSB 사이클이 시그널링된다. 시그널링되는 경우, POC MSB 사이클 정보는 장기 참조 영상을 참조하는 ref_pic_list_struct()의 엔트리 각각에 대해 시그널링된다. ref_pic_list_struct() 신택스 구조는 SPS에서 시그널링되지 않고 슬라이스 헤더에서만 시그널링된다. 이 특징을 지원하려면, 다음 변경이 필요하다:In one alternative embodiment of the present disclosure, instead of using an additional bit to indicate the POC LSB of the long-term reference picture entry in ref_pic_list_struct(), a POC MSB cycle is signaled to distinguish the long-term reference picture. When signaled, the POC MSB cycle information is signaled for each entry of ref_pic_list_struct() referring to the long-term reference picture. The ref_pic_list_struct() syntax structure is not signaled in the SPS, but only in the slice header. To support this feature, the following changes are required:

ref_pic_list_struct(listIdx, ltrpFlag) 신택스 구조는 슬라이스 헤더에 존재할 수 있다. 슬라이스 헤더에 존재하는 경우, ref_pic_list_struct(listIdx, ltrpFlag) 신택스 구조는 현재 영상(슬라이스를 포함하는 영상)의 참조 영상 리스트 listIdx를 지정한다. num_strp_entries[listIdx][ rplsIdx ]는 ref_pic_list_struct(listIdx, rplsIdx , ltrpFlag) 신택스 구조의 STRP 엔트리 수를 지정한다. num_ltrp_entries[listIdx][ rplsIdx ]는 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 LTRP 엔트리 수를 지정한다. 존재하지 않을 경우, num_ltrp_entries[listIdx][rplsIdx]의 값은 0과 같은 것으로 추정된다.The ref_pic_list_struct(listIdx, ltrpFlag) syntax structure may exist in the slice header. When present in the slice header, the ref_pic_list_struct(listIdx, ltrpFlag) syntax structure designates the reference picture list listIdx of the current picture (picture including the slice). num_strp_entries[listIdx] [ rplsIdx ] specifies the number of STRP entries in the syntax structure ref_pic_list_struct(listIdx, rplsIdx , ltrpFlag). num_ltrp_entries[listIdx] [ rplsIdx ] designates the number of LTRP entries in the syntax structure ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag). If not present, the value of num_ltrp_entries[listIdx] [rplsIdx] is estimated to be equal to 0.

변수 NumEntriesInList[listIdx][ rplsIdx ]는 다음과 같이 도출된다:The variable NumEntriesInList[listIdx] [ rplsIdx ] is derived as follows:

NumRefPicEntries[listIdx][ rplsIdx ]의 값은 0부터 sps_max_dec_pic_buffering_minus1(포함)까지의 범위에 있어야 한다. lt_ref_pic_flag[listIdx][rplsIdx][i] equal to 1 은 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 i번째 엔트리가 LTRP 엔트리임을 지정한다. lt_ref_pic_flag[listIdx][rplsIdx ][i] equal to 0은 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 i번째 엔트리가 STRP 엔트리임을 지정한다. 존재하지 않을 경우, lt_ref_pic_flag[listIdx][rplsIdx][i]의 값은 0과 같은 것으로 추론된다. 0에서 NumRefPicEntries[listIdx][rplsIdx]　-　1(포함)까지의 범위에 있는 i의 모든 값에 대해 lt_ref_pic_flag[listIdx][ rplsIdx ][i]의 합이 num_ltrp_entries[listIdx][rplsIdx]와 같아야 하는 것이 비트스트림 적합성의 요건이다. delta_poc_st[listIdx][ rplsIdx ][i]는, i번째 엔트리가 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 첫 번째 STRP 엔트리인 경우, 현재 영상과 i번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 값 간의 차이를 지정하거나, 또는, i번째 엔트리가 TRP 엔트리이지만 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의 첫 번째 STRP 엔트리가 아닌 경우, ref_pic_list_struct(listIdx,　rplsIdx,　ltrpFlag) 신택스 구조에서 i번째 엔트리 및 이전 STRP 엔트리에 의해 참조되는 영상의 영상 순서 카운트 값 간의 차리를 지정한다. delta_poc_st[listIdx][rplsIdx][i]의 값은 -2¹⁵ 에서 2¹⁵　-　1(포함)까지의 범위에 있어야 한다. poc_lsb_lt[listIdx][rplsIdx][i]는 ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조에서 i번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 모듈로 MaxLtPicOrderCntLsb의 값을 지정한다. poc_lsb_lt[listIdx][rplsIdx][i] 신택스 요소의 길이는 Log2(MaxLtPicOrderCntLsb) 비트이다. delta_poc_msb_present_flag[listIdx][i] equal to 1은 delta_poc_msb_cycle_lt[listIdx][i] 이 존재함을 지정한다. delta_poc_msb_present_flag[listIdx][i] equal to 0 은 delta_poc_msb_cycle_lt[listIdx][i] 가 존재하지 않음을 지정한다.The value of NumRefPicEntries[listIdx] [ rplsIdx ] must be in the range from 0 to sps_max_dec_pic_buffering_minus1 (inclusive). lt_ref_pic_flag[listIdx][ rplsIdx] [i] equal to 1 designates that the i-th entry of the ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) syntax structure is an LTRP entry. lt_ref_pic_flag[listIdx][ rplsIdx ] [i] equal to 0 designates that the i-th entry of the ref_pic_list_struct(listIdx, rplsIdx , ltrpFlag) syntax structure is an STRP entry. If not present, the value of lt_ref_pic_flag[listIdx][ rplsIdx ][i] is inferred to be equal to 0. From 0 NumRefPicEntries [listIdx] [rplsIdx] - for all values of i in the range of from 1 (included) lt_ref_pic_flag [listIdx] [rplsIdx] the sum of the [i] num_ltrp_entries [listIdx] [ rplsIdx] and equal to bits This is a requirement for stream conformance. delta_poc_st[listIdx] [ rplsIdx ] [i] is, when the i-th entry is the first STRP entry of the ref_pic_list_struct(listIdx, rplsIdx , ltrpFlag) syntax structure, the picture order count value of the current picture and the picture referenced by the i-th entry specify the difference between, or, i-th entry is TRP entry but ref_pic_list_struct (listIdx, rplsIdx, ltrpFlag), if not the first STRP entry of the syntax structure, ref_pic_list_struct (listIdx, rplsIdx, ltrpFlag) i-th entry and earlier in the syntax structure Specifies the difference between the picture order count values of the picture referenced by the STRP entry. delta_poc_st [listIdx] [rplsIdx] The value of [i] is in the ¹⁵ -2 2 ¹⁵ - should be in the range of from 1 (included). poc_lsb_lt[listIdx] [rplsIdx] [i] is a picture order count module of the picture referenced by the i-th entry in the ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) syntax structure, and specifies the value of MaxLtPicOrderCntLsb. The length of the syntax element poc_lsb_lt[listIdx][ rplsIdx ][i] is Log2 (Max Lt PicOrderCntLsb) bits. delta_poc_msb_present_flag[listIdx][i] equal to 1 specifies that delta_poc_msb_cycle_lt[listIdx][i] exists. delta_poc_msb_present_flag[listIdx][i] equal to 0 specifies that delta_poc_msb_cycle_lt[listIdx][i] does not exist.

num_ltrp_entries[listIdx]가 0보다 크고 PicOrderCntVal 모듈로 MaxPicOrderCntLsb가 poc_lsb_lt[listIdx][i] 와 같은, 이 슬라이스 헤더가 디코딩될 때 DPG에 하나 이상의 참조 영상이 있는 경우, delta_poc_msb_present_flag[listIdx][i]는 1과 같아야 한다. 존재하지 않을 경우, delta_poc_msb_cycle_lt[listIdx][i]의 값은 0과 같은 것으로 추론된다. delta_poc_msb_cycle_lt[listIdx][i]는 ref_pic_list_struct(listIdx,　ltrpFlag) 신택스 구조에서 i번째 엔트리의 영상 순서 카운트 값의 최상위 비트의 값을 결정하는 데 사용된다. delta_poc_msb_cycle_lt[listIdx][i] 가 존재하지 않는 경우, 0과 같은 것으로 추론된다. 영상 순서 카운트에 대한 디코딩 프로세스의 변경: 디코딩 프로세스 중 언제든, DPB에서 임의의 두 참조 영상에 대한 PicOrderCntVal &(MaxLtPicOrderCntLsb - 1)의 값은 동일하지 않아야 한다. If num_ltrp_entries[listIdx] is greater than 0 and the PicOrderCntVal module MaxPicOrderCntLsb is poc_lsb_lt[listIdx][i] and there is more than one reference picture in the DPG when this slice header is decoded, delta_poc_msb_present_flag[listIdx][i] is equal to 1 It should be the same. If it does not exist, it is inferred that the value of delta_poc_msb_cycle_lt[listIdx][i] is equal to 0. delta_poc_msb_cycle_lt[listIdx][i] is used to determine the value of the most significant bit of the picture order count value of the i-th entry in the ref_pic_list_struct(listIdx, ltrpFlag) syntax structure. If delta_poc_msb_cycle_lt[listIdx][i] does not exist, it is inferred to be equal to 0. Changing the decoding process for the picture order count: At any time during the decoding process, the values of PicOrderCntVal &(MaxLtPicOrderCntLsb-1) for any two reference pictures in the DPB should not be the same.

참조 영상 리스트 RefPicList[0] 및 RefPicList[1]은 다음과 같이 구축된다:The reference picture lists RefPicList[0] and RefPicList[1] are constructed as follows:

대안으로, delta_poc_msb_cycle_lt[listIdx][i]의 시맨틱스는 참조 영상 리스트 구축이 다음과 같이 업데이트될 수 있도록 델타의 델타로 표현될 수 있다: 참조 영상 리스트 RefPicList[0] 및 RefPicList[1]은 다음과 같이 구축된다.Alternatively, the semantics of delta_poc_msb_cycle_lt[listIdx][i] can be expressed as a delta of delta so that the reference picture list construction can be updated as follows: Reference picture lists RefPicList[0] and RefPicList[1] are as follows: Is built.

각각의 STRP는 그 PicOrderCntVal 값에 의해 식별된다. 각각의 LTRP에 대해, delta_poc_msb_present_flag[listIdx][i] equal to 1을 갖는 RefPicList[0] 또는 RefPicList[1]의 엔트리에 의해 참조되면, PicOrderCntVal 값에 의해 식별되고, 그렇지 않으면 그 PicOrderCntVal 값의 Log2(MaxPicOrderCntLsb) LSB에 의해 식별된다.Each STRP is identified by its PicOrderCntVal value. For each LTRP, if referenced by an entry in RefPicList[0] or RefPicList[1] with delta_poc_msb_present_flag[listIdx][i] equal to 1, it is identified by the PicOrderCntVal value, otherwise Log2(MaxPicOrderCntLsb ) Identified by LSB.

장기 참조 영상 엔트리에 대한 델타 POC MSB의 시그널링의 대안 1.Alternative to signaling of delta POC MSB for long-term reference image entry 1.

이 실시예는 이전 섹션에서 설명한 실시예에 대한 대안을 제공한다. 이전 섹션의 아이디어와 유사하게, ref_pic_list_struct()에서 장기 참조 영상의 POC LSB를 표현하기 위해 추가 비트를 사용하는 대신, 장기 참조 영상을 구별하기 위해 POC MSB 사이클이 시그널링된다. 그러나, 이 대안에서는 시그널링될 때, POC MSB주기 정보가 ref_pic_list_struct() 내에서 시그널링되지 않고, 대신 POC MSB 사이클 정보가 필요할 때, 슬라이스 헤더에서 시그널링된다. ref_pic_list_struct() 신택스 구조는 SPS 및 슬라이스 헤더에서 시그널링될 수 있다.This embodiment provides an alternative to the embodiment described in the previous section. Similar to the idea in the previous section, instead of using additional bits to represent the POC LSB of the long-term reference picture in ref_pic_list_struct(), a POC MSB cycle is signaled to distinguish the long-term reference picture. However, in this alternative, when signaled, POC MSB period information is not signaled in ref_pic_list_struct(), but instead, when POC MSB cycle information is required, it is signaled in the slice header. The ref_pic_list_struct() syntax structure may be signaled in the SPS and the slice header.

delta_poc_msb_present_flag[i][j] equal to 1은 delta_poc_msb_cycle_lt[i][j]가 존재함을 지정한다. delta_poc_msb_present_flag[i][j] equal to 0 은 delta_poc_msb_cycle_lt[i][j]이 존재하지 않음을 지정한다. NumLtrpEntries[i]가 0보다 크고 ref_pic_list_struct(i, rplsIdx, 1) 신택스 구조의 j번째 LTRP 엔트리에 대해, PicOrderCntVal 모듈로 MaxPicOrderCntLsb가 poc_lsb_lt[i] [rplsIdx][jj]와 같은, 이 슬라이스 헤더가 디코딩될 때 DPB에 하나 이상의 참조 영상이 있는 경우, 여기서 jj는 ref_pic_list_struct(i, rplsIdx, 1) 신택스 구조에서 j번째 LTRP 엔트리인 ref_pic_list_struct(i, rplsIdx, 1) 신택스 구조의 엔트리의 엔트리 색인이며, delta_poc_msb_present_flag[i] [j]는 1과 같아야 한다. 존재하지 않을 경우, delta_poc_msb_cycle_lt[i] [j]의 값은 0과 같은 것으로 추론된다. delta_poc_msb_cycle_lt[i] [j]는 ref_pic_list_struct(i, rplsIdx, 1) 신택스 구조에서 j번째 LTRP 엔트리의 영상 순서 카운트 값의 최상위 비트 값을 결정하는 데 사용된다. delta_poc_msb_cycle_lt[i] [j]가 존재하지 않으면, 0과 같은 것으로 추론된다.delta_poc_msb_present_flag[i][j] equal to 1 specifies that delta_poc_msb_cycle_lt[i][j] exists. delta_poc_msb_present_flag[i][j] equal to 0 specifies that delta_poc_msb_cycle_lt[i][j] does not exist. For the jth LTRP entry of the syntax structure ref_pic_list_struct(i, rplsIdx, 1) where NumLtrpEntries[i] is greater than 0, this slice header will be decoded, with MaxPicOrderCntLsb equal to poc_lsb_lt[i] [rplsIdx][jj] with the PicOrderCntVal module. When there is more than one reference picture in the DPB, where jj is the entry index of the entry in the ref_pic_list_struct(i, rplsIdx, 1) syntax structure, which is the j-th LTRP entry in the ref_pic_list_struct(i, rplsIdx, 1) syntax structure, and delta_poc_msb_present_flag[i ] [j] must be equal to 1. If it does not exist, it is inferred that the value of delta_poc_msb_cycle_lt[i] [j] is equal to 0. delta_poc_msb_cycle_lt[i] [j] is used to determine the most significant bit value of the picture order count value of the j-th LTRP entry in the ref_pic_list_struct(i, rplsIdx, 1) syntax structure. If delta_poc_msb_cycle_lt[i] [j] does not exist, it is inferred to be equal to 0.

delta_poc_msb_present_flag[i][j] equal to 1은 delta_poc_msb_cycle_lt[i][j]가 존재함을 지정한다. delta_poc_msb_present_flag[i][j] equal to 0은 delta_poc_msb_cycle_lt[i][j]가 존재하지 않음을 지정한다. NumLtrpEntries[i] 가 0보다 크고 PicOrderCntVal 모듈로 MaxPicOrderCntLsb가 poc_lsb_lt[i] [rplsIdx][j]인 슬라이스 헤더가 디코딩될 때 DPB에 하나 이상의 참조 영상이 있는 경우, delta_poc_msb_present_flag[i][j]는 1과 같아야 한다. 존재하지 않을 경우, delta_poc_msb_cycle_lt[i] [j]의 값은 0과 같은 것으로 추론된다. delta_poc_msb_cycle_lt[i] [j]는 ref_pic_list_struct(i, rplsIdx, 1) 신택스 구조에서 j번째 엔트리의 영상 순서 카운트 값의 최상위 비트의 값을 결정하는 데 사용된다. delta_poc_msb_cycle_lt[i] [j]가 존재하지 않는 경우, 0과 같은 것으로 추론된다. poc_lsb_lt[listIdx][rplsIdx][i]는 ref_pic_list_struct(listIdx,　rplsIdx,　ltrpFlag) 신택스 구조의 i번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 모듈로 MaxLtPicOrderCntLsb MaxPicOrderCntLsb의 값을 지정한다. ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) 신택스 구조의-번째 엔트리. poc_lsb_lt[listIdx][rplsIdx][i] 신택스 요소의 길이는 Log2(MaxLtPicOrderCntLsb MaxPicOrderCntLsb) 비트이다.delta_poc_msb_present_flag[i][j] equal to 1 specifies that delta_poc_msb_cycle_lt[i][j] exists. delta_poc_msb_present_flag[i][j] equal to 0 specifies that delta_poc_msb_cycle_lt[i][j] does not exist. When the slice header with NumLtrpEntries[i] greater than 0 and the PicOrderCntVal module with MaxPicOrderCntLsb poc_lsb_lt[i] [rplsIdx][j] is decoded and there is more than one reference image in DPB, delta_poc_msb_present_flag[i][j] equals 1 and It should be the same. If it does not exist, it is inferred that the value of delta_poc_msb_cycle_lt[i] [j] is equal to 0. delta_poc_msb_cycle_lt[i] [j] is used to determine the value of the most significant bit of the image order count value of the j-th entry in the ref_pic_list_struct(i, rplsIdx, 1) syntax structure. If delta_poc_msb_cycle_lt[i] [j] does not exist, it is inferred to be equal to 0. poc_lsb_lt[listIdx][rplsIdx][i] specifies the value of MaxLtPicOrderCntLsb MaxPicOrderCntLsb as an image order count module of the image referenced by the i-th entry of the ref_pic_list_struct(listIdx, 　rplsIdx, 　ltrpFlag) syntax structure. ref_pic_list_struct(listIdx, rplsIdx, ltrpFlag) -th entry of the syntax structure. The length of the poc_lsb_lt[listIdx][rplsIdx][i] syntax element is Log2 (MaxLtPicOrderCntLsb MaxPicOrderCntLsb) bits.

영상 순서 카운트에 대한 디코딩 프로세스의 변경: 디코딩 프로세스 중 언제든, DPB에서 임의의 두 참조 영상에 대한 PicOrderCntVal &( MaxLtPicOrderCntLsb - 1)의 값은 동일하지 않아야 한다. Changing of the decoding process for a picture sequence count: PicOrderCntVal & (MaxLtPicOrderCntLsb - 1) for the time of the decoding process all, any of the two reference values of the image in the DPB should not be the same.

슬라이스 헤더 설계 1의 경우, 참조 영상 리스트 RefPicList[0] 및 RefPicList[1] 은 다음과 같이 구축된다:For slice header design 1, reference picture lists RefPicList[0] and RefPicList[1] are constructed as follows:

대안으로, 슬라이스 헤더 설계 1의 경우, delta_poc_msb_cycle_lt[listIdx][i]의 시맨틱스는 참조 영상 리스트 구축이 다음과 같이 업데이트될 수 있도록 델타의 델타로 표현될 수 있다: 참조 영상 리스트 RefPicList[0] 및 RefPicList[1]은 다음과 같이 구축된다:Alternatively, in the case of slice header design 1, the semantics of delta_poc_msb_cycle_lt[listIdx][i] can be expressed as delta of delta so that the reference picture list construction can be updated as follows: reference picture list RefPicList[0] and RefPicList [1] is constructed as follows:

슬라이스 헤더 설계 2의 경우, 참조 영상 리스트 RefPicList[0] 및 RefPicList[1]은 다음과 같이 구축된다.In the case of slice header design 2, reference picture lists RefPicList[0] and RefPicList[1] are constructed as follows.

대안으로, 슬라이스 헤더 설계 2의 경우, delta_poc_msb_cycle_lt[listIdx][i]의 시맨틱스는 참조 영상 리스트 구축이 다음과 같이 업데이트될 수 있도록 델타의 델타로 표현될 수 있다: 참조 영상 리스트 RefPicList[0] 및 RefPicList[1]은 다음과 같이 구축된다:Alternatively, in the case of slice header design 2, the semantics of delta_poc_msb_cycle_lt[listIdx][i] can be expressed as delta of delta so that the reference picture list construction can be updated as follows: reference picture list RefPicList[0] and RefPicList [1] is constructed as follows:

각각의 STRP는 그 PicOrderCntVal 값에 의해 식별된다. 각각의 LTRP에 대해, delta_poc_msb_present_flag[i] equal to 1을 갖는 RefPicList[0] 또는 RefPicList[1]의 엔트리에 의해 참조되면, PicOrderCntVal 값에 의해 식별되고, 그렇지 않으면 그 PicOrderCntVal 값의 그 Log2(MaxPicOrderCntLsb) LSB에 의해 식별된다.Each STRP is identified by its PicOrderCntVal value. For each LTRP, if referenced by an entry in RefPicList[0] or RefPicList[1] with delta_poc_msb_present_flag[i] equal to 1, it is identified by the PicOrderCntVal value, otherwise its Log2(MaxPicOrderCntLsb) LSB of its PicOrderCntVal value. Is identified by

장기 참조 영상 엔트리에 대한 델타 POC MSB의 시그널링의 대안 2.Alternative to signaling of delta POC MSB for long-term reference image entry 2.

본 개시의 하나의 대안적인 실시예에서, 제1 실시예 또는 제2 실시예에서 설명된 개시는 전술한 실시예와 결합될 수 있고, "장기 참조 영상 엔트리에 대한 델타 POC MSB의 시그널링" 및 "장기 참조 영상 엔트리에 대한 델타 POC MSB의 시그널링의 대안 1"로 각각 명명될 수 있다. 결합될 본 개시의 측면들은 additional_lt_poc_lsb(즉, 제1 실시예 또는 제2 실시예로부터) 및 POC MSB 사이클 정보(즉, 위에서 설명되고 "장기 참조 영상 엔트리에 대한 델타 POC MSB의 시그널링"으로 명명된 실시예로부터)의 시그널링이며 "장기 참조 영상 엔트리에 대한 델타 POC MSB의 시그널링" 및 "장기 참조 영상 엔트리에 대한 델타 POC MSB의 시그널링의 대안 1"로 명명된다. 전술한 제1 실시예와 실시예를 결합하고 "장기 참조 영상 엔트리에 대한 델타 POC MSB의 시그널링의 대안 1"로 명명 된 조합이 어떻게 수행될 수 있는지에 대한 일 예는 다음과 같이 설명된다.In one alternative embodiment of the present disclosure, the disclosure described in the first embodiment or the second embodiment can be combined with the above-described embodiment, and “Signaling of delta POC MSB for long-term reference picture entry” and “ It may be named "Alternative 1" of the signaling of delta POC MSB for long-term reference picture entry, respectively. Aspects of the present disclosure to be combined include additional_lt_poc_lsb (i.e., from the first embodiment or the second embodiment) and POC MSB cycle information (i.e., the implementation described above and named "Signaling of delta POC MSB for long-term reference picture entry" Is signaling from the old) and is named "Alternative 1 of signaling of delta POC MSB for long-term reference picture entry" and "signaling of delta POC MSB for long-term reference picture entry". An example of how the above-described first embodiment and the embodiment can be combined and a combination named “Alternative 1 of signaling of delta POC MSB for long-term reference picture entry” can be performed will be described as follows.

delta_poc_msb_present_flag[i][j] equal to 1은 delta_poc_msb_cycle_lt[i][j]가 존재함을 지정한다. delta_poc_msb_present_flag[i][i] equal to 0 은 delta_poc_msb_cycle_lt[i][j]이 존재하지 않음을 지정한다. NumLtrpEntries[i]가 0보다 크고 ref_pic_list_struct(i, rplsIdx, 1) 신택스 구조의 j번째 LTRP 엔트리에 대해, PicOrderCntVal 모듈로 MaxPicOrderLtCntLsb가 poc_lsb_lt[i] [rplsIdx][jj]와 같은, 이 슬라이스 헤더가 디코딩될 때 DPB에 하나 이상의 참조 영상이 있는 경우, 여기서 jj는 ref_pic_list_struct(i, rplsIdx, 1) 신택스 구조에서 j번째 LTRP 엔트리인 ref_pic_list_struct(i, rplsIdx, 1) 신택스 구조의 엔트리의 엔트리 색인이며, delta_poc_msb_present_flag[i] [j]는 1과 같아야 한다. 존재하지 않을 경우, delta_poc_msb_cycle_lt[i] [j]의 값은 0과 같은 것으로 추론된다. delta_poc_msb_cycle_lt[i] [j]는 ref_pic_list_struct(i, rplsIdx, 1) 신택스 구조에서 j번째 LTRP 엔트리의 영상 순서 카운트 값의 최상위 비트 값을 결정하는 데 사용된다. delta_poc_msb_cycle_lt[i] [j]가 존재하지 않으면, 0과 같은 것으로 추론된다.delta_poc_msb_present_flag[i][j] equal to 1 specifies that delta_poc_msb_cycle_lt[i][j] exists. delta_poc_msb_present_flag[i][i] equal to 0 specifies that delta_poc_msb_cycle_lt[i][j] does not exist. For the jth LTRP entry in the syntax structure ref_pic_list_struct(i, rplsIdx, 1) where NumLtrpEntries[i] is greater than 0, this slice header will be decoded, with MaxPicOrderLtCntLsb as poc_lsb_lt[i] [rplsIdx][jj] as the PicOrderCntVal module. When there is more than one reference picture in the DPB, where jj is the entry index of the entry in the ref_pic_list_struct(i, rplsIdx, 1) syntax structure, which is the j-th LTRP entry in the ref_pic_list_struct(i, rplsIdx, 1) syntax structure, and delta_poc_msb_present_flag[i ] [j] must be equal to 1. If it does not exist, it is inferred that the value of delta_poc_msb_cycle_lt[i] [j] is equal to 0. delta_poc_msb_cycle_lt[i] [j] is used to determine the most significant bit value of the picture order count value of the j-th LTRP entry in the ref_pic_list_struct(i, rplsIdx, 1) syntax structure. If delta_poc_msb_cycle_lt[i] [j] does not exist, it is inferred to be equal to 0.

영상 순서 카운트에 대한 디코딩 프로세스의 변경: 디코딩 프로세스 중 언제든지, DPB의 임의의 두 개의 참조 영상에 대한 PicOrderCntVal &(MaxLtPicOrderCntLsb-1)의 값은 동일하지 않아야 한다. Changing the decoding process for the picture order count: At any time during the decoding process, the values of PicOrderCntVal &(MaxLtPicOrderCntLsb-1) for any two reference pictures of the DPB should not be the same.

대안으로, delta_poc_msb_cycle_lt[listIdx][i]의 시맨틱스는 참조 영상 리스트 구축이 다음과 같이 업데이트될 수 있도록 델타의 델타로 표현될 수 있다: 참조 영상 리스트 RefPicList[0] 및 RefPicList[1]은 다음과 같이 구축된다:Alternatively, the semantics of delta_poc_msb_cycle_lt[listIdx][i] can be expressed as a delta of delta so that the reference picture list construction can be updated as follows: Reference picture lists RefPicList[0] and RefPicList[1] are as follows: Is built:

각각의 STRP는 그 PicOrderCntVal 값에 의해 식별된다. 각각의 LTRP에 대해, delta_poc_msb_present_flag[i] equal to 1을 갖는 RefPicList[0] 또는 RefPicList[1]의 엔트리에 의해 참조되면, PicOrderCntVal 값에 의해 식별되고, 그렇지 않으면 그 PicOrderCntVal 값의 그 Log2(MaxLtPicOrderCntLsb) LSB에 의해 식별된다.Each STRP is identified by its PicOrderCntVal value. For each LTRP, if referenced by an entry in RefPicList[0] or RefPicList[1] with delta_poc_msb_present_flag[i] equal to 1, it is identified by the PicOrderCntVal value, otherwise its Log2(MaxLtPicOrderCntLsb) LSB of its PicOrderCntVal value. Is identified by

항상 단기 및 장기 참조 영상을 구별하여 슬라이스 헤더에서 참조 영상 리스트를 시그널링한다.A reference picture list is signaled in the slice header by always distinguishing short and long-term reference pictures.

이 섹션은 본 개시의 다른 대안적인 실시예를 설명한다. 설명은 최신 VVC WD와 관련 있다(즉, JVET-K1001-v1의 최신 VVC WD와 관련된 델타만 설명되고, 아래에 언급되지 않은 최신 VVC WD의 텍스트는 그대로 적용된다). 이 대안적인 실시예는 다음과 같이 요약된다: 참조 영상 리스트 구조는 슬라이스 헤더에서만 시그널링된다. 단기 참조 영상과 장기 참조 영상은 모두 POC LSB에 의해 식별되며, 이는 POC 값의 도출을 위해 슬라이스 헤더에서 시그널링되는 POC LSB를 나타내는 데 사용되는 비트 수와 다른 비트 수로 표현될 수 있다. 또한, 단기 참조 영상과 장기 참조 영상을 식별하기 위한 POC LSB를 나타내는 데 사용되는 비트 수는 다를 수 있다.This section describes another alternative embodiment of the present disclosure. The description relates to the latest VVC WD (that is, only the delta related to the latest VVC WD of JVET-K1001-v1 is described, the text of the latest VVC WD not mentioned below is applied as it is). This alternative embodiment is summarized as follows: The reference picture list structure is signaled only in the slice header. Both the short-term reference picture and the long-term reference picture are identified by the POC LSB, which can be expressed as a number of bits different from the number of bits used to indicate the POC LSB signaled in the slice header to derive the POC value. In addition, the number of bits used to indicate the POC LSB for identifying the short-term reference picture and the long-term reference picture may be different.

NAL 유닛 헤더 신택스.NAL unit header syntax.

영상 파라미터 집합 RBSP 신택스.Picture parameter set RBSP syntax.

슬라이스 헤더 신택스.Slice header syntax.

참조 영상 리스트 구조 신택스.Reference image list structure syntax.

NAL 유닛 헤더 시맨틱스.NAL unit header semantics.

forbidden_zero_bit는 0과 같아야한다. nal_unit_type은 NAL 유닛에 포함된 RBSP 데이터 구조의 유형을 지정한다.forbidden_zero_bit must be equal to 0. nal_unit_type designates the type of the RBSP data structure included in the NAL unit.

nuh_temporal_id_plus1 minus 1은 NAL 유닛에 대한 시간 식별자를 지정한다. nuh_temporal_id_plus1의 값은 0과 같지 않아야 한다. 변수 TemporalId는 다음과 같이 지정된다: TemporalId = nuh_temporal_id_plus1 - 1.nuh_temporal_id_plus1 minus 1 designates a time identifier for the NAL unit. The value of nuh_temporal_id_plus1 should not be equal to 0. The variable TemporalId is specified as follows: TemporalId = nuh_temporal_id_plus1-1.

nal_unit_type이 IRAP_NUT와 같은 경우, 코딩된 슬라이스는 IRAP 영상에 속하고, TemporalId는 0과 같아야 한다. TemporalId의 값은 액세스 유닛의 모든 VCL NAL 유닛에 대해 동일해야 한다. 코딩된 영상 또는 액세스 유닛의 TemporalId 값은 코딩된 영상 또는 액세스 유닛의 VCL NAL 유닛의 TemporalId의 값이다. 비 VCL NAL 유닛의 TemporalId 값은 다음과 같이 제한된다: nal_unit_type이 SPS_NUT와 같으면 TemporalId는 0과 같아야 하고 NAL 유닛을 포함하는 액세스 유닛의 TemporalId는 0과 같아야 한다. 그렇지 않으면 nal_unit_type이 EOS_NUT 또는 EOB_NUT와 같으면, TemporalId는 0과 같아야 한다. 그렇지 않으면 TemporalId는 NAL 유닛을 포함하는 액세스 유닛의 TemporalId보다 크거나 같아야 한다. NAL 유닛이 비 VCL NAL 유닛인 경우, TemporalId의 값은 비 VCL NAL 유닛이 적용되는 모든 액세스 유닛의 TemporalId 값의 최소값과 같다. nal_unit_type이 PPS_NUT와 같은 경우, 모든 영상 파라미터 세트(PSS)가 비트스트림의 시작 부분에 포함될 수 있으므로, TemporalId는 포함하는 액세스 유닛의 TemporalId보다 크거나 같을 수 있으며, 여기서 첫 번째 코딩된 영상은 0과 같은 TemporalId 갖는다. nal_unit_type이 PREFIX_SEI_NUT 또는 SUFFIX_SEI_NUT와 같을 때, 보충 강화 정보(SEI) NAL 유닛이 TemporalId 값이 SEI NAL 유닛을 포함하는 액세스 유닛의 TemporalId보다 큰 액세스 유닛을 포함하는 비트스트림 서브유닛에 적용되는 정보를 포함할 수 있으므로, TemporalId는 포함하는 액세스 유닛의 TemporalId보다 크거나 같을 수 있다. nuh_reserved_zero_7bits는 '0000000'과 같아야 한다. nuh_reserved_zero_7bits의 다른 값은 ITU-T|ISO/IEC에 의해 장래에 지정될 수 있다. 디코더는 nuh_reserved_zero_7bits 값이 '0000000'과 같지 않은 NAL 유닛을 무시해야 한다(즉, 비트스트림에서 제거하고 폐기).When nal_unit_type is equal to IRAP_NUT, the coded slice belongs to the IRAP image, and the TemporalId must be equal to 0. The value of TemporalId must be the same for all VCL NAL units of the access unit. The TemporalId value of the coded video or access unit is a value of TemporalId of the VCL NAL unit of the coded video or access unit. TemporalId values of non-VCL NAL units are limited as follows: If nal_unit_type is equal to SPS_NUT, TemporalId must be equal to 0 and TemporalId of access unit containing NAL unit must equal 0. Otherwise, if nal_unit_type is equal to EOS_NUT or EOB_NUT, TemporalId must be equal to 0. Otherwise, the TemporalId must be greater than or equal to the TemporalId of the access unit including the NAL unit. When the NAL unit is a non-VCL NAL unit, the value of TemporalId is the same as the minimum value of TemporalId values of all access units to which the non-VCL NAL unit is applied. When nal_unit_type is equal to PPS_NUT, since all image parameter sets (PSS) may be included at the beginning of the bitstream, the TemporalId may be greater than or equal to the TemporalId of the containing access unit, where the first coded image is equal to 0. It has a TemporalId. When nal_unit_type is equal to PREFIX_SEI_NUT or SUFFIX_SEI_NUT, the supplemental enhancement information (SEI) NAL unit may include information applied to a bitstream subunit including an access unit whose TemporalId value is greater than the TemporalId of the access unit including the SEI NAL unit. Therefore, the TemporalId may be greater than or equal to the TemporalId of the containing access unit. nuh_reserved_zero_7bits must be equal to '0000000'. Other values of nuh_reserved_zero_7bits may be specified in the future by ITU-T|ISO/IEC. The decoder must ignore NAL units whose nuh_reserved_zero_7bits value is not equal to '0000000' (ie, remove and discard from the bitstream).

log2_max_pic_order_cnt_lsb_minus4는 영상 순서 카운트에 대한 디코딩 프로세서에서 사용되는 변수 MaxPicOrderCntLsb의 값을 다음과 같이 지정한다:log2_max_pic_order_cnt_lsb_minus4 specifies the value of the variable MaxPicOrderCntLsb used in the decoding processor for the picture order count as follows:

MaxPicOrderCntLsb　=　2(^{log2_max_pic_order_cnt_lsb_minus4　+　4)} MaxPicOrderCntLsb = 2( ^{log2_max_pic_order_cnt_lsb_minus4 + 4)}

log2_max_pic_order_cnt_lsb_minus4의 값은 0에서 12(포함)까지의 범위에 있어야 한다. sps_max_dec_pic_buffering_minus1 plus 1은 CVS에 필요한 디코딩된 영상 버퍼의 최대 크기를 영상 스토리지 버퍼 단위로 지정한다. sps_max_dec_pic_buffering_minus1의 값은 0에서 MaxDpbSize - 1(포함)까지의 범위에 있어야 하며, 여기서 MaxDpbSize는 다른 곳에 지정된 대로이다. additional_st_poc_lsb는 다음과 같이 참조 영상 리스트에 대한 디코딩 프로세스에 사용되는 변수 MaxStPicOrderCntLsb의 값을 지정한다:The value of log2_max_pic_order_cnt_lsb_minus4 must be in the range of 0 to 12 (inclusive). sps_max_dec_pic_buffering_minus1 plus 1 designates the maximum size of a decoded video buffer required for CVS in video storage buffer units. The value of sps_max_dec_pic_buffering_minus1 must be in the range of 0 to MaxDpbSize-1 (inclusive), where MaxDpbSize is as specified elsewhere. additional_st_poc_lsb specifies the value of the variable MaxStPicOrderCntLsb used in the decoding process for the reference picture list as follows:

MaxStPicOrderCntLsb　=　2(^{log2_max_pic_order_cnt_lsb_minus4　+　4　+　additional_st_poc_lsb)} MaxStPicOrderCntLsb = 2( ^{log2_max_pic_order_cnt_lsb_minus4 + 4 + additional_st_poc_lsb)}

additional_st_poc_lsb의 값은 0에서 32　-　log2_max_pic_order_cnt_lsb_minus4　- 4(포함)까지의 범위에 있어야 한다. long_term_ref_pics_flag equal to 0은 CVS에서 임의의 코딩된 영상의 인터 예측에 LTRP가 사용되지 않음을 지정한다. long_term_ref_pics_flag equal to 1은 LTRP가 CVS에서 하나 이상의 코딩된 영상의 인터 예측에 사용될 수 있음을 지정한다. additional_lt_poc_lsb는 다음과 같이 참조 영상 리스트에 대한 디코딩 프로세스에 사용되는 변수 MaxLtPicOrderCntLsb의 값을 지정한다:The value of additional_st_poc_lsb must be in the range of 0 to 32　-　log2_max_pic_order_cnt_lsb_minus4　- 4 (inclusive). long_term_ref_pics_flag equal to 0 specifies that LTRP is not used for inter prediction of an arbitrary coded image in CVS. long_term_ref_pics_flag equal to 1 specifies that LTRP can be used for inter prediction of one or more coded images in CVS. additional_lt_poc_lsb specifies the value of the variable MaxLtPicOrderCntLsb used in the decoding process for the reference picture list as follows:

MaxLtPicOrderCntLsb　=　2(^{log2_max_pic_order_cnt_lsb_minus4　+　4　+　additional_st_poc_lsb　+　additional_lt_poc_lsb)} MaxLtPicOrderCntLsb = 2( ^{log2_max_pic_order_cnt_lsb_minus4 + 4 + additional_st_poc_lsb + additional_lt_poc_lsb)}

num_ref_idx_default_active_minus1[i] plus 1은, i가 0인 경우, num_ref_idx_active_override_flag equal to 0을 갖는 P 또는 B 슬라이스에 대한 변수 NumRefIdxActive[0]의 추론된 값을 지정하고, i가 1일 때 num_ref_idx_active_override_flag equal to　0을 갖는 B 슬라이스에 대한 NumRefIdxActive[1]의 추론된 값을 지정한다. num_ref_idx_default_active_minus1[i]의 값은 0에서 14(포함)까지의 범위에 있어야 한다.num_ref_idx_default_active_minus1[i] plus 1 specifies the inferred value of NumRefIdxActive[0] for a P or B slice with num_ref_idx_active_override_flag equal to 0 when i is 0, and num_ref_idx_active_override_flag equal to 0 when i is 1 Specifies the inferred value of NumRefIdxActive[1] for B slice. The value of num_ref_idx_default_active_minus1[i] must be in the range of 0 to 14 (inclusive).

슬라이스 헤더 시맨틱스.Slice header semantics.

존재하는 경우, 슬라이스 헤더 신택스 요소 slice_pic_parameter_set_id 및 slice_pic_order_cnt_lsb 각각의 값은 코딩된 영상의 모든 슬라이스 헤더에서 동일해야한다. slice_type은 표 7-3에 따라 슬라이스의 코딩 유형을 지정한다.If present, the values of each of the slice header syntax elements slice_pic_parameter_set_id and slice_pic_order_cnt_lsb must be the same in all slice headers of the coded image. slice_type specifies the coding type of a slice according to Table 7-3.

al_unit_type이 IRAP_NUT과 동일한 경우, 즉 영상이 IRAP 영상인 경우, slice_type은 2와 같아야 한다.When al_unit_type is the same as IRAP_NUT, that is, when the video is an IRAP video, slice_type should be equal to 2.

slice_pic_order_cnt_lsb는 현재 영상에 대한 MaxPicOrderCntLsb 모듈로 영상 순서 카운트를 지정한다. slice_pic_order_cnt_lsb 신택스 요소의 길이는 log2_max_pic_order_cnt_lsb_minus4 + 4 비트이다. slice_pic_order_cnt_lsb의 값은 0에서 MaxPicOrderCntLsb - 1(포함)까지의 범위에 있어야 한다. slice_pic_order_cnt_lsb가 존재하지 않을 경우, slice_pic_order_cnt_lsb는 0과 같은 것으로 추론된다. num_ref_idx_active_override_flag equal to 1은,slice_pic_order_cnt_lsb designates an image order count with the MaxPicOrderCntLsb module for the current image. The length of the slice_pic_order_cnt_lsb syntax element is log2_max_pic_order_cnt_lsb_minus4 + 4 bits. The value of slice_pic_order_cnt_lsb must be in the range of 0 to MaxPicOrderCntLsb-1 (inclusive). If slice_pic_order_cnt_lsb does not exist, slice_pic_order_cnt_lsb is inferred to be equal to 0. num_ref_idx_active_override_flag equal to 1 is,

신택스 요소 num_ref_idx_active_minus1[0]이 P 및 B 슬라이스에 대해 존재하고, 신택스 요소 num_ref_idx_active_minus1[1]이 B 슬라이스에 존재한다고 지정한다.It is specified that the syntax element num_ref_idx_active_minus1[0] exists for the P and B slices, and that the syntax element num_ref_idx_active_minus1[1] exists for the B slice.

num_ref_idx_active_override_flag equal to 0은 신택스 요소 num_ref_idx_active_minus1[0] 및 num_ref_idx_active_minus1[1]이 존재하지 않음을 지정한다. num_ref_idx_active_minus1[i]는, 존재하는 경우, 다음과 같이 변수 NumRefIdxActive[i]의 값을 지정한다.num_ref_idx_active_override_flag equal to 0 specifies that the syntax elements num_ref_idx_active_minus1[0] and num_ref_idx_active_minus1[1] do not exist. When num_ref_idx_active_minus1[i] is present, the value of the variable NumRefIdxActive[i] is designated as follows.

NumRefIdxActive[i] = num_ref_idx_active_minus1[i]　+　1NumRefIdxActive[i] = num_ref_idx_active_minus1[i]　+　1

num_ref_idx_active_minus1[i]의 값은 0에서 14(포함)까지의 범위에 있어야 한다. NumRefIdxActive[i] - 1의 값은 슬라이스를 디코딩하는 데 사용될 수 있는 참조 영상 리스트 i에 대한 최대 참조 색인을 지정한다. NumRefIdxActive[i]의 값이 0과 같을 때, 참조 영상 리스트 i에 대한 참조 색인을 사용하여 슬라이스를 디코딩할 수 없다. 0 또는 1인 i 각각에 대해, 현재 슬라이스가 B 슬라이스이고 num_ref_idx_active_override_flag가 0과 같은 경우, NumRefIdxActive[i]는 num_ref_idx_default_active_minus1[i] + 1과 같은 것으로 추론된다. 현재 슬라이스가 P 슬라이스이고 num_ref_idx_active_override_flag가 0과 같은 경우, NumRefIdxActive[0]는 num_ref_idx_default_active_minus1[0] + 1과 같은 것으로 추론된다. 현재 슬라이스가 P 슬라이스인 경우, NumRefIdxActive[1]은 0과 같은 것으로 추론된다. 현재 슬라이스가 I 슬라이스인 경우, NumRefIdxActive[0] 및 NumRefIdxActive[1] 모두 0과 같은 것으로 추론된다. 또는 0 또는 1인 i 각각에 대해, 위의 것이 적용된 후에 다음이 적용된다: rplsIdx1을 ref_pic_list_sps_flag[i] ? ref_pic_list_idx[i] : num_ref_pic_lists_in_sps[i]과 동일하게 설정하고, numRpEntries[i]는 num_strp_entries[i][rplsIdx1]　+　num_ltrp_entries[i][rplsIdx1]과 동일하게 설정하자. NumRefIdxActive[i] 가 numRpEntries[i] 보다 클 때, NumRefIdxActive[i]의 값은 numRpEntries[i] 와 동일하게 설정된다.The value of num_ref_idx_active_minus1[i] must be in the range of 0 to 14 (inclusive). A value of NumRefIdxActive[i]-1 specifies the maximum reference index for the reference picture list i that can be used to decode a slice. When the value of NumRefIdxActive[i] is equal to 0, the slice cannot be decoded using the reference index for the reference picture list i. For each of i, which is 0 or 1, when the current slice is a B slice and num_ref_idx_active_override_flag is equal to 0, NumRefIdxActive[i] is inferred to be equal to num_ref_idx_default_active_minus1[i] + 1. If the current slice is a P slice and num_ref_idx_active_override_flag is equal to 0, NumRefIdxActive[0] is inferred to be equal to num_ref_idx_default_active_minus1[0] + 1. If the current slice is a P slice, it is inferred that NumRefIdxActive[1] is equal to 0. When the current slice is an I slice, it is inferred that both NumRefIdxActive[0] and NumRefIdxActive[1] are equal to 0. Or, for each of i, which is 0 or 1, the following applies after the above is applied: rplsIdx1 to ref_pic_list_sps_flag[i]? ref_pic_list_idx[i]: Set the same as num_ref_pic_lists_in_sps[i], and set numRpEntries[i] the same as num_strp_entries[i][rplsIdx1]　+　num_ltrp_entries[i][rplsIdx1]. When NumRefIdxActive[i] is greater than numRpEntries[i], the value of NumRefIdxActive[i] is set equal to numRpEntries[i].

ref_pic_list_struct(listIdx, ltrpFlag) 신택스 구조는 슬라이스 헤더에 존재할 수 있다. 슬라이스 헤더에 존재하는 경우, ref_pic_list_struct(listIdx, ltrpFlag) 신택스 구조는 현재 영상(슬라이스를 포함하는 영상)의 참조 영상 리스트 listIdx를 지정한다. num_strp_entries[listIdx]는 ref_pic_list_struct(listIdx, ltrpFlag) 신택스 구조의 STRP 엔트리 수를 지정한다. num_ltrp_entries[listIdx]는 ref_pic_list_struct(listIdx, ltrpFlag) 신택스 구조의 LTRP 엔트리 수를 지정한다. 존재하지 않는 경우, num_ltrp_entries[listIdx]의 값은 0과 같은 것으로 추론된다. 변수 NumEntriesInList[listIdx]는 다음과 같이 도출된다: The ref_pic_list_struct(listIdx, ltrpFlag) syntax structure may exist in the slice header. When present in the slice header, the ref_pic_list_struct(listIdx, ltrpFlag) syntax structure designates the reference picture list listIdx of the current picture (picture including the slice). num_strp_entries[listIdx] designates the number of STRP entries in the ref_pic_list_struct(listIdx, ltrpFlag) syntax structure. num_ltrp_entries[listIdx] designates the number of LTRP entries in the syntax structure ref_pic_list_struct(listIdx, ltrpFlag). If not present, the value of num_ltrp_entries[listIdx] is inferred to be equal to 0. The variable NumEntriesInList[listIdx] is derived as follows:

NumEntriesInList[listIdx]의 값은 0에서 sps_max_dec_pic_buffering_minus1(포함)까지의 범위에 있어야 한다. lt_ref_pic_flag[listIdx][i] equal to 1은 ref_pic_list_struct(listIdx, ltrpFlag) 신택스 구조의 i번째 엔트리가 LTRP 엔트리임을 지정한다.The value of NumEntriesInList[listIdx] must be in the range of 0 to sps_max_dec_pic_buffering_minus1 (inclusive). lt_ref_pic_flag[listIdx][i] equal to 1 designates that the i-th entry of the ref_pic_list_struct(listIdx, ltrpFlag) syntax structure is an LTRP entry.

lt_ref_pic_flag[listIdx][i] equal to 0은 ref_pic_list_struct(listIdx, ltrpFlag) 신택스 구조의 i번째 엔트리가 STRP 엔트리임을 지정한다. 존재하지 않는 경우, lt_ref_pic_flag[listIdx][i]의 값은 0으로 추론된다. 0 에서 NumEntriesInList[listIdx]　-　1(포함)까지의 범위에 있는 i의 모든 값에 대해 lt_ref_pic_flag[listIdx][i]의 합계가 num_ltrp_entries[listIdx]와 같아야 하는 것이 비트스트림 적합성의 요건이다lt_ref_pic_flag[listIdx][i] equal to 0 designates that the i-th entry of the ref_pic_list_struct(listIdx, ltrpFlag) syntax structure is an STRP entry. If not present, the value of lt_ref_pic_flag[listIdx][i] is deduced as 0. For all values of i in the range 0 to NumEntriesInList[listIdx]　-　1 (inclusive), the sum of lt_ref_pic_flag[listIdx][i] must equal num_ltrp_entries[listIdx] is a requirement of bitstream conformance.

poc_lsb_st[listIdx][i]는, lt_ref_pic_flag[listIdx][i] 가 0인 경우, ref_pic_list_struct(listIdx, ltrpFlag) 신택스의 i번째 엔트리가 참조하는 영상의 영상 순서 카운트 모듈로 MaxStPicOrderCntLsb를 지정한다. poc_lsb_st[listIdx][i] 신택스 요소의 길이는 Log2(MaxStPicOrderCntLsb) 비트이다. poc_lsb_lt[listIdx][i]는, lt_ref_pic_flag[listIdx][i] 가 1일 때, ref_pic_list_struct(listIdx, ltrpFlag) 신택스 구조에서 i번째 엔트리가 참조하는 영상의 영상 순서 카운트 모듈로 MaxLtPicOrderCntLsb의 값을 지정한다. poc_lsb_lt[listIdx][i] 신택스 요소의 길이는 Log2(MaxLtPicOrderCntLsb) 비트이다.poc_lsb_st[listIdx][i], when lt_ref_pic_flag[listIdx][i] is 0, designates MaxStPicOrderCntLsb as a picture order count module of the picture referenced by the i-th entry of the ref_pic_list_struct(listIdx, ltrpFlag) syntax. The length of the poc_lsb_st[listIdx][i] syntax element is Log2 (MaxStPicOrderCntLsb) bits. poc_lsb_lt[listIdx][i], when lt_ref_pic_flag[listIdx][i] is 1, designates the value of MaxLtPicOrderCntLsb as a picture order count module of the picture referenced by the i-th entry in the ref_pic_list_struct(listIdx, ltrpFlag) syntax structure. The length of the poc_lsb_lt[listIdx][i] syntax element is Log2 (MaxLtPicOrderCntLsb) bits.

디코딩 프로세스를 설명한다.Describe the decoding process.

일반 디코딩 프로세스.General decoding process.

디코딩 프로세스는 현재 영상 CurrPic에 대해 다음과 같이 동작한다: NAL 유닛의 디코딩은 아래에 지정된다. 아래의 프로세스는 슬라이스 헤더 계층 이상에서 신택스 요소를 사용하여 다음 디코딩 프로세스를 지정한다: 영상 순서 카운트와 관련된 변수 및 함수가 도출된다. 이것은 영상의 첫 번째 슬라이스에 대해서만 호출되어야 한다. 비 IRAP 영상의 슬라이스 각각에 대한 디코딩 프로세스의 시작에서, 참조 영상 리스트 0(RefPicList[0]) 및 참조 영상 리스트 1(RefPicList[1]의 도출을 위해 참조 영상 리스트 구축을 한 디코딩 프로세스가 호출된다. 참조 영상 마킹을 위한 디코딩 프로세스가 호출되며, 여기서 참조 영상은 "참조용으로 사용되지 않음" 또는 "장기 참조용으로 사용됨"으로 마킹될 수 있다. 이것은 영상의 첫 번째 슬라이스에 대해서만 호출되어야 한다. 코딩 트리 유닛, 스케일링, 변환, 루프 내 필터링 등에 대한 디코딩 프로세스가 호출된다. 현재 영상의 모든 슬라이스가 디코딩된 후, 현재 디코딩된 영상은 "단기 참조용으로 사용됨"으로 마킹된다.The decoding process operates as follows for the current picture CurrPic: The decoding of the NAL unit is specified below. The following process specifies the following decoding process using syntax elements above the slice header layer: Variables and functions related to the picture order count are derived. This should only be called for the first slice of the image. At the beginning of the decoding process for each slice of the non-IRAP picture, the decoding process in which the reference picture list is constructed is called to derive the reference picture list 0 (RefPicList[0]) and the reference picture list 1 (RefPicList[1]). The decoding process for marking the reference picture is called, where the reference picture can be marked as “not used for reference” or “used for long-term reference.” This should only be called for the first slice of the picture. The decoding process is called for tree unit, scaling, transform, in-loop filtering, etc. After all slices of the current picture have been decoded, the currently decoded picture is marked as "used for short-term reference".

NAL 유닛 디코딩 프로세스.NAL unit decoding process.

이 프로세스에 대한 입력은 현재 영상의 NAL 유닛 및 연관된 비 VCL NAL 유닛이다. 이 프로세스의 출력은 NAL 유닛 내에 캡슐화된, 파싱된 RBSP 신택스 구조이다. 각각의 NAL 유닛에 대한 디코딩 프로세스는 NAL 유닛으로부터 RBSP 신택스 구조를 추출한 다음, RBSP 신택스 구조를 파싱한다The input to this process is the NAL unit of the current picture and the associated non-VCL NAL unit. The output of this process is the parsed RBSP syntax structure, encapsulated within the NAL unit. The decoding process for each NAL unit extracts the RBSP syntax structure from the NAL unit and then parses the RBSP syntax structure.

슬라이스 디코딩 프로세스.Slice decoding process.

영상 순서 카운트에 대한 디코딩 프로세스.The decoding process for the picture sequence count.

이 프로세스의 출력은 현재 영상의 영상 순서 카운트인 PicOrderCntVal이다. 영상 순서 카운트는 영상을 식별하고, 병합 모드 및 움직임 벡터 예측에서 움직임 파라미터를 도출하고, 디코더 적합성 검사에 사용된다. 각각의 코딩된 영상은 PicOrderCntVal로 표시된, 영상 순서 카운트 변수와 연관된다. 현재 영상이 IRAP 영상이 아닌 경우, 변수 prevPicOrderCntLsb 및 prevPicOrderCntMsb는 다음과 같이 도출된다: prevTid0Pic을 디코딩 순서에서 TemporalId가 0인 이전 영상이라고 하자. 변수 prevPicOrderCntLsb는 prevTid0Pic의 slice_pic_order_cnt_lsb와 동일하게 설정된다. 변수 prevPicOrderCntMsb는 prevTid0Pic의 PicOrderCntMsb와 동일하게 설정된다. 현재 영상의 변수 PicOrderCntMsb는 다음과 같이 도출된다: 현재 영상이 IRAP 양싱이면 PicOrderCntMsb는 0으로 설정된다. 그렇지 않으면 PicOrderCntMsb는 다음과 같이 도출된다.The output of this process is PicOrderCntVal, which is the image order count of the current image. The picture sequence count is used to identify the picture, derive motion parameters from merge mode and motion vector prediction, and check decoder suitability. Each coded picture is associated with a picture order count variable, denoted PicOrderCntVal. If the current picture is not an IRAP picture, the variables prevPicOrderCntLsb and prevPicOrderCntMsb are derived as follows: Let prevTid0Pic be the previous picture with TemporalId of 0 in decoding order. The variable prevPicOrderCntLsb is set equal to slice_pic_order_cnt_lsb of prevTid0Pic. The variable prevPicOrderCntMsb is set the same as PicOrderCntMsb of prevTid0Pic. The variable PicOrderCntMsb of the current image is derived as follows: If the current image is both IRAP, PicOrderCntMsb is set to 0. Otherwise, PicOrderCntMsb is derived as follows.

PicOrderCntVal은 다음과 같이 도출된다.PicOrderCntVal is derived as follows.

PicOrderCntVal = PicOrderCntMsb + slice_pic_order_cnt_lsbPicOrderCntVal = PicOrderCntMsb + slice_pic_order_cnt_lsb

slice_pic_order_cnt_lsb는 IRAP 영상에 대해 0으로 추론되고 prevPicOrderCntLsb와 prevPicOrderCntMsb는 모두 0으로 설정되기 때문에 모든 IRAP 영상은 0과 같은 PicOrderCntVal을 가질 것이다. PicOrderCntVal의 값은 -231에서 231 - 1(포함)까지의 범위에 있어야 한다. 하나의 CVS에서, 임의의 두 개의 코딩된 영상에 대한 PicOrderCntVal 값은 동일하지 않아야 한다. 디코딩 프로세스 동안 언제든, DPB의 임의의 두 단기 참조 영상에 대한 PicOrderCntVal &(MaxStPicOrderCntLsb - 1)의 값은 동일하지 않아야 한다. 디코딩 프로세스 중 언제든, DPB의 두 참조 영상에 대한 PicOrderCntVal &(MaxLtPicOrderCntLsb - 1)의 값은 동일하지 않아야 한다.Since slice_pic_order_cnt_lsb is inferred to be 0 for IRAP images, and prevPicOrderCntLsb and prevPicOrderCntMsb are both set to 0, all IRAP images will have PicOrderCntVal equal to 0. The value of PicOrderCntVal must be in the range of -231 to 231-1 (inclusive). In one CVS, the PicOrderCntVal values for any two coded images should not be the same. At any time during the decoding process, the values of PicOrderCntVal &(MaxStPicOrderCntLsb-1) for any two short-term reference pictures in the DPB should not be the same. At any time during the decoding process, the values of PicOrderCntVal &(MaxLtPicOrderCntLsb-1) for the two reference pictures of the DPB should not be the same.

함수 PicOrderCnt(picX)는 다음과 같이 지정된다:The function PicOrderCnt(picX) is specified as follows:

PicOrderCnt(picX) = PicOrderCntVal of the picture picXPicOrderCnt(picX) = PicOrderCntVal of the picture picX

함수 DiffPicOrderCnt(picA, picB)는 다음과 같이 지정된다:The function DiffPicOrderCnt(picA, picB) is specified as follows:

비트스트림은 디코딩 프로세스에서 사용되는 DiffPicOrderCnt(picA, picB)의 값이 -2¹⁵ 에서 2¹⁵　-　1(포함)까지의 범위에 없는 데이터를 포함하지 않아야 한다. X를 현재 영상이라 하고 Y와 Z를 동일한 CVS에 있는 다른 두 개의 영상이라고 하고, Y와 Z는 DiffPicOrderCnt(X, Y) 및 DiffPicOrderCnt(X, Z)가 모두 양수이거나 모두 음수인 경우 X로부터 동일한 출력 순서 방향에 있는 것으로 간주된다. Bitstream DiffPicOrderCnt from ^2-215 value of (picA, picB) 2 ¹⁵ to be used in the decoding process - it should not include data that is not in the range of from 1 (included). X is the current image, Y and Z are the other two images in the same CVS, and Y and Z are the same output from X when DiffPicOrderCnt(X, Y) and DiffPicOrderCnt(X, Z) are both positive or negative. It is considered to be in the ordinal direction.

이 프로세스는 비 IRAP 영상의 슬라이스에 각각에 대한 디코딩 프로세스의 시작에서 호출된다. 참조 영상은 참조 색인을 통해 처리된다. 참조 색인은 참조 영상 리스트에 대한 색인이다. I 슬라이스를 디코딩하는 경우, 슬라이스 데이터의 디코딩에 참조 영상 리스트가 사용되지 않는다. P 슬라이스를 디코딩하는 경우, 참조 영상 리스트 0(즉, RefPicList[0])만이 슬라이스 데이터의 디코딩에 사용된다. B 슬라이스를 디코딩하는 경우, 참조 영상 리스트 0과 참조 영상 리스트 1(즉, RefPicList[1])이 모두 슬라이스 데이터의 디코딩에 사용된다. 비 IRAP 영상의 슬라이스에 대한 디코딩 프로세스의 시작에서, 참조 영상 리스트 RefPicList[0] 및 RefPicList[1]이 도출된다. 참조 영상 리스트는 참조 영상의 마킹 또는 슬라이스 데이터의 디코딩에 사용된다. 영상의 첫 번째 슬라이스가 아닌 비 IRAP 영상의 I 슬라이스의 경우, RefPicList[0] 및 RefPicList[1]은 비트스트림 적합성 검사 목적으로 도출될 수 있지만, 이들의 도출은 현재 영상 또는 디코딩 순서에서 현재 영상 다음에 오는 나오는 영상의 디코딩에 필요하지 않다. 영상의 첫 번째 슬라이스가 아닌 P 슬라이스의 경우, 비트스트림 적합성 검사 목적으로 RefPicList[1]가 도출될 수 있지만, 이 도출은 현재 영상 또는 디코딩 순서에서 현재 영상 다음에 오는 영상의 디코딩에 필요하지 않다. This process is called at the beginning of the decoding process for each slice of a non-IRAP picture. The reference image is processed through the reference index. The reference index is an index for the reference video list. When decoding the I slice, the reference picture list is not used for decoding the slice data. When decoding a P slice, only reference picture list 0 (ie, RefPicList[0]) is used for decoding slice data. When decoding the B slice, both reference picture list 0 and reference picture list 1 (ie, RefPicList[1]) are used for decoding slice data. At the beginning of the decoding process for a slice of a non-IRAP picture, reference picture lists RefPicList[0] and RefPicList[1] are derived. The reference picture list is used for marking a reference picture or decoding slice data. In the case of an I slice of a non-IRAP image other than the first slice of an image, RefPicList[0] and RefPicList[1] can be derived for the purpose of checking bitstream conformance, but their derivation is next to the current image in the current image or decoding order. It is not necessary to decode the video coming in. In the case of a P slice other than the first slice of an image, RefPicList[1] may be derived for the purpose of checking bitstream suitability, but this derivation is not necessary for decoding the current image or the image following the current image in decoding order.

0 또는 1인 각각의 i에 대해, 다음이 적용된다:For each i, either 0 or 1, the following applies:

RefPicList[i]의 첫 번째 NumRefIdxActive[i] 엔트리는 RefPicList[i]의 활성 엔트리라고 하고, RefPicList[i]의 다른 엔트리는 RefPicList[i]의 비활성 엔트리라고 한다. 0에서 NumEntriesInList[i]　-　1(포함)까지의 범위에 있는 j에 대한 RefPicList[i][j]의 엔트리 각각은 lt_ref_pic_flag[i][j]가 0이면 STRP 엔트리라고 하고, 그렇지 않으면 LTRP 엔트리라고 한다. RefPicList[0]의 엔트리와 RefPicList[1]의 엔트리 둘 다에 의해 특정 영상이 참조되는 것도 가능하다. RefPicList[0]의 하나의 엔트리 또는 RefPicList[1]의 하나 이상의 엔트리에 의해 특정 영상이 참조되는 것도 가능하다. RefPicList[0]의 활성 엔트리와 RefPicList[1]의 활성 엔트리는 현재 영상의 인터 예측에 사용될 수 있는 모든 참조 영상과 디코딩 순서에서 현재 영상 다음에 오는 하나 이상의 영상을 총칭한다. RefPicList[0]의 비활성 엔트리와 RefPicList[1]의 비활성 엔트리는 현재 영상의 인터 예측에 사용되지 않지만 디코딩 순서에서 현재 영상 다음에 오는 하나 이상의 영상에 대한 인터 예측에 사용될 수 있는 모든 참조 영상을 총칭한다. RefPicList[0] 또는 RefPicList[1]에는 대응하는 영상이 DPB에 존재하지 않기 때문에 "참조 영상 없음"과 동일한 엔트리가 하나 이상 있을 수 있다. "참조 영상 없음"과 동일한 RefPicList[0] 또는 RefPicList[0]의 비활성 엔트리 각각은 무시되어야 한다. 의도하지 않은 영상 손실은 "참조 영상 없음"과 동일한 RefPicList[0] 또는 RefPicList[1]의 활성 엔트리 각각에 대해 추론되어야 한다.The first NumRefIdxActive[i] entry of RefPicList[i] is said to be an active entry of RefPicList[i], and the other entries of RefPicList[i] are said to be inactive entries of RefPicList[i]. Each entry in RefPicList[i][j] for j in the range 0 to NumEntriesInList[i]　-　1 (inclusive) is called a STRP entry if lt_ref_pic_flag[i][j] is 0, otherwise it is called an LTRP entry. do. It is also possible for a specific image to be referenced by both the entry of RefPicList[0] and the entry of RefPicList[1]. It is also possible for a specific image to be referenced by one entry of RefPicList[0] or one or more entries of RefPicList[1]. The active entry of RefPicList[0] and the active entry of RefPicList[1] collectively refer to all reference pictures that can be used for inter prediction of the current picture and one or more pictures following the current picture in decoding order. The inactive entry of RefPicList[0] and the inactive entry of RefPicList[1] are not used for inter prediction of the current picture, but collectively refer to all reference pictures that can be used for inter prediction of one or more pictures following the current picture in decoding order. . In RefPicList[0] or RefPicList[1], since a corresponding picture does not exist in the DPB, there may be one or more entries identical to "No Reference Picture". Each of the inactive entries of RefPicList[0] or RefPicList[0] equal to "No Reference Picture" shall be ignored. Unintended picture loss must be inferred for each active entry in RefPicList[0] or RefPicList[1] equal to "No Reference Picture".

다음과 같은 제약이 적용되는 비트스트림 적합성의 요건이다: 0 또는 1인 각각 i에 대해, NumEntriesInList[i]은 NumRefIdxActive[i]보다 작지 않아야 한다. RefPicList[0] 또는 RefPicList[1]의 활성 엔트리 각각에 의해 참조되는 영상은 DPB에 있어야 하며 현재 영상의 TemporalId보다 작거나 같은 TemporalId를 가져야 한다. 선택적으로 다음과 같은 제약이 추가로 지정될 수 있다: RefPicList[0] 또는 RefPicList[1]의 임의의 비활성 엔트리의 엔트리 색인은 현재 영상의 디코딩을 위한 참조 색인으로 사용되지 않아야 한다. 선택적으로 다음의 제약이 추가로 지정될 수 있다: RefPicList[0] 또는 RefPicList[1]의 비활성 엔트리는 RefPicList[0] 또는 RefPicList[1]의 다른 엔트리와 동일한 영상을 참조하지 않아야 한다. 영상의 슬라이스의 RefPicList[0] 또는 RefPicList[1]에 있는 STRP 엔트리와, 동일한 영상의 동일한 슬라이스 또는 상이한 슬라이스의 RefPicList[0] 또는 RefPicList[1]에 있는 LTRP 엔트리는 동일한 영상을 참조하지 않아야 한다. 현재 영상 자체는 RefPicList[0] 또는 RefPicList[1]의 임의의 엔트리에 의해 참조되지 않아야 한다. RefPicList[0] 또는 RefPicList[1]에는 현재 영상의 PicOrderCntVal과 엔트리가 참조하는 영상의 PicOrderCntVal 사이의 차이가 224보다 크거나 같은 LTRP 엔트리가 없어야 한다. setOfRefPics를 RefPicList[0]의 모든 엔트리와 RefPicList[1]의 모든 엔트리에 의해 참조되는 유일한 영상의 세트라고 하자. setOfRefPics의 영상 수는 sps_max_dec_pic_buffering_minus1보다 작거나 같아야 하며 setOfRefPics는 영상의 모든 슬라이스에 대해 동일해야 한다The following restrictions apply to the requirements of bitstream conformance: For each i, which is 0 or 1, NumEntriesInList[i] must not be less than NumRefIdxActive[i]. The picture referenced by each of the active entries of RefPicList[0] or RefPicList[1] must be in the DPB and must have a TemporalId less than or equal to the TemporalId of the current picture. Optionally, the following constraints may be additionally specified: The entry index of any inactive entry in RefPicList[0] or RefPicList[1] shall not be used as a reference index for decoding of the current picture. Optionally, the following constraints may be additionally specified: An inactive entry in RefPicList[0] or RefPicList[1] must not refer to the same picture as another entry in RefPicList[0] or RefPicList[1]. STRP entries in RefPicList[0] or RefPicList[1] of a slice of an image and LTRP entries in RefPicList[0] or RefPicList[1] of the same slice or different slices of the same image shall not refer to the same image. The current picture itself must not be referenced by any entry in RefPicList[0] or RefPicList[1]. In RefPicList[0] or RefPicList[1], there must not be an LTRP entry with a difference between the PicOrderCntVal of the current video and the PicOrderCntVal of the video referenced by the entry is greater than or equal to 224. Let setOfRefPics be the set of unique images referenced by all entries in RefPicList[0] and all entries in RefPicList[1]. The number of images in setOfRefPics must be less than or equal to sps_max_dec_pic_buffering_minus1, and setOfRefPics must be the same for all slices of the image.

이 프로세스는 슬라이스 헤더의 디코딩 및 슬라이스에 대한 참조 영상 리스트 구축을 위한 디코딩 프로세스 이후, 그러나 슬라이스 데이터의 디코딩 이전에 영상당 한 번 호출된다. 이 프로세스는 DPB에 있는 하나 이상의 참조 영상이 "참조용으로 사용되지 않음" 또는 "장기 참조용으로 사용됨"으로 마킹되도록 할 수 있다. DPB에서 디코딩된 영상은 "참조용으로 사용되지 않음", "단기 참조용으로 사용됨" 또는 "장기 참조용으로 사용됨"으로 마킹될 수 있지만, 디코딩 프로세스의 동작 중의 임의의 주어진 시각에는 이 세 가지 중 하나만으로 마킹된다. 이러한 마킹 중 하나를 영상에 할당하는 것은 적용 가능한 경우에 이러한 마킹 중 다른 마킹은 암묵적으로 제거된다. 영상이 "참조용으로 사용됨"으로 마킹되는 경우, 이는 "단기 참조용으로 사용됨" 또는 "장기 참조용으로 사용됨"(둘 다는 아님)으로 마킹된 영상을 통칭한다 현재 영상이 IRAP 영상인 경우, 현재 DPB에 있는 모든 참조 영상(있는 경우)이 "참조용으로 사용되지 않음"으로 표시된다. STRP는 그 PicOrderCntVal 값의 Log2(MaxStPicOrderCntLsb)　LSB에 의해 식별된다. LTRP는 그 PicOrderCntVal 값의 Log2(MaxLtPicOrderCntLsb) LSB에 의해 식별된다. This process is called once per picture after the decoding process for decoding the slice header and constructing a reference picture list for the slice, but before decoding the slice data. This process may cause one or more reference images in the DPB to be marked as “not used for reference” or “used for long-term reference”. Video decoded in DPB may be marked as "not used for reference", "used for short-term reference" or "used for long-term reference", but at any given time during the operation of the decoding process, one of the three It is marked with only one. If allocating one of these markings to an image is applicable, the other of these markings is implicitly removed. When an image is marked as "used for reference", it is collectively referred to as an image marked as "used for short-term reference" or "used for long-term reference" (but not both). If the current image is an IRAP image, the current All reference images (if any) in the DPB are marked as "not used for reference". STRP is identified by Log2(MaxStPicOrderCntLsb) and LSB of its PicOrderCntVal value. LTRP is identified by the Log2 (MaxLtPicOrderCntLsb) LSB of its PicOrderCntVal value.

다음이 적용된다: RefPicList[0] 또는 RefPicList[1]의 LTRP 엔트리 각각에 대해, 참조된 영상이 STRP인 경우, 영상은 "장기 참조에 사용됨"으로 마킹된다. RefPicList[0] 또는 RefPicList[1]의 임의의 엔트리에 의해 참조되지 않는 DPB의 참조 영상 각각은 "참조용으로 사용되지 않음"으로 마킹된다. The following applies: For each of the LTRP entries of RefPicList[0] or RefPicList[1], if the referenced picture is STRP, the picture is marked as "used for long-term reference". Each reference picture of a DPB that is not referenced by any entry in RefPicList[0] or RefPicList[1] is marked as "not used for reference".

단기 참조 화상과 장기 참조 화상 사이의 구별 없이 항상 슬라이스 헤더에서 참조 화상 목록을 시그널링한다.A reference picture list is always signaled in the slice header without any distinction between the short-term reference picture and the long-term reference picture.

이 섹션은 본 개시의 다른 대안적인 실시예를 설명한다. 설명은 최신 VVC WD와 관련이 있다(즉, JVET-K1001-v1의 최신 VVC WD와 관련된 델타만 설명되고, 아래에 언급되지 않은 최신 VVC WD의 텍스트는 그대로 적용된다). 이 대안적인 실시예는 다음과 같이 요약된다: 참조 영상 리스트 구조는 슬라이스 헤더에서만 시그널링된다. 단기 참조 영상과 장기 참조 영상은 구별되지 않는다. 모든 참조 영상은 단지 명명된 참조 영상이다. 참조 영상은 그 POC LSB에 의해 식별되며, 이는 POC 값의 도출을 위해 슬라이스 헤더에서 시그널링되는 POC LSB를 나타내는 데 사용되는 비트 수와 다른 비트 수로 표현될 수 있다. This section describes another alternative embodiment of the present disclosure. The description relates to the latest VVC WD (that is, only the delta related to the latest VVC WD of JVET-K1001-v1 is described, the text of the latest VVC WD not mentioned below is applied as it is). This alternative embodiment is summarized as follows: The reference picture list structure is signaled only in the slice header. There is no distinction between short-term reference images and long-term reference images. All reference pictures are just named reference pictures. The reference picture is identified by its POC LSB, which can be expressed as a number of bits different from the number of bits used to indicate the POC LSB signaled in the slice header for derivation of the POC value.

약어. VVC WD의 clause 4에 있는 텍스트가 적용된다.Abbreviation. The text in clause 4 of the VVC WD applies.

NAL 유닛 헤더 신택스NAL unit header syntax

영상 파라미터 세트 RBSP 신택스.Picture parameter set RBSP syntax.

슬라이스 헤더 신택스.Slice header syntax.

참조 영상 리스트 구조 신택스.Reference image list structure syntax.

NAL 유닛 헤더 시맨틱스.NAL unit header semantics.

nuh_temporal_id_plus1 minus 1은 NAL 유닛에 대한 시간 식별자를 지정한다. nuh_temporal_id_plus1의 값은 0과 같지 않아야 한다. 변수 TemporalId는 다음과 같이 지정된다.nuh_temporal_id_plus1 minus 1 designates a time identifier for the NAL unit. The value of nuh_temporal_id_plus1 should not be equal to 0. The variable TemporalId is specified as follows.

TemporalId = nuh_temporal_id_plus1　-　1 TemporalId = nuh_temporal_id_plus1　-　1

nal_unit_type이 IRAP_NUT와 같은 경우, 코딩된 슬라이스는 IRAP 영상에 속하고, TemporalId는 0과 같아야 한다. TemporalId의 값은 액세스 유닛의 모든 VCL NAL 유닛에 대해 동일해야 한다. 코딩된 영상 또는 액세스 유닛의 TemporalId 값은 코딩된 영상 또는 액세스 유닛의 VCL NAL 유닛의 TemporalId 값이다. 비 VCL NAL 유닛의 TemporalId 값은 다음과 같이 제한된다:When nal_unit_type is equal to IRAP_NUT, the coded slice belongs to the IRAP image, and the TemporalId must be equal to 0. The value of TemporalId must be the same for all VCL NAL units of the access unit. The TemporalId value of the coded picture or access unit is the TemporalId value of the VCL NAL unit of the coded picture or access unit. TemporalId values of non-VCL NAL units are limited as follows:

nal_unit_type이 SPS_NUT와 같으면 TemporalId는 0과 같아야 하고 NAL 유닛을 포함하는 액세스 유닛의 TemporalId는 0과 같아야 한다. 그렇지 않고 nal_unit_type이 EOS_NUT 또는 EOB_NUT와 같으면, TemporalId는 0과 같아야 한다. 그렇지 않으면 TemporalId는 NAL 유닛을 포함하는 액세스 유닛의 TemporalId보다 크거나 같아야 한다. NAL 유닛이 비 VCL NAL 유닛인 경우, TemporalId의 값은 비 VCL NAL 유닛이 적용되는 모든 액세스 유닛의 TemporalId 값의 최소값과 같다. nal_unit_type이 PPS_NUT와 같은 경우, 모든 영상 파라미터 세트(PSS)가 비트스트림의 시작 부분에 포함될 수 있으므로, TemporalId는 포함하는 액세스 유닛의 TemporalId보다 크거나 같을 수 있으며, 여기서 첫 번째 코딩된 영상은 0과 같은 TemporalId 갖는다. nal_unit_type이 PREFIX_SEI_NUT 또는 SUFFIX_SEI_NUT와 같을 때, SEI NAL 유닛은 TemporalId 값이 SEI NAL 유닛을 포함하는 액세스 유닛의 TemporalId보다 큰 액세스 유닛을 포함하는 비트스트림 서브유닛에 적용되는 정보를 포함할 수 있으므로, TemporalId는 포함하는 액세스 유닛의 TemporalId보다 크거나 같을 수 있다. nuh_reserved_zero_7bits는 '0000000'과 같아야 한다. nuh_reserved_zero_7bits의 다른 값은 ITU-T|ISO/IEC에 의해 장래에 지정될 수 있다. 디코더는 nuh_reserved_zero_7bits 값이 '0000000'과 같지 않은 NAL 유닛을 무시해야 한다(즉, 비트스트림에서 제거하고 폐기).If nal_unit_type is equal to SPS_NUT, TemporalId must be equal to 0, and TemporalId of access unit including NAL unit must be equal to 0. Otherwise, if nal_unit_type is equal to EOS_NUT or EOB_NUT, TemporalId must be equal to 0. Otherwise, the TemporalId must be greater than or equal to the TemporalId of the access unit including the NAL unit. When the NAL unit is a non-VCL NAL unit, the value of TemporalId is the same as the minimum value of TemporalId values of all access units to which the non-VCL NAL unit is applied. When nal_unit_type is equal to PPS_NUT, since all image parameter sets (PSS) may be included at the beginning of the bitstream, the TemporalId may be greater than or equal to the TemporalId of the containing access unit, where the first coded image is equal to 0. It has a TemporalId. When nal_unit_type is equal to PREFIX_SEI_NUT or SUFFIX_SEI_NUT, the SEI NAL unit may contain information applied to the bitstream subunit including the access unit whose TemporalId value is greater than the TemporalId of the access unit including the SEI NAL unit, so TemporalId is included It may be greater than or equal to the TemporalId of the access unit. nuh_reserved_zero_7bits must be equal to '0000000'. Other values of nuh_reserved_zero_7bits may be specified in the future by ITU-T|ISO/IEC. The decoder must ignore NAL units whose nuh_reserved_zero_7bits value is not equal to '0000000' (ie, remove and discard from the bitstream).

log2_max_pic_order_cnt_lsb_minus4는 영상 순서 카운트에 대한 디코딩 프로세스에서 사용되는 변수 MaxPicOrderCntLsb의 값을 다음과 같이 지정한다:log2_max_pic_order_cnt_lsb_minus4 specifies the value of the variable MaxPicOrderCntLsb used in the decoding process for the picture order count as follows:

log2_max_pic_order_cnt_lsb_minus4의 값은 0에서 12(포함)까지의 범위에 있어야 한다. sps_max_dec_pic_buffering_minus1 plus 1은 CVS에 필요한 디코딩된 영상 버퍼의 최대 크기를 영상 저장 버퍼 단위로 지정한다. sps_max_dec_pic_buffering_minus1의 값은 0에서 MaxDpbSize - 1(포함)까지의 범위에 있어야 하며, 여기서 MaxDpbSize는 다른 곳에 지정된 대로이다. additional_ref_poc_lsb는 다음과 같이 참조 영상 리스트에 대한 디코딩 프로세스에서 사용되는 변수 MaxRefPicOrderCntLsb의 값을 지정한다: The value of log2_max_pic_order_cnt_lsb_minus4 must be in the range of 0 to 12 (inclusive). sps_max_dec_pic_buffering_minus1 plus 1 designates the maximum size of the decoded video buffer required for CVS in video storage buffer units. The value of sps_max_dec_pic_buffering_minus1 must be in the range of 0 to MaxDpbSize-1 (inclusive), where MaxDpbSize is as specified elsewhere. additional_ref_poc_lsb specifies the value of the variable MaxRefPicOrderCntLsb used in the decoding process for the reference picture list as follows:

MaxRefPicOrderCntLsb　=　2(^{log2_max_pic_order_cnt_lsb_minus4　+　4　+　additional_ref_poc_lsb)} MaxRefPicOrderCntLsb = 2( ^{log2_max_pic_order_cnt_lsb_minus4 + 4 + additional_ref_poc_lsb)}

additional_ref_poc_lsb의 값은 0에서 32 - log2_max_pic_order_cnt_lsb_minus4 - 4(포함)까지의 범위에 있어야 한다.The value of additional_ref_poc_lsb must be in the range of 0 to 32-log2_max_pic_order_cnt_lsb_minus4-4 (inclusive).

num_ref_idx_default_active_minus1[i] plus 1은, i가 0인 경우, num_ref_idx_active_override_flag equal to 0을 갖는 P 또는 B 슬라이스에 대한 변수 NumRefIdxActive[0]의 추론된 값을 지정하고, i가 1일 때, num_ref_idx_active_override_flag equal to　0을 갖는 B 슬라이스에 대한 NumRefIdxActive[1]의 추론된 값을 지정한다. num_ref_idx_default_active_minus1[i]의 값은 0에서 14(포함)까지의 범위에 있어야 한다.num_ref_idx_default_active_minus1[i] plus 1 specifies the inferred value of NumRefIdxActive[0] for a P or B slice with num_ref_idx_active_override_flag equal to 0 when i is 0, and when i is 1, num_ref_idx_active_override_flag equal to 0 Specifies the inferred value of NumRefIdxActive[1] for the B slice having. The value of num_ref_idx_default_active_minus1[i] must be in the range of 0 to 14 (inclusive).

슬라이스 헤더 시맨틱스.Slice header semantics.

존재하는 경우, 슬라이스 헤더 신택스 요소 slice_pic_parameter_set_id 및 slice_pic_order_cnt_lsb 각각의 값은 코딩된 영상의 모든 슬라이스 헤더에서 동일해야 한다. ... slice_type은 표 7-3에 따라 슬라이스의 코딩 유형을 지정한다.If present, the values of each of the slice header syntax elements slice_pic_parameter_set_id and slice_pic_order_cnt_lsb must be the same in all slice headers of the coded video. ... slice_type designates the coding type of the slice according to Table 7-3.

nal_unit_type이 IRAP_NUT와 같을 경우, 즉 영상이 IRAP 영상인 경우, slice_type은 2와 같아야 한다. ... slice_pic_order_cnt_lsb는 현재 영상에 대한 영상 순서 카운트 모듈로 MaxPicOrderCntLsb를 지정한다. slice_pic_order_cnt_lsb 신택스 요소의 길이는 log2_max_pic_order_cnt_lsb_minus4 + 4비트이다. slice_pic_order_cnt_lsb의 값은 0에서 MaxPicOrderCntLsb - 1(포함)까지의 범위에 있어야 한다. slice_pic_order_cnt_lsb가 존재하지 않으면 slice_pic_order_cnt_lsb는 0과 같은 것으로 추론된다. num_ref_idx_active_override_flag equal to 1은 신택스 요소 num_ref_idx_active_minus1[0]이 P 및 B 슬라이스에 존재하고 신택스 요소 num_ref_idx_active_minus1[1]이 B 슬라이스에 존재함을 지정한다. num_ref_idx_active_override_flag equal to 0은 신택스 요소 num_ref_idx_active_minus1[0] 및 num_ref_idx_active_minus1[1]이 존재하지 않음을 지정한다. num_ref_idx_active_minus1[i]은, 존재하는 경우, 다음과 같이 변수 NumRefIdxActive[i]의 값을 지정한다:When nal_unit_type is the same as IRAP_NUT, that is, when the video is an IRAP video, slice_type should be equal to 2. ... slice_pic_order_cnt_lsb designates MaxPicOrderCntLsb as an image order count module for the current image. The length of the slice_pic_order_cnt_lsb syntax element is log2_max_pic_order_cnt_lsb_minus4 + 4 bits. The value of slice_pic_order_cnt_lsb must be in the range of 0 to MaxPicOrderCntLsb-1 (inclusive). If slice_pic_order_cnt_lsb does not exist, slice_pic_order_cnt_lsb is inferred to be equal to 0. num_ref_idx_active_override_flag equal to 1 specifies that the syntax element num_ref_idx_active_minus1[0] exists in the P and B slices, and the syntax element num_ref_idx_active_minus1[1] exists in the B slice. num_ref_idx_active_override_flag equal to 0 specifies that the syntax elements num_ref_idx_active_minus1[0] and num_ref_idx_active_minus1[1] do not exist. num_ref_idx_active_minus1[i], if present, specifies the value of the variable NumRefIdxActive[i] as follows:

num_ref_idx_active_minus1[i]의 값은 0에서 14(포함)까지의 범위에 있어야 한다. NumRefIdxActive[i] - 1의 값은 슬라이스를 디코딩하는 데 사용될 수 있는 참조 영상 리스트 i에 대한 최대 참조 색인을 지정한다. NumRefIdxActive[i]의 값이 0인 경우, 참조 영상 리스트 i에 대한 참조 색인은 슬라이스를 디코딩하는 데 사용될 수 없다. i가 0 또는 1인 각각의 i에 대해, 현재 슬라이스가 B 슬라이스이고 num_ref_idx_active_override_flag가 0인 경우, NumRefIdxActive[i]는 num_ref_idx_default_active_minus1[i] + 1과 같은 것으로 추론된다. 현재 슬라이스가 P 슬라이스이고 num_ref_idx_active_override_flag가 0인 경우, NumRefIdxActive[0]는 num_ref_idx_default_active_minus1[0] + 1과 같은 것으로 추론된다. 현재 슬라이스가 P 슬라이스인 경우, NumRefIdxActive[1]은 0과 같은 것으로 추론된다. 현재 슬라이스가 I 슬라이스인 경우, NumRefIdxActive[0] 및 NumRefIdxActive[1] 모두 0과 같은 것으로 추론된다. 대안으로, 0 또는 1인 i 에 대해, 위의 것을 적용한 뒤에 다음이 적용된다: rplsIdx1을 ref_pic_list_sps_flag[i] ? ref_pic_list_idx[i] : num_ref_pic_lists_in_sps[i]과 동일하게 설정하고, numRpEntries[i]는 num_strp_entries[i] [rplsIdx1] + num_ltrp_entries[i] [rplsIdx1]와 동일하게 하자. NumRefIdxActive[i] 가 numRpEntries[i] 보다 큰 경우, NumRefIdxActive[i]의 값은 numRpEntries[i]과 동일하게 설정된다.The value of num_ref_idx_active_minus1[i] must be in the range of 0 to 14 (inclusive). A value of NumRefIdxActive[i]-1 specifies the maximum reference index for the reference picture list i that can be used to decode a slice. When the value of NumRefIdxActive[i] is 0, the reference index for the reference picture list i cannot be used to decode a slice. For each i whose i is 0 or 1, when the current slice is a B slice and num_ref_idx_active_override_flag is 0, NumRefIdxActive[i] is inferred to be equal to num_ref_idx_default_active_minus1[i] + 1. When the current slice is a P slice and num_ref_idx_active_override_flag is 0, NumRefIdxActive[0] is inferred to be equal to num_ref_idx_default_active_minus1[0] + 1. If the current slice is a P slice, it is inferred that NumRefIdxActive[1] is equal to 0. When the current slice is an I slice, it is inferred that both NumRefIdxActive[0] and NumRefIdxActive[1] are equal to 0. Alternatively, for i, which is 0 or 1, the following applies after applying the above: rplsIdx1 to ref_pic_list_sps_flag[i]? ref_pic_list_idx[i]: Set the same as num_ref_pic_lists_in_sps[i], and let numRpEntries[i] be the same as num_strp_entries[i] [rplsIdx1] + num_ltrp_entries[i] [rplsIdx1]. When NumRefIdxActive[i] is greater than numRpEntries[i], the value of NumRefIdxActive[i] is set equal to numRpEntries[i].

ref_pic_list_struct(listIdx) 신택스 구조는 슬라이스 헤더에 존재할 수 있다. 슬라이스 헤더에 존재하는 경우, ref_pic_list_struct(listIdx) 신택스 구조는 현재 영상(슬라이스를 포함하는 영상)의 참조 영상 리스트 listIdx를 지정한다. num_ref_entries[listIdx]는 ref_pic_list_struct(listIdx) 신택스 구조의 엔트리 수를 지정한다. 변수 NumEntriesInList[listIdx]는 다음과 같이 도출된다:The ref_pic_list_struct(listIdx) syntax structure may exist in the slice header. When present in the slice header, the ref_pic_list_struct(listIdx) syntax structure designates the reference picture list listIdx of the current picture (the picture including the slice). num_ref_entries[listIdx] designates the number of entries in the ref_pic_list_struct(listIdx) syntax structure. The variable NumEntriesInList[listIdx] is derived as follows:

NumRefPicEntriesInRpl[listIdx] = num_ref_entries[listIdx]NumRefPicEntriesInRpl[listIdx] = num_ref_entries[listIdx]

NumRefPicEntries[listIdx]의 값은 0에서 sps_max_dec_pic_buffering_minus1(포함)까지의 범위에 있어야 한다. poc_ref_lsb[listIdx][i]는 ref_pic_list_struct(listIdx) 신택스 구조에서 i번째 엔트리에 의해 참조되는 영상의 영상 순서 카운트 모듈로 MaxRefPicOrderCntLsb의 값을 지정한다. poc_ref_lsb[listIdx][i] 신택스 요소의 길이는 Log2(MaxRefPicOrderCntLsb) 비트이다.The value of NumRefPicEntries[listIdx] must be in the range of 0 to sps_max_dec_pic_buffering_minus1 (inclusive). poc_ref_lsb[listIdx][i] is a picture order count module of the picture referenced by the i-th entry in the ref_pic_list_struct(listIdx) syntax structure, and specifies the value of MaxRefPicOrderCntLsb. The length of the poc_ref_lsb[listIdx][i] syntax element is Log2 (MaxRefPicOrderCntLsb) bits.

디코딩 프로세스가 설명된다.The decoding process is described.

일반 디코딩 프로세스.General decoding process.

디코딩 프로세스는 현재 영상 CurrPic에 대해 다음과 같이 동작한다: NAL 유닛의 디코딩은 아래에 지정된다. 아래의 프로세스는 슬라이스 헤더 계층 이상에서 신택스 요소를 사용하여 다음 디코딩 프로세스를 지정한다: 영상 순서 카운트와 관련된 변수 및 함수가 도출된다. 이것은 영상의 첫 번째 슬라이스에 대해서만 호출되어야 한다. 비 IRAP 영상의 슬라이스 각각에 대한 디코딩 프로세스의 시작에서, 참조 영상 리스트 0(RefPicList[0]) 및 참조 영상 리스트 1(RefPicList[1]의 도출을 위해 참조 영상 리스트 구축을 한 디코딩 프로세스가 호출된다. 참조 영상 마킹을 위한 디코딩 프로세스가 호출되며, 여기서 참조 영상은 "참조용으로 사용되지 않음"으로 마킹될 수 있다. 이것은 영상의 첫 번째 슬라이스에 대해서만 호출되어야 한다. 코딩 트리 유닛, 스케일링, 변환, 루프 내 필터링 등에 대한 디코딩 프로세스가 호출된다. 현재 영상의 모든 슬라이스가 디코딩된 후, 현재 디코딩된 영상은 "참조용으로 사용됨"으로 마킹된다.The decoding process operates as follows for the current picture CurrPic: The decoding of the NAL unit is specified below. The following process specifies the following decoding process using syntax elements above the slice header layer: Variables and functions related to the picture order count are derived. This should only be called for the first slice of the image. At the beginning of the decoding process for each slice of the non-IRAP picture, the decoding process in which the reference picture list is constructed is called to derive the reference picture list 0 (RefPicList[0]) and the reference picture list 1 (RefPicList[1]). The decoding process for marking the reference picture is called, where the reference picture can be marked as “not used for reference.” This should be called only for the first slice of the picture: Coding Tree Unit, Scaling, Transformation, Loop The decoding process is called for my filtering, etc. After all slices of the current picture have been decoded, the currently decoded picture is marked as "used for reference".

NAL 유닛 디코딩 프로세스.NAL unit decoding process.

슬라이스 디코딩 프로세스.Slice decoding process.

PicOrderCntVal은 다음과 같이 도출된다: PicOrderCntVal is derived as follows:

slice_pic_order_cnt_lsb는 IRAP 영상에 대해 0으로 추론되고 prevPicOrderCntLsb와 prevPicOrderCntMsb는 모두 0으로 설정되기 때문에 모든 IRAP 영상은 0과 같은 PicOrderCntVal을 가질 것이다. PicOrderCntVal의 값은 -231에서 231 - 1(포함)까지의 범위에 있어야 한다. 하나의 CVS에서, 임의의 두 개의 코딩된 영상에 대한 PicOrderCntVal 값은 동일하지 않아야 한다. 디코딩 프로세스 동안 언제든, DPB의 임의의 두 단기 참조 영상에 대한 PicOrderCntVal　&　(MaxRefPicOrderCntLsb　-　1)의 값은 동일하지 않아야 한다.Since slice_pic_order_cnt_lsb is inferred to be 0 for IRAP images, and prevPicOrderCntLsb and prevPicOrderCntMsb are both set to 0, all IRAP images will have PicOrderCntVal equal to 0. The value of PicOrderCntVal must be in the range of -231 to 231-1 (inclusive). In one CVS, the PicOrderCntVal values for any two coded images should not be the same. At any time during the decoding process, the values of PicOrderCntVal&　(MaxRefPicOrderCntLsb　-　1) for any two short-term reference pictures in the DPB should not be the same.

0 또는 1인 각각의 i에 대해, RefPicList[i]의 첫 번째 NumRefIdxActive[i] 엔트리는 RefPicList[i]의 활성 엔트리라고 하고, RefPicList[i]의 다른 엔트리는 RefPicList[i]의 비활성 엔트리라고 한다. RefPicList[0]의 엔트리와 RefPicList[1]의 엔트리 모두에 의해 특정 영상이 참조될 수 있다. RefPicList[0]의 둘 이상의 엔트리 또는 RefPicList[1]의 둘 이상의 엔트리에 의해 특정 영상이 참조될 수도 있다. RefPicList[0]의 활성 엔트리와 RefPicList[1]의 활성 엔트리는 현재 영상 및 순서에서 현재 영상 다음에 오는 하나 이상의 영상의 인터 예측에 사용될 수 있는 모든 참조 영상을 총칭한다. RefPicList[0]의 비활성 엔트리와 RefPicList[1]의 비활성 엔트리는 현재 영상의 인터 예측에 사용되지 않지만 순서에서 현재 영상 다음에 오는 하나 이상의 영상에 대한 인터 예측에 사용될 수 있는 모든 참조 영상을 총칭한다. 참조 영상 없음"과 동일한 RefPicList[0] 또는 RefPicList[1]의 비활성 엔트리 각각은 무시되어야 한다. "참조 영상 없음"과 동일한 RefPicList[0] 또는 RefPicList[1]의 활성 엔트리 각각에 대해 의도하지 않은 영상 손실이 유추되어야 한다.For each i, which is 0 or 1, the first NumRefIdxActive[i] entry in RefPicList[i] is called the active entry in RefPicList[i], and the other entries in RefPicList[i] are called inactive entries in RefPicList[i]. . A specific image can be referenced by both the entry of RefPicList[0] and the entry of RefPicList[1]. A specific image may be referenced by two or more entries of RefPicList[0] or two or more entries of RefPicList[1]. The active entry of RefPicList[0] and the active entry of RefPicList[1] collectively refer to the current picture and all reference pictures that can be used for inter prediction of one or more pictures following the current picture in the sequence. The inactive entry of RefPicList[0] and the inactive entry of RefPicList[1] are not used for inter prediction of the current picture, but collectively refer to all reference pictures that can be used for inter prediction of one or more pictures following the current picture in sequence. Each inactive entry in RefPicList[0] or RefPicList[1] equal to "No Reference Picture" shall be ignored. Unintended picture for each active entry in RefPicList[0] or RefPicList[1] equal to "No Reference Picture". The loss must be inferred.

다음과 같은 제약이 적용되는 비트스트림 적합성의 요건이다: 0 또는 1인 각각 i에 대해, NumEntriesInList[i]은 NumRefIdxActive[i]보다 작지 않아야 한다. RefPicList[0] 또는 RefPicList[1]의 활성 엔트리 각각에 의해 참조되는 영상은 DPB에 있어야 하며 현재 영상의 TemporalId보다 작거나 같은 TemporalId를 가져야 한다. 선택적으로 다음과 같은 제약이 추가로 지정될 수 있다: RefPicList[0] 또는 RefPicList[1]의 임의의 비활성 엔트리의 엔트리 색인은 현재 영상의 디코딩을 위한 참조 색인으로 사용되지 않아야 한다. 선택적으로 다음의 제약이 추가로 지정될 수 있다: RefPicList[0] 또는 RefPicList[1]의 비활성 엔트리는 RefPicList[0] 또는 RefPicList[1]의 다른 엔트리와 동일한 영상을 참조하지 않아야 한다. 현재 영상 자체는 RefPicList[0] 또는 RefPicList[1]의 임의의 엔트리에 의해 참조되지 않아야 한다. RefPicList[0] 또는 RefPicList[1]에는 현재 영상의 PicOrderCntVal과 엔트리가 참조하는 영상의 PicOrderCntVal 사이의 차이가 224보다 크거나 같은 LTRP 엔트리가 없어야 한다. setOfRefPics를 RefPicList[0]의 모든 엔트리와 RefPicList[1]의 모든 엔트리에 의해 참조되는 유일한 영상의 세트라고 하자. setOfRefPics의 영상 수는 sps_max_dec_pic_buffering_minus1보다 작거나 같아야 하며 setOfRefPics는 영상의 모든 슬라이스에 대해 동일해야 한다The following restrictions apply to the requirements of bitstream conformance: For each i, which is 0 or 1, NumEntriesInList[i] must not be less than NumRefIdxActive[i]. The picture referenced by each of the active entries of RefPicList[0] or RefPicList[1] must be in the DPB and must have a TemporalId less than or equal to the TemporalId of the current picture. Optionally, the following constraints may be additionally specified: The entry index of any inactive entry in RefPicList[0] or RefPicList[1] shall not be used as a reference index for decoding of the current picture. Optionally, the following constraints may be additionally specified: An inactive entry in RefPicList[0] or RefPicList[1] must not refer to the same picture as another entry in RefPicList[0] or RefPicList[1]. The current picture itself must not be referenced by any entry in RefPicList[0] or RefPicList[1]. In RefPicList[0] or RefPicList[1], there must not be an LTRP entry with a difference between the PicOrderCntVal of the current video and the PicOrderCntVal of the video referenced by the entry is greater than or equal to 224. Let setOfRefPics be the set of unique images referenced by all entries in RefPicList[0] and all entries in RefPicList[1]. The number of images in setOfRefPics must be less than or equal to sps_max_dec_pic_buffering_minus1, and setOfRefPics must be the same for all slices of the image.

이 프로세스는 슬라이스 헤더의 디코딩 및 슬라이스에 대한 참조 영상 리스트 구축을 위한 디코딩 프로세스 이후, 그러나 슬라이스 데이터의 디코딩 이전에 영상당 한 번 호출된다. 이 프로세스는 DPB에 있는 하나 이상의 참조 영상이 "참조용으로 사용되지 않음"으로 마킹되도록 할 수 있다. DPB에서 디코딩된 영상은 "참조용으로 사용되지 않음" 또는 "참조용으로 사용됨"으로 마킹될 수 있지만, 디코딩 프로세스의 동작 중의 임의의 주어진 시각에는 이 두 가지 중 하나만으로 마킹된다. 이러한 마킹 중 하나를 영상에 할당하는 것은 적용 가능한 경우에 이러한 마킹 중 다른 마킹은 암묵적으로 제거된다. 현재 영상이 IRAP 영상인 경우, 현재 DPB에 있는 모든 참조 영상(있는 경우)이 "참조용으로 사용되지 않음"으로 마킹된다. DPB의 참조 영상은 그 PicOrderCntVal 값의 Log2(MaxRefPicOrderCntLsb) LSB에 의해 식별된다. RefPicList[0] 또는 RefPicList[1]의 임의의 엔트리에 의해 참조되지 않은 DPB 내의 참조 영상 각각은 "참조용으로 사용되지 않음"으로 마킹된다. This process is called once per picture after the decoding process for decoding the slice header and constructing a reference picture list for the slice, but before decoding the slice data. This process can cause one or more reference images in the DPB to be marked as "not used for reference". The picture decoded in the DPB may be marked as "not used for reference" or "used for reference", but at any given time during the operation of the decoding process, it is marked with only one of these two. If allocating one of these markings to an image is applicable, the other of these markings is implicitly removed. If the current image is an IRAP image, all reference images (if any) in the current DPB are marked as "not used for reference". The reference picture of the DPB is identified by the Log2 (MaxRefPicOrderCntLsb) LSB of the PicOrderCntVal value. Each of the reference pictures in the DPB not referenced by any entry in RefPicList[0] or RefPicList[1] is marked as "not used for reference".

또 다른 대안적인 실시예. Another alternative embodiment.

이 섹션은 "단기 참조 영상과 장기 참조 영상 간의 차이를 갖는 슬라이스 헤더에서 참조 영상 리스트를 항상 시그널링"이라고 명명된 전술한 접근법에 대한 대안적인 실시예를 설명한다. 이 대안적인 실시예에서는, 슬라이스 헤더에서, POC MSB 사이클은 HEVC 또는 전술한 접근법에서와 유사하게, 각각의 LTRP 엔트리에 대해 시그널링될 수 있고, 다음 제약이 제거된다: 디코딩 프로세스 중 언제든, DPB의 두 참조 영상에 대한 PicOrderCntVal & (MaxLtPicOrderCntLsb - 1) 값은 동일하지 않아야 한다. This section describes an alternative embodiment to the above-described approach, named "Always signaling a reference picture list in a slice header having a difference between a short-term reference picture and a long-term reference picture." In this alternative embodiment, in the slice header, the POC MSB cycle can be signaled for each LTRP entry, similar to HEVC or the above-described approach, and the following constraints are removed: At any time during the decoding process, two of the DPBs The values of PicOrderCntVal & (MaxLtPicOrderCntLsb-1) for the reference image should not be the same.

도 6은 본 개시의 일 실시예에 따른 비디오 코딩 기기(600)(예를 들어, 비디오 인코더(20) 또는 비디오 디코더(30))의 개략도이다. 비디오 코딩 기기(600)는 여기에 설명된 바와 같이 개시된 실시예들을 구현하기에 적합하다. 비디오 코딩 기기(600)는 데이터를 수신하기 위한 입구 포트(ingress port)(610) 및 수신기 유닛(Rx)(620)을 포함하고; 데이터를 처리하기 위한 프로세서, 로직 유닛, 또는 중앙 처리 유닛(CPU)(630); 데이터를 송신하기 위한 송신기 유닛(Tx)(640) 및 출구 포트(egress port)(650); 및 데이터를 저장하기 위한 메모리(660)를 포함한다. 비디오 코딩 기기(600)는 또한 광 신호 또는 전기 신호의 유출 또는 유입을 위해, 입구 포트(610), 수신기 유닛(620), 송신기 유닛(640) 및 출구 포트(650)에 결합된 광-전기(optical-to-electrical, OE) 구성요소 및 전기-광(electrical-to-optical, EO) 구성요소를 포함할 수 있다. 6 is a schematic diagram of a video coding device 600 (eg, video encoder 20 or video decoder 30) according to an embodiment of the present disclosure. The video coding device 600 is suitable for implementing the disclosed embodiments as described herein. The video coding device 600 includes an inlet port 610 and a receiver unit (Rx) 620 for receiving data; A processor, logic unit, or central processing unit (CPU) 630 for processing data; A transmitter unit (Tx) 640 and an egress port 650 for transmitting data; And a memory 660 for storing data. The video coding device 600 is also an opto-electric (optical-electric) coupled to the inlet port 610, the receiver unit 620, the transmitter unit 640 and the outlet port 650 for the outflow or inflow of an optical signal or an electrical signal. Optical-to-electrical (OE) components and electrical-to-optical (EO) components.

프로세서(630)는 하드웨어 및 소프트웨어에 의해 구현된다. 프로세서(630)는 하나 이상의 CPU 칩, 코어(예: 멀티 코어 프로세서), 필드 프로그래머블 게이트 어레이(FPGA), 주문형 반도체(ASIC) 및 디지털 신호 프로세서(DSP)로 구현될 수 있다. 프로세서(630)는 입구 포트(610), 수신기 유닛(620), 송신기 유닛(640), 출구 포트(650) 및 메모리(660)와 통신한다. 프로세서(630)는 코딩 모듈(670)을 포함한다. 코딩 모듈(670)은 전술한 개시된 실시예를 구현한다. 예를 들어, 코딩 모듈(670)은 다양한 네트워킹 기능을 구현, 처리, 준비 또는 제공한다. 따라서 코딩 모듈(670)의 포함은 비디오 코딩 기기(600)의 기능에 실질적인 개선을 제공하고 비디오 코딩 기기(600)의 다른 상태로의 변환에 영향을 미친다. 대안으로, 코딩 모듈(670)은 메모리(660)에 저장되고 프로세서(630)에 의해 실행되는 명령어로서 구현된다.The processor 630 is implemented by hardware and software. The processor 630 may be implemented with one or more CPU chips, cores (eg, a multi-core processor), a field programmable gate array (FPGA), an application specific semiconductor (ASIC), and a digital signal processor (DSP). Processor 630 communicates with inlet port 610, receiver unit 620, transmitter unit 640, outlet port 650 and memory 660. The processor 630 includes a coding module 670. The coding module 670 implements the disclosed embodiment described above. For example, the coding module 670 implements, processes, prepares, or provides various networking functions. Thus, the inclusion of the coding module 670 provides a substantial improvement in the functionality of the video coding device 600 and affects the transformation of the video coding device 600 to another state. Alternatively, coding module 670 is implemented as instructions stored in memory 660 and executed by processor 630.

비디오 코딩 기기(600)는 또한 사용자와 데이터를 통신하기 위한 입력 및/또는 출력(I/O) 기기(680)를 포함할 수 있다. I/O 기기(680)는 비디오 데이터를 표시하기 위한 디스플레이, 오디오 데이터를 출력하기 위한 스피커 등과 같은, 출력 기기를 포함할 수 있다. I/O 기기(680)는 또한 키보드, 마우스, 트랙볼 등과 같은 입력 기기, 및/또는 이러한 출력 기기와 상호작용하기 위한 대응하는 인터페이스를 포함할 수 있다.The video coding device 600 may also include an input and/or output (I/O) device 680 for communicating data with a user. The I/O device 680 may include an output device, such as a display for displaying video data and a speaker for outputting audio data. The I/O device 680 may also include an input device such as a keyboard, mouse, trackball, or the like, and/or a corresponding interface for interacting with such an output device.

메모리(660)는 하나 이상의 디스크, 테이프 드라이브 및 솔리드 스테이트 드라이브를 포함하고 오버플로 데이터 저장 기기로 사용되어, 그러한 프로그램이 실행을 위해 선택될 때 프로그램을 저장하고, 프로그램 실행 중에 판독되는 명령어 및 데이터를 저장하기 위해 사용될 수 있다. 메모리(660)는 휘발성 및/또는 비휘발성일 수 있고 판독 전용 메모리(read-only memory, ROM), 랜덤 액세스 메모리(random access memory, RAM), 삼원 콘텐츠 주조지정 가능한 메모리(ternary content-addressable memory, TCAM) 및/또는 정적 랜덤 액세스 메모리(static random-access memory, SRAM)일 수 있다. .The memory 660 includes one or more disks, tape drives, and solid state drives and is used as an overflow data storage device to store programs when such programs are selected for execution, and to store instructions and data read during program execution. Can be used to store. The memory 660 may be volatile and/or non-volatile and may be read-only memory (ROM), random access memory (RAM), ternary content-addressable memory, TCAM) and/or static random-access memory (SRAM). .

도 7은 코딩 수단(700)의 실시예의 개략도이다. 실시예에서, 코딩 수단(700)은 비디오 코딩 기기(702)(예: 비디오 인코더(20) 또는 비디오 디코더(30))에 구현된다. 비디오 코딩 기기(702)는 수신 수단(701)을 포함한다. 수신 수단(701)은 인코딩할 영상을 수신하거나 디코딩할 비트스트림을 수신하도록 구성된다. 비디오 코딩 기기(702)는 수신 수단(701)에 연결된 송신 수단(707)을 포함한다. 송신 수단(707)은 비트스트림을 디코더로 전송하거나 디코딩된 이미지를 디스플레이 수단(예: I/O 기기(680) 중 하나)에 송신하도록 구성된다.7 is a schematic diagram of an embodiment of a coding means 700. In an embodiment, the coding means 700 are implemented in a video coding device 702 (eg video encoder 20 or video decoder 30). The video coding device 702 comprises a receiving means 701. The receiving means 701 is configured to receive an image to be encoded or to receive a bitstream to be decoded. The video coding device 702 comprises a transmitting means 707 connected to a receiving means 701. The transmitting means 707 is configured to transmit the bitstream to the decoder or to transmit the decoded image to the display means (eg, one of the I/O devices 680).

비디오 코딩 기기(702)는 저장 수단(703)을 포함한다. 저장 수단(703)은 수신 수단(701) 또는 송신 수단(707) 중 적어도 하나에 연결된다. 저장 수단(703)은 명령어를 저장하도록 구성된다. 비디오 코딩 기기(702)는 또한 처리 수단(705)을 포함한다. 처리 수단(705)은 저장 수단(703)에 결합된다. 처리 수단(705)은 저장 수단(703)에 저장된 명령어를 실행하여 여기에 개시된 방법을 수행하도록 구성된다.The video coding device 702 comprises storage means 703. The storage means 703 is connected to at least one of the receiving means 701 or the transmitting means 707. The storage means 703 are configured to store instructions. The video coding device 702 also comprises processing means 705. The processing means 705 is coupled to the storage means 703. The processing means 705 is configured to execute an instruction stored in the storage means 703 to perform the method disclosed herein.

본 개시에서 여러 실시예가 제공되었지만, 개시된 시스템 및 방법은 본 개시의 사상 또는 범위를 벗어나지 않고 많은 다른 구체적인 형태로 구현될 수 있음을 이해해야 한다. 본 예들은 제한적인 것이 아니라 예시적인 것으로 간주되어야 하며, 그러한 의도는 여기에 제공된 세부사항에 한정되지 않는다. 예를 들어, 다양한 요소 또는 구성요소는 다른 시스템에 결합 또는 통합될 수 있거나 특정 기능이 생략되거나 구현되지 않을 수 있다.While several embodiments have been provided in the present disclosure, it should be understood that the disclosed systems and methods may be implemented in many other specific forms without departing from the spirit or scope of the present disclosure. These examples are to be regarded as illustrative rather than limiting, and such intent is not limited to the details provided herein. For example, various elements or components may be combined or integrated into other systems, or specific functions may be omitted or not implemented.

또한, 다양한 실시예에서 개별적 또는 별개로 설명되고 예시된 기술, 시스템, 서브시스템 및 방법은 본 개시의 범위를 벗어나지 않고 다른 시스템, 모듈, 기술 또는 방법과 결합되거나 통합될 수 있다. 서로 결합되거나 직접 결합되거나 통신하는 것으로 도시되거나 논의된 다른 항목은 전기적으로, 기계적으로 또는 다른 방식으로 일부 인터페이스, 기기 또는 중간 구성요소를 통해 간접적으로 결합되거나 통신할 수 있다. 변경, 대체 및 개조의 다른 예는 당업자에 의해 확인될 수 있으며 여기에 개시된 사상 및 범위를 벗어나지 않고 이루어질 수 있다. In addition, techniques, systems, subsystems, and methods described and illustrated individually or separately in various embodiments may be combined or integrated with other systems, modules, technologies, or methods without departing from the scope of the present disclosure. Other items shown or discussed as being coupled, directly coupled or communicating with each other may be electrically, mechanically or otherwise indirectly coupled or communicated via some interface, device, or intermediate component. Other examples of changes, substitutions and modifications may be identified by those skilled in the art and may be made without departing from the spirit and scope disclosed herein.

Claims

A method of decoding a coded video bitstream implemented by a video decoder, comprising:
Parsing a parameter set represented by the coded video bitstream, the parameter set comprising a set of syntax elements comprising a set of reference picture list structures;
Parsing a slice header of a current slice represented by the coded video bitstream, the slice header including an index of a reference picture list structure in a set of reference picture list structures in the parameter set;
Deriving a reference picture list of the current slice based on the set of syntax elements in the parameter set and the index of the reference picture list structure; And
Obtaining one or more reconstructed blocks of the current slice based on the reference image list
How to include.

The method of claim 1,
The order of entries in the reference picture list structure is the same as the order of corresponding reference pictures in the reference picture list.

The method according to any one of claims 1 to 2,
The order of the entries is from 0 to the indicated value.

The method of claim 3,
The method, wherein the indicated value is from 0 to the value indicated by sps_max_dec_pic_buffering_minus1.

The method according to any one of claims 1 to 4,
The reference picture list is designated as RefPictList[0].

The method according to any one of claims 1 to 4,
The reference picture list is designated as RefPictList[1].

The method according to any one of claims 1 to 6,
Wherein the one or more reconstructed blocks are used to generate an image displayed on a display of an electronic device.

The method according to any one of claims 1 to 7,
The reference picture list includes a list of reference pictures used for inter prediction.

The method according to any one of claims 1 to 8,
The method of claim 1, wherein the inter prediction is for a P slice or a B slice.

The method according to any one of claims 1 to 9,
Wherein the parameter set comprises a sequence parameter set (SPS).

The method according to any one of claims 1 to 10,
The set of syntax elements from the parameter set is placed in a Raw Byte Sequence Payload (RBSP) of a Network Abstraction Layer (NAL) unit.

The method according to any one of claims 7 to 11,
The reference picture list is designated as RefPictList[0] or RefPictList[1], and the order of entries in the reference picture list structure is the same as the order of corresponding reference pictures in the reference picture list.

The method according to any one of claims 1 to 12,
Parsing a parameter set represented by the coded video bitstream, the parameter set comprising a set of syntax elements comprising a set of reference picture list structures;
Obtaining a reference picture list structure represented by the coded video bitstream;
Deriving a first reference picture list of a current slice based on a reference picture list structure-The first reference picture list includes one or more active entries and one or more inactive entries, and the one or more inactive entries are of the current slice. Refers to a reference picture that is not used for inter prediction but is referenced by an active entry in a second reference picture list, and the second reference picture list is a reference picture list of a slice following the current slice in decoding order, or in decoding order It is a reference video list of the video following the current video -; And
The method further comprising obtaining one or more reconstructed blocks of the current slice based on one or more active entries of the first reference picture list.

As a decoding device,
A receiver configured to receive a coded video bitstream;
A memory coupled to the receiver and storing instructions; And
A processor coupled to the memory,
The processor executes the command stored in the memory,
Parsing a parameter set represented by the coded video bitstream, the parameter set comprising a set of syntax elements comprising a set of reference picture list structures;
Parsing a slice header of a current slice represented by the coded video bitstream, the slice header including an index of a reference picture list structure in a set of reference picture list structures in the parameter set;
Derive a reference picture list of the current slice based on the set of syntax elements in the parameter set and the index of the reference picture list structure;
Based on the reference image list, configured to obtain one or more reconstructed blocks of the current slice,
Decoding device.

The method of claim 14,
And a display configured to display an image based on the one or more reconstructed blocks.

As a coding device,
A receiver configured to receive a bitstream to be decoded;
A transmitter coupled to the receiver and configured to transmit a decoded image to a display;
A memory coupled to at least one of the receiver or the transmitter and configured to store instructions; And
A processor coupled to the memory and configured to execute an instruction stored in the memory to perform the method of any one of claims 1 to 13
Coding device comprising a.

As a system,
Encoder; And
A decoder in communication with the encoder,
The decoder comprises the decoding device or coding device of any one of claims 14 to 16,
system.

As a means for coding,
Receiving means configured to receive a bitstream to be decoded;
Transmitting means coupled to the receiving means and configured to transmit the decoded image to the display means;
Storage means coupled to at least one of said receiving means or said transmitting means and configured to store an instruction; And
Processing means coupled to said storage means and configured to execute an instruction stored in said storage means to perform the method in any one of claims 1 to 13
Means for coding comprising a.