KR20230127982A

KR20230127982A - Cross Random Access Point Sample Group

Info

Publication number: KR20230127982A
Application number: KR1020237017496A
Authority: KR
Inventors: 이에-쿠이 왕; 양 왕; 리 장
Original assignee: 베이징 바이트댄스 네트워크 테크놀로지 컴퍼니, 리미티드; 바이트댄스 아이엔씨
Priority date: 2020-12-28
Filing date: 2021-12-28
Publication date: 2023-09-01
Also published as: KR20230127983A; EP4252425A1; EP4252424A1; CN116746150A; JP2024500549A; WO2022143616A1; CN116724549A; JP2024501329A; WO2022143615A1; EP4252420A1; US20230345007A1; CN116711314A; WO2022143614A1; KR20230127981A; US20230345053A1; US20230345032A1; JP2024500550A

Abstract

비디오 데이터를 처리하기 위한 메커니즘이 개시된다. 교차 랜덤 액세스 포인트 참조 (CRR) 샘플들의 디스크립션이 국제 표준화 기구(ISO) 기반 미디어 파일 포맷 (ISOBMFF)에 따르는 시각 미디어 데이터 파일내에 시그널링 된다. CRR 샘플 그룹에 기반하여, 시각 미디어 데이터와 상기 시각 미디어 데이터 파일간의 변환이 수행되어 진다. A mechanism for processing video data is disclosed. A description of the Crossed Random Access Point Reference (CRR) samples is signaled within a visual media data file conforming to the International Organization for Standardization (ISO) Base Media File Format (ISOBMFF). Based on the CRR sample group, conversion between visual media data and the visual media data file is performed.

Description

Cross Random Access Point Sample Group

관련 출원들에 대한 상호 참조CROSS REFERENCES TO RELATED APPLICATIONS

본 특허 출원은 2020년 12월 28일 예-쿠이 왕 등에 의해 제출된, "비디오 비트스트림 및 미디어 파일에서 교차 랜덤 액세스 포인트 참조의 시그널링"이란 제목의 국제 출원 No. PCT/CN2020/139893의 우선권을 주장하기 위해 만들어졌다. 여기서, 상기 국제 출원의 전체 개시물은 본 출원의 개시의 일부로서 참고로 통합된다.This patent application is filed on December 28, 2020 by Ye-Kui Wang et al., entitled "Signaling of Cross Random Access Point References in Video Bitstreams and Media Files", International Application No. It is made to claim the priority of PCT/CN2020/139893. The entire disclosure of the above international application is hereby incorporated by reference as part of the disclosure of this application.

본 특허 문서는 디지털 오디오 비디오 미디어 정보의 파일 포맷 상의 생성, 저장 및 소비에 관한 것이다.This patent document relates to the creation, storage and consumption of digital audio video media information in a file format.

디지털 비디오는 인터넷 및 기타 디지털 통신 네트워크들에서 가장 큰 대역폭을 사용한다. 비디오를 수신하고 디스플레이할 수 있는 연결된 사용자 디바이스들의 수가 증가함에 따라, 디지털 비디오 사용에 대한 대역폭의 수요는 지속적으로 증가할 것으로 예상된다.Digital video uses the most bandwidth on the Internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, the demand for bandwidth for digital video usage is expected to continue to increase.

제1 양태는 비디오 데이터를 처리하기 위한 방법에 관한 것으로, 상기 방법은 국제 표준화 기구 (ISO: International Organization for Standardization) 기반 미디어 파일 포맷(ISOBMFF) 에서 시각 미디어 데이터 파일내의 교차 랜덤 액세스 포인트 참조(CRR: Cross Random Access Point Referencing) 샘플들에 대한 디스크립션(description)을 결정하는 단계와, 상기 시각 미디어 데이터 파일과 시각 미디어 데이터 간의 변환을 상기 CRR 샘플 그룹에 기반하여 실시하는 단계를 포함한다. A first aspect relates to a method for processing video data, the method comprising a crossed random access point reference (CRR: Determining a description of Cross Random Access Point Referencing samples, and performing conversion between the visual media data file and the visual media data based on the CRR sample group.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들의 디스크립션은 CRR 샘플 그룹내에 포함됨을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the description of the CRR samples is included within a CRR sample group.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들의 디스크립션은 종속 랜덤 액세스 포인트 (DRAP: dependent random access point) 샘플 그룹내에 포함됨을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the description of the CRR samples is included within a dependent random access point (DRAP) sample group.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들의 디스크립션은 유형 2 DRAP 샘플 그룹내에 포함됨을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the description of the CRR samples is included within a type 2 DRAP sample group.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들의 디스크립션은 향상된 종속 랜덤 액세스 포인트 (EDRAP: enhanced dependent random access point) 샘플 그룹내에 포함됨을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the description of the CRR samples is included within an enhanced dependent random access point (EDRAP) sample group.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들의 디스크립션은 샘플 투 그룹 박스 (SampleToGroupBox) 내에 포함됨을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the description of the CRR samples is contained within a SampleToGroupBox.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들의 디스크립션은 컴팩트 샘플 투 그룹 박스 (CompactSampleToGroupBox) 내에 포함됨을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the description of the CRR samples is contained within a CompactSampleToGroupBox.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들의 디스크립션은 그룹 유형 파라미터 (group_type_parameter) 내에 포함됨을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the description of the CRR samples is included within a group_type_parameter.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들은 유형 2 DRAP 샘플들로 표시되어짐을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the CRR samples are represented as type 2 DRAP samples.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들은 향상된 종속 랜덤 액세스 포인트 (EDRAP) 샘플들로 표시되어짐을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the CRR samples are represented as Enhanced Dependent Random Access Point (EDRAP) samples.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 각 샘플은 픽처를 포함함을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the above aspect provides that each sample comprises a picture.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들의 디스크립션은 어떤 샘플 그룹에 속하는 샘플들을 식별하기 위한 하나 이상의 샘플 식별자들을 포함함을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the description of CRR samples includes one or more sample identifiers for identifying samples belonging to a certain sample group.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들의 디스크립션은 상기 CRR 샘플들을 위한 참조 픽처들의 식별자들을 포함함을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the description of the CRR samples includes identifiers of reference pictures for the CRR samples.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들의 디스크립션은 현재 샘플을 디코더하기 위해 참조로 필요한 샘플들의 수를 포함함을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the description of CRR samples includes a number of samples needed as a reference to decode the current sample.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 CRR 샘플들의 디스크립션은 샘플 그룹내 샘플 엔트리내에 포함됨을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the description of the CRR samples is included within a sample entry within a sample group.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 현재 샘플이 가장 인접한 선행 초기 샘플, 상기 현재 샘플 보다 디코딩 순서가 선행하는 하나 이상의 CRR 샘플들, 또는 이들의 조합만을 참조하는 경우, 상기 현재 샘플은 상기 CRR 샘플들 중 하나임을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect refers only to a nearest preceding initial sample, one or more CRR samples preceding the current sample in decoding order, or a combination thereof. , the current sample is one of the CRR samples.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 현재 샘플에서 디코딩이 시작될때, 현재 샘플과 디코딩 순서 및 출력 순서에서 상기 현재 샘플 이후의 모든 샘플들이 바르게 디코딩될 수 있는 경우, 상기 현재 샘플은 상기 CRR 샘플들 중 하나임을 제공한다.Or, in any of the preceding aspects, another implementation of the above aspect is that when decoding starts at a current sample, if the current sample and all samples after the current sample in decoding order and output order can be decoded correctly, It is provided that the current sample is one of the CRR samples.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 가장 인접한 선행 초기 샘플, 상기 현재 샘플 보다 디코딩 순서가 선행하는 하나 이상의 CRR 샘플들, 또는 이들의 조합을 디코딩한 이후, 현재 샘플과 상기 현재 샘플 이후의 모든 샘플들이 바르게 디코딩됨을 제공한다.Alternatively, in any of the preceding aspects, another implementation of the aspect may, after decoding the nearest preceding initial sample, one or more CRR samples preceding the current sample in decoding order, or a combination thereof, the current sample and that all samples after the current sample are correctly decoded.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 변환은 상기 시각 미디어 데이터에 따라 상기 시각 미디어 데이터 파일을 생성하는 것를 포함함을 제공한다. Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the converting comprises generating the visual media data file according to the visual media data.

또는, 선행 양태들 중 임의의 양태에서, 상기 양태의 다른 구현은, 상기 변환은 상기 시각 미디어 데이터를 획득하기 위해 상기 시각 미디어 데이터 파일을 파싱하는 것를 포함함을 제공한다. Alternatively, in any of the preceding aspects, another implementation of the aspect provides that the transforming comprises parsing the visual media data file to obtain the visual media data.

제2 양태는 비디오 데이터를 처리하기 위한 장치에 관한 것으로, 상기 장치는 프로세서 및 명령들을 포함하는 비일시적 메모리를 포함하며, 이때 상기 명령들은 상기 프로세서에 의해 실행 시 상기 프로세서가 상기 선행하는 양태들 중 임의의 방법을 실시하도록 한다.A second aspect relates to an apparatus for processing video data, the apparatus comprising a processor and a non-transitory memory comprising instructions, wherein the instructions, when executed by the processor, cause the processor to perform any of the preceding aspects to be carried out in any way.

제3 양태는 비디오 코딩 디바이스에 의해 사용될 컴퓨터 프로그램 제품을 포함하는 비일시적 컴퓨터 판독가능 매체에 관한 것으로, 상기 컴퓨터 프로그램 제품은 상기 컴퓨터 판독가능 매체에 저장된 컴퓨터 실행 가능 명령들을 포함하여 프로세서에 의해 실행되는 경우 상기 비디오 코딩 디바이스로 하여금 상기 선행하는 양태들 중 임의의 방법을 실시하도록 한다.A third aspect relates to a non-transitory computer readable medium comprising a computer program product to be used by a video coding device, the computer program product comprising computer executable instructions stored on the computer readable medium to be executed by a processor. cause the video coding device to implement any of the preceding aspects.

명료성을 위해, 선행 실시예들 중 어느 하나가 다른 선행 실시예들 중 어느 하나 또는 그 이상의 것과 결합하여 본 개시의 범위 내에서 새로운 실시예를 생성할 수 있다.For clarity, any one of the preceding embodiments may be combined with any one or more of the other preceding embodiments to create a new embodiment within the scope of the present disclosure.

지금까지의 특징들 및 기타 특징들은 첨부한 도면들 및 청구 범위와 함께 다음의 상세한 설명으로부터 더 명확하게 이해될 것이다.These and other features will become more clearly understood from the following detailed description taken in conjunction with the accompanying drawings and claims.

본 개시내용의 보다 완전한 이해를 위해, 유사한 참조 번호들이 유사한 부분들을 나타내는 첨부 도면들 및 상세한 설명과 관련하여 취해진 다음의 간략한 설명이 이제 참조된다.
도 1은 IRAP 픽처들을 사용하여 비트스트림을 디코딩하는 경우 랜덤 액세스를 위한 예시적 메커너즘의 개략도이다.
도 2는 DRAP 픽처들을 사용하여 비트스트림을 디코딩하는 경우 랜덤 액세스를 위한 예시적 메커너즘의 개략도이다.
도 3은 CRR 픽처들을 사용하여 비트스트림을 디코딩하는 경우 랜덤 액세스를 위한 예시적 메커너즘의 개략도이다.
도 4는 CRR 기반의 랜덤 액세스를 지원하기 위해 외부 비트스트림을 시그널링하기 위한 예시적 메커니즘의 개략도이다.
도 5는 픽처가 DRAP 및/또는 CRR 픽처를 디코딩 순서에서는 뒤따르고 상기 DRAP 및/또는 CRR 픽처에 출력 순서에서는 선행하는 경우의 잠재적인 디코딩 오류를 보여주는 개략도이다.
도 6은 국제 표준화 기구(ISO) 기반 미디어 파일 포맷(ISOBMFF)으로 저장된 미디어 파일의 개략도이다.
도 7은 인코딩된 시각 미디어 데이터를 포함하는 비트스트림의 개략도이다.
도 8은 예시적인 비디오 처리 시스템을 보여주는 개략도이다.
도 9은 예시적인 비디오 처리 장치를 보여주는 개략도이다.
도 10은 비디오 처리의 예시적인 방법을 위한 순서도이다.
도 11은 예시적인 비디오 코딩 시스템을 도시하는 개략도이다.
도 12는 예시적인 인코더를 도시하는 개략도이다.
도 13은 예시적인 디코더를 도시하는 개략도이다.
도 14는 예시적인 인코더의 개략도이다.For a more complete understanding of the present disclosure, reference is now made to the accompanying drawings in which like reference numbers indicate like parts and the following brief description taken in connection with the detailed description.
1 is a schematic diagram of an example mechanism for random access when decoding a bitstream using IRAP pictures.
2 is a schematic diagram of an example mechanism for random access when decoding a bitstream using DRAP pictures.
3 is a schematic diagram of an example mechanism for random access when decoding a bitstream using CRR pictures.
4 is a schematic diagram of an example mechanism for signaling an outer bitstream to support CRR-based random access.
5 is a schematic diagram showing potential decoding errors when a picture follows a DRAP and/or CRR picture in decoding order and precedes the DRAP and/or CRR picture in output order.
6 is a schematic diagram of a media file stored in the International Organization for Standardization (ISO) based media file format (ISOBMFF).
7 is a schematic diagram of a bitstream containing encoded visual media data.
8 is a schematic diagram showing an exemplary video processing system.
Fig. 9 is a schematic diagram showing an exemplary video processing device.
10 is a flow chart for an exemplary method of video processing.
11 is a schematic diagram illustrating an exemplary video coding system.
12 is a schematic diagram illustrating an exemplary encoder.
13 is a schematic diagram illustrating an exemplary decoder.
14 is a schematic diagram of an exemplary encoder.

하나 이상의 실시예들의 예시적인 구현이 아래에 제공되지만, 개시된 시스템들 및/또는 방법들이 현재 알려져 있든 또는 개발되어야 하든 임의의 수의 기법들을 사용하여 구현될 수 있다는 것이 처음부터 이해되어야 한다. 본 개시내용은 본원에 예시되고 설명된 예시적인 설계들 및 구현들을 포함하여 아래에 예시된 예시적인 구현들, 도면들 및 기법들에 결코 제한되어서는 안 되지만, 등가물들의 전체 범위와 함께 첨부된 청구범위들의 범위 내에서 수정될 수 있다.An example implementation of one or more embodiments is provided below, but it should be understood from the outset that the disclosed systems and/or methods may be implemented using any number of techniques, whether now known or to be developed. This disclosure should in no way be limited to the example implementations, drawings and techniques illustrated below, including the example designs and implementations illustrated and described herein, but to the full scope of equivalents, along with the appended claims It can be modified within the range of scopes.

H.266으로도 알려진 다목적 비디오 코딩(Versatile Video Coding(VVC)) 용어는 개시된 기술들의 이해를 쉽게 하기 위해서만 일부 기술에서 사용되며, 개시된 기술들의 범위를 제한하기 위한 것이 아니다. 그 자체로, 본원에 개시된 기술들은 다른 비디오 코덱 프로토콜 및 설계들에도 적용 가능하다. 현재 문서에서, 편집 상의 변경들은 VVC 사양 또는 국제 표준화 기구(ISO) 기반 미디어 파일 포맷(ISOBMFF) 파일 포맷 사양과 관련하여, 삭제된 텍스트는 굵은 이탤릭체로 추가된 텍스트는 밑줄로 하여 나타난다. The term Versatile Video Coding (VVC), also known as H.266, is used in some technologies only to facilitate understanding of the disclosed technologies and is not intended to limit the scope of the disclosed technologies. As such, the techniques disclosed herein are applicable to other video codec protocols and designs. In the current document, editorial changes are made in relation to the VVC specification or the International Organization for Standardization (ISO) Based Media File Format (ISOBMFF) file format specification, with deleted text appearing in bold italics and added text underlined.

본 특허 문서는 비디오 코딩, 비디오 파일 포맷, 비디오 시그널링, 그리고 비디오 어플리케이션들에 관한 것이다. 구체적으로, 본 문서는 추가 향상 정보 (Supplemental Enhancement Information (SEI)) 메시지들에 기반한 비디오 코딩에서의 교차 랜덤 액세스 포인트 (random access point (RAP)) 참조의 향상된 시그널링 및 미디어 파일들에서의 교차 RAP 참조 (Cross RAP Referencing (CRR))의 시그널링에 관계한다. 개시된 예들은 VVC와 같은 임의의 비디오 코딩 표준 또는 비 표준 비디오 코덱 그리고 ISOBMFF와 같은 임의의 미디어 파일 포맷들에 따른 미디어 파일들에 개별적으로 또는 다양하게 결합되어 적용될 수 있다.This patent document relates to video coding, video file format, video signaling, and video applications. Specifically, this document provides enhanced signaling of cross random access point (RAP) reference in video coding based on Supplemental Enhancement Information (SEI) messages and cross RAP reference in media files. It is related to the signaling of (Cross RAP Referencing (CRR)). The disclosed examples may be applied individually or in various combinations to media files conforming to any video coding standard or non-standard video codec, such as VVC, and any media file formats, such as ISOBMFF.

본 개시는 다음과 같은 약어들을 포함한다. 적응형 색상 변환(ACT; adaptive color transform), 적응형 루프 필터(ALF; adaptive loop filter), 적응형 모션 벡터 해상도(AMVR; adaptive motion vector resolution), 적응 파라미터 세트(APS; adaptation parameter set), 액세스 유닛(AU; access unit), 액세스 유닛 구분자(AUD; access unit delimiter), 고급 비디오 코딩(Rec. ITU-T H.264 | ISO/IEC 14496-10)(AVC; advanced video coding), 양방향 예측(B), 코딩 유닛(CU) 레벨 가중치들을 이용한 양방향 예측(BCW; bi-prediction with coding unit (CU)-level weights), 양방향 광학 흐름(BDOF; bi-directional optical flow), 블록 기반의 델타 펄스 코드 변조(BDPCM; block-based delta pulse code modulation), 버퍼링 기간(BP; buffering period),) 컨텍스트 기반의 적응형 2진 산술 코딩(CABAC; context-based adaptive binary arithmetic coding), 코딩 블록(CB; coding block), 고정 비트 레이트(CBR; constant bit rate), 교차 성분 적응형 루프 필터(CCALF; cross-component adaptive loop filter), 코딩된 계층 비디오 시퀀스(CLVS: coded layer video sequence), 코딩된 계층 비디오 시퀀스 시작(CLVSS: coded layer video sequence start), 코딩된 픽처 버퍼(CPB; coded picture buffer), 클린 랜덤 액세스(CRA; clean random access), 순환 중복성 체크(CRC; cyclic redundancy check), 교차 RAP 참조(CRR: cross RAP referencing), 코딩 트리 블록(CTB; coding tree block), 코딩 트리 유닛(CTU; coding tree unit), 코딩 유닛(CU; coding unit), 코딩된 비디오 시퀀스(CVS; coded video sequence), 코딩된 비디오 시퀀스 시작(CVSS: coded video sequence start), 디코딩 능력 정보(DCI; decoding capability information), 디코딩 초기화 정보(DII; decoding initalization information), 디코딩된 픽처 버퍼(DPB; decoded picture buffer), 종속 랜덤 액세스 포인트(DRAP; dependent random access point), 디코딩 유닛(DU; decoding unit), 디코딩 유닛 정보(DUI; decoding unit information), 지수 골롬(EG; exponential-Golomb), k차 지수 골롬(EGk; k-th order exponential-Golomb), 비트스트림의 끝(EOB; end of bitstream), 시퀀스의 끝(EOS; end of sequence), 필러 데이터(FD; filler data), 선입선출(FIFO; first-in, first-out), 고정 길이(FL; fixed-length), 녹색, 청색 및 적색(GBR; green, blue, and red), 일반 제약 정보(GCI; general constraints information), 점진적 디코딩 리프레시(GDR; gradual decoding refresh), 기하학적 분할 모드(GPM; geometric partitioning mode), Rec. ITU-T H.265 | ISO/IEC 23008-2라고도 알려진, 고효율 비디오 코딩(HEVC; high efficiency video coding), 가상 참조 디코더(HRD; hypothetical reference decoder), 가상 스트림 스케줄러(HSS; hypothetical stream scheduler), 인트라(I; intra), 인트라 블록 카피(IBC; intra block copy), 순간 디코딩 리프레시(IDR; instantaneous decoding refresh), 계층간 참조 픽처(ILRP; inter layer reference picture), 인트라 랜덤 액세스 포인트(IRAP; intra random access point), 저주파 분리불가 변환(LFNST; low frequency non-separable transform), 최소 가능성 심볼(LPS; least probable symbol), 최하위 비트(LSB; least significant bit), 장기 참조 픽처(LTRP; long-term reference picture), 크로마 스케일링을 이용한 루마 맵핑(LMCS; luma mapping with chroma scaling), 매트릭스 기반의 인트라 예측(MIP; matrix-based intra prediction), 최대 가능성 심볼(MPS; most probable symbol), 최상위 비트(MSB; most significant bit), 다중 변환 선택(MTS; multiple transform selection), 모션 벡터 예측(MVP; motion vector prediction), 네트워크 추상화 계층(NAL; network abstraction layer), 출력 계층 세트(OLS; output layer set), 동작 포인트(OP; operation point), 동작 포인트 정보(OPI; operating point information), 예측(P; predictive), 픽처 헤더(PH; picture header), 픽처 순서 카운트(POC; picture order count), 픽처 파라미터 세트(PPS; picture parameter set), 광학적 흐름을 이용한 예측 개선(PROF; prediction refinement with optical flow), 픽처 타이밍(PT; picture timing), 픽처 유닛(PU; picture unit), 양자화 파라미터(QP; quantization parameter), 랜덤 액세스 디코딩 가능한 선두 픽처(RADL; random access decodable leading picture), 랜덤 액세스 포인트(RAP), 랜덤 액세스 스킵형 선두 픽처(RASL; random access skipped leading picture), 로바이트 시퀀스 페이로드(RBSP; raw byte sequence payload), 적색, 녹색 및 청색(RGB; red, green, and blue), 참조 픽처 목록(RPL; reference picture list), 샘플 적응형 오프셋(SAO; sample adaptive offset), 샘플 종횡비(SAR; sample aspect ratio), 추가 향상 정보(SEI; supplemental enhancement information), 슬라이스 헤더(SH; slice header), 서브픽처 레벨 정보(SLI; subpicture level information), 데이터 비트들의 스트링(SODB; string of data bits), 시퀀스 파라미터 세트(SPS; sequence parameter set), 단기 참조 픽처(STRP; short-term reference picture), 단계별 시간적 서브계층 액세스(STSA; step-wise temporal sublayer access), 절삭된 라이스(TR; truncated rice), 변환 유닛(TU: transform unit), 가변 비트 레이트(VBR; variable bit rate), 비디오 코딩 계층(VCL; video coding layer), 비디오 파라미터 세트(VPS; video parameter set), Rec. ITU-T H.274 | ISO/IEC 23002-7(VSEI)이라고도 알려진 다목적 추가 향상 정보(versatile supplemental enhancement information), 비디오 이용성 정보(VUI; video usability information), 및 Rec. ITU-T H.266 | ISO/IEC 23090-3, (VVC)라고도 알려진 다목적 비디오 코딩(versatile video coding).This disclosure includes the following abbreviations. Adaptive color transform (ACT), adaptive loop filter (ALF), adaptive motion vector resolution (AMVR), adaptation parameter set (APS), access unit (AU; access unit), access unit delimiter (AUD; access unit delimiter), advanced video coding (Rec. ITU-T H.264 | ISO/IEC 14496-10) (AVC; advanced video coding), bidirectional prediction ( B), bi-prediction with coding unit (CU)-level weights (BCW), bi-directional optical flow (BDOF), block-based delta pulse code Block-based delta pulse code modulation (BDPCM), buffering period (BP), context-based adaptive binary arithmetic coding (CABAC), coding block (CB) block), constant bit rate (CBR), cross-component adaptive loop filter (CCALF), coded layer video sequence (CLVS), coded layer video sequence Start (CLVSS: coded layer video sequence start), coded picture buffer (CPB; coded picture buffer), clean random access (CRA), cyclic redundancy check (CRC; cyclic redundancy check), cross RAP reference (CRR : cross RAP referencing), coding tree block (CTB; coding tree block), coding tree unit (CTU), coding unit (CU), coded video sequence (CVS), coded video sequence start (CVSS) , decoding capability information (DCI), decoding initialization information (DII), decoded picture buffer (DPB), dependent random access point (DRAP), decoding unit (DU; decoding unit), decoding unit information (DUI; decoding unit information), exponential-Golomb (EG), k-th order exponential-Golomb (EGk), end of bitstream (EOB) ; end of bitstream), end of sequence (EOS), filler data (FD), first-in, first-out (FIFO), fixed-length (FL), green, blue, and red (GBR), general constraints information (GCI), gradual decoding refresh (GDR), geometric partitioning mode (GPM), Rec . ITU-T H.265 | Also known as ISO/IEC 23008-2, high efficiency video coding (HEVC), hypothetical reference decoder (HRD), hypothetical stream scheduler (HSS), intra (I), Intra block copy (IBC), instantaneous decoding refresh (IDR), inter layer reference picture (ILRP), intra random access point (IRAP), low frequency separation Low frequency non-separable transform (LFNST), least probable symbol (LPS), least significant bit (LSB), long-term reference picture (LTRP), chroma scaling Luma mapping with chroma scaling (LMCS), matrix-based intra prediction (MIP), most probable symbol (MPS), most significant bit (MSB), multiple Multiple transform selection (MTS), motion vector prediction (MVP), network abstraction layer (NAL), output layer set (OLS), operation point (OP) ), operating point information (OPI), predictive (P), picture header (PH), picture order count (POC; picture order count), picture parameter set (PPS), prediction refinement with optical flow (PROF), picture timing (PT), picture unit (PU), Quantization parameter (QP), random access decodable leading picture (RADL), random access point (RAP), random access skipped leading picture (RASL), low bytes Raw byte sequence payload (RBSP), red, green, and blue (RGB), reference picture list (RPL), sample adaptive offset (SAO) , sample aspect ratio (SAR), supplemental enhancement information (SEI), slice header (SH), subpicture level information (SLI), a string of data bits (SODB; string of data bits), sequence parameter set (SPS), short-term reference picture (STRP), step-wise temporal sublayer access (STSA), cut rice ( truncated rice (TR), transform unit (TU), variable bit rate (VBR), video coding layer (VCL), video parameter set (VPS; video parameter set), Rec. ITU-T H.274 | Versatile supplemental enhancement information, also known as ISO/IEC 23002-7 (VSEI), video usability information (VUI), and Rec. ITU-T H.266 | ISO/IEC 23090-3, versatile video coding, also known as (VVC).

비디오 코딩 표준들은, 주로 ITU(International Telecommunication Union) ITU-T(Telecommunications Standardization Sector) 및 ISO/IEC(International Electrotechnical Commission) 표준들의 개발을 통해 발전해 왔다. ITU-T는 H.261과 H.263을 생성했고, ISO/IEC는 Motion Picture Experts Group(MPEG)-1과 MPEG-4 Visual을 생성했고, 2개의 조직은 공동으로 H.262/MPEG-2 Video와 H.264/MPEG-4 Advanced Video Coding (AVC) 및 H.265/HEVC 표준들을 생성했다. H.262 이후로, 비디오 코딩 표준들은, 시간적 예측과 변환 코딩이 이용되는 하이브리드 비디오 코딩 구조에 기초한다. HEVC 이상의 추가적인 비디오 코딩 기술들을 탐구하기 위해, JVET(Joint Video Exploration Team)가 VCEG(Video Coding Experts Group)와 MPEG에 의해 공동으로 설립되었다. 많은 방법이 JVET에 의해 채택되었으며 JEM(Joint Exploration Model)이라는 기준 소프트웨어 내에 포함되었다. JVET는, VVC(Versatile Video Coding) 프로젝트가 공식적으로 시작되었을 때 JVET(Joint Video Experts Team)로 명칭이 나중에 변경되었다. VVC는 HEVC와 비교하여 50% 비트레이트 감소를 목표로 하는 코딩 표준이다. VVC는 JVET에 의해 마무리 되었다. Video coding standards have evolved mainly through the development of International Telecommunication Union (ITU) Telecommunications Standardization Sector (ITU-T) and International Electrotechnical Commission (ISO/IEC) standards. ITU-T created H.261 and H.263, ISO/IEC created Motion Picture Experts Group (MPEG)-1 and MPEG-4 Visual, and the two organizations jointly created H.262/MPEG-2 Created Video and H.264/MPEG-4 Advanced Video Coding (AVC) and H.265/HEVC standards. Since H.262, video coding standards are based on a hybrid video coding scheme in which temporal prediction and transform coding are used. To explore additional video coding technologies beyond HEVC, a Joint Video Exploration Team (JVET) was jointly established by the Video Coding Experts Group (VCEG) and MPEG. A number of methods have been adopted by JVET and incorporated into standard software called the Joint Exploration Model (JEM). JVET was later renamed JVET (Joint Video Experts Team) when the Versatile Video Coding (VVC) project officially started. VVC is a coding standard that aims for a 50% bitrate reduction compared to HEVC. VVC was finalized by JVET.

VVC 및 VSEI 표준들은 텔레비전 방송, 화상 회의, 저장 매체로부터의 재생, 적응형 비트 레이트 스트리밍, 비디오 영역 추출, 복수의 코딩된 비디오 비트스트림으로부터의 콘텐츠의 합성 및 병합, 멀티뷰 비디오, 스케일가능한 계층화된 코딩 및 뷰 포트 적응형 삼백육십도(360°) 몰입형 미디어 등의 광범위한 응용에서의 이용을 위해 설계되었다.VVC and VSEI standards are used for television broadcasting, videoconferencing, playback from storage media, adaptive bit rate streaming, video region extraction, synthesis and merging of content from multiple coded video bitstreams, multiview video, scalable layered It is designed for use in a wide range of applications including coding and viewport adaptive 360 degree (360°) immersive media.

필수 비디오 코딩(Essential Video Coding (EVC)) 표준(ISO/IEC 23094-1)은 MPEG에 의해 개발된 또 다른 비디오 코딩 표준이다.The Essential Video Coding (EVC) standard (ISO/IEC 23094-1) is another video coding standard developed by MPEG.

파일 포맷 표준들은 아래에서 논의된다. 미디어 스트리밍 애플리케이션들은 전형적으로 인터넷 프로토콜(IP), 전송 제어 프로토콜(TCP) 및 하이퍼텍스트 전송 프로토콜(HTTP) 트랜스포트 방법들에 기초하며, 전형적으로 ISOBMFF 등의 파일 포맷에 의존한다. 이러한 스트리밍 시스템 중 하나는 HTTP를 통한 동적 적응형 스트리밍(DASH; dynamic adaptive streaming over HTTP)이다. 비디오는, AVC 및/또는 HEVC 등의, 비디오 포맷으로 인코딩될 수 있다. 인코딩된 비디오는 ISOBMFF 트랙들에서 캡슐화되고 DASH 표현들 및 세그먼트들에 포함될 수 있다. 프로파일, 티어 및 레벨 등의 비디오 비트스트림들에 관한 중요한 정보는, 파일 포맷 레벨 메타데이터로서 및/또는 콘텐츠 선택 목적을 위한 DASH 미디어 프리젠테이션 디스크립션 (MPD: media presentation description)에서 노출될 수 있다. 예를 들어, 이러한 정보는, 스트리밍 세션의 시작시의 초기화 및 스트리밍 세션 동안의 스트림 적응 양쪽 모두를 위해 적절한 미디어 세그먼트들의 선택에 이용될 수 있다.File format standards are discussed below. Media streaming applications are typically based on Internet Protocol (IP), Transmission Control Protocol (TCP) and Hypertext Transfer Protocol (HTTP) transport methods, and typically rely on a file format such as ISOBMFF. One such streaming system is dynamic adaptive streaming over HTTP (DASH). Video may be encoded in a video format, such as AVC and/or HEVC. Encoded video can be encapsulated in ISOBMFF tracks and included in DASH representations and segments. Important information about video bitstreams, such as profile, tier and level, can be exposed as file format level metadata and/or in a DASH media presentation description (MPD) for content selection purposes. For example, this information can be used for selection of appropriate media segments both for initialization at the beginning of a streaming session and for stream adaptation during a streaming session.

유사하게, ISOBMFF으로 이미지 포맷을 이용할 때, AVC 이미지 파일 포맷 및 HEVC 이미지 파일 포맷 등과 같은 이미지 포맷 특유의 파일 포맷 명세가 채용될 수 있다. ISOBMFF에 기초한 VVC 비디오 콘텐츠 저장을 위한 파일 포맷인, VVC 비디오 파일 포맷은 MPEG에 의해 개발 중이다. ISOBMFF에 기초하여 VVC를 이용하여 코딩된 이미지 콘텐츠를 저장하기 위한 파일 포맷인 VVC 이미지 파일 포맷도 역시 MPEG에 의해 개발 중이다.Similarly, when using an image format as ISOBMFF, file format specifications specific to image formats, such as AVC image file format and HEVC image file format, may be employed. The VVC video file format, which is a file format for storing VVC video contents based on ISOBMFF, is being developed by MPEG. A VVC image file format, which is a file format for storing image contents coded using VVC based on ISOBMFF, is also being developed by MPEG.

HEVC 및 VVC에서의 랜덤 액세스를 위한 지원이 아래에서 논의된다. 랜덤 액세스는 디코딩 순서에서 비트스트림의 첫 번째 픽처가 아닌 픽처로부터 비트스트림의 액세스 및 디코딩을 시작하는 것을 지칭한다. 브로드캐스트/멀티캐스트 및 다자간 화상 회의에서의 튜닝 및 채널 전환, 로컬 재생 및 스트리밍에서의 탐색은 물론, 스트리밍에서의 스트림 적응을 지원하기 위해, 비트스트림은 랜덤 액세스 포인트들을 포함해야 한다. 이러한 랜덤 액세스 포인트들은 전형적으로 인트라 코딩된 픽처들이지만, (예를 들면, 점진적 디코딩 리프레시(gradual decoding refresh)의 경우) 인터 코딩된 픽처들일 수도 있다. 인트라 코딩된 픽처들은 픽처 내의 블록들을 참조하여 코딩된 픽처들이고, 인터 코딩된 픽처들은 다른 픽처들에서의 블록들을 참조하여 코딩된 픽처들이다. Support for random access in HEVC and VVC is discussed below. Random access refers to starting access and decoding of a bitstream from a picture that is not the first picture of the bitstream in decoding order. In order to support tuning and channel switching in broadcast/multicast and multiparty videoconferencing, local playback and seeking in streaming, as well as stream adaptation in streaming, the bitstream must contain random access points. These random access points are typically intra coded pictures, but may also be inter coded pictures (eg for gradual decoding refresh). Intra-coded pictures are pictures coded with reference to blocks within a picture, and inter-coded pictures are pictures coded with reference to blocks in other pictures.

HEVC는 NAL 유닛 유형들을 통해, NAL 유닛 헤더에서 인트라 랜덤 액세스 포인트(IRAP) 픽처들을 시그널링하는 것을 포함한다. HEVC는 세 가지 유형의 IRAP 픽처들, 즉 IDR(instantaneous decoder refresh) 픽처들, CRA(clean random access) 픽처들, 및 BLA(broken link access) 픽처들을 지원한다. IDR 픽처들은 현재 GOP(group-of-pictures) 이전의 어떠한 픽처도 참조하지 않도록 인터 픽처 예측 구조를 제약하고 있으며, 닫힌 GOP(closed-GOP) 랜덤 액세스 포인트들이라고도 알려진다. CRA 픽처들은 특정 픽처들이, 랜덤 액세스의 경우에 모두 폐기되는, 현재 GOP 이전의 픽처들을 참조하도록 허용하는 것에 의해 덜 제한적이다. CRA 픽처들은 열린 GOP(open-GOP) 랜덤 액세스 포인트들이라고 지칭된다. BLA 픽처들은 일반적으로, 예를 들면, 스트림 전환 동안 CRA 픽처에서의 2 개의 비트스트림 또는 그 일부의 스플라이싱(splicing)으로부터 생긴다. IRAP 픽처들의 보다 나은 시스템 사용을 가능하게 하기 위해, ISOBMFF에 정의된 바와 같은 스트림 액세스 포인트 유형들과 보다 잘 매칭시키기 위해 사용될 수 있는 IRAP 픽처들의 속성들을 시그널링하기 위해 모두 6 개의 상이한 NAL 유닛이 정의된다. 이러한 스트림 액세스 포인트 유형들은 DASH(dynamic adaptive streaming over HTTP)에서의 랜덤 액세스 지원을 위해서도 활용된다.HEVC involves signaling intra random access point (IRAP) pictures in the NAL unit header, via NAL unit types. HEVC supports three types of IRAP pictures: instantaneous decoder refresh (IDR) pictures, clean random access (CRA) pictures, and broken link access (BLA) pictures. IDR pictures constrain the inter-picture prediction structure to not refer to any picture before the current group-of-pictures (GOP), also known as closed-GOP (closed-GOP) random access points. CRA pictures are less restrictive by allowing certain pictures to refer to pictures before the current GOP, which are all discarded in case of random access. CRA pictures are referred to as open-GOP random access points. BLA pictures generally result from splicing of two bitstreams or parts thereof in a CRA picture, for example during stream switching. To enable better system usage of IRAP pictures, all 6 different NAL units are defined to signal attributes of IRAP pictures that can be used to better match stream access point types as defined in ISOBMFF. . These stream access point types are also utilized for random access support in dynamic adaptive streaming over HTTP (DASH).

VVC는 3 가지 유형의 IRAP 픽처들, 2 가지 유형의 IDR 픽처들(연관된 RADL 픽처들을 갖는 한 유형 또는 연관된 RADL 픽처들을 갖지 않는 다른 유형) 그리고 한 유형의 CRA 픽처를 지원한다. 이들은 HEVC에서와 유사한 방식으로 사용된다. HEVC에서의 BLA 픽처 유형들은 VVC에 포함되지 않는다. 이는 BLA 픽처들의 기본 기능이 CRA 픽처들과, 그것의 존재가 후속 픽처가 단일 계층 비트스트림에서의 새로운 CVS를 시작한다는 것을 나타내는 시퀀스 NAL 유닛의 끝에 의해 실현될 수 있기 때문이다. 더 나아가, NAL 유닛 헤더에서의 NAL 유닛 유형 필드에 대해 6 비트 대신에 5 비트를 사용하는 것으로 표시되는 바와 같이, VVC의 개발 동안 HEVC보다 적은 NAL 유닛 유형들을 지정하려는 요구가 있었다.VVC supports three types of IRAP pictures, two types of IDR pictures (one type with associated RADL pictures or the other type without associated RADL pictures) and one type of CRA picture. They are used in a similar way as in HEVC. BLA picture types in HEVC are not included in VVC. This is because the basic function of BLA pictures can be realized by CRA pictures and the end of a sequence NAL unit whose presence indicates that a subsequent picture starts a new CVS in a single layer bitstream. Further, during the development of VVC there was a desire to specify fewer NAL unit types than HEVC, as indicated by the use of 5 bits instead of 6 bits for the NAL unit type field in the NAL unit header.

VVC와 HEVC간의 랜덤 액세스 지원에서의 다른 차이점은 VVC에서 보다 규범적인 방식으로 GDR을 지원한다는 것이다. GDR에서, 비트스트림의 디코딩은 인터 코딩된 픽처로부터 시작될 수 있다. 랜덤 액세스 포인트에서의 제1 픽처에서는, 상기 픽처의 오직 부분만이 정확하게 디코딩될 수 있다. 하지만, 다수의 픽처들 이후에는 전체 픽처 영역이 정확하게 디코딩되고 디스플레이 될 수 있다. AVC 및 HEVC는 또한, GDR 랜덤 액세스 포인트들 및 복구 포인트들의 시그널링을 위한 복구 포인트 SEI 메시지를 사용하여, GDR을 지원한다. VVC에서, NAL 유닛 유형은 GDR 픽처들의 표시를 위해 지정되며, 복구 포인트는 픽처 헤더 신택스 구조에서 시그널링된다. CVS 및 비트스트림은 GDR 픽처로 시작되도록 허용된다. 따라서, 전체 비트스트림이 단일한 인트라 코딩된 픽처 없이 인터 코딩된 픽처들만을 포함하는 것이 허용된다. GDR 지원을 이러한 방식으로 지정하는 것의 주된 이점은 GDR에 대한 적합한 거동을 제공하는 것이다. GDR은 전체 픽처들을 인트라 코딩하는 것과 달리 다수의 픽처들에 (인터 코딩된 슬라이스들/블록 보다 덜 압축된) 인트라 코딩된 슬라이스들 또는 블록들을 분산시킴으로써 인코더들이 비트스트림의 비트 레이트를 고르게 할 수 있도록 한다. 이로 인해 상당한 엔드 투 엔드 지연 감소를 가능하게 하며, 이는 무선 디스플레이, 온라인 게이밍, 드론 기반 애플리케이션들과 같은 초저지연 애플리케이션들의 사용이 증가함에 따라 이전보다 더 중요하게 간주된다. Another difference in random access support between VVC and HEVC is that VVC supports GDR in a more prescriptive manner. In GDR, decoding of a bitstream can start from an inter-coded picture. For the first picture at the random access point, only part of the picture can be decoded correctly. However, after a number of pictures, the entire picture area can be accurately decoded and displayed. AVC and HEVC also support GDR, using the recovery point SEI message for signaling of GDR random access points and recovery points. In VVC, the NAL unit type is specified for the indication of GDR pictures, and the recovery point is signaled in the picture header syntax structure. CVS and bitstreams are allowed to start with GDR pictures. Therefore, it is allowed for the entire bitstream to contain only inter-coded pictures without a single intra-coded picture. The main advantage of specifying GDR support in this way is that it provides suitable behavior for GDR. Unlike intra-coding entire pictures, GDR distributes intra-coded slices or blocks (which are less compressed than inter-coded slices/blocks) over a large number of pictures, so that encoders can even out the bit rate of the bitstream. do. This enables significant end-to-end latency reduction, which is considered more important than ever with the increasing use of ultra-low latency applications such as wireless display, online gaming and drone-based applications.

VVC에서의 또 다른 GDR 관련 특징은 가상 경계 시그널링이다. GDR 픽처와 그 복구 포인트 사이의 픽처에서 리프레쉬된 영역(즉, GDR에서 올바르게 디코딩된 영역)과 리프레쉬 되지 않은 영역 사이의 경계는 가상 경계로서 시그널링될 수 있다. 시그널링되는 경우, 경계를 가로지르는 인루프 필터링이 적용되지 않을 것이다. 이는 경계 또는 경계 근처에서 일부 샘플에 대해 디코딩 미스매치를 방지한다. 이는 애플리케이션이 GDR 프로세스 중에 올바르게 디코딩된 영역을 표시하기로 결정할 때 유용할 수 있다. IRAP 픽처들 및 GDR 픽처들을 총칭하여 RAP(랜덤 액세스 포인트) 픽처라고 할 수 있다.Another GDR-related feature in VVC is virtual boundary signaling. The boundary between a refreshed region (ie, a correctly decoded region in GDR) and a non-refreshed region in a picture between a GDR picture and its recovery point may be signaled as a virtual boundary. If signaled, cross-border in-loop filtering will not be applied. This avoids decoding mismatches for some samples at or near the boundary. This can be useful when an application decides to mark correctly decoded regions during the GDR process. IRAP pictures and GDR pictures may be collectively referred to as RAP (Random Access Point) pictures.

VUI 및 SEI 메시지들이 아래에서 논의된다. VUI는 SPS의 일부로서(그리고 어쩌면 HEVC에서 VPS에서도) 송신되는 신택스 구조이다. VUI는 규범적 디코딩 프로세스에 영향을 미치지 않지만 코딩된 비디오의 적절한 렌더링을 위해 사용될 수 있는 정보를 전달한다. SEI는 디코딩, 디스플레이 또는 다른 목적들에 관련된 프로세스들을 지원한다. VUI와 마찬가지로, SEI도 규범적 디코딩 프로세스에 영향을 미치지 않는다. SEI는 SEI 메시지들에서 전달된다. SEI 메시지들의 디코더 지원은 선택적이다. 그렇지만, SEI 메시지들은 비트스트림 적합성에 영향을 미친다. 예를 들면, 비트스트림에서의 SEI 메시지의 신택스가 사양을 따르지 않는 경우, 비트스트림은 적합하지 않다. 일부 SEI 메시지들은 HRD 사양에서 사용된다. VUI and SEI messages are discussed below. A VUI is a syntax structure that is transmitted as part of the SPS (and possibly also in the VPS in HEVC). The VUI does not affect the canonical decoding process, but carries information that can be used for proper rendering of the coded video. SEI supports processes related to decoding, display or other purposes. Like VUI, SEI does not affect the canonical decoding process. SEI is carried in SEI messages. Decoder support of SEI messages is optional. However, SEI messages affect bitstream conformance. For example, if the syntax of the SEI message in the bitstream does not conform to the specification, the bitstream is not conforming. Some SEI messages are used in the HRD specification.

VVC와 함께 사용되는 VUI 신택스 구조 및 대부분의 SEI 메시지들은 VVC 사양이 아니라 VSEI 사양에 명시되어 있다. HRD 적합성 테스트에 필요한 SEI 메시지들은 VVC 사양에 명시되어 있다. VVC는 HRD 적합성 테스트와 관련된 5 개의 SEI 메시지를 정의하고 VSEI는 20개의 추가적인 SEI 메시지를 지정한다. VSEI 사양에서 전달되는 SEI 메시지들은 적합한 디코더 거동에 직접적인 영향을 미치지 않으며, VSEI가 향후 VVC 외에도 다른 비디오 코딩 표준들과 함께 사용될 수 있게 하는, 코딩 포맷에 구애받지 않는 방식으로 사용될 수 있도록 정의되었다. VVC 신택스 요소 이름들을 구체적으로 언급하는 대신에, VSEI 사양은 값들이 VVC 사양 내에서 설정되는 변수들을 언급한다.The VUI syntax structure and most SEI messages used with VVC are specified in the VSEI specification, not the VVC specification. SEI messages required for HRD conformance testing are specified in the VVC specification. VVC defines 5 SEI messages related to HRD conformance testing and VSEI specifies 20 additional SEI messages. The SEI messages conveyed in the VSEI specification do not directly affect conforming decoder behavior and are defined to be used in a coding format agnostic way, allowing VSEI to be used with other video coding standards besides VVC in the future. Instead of specifically mentioning VVC syntax element names, the VSEI specification refers to variables whose values are set within the VVC specification.

HEVC와 비교하여, VVC의 VUI 신택스 구조는 픽처들의 적절한 렌더링과 관련된 정보에만 초점을 맞추고 어떠한 타이밍 정보 또는 비트스트림 제한 표시들도 포함하지 않는다. VVC에서, VUI는 바이트 단위의 VUI 페이로드의 길이를 시그널링하기 위해 VUI 신택스 구조 앞에 길이 필드를 포함하는 SPS 내에서 시그널링된다. 이는 디코더가 정보를 쉽게 건너뛸 수 있게 하며, SEI 메시지 신택스 확장자와 유사한 방식으로 VUI 신택스 구조의 끝에 새로운 신택스 요소들을 직접 추가함으로써 VUI 신택스 확장자들을 가능하게 한다.Compared to HEVC, the VUI syntax structure of VVC focuses only on information related to proper rendering of pictures and does not contain any timing information or bitstream restriction indications. In VVC, the VUI is signaled within the SPS which includes a length field before the VUI syntax structure to signal the length of the VUI payload in bytes. This allows the decoder to skip information easily and enables VUI syntax extensions by adding new syntax elements directly to the end of the VUI syntax structure in a manner similar to SEI message syntax extensions.

VUI 신택스 구조는 다음의 정보를 포함한다: 콘텐츠가 인터레이스(interlaced)이거나 프로그레시브(progressive)임; 콘텐츠가 프레임 패킹된 스테레오스코픽 비디오 또는 프로젝션 전방향 비디오를 포함하는지 여부의 지시; 샘플 종횡비; 콘텐츠가 오버스캔 디스플레이에 적절한지 여부의 지시; 고다이나믹 레인지(high dynamic range (HDR)) 뿐 아니라 초고화질(UHD) 대 고화질(HD) 색 공간을 시그널링하는 것을 지원하는 원색들, 매트릭스 및 전달 특성들을 포함한 색 디스크립션(color description); 그리고 루마와 비교한 크로마 위치의 지시(HEVC와 비교하여 프로그레시브 콘텐츠에 대해 시그널링이 명확해짐).The VUI syntax structure contains the following information: whether the content is interlaced or progressive; an indication of whether the content includes frame packed stereoscopic video or projection omnidirectional video; sample aspect ratio; an indication of whether the content is appropriate for an overscan display; color description including primaries, matrices and transfer characteristics that support signaling high dynamic range (HDR) as well as ultra high definition (UHD) to high definition (HD) color space; and indication of chroma position compared to luma (signaling clearer for progressive content compared to HEVC).

SPS가 어떠한 VUI도 포함하지 않을 때, 정보는 지정되지 않은 것으로 간주되고 외부 수단을 통해 전달되거나 또는 비트스트림의 콘텐츠가 디스플레이 상에 렌더링되도록 의도되어 있는 경우 애플리케이션에 의해 지정된다When the SPS does not contain any VUI, the information is considered unspecified and either passed through external means or specified by the application if the content of the bitstream is intended to be rendered on the display.

표 1은 VVC에 대해 지정된 SEI 메시지들은 물론, 그것들의 신택스 및 시맨틱스를 포함하는 사양을 나열한다. VSEI 사양에 지정된 20 개의 SEI 메시지 중에서, 많은 것이 HEVC로부터 상속되었다(예를 들어, 필러 페이로드(filler payload) 및 양쪽 사용자 데이터 SEI 메시지들). 일부 SEI 메시지들은 코딩된 비디오 콘텐츠의 정확한 프로세싱 또는 렌더링을 위해 사용된다. 이것은 HDR 콘텐츠와 특히 관련된 마스터링 디스플레이 색 볼륨, 콘텐츠 밝기 레벨 정보 및/또는 대안 전달 특성 SEI 메시지들의 경우에 해당한다. 다른 예들은 360° 비디오 콘텐츠의 시그널링 및 프로세싱과 관련된, 등장방형 프로젝션(equirectangular projection), 구 회전, 영역별 패킹 및/또는 전방향 뷰포트 SEI 메시지들을 포함한다.Table 1 lists the SEI messages specified for VVC, as well as the specification including their syntax and semantics. Of the 20 SEI messages specified in the VSEI specification, many have been inherited from HEVC (eg, filler payload and both user data SEI messages). Some SEI messages are used for precise processing or rendering of coded video content. This is the case for mastering display color volume, content brightness level information and/or alternative transfer characteristic SEI messages specifically related to HDR content. Other examples include equirectangular projection, spherical rotation, per-area packing and/or omni-viewport SEI messages related to signaling and processing of 360° video content.

표 1. VVC v1에서의 SEI 메시지들의 리스트Table 1. List of SEI messages in VVC v1

SEI 메시지의 이름Name of SEI message SEI 메시지의 목적Purpose of SEI messages VVC 사양에 지정된 SEI 메시지들SEI messages specified in the VVC specification 버퍼링 기간buffering period HRD에 대한 초기 CPR 제거 지연들Initial CPR removal delays for HRD 픽처 타이밍picture timing HRD에 대한 CPB 제거 지연들 및 DPB 출력 지연들CPB removal delays and DPB output delays for HRD 디코딩 유닛 정보decoding unit information DU 기반 HRD에 대한 CPB 제거 지연들 및 DPB 출력 지연들CPB removal delays and DPB output delays for DU-based HRD 스케일러블 네스팅scalable nesting SEI 메시지들을 특정 출력 계층 세트들, 서브픽처들의 계층들 또는 세트들과 연관시키는 메커니즘A mechanism to associate SEI messages with specific output layer sets, layers or sets of subpictures. 서브픽처 레벨 정보Subpicture level information 서브픽처 시퀀스들에 대한 레벨들에 관한 정보Information about levels for subpicture sequences VESI 사양에 지정된 SEI 메시지들SEI messages specified in the VESI specification 필러 페이로드filler payload 비트 레이트를 조정하기 위한 필러 데이터Filler data to adjust bit rate Rec. ITU-T T.35에 의해 등록된 사용자 데이터등록되지 않은 사용자 데이터Rec. User data registered by ITU-T T.35 User data not registered 사용자 데이터를 전달하며, 다른 단체들에 의해 데이터에 대한 컨테이너로 사용될 수 있음Carries user data and can be used as a container for data by other entities 필름 그레인 특성들film grain characteristics 필름 그레인 합성을 위한 모델Model for Film Grain Synthesis 프레임 패킹 배열frame packing arrangement 스테레오스코픽 비디오가 비트스트림에 어떻게 코딩되는지에 관한 정보, 예를 들면, 2개의 뷰가 각각의 시간 인스턴스에 대한 2개의 픽처를 하나의 픽처로 패킹함Information about how the stereoscopic video is coded into the bitstream, e.g. two views packing two pictures for each time instance into one picture 파라미터 세트들 포함 표시Display including parameter sets 시퀀스가 디코딩에 필요한 모든 NAL 유닛들을 포함하는지 여부의 표시Indication of whether the sequence contains all NAL units needed for decoding 디코딩된 픽처 해시Decoded picture hash 오류 검출을 위한 디코딩된 픽처들의 해시Hash of decoded pictures for error detection 마스터링 디스플레이 색 볼륨Mastering Display Color Volume 콘텐츠를 저작하는 데 사용되는 디스플레이의 색 볼륨에 대한 설명A description of the color volume of the display used to author content 콘텐츠 밝기 레벨 정보Content Brightness Level Information 콘텐츠의 공칭 목표 휘도 밝기 레벨에 대한 상한들Upper bounds on the nominal target luminance brightness level of the content 종속 RAP 표시Show dependent RAP 인터 예측 참조를 위해 선행 IRAP 픽처만을 사용하는 픽처를 표시Displays pictures that use only preceding IRAP pictures for inter prediction reference 대안 전달 특성들Alternative transfer characteristics 콘텐츠의 전달 특성들에 대한 선호된 대안 값Preferred Alternative Values for Delivery Characteristics of Content 주변 시청 환경ambient viewing environment 콘텐츠의 디스플레이를 위한 공칭 주변 시청 환경의 특성들, 수신기가 로컬 시청 환경에 따라 콘텐츠를 프로세싱하는 것을 지원하는 데 사용될 수 있음Characteristics of a nominal ambient viewing environment for display of content, which may be used to assist a receiver in processing content according to the local viewing environment. 콘텐츠 색 볼륨content color volume 연관된 픽처의 색 볼륨 특성들Color volume characteristics of the associated picture 등장방형 프로젝션일반화된 큐브맵 프로젝션Equirectangular Projection Generalized Cubemap Projection 전방향 비디오 애플리케이션들에서 랜더링하기 위해 구에 콘텐츠를 재매핑하는 데 필요한 정보를 포함하여, 적용되는 프로젝션 포맷의 표시An indication of the applied projection format, including information needed to remap content to a sphere for rendering in omnidirectional video applications. 구 회전sphere rotation 전방향 비디오 애플리케이션들에서 사용하기 위한, 글로벌 좌표축과 로컬 좌표축 간의 변환을 위한 회전 각도에 대한 정보Information about rotation angles for transformation between global and local axes for use in omnidirectional video applications. 영역별 패킹Packing by area 재배치, 크기 조정 및 회전과 같은 영역별 작업들을 포함하는, 크로핑된 디코딩된 픽처들을 전방향 비디오 애플리케이션들에서 사용하기 위해 프로젝션된 픽처들 상에 재매핑하는 데 필요한 정보Information needed to remap cropped decoded pictures onto projected pictures for use in omnidirectional video applications, including region-specific operations such as repositioning, scaling, and rotation. 전방향 뷰포트omni viewport 전방향 비디오 애플리케이션들에서 사용하기 위한, 디스플레이하기 위한 권장된 뷰포트들에 대응하는 하나 이상의 영역의 좌표들Coordinates of one or more regions corresponding to recommended viewports for displaying, for use in omnidirectional video applications. 프레임 필드 정보frame field information 연관된 픽처가 어떻게 디스플레이되어야 하는지, 그것의 소스 스캔, 그리고 그것이 이전 픽처의 복제본인지 여부를 표시Indicate how the associated picture should be displayed, its source scan, and whether it is a duplicate of the previous picture. 샘플 종횡비 정보Sample aspect ratio information 연관된 픽처의 샘플 종횡비에 관한 정보Information about the sample aspect ratio of the associated picture

VVC v1에 대해 지정된 SEI 메시지들은 프레임 필드 정보 SEI 메시지, 샘플 종횡비 정보 SEI 메시지 그리고 서브픽처 레벨 정보 SEI 메시지를 포함한다. 프레임 필드 정보 SEI 메시지는 연관된 픽처가 어떻게 디스플레이되어야 하는지(예컨대, 필드 패리티 또는 프레임 반복 주기), 연관된 픽처의 소스 스캔 유형, 그리고 연관된 픽처가 이전 픽처의 복제본인지 여부를 나타내는 정보를 포함한다. 이 정보는, 연관된 픽처의 타이밍 정보와 함께, 이전 비디오 코딩 표준들에서 픽처 타이밍 SEI 메시지에서 시그널링될 수 있다. 그렇지만, 프레임 필드 정보와 타이밍 정보는 반드시 함께 시그널링될 필요는 없는 두 가지 상이한 종류의 정보이다. 전형적인 예로, 타이밍 정보는 시스템 레벨에서 시그널링 되지만, 프레임 필드 정보는 비트스트림 내에서 시그널링된다. 따라서, 프레임 필드 정보는 픽처 타이밍 SEI 메시지로부터 제거되고 그 대신 전용 SEI 메시지 내에서 시그널링된다. 이 변경은 또한 필드들을 함께 페어링하는 것 또는 프레임 반복을 위한 더 많은 값들과 같은 추가적이고 보다 명확한 명령들을 디스플레이에 전달하기 위해 프레임 필드 정보의 신택스를 수정하는 것을 지원한다.SEI messages specified for VVC v1 include a frame field information SEI message, a sample aspect ratio information SEI message, and a subpicture level information SEI message. The frame field information SEI message includes information indicating how the associated picture should be displayed (eg, field parity or frame repetition period), the source scan type of the associated picture, and whether the associated picture is a duplicate of the previous picture. This information, together with the associated picture's timing information, can be signaled in the picture timing SEI message in previous video coding standards. However, frame field information and timing information are two different kinds of information that do not necessarily need to be signaled together. As a typical example, timing information is signaled at the system level, while frame field information is signaled within the bitstream. Therefore, the frame field information is removed from the picture timing SEI message and instead signaled within a dedicated SEI message. This change also supports modifying the syntax of frame field information to convey additional and more specific commands to the display, such as pairing fields together or more values for frame repetition.

샘플 종횡비 SEI 메시지는 동일한 시퀀스 내의 상이한 픽처들에 대한 상이한 샘플 종횡비들을 시그널링하는 것을 가능하게 하는 반면, VUI에 포함된 대응하는 정보는 전체 시퀀스에 적용된다. 이는 동일한 시퀀스의 상이한 픽처들이 상이한 샘플 종횡비들을 갖게 하는 스케일링 인자들과 함께 참조 픽처 리샘플링 특징을 사용할 때 관련될 수 있다.A sample aspect ratio SEI message makes it possible to signal different sample aspect ratios for different pictures within the same sequence, whereas the corresponding information contained in the VUI applies to the entire sequence. This may be relevant when using the reference picture resampling feature with scaling factors that cause different pictures of the same sequence to have different sample aspect ratios.

서브픽처 레벨 정보 SEI 메시지는 서브픽처 시퀀스들에 대한 레벨들의 정보를 제공한다.The subpicture level information SEI message provides levels of information for subpicture sequences.

DRAP 표시 SEI 메시지는 아래에 논의된다. VSEI사양은, 다음과 같이 지정된, DRAP 표시 SEI 메시지를 포함한다.DRAP indication SEI messages are discussed below. The VSEI specification includes DRAP indication SEI messages, specified as follows:

종속 랜덤 액세스 포인트(DRAP) 표시 SEI 메시지와 연관된 픽처는 DRAP 픽처라고 지칭된다. DRAP 표시 SEI 메시지의 존재는 이 절에서 지정되는 픽처 순서 및 픽처 참조에 대한 제약들이 적용된다는 것을 나타낸다. 이러한 제약들은 디코더가, DRAP 픽처의 연관된 IRAP 픽처를 제외한 임의의 다른 픽처들을 디코딩할 필요 없이, 디코딩 순서 및 출력 순서 둘 모두에서 DRAP 픽처에 후속하는 픽처들을 적절하게 디코딩할 수 있게 한다.A picture associated with a dependent random access point (DRAP) indication SEI message is referred to as a DRAP picture. The presence of the DRAP indication SEI message indicates that the constraints on picture order and picture reference specified in this clause apply. These constraints allow the decoder to properly decode pictures that follow a DRAP picture in both decoding order and output order, without having to decode any other pictures except for the DRAP picture's associated IRAP picture.

모두 적용되어야 하는, DRAP 표시 SEI 메시지의 존재에 의해 표시되는 제약들은 다음과 같다. DRAP 픽처는 트레일링 픽처(trailing picture)이다. DRAP 픽처는 0과 동일한 시간 서브계층 식별자를 갖는다. DRAP 픽처는 DRAP 픽처의 연관된 IRAP 픽처를 제외하고 그것의 참조 픽처 리스트들의 활성 엔트리들에 있는 임의의 픽처들을 포함하지 않는다. 디코딩 순서와 출력 순서 둘 모두에서 DRAP 픽처에 후속하는 임의의 픽처는, 그것의 참조 픽처 리스트들의 활성 엔트리들에, DRAP 픽처의 연관된 IRAP 픽처를 제외하고, 디코딩 순서 또는 출력 순서에서 DRAP 픽처보다 선행하는 어떠한 픽처도 포함하지 않는다. The constraints indicated by the presence of the DRAP indication SEI message, which must all apply, are as follows. A DRAP picture is a trailing picture. A DRAP picture has a temporal sub-layer identifier equal to 0. A DRAP picture does not contain any pictures in the active entries of its reference picture lists, except for the DRAP picture's associated IRAP picture. Any picture that follows a DRAP picture in both decoding order and output order, in active entries of its reference picture lists, excluding the DRAP picture's associated IRAP picture, precedes the DRAP picture in decoding order or output order. It does not contain any pictures.

미디어 파일들에서의 DRAP 시그널링이 아래에서 논의된다. ISOBMFF는 아래와 같이 샘플 그룹들에 기반하는 DRAP를 위한 시그널링 메커니즘을 포함한다. DRAP 샘플 그룹은 다음과 같이 정의된다. DRAP 샘플은, 만약 상기 DRAP 샘플에 선행하는 최근접 초기 샘플이 참조를 위해 이용 가능하다면, 디코딩 순서에서 그것에 뒤따르는 모든 샘플들이 정확하게 디코딩될 수 있는 샘플이다. 상기 초기 샘플은 싱크(sync) 샘플임에 의해 그 자체로 표시되거나 또는 SAP 샘플 그룹에 의해 표시되는 SAP 유형 1, 2 또는 3의 스트림 액세스 포인트(stream access point (SAP)) 샘플이다. 예를 들어, 만약 파일 내 32번째 샘플이 I 픽처를 포함하는 초기 샘플이라면, 48번째 샘플은 P 픽처를 포함할 수 있고 종속 랜덤 액세스 포인트 샘플 그룹의 구성원으로 표시될 수 있다. 이것이 표시하는 것은 32번째 샘플(샘플 33 내지 47은 무시)을 먼저 디코딩하고 그리고 나서 48번째 샘플로부터 계속 디코딩함으로써 랜덤 액세스가 48번째 샘플에서 실시될 수 있다는 것이다. DRAP signaling in media files is discussed below. ISOBMFF includes a signaling mechanism for DRAP based on sample groups as follows. A DRAP sample group is defined as follows. A DRAP sample is a sample from which all samples following it in decoding order can be correctly decoded, if the nearest initial sample preceding the DRAP sample is available for reference. The initial sample is a stream access point (SAP) sample of SAP type 1, 2 or 3, indicated by itself by being a sync sample or by a group of SAP samples. For example, if the 32nd sample in the file is an initial sample containing an I picture, the 48th sample may contain a P picture and be marked as a member of the dependent random access point sample group. What this indicates is that random access can be done at the 48th sample by first decoding the 32nd sample (ignoring samples 33 to 47) and then continuing decoding from the 48th sample.

다음 조건들이 참인 경우에만 샘플은 종속 랜덤 액세스 포인트 샘플 그룹의 구성원이 될 수 있다(그리고 이러한 이유로 DRAP 샘플로 불릴 수 있다). DRAP 샘플은 최근접으로 선행하는 초기 샘플만을 참조한다. DRAP 샘플 및 출력 순서에서 상기DRAP 샘플을 추종하는 모든 샘플들은, 상기 DRAP 샘플에서 디코딩을 시작하는 경우에 정확하게 디코딩 될 수 있다. 이것은, 싱크 샘플임에 의해 그 자체로 표시되거나 SAP샘플 그룹에 의해 표시될 수 있는, 유형 1, 2 또는 3의 최근접으로 선행하는 SAP 샘플을 디코딩 한 후에 발생될 수 있다. DRAP 샘플들은 유형 1, 2 및 3의 SAP 샘플들과 결합해서만 사용될 수 있다. 이것은 선행하는 SAP 샘플을 DRAP 샘플 및 출력 순서에서 상기 DRAP샘플을 추종하는 샘플들과 연결시킴으로써 샘플들의 디코딩 가능한 시퀀스를 생성하는 기능을 활성화하기 위한 것이다. DRAP 샘플 그룹의 예시적인 신택스는 아래와 같다.A sample may become a member of a dependent random access point sample group (and may be referred to as a DRAP sample for this reason) only if the following conditions are true: DRAP samples refer only to the nearest preceding initial sample. DRAP sample and all samples following the DRAP sample in output order can be correctly decoded when decoding starts from the DRAP sample. This may occur after decoding the nearest preceding SAP sample of type 1, 2 or 3, which may be indicated by itself by being a sync sample or by a group of SAP samples. DRAP samples can only be used in conjunction with types 1, 2 and 3 SAP samples. This is to activate the function of creating a decodable sequence of samples by concatenating a preceding SAP sample with a DRAP sample and samples following that DRAP sample in output order. Exemplary syntax of the DRAP sample group is as follows.

DRAP 샘플 그룹에 대한 예시적인 시맨틱스는 다음과 같다. DRAP_type은 음이 아닌 정수이다. DRAP_type이 1 내지 3의 범위에 있는 경우, DRAP 샘플이 최근접하는 선행 SAP에 의존하지 않았으면 DRAP 샘플이 상응했을 (부속서 I에서 지정되는 것과 같은) SAP_type을 지시한다. 다른 유형 값들은 보존된다. 보존되는 것은 0과 같아야 한다. 본 하위 절의 시맨틱스는 보존된 값이 0과 같은 샘플 그룹 디스크립션 엔트리들(sample group description entries)에만 적용된다. 파서들은 본 샘플 그룹을 파싱하는 경우 보존된 값이 0 보다 큰 샘플 그룹 디스크립션 엔트리들을 허용하고 무시해야 한다.Exemplary semantics for DRAP sample groups are as follows. DRAP_type is a non-negative integer. If DRAP_type is in the range 1 to 3, indicates the SAP_type (as specified in Annex I) to which the DRAP sample would have corresponded if the DRAP sample had not depended on the nearest preceding SAP. Other type values are preserved. Conserved must be equal to zero. The semantics of this subclause apply only to sample group description entries with a preserved value equal to 0. Parsers MUST accept and ignore sample group description entries with a preserved value greater than 0 when parsing this sample group.

외적 디코딩 리프레시(external decoding refresh (EDR)) 및/또는 유형 2 DRAP라고도 지칭되는 교차 RAP 참조에 기초한 비디오 코딩 접근법이 아래에서 논의된다. 이 비디오 코딩 접근법의 기본 착안은 다음과 같다. (비트스트림에서의 맨 처음 픽처를 제외하고) 랜덤 액세스 포인트들을 인트라 코딩된 IRAP 픽처들로서 코딩하는 대신에, 랜덤 액세스 포인트들이 IRAP 픽처들로서 코딩되는 경우 이전 픽처들의 사용 불가능성을 피하기 위해, 랜덤 액세스 포인트들이 인터 예측을 사용하여 코딩된다. 외부 스트림 및/또는 외부 수단이라고 지칭될 수 있는, 별도의 비디오 비트스트림을 통해 전형적으로 비디오 콘텐츠의 상이한 장면들을 나타내는 제한된 수의 이전 픽처들을 메커니즘이 제공한다. 이러한 이전 픽처들은 외부 픽처들이라고 지칭된다. 결과적으로, 각각의 외부 픽처는 랜덤 액세스 포인트들에 걸친 픽처들에 의한 인터 예측 참조를 위해 사용될 수 있다. 코딩 효율 이득은 인터 예측된 픽처들로서 코딩된 랜덤 액세스 포인트들을 갖고 디코딩 순서에서 EDR 픽처들에 후속하는 픽처들에 대해 보다 많은 이용 가능한 참조 픽처들을 갖는 것으로부터 비롯된다. 그러한 비디오 코딩 접근법으로 코딩된 비트스트림은 아래에서 설명되는 바와 같이 ISOBMFF 및 DASH에 기초한 애플리케이션들에서 사용될 수 있다.A video coding approach based on cross-RAP reference, also referred to as external decoding refresh (EDR) and/or type 2 DRAP, is discussed below. The basic idea of this video coding approach is as follows. Instead of coding random access points as intra-coded IRAP pictures (except for the very first picture in the bitstream), to avoid unusability of previous pictures when random access points are coded as IRAP pictures, random access point are coded using inter prediction. The mechanism provides a limited number of previous pictures representing different scenes of video content, typically via a separate video bitstream, which may be referred to as an external stream and/or external means. These previous pictures are referred to as external pictures. As a result, each outer picture can be used for inter prediction reference by pictures across random access points. The coding efficiency gain comes from having random access points coded as inter-predicted pictures and having more available reference pictures for pictures that follow EDR pictures in decoding order. A bitstream coded with such a video coding approach can be used in applications based on ISOBMFF and DASH as described below.

DASH 콘텐츠 준비 동작들이 아래에 논의된다. 비디오 콘텐츠는 하나 이상의 표현으로 인코딩되고, 각각은 특정 공간 해상도, 시간 해상도 및 품질을 포함한다. 비디오 콘텐츠의 각 표현은 메인 스트림 그리고 어쩌면 또한 외부 스트림에 의해 표현된다. 메인 스트림은 EDR 픽처들을 포함할 수 있거나 포함하지 않을 수 있는 코딩된 픽처들을 포함한다. 적어도 하나의 EDR 픽처가 메인 스트림에 포함될 때, 외부 스트림도 존재하고 외부 픽처들을 포함한다. EDR 픽처가 메인 스트림에 포함되지 않을 때, 외부 스트림은 존재하지 않는다. 각각의 메인 스트림은 메인 스트림 표현(Main Stream Representation (MSR))에서 전달된다. MSR에서의 각각의 EDR 픽처는 세그먼트의 첫 번째 픽처이다. DASH content preparation operations are discussed below. Video content is encoded into one or more representations, each having a specific spatial resolution, temporal resolution and quality. Each presentation of video content is represented by a main stream and possibly also an external stream. The main stream contains coded pictures that may or may not contain EDR pictures. When at least one EDR picture is included in the main stream, the outer stream also exists and includes the outer pictures. When EDR pictures are not included in the main stream, there is no external stream. Each main stream is delivered in a Main Stream Representation (MSR). Each EDR picture in MSR is the first picture of a segment.

각각의 외부 스트림은, 존재할 때, 외부 스트림 표현(External Stream Representation (ESR))에서 전달된다. EDR 픽처로 시작되는 MSR에서의 각각의 세그먼트에 대해, MPD로부터 도출되는 동일한 세그먼트 시작 시간을 갖는 대응하는 ESR에, 해당 EDR 픽처의 디코딩에 필요한 외부 픽처들 및 MSR에서 전달되는 비트스트림에서의 디코딩 순서에서 후속하는 픽처들을 전달하는 세그먼트가 있다. 동일한 비디오 콘텐츠의 MSR들은 하나의 적응 세트(Adaptation Set (AS))에 포함된다. 동일한 비디오 콘텐츠의 ESR들은 하나의 AS에 포함된다.Each external stream, when present, is carried in an External Stream Representation (ESR). For each segment in the MSR starting with an EDR picture, in the corresponding ESR with the same segment start time derived from the MPD, the external pictures required for decoding of that EDR picture and the decoding order in the bitstream delivered in the MSR There is a segment that carries subsequent pictures in . MSRs of the same video content are included in one Adaptation Set (AS). ESRs of the same video content are included in one AS.

DASH 스트리밍 동작들이 아래에 논의된다. 클라이언트는 DASH 미디어 프레젠테이션의 MPD를 얻고, MPD를 파싱하며, MSR을 선택하고, 콘텐츠가 소비될 시작 프레젠테이션 시간을 결정한다. 상기 클라이언트는, 시작 프레젠테이션 시간과 동일한 (또는 이에 충분히 가까운) 프레젠테이션 시간을 갖는 픽처를 포함하는 세그먼트부터 시작하여, MSR의 세그먼트들을 요청한다. 시작 세그먼트에서의 첫 번째 픽처가 EDR 픽처인 경우, 바람직하게는 MSR 세그먼트들을 요청하기 전에, 연관된 ESR에서의 대응하는 세그먼트(MPD로부터 도출되는 동일한 세그먼트 시작 시간을 가짐)도 요청된다. 그렇지 않은 경우, 연관된 ESR의 세그먼트가 요청되지 않는다.DASH streaming operations are discussed below. The client obtains the MPD of the DASH media presentation, parses the MPD, selects the MSR, and determines the starting presentation time for the content to be consumed. The client requests segments of the MSR, starting with the segment containing the picture with a presentation time equal to (or sufficiently close to) the starting presentation time. If the first picture in the start segment is an EDR picture, preferably before requesting the MSR segments, the corresponding segment (with the same segment start time derived from the MPD) in the associated ESR is also requested. Otherwise, the segment of the associated ESR is not requested.

상이한 MSR로 전환할 때, 상기 클라이언트는, 전환 소스(switch-from) MSR의 마지막 요청된 세그먼트 시작 시간보다 더 큰 세그먼트 시작 시간을 갖는 첫 번째 세그먼트부터 시작하여, 전환 대상(switch-to) MSR의 세그먼트들을 요청한다. 전환 대상 MSR에서의 시작 세그먼트에서의 첫 번째 픽처가 EDR 픽처인 경우, 바람직하게는 MSR 세그먼트들을 요청하기 전에, 연관된 ESR에서의 대응하는 세그먼트도 요청된다. 그렇지 않은 경우, 연관된 ESR의 세그먼트가 요청되지 않는다.When switching to a different MSR, the client starts with the first segment with a segment start time greater than the last requested segment start time of the switch-from MSR, Request segments. If the first picture in the starting segment in the MSR to be switched is an EDR picture, preferably before requesting the MSR segments, the corresponding segment in the associated ESR is also requested. Otherwise, the segment of the associated ESR is not requested.

동일한 MSR에서 연속적으로 작동할 때(탐색 또는 스트림 전환 동작 이후 시작 세그먼트의 디코딩 이후에), EDR 픽처로 시작되는 임의의 세그먼트를 요청할 때를 포함하여, 연관된 ESR의 세그먼트는 요청되지 않는다.When operating consecutively on the same MSR (after decoding of the starting segment after a seek or stream switching operation), segments of the associated ESR are not requested, including when requesting any segment starting with an EDR picture.

비디오에서 교차 RAP 참조의 시그널링이 아래에 논의된다. CRR은 유형 2 DRAP 표시 SEI 메시지라고 불리는 SEI 메시지에서 다음과 같이 시그널링 될 수 있다. 유형 2 DRAP 표시 SEI 메시지 신택스는 다음과 같다.The signaling of cross RAP references in video is discussed below. CRR can be signaled as follows in an SEI message called a type 2 DRAP indication SEI message. The type 2 DRAP indication SEI message syntax is as follows.

유형 2 DRAP 표시 SEI 메시지 시맨틱스는 다음과 같다. 유형 2 DRAP 표시 SEI 메시지와 연관된 픽처는 유형 2 DRAP 픽처로 지칭된다. (DRAP 표시 SEI 메시지와 연관된) 유형 1 DRAP 픽처들 및 유형 2 DRAP 픽처들은 DRAP 픽처들로 통칭된다. 상기 유형 2 DRAP 표시 SEI 메시지의 존재는 본 하위 절에서 명시되는 픽처 순서 및 픽처 참조에 관한 제약들이 적용됨을 지시한다. 이 제약들은 디코더로 하여금, 디코딩 순서 및 출력 순서 모두에서 상기 유형 2 DRAP 픽처를 뒤따르며 동일한 계층에 있는 픽처들 및 유형 2 DRAP 픽처를 적절하게 디코딩하도록 할 수 있다. 이것은 동일한 CLVS 내에 있으며 t2drap_ref_rap_id[　i　] 신택스 요소들로 식별되는, 디코딩 순서에 따른 IRAP 또는 DRAP 픽처들의 리스트를 포함하는, 픽처들 referenceablePictures의 리스트를 제외한 동일 계층에 있는 임의의 다른 픽처들을 디코딩할 필요 없이 완수될 수 있다.Type 2 DRAP indication SEI message semantics are as follows. A picture associated with a Type 2 DRAP indication SEI message is referred to as a Type 2 DRAP picture. Type 1 DRAP pictures (associated with the DRAP indication SEI message) and Type 2 DRAP pictures are collectively referred to as DRAP pictures. The existence of the type 2 DRAP indication SEI message indicates that the constraints on picture order and picture reference specified in this subclause are applied. These constraints enable a decoder to properly decode a type 2 DRAP picture and pictures in the same layer that follow the type 2 DRAP picture in both decoding order and output order. This is in the same CLVS and contains a list of IRAP or DRAP pictures in decoding order, identified by t2drap_ref_rap_id [　i　] syntax elements, excluding the list of pictures referenceablePictures Without the need to decode any other pictures in the same layer can be completed

상기 유형 2 DRAP 표시 SEI 메시지의 존재로 표시되며 모두 적용되어야 하는 상기 제약들은 다음과 같다. 유형 2 DRAP 픽처는 트레일링 픽처이다. 유형 2 DRAP 픽처는 0과 동일한 시간 서브계층 식별자를 갖는다. 유형 2 DRAP 픽처는 referenceablePictures를 제외하고 유형 2 DRAP 픽처의 참조 픽처 리스트들의 활성 엔트리들에 있는 동일 계층의 어떤 픽처도 포함하지 않는다. 디코딩 순서와 출력 순서 둘 모두에서 유형 2 DRAP 픽처에 후속하며 동일 계층에 있는 임의의 픽처는, 유형 2 DRAP 픽처의 참조 픽처 리스트들의 활성 엔트리들에, referenceablePictures를 제외하고, 디코딩 순서 또는 출력 순서에서 유형 2 DRAP 픽처에 선행하는 어떠한 픽처도 포함하지 않는다. 리스트 referenceablePictures에서의 임의의 픽처는, 상기 픽처의 참조 픽처 리스트들의 활성 엔트리들 내에, 리스트 referenceablePictures에서 이전 위치에 있는 픽처가 아니면서 동일 계층에 있는 어떤 픽처도 포함하지 않는다. 결과적으로 referenceablePictures 내의 첫 번째 픽처는, 상기 픽처가 IRAP 픽처 대신 DRAP 픽처라 하더라도, 상기 픽처의 참조 픽처 리스트들의 활성 엔트리들에 있는 동일 계층으로부터의 어떤 픽처도 포함하지 않는다. The above constraints, which are indicated by the presence of the type 2 DRAP indication SEI message and must all be applied, are as follows. A type 2 DRAP picture is a trailing picture. A type 2 DRAP picture has a temporal sublayer identifier equal to 0. A type 2 DRAP picture does not contain any pictures of the same layer in active entries of the reference picture lists of the type 2 DRAP picture except referenceablePictures. Any picture in the same layer that follows a type 2 DRAP picture in both decoding order and output order, excluding referenceablePictures, in active entries of the type 2 DRAP picture's reference picture lists, in decoding order or output order 2 It does not include any picture preceding the DRAP picture. Any picture in the list referenceablePictures does not contain any picture in the same layer that is not a picture at a previous position in the list referenceablePictures, in the active entries of the reference picture lists of that picture. As a result, the first picture in referenceablePictures does not contain any picture from the same layer in the active entries of the picture's reference picture lists, even if the picture is a DRAP picture instead of an IRAP picture.

t2drap_rap_id_in_clvs는 유형 2 DRAP 픽처의, RapPicId로서 표기된, RAP 픽처 식별자를 지정한다. 각각의 IRAP 또는 DRAP 픽처는 RapPicId와 연관된다. IRAP 픽처에 대한 RapPicId의 값은 0과 같다고 추론된다. RapPicId의 값들은 CLVS 내 임의의 두 IRAP 또는 DRAP 픽처들에 대해 상이해야 한다. t2drap_reserved_zero_13bits 본 사양의 본 버전에 부합하는 비트스트림들에서 0과 같아야 한다. t2drap_reserved_zero_13bits에 대한 다른 값들은 유지된다. 디코더들은 t2drap_reserved_zero_13bits의 값을 무시해야 한다. t2drap_num_ref_rap_pics_minus1 플러스 1은 유형 2 DRAP 픽처와 동일한 CLVS 내에 있으며 유형 2 DRAP 픽처의 참조 픽처 리스트들의 활성 엔트리들에 포함될 수 있는 IRAP 또는 DRAP 픽처들의 수를 지시한다. t2drap_ref_rap_id[ i ]는 유형 2 DRAP 픽처와 동일한 CLVS 내에 있고 유형 2 DRAP 픽처의 참조 픽처 리스트들의 활성 엔트리들에 포함될 수 있는 i번째 IRAP 또는 DRAP 픽처의 RapPicId를 지시한다.t2drap_rap_id_in_clvs specifies the RAP picture identifier, denoted as RapPicId, of a type 2 DRAP picture. Each IRAP or DRAP picture is associated with a RapPicId. The value of RapPicId for an IRAP picture is inferred to be equal to 0. The values of RapPicId must be different for any two IRAP or DRAP pictures in CLVS. t2drap_reserved_zero_13bits Must be equal to 0 in bitstreams conforming to this version of this specification. Other values for t2drap_reserved_zero_13bits are retained. Decoders should ignore the value of t2drap_reserved_zero_13bits. t2drap_num_ref_rap_pics_minus1 plus 1 indicates the number of IRAP or DRAP pictures that are in the same CLVS as a type 2 DRAP picture and can be included in active entries of reference picture lists of a type 2 DRAP picture. t2drap_ref_rap_id[i] indicates RapPicId of the i-th IRAP or DRAP picture that is in the same CLVS as the type 2 DRAP picture and can be included in active entries of reference picture lists of the type 2 DRAP picture.

아래는 개시된 기술적 해결책들에 의해 해결되는 기술적 문제들의 예들이다. 예를 들어, 다음 문제들은 비디오 비트스트림들 및 미디어 파일들에서 CRR 및/또는 DRAP의 시그널링과 관련하여 존재한다. DRAP 표시 SEI 메시지는, DRAP 픽처에 디코딩 순서에서는 뒤따르지만 출력 순서에서는 상기 DRAP 픽처에 선행하는 픽처들이 상기 DRAP 픽처로부터 랜덤 액세싱할 때 정확하게 디코딩될 수 있는지 여부를 표시하는 시그널링을 결여한다. 이러한 픽처들은 이러한 경우에 부정확하게 디코딩될 수 있는데, 이는 그것들이 디코딩 순서에서 상기 DRAP 픽처 보다 이전의 픽처들을 인터 예측을 위해 참조하기 때문이다. Below are examples of technical problems solved by the disclosed technical solutions. For example, the following problems exist with respect to signaling of CRR and/or DRAP in video bitstreams and media files. The DRAP indication SEI message lacks signaling indicating whether pictures that follow a DRAP picture in decoding order but precede the DRAP picture in output order can be correctly decoded when randomly accessed from the DRAP picture. These pictures may be decoded incorrectly in this case, since they refer for inter prediction to pictures prior to the DRAP picture in decoding order.

디코딩 순서에서 연관된 DRAP 픽처를 뒤따르며 출력 순서에서 상기 연관된 DRAP 픽처에 선행하는 픽처의 예를 보여주는 도 5를 참조한다. 각 박스는 좌측에서 우측으로 디코딩 순서로 도시되는 픽처이다. 박스 안의 숫자는 출력 순서로, 픽처의 픽처 순서 카운트로도 알려진 것이다. 화살표는 2개의 픽처들 사이의 인터 예측 관계를 지시하며, 우측의 픽처(화살 촉)는 좌측의 픽처(화살표 출발점)를 참조 픽처를 사용한다.5, which shows an example of a picture that follows an associated DRAP picture in decoding order and precedes the associated DRAP picture in output order. Each box is a picture shown in decoding order from left to right. The number in the box is the output order, also known as the picture order count of the picture. An arrow indicates an inter-prediction relationship between two pictures, and the right picture (arrowhead) uses the left picture (arrow starting point) as a reference picture.

도 5에서 제시되는 예에서, 픽처 6에서 픽처 8로의 인터 예측은 턴 오프될 수 있다(상기 2개의 픽처들을 모으는 화살표가 제거된다). 이 경우, DRAP 픽처(픽처 10)으로부터 랜덤 액세스 하는 경우, 픽처 8은 정확하게 디코딩 될 수 있다. 하지만, 픽처 6에서 픽처 8로의 인터 예측이 채용되는 경우, 픽처 8은 DRAP 픽처(픽처 10)이 랜덤 액세스 포인트로 사용될 때 정확하게 디코딩 될 수 없다. 이러한 인터 예측이 턴 오프되는지 여부에 대한 표시는 DRAP 픽처로부터 랜덤 액세스 할 때 시스템들이 비디오를 언제 프리젠팅하기 시작할지를 아는 데에 유용하다. 예를 들어, 이러한 표시로, 상기 DRAP 픽처(픽처 10)로부터 랜덤 액세스할 때, 애플리케이션 시스템은 프리젠테이션이 픽처 8로부터 출발할지 픽처 10으로부터 출발할지 여부를 알 수 있다.In the example presented in FIG. 5 , inter prediction from picture 6 to picture 8 can be turned off (the arrow bringing the two pictures together is removed). In this case, in the case of random access from the DRAP picture (picture 10), picture 8 can be accurately decoded. However, if inter prediction from picture 6 to picture 8 is employed, picture 8 cannot be accurately decoded when a DRAP picture (picture 10) is used as a random access point. An indication of whether this inter prediction is turned off is useful for systems to know when to start presenting video when randomly accessing from a DRAP picture. For example, with this indication, on random access from the DRAP picture (picture 10), the application system can know whether the presentation will start from picture 8 or picture 10.

유형 2 DRAP 표시 SEI 메시지는, 디코딩 순서에서 유형 2 DRAP 픽처를 뒤따르지만 출력 순서에서 상기 유형 2 DRAP 픽처에 선행하는 픽처들이 유형 2 DRAP 픽처로부터 랜덤 엑세스할 때 정확하게 디코딩될 수 있는지 여부를 표시하기 위한 시그널링 메커니즘 또한 결여한다. 이러한 픽처는, 상기 픽처가 디코딩 순서에서 유형 2 DRAP 픽처 이전의 픽처들을 참조하는 경우 부정확하게 디코딩할 수 있다. 이와 같은 표시는 시스템들이 유형 2 DRAP 픽처로부터 랜덤 액세스할 때 언제 비디오를 프리젠팅 할 것인지를 결정하는 데에 유용하다. 미디어 파일들에서 CPP를 시그널링하기 위한 메커니즘 또한 결여된다. The type 2 DRAP indication SEI message indicates whether pictures following a type 2 DRAP picture in decoding order but preceding the type 2 DRAP picture in output order can be correctly decoded when randomly accessed from a type 2 DRAP picture A signaling mechanism is also lacking. Such a picture may be decoded incorrectly if the picture refers to pictures prior to a type 2 DRAP picture in decoding order. Such an indication is useful for systems to decide when to present video when randomly accessing from a type 2 DRAP picture. A mechanism for signaling CPP in media files is also lacking.

더 나아가, ISOBMFF에서의 DRAP 샘플 그룹의 시맨틱스는 불완전하다. ISOBMFF가 진술하는 것은, DRAP 샘플은 만약 상기 DRAP 샘플에 선행하는 최근접 초기 샘플이 참조를 위해 이용 가능하다면, 디코딩 순서에서 그것에 뒤따르는 모든 샘플들이 정확하게 디코딩될 수 있는 샘플이라는 것이다. 하지만, 상기 DRAP 샘플에 선행하는 최근접 초기 샘플이 참조를 위해 이용 가능하다고 해도, 디코딩 순서에서 상기 DRAP 샘플을 뒤따르지만 출력 순서에서는 상기 DRAP 샘플에 선행하는 샘플들이 참조를 위한 상기 최근접 최초 샘플 내의 픽처들 이전의 픽처들을 참조하는 경우가 존재한다. 이러한 경우, 이와 같은 샘플들(픽처들)은 정확하게 디코딩될 수 없다.Furthermore, the semantics of DRAP sample groups in ISOBMFF are incomplete. ISOBMFF states that a DRAP sample is a sample from which all samples following it in decoding order can be decoded correctly, if the nearest initial sample preceding the DRAP sample is available for reference. However, even if the nearest initial sample preceding the DRAP sample is available for reference, samples following the DRAP sample in decoding order but preceding the DRAP sample in output order are within the nearest initial sample for reference. There are cases in which pictures refer to previous pictures. In this case, such samples (pictures) cannot be accurately decoded.

여기서 개시되는 것은 위에서 열거된 문제들 중 하나 또는 그 이상을 다루는 메커니즘들이다. 예를 들어, DRAP 픽처는 IRAP 픽처를 참조하여 인터 예측을 통해 코딩되는 랜덤 액세스 포인트 픽처이다. 더 나아가, 유형 2 DRAP 및/또는 향상된 종속적 랜덤 액세스 포인트(EDRAP) 픽처라고도 알려진, CRR 픽처는 IRAP 픽처를 참조하여 인터 예측을 통해 코딩되며 하나 또는 그 이상의 다른 종속적 랜덤 액세스 포인트 픽처들을 참조하도록 더 허용되는 랜덤 액세스 포인트 픽처이다. 이러한 이유로, CRR/DRAP/유형 2 DRAP는 DRAP의 한 유형으로 간주될 수 있다. DRAP 및 CRR은 픽처들이 특정 순서로 관리된다는 가정에 기반하여 설계된다. 하지만, 인코더들은 픽처들의 순서를 재배열하여 코딩 효율을 올릴 수 있도록 허용된다. 이에 따라, 비디오 픽처들은 출력 순서와 디코딩 순서를 가질 수 있다. 출력 순서는 픽처들이 프리젠팅/디스플레이 되는 순서이며, 디코딩 순서는 상기 픽처들이 비트스트림으로 코딩되는 순서이다. 일부 DRAP 및 CRR 설계들은 이러한 구분을 고려하지 않으며, 그리하여 비디오가 DRAP 및/또는 CRR을 사용하여 코딩되고 인코더가 상기 픽처들을 재배열하기로 결정하는 경우 오류가 발생할 수 있다. 구체적으로, 인터 예측된 픽처가 DRAP/CRR 픽처를 디코딩 순서에서 뒤따르며 상기 DRAP/CRR 픽처에 출력 순서에서 선행하는 경우, 오류가 발생할 수 있다. 이러한 픽처가 디코딩 순서에서 상기 DRAP/CRR 픽처에 선행하는 다른 픽처를 참조하여 코딩되는 것이 허용될 수 있기 때문에 오류가 발생할 수 있다. 상기 DRAP/CRR 픽처가 상기 디코더에 의해 랜덤 액세스 포인트로 사용되는 경우, 다른 픽처를 참조하는 인터 예측이 사용되는지 여부에 따라, 상기 픽처는 완전히 디코딩 가능하게 되거나 그렇지 않을 수 있다. 더 나아가, 다양한 시그널링 메커니즘들은 DRAP 및/또는 CRR을 완전히 지원하지 않을 수 있다.Disclosed herein are mechanisms to address one or more of the problems listed above. For example, a DRAP picture is a random access point picture coded through inter prediction with reference to an IRAP picture. Furthermore, a CRR picture, also known as a type 2 DRAP and/or enhanced dependent random access point (EDRAP) picture, is coded through inter prediction with reference to an IRAP picture and is further allowed to refer to one or more other dependent random access point pictures. is a random access point picture. For this reason, CRR/DRAP/Type 2 DRAP can be considered as one type of DRAP. DRAP and CRR are designed based on the assumption that pictures are managed in a specific order. However, encoders are allowed to rearrange the order of pictures to improve coding efficiency. Accordingly, video pictures may have an output order and a decoding order. The output order is the order in which pictures are presented/displayed, and the decoding order is the order in which the pictures are coded into a bitstream. Some DRAP and CRR designs do not take this distinction into account, so errors can occur when video is coded using DRAP and/or CRR and the encoder decides to rearrange the pictures. Specifically, when an inter-predicted picture follows a DRAP/CRR picture in decoding order and precedes the DRAP/CRR picture in output order, an error may occur. An error may occur because such a picture may be coded with reference to another picture preceding the DRAP/CRR picture in decoding order. When the DRAP/CRR picture is used as a random access point by the decoder, the picture may or may not be fully decodable depending on whether inter prediction referencing other pictures is used. Furthermore, various signaling mechanisms may not fully support DRAP and/or CRR.

따라서, 본 개시는 DRAP/CRR 픽처를 디코딩 순서에서 뒤따르고 상기 DRAP/CRR 픽처에 출력 순서에서 선행하는 인터 예측된 픽처가 디코딩 순서에서 상기 DRAP/CRR 픽처에 선행하는 다른 픽처들을 참조하는 것이 허용되는지 여부를 표시하기 위한 시그널링 메커니즘을 포함한다. 일 예에서, 상기 시그널링 메커니즘은 인코딩된 비트스트림 내의 SEI 메시지이다. 만약 이러한 인터 예측 참조가 허용된다면, 상기 인터 예측된 픽처는 상기 DRAP/CRR 픽처가 랜덤 액세스 포인트로 사용되는 경우에 디스플레이 되지 않는다. 만약 이러한 인터 예측 참조가 허용되지 않는다면, 상기 인터 예측된 픽처는 상기 DRAP/CRR 픽처가 랜덤 액세스 포인트로 사용되는 경우에 디스플레이 된다. 또한, 본 개시는 DRAP 및/또는 CRR 픽처를 기술하도록 ISOBMFF 미디어 파일들에 포함될 수 있는 동일한 그룹들 및/또는 동일한 엔트리들을 기술한다. 이것은 인코더로 하여금 파일 포맷 레벨에서 DRAP 및/또는 CRR 픽처들의 존재 및 위치를 결정하도록 허용한다.Therefore, the present disclosure asks whether an inter-predicted picture that follows a DRAP/CRR picture in decoding order and precedes the DRAP/CRR picture in output order is allowed to refer to other pictures that precede the DRAP/CRR picture in decoding order. It includes a signaling mechanism to indicate whether or not In one example, the signaling mechanism is an SEI message within an encoded bitstream. If such inter-prediction reference is allowed, the inter-predicted picture is not displayed when the DRAP/CRR picture is used as a random access point. If such inter-prediction reference is not allowed, the inter-predicted picture is displayed when the DRAP/CRR picture is used as a random access point. Also, this disclosure describes identical groups and/or identical entries that can be included in ISOBMFF media files to describe DRAP and/or CRR pictures. This allows the encoder to determine the presence and location of DRAP and/or CRR pictures at the file format level.

상기 문제들 및 기타 문제들을 해결하기 위해, 아래에 요약되는 방법들이 개시된다. 항목들은 일반적인 개념들을 설명하기 위한 예시들로 간주되어야 하며 협소한 방식으로 해석되어서는 안 된다. 더 나아가, 이러한 항목들은 개별적으로 또는 임의의 방식으로 결합되어서 적용될 수 있다.In order to solve the above problems and other problems, the methods summarized below are disclosed. Items are to be regarded as examples to explain general concepts and not to be construed in a narrow way. Furthermore, these items may be applied individually or in any combination.

예시 1Example 1

일 예에서, DRAP 표시 SEI 메시지 신택스에 표시가 추가되어, 디코딩 순서에서는 DRAP 픽처를 뒤따르고 출력 순서에서는 DRAP 픽처에 선행하는, DRAP 픽처와 동일한 계층에 있는 픽처들이 동일한 계층에 있으면서 출력 순서에서는 DRAP 픽처 보다 앞선 픽처를 인터 예측을 위해 참조하는 것이 허용되는지 여부를 표시한다. 만약 이러한 참조가 허용되지 않는다면, 디코더는 DRAP가 랜덤 액세스 포인트로 사용되는 경우 이와 같은 픽처들을 정확하게 디코딩하여 디스플레이 할 수 있다. 상기 참조가 허용되는 경우, 디코딩은 가능하지 않을 수 있으며, 이와 같은 픽처들은 DRAP가 랜덤 액세스 포인트로 사용되는 경우 디코더 측에서 디스플레이 되어서는 안 된다. 일 예에서, 상기 표시는 1 비트 플래그이다. 일 예에서, 상기 플래그는 X(X는 1 또는 0)와 같다고 설정되어, 동일한 계층에 있으면서 디코딩 순서에서는 DRAP 픽처를 뒤따르고 출력 순서에서는 DRAP 픽처에 선행하는 픽처들이 동일한 계층에 있으면서 디코딩 순서에서는 DRAP 픽처 보다 먼저인 픽처를 인터 예측을 위해 참조하는 것이 허용됨을 표시한다. 일 예에서, 상기 플래그는 1-X(X는 1 또는 0)과 같도록 설정되어, 동일한 계층에 있으면서 디코딩 순서에서는 DRAP 픽처를 뒤따르지만 출력 순서에서는 DRAP 픽처에 선행하는 픽처들이 동일한 계층에 있으면서 디코딩 순서에서는 DRAP 픽처 보다 먼저인 픽처를 인터 예측을 위해 참조하지 않음을 표시한다. 일 예에서, 상기 표시는 멀티 비트 지시자이다. 일 예에서, 제약 조건은 디코딩 순서에서는 DRAP 픽처를 뒤따르며 동일한 계층에 있는 임의의 픽처가 디코딩 순서에서는 상기 DRAP 픽처에 선행하는 동일한 계층에 있는 어떤 픽처도 출력 순서에서 뒤따르지 않아야 함을 요구한다.In one example, an indication is added to the DRAP indication SEI message syntax so that pictures in the same layer as the DRAP picture, which follow the DRAP picture in decoding order and precede the DRAP picture in output order, are in the same layer and are in output order. Indicates whether it is allowed to refer to an earlier picture for inter prediction. If such reference is not allowed, the decoder can accurately decode and display such pictures when DRAP is used as a random access point. If the reference is allowed, decoding may not be possible, and such pictures should not be displayed at the decoder side when DRAP is used as a random access point. In one example, the indication is a 1-bit flag. In one example, the flag is set equal to X (X is 1 or 0) so that pictures in the same layer that follow a DRAP picture in decoding order and precede the DRAP picture in output order are in the same layer and in decoding order DRAP picture. Indicates that it is allowed to refer to a picture that is earlier than the picture for inter prediction. In one example, the flag is set equal to 1-X (X is 1 or 0) so that pictures in the same layer that follow a DRAP picture in decoding order but precede the DRAP picture in output order are in the same layer and are decoded The order indicates that a picture prior to a DRAP picture is not referred to for inter prediction. In one example, the indication is a multi-bit indicator. In one example, the constraint requires that any picture in the same layer that follows a DRAP picture in decoding order must not follow in output order any picture in the same layer that precedes the DRAP picture in decoding order.

예시 2example 2

일 예에서, 추가적인 SEI 메시지가 지정되며, 이 SEI 메시지의 존재는, 동일 계층에 있으며 디코딩 순서에서 비트스트림 내 DRAP 픽처를 뒤따르면서 출력 순서에서는 상기 DRAP 픽처에 선행하는 픽처들이 동일 계층에 있으며 디코딩 순서에서 상기 DRAP 픽처 보다 먼저인 픽처를 인터 예측을 위해 참조하지 않는다는 것을 표시한다. 일 예에서, 이 SEI 메시지의 존재는, 동일 계층에 있으며 디코딩 순서에서 비트스트림 내 DRAP 픽처를 뒤따르면서 출력 순서에서는 상기 DRAP 픽처에 선행하는 픽처들이 동일 계층에 있으며 디코딩 순서에서 상기 DRAP 픽처 보다 먼저인 픽처를 인터 예측을 위해 참조하는 것이 허용됨을 표시한다. 일 예에서, 제약 조건은, 동일 계층에 있으며 디코딩 순서에서 상기 DRAP 픽처를 뒤따르는 임의의 픽처가 동일 계층에 있으면서 디코딩 순서에서 상기 DRAP 픽처에 선행하는 임의의 픽처를 출력 순서에서 뒤따라야 한다는 것을 요구한다.In one example, an additional SEI message is specified, and the presence of this SEI message indicates that pictures that are in the same layer and follow a DRAP picture in the bitstream in decoding order and precede the DRAP picture in output order are in the same layer and in decoding order Indicates that a picture prior to the DRAP picture is not referred to for inter prediction. In one example, the presence of this SEI message indicates that pictures that are in the same layer and follow the DRAP picture in the bitstream in decoding order and precede the DRAP picture in output order are in the same layer and precede the DRAP picture in decoding order. Indicates that it is allowed to refer to a picture for inter prediction. In one example, the constraint requires that any picture that is in the same layer and follows the DRAP picture in decoding order must follow in output order any picture that is in the same layer and precedes the DRAP picture in decoding order .

예시 3example 3

일 예에서, 추가적인 SEI 메시지가 지정된다. 이러한 추가적인 SEI 메시지의 존재는, 동일한 계층에 있으며, 상기 SEI 메시지 및 DRAP 표시 SEI 메시지 모두와 연관된 DRAP 픽처를 디코딩 순서에서 뒤따르며 출력 순서에서 상기 DRAP 픽처에 선행하는 픽처들이 동일한 계층에 있으며 디코딩 순서에서 상기 DRAP 픽처 보다 앞서 위치하는 어떤 픽처도 인터 예측을 위해 참조하지 않음을 표시한다. 일 예에서, 이러한 추가적인 SEI 메시지의 부재는, 동일한 계층에 있으며, 상기 추가적인 SEI 메시지 및 DRAP 표시 SEI 메시지 모두와 연관된 DRAP 픽처를 디코딩 순서에서 뒤따르며 출력 순서에서 상기 DRAP 픽처에 선행하는 픽처들이 동일한 계층에 있으며 디코딩 순서에서 상기 DRAP 픽처 보다 앞서는 픽처를 인터 예측을 위해 참조하는 것이 허용됨을 표시한다. 일 예에서, 제약 조건은, 동일 계층에 있으며 디코딩 순서에서 상기 DRAP 픽처를 뒤따르는 임의의 픽처가 동일 계층에 있으면서 디코딩 순서에서 상기 DRAP 픽처에 선행하는 임의의 픽처를 출력 순서에서 뒤따라야 한다는 것을 보장한다. In one example, additional SEI messages are specified. The existence of this additional SEI message is such that pictures that are in the same layer, follow in decoding order the DRAP picture associated with both the SEI message and the DRAP indication SEI message, and precede the DRAP picture in output order, are in the same layer and in decoding order. Indicates that any picture located prior to the DRAP picture is not referred for inter prediction. In one example, the absence of this additional SEI message is such that pictures that are in the same layer, that follow in decoding order the DRAP picture associated with both the additional SEI message and the DRAP indication SEI message, and precede the DRAP picture in output order, are in the same layer. Indicates that it is allowed to refer to a picture preceding the DRAP picture in decoding order for inter prediction. In one example, the constraint guarantees that any picture that is in the same layer and follows the DRAP picture in decoding order must follow in output order any picture that is in the same layer and precedes the DRAP picture in decoding order .

예시 4example 4

일 예에서, 추가적인 SEIM 메시지가 지정되며, 상기 추가적인 SEI 메시지 신택스에 표시가 추가되어, 상기 추가적인 SEI 메시지 및 DRAP 표시 SEI 메시지 모두와 연관되는 DRAP 픽처를 디코딩 순서에서 뒤따르며 상기 DRAP 픽처에 출력 순서에서 선행하는 동일 계층에 있는 픽처들이 디코딩 순서에서 상기 DRAP 픽처 보다 앞서며 동일 계층에 있는 픽처를 인터 예측을 위해 참조하는 것이 허용되는지 여부를 표시한다. 일 예에서, 상기 표시는 1 비트 플래그이다. 일 예에서, 상기 플래그는 X(X는 1 또는 0)와 같다고 설정되어, 디코딩 순서에서 상기 DRAP 픽처를 뒤따르고 출력 순서에서는 상기 DRAP 픽처에 선행하는 동일한 계층에 있는 픽처들이 디코딩 순서에서는 상기 DRAP 픽처 보다 먼저인 동일한 계층에 있는 픽처를 인터 예측을 위해 참조하는 것이 허용됨을 표시한다. 일 예에서, 더 나아가, 상기 플래그는 1-X(X는 1 또는 0)과 같도록 설정되어, 디코딩 순서에서는 DRAP 픽처를 뒤따르고 출력 순서에서는 상기 DRAP 픽처에 선행하는 동일 계층에 있는 픽처들이 디코딩 순서에서는 상기 DRAP 픽처 보다 먼저인 동일 계층에 있는 픽처를 인터 예측을 위해 참조하지 않음을 표시한다. 일 예에서, 상기 표시는 멀티 비트 지시자이다. 일 예에서, 제약조건은, 동일한 계층에 있으면서 디코딩 순서에서 상기 DRAP 픽처를 뒤따르는 임의의 픽처가 동일한 계층에 있으면서 디코딩 순서에서 상기 DRAP 픽처에 선행하는 임의의 픽처를 출력 순서에서 뒤따라야 한다고 요구한다. In one example, an additional SEIM message is specified, and an indication is added to the syntax of the additional SEI message to follow the DRAP picture associated with both the additional SEI message and the DRAP indication SEI message in decoding order and to the DRAP picture in output order. It indicates whether or not pictures in the same layer that precede the DRAP picture in decoding order are allowed to refer to pictures in the same layer for inter prediction. In one example, the indication is a 1-bit flag. In one example, the flag is set equal to X (X is 1 or 0) so that pictures in the same layer that follow the DRAP picture in decoding order and precede the DRAP picture in output order are the DRAP picture in decoding order. Indicates that it is allowed to refer to a picture in the same layer that is earlier for inter prediction. In one example, furthermore, the flag is set equal to 1-X (X is 1 or 0) so that pictures in the same layer that follow a DRAP picture in decoding order and precede the DRAP picture in output order are decoded. The order indicates that a picture in the same layer prior to the DRAP picture is not referred to for inter prediction. In one example, the indication is a multi-bit indicator. In one example, the constraint requires that any picture that is in the same layer and follows the DRAP picture in decoding order must follow in output order any picture that is in the same layer and precedes the DRAP picture in decoding order.

예시 5example 5

일 예에서, 표시가 유형 2 DRAP 표시 SEI 메시지 신택스에 추가된다. 상기 표시는, 동일 계층에 있으며 디코딩 순서에서 유형 2 DRAP 픽처를 뒤따르고 출력 순서에서 상기 유형 2 DRAP 픽처에 선행하는 픽처들이 동일 계층에 있고 디코딩 순서에서 상기 유형 2 DRAP 픽처 보다 앞서는 픽처를 인터 예측을 위해 참조하는 것이 허용되는지 여부를 표시한다. 일 예에서, 상기 표시는 1 비트 플래그이다. 일 예에서, 상기 플래그는 X(X는 1 또는 0)과 같다고 설정되어, 동일한 계층에 있으면서 디코딩 순서에서 상기 DRAP 픽처를 뒤따르고 출력 순서에서 상기 DRAP 픽처에 선행하는 픽처들이 동일한 계층에 있으며 디코딩 순서에서 상기 DRAP 픽처 보다 앞서는 픽처를 인터 예측을 위해 참조하는 것이 허용된다는 것을 표시한다. 일 예에서, 더 나아가, 상기 플래그는 1-X(X는 1 또는 0)과 같다고 설정되어, 동일한 계층에 있으면서 디코딩 순서에서 DRAP 픽처를 뒤따르고 출력 순서에서 상기 DRAP 픽처에 선행하는 픽처들이 동일한 계층에 있으며 디코딩 순서에서 상기 DRAP 픽처 보다 앞서는 픽처를 인터 예측을 위해 참조하지 않는다고 표시한다. 일 예에서, 상기 플래그는 상기 유형 2 DRAP 표시 SEI 메시지 신택스에 있는 t2drap_reserved_zero_13bits 필드로부터 1 비트를 재사용하여 추가된다. 일 예에서, 상기 표시는 멀티 비트 지시자이다. 일 예에서, 제약 조건은 동일한 계층에 있으며 디코딩 순서에서 상기 DRAP 픽처를 뒤따르는 임의의 픽처가 동일한 계층에 있으며 디코딩 순서에서 상기 DRAP 픽처에 선행하는 임의의 픽처를 출력 순서에서 뒤따라야 한다고 요구한다.In one example, an indication is added to the type 2 DRAP indication SEI message syntax. The indication is that pictures that are in the same layer, follow the type 2 DRAP picture in decoding order, and precede the type 2 DRAP picture in output order are inter-predicted for pictures that are in the same layer and precede the type 2 DRAP picture in decoding order. Indicates whether referencing is allowed. In one example, the indication is a 1-bit flag. In one example, the flag is set equal to X (X is 1 or 0) so that pictures in the same layer, following the DRAP picture in decoding order, and preceding the DRAP picture in output order, are in the same layer and in decoding order. Indicates that it is allowed to refer to a picture preceding the DRAP picture for inter prediction. In one example, furthermore, the flag is set equal to 1-X (X is 1 or 0) so that pictures in the same layer, following a DRAP picture in decoding order, and preceding the DRAP picture in output order are in the same layer. , and indicates that a picture preceding the DRAP picture in decoding order is not referred to for inter prediction. In one example, the flag is added by reusing 1 bit from the t2drap_reserved_zero_13bits field in the type 2 DRAP indication SEI message syntax. In one example, the indication is a multi-bit indicator. In one example, the constraint requires that any picture that is in the same layer and follows the DRAP picture in decoding order must follow in output order any picture that is in the same layer and precedes the DRAP picture in decoding order.

예시 6example 6

다른 예에서, 표시는 DRAP 또는 유형 2 DRAP 픽처와 연관된다. 일 예에서, 상기 표시는 각각의 DRAP 또는 유형 2 DRAP에 대해 시그널링될 수 있다. In another example, the indication is associated with a DRAP or type 2 DRAP picture. In one example, the indication may be signaled for each DRAP or type 2 DRAP.

예시 7Example 7

일 예에서, 추가적인 샘플 그룹은 ISOBMFF 파일에서 CRR(예를 들어, 유형 2 DRAP 픽처들을 포함하는 샘플들)을 시그널링 하도록 지정된다.In one example, an additional sample group is designated to signal CRR (eg, samples containing type 2 DRAP pictures) in the ISOBMFF file.

예시 8example 8

일 예에서, 예를 들어 샘플 투 그룹 박스 (예를 들어, SampleToGroupBox 또는 CompactSampleToGroupBox)의 버전 필드를 사용하여 또는 상기 샘플 투 그룹 박스내 grouping_type_parameter 필드 (또는 그것의 부분)을 사용하여, DRAP 샘플 그룹은 ISOBMFF 파일에서 CRR(예를 들어, 유형 2 DRAP 픽처들을 포함하는 샘플들)을 시그널링 하기 위해 확장되어 진다, In one example, a DRAP sample group is ISOBMFF, eg, using a version field of a sample-to-group box (e.g., SampleToGroupBox or CompactSampleToGroupBox) or using a grouping_type_parameter field (or part thereof) in the sample-to-group box. It is extended to signal CRR (eg, samples containing type 2 DRAP pictures) in a file,

예시 9Example 9

일 예에서, DRAP 샘플 엔트리는 DRAP 샘플 그룹의 구성원으로부터 랜덤 액세싱하는 데에 필요한 랜덤 액세스 포인트(RAP) 샘플들의 수를 표시하는 필드를 포함한다. 상기 요구되는 RAP 샘플들은 초기 샘플들이거나 DRAP 샘플들이다. 일 예에서, DRAP 샘플 엔트리는 DRAP 샘플 그룹의 구성원들의 RAP 식별자를 표시하는 필드를 더 포함한다. 일 예에서, RAP 식별자를 표시하는 상기 필드는 16 비트를 사용하여 코딩된다. 일 예에서, RAP 식별자를 표시하는 상기 필드는 32 비트를 사용하여 코딩된다. 일 예에서, DRAP 샘플 엔트리는 DRAP 샘플 그룹의 구성원들의 RAP 식별자를 표시하는 필드를 배제한다. 상기 RAP 식별자는 서브 샘플 정보 박스, 샘플 보조 정보 크기 박스, 그리고/또는 다른 박스에서 시그널링될 수 있다. 일 예에서, DRAP 샘플 엔트리는 DRAP 샘플 그룹의 구성원들의 RAP 식별자를 표시하는 필드를 배제한다. 일 예에서, 상기 RAP 식별자는 샘플 수이다. 일 예에서, 상기 DRAP 샘플 엔트리는 DRAP 샘플 그룹의 구성원으로부터 랜덤 액세싱하는 데에 필요한 요구되는 RAP 샘플들의 RAP 식별자들을 표시하는 많은 필드들을 더 포함한다. 일 예에서, 상기 요구되는 RAP 샘플들의 RAP 식별자들을 표시하는 필드들 각각은 16 비트를 사용하여 코딩된다. 일 예에서, 상기 요구되는 RAP 샘플들의 RAP 식별자들을 표시하는 필드들 각각은 32 비트를 사용하여 코딩된다. 일 예에서, 상기 요구되는 RAP 샘플들의 RAP 식별자들을 표시하는 필드들 각각은 요구되는 RAP 샘플의 RAP 식별자를 직접 표현한다. 일 예에서, 상기 요구되는 RAP 샘플들의 RAP 식별자들을 표시하는 필드들 각각은 2개의 RAP 샘플들의 RAP 식별자들 사이의 차이를 표현한다. 일 예에서, 상기 요구되는 RAP 샘플들의 RAP 식별자들을 표시하는 필드들의 i 번째 필드(i는 0과 같음)는 현재 샘플의 RAP 식별자(예를 들어, 현재 DRAP 샘플 그룹의 샘플)과 첫 번째 요구되는 RAP 샘플의 i 번째 RAP 식별자 간의 차이를 표현한다. 일 예에서, 상기 요구되는 RAP 샘플들의 RAP 식별자들을 표시하는 필드들의 i 번째 필드(i는 0 보다 큼)는 (i-1) 번째의 요구되는 RAP 샘플의 RAP 식별자와 i번째 요구되는 RAP 샘플의 RAP 식별자 간의 차이를 표현한다. 일 예에서, 상기 요구되는 RAP 샘플들의 RAP 식별자들을 표시하는 필드들의 i 번째 필드(i는 0 보다 큼)는 i번째의 요구되는 RAP 샘플의 RAP 식별자와 (i-1)번째 요구되는 RAP 샘플의 RAP 식별자 간의 차이를 표현한다.In one example, a DRAP sample entry includes a field indicating the number of random access point (RAP) samples required for random access from a member of a DRAP sample group. The required RAP samples are initial samples or DRAP samples. In one example, the DRAP sample entry further includes a field indicating RAP identifiers of members of the DRAP sample group. In one example, the field indicating the RAP identifier is coded using 16 bits. In one example, the field indicating the RAP identifier is coded using 32 bits. In one example, the DRAP sample entry excludes a field indicating the RAP identifiers of members of the DRAP sample group. The RAP identifier may be signaled in a subsample information box, a sample auxiliary information size box, and/or other boxes. In one example, the DRAP sample entry excludes a field indicating the RAP identifiers of members of the DRAP sample group. In one example, the RAP identifier is a sample number. In one example, the DRAP sample entry further includes a number of fields indicating RAP identifiers of required RAP samples needed for random access from a member of a DRAP sample group. In one example, each of the fields indicating RAP identifiers of the requested RAP samples are coded using 16 bits. In one example, each of the fields indicating RAP identifiers of the requested RAP samples are coded using 32 bits. In one example, each of the fields indicating RAP identifiers of the requested RAP samples directly represents the RAP identifier of the requested RAP sample. In one example, each of the fields indicating RAP identifiers of the requested RAP samples represents a difference between RAP identifiers of two RAP samples. In one example, the i-th field of the fields indicating RAP identifiers of the requested RAP samples (i is equal to 0) includes the RAP identifier of the current sample (eg, a sample of the current DRAP sample group) and the first requested RAP sample. It represents the difference between the i-th RAP identifiers of RAP samples. In one example, the i-th field (i is greater than 0) of the fields indicating the RAP identifiers of the requested RAP samples is the RAP identifier of the (i-1)-th required RAP sample and the i-th required RAP sample It expresses the difference between RAP identifiers. In one example, the i-th field (i is greater than 0) of the fields indicating the RAP identifiers of the requested RAP samples is the RAP identifier of the i-th requested RAP sample and the (i-1)-th requested RAP sample It expresses the difference between RAP identifiers.

예시 10Example 10

일 예에서, 종속 랜덤 액세스 포인트(DRAP) 샘플은, 디코딩 및 출력 순서 모두에서 그것 다음의 모든 샘플들이, 상기 DRAP 샘플에 선행하는 최근접 초기 샘플이 참조를 위해 이용가능한 경우, 정확하게 디코딩될 수 있는 그런 샘플이다. In one example, a dependent random access point (DRAP) sample can be decoded correctly if all samples following it in both decoding and output order are available for reference if the nearest initial sample preceding the DRAP sample is available for reference. is such a sample.

아래는 위에서 요약된 양태들의 일부에 대한 일부 예시적인 실시예들이다. 추가되거나 수정된 관련 부분들은 밑줄에 굵은 활자체로 표시되며, 삭제된 부분들은 굵은 이탤릭체로 표시된다. Below are some example embodiments of some of the aspects outlined above. Relevant portions that have been added or modified are underlined and in bold type, and portions that have been deleted are indicated in bold italic type.

예시적인 구현에서, 유형 2 DRAP 표시 SEI 메시지에 대한 신택스가 아래와 같이 수정되었다In an example implementation, the syntax for the Type 2 DRAP indication SEI message has been modified as follows

더 나아가, 유형 2 DRAP 표시 SEI 메시지 시맨틱스는 다음과 같이 수정되었다. 유형 2 DRAP 표시 SEI 메시지와 연관된 픽처는 유형 2 DRAP 픽처로 지칭된다. (DRAP 표시 SEI 메시지와 연관된) 유형 1 DRAP 픽처들 및 유형 2 DRAP 픽처들은 DRAP 픽처들로 통칭된다. 상기 유형 2 DRAP 표시 SEI 메시지의 존재는 본 하위 절에서 명시되는 픽처 순서 및 픽처 참조에 관한 제약들이 적용됨을 지시한다. 이 제약들은 디코더로 하여금, 유형 2 DRAP 픽처 그리고 동일한 계층에 있으며 디코딩 순서 및 출력 순서 모두에서 그것을 뒤따르는 픽처들을, 동일한 CLVS 내에 있으며 t2drap_ref_rap_id[　i　] 신택스 요소들로 식별되는, 디코딩 순서에 따라 IRAP 또는 DRAP 픽처들의 리스트로 구성되는, 픽처들 referenceablePictures의 리스트를 제외한 동일 계층에 있는 임의의 다른 픽처들을 디코딩할 필요 없이, 적절하게 디코딩하도록 할 수 있다. Furthermore, the type 2 DRAP indication SEI message semantics have been modified as follows. A picture associated with a Type 2 DRAP indication SEI message is referred to as a Type 2 DRAP picture. Type 1 DRAP pictures (associated with the DRAP indication SEI message) and Type 2 DRAP pictures are collectively referred to as DRAP pictures. The existence of the type 2 DRAP indication SEI message indicates that the constraints on picture order and picture reference specified in this subclause are applied. These constraints cause the decoder to generate a type 2 DRAP picture and the pictures that are in the same layer and that follow it in both decoding order and output order, depending on the decoding order, which are in the same CLVS and identified by the t2drap_ref_rap_id[　i　] syntax elements, either IRAP or It is possible to properly decode any other pictures in the same layer except for the list of pictures referenceablePictures, which is composed of a list of DRAP pictures, without the need to decode them.

상기 유형 2 DRAP 표시 SEI 메시지의 존재로 표시되며 모두 적용되어야 하는 상기 제약들은 다음과 같다. 유형 2 DRAP 픽처는 트레일링 픽처이다. 유형 2 DRAP 픽처는 0과 동일한 시간 서브계층 식별자를 갖는다. 유형 2 DRAP 픽처는 referenceablePictures를 제외하고 그것의 참조 픽처 리스트들의 활성 엔트리들에 있는 동일 계층의 어떤 픽처도 포함하지 않는다. 디코딩 순서와 출력 순서 둘 모두에서 유형 2 DRAP 픽처에 후속하며 동일 계층에 있는 임의의 픽처는, 그것의 참조 픽처 리스트들의 활성 엔트리들에, referenceablePictures를 제외하고, 디코딩 순서 또는 출력 순서에서 유형 2 DRAP 픽처에 선행하는 어떠한 픽처도 포함하지 않는다. 리스트 referenceablePictures에서의 어떤 픽처도, 상기 픽처의 참조 픽처 리스트들의 활성 엔트리들 내에, 리스트 referenceablePictures에서 이전 위치에 있는 픽처가 아니면서 동일 계층에 있는 어떤 픽처도 포함하지 않는다.The above constraints, which are indicated by the presence of the type 2 DRAP indication SEI message and must all be applied, are as follows. A type 2 DRAP picture is a trailing picture. A type 2 DRAP picture has a temporal sublayer identifier equal to 0. A type 2 DRAP picture does not contain any pictures of the same layer in the active entries of its reference picture lists except for referenceablePictures. Any picture in the same layer that follows a type 2 DRAP picture in both decoding order and output order, excluding referenceablePictures, in active entries of its reference picture lists, is a type 2 DRAP picture in either decoding order or output order. It does not include any picture that precedes . No picture in the list referenceablePictures contains any picture in the same layer that is not a picture at a previous position in the list referenceablePictures, within the active entries of the picture's reference picture lists.

t2drap_leading_pictures_decodable_flag가 1과 같으면, 다음이 적용된다. 동일한 계층에 있으면서 디코딩 순서에서 유형 2 DRAP 픽처를 뒤따르는 임의의 픽처는 동일한 계층에 있으면서 디코딩 순서에서 유형 2 DRAP 픽처에 선행하는 임의의 픽처를 출력 순서에서 뒤따라야 한다. 동일한 계층에 있으면서 디코딩 순서에서 유형 2 DRAP 픽처를 뒤따르고 출력 순서에서 유형 2 DRAP 픽처에 선행하는 임의의 픽처는, referenceablePictures를 제외하고, 동일한 계층에 있으며 디코딩 순서에서 유형 2 DRAP 픽처에 선행하는 어떤 픽처도 그것의 참조 픽처 리스트들의 활성 엔트리들에 포함하지 않는다.If t2drap_leading_pictures_decodable_flag is equal to 1, the following applies. Any picture that is in the same layer and follows a type 2 DRAP picture in decoding order must follow any picture that is in the same layer and precedes a type 2 DRAP picture in decoding order in output order. Any picture in the same layer that follows a type 2 DRAP picture in decoding order and precedes a type 2 DRAP picture in output order, excluding referenceablePictures, is any picture in the same layer that precedes a type 2 DRAP picture in decoding order. Also does not include active entries in its reference picture lists.

리스트 referenceablePictures에서의 임의의 픽처는, 그것의 참조 픽처 리스트들의 활성 엔트리들 내에, 리스트 referenceablePictures에서 이전 위치에 있는 픽처가 아니면서 동일 계층에 있는 어떤 픽처도 포함하지 않는다. 비고- 결과적으로 referenceablePictures 내의 첫 번째 픽처는, 그것이 IRAP 픽처 대신 DRAP 픽처라 하더라도, 그것의 참조 픽처 리스트들의 활성 엔트리들에 있는 동일 계층으로부터의 어떤 픽처도 포함하지 않는다.Any picture in the list referenceablePictures does not contain, within the active entries of its reference picture lists, any picture in the same layer that is not a picture at a previous position in the list referenceablePictures. Remarks- As a result, the first picture in referenceablePictures does not contain any picture from the same layer in the active entries of its reference picture lists, even if it is a DRAP picture instead of an IRAP picture.

t2drap_rap_id_in_clvs는 유형 2 DRAP 픽처의, RapPicId로서 표기된, RAP 픽처 식별자를 지정한다. 각각의 IRAP 또는 DRAP 픽처는 RapPicId와 연관된다. IRAP 픽처에 대한 RapPicId의 값은 0과 같다고 추론된다. RapPicId의 값들은 CLVS 내 임의의 두 IRAP 또는 DRAP 픽처들에 대해 상이해야 한다. t2drap_reserved_zero_13bits 본 사양의 본 버전에 부합하는 비트스트림들에서 0과 같아야 한다. t2drap_reserved_zero_13bits에 대한 다른 값들은 ITU-T | ISO/IEC에 의한 미래의 사용을 위해 유지된다. 디코더들은 t2drap_reserved_zero_13bits의 값을 무시해야 한다. t2drap_num_ref_rap_pics_minus1 플러스 1은 유형 2 DRAP 픽처와 동일한 CLVS 내에 있으며 유형 2 DRAP 픽처의 참조 픽처 리스트들의 활성 엔트리들에 포함될 수 있는 IRAP 또는 DRAP 픽처들의 수를 지시한다. t2drap_ref_rap_id[ i ]는 유형 2 DRAP 픽처와 동일한 CLVS 내에 있고 유형 2 DRAP 픽처의 참조 픽처 리스트들의 활성 엔트리들에 포함될 수 있는 i번째 IRAP 또는 DRAP 픽처의 RapPicId를 지시한다.t2drap_rap_id_in_clvs specifies the RAP picture identifier, denoted as RapPicId, of a type 2 DRAP picture. Each IRAP or DRAP picture is associated with a RapPicId. The value of RapPicId for an IRAP picture is inferred to be equal to 0. The values of RapPicId must be different for any two IRAP or DRAP pictures in CLVS. t2drap_reserved_zero_13bits Must be equal to 0 in bitstreams conforming to this version of this specification. Other values for t2drap_reserved_zero_13bits are ITU-T | It is maintained for future use by ISO/IEC. Decoders should ignore the value of t2drap_reserved_zero_13bits. t2drap_num_ref_rap_pics_minus1 plus 1 indicates the number of IRAP or DRAP pictures that are in the same CLVS as a type 2 DRAP picture and can be included in active entries of reference picture lists of a type 2 DRAP picture. t2drap_ref_rap_id[i] indicates RapPicId of the i-th IRAP or DRAP picture that is in the same CLVS as the type 2 DRAP picture and can be included in active entries of reference picture lists of the type 2 DRAP picture.

예시적인 구현에서, 종속 랜덤 액세스 포인트(DRAP) 샘플 그룹은 다음과 같이 정의된다. SampleToGroupBox 또는 CompactSampleToGroupBox의 분류 유형이 'drap'과 같으며, 다음이 적용된다. SampleToGroupBox 또는 CompactSampleToGroupBox의 버전 필드가 0과 같거나 또는 필드 grouping_type_parameter가 존재하고 값이 0과 같은 경우, 종속 랜덤 액세스 포인트(DRAP) 샘플은, 디코딩 순서 및 출력 순서에서 그것 이후의 모든 샘플들이, 상기 DRAP 샘플에 선행하는 최근접 초기 샘플이 참조를 위한 이용 가능하다면, 정확하게 디코딩될 수 있는 그러한 샘플이다. 필드 grouping_type_parameter가 존재하고 값이 1과 같으면, DRAP 샘플은, 디코딩 순서 및 출력 순서에서 그것 이후의 모든 샘플들이, 상기 DRAP 샘플에 선행하는 최근접 초기 샘플 및 디코딩 순서에서 먼저인 제로 또는 그 이상의 다른 식별된 DRAP 샘플들 참조를 위한 이용 가능하다면, 정확하게 디코딩될 수 있는 그러한 샘플이다. In an example implementation, a dependent random access point (DRAP) sample group is defined as: If the classification type of SampleToGroupBox or CompactSampleToGroupBox is equal to 'drap', the following applies. If the version field of SampleToGroupBox or CompactSampleToGroupBox is equal to 0, or if the field grouping_type_parameter is present and has a value equal to 0, the dependent random access point (DRAP) samples are all samples after it in decoding order and output order , the DRAP sample If the nearest initial sample that precedes is available for reference, it is that sample that can be correctly decoded. If the field grouping_type_parameter is present and has a value equal to 1, then a DRAP sample, all samples after it in decoding order and output order, the nearest initial sample that precedes the DRAP sample, and zero or more other identifications that come first in decoding order If available, the DRAP samples for reference are those samples that can be correctly decoded.

상기 초기 샘플은 싱크(sync) 샘플임에 의해 그 자체로 표시되거나 또는 SAP 샘플 그룹에 의해 표시되는 SAP 유형 1, 2 또는 3의 스트림 액세스 포인트(stream access point (SAP)) 샘플이다. 예를 들어, 만약 파일 내 32번째 샘플이 I 픽처로 구성되는 초기 샘플이라면, 48번째 샘플은 P 픽처로 구성될 수 있고 종속 랜덤 액세스 포인트 샘플 그룹의 구성원으로 표시될 수 있으며, 그로 인해 32번째 샘플(샘플 33 내지 47은 무시)을 먼저 디코딩하고 그리고 나서 48번째 샘플로부터 계속 디코딩함으로써 랜덤 액세스가 48번째 샘플에서 실시될 수 있다는 것을 지시한다. 비고: DRAP 샘플들은 유형 1, 2 및 3의 SAP 샘플들과 결합해서만 사용될 수 있다. 이것은 선행하는 SAP 샘플 및 상기 DRAP 샘플 보다 디코딩 순서에서 앞서는 제로 또는 그 이상의 식별된 DRAP 샘플들을 DRAP 샘플 및 출력 순서에서 상기 DRAP샘플을 추종하는 샘플들과 연결시킴으로써 샘플들의 디코딩 가능한 시퀀스를 생성하는 기능을 활성화하기 위한 것이다.The initial sample is a stream access point (SAP) sample of SAP type 1, 2 or 3, indicated by itself by being a sync sample or by a group of SAP samples. For example, if the 32nd sample in the file is an initial sample consisting of I pictures, the 48th sample may consist of P pictures and be marked as a member of the dependent random access point sample group, thereby making the 32nd sample Indicates that random access can be performed at the 48th sample by first decoding (ignoring samples 33 to 47) and then continuing decoding from the 48th sample. Note: DRAP samples can only be used in combination with types 1, 2 and 3 SAP samples. This function creates a decodable sequence of samples by concatenating zero or more identified DRAP samples that precede the preceding SAP sample and the DRAP sample in decoding order with the DRAP sample and samples that follow the DRAP sample in output order. to activate it.

SampleToGroupBox 또는 CompactSampleToGroupBox의 버전 필드가 0과 같거나, 또는 필드 grouping_type_parameter가 존재하고 그 값이 0과 같으면, 오직 다음 건들이 참인 경우에만, 샘플이 종속 랜덤 액세스 포인트 샘플 그룹의 구성원이 될 수 있다(그리고 이러한 이유로 DRAP 샘플로 불릴 수 있다). 상기 DRAP 샘플은 오직 최근접 선행 초기 샘플만을 참조한다. 상기 DRAP 샘플 및 상기 DRAP 샘플을 디코딩 순서 및 출력 순서에서 뒤따르는 모든 샘플들은, 싱크 샘플임에 의해 그 자체로 표시되거나 또는 SAP 샘플 그룹에 의해 표시되는 유형 1, 2 또는 3의 최근접 선행 SAP 샘플을 디코딩한 후 DRAP 샘플 측에서 디코딩을 시작할 때, 정확하게 디코딩될 수 있다. If the version field of SampleToGroupBox or CompactSampleToGroupBox is equal to 0, or if the field grouping_type_parameter is present and its value is equal to 0, then a sample may become a member of the subordinate random access point sample group only if and only if the following are true (and these may be referred to as a DRAP sample for a reason). The DRAP sample only references the nearest preceding initial sample. The DRAP sample and all samples that follow the DRAP sample in decoding order and output order are either themselves indicated by being a sync sample or nearest preceding SAP sample of type 1, 2 or 3 indicated by a SAP sample group. When decoding starts on the DRAP sample side after decoding , it can be accurately decoded.

필드 grouping_type_parameter가 존재하고 그 값이 1과 같은 경우, 샘플은, 오직 다음 조건들이 참인 경우에만, DRAP 샘플 그룹의 구성원이 될 수 있다(그리고 이러한 이유로 DRAP 샘플로 불릴 수 있다). DRAP 샘플은 최인접 선행 초기 샘플 및 상기 DRAP 샘플 보다 디코딩 순서에서 먼저인 제로 또는 그 이상의 다른 식별된 DRAP 샘플들만을 참조한다. DRAP 샘플 및 디코딩 순서 및 출력 순서에서 상기 DRAP 샘플을 뒤따르는 모든 샘플들은, 싱크 샘플임에 의해 그 자체로 표시되거나 또는 SAP 샘플 그룹에 의해 표시되는 유형 1, 2 또는 3의 최근접 선행 SAP 샘플을 디코딩한 후 그리고 상기 DRAP 샘플 보다 디코딩 순서에서 먼저인 상기 제로 또는 그 이상의 식별된 DRAP 샘플들을 디코딩 한 후에 DRAP 샘플 측에서 디코딩을 시작할 때, 정확하게 디코딩될 수 있다.If the field grouping_type_parameter is present and its value is equal to 1, the sample may become a member of a DRAP sample group (and may be called a DRAP sample for this reason) only if the following conditions are true. A DRAP sample only references the nearest preceding initial sample and zero or more other identified DRAP samples that are earlier in decoding order than the DRAP sample. A DRAP sample and all samples that follow that DRAP sample in decoding order and output order are either indicated by themselves by being sync samples, or by the SAP sample group, which is the nearest preceding SAP sample of type 1, 2, or 3. When decoding starts at the DRAP sample side after decoding and decoding the zero or more identified DRAP samples that are earlier in decoding order than the DRAP sample, it can be correctly decoded.

DRAP 샘플 그룹 엔트리에 대한 예시적인 신택스는 아래와 같다.Exemplary syntax for a DRAP sample group entry is as follows.

DRAP 샘플 그룹 엔트리에 대한 예시적인 시맨틱스는 다음과 같다. DRAP_type은 음이 아닌 정수이다. DRAP_type이 1 내지 3의 범위에 있는 경우, 그것은 최근접하는 선행 SAP 또는 다른 DRAP 샘플들 에 의존하지 않았으면 DRAP 샘플이 상응했을 (부속서 I에서 지정되는 것과 같은) SAP_type을 지시한다. 다른 유형 값들은 보존된다. num_ref_rap_pics_minus1 플러스 1은, 디코딩 순서에서 DRAP 샘플 보다 먼저이고 DRAP 샘플 및 상기 DRAP 샘플로부터 디코딩을 시작할 때 디코딩 순서 및 출력 순서 모두에서 상기 DRAP 샘플을 뒤따르는 모든 샘플들을 정확하게 디코딩할 수 있기 위해 참조될 필요가 있는 다른 DRAP 샘플들 및 초기 샘플의 수를 표시한다. 보존되는 것은 0과 같아야 한다. 본 하위 절의 시맨틱스는 보존된 값이 0과 같은 샘플 그룹 디스크립션 엔트리들에만 적용된다. 파서들은 본 샘플 그룹을 파싱하는 경우 보존된 값이 0 보다 큰 샘플 그룹 디스크립션 엔트리들을 허용하고 무시해야 한다. RAP_id는 본 샘플 그룹에 속하는 샘플들의 RAP 샘플 식별자를 지정한다. RAP 샘플은 초기 샘플이거나 DRAP 샘플이다. 초기 샘플에 대한 RAP_id의 값은 0과 같다고 추론된다. ref_RAP_id[ i ]은, 디코딩 순서에서 상기 DRAP 샘플 보다 이전이며, 상기 DRAP 샘플 및 상기 DRAP 샘플로부터 디코딩을 시작할 때 디코딩 순서 및 출력 순서 모두에서 상기 DRAP 샘플을 뒤따르는 모든 샘플들을 정확하게 디코딩할 수 있도록 참조될 필요가 있는, i 번째 RAP 샘플의 RAP_id을 표시한다. Exemplary semantics for a DRAP sample group entry are as follows. DRAP_type is a non-negative integer. If DRAP_type is in the range 1 to 3, it indicates the SAP_type (as specified in Annex I) to which the DRAP sample would have corresponded had it not depended on the nearest preceding SAP or other DRAP samples . Other type values are preserved. num_ref_rap_pics_minus1 plus 1 is earlier than the DRAP sample in decoding order and when starting decoding from the DRAP sample and the DRAP sample, all samples following the DRAP sample in both decoding order and output order Need to be referenced to be able to decode correctly Indicates the number of other DRAP samples present and the initial sample. Conserved must be equal to zero. The semantics of this subclause apply only to sample group description entries whose preserved value is equal to zero. Parsers MUST accept and ignore sample group description entries with a preserved value greater than 0 when parsing this sample group. RAP_id designates RAP sample identifiers of samples belonging to this sample group. A RAP sample is either an initial sample or a DRAP sample. The value of RAP_id for the initial sample is inferred to be equal to zero. ref_RAP_id [i] is earlier than the DRAP sample in decoding order, and all samples following the DRAP sample in both decoding order and output order when decoding is started from the DRAP sample and the DRAP sample Reference to be able to accurately decode Indicates the RAP_id of the i-th RAP sample, which needs to be

다른 예시적 구현에서, RAP_id 필드는 VisualDRAPEntry() 신택스에서 시그널링되지 않으며, 이 경우 상기 신택스 VisualDRAPEntry()는 아래와 같다.In another exemplary implementation, the RAP_id field is not signaled in the VisualDRAPEntry() syntax, in which case the syntax VisualDRAPEntry() is as follows.

더 나아가, 각각의 DRAP 샘플에 대한 RAP_id는 서브 샘플 정보 박스, 샘플 보조 정보 크기 박스, 또는 추가적인 박스에서 시그널링된다.Furthermore, RAP_id for each DRAP sample is signaled in a subsample information box, a sample auxiliary information size box, or an additional box.

다른 예시적인 구현에서, ref_RAP_id[i] 필드는 ref_RAP_id_delta[i]로 변경되고, ref_RAP_id[i] 필드의 시맨틱스는 다음과 같이 변경된다. ref_RAP_id_delta[ i ]는, 디코딩 순서에서 DRAP 샘플 보다 이전이며 상기 DRAP 샘플 및 상기 DRAP 샘플로부터 디코딩을 시작할 때 디코딩 순서 및 출력 순서 모두에서 상기 DRAP 샘플을 뒤따르는 모든 샘플들을 정확하게 디코딩할 수 있도록 참조될 필요가 있는, i번째 RAP 샘플의 RAP_id의 델타 를 표시한다. i번째 RAP 샘플의 RAP_id 를 나타내는 변수 RefRapId[i]는 아래와 같이 도출되며, 여기서 RAP_id는 현재 샘플(즉, 현재 DRAP 샘플 그룹의 샘플)의 RAP_id이다. In another example implementation, the ref_RAP_id[i] field is changed to ref_RAP_id_delta[i], and the semantics of the ref_RAP_id[i] field is changed as follows. ref_RAP_id_delta [i] is earlier than the DRAP sample in decoding order and when decoding starts from the DRAP sample and the DRAP sample, all samples following the DRAP sample in both decoding order and output order can be referenced to accurately decode Displays the delta of RAP_id of the i-th RAP sample that is necessary. A variable RefRapId[i] representing the RAP_id of the i-th RAP sample is derived as follows, where RAP_id is the RAP_id of the current sample (ie, the sample of the current DRAP sample group).

다른 예시적인 구현에서, ref_RAP_id_delta[i] 필드의 시맨틱스는 다음과 같이 변경된다. ref_RAP_id[i] 필드의 시맨틱스는 다음과 같이 변경된다. ref_RAP_id_delta[ i ]는, 디코딩 순서에서 DRAP 샘플 보다 이전이며 상기 DRAP 샘플 및 상기 DRAP 샘플로부터 디코딩을 시작할 때 디코딩 순서 및 출력 순서 모두에서 상기 DRAP 샘플을 뒤따르는 모든 샘플들을 정확하게 디코딩할 수 있도록 참조될 필요가 있는, i번째 RAP 샘플의 RAP_id의 델타를 표시한다. i번째 RAP 샘플의 RAP_id를 나타내는 변수 RefRapId[i]는 아래와 같이 도출되며, 여기서 RAP_id는 현재 샘플(즉, 현재 DRAP 샘플 그룹의 샘플)의 RAP_id이다.In another example implementation, the semantics of the ref_RAP_id_delta[i] field are changed as follows. The semantics of the ref_RAP_id[i] field is changed as follows. ref_RAP_id_delta [i] is earlier than the DRAP sample in decoding order and when decoding starts from the DRAP sample and the DRAP sample, all samples following the DRAP sample in both decoding order and output order Need to be referenced to accurately decode indicates the delta of RAP_id of the ith RAP sample. The variable RefRapId[i] representing the RAP_id of the i-th RAP sample is derived as follows, where RAP_id is the RAP_id of the current sample (ie, the sample of the current DRAP sample group).

다른 예시적인 구현에서, RAP 샘플의 RAP 샘플 식별자는 RAP 샘플의 샘플 수와 같은 것으로 지정되며, 현재 샘플의 RAP_id는 현재 샘플의 샘플 수이고, 변수 RefRapId[i]는 i번째 RAP 샘플의 샘플 수를 나타낸다.In another exemplary implementation, the RAP sample identifier of the RAP sample is specified equal to the sample number of the RAP sample, the RAP_id of the current sample is the sample number of the current sample, and the variable RefRapId[i] is the sample number of the ith RAP sample. indicate

다른 예시적인 구현에서, 샘플 그룹 디스크립션에 존재하는 경우의 RAP_id 필드와 ref_RAP_id[i] 필드는 32 비트를 사용하여 코딩된다.In another exemplary implementation, the RAP_id field and the ref_RAP_id[i] field when present in the sample group description are coded using 32 bits.

도 1은 IRAP 픽처들을 사용하여 비트스트림을 디코딩하는 경우 랜덤 액세스를 위한 예시적 메커너즘의 개략도이다. 구체적으로, 도 1은 IRAP 픽처들 (101) 및 비IRAP 픽처들(103)을 포함하는 비트스트림(100)을 도시한다. IRAP 픽처(101)는 인트라 예측에 따라 코딩되며 비트스트림(100)으로의 액세스 포인트로 사용될 수 있는 픽처이다. 인트라 픽처는 픽처의 블록들을 동일 픽처 내의 다른 블록들을 참조하여 코딩하는 과정이다. 인트라 예측에 따라 코딩된 픽처는 다른 픽처들에 대한 참조 없이 코딩 될 수 있다. 반면에, 비IRAP 픽처(103)는 액세스 포인트로 사용될 수 없으며 연관된 IRAP 픽처(101)가 디코딩된 후에 디코딩될 수 있는 픽처이다. 예를 들어, 비IRAP 픽처(103)는 일반적으로 인터 예측에 따라 코딩된다. 인터 예측은 픽처의 블록들을 참조 픽처들로 지정되는 다른 픽처들의 블록들을 참조하여 코딩하는 과정이다. 인터 예측에 기반하여 코딩된 픽처는, 해당 픽처의 참조 픽처들 역시 모두 디코딩 되는 경우에만 정확하게 디코딩될 수 있다. IRAP 픽처들(101)과 비IRAP 픽처들(103) 모두 다른 비IRAP 픽처들(103)을 위한 참조 픽처들로 지정될 수 있다. 1 is a schematic diagram of an example mechanism for random access when decoding a bitstream using IRAP pictures. Specifically, FIG. 1 shows a bitstream 100 comprising IRAP pictures 101 and non-IRAP pictures 103. The IRAP picture 101 is a picture that is coded according to intra prediction and can be used as an access point to the bitstream 100. Intra-picture coding is a process of coding blocks of a picture with reference to other blocks in the same picture. A picture coded according to intra prediction can be coded without reference to other pictures. On the other hand, the non-IRAP picture 103 is a picture that cannot be used as an access point and can be decoded after the associated IRAP picture 101 has been decoded. For example, non-IRAP pictures 103 are generally coded according to inter prediction. Inter prediction is a process of coding blocks of a picture with reference to blocks of other pictures designated as reference pictures. A picture coded based on inter prediction can be accurately decoded only when all reference pictures of the corresponding picture are also decoded. Both IRAP pictures 101 and non-IRAP pictures 103 may be designated as reference pictures for other non-IRAP pictures 103.

코딩 기술에 따라, 다양한 유형의 IRAP 픽처들(101)이 사용될 수 있다. 본 예시에서, IRAP 픽처들(101)은 IDR 픽처들 및 CRA 픽처들을 포함한다. IDR 픽처는 코딩된 비디오 시퀀스에서 첫 번째 픽처로 사용될 수 있는 인트라 코딩된 픽처이다. CRA 픽처는 연관된 리딩 픽처들의 사용을 허용하는 인트라 코딩된 픽처이다. 리딩 픽처는 출력 순서에서 연관된 IRAP 픽처(101)에 선행하지만 디코딩 순서에서는 상기 IRAP 픽처(101)를 뒤따르는 픽처이다. 디코더는 비트스트림(100)의 시작에서 디코딩을 개시할 수 있다. 하지만, 사용자들은 종종 비트스트림 내의 특정 포인트로 건너뛰어서 선택된 포인트부터 보기 시작하고 싶어 한다. 사용자에 의해 디코딩을 위한 출발점으로 선택될 수 있는 임의의 포인트는 랜덤 액세스 포인트로 알려진다. Depending on the coding technology, various types of IRAP pictures 101 may be used. In this example, IRAP pictures 101 include IDR pictures and CRA pictures. An IDR picture is an intra-coded picture that can be used as the first picture in a coded video sequence. A CRA picture is an intra-coded picture that allows the use of associated leading pictures. A leading picture is a picture that precedes the associated IRAP picture 101 in output order but follows the IRAP picture 101 in decoding order. The decoder may start decoding at the beginning of the bitstream 100 . However, users often want to jump to a specific point in the bitstream and start viewing from the selected point. Any point that can be selected by the user as a starting point for decoding is known as a random access point.

일반적으로 임의의 IRAP 픽처(101)는 랜덤 액세스 포인트로 사용될 수 있다. 일단 IRAP 픽처(101)가 랜덤 액세스 포인트로 선택되면, (예를 들어, 상기 선택된 IRAP 픽처(101)를 뒤따르는) 모든 연관된 비IRAP 픽처들(103) 또한 디코딩될 수 있다. 제시된 예에서, 사용자는 랜덤 액세스를 위한 CRA4를 선택했다. 디코더는 CRA4 이전의 어떤 픽처들도 디코딩하지 않고서 CRA4에서 디코딩을 시작할 수 있다. 이것은 IRAP 픽처들을 뒤따르는 픽처들이 일반적으로 이전 IRAP 픽처들을 참조할 수 없게 되기 때문이다. 이에 따라, 일단 CRA4가 랜덤 액세스 포인트로 선택되면, 디코더는 디스플레이를 위한 CRA4를 디코딩한 후에 CRA4에 기반하여 CRA4를 뒤따르는 비IRAP 픽처들(103)을 디코딩할 수 있다. 이는 디코더로 하여금 랜덤 액세스 포인트 이전의 픽처들을 디코딩하지 않고 상기 랜덤 액세스 포인트(예를 들어, CRA4)로부터의 비트스트림을 제시하기 시작하도록 허용한다. In general, any IRAP picture 101 can be used as a random access point. Once an IRAP picture 101 is selected as a random access point, all associated non-IRAP pictures 103 (eg, following the selected IRAP picture 101) can also be decoded. In the example presented, the user has selected CRA4 for random access. The decoder can start decoding at CRA4 without decoding any pictures before CRA4. This is because pictures that follow IRAP pictures generally cannot reference previous IRAP pictures. Accordingly, once CRA4 is selected as a random access point, the decoder can decode CRA4 for display and then based on CRA4 decode non-IRAP pictures 103 following CRA4. This allows the decoder to start presenting the bitstream from the random access point (eg CRA4) without decoding pictures before the random access point.

도 2는 DRAP 픽처들을 사용하여 비트스트림을 디코딩하는 경우 랜덤 액세스를 위한 예시적 메커너즘의 개략도이다. 구체적으로, 도 2는 IRAP 픽처(201), 비IRAP 픽처들(203), 그리고 DRAP 픽처들(205)을 포함하는 비트스트림(200)을 도시한다. IRAP 픽처(201) 및 비IRAP 픽처들(203)은 각각 IRAP 픽처들(101) 및 비IRAP 픽처들(103)과 실질적으로 유사할 수 있다. 본 예시에서, IDR 픽처는 IRAP 픽처(201)로 사용된다. 2 is a schematic diagram of an example mechanism for random access when decoding a bitstream using DRAP pictures. Specifically, FIG. 2 shows a bitstream 200 including IRAP pictures 201 , non-IRAP pictures 203 , and DRAP pictures 205 . IRAP pictures 201 and non-IRAP pictures 203 may be substantially similar to IRAP pictures 101 and non-IRAP pictures 103, respectively. In this example, an IDR picture is used as an IRAP picture (201).

DRAP 픽처들(205) 또한 포함된다. DRAP 픽처(205)는 인터 예측에 따라 코딩되고 비트스트림(200)으로의 액세스 포인트로 사용될 수 있는 픽처이다. 예를 들어, 각각의 DRAP 픽처(205)는 IRAP 픽처(201)를 참조하여 코딩 될 수 있다. 도 2는 인터 예측에 따라 그리고 연관된 참조 픽처로부터 코딩되는 픽처를 가리키는 화살표들을 포함한다. 도시된 바와 같이, 각각의 DRAP 픽처(205)는 IDR0를 참조하여 코딩된다. 그 자체로, 임의의 DRAP 픽처(205)는 디코더가 연관된 IRAP 픽처(201)를 디코딩할 수 있는 한 랜덤 액세스 포인트로 사용될 수 있다. 도시된 예시에서, DRAP4는 랜덤 액세스 포인트로 선택되었다. 디코더는, 가령, 시그널링을 통해, DRAP 픽처들(205)이 비트스트림(200)에서 사용됨을 알게 되어야 하며, DRAP 픽처들(205)에 대한 참조 픽처들로 사용되는 IRAP 픽처(들)(201)을 알게 되어야 한다. 그 다음에, 디코더는 랜덤 액세스에서 사용하기 위한 IDR0를 디코딩하고 IDR0에 기반하여 DRAP4를 디코딩할 수 있다. 그 다음에, 디코더는 DRAP4에 기반하여 DRAP4를 뒤따르는 비IRAP 픽처들(203)을 디코딩할 수 있다. 디코더는 DRAP4에서 디코딩된 비디오를 제시하기 시작할 수 있다. DRAP pictures 205 are also included. The DRAP picture 205 is a picture that can be coded according to inter prediction and used as an access point to the bitstream 200 . For example, each DRAP picture 205 may be coded with reference to the IRAP picture 201. 2 includes arrows pointing to pictures that are coded according to inter prediction and from associated reference pictures. As shown, each DRAP picture 205 is coded with reference to IDR0. By itself, any DRAP picture 205 can be used as a random access point as long as the decoder can decode the associated IRAP picture 201. In the illustrated example, DRAP4 was selected as the random access point. The decoder must know, for example, through signaling, that the DRAP pictures 205 are used in the bitstream 200, and the IRAP picture(s) 201 used as reference pictures for the DRAP pictures 205 should know The decoder can then decode IDR0 for use in random access and decode DRAP4 based on IDR0. The decoder can then decode non-IRAP pictures 203 following DRAP4 based on DRAP4. The decoder may start presenting the decoded video in DRAP4.

인터 예측에 따라 코딩된 픽처들은 인트라 예측에 따라 코딩된 픽처들 보다 더 압축된다. 따라서, DRAP 픽처들(205)은 비트스트림(100) 내 IRAP 픽처들(101) 보다 더 압축된다. 따라서, DRAP 픽처들(205)의 사용은 더 복잡한 시그널링 메커니즘 및 디코딩 절차를 대가로 비트스트림(100)에 비교하여 비트스트림(200)에 대해 시그널링되는 시간에 따른 데이터의 양(예를 들어, 비트레이트)을 감소시킨다.Pictures coded according to inter prediction are more compressed than pictures coded according to intra prediction. Accordingly, DRAP pictures 205 are more compressed than IRAP pictures 101 in the bitstream 100. Thus, the use of DRAP pictures 205 is the amount of data (e.g., bits of rate) is reduced.

도 3은 CRR 픽처들을 사용하여 비트스트림을 디코딩하는 경우 랜덤 액세스를 위한 예시적 메커너즘의 개략도이다. 구체적으로, 도 3은 IRAP 픽처(301), 비IRAP 픽처들(303), 그리고 CRR 픽처들(305)을 포함하는 비트스트림(300)을 도시한다. IRAP 픽처(301) 및 비IRAP 픽처들(303)은 각각 IRAP 픽처들(101) 및 비IRAP 픽처들(103)과 실질적으로 유사할 수 있다. CRR 픽처(305)는 인터 예측에 따라 코딩되고 비트스트림(300)으로의 액세스 포인트로 사용될 수 있는 픽처이다. CRR 픽처(305)는 DRAP 픽처의 유형으로 간주될 수 있다. DRAP 픽처가 IRAP 픽처를 참조하여 코딩되는 동안, CRR 픽처(305)는 IRAP 픽처(301) 및 임의의 다른 CRR 픽처(305) 모두를 참조하여 코딩 될 수 있다. CRR 픽처(305)가 DRAP 픽처의 유형이기 때문에, CRR 픽처들(305) 또한 EDRAP 픽처들 및/또는 유형 2 DRAP 픽처들로 알려질 수 있으며, 이러한 용어들은 상호 교차하여 사용될 수 있다. 도 3은 인터 예측에 따라 그리고 연관된 참조 픽처로부터 코딩되는 픽처를 가리키는 화살표들을 포함한다. 3 is a schematic diagram of an example mechanism for random access when decoding a bitstream using CRR pictures. Specifically, FIG. 3 shows a bitstream 300 comprising IRAP pictures 301 , non-IRAP pictures 303 , and CRR pictures 305 . IRAP pictures 301 and non-IRAP pictures 303 may be substantially similar to IRAP pictures 101 and non-IRAP pictures 103, respectively. The CRR picture 305 is a picture that can be coded according to inter prediction and used as an access point to the bitstream 300 . A CRR picture 305 may be regarded as a type of DRAP picture. While a DRAP picture is coded with reference to an IRAP picture, a CRR picture 305 may be coded with reference to both the IRAP picture 301 and any other CRR picture 305. Since the CRR picture 305 is a type of DRAP picture, CRR pictures 305 may also be known as EDRAP pictures and/or type 2 DRAP pictures, and these terms may be used interchangeably. 3 includes arrows pointing to pictures that are coded according to inter prediction and from associated reference pictures.

도시된 예에서, CRR 픽처들(305)은 모두 IDR0로 표기되는 IRAP 픽처(301)을 참조하여 코딩된다. 더 나아가, CRR3, CRR4 및 CRR5 또한 CRR2를 참조하여 코딩된다. 따라서, 임의의 CRR 픽처(305)는 디코더가 연관된 IRAP 픽처(301)를 디코딩하는 한 랜덤 액세스 포인트로 사용될 수 있으며, 임의의 연관된 CRR 픽처(305)는 참조 픽처로 사용될 수 있다. 도시된 예에서, CRR4는 랜덤 액세스 포인트로 선택되었다. 디코더는 가령, 시그널링을 통해, CRR 픽처들(305)이 비트스트림(300)에서 사용됨을 알게 되어야 하며, 다른 CRR 픽처들(305)에 대한 참조 픽처들로 사용되는 IRAP 픽처(들)(310) 및 CRR 픽처(305)을 알게 되어야 한다. 그 다음에, 디코더는 랜덤 액세스에서 사용하기 위한 IDR0 및 CRR2를 디코딩하고 IDR0 및 CRR2에 기반하여 CRR4를 디코딩할 수 있다. 그 다음에, 디코더는 CRR4에 기반하여 CRR4를 뒤따르는 비IRAP 픽처들(303)을 디코딩할 수 있다. 디코더는 CRR4에서 디코딩된 비디오를 제시하기 시작할 수 있다.In the illustrated example, all CRR pictures 305 are coded with reference to the IRAP picture 301 denoted IDR0. Furthermore, CRR3, CRR4 and CRR5 are also coded with reference to CRR2. Thus, any CRR picture 305 can be used as a random access point as long as the decoder decodes the associated IRAP picture 301, and any associated CRR picture 305 can be used as a reference picture. In the illustrated example, CRR4 was selected as the random access point. The decoder must know, for example, through signaling, that CRR pictures 305 are used in the bitstream 300, and IRAP picture(s) 310 used as reference pictures for other CRR pictures 305 and CRR picture 305. The decoder can then decode IDR0 and CRR2 for use in random access and decode CRR4 based on IDR0 and CRR2. The decoder can then decode non-IRAP pictures 303 that follow CRR4 based on CRR4. The decoder can start presenting the decoded video at CRR4.

인터 예측은 픽처 내 블록들을 참조 픽처(들) 내 유사한 참조 블록들과 매칭함으로써 작동한다. 그리하여, 인코더는 현재 블록을 인코딩하는 대신에 참조 블록을 가리키는 모션 벡터를 인코딩할 수 있다. 현재 블록과 참조 블록 간의 임의의 차이는 잔차로 인코딩된다. 현재 블록이 참조 블록과 더 밀접하게 매칭될수록, 인코딩되는 잔차는 감소한다. 그 자체로, 현재 블록과 참조 블록간의 더 나은 매칭은 덜 코딩된 데이터 및 더 나은 압축으로 귀결된다. DRAP에 대한 CRR의 이점은 더 많은 픽처들이 가용하다는 것이며, 이는 더 나은 매칭과 더 나은 압축으로 귀결된다. DRAP에 대한 CRR의 비용은 시그널링 및 디코딩에서의 복잡성 증가이다.Inter prediction works by matching blocks in a picture with similar reference blocks in reference picture(s). Thus, the encoder can encode a motion vector pointing to a reference block instead of encoding the current block. Any difference between the current block and the reference block is encoded as a residual. The more closely the current block matches the reference block, the smaller the residual to be encoded. By itself, better matching between the current block and the reference block results in less coded data and better compression. The advantage of CRR over DRAP is that more pictures are available, which results in better matching and better compression. The cost of CRR over DRAP is increased complexity in signaling and decoding.

도 4는 CRR 기반의 랜덤 액세스를 지원하기 위해 외부 비트스트림(401)을 시그널링하기 위한 예시적 메커니즘의 개략도이다. 위에서 본 바와 같이, CRR을 위한 참조 픽처들을 관리하는 것은 DRAP를 위한 참조 픽처들을 관리하는 것 보다 더 복잡하다. 도 4는 디코더에 의해 디코딩할 인코딩된 비디오를 포함하는 메인 비트스트림(400)을 도시한다. 메인 비트스트림(400)은 단순성을 위해 참조들이 생략된 비트스트림(300)과 실질적으로 유사하다. 외부 비트스트림(401)은 랜덤 액세스를 지원하기 위해 포함된다. 구체적으로, 외부 비트스트림(401)은 각각의 CRR 픽처에 상응하는 참조 픽처들의 세트를 포함한다. 랜덤 액세스가 발생하는 경우, 인코더 및/또는 비디오 서버는 액세스 포인트에서 시작되는 메인 비트스트림(400) 및 상기 액세스 포인트에 해당하는 메인 비트스트림(400)의 부분을 전송할 수 있다. 예를 들어, 사용자는 랜덤 액세스를 위한 CRR3을 선택할 수 있다. 그 다음에, 디코더는 CRR3에서 시작되는 메인 비트스트림(400)을 요청한다. 그 다음에, 인코더/비디오 서버는 CRR3에서 메인 비트스트림(400)을 전송하기 시작할 수 있다. 인코더/비디오 서버는 상기 랜덤 액세스 포인트에 해당하는 외부 비트스트림(401)의 부분 또한 전송할 수 있다. 본 예에서, 인코더/비디오 서버는 IDR0 및 CRR2를 전송할 것이다.4 is a schematic diagram of an example mechanism for signaling an outer bitstream 401 to support CRR-based random access. As seen above, managing reference pictures for CRR is more complicated than managing reference pictures for DRAP. 4 shows a main bitstream 400 containing encoded video to be decoded by a decoder. Main bitstream 400 is substantially similar to bitstream 300 with references omitted for simplicity. An external bitstream 401 is included to support random access. Specifically, the external bitstream 401 includes a set of reference pictures corresponding to each CRR picture. When random access occurs, the encoder and/or the video server may transmit a main bitstream 400 starting from an access point and a part of the main bitstream 400 corresponding to the access point. For example, the user can select CRR3 for random access. Next, the decoder requests the main bitstream 400 starting at CRR3. Then, the encoder/video server may start transmitting the main bitstream 400 at CRR3. The encoder/video server may also transmit a portion of the external bitstream 401 corresponding to the random access point. In this example, the encoder/video server will send IDR0 and CRR2.

이러한 방식으로, 디코더는 상기 랜덤 액세스 포인트에서의 CRR 픽처와 해당 CRR 픽처를 디코딩하기에 필요한 모든 참조 픽처들을 모두 수신한다. 그 다음으로, 디코더는 CRR3를 디코딩하고 상기 비디오를 해당 포인트에서부터 디스플레이하기 시작할 수 있다. 데이터 전송을 줄이기 위해, 인코더/비디오 서버는 상기 랜덤 액세스 포인트를 디코딩하는 데에 필요한 외부 비트스트림(401)의 부분만을 보낼 수 있으며, 랜덤 액세스가 다시 발생하지 않고 그리고/또는 후속 CRR 픽처들이 현재 랜덤 액세스 포인트에서 제공되지 않는 참조 픽처들을 채용하지 않는다면 추가 데이터를 보내지 않을 수 있다.In this way, the decoder receives both the CRR picture at the random access point and all reference pictures necessary for decoding the CRR picture. Next, the decoder can decode CRR3 and start displaying the video from that point. To reduce data transmission, the encoder/video server can send only the part of the outer bitstream 401 needed to decode the random access point, so that random access does not occur again and/or subsequent CRR pictures are now random. Additional data may not be transmitted unless reference pictures not provided by the access point are employed.

도 5는 픽처가 DRAP 및/또는 CRR 픽처를 디코딩 순서에서는 뒤따르고 상기 DRAP 및/또는 CRR 픽처에 출력 순서에서는 선행하는 경우의 잠재적인 디코딩 오류를 보여주는 개략도이다. 이전 픽처들에서와 같이, 화살표들은 인터 예측을 나타내며, 인터 예측된 픽처를 가리키는 화살표 및 연관된 참조 픽처로부터 출발하는 화살표를 포함한다.5 is a schematic diagram showing potential decoding errors when a picture follows a DRAP and/or CRR picture in decoding order and precedes the DRAP and/or CRR picture in output order. As in the previous pictures, arrows indicate inter prediction, including an arrow pointing to an inter predicted picture and an arrow departing from an associated reference picture.

인코더들은 압축을 증가시키기 위해 픽처들을 재배열하도록 허용된다. 그 자체로, 픽처들이 사용자에게 제시되어야 하는 순서는 출력 순서로 알려진다. 픽처들이 비트스트림으로 코딩되는 순서는 디코딩 순서로 알려진다. 픽처들은 픽처 오더 카운트에 의해 식별될 수 있다. 픽처 오더 카운트는 픽처를 고유하게 식별하는 오름차순의 임의의 값일 수 있다. 다이어그램(500)에서, 픽처들은 디코딩 순서로 도시된다. 한편으로, 픽처들은 출력 순서에서 증가하는 그것들의 픽처 오더 카운트에 기반하여 번호가 매겨진다. 픽처 오더 카운트들에 의해 보이듯이, 픽처(8)는 출력 순서에서 나와 랜덤 액세스 포인트인 픽처(10)을 따른다. 따라서, 픽처(8)은, 출력 순서에서 랜덤 액세스 포인트에 선행하고 디코딩 순서에서 상기 랜덤 액세스 포인트를 뒤따르는 인터 예측된 픽처(503)이다. 본 예에서, 픽처(10)은, 상기 예에 따라, DRAP 픽처 또는 CRR/EDRAP/유형 2 DRAP 픽처일 수 있는, DRAP/CRR 픽처(505)이다. 본 예에서, 인터 예측된 픽처(503)은 픽처(6)을 참조(507)하여 인터 예측을 통해 코딩된다. 따라서, 픽처(6)은 인터 예측된 픽처(503)를 위한 참조 픽처(503)이다. Encoders are allowed to rearrange pictures to increase compression. As such, the order in which pictures should be presented to the user is known as the output order. The order in which pictures are coded into the bitstream is known as the decoding order. Pictures can be identified by picture order count. The picture order count can be any value in ascending order that uniquely identifies a picture. In diagram 500, pictures are shown in decoding order. On the one hand, pictures are numbered based on their picture order count increasing in output order. As shown by the picture order counts, picture 8 follows picture 10, which is a random access point, out of output order. Thus, picture 8 is the inter-predicted picture 503 that precedes the random access point in output order and follows the random access point in decoding order. In this example, picture 10 is a DRAP/CRR picture 505, which may be a DRAP picture or a CRR/EDRAP/Type 2 DRAP picture, according to the example above. In this example, the inter-predicted picture 503 is coded through inter-prediction with reference to picture 6 (507). Thus, picture 6 is the reference picture 503 for the inter-predicted picture 503.

다이어그램(500)은, 인터 예측된 픽처(503)이 인터 예측을 통해 참조 픽처(502)를 참조하기(507) 때문에 가능한 코딩 오류를 도시한다. 구체적으로, 인터 예측된 픽처(503)은 디코딩 순서에서 DRAP/CRR 픽처(505)를 뒤따르고, 출력 순서에서 DRAP/CRR 픽처(505)에 선행하며, 디코딩 순서에서 DRAP/CRR 픽처(505)에 앞서 위치하는 참조 픽처(502)를 참조한다(507). 비트스트림이 유형 IDR의 IRAP 픽처인 픽처(4)로부터 디코딩하는 경우, 참조 픽처(502)는 디코딩되며 참조 픽처 버퍼에 저장되고, 따라서 인터 예측된 픽처(503)가 적절하게 디코딩될 수 있다. 하지만, DRAP/CRR 픽처(505)가 랜덤 액세스를 위해 사용되는 경우, 참조 픽처(502)는 생략되며 디코딩되지 않는다. 따라서, 인터 예측된 픽처(503)는 인터 예측된 픽처(503)가 참조 픽처(502)를 참조하는 경우 정확하게 디코딩될 수 없다. 인코더는 참조(507)을 허용하지 않는 선택지를 가진다. 예를 들어, 인코더는 모든 인터 예측된 픽처들(503)이 연관된 랜덤 액세스 포인에서의 픽처 및 디코딩 순서에서 연관된 액세스 포인트를 뒤따르는 픽처들 만을 참조하도록 제한할 수 있다. 만약 참조(507)가 허용되지 않는다면, 인터 예측된 픽처(503)은 항상 디코딩될 수 있는데, 이는 상기 인터 예측된 픽처(503)이 DRAP/CRR 픽처(505) 이전의 어떠한 픽처도 참조하도록 허용되지 않기 때문이다. 하지만, 만약 참조(507)가 허용된다면, 인터 예측된 픽처(503)는, 인코더가 참조 픽처(502)를 참조하여 인터 예측된 픽처(503)를 인코딩할 것을 결정하는 경우, 직접적으로 디코딩될 수 없다. 참조(507)를 허용하는 것이 항상 오류를 발생시키지는 않는다는 사실에 주의해야 하는데, 이는 인코더가 참조(507)를 사용하는 것이 요구되지 않기 때문이다. 그러나, 만약 참조(507)가 허용된다면, 참조(507)가 선택되어 DRAP/CRR 픽처(505)가 랜덤 액세스를 위해 사용되는 때는 언제나 오류가 발생한다. 이것의 결과는 사용자의 관점에서 랜덤 오류들로 보이는 것일 수 있으며, 이는 사용자 경험을 감소시킨다. Diagram 500 shows possible coding errors because inter-predicted picture 503 references 507 reference picture 502 via inter prediction. Specifically, the inter-predicted picture 503 follows the DRAP/CRR picture 505 in decoding order, precedes the DRAP/CRR picture 505 in output order, and follows the DRAP/CRR picture 505 in decoding order. A previously located reference picture 502 is referred to (507). When decoding from picture 4, whose bitstream is an IRAP picture of type IDR, the reference picture 502 is decoded and stored in the reference picture buffer, so the inter-predicted picture 503 can be properly decoded. However, when the DRAP/CRR picture 505 is used for random access, the reference picture 502 is omitted and not decoded. Therefore, the inter-predicted picture 503 cannot be correctly decoded when the inter-predicted picture 503 refers to the reference picture 502 . The encoder has the option of disallowing the reference 507. For example, the encoder can restrict all inter-predicted pictures 503 to refer only to the picture in the associated random access point and pictures following the associated access point in decoding order. If the reference 507 is not allowed, the inter-predicted picture 503 can always be decoded, since the inter-predicted picture 503 is not allowed to refer to any picture before the DRAP/CRR picture 505. because it doesn't However, if the reference 507 is allowed, the inter-predicted picture 503 can be directly decoded if the encoder decides to encode the inter-predicted picture 503 with reference to the reference picture 502. does not exist. It should be noted that allowing the reference 507 is not always error-prone, since the encoder is not required to use the reference 507. However, if the reference 507 is allowed, an error occurs whenever the reference 507 is selected and the DRAP/CRR picture 505 is used for random access. The result of this may be what appears to be random errors from the user's point of view, which reduces the user experience.

본 개시는 이 문제를 다루는 몇 가지 메커니즘들을 포함한다. 예를 들어, 인코더는 참조(507)가 허용되는지 여부를 디코더에 시그널링할 수 있다. 참조(507)가 허용되는 경우, DRAP/CRR 픽처(505)가 랜덤 액세스를 위해 사용될 때, 인터 예측된 픽처(503)가 (인코더가 참조(507)를 사용할 것을 선택하는지 여부에 따라) 디코딩 가능하거나 가능하지 않을 수 있기 때문에, 디코더는 출력 순서에서 DRAP/CRR 픽처들(505)에 선행하고 디코딩 순서에서 DRAP/CRR 픽처(505)를 뒤따르는 인터 예측된 픽처들(503)을 디스플레이하지 않아야 한다. 참조(507)가 허용되지 않는 경우, DRAP/CRR 픽처(505)가 랜덤 액세스를 위해 사용될 때, 디코더는 DRAP/CRR 픽처(505)와 연관되는 인터 예측된 픽처들(503)을 디스플레이하지 하지 않아야 한다. 게다가, DRAP 및 CRR 시그널링 메커니즘들은 충분히 규정되지 않았다. 따라서, 본 개시는 DRAP/CRR 픽처들(505) 및/또는 연관된 픽처들을 랜덤 액세스 이후 디코더로 더 효율적으로 디코딩할 수 있도록 미디어 파일들에서의 DRAP 및 CRR 사용의 설명들을 시그널링하는 메커니즘들을 포함한다. This disclosure includes several mechanisms to address this issue. For example, the encoder can signal to the decoder whether the reference 507 is allowed. If the reference 507 is allowed, when the DRAP/CRR picture 505 is used for random access, the inter predicted picture 503 is decodable (depending on whether the encoder chooses to use the reference 507 or not) Since this may or may not be possible, the decoder should not display the inter predicted pictures 503 that precede the DRAP/CRR pictures 505 in output order and follow the DRAP/CRR picture 505 in decoding order. . If the reference 507 is not allowed, when the DRAP/CRR picture 505 is used for random access, the decoder should not display the inter predicted pictures 503 associated with the DRAP/CRR picture 505. do. Moreover, DRAP and CRR signaling mechanisms are not well defined. Accordingly, this disclosure includes mechanisms for signaling descriptions of DRAP and CRR usage in media files to more efficiently decode DRAP/CRR pictures 505 and/or associated pictures to a decoder after random access.

다른 예에서, 코딩 과정은 참조(507)가 발생하지 않도록 제약될 수 있다. 예를 들어, 픽처들은 계층들로 분리될 수 있고, 각 계층은 상이한 프레임 레이트와 연관될 수 있다. 이것은 디코더로 하여금 상기 디코더가 지원할 수 있는 프레임 레이트를 가지는 계층을 선택할 수 있도록 허용한다. 상기 디코더는 그래서 선택된 계층 내 모드 픽처들 및 상기 선택된 계층 아래 계층들의 모든 픽처들을 디스플레이하여 요구되는 프레임 레이트를 달성한다. DRAP/CRR 픽처(505)와 동일한 계층에 있고 디코딩 순서에서 상기 DRAP/CRR 픽처(505)를 뒤따르는 임의의 픽처(예를 들어, 인터 예측된 픽처(503))는 동일한 계층에 있고 디코딩 순서에서 상기 DRAP/CRR 픽처(505)에 선행하는 임의의 픽처를 출력 순서에서 뒤따라야 한다고 인코더가 요구할 때, 다이어그램(500)에 보이는 오류가 방지될 수 있다.In another example, the coding process can be constrained so that reference 507 does not occur. For example, pictures can be separated into layers, and each layer can be associated with a different frame rate. This allows the decoder to select a layer with a frame rate that the decoder can support. The decoder then displays the mode pictures in the selected layer and all pictures in the layers below the selected layer to achieve the required frame rate. Any picture that is in the same layer as the DRAP/CRR picture 505 and follows the DRAP/CRR picture 505 in decoding order (e.g., the inter-predicted picture 503) is in the same layer and in decoding order. When the encoder requests that any picture preceding the DRAP/CRR picture 505 be followed in output order, the error shown in diagram 500 can be prevented.

도 6은 ISOBMFF로 저장된 미디어 파일(600)의 개략도이다. 예를 들어, 미디어 파일(600)은 ISOBMFF에 저장되어 DASH 표현으로 사용될 수 있다. ISOBMFF 미디어 파일(600)은 대상들 및/또는 미디어 콘텐트 또는 미디어 프레젠테이션과 연관된 데이터를 운반하는 복수의 박스들에 저장된다. 예를 들어, 미디어 파일(600)은 파일 유형 박스(630)(예를 들어, ftyp), 무비박스(610)(예를 들어, moov), 그리고 미디어 데이터 박스(620)(예를 들어, mdat)를 포함할 수 있다.6 is a schematic diagram of a media file 600 stored as ISOBMFF. For example, media file 600 can be stored in ISOBMFF and used as a DASH representation. The ISOBMFF media file 600 is stored in a plurality of boxes carrying objects and/or data associated with media content or media presentation. For example, a media file 600 includes a file type box 630 (eg, ftyp), a movie box 610 (eg, moov), and a media data box 620 (eg, mdat). ) may be included.

파일 유형 박스(630)은 전체 파일을 기술하는 데이터를 운반할 수 있으며, 그리하여 파일 레벨 데이터를 운반할 수 있다. 따라서, 파일 레벨 박스는 전체 미디어 파일(600)과 관련되는 데이터를 포함하는 임의의 박스이다. 예를 들어, 상기 파일 유형 박스(630)는 ISO 사양의 버전 번호 및/또는 미디어 파일(600)의 호환성 정보를 표시하는 파일 유형을 포함할 수 있다. 무비 박스(600)는 미디어 파일에 포함된 무비를 기술하는 데이터를 운반할 수 있으며, 그리하여 무비 레벨 데이터를 운반할 수 있다. 무비 레벨 박스는 미디어 파일(600)에 포함되는 전체 무비를 기술하는 데이터를 포함하는 임의의 박스이다. 무비 박스(610)는 다양한 사용들을 위한 데이터를 포함하는 데에 사용되는 광범위한 서브 박스들을 포함할 수 있다. 예를 들어, 무비 박스(610)는 미디어 프레젠테이션의 트랙을 기술하는 메타 데이터를 운반하는 트랙 박스들(trak)을 포함한다. 트랙은 관련된 샘플들의 시간적인 시퀀스로 지칭될 수 있다는 것을 주의해야 한다. 예를 들어, 미디어 트랙은 픽처들 또는 샘플링된 오디오의 시퀀스를 포함할 수 있는 반면에, 메타 데이터 트랙은 상기 픽처들 및/또는 오디오에 상응하는 메타 데이터의 시퀀스를 포함할 수 있다. 트랙을 기술하는 데이터는 트랙 레벨 데이터이며, 그래서 트랙을 기술하는 임의의 박스는 트랙 레벨 박스이다.The file type box 630 can carry data describing the entire file, and thus can carry file level data. Thus, a file level box is any box that contains data related to the entire media file 600. For example, the file type box 630 may include a version number of the ISO specification and/or a file type indicating compatibility information of the media file 600 . Movie box 600 can carry data describing a movie included in a media file, and thus can carry movie level data. The movie level box is an arbitrary box containing data describing the entire movie included in the media file 600 . Movie box 610 can include a wide range of sub-boxes used to contain data for a variety of uses. For example, movie box 610 includes track boxes (traks) carrying meta data describing a track of a media presentation. It should be noted that a track may refer to a temporal sequence of related samples. For example, a media track may contain a sequence of pictures or sampled audio, while a metadata track may contain a sequence of meta data corresponding to the pictures and/or audio. Data describing a track is track level data, so any box describing a track is a track level box.

미디어 데이터 박스(620)는 미디어 프레젠테이션의 인터리브되고 시간적으로 배열된 미디어 데이터(예를 들어, 코딩된 비디오 픽처들 및/또는 오디오)를 포함한다. 예를 들어, 상기 미디어 데이터 박스(620)는 VVC, AVC, HEVC 등에 따라 코딩된 비디오 데이터의 비트스트림을 포함할 수 있다. 미디어 데이터 박스(620)는 비디오 픽처들, 오디오, 텍스트, 또는 사용자에게 디스플레이하기 위한 기타 미디어 데이터를 포함할 수 있다. ISOBMFF에서, 픽처들, 오디오, 그리고 텍스트는 샘플들로 통칭된다. 이것은 인코딩/디코딩될 픽셀들을 샘플들이라 부르는 비디오 코딩 표준들에서 사용되는 용어와 대조적이다. 그 자체로, 샘플이란 말은 맥락에 따라 (파일 포맷 레벨에서의) 전체 픽처 또는 (비트스트림 레벨에서의) 픽셀들의 그룹을 지칭할 수 있다.Media data box 620 contains interleaved and temporally arranged media data (eg, coded video pictures and/or audio) of a media presentation. For example, the media data box 620 may include a bitstream of video data coded according to VVC, AVC, HEVC, or the like. Media data box 620 may contain video pictures, audio, text, or other media data for display to a user. In ISOBMFF, pictures, audio, and text are collectively referred to as samples. This is in contrast to the terminology used in video coding standards which refers to pixels to be encoded/decoded as samples. By itself, the term sample can refer to an entire picture (at the file format level) or a group of pixels (at the bitstream level), depending on the context.

위에서 언급된 것처럼, 본 개시는 파일 포맷 레벨에서 DRAP 및/또는 CRR 사용을 시그널링하기 위한 추가 메커니즘들을 제공한다. 이것은 mdat 박스(620)에 포함된 샘플들의 비트스트림(들)을 실제로 디코딩하기 전에 moov 박스(610)에 파라미터들을 로딩함으로써 디코더로 하여금 DRAP 및/또는 CRR 사용을 알 수 있도록 허용한다. 예를 들어, moov 박스(610)는 DRAP 샘플 그룹 박스(625) 및/또는 EDRAP 샘플 그룹 박스(621)를 포함할 수 있다. 샘플 그룹 박스는 어떤 샘플들이 상기 샘플 그룹 박스와 일치하는 유형인지를 기술할 수 있다. 일 예에서, DRAP와 CRR 모두, 예를 들어 CRR을 DRAP 하위 유형으로 취급함으로써, DRAP 샘플 그룹 박스(625)에서 기술된다. 다른 예에서, CRR 샘플들은 EDRAP 샘플 그룹 박스(621)에 의해 기술되며, DRAP 샘플들은 DRAP 샘플 그룹 박스(625)에 의해 각각 기술된다. 일 예에서, DRAP 샘플 그룹(625)은 DRAP 샘플 엔트리들(627)을 포함할 수 있다. 그러면, 각각의 DRAP 샘플 엔트리들(627)은 DRAP에 따라 코딩된 연관 샘플을 기술할 수 있다. 일 예에서, EDRAP 샘플 그룹(621)은 EDRAP 샘플 엔트리들(623)을 포함할 수 있다. 그러면, 각각의 EDRAP 샘플 엔트리들(623)은 CRR/EDRAP/유형 2 DRAP에 따라 코딩된 연관 샘플을 기술할 수 있다. 각 DRAP/CRR 샘플의 디스크립션들은 픽처의 샘플 식별자, 연관된 참조 픽처(들)을 포함하는 샘플들의 식별자, 샘플들의 수의 표시, 그리고/또는 픽처로부터의 랜덤 액세스를 수행하는 데에 필요한 RAP 샘플들, 그리고/또는 DRAP/CRR 픽처에서 랜덤 액세스를 선택하고 수행할 때 디코더에 도움이 되는 추가 정보를 포함할 수 있다. As mentioned above, this disclosure provides additional mechanisms for signaling DRAP and/or CRR usage at the file format level. This allows the decoder to know the DRAP and/or CRR usage by loading the parameters into the moov box 610 prior to actually decoding the bitstream(s) of samples contained in the mdat box 620. For example, the moov box 610 may include a DRAP sample group box 625 and/or an EDRAP sample group box 621 . A Sample Group box may describe which samples are of a matching type with the Sample Group box. In one example, both DRAP and CRR are described in DRAP Sample Group box 625, eg by treating CRR as a DRAP subtype. In another example, CRR samples are described by EDRAP Sample Group box 621 and DRAP samples are described by DRAP Sample Group box 625, respectively. In one example, DRAP sample group 625 may include DRAP sample entries 627 . Then, each DRAP sample entry 627 may describe an associated sample coded according to DRAP. In one example, EDRAP sample group 621 may include EDRAP sample entries 623 . Then, each of the EDRAP sample entries 623 may describe an associated sample coded according to CRR/EDRAP/Type 2 DRAP. The descriptions of each DRAP / CRR sample include the sample identifier of the picture, the identifier of the samples containing the associated reference picture (s), an indication of the number of samples, and / or RAP samples necessary to perform random access from the picture, And/or may include additional information to help the decoder when selecting and performing random access in DRAP/CRR pictures.

moov 박스(610)는 광범위한 다른 박스들(629)도 포함할 수 있다. 일부 예들에서, DRAP/CRR 샘플들의 디스크립션들은 상기 다른 박스들(629)의 하나 또는 그 이상에서 포함될 수 있다. 예를 들어, 상기 다른 박스들(629)은 샘플 투 그룹 박스(SampleToGroupBox)를 포함할 수 있으며, DRAP 및/또는 CRR 샘플들은 상기 SampleToGroupBox에서 기술될 수 있다. 다른 예에서, 상기 다른 박스들(629)은 콤팩트 샘플 투 그룹 박스(CompactSampleToGroupBox)를 포함할 수 있으며, DRAP 및/또는 CRR 샘플들은 상기 CompactSampleToGroupBox에서 기술될 수 있다. 구체적인 예로서, 상기 DRAP 및/또는 CRR 샘플들은 SampleToGroupBox 및/또는 CompactSampleToGroupBox내 그룹 유형 파라미터(group_type_parameter) 필드에서 기술될 수 있다. 다른 예에서, 상기 다른 박스들(629)은 서브 샘플 정보 박스를 포함할 수 있으며, DRAP 및/또는 CRR 샘플들은 상기 서브 샘플 정보 박스에서 기술될 수 있다. 다른 예에서, 상기 다른 박스들(629)은 샘플 보조 정보 크기 박스를 포함할 수 있으며, DRAP 및/또는 CRR 샘플들은 상기 샘플 보조 정보 크기 박스에서 기술될 수 있다. 더 나아가, 여기에서 기술되는 임의의 다른 박스 또한 다른 박스들(629)에 포함될 수 있으며, DRAP 및/또는 CRR 샘플들의 디스크립션을 포함할 수 있다.The moov box 610 may contain a wide range of other boxes 629 as well. In some examples, descriptions of DRAP/CRR samples may be included in one or more of the other boxes 629 above. For example, the other boxes 629 may include a SampleToGroupBox, and DRAP and/or CRR samples may be described in the SampleToGroupBox. In another example, the other boxes 629 may include a CompactSampleToGroupBox, and DRAP and/or CRR samples may be described in the CompactSampleToGroupBox. As a specific example, the DRAP and/or CRR samples may be described in a group_type_parameter field in SampleToGroupBox and/or CompactSampleToGroupBox. In another example, the other boxes 629 may include a subsample information box, and DRAP and/or CRR samples may be described in the subsample information box. In another example, the other boxes 629 may include a Sample Auxiliary Information Size box, and DRAP and/or CRR samples may be described in the Sample Auxiliary Information Size box. Further, any other box described herein may also be included in the other boxes 629 and may include a description of DRAP and/or CRR samples.

도 7은 인코딩된 시각 미디어 데이터를 포함하는 비트스트림(700)의 개략도이다. 비트스트림(700)은 인코더에 의해 코딩/압축되었으며 디코더에 의해 디코딩/압축 해제될 미디어 데이터를 포함한다. 예를 들어, 비트스트림(700)은 ISOBMFF 미디어 파일(600)의 미디어 데이터 박스(620)에 포함될 수 있다. 더 나아가, 비트스트림(700)은 DASH에서의 표현에 포함될 수 있다. 비트스트림(700)은 VVC, AVC, EVC, HEVC 등의 다양한 코딩 포맷들에 따라 코딩될 수 있다. 일부 코딩 포맷들에서, 비트스트림(700)은 일련의 NAL 유닛들로 표현된다. NAL 유닛은 데이터 패킷에 위치할 크기의 데이터의 유닛이다. 예를 들어, VVC는 여러 유형의 NAL 유닛들을 포함한다. 비트스트림(700)은 비디오 데이터를 포함하는 비디오 코딩 계층(video coding layer(VCL)) NAL 유닛들 및 상기 VCL NAL 유닛들을 기술하고, 채용된 코딩 툴들을 기술하고, 코딩 제약들 등을 기술하는 데이터를 포함하는 비-VCL NAL 유닛들을 포함할 수 있다. 일 예에서, 비트스트림(700)은 VCL NAL 유닛들로 코딩된 픽처들(710)을 포함할 수 있다. 픽처들(710)은 IRAP 픽처들, 인터 예측된 픽처들, DRAP 픽처들, CRR 픽처들 등일 수 있다. 비VCL NAL 유닛들은 다양한 메시지들 및 픽처들(710)을 코딩하는 데에 사용되는 메커니즘들을 기술하는 파라미터 세트들을 포함할 수 있다. 많은 VCL NAL 유닛들이 VVC에 포함되는 반면, 본 개시는 SEI NAL 유닛들에 집중한다. 예를 들어, SEI NAL 유닛은 SEI 메시지를 포함할 수 있다. SEI NAL 메시지는 디코딩, 디스플레이, 또는 기타 목적들에 관련된 과정들을 돕지만 디코딩된 픽처들에서 샘플 값들을 결정하는 데에 디코딩 과정에 의해서는 필요하지 않은 데이터를 포함한다. 일 예에서, SEI 메시지들은 DRAP 표시 SEI 메시지(716) 및/또는 유형 2 DARP 표시 SEI 메시지(717)를 포함할 수 있다. DRAP 표시 SEI 메시지(716)는 DRAP 픽처들의 사용을 기술하는 데이터를 포함하는 SEI 메시지이다. 유형 2 DRAP 표시 SEI 메시지(717)는 CRR/EDRAP/유형 2 DRAP 픽처들의 사용을 기술하는 데이터를 포함하는 SEI 메시지이다. DRAP 표시 SEI 메시지(716) 및/또는 유형 2 DRAP 표시 SEI 메시지(717)는 DRAP 및/또는 CRR/EDRAP/유형 2 DRAP 픽처와 연관될 수 있으며, 그러한 픽처들이 디코딩 동안 어떻게 취급되어야 하는지를 표시할 수 있다. 7 is a schematic diagram of a bitstream 700 containing encoded visual media data. The bitstream 700 contains media data that has been coded/compressed by an encoder and to be decoded/decompressed by a decoder. For example, the bitstream 700 may be included in the media data box 620 of the ISOBMFF media file 600. Further, the bitstream 700 may be included in a representation in DASH. The bitstream 700 may be coded according to various coding formats such as VVC, AVC, EVC, and HEVC. In some coding formats, bitstream 700 is represented as a series of NAL units. A NAL unit is a unit of data of a size to be placed in a data packet. For example, VVC contains several types of NAL units. The bitstream 700 is video coding layer (VCL) NAL units containing video data and data describing the VCL NAL units, describing coding tools employed, coding constraints, and the like. It may include non-VCL NAL units including. In one example, bitstream 700 may include pictures 710 coded in VCL NAL units. Pictures 710 may be IRAP pictures, inter predicted pictures, DRAP pictures, CRR pictures, etc. Non-VCL NAL units may contain parameter sets that describe mechanisms used to code various messages and pictures 710 . While many VCL NAL units are included in VVC, this disclosure focuses on SEI NAL units. For example, an SEI NAL unit may contain an SEI message. The SEI NAL message contains data that assists processes related to decoding, display, or other purposes, but is not needed by the decoding process to determine sample values in decoded pictures. In one example, SEI messages may include a DRAP indication SEI message 716 and/or a type 2 DARP indication SEI message 717 . The DRAP indication SEI message 716 is an SEI message containing data describing the use of DRAP pictures. The type 2 DRAP indication SEI message 717 is an SEI message containing data describing the use of CRR/EDRAP/type 2 DRAP pictures. DRAP indication SEI messages 716 and/or type 2 DRAP indication SEI messages 717 may be associated with DRAP and/or CRR/EDRAP/type 2 DRAP pictures and may indicate how such pictures should be treated during decoding. there is.

일 예에서, DRAP 표시 SEI 메시지(716)는 디코딩 순서에서 DRAP 픽처를 뒤따르고 출력 순서에서 상기 DRAP 픽처에 선행하는 픽처가 디코딩 순서에서 상기 DRAP 픽처에 앞서 위치하는 참조 픽처를 인터 예측을 위해 참조하는 것이 허용되는지 여부에 관한 표시를 포함할 수 있다. 일 예에서 상기 DRAP 표시 SEI 메시지(716)는, 디코딩 순서에서 CRR/EDRAP/유형 2 DRAP 픽처를 뒤따르며 출력 순서에서 상기 DRAP 픽처에 선행하는 픽처가 디코딩 순서에서 상기 DRAP 픽처에 앞서 위치하는 참조 픽처를 인터 예측을 위해 참조하는 것이 허용되는지 여부에 관한 표시를 포함할 수 있다. 일 예에서, 상기 유형 2 DRAP 표시 SEI 메시지(717)는, 디코딩 순서에서 CRR/EDRAP/유형 2 DRAP 픽처를 뒤따르며 출력 순서에서 상기 DRAP 픽처에 선행하는 픽처가 디코딩 순서에서 상기 DRAP 픽처에 앞서 위치하는 참조 픽처를 인터 예측을 위해 참조하는 것이 허용되는지 여부에 관한 표시를 포함할 수 있다. 따라서, 디코더는, 상기 예에 따라, 상기 DRAP 표시 SEI 메시지(716) 및/또는 상기 유형 2 DRAP 표시 SEI 메시지(717)를 판독할 수 있고, 디코딩 순서에서 DRAP/CRR 픽처를 뒤따르고 출력 순서에서 상기 DRAP/CRR 픽처에 선행하는 픽처들이 상기 DRAP/CRR 픽처가 랜덤 액세스 포인트로 사용될 때 제시되어야 하는지 여부를 판단할 수 있다.In one example, the DRAP indication SEI message 716 indicates that a picture that follows a DRAP picture in decoding order and precedes the DRAP picture in output order refers to a reference picture that precedes the DRAP picture in decoding order for inter prediction. It may include an indication as to whether or not this is permitted. In one example, the DRAP indication SEI message 716 indicates that a picture that follows a CRR/EDRAP/type 2 DRAP picture in decoding order and precedes the DRAP picture in output order is a reference picture that precedes the DRAP picture in decoding order. It may include an indication as to whether it is allowed to refer to for inter prediction. In one example, the type 2 DRAP indication SEI message 717 indicates that a picture that follows a CRR/EDRAP/type 2 DRAP picture in decoding order and precedes the DRAP picture in output order precedes the DRAP picture in decoding order. It may include an indication of whether it is allowed to refer to a reference picture for inter prediction. Accordingly, the decoder can read the DRAP indication SEI message 716 and/or the type 2 DRAP indication SEI message 717, following the DRAP/CRR picture in decoding order and in output order, according to the example above. It may be determined whether pictures preceding the DRAP/CRR picture should be presented when the DRAP/CRR picture is used as a random access point.

구체적인 예에서, DRAP 표시 SEI 메시지(716)는 DRAP 픽처와 연관될 수 있고, 유형 2 DRAP 표시 SEI 메시지(717)는 CRR/EDRAP/유형 2 DRAP 픽처와 연관될 수 있다. 추가 예에서, 유형 2 DRAP 표시 SEI 메시지(717)는 T2drap_reserved_zero_13bits 필드(701)을 포함할 수 있으며, 상기 T2drap_reserved_zero_13bits 필드(701)로부터의 비트스트림은, 디코딩 순서에서 CRR/EDRAP/유형 2 DRAP 픽처를 뒤따르며 출력 순서에서 상기 DRAP 픽처에 선행하는 픽처가 디코딩 순서에서 상기 DRAP 픽처에 앞서 위치하는 참조 픽처를 인터 예측을 위해 참조하는 것이 허용되는지 여부를 표시하는 데에 사용될 수 있다. 다른 예에서, 상기 DRAP 표시 SEI 메시지(716) 내 필드는 DRAP 픽처에 대한 유사한 표시를 포함할 수 있다. 다른 예들에서, 상기 DRAP 표시 SEI 메시지(716) 및/또는 상기 유형 2 DRAP 표시 SEI 메시지(717) 내 멀티비트 지시자는 이 목적을 위해 사용될 수 있다.In a specific example, the DRAP indication SEI message 716 can be associated with a DRAP picture, and the type 2 DRAP indication SEI message 717 can be associated with a CRR/EDRAP/type 2 DRAP picture. In a further example, the type 2 DRAP indication SEI message 717 may include a T2drap_reserved_zero_13bits field 701 , wherein the bitstream from the T2drap_reserved_zero_13bits field 701 follows the CRR/EDRAP/Type 2 DRAP picture in decoding order. It may be used to indicate whether a picture that follows the DRAP picture and precedes the DRAP picture in output order is allowed to refer to a reference picture that precedes the DRAP picture in decoding order for inter prediction. In another example, a field in the DRAP indication SEI message 716 may contain a similar indication for a DRAP picture. In other examples, the multi-bit indicator in the DRAP indication SEI message 716 and/or the type 2 DRAP indication SEI message 717 may be used for this purpose.

도 8은 본원에 개시된 다양한 기술들이 구현될 수 있는 예시적인 비디오 처리 시스템(800)을 보여주는 개략도이다. 다양한 구현들은 시스템(800)의 일부 또는 전체 구성요소들을 포함할 수 있다. 시스템(800)은 비디오 콘텐트를 수신하기 위한 입력(802)을 포함할 수 있다. 상기 비디오 콘텐트는 8 또는 10 비트 다중 컴포넌트 픽셀 값들과 같은 로(raw) 또는 비압축 포맷으로 수신될 수 있거나 또는 압축 또는 인코딩된 포맷으로 수신될 수 있다. 입력(802)은 네트워크 인터페이스, 주변 버스 인터페이스 또는 스토리지 인터페이스를 표시할 수 있다. 네트워크 인터페이스의 예들은 이더넷, 수동 광학 네트워크 (PON: passive optical network) 등과 같은 유선 인터페이스들 및 와이파이 또는 셀룰러 인터페이스들과 같은 무선 인터페이스들을 포함할 수 있다.8 is a schematic diagram illustrating an example video processing system 800 in which various techniques disclosed herein may be implemented. Various implementations may include some or all components of system 800 . System 800 can include an input 802 for receiving video content. The video content may be received in a raw or uncompressed format, such as 8 or 10 bit multi-component pixel values, or may be received in a compressed or encoded format. Input 802 may represent a network interface, a peripheral bus interface, or a storage interface. Examples of network interfaces may include wired interfaces such as Ethernet, passive optical network (PON), and the like, and wireless interfaces such as Wi-Fi or cellular interfaces.

시스템(800)은 본 문서에 기술된 다양한 코딩 또는 인코딩 방법들을 구현할 수 있는 코딩 구성요소(804)를 포함할 수 있다. 코딩 구성요소(804)는 상기 비디오의 코딩된 표현을 산출하기 위해 코딩 구성요소(804)의 입력(802)으로부터 출력까지의 비디오의 평균 비트레이트를 감소시킬 수 있다. 따라서, 코딩 기술들은 종종 비디오 압축 또는 비디오 트랜스코딩 기술로 불린다. 코딩 구성요소(804)의 출력은 컴포넌트(806)에 의해 표현되는 것처럼, 저장되거나 또는 연결된 통신을 통해 전송될 수 있다. 입력(802)에서 수신된 상기 비디오의 저장되거나 통신된 (또는 코딩된) 비트스트림 표현은 컴포넌트(808)에 의해 사용되어 픽셀 값들을 생성하거나 또는 디스플레이 인터페이스(810)에 전송되는 디스플레이 가능한 비디오를 생성할 수 있다. 상기 비트스트림 표현으로부터 사용자가 시청가능한 비디오를 생성하는 과정은 종종 비디오 압축해제라 불린다. 더 나아가, 특정 비디오 처리 동작들이 "코딩” 동작 또는 툴로 불리는 반면, 상기 코딩 툴 또는 동작들은 인코더 측에서 사용되며 상기 코딩의 결과들을 반전시키는 대응하는 디코딩 툴 또는 동작들은 디코더가 실시할 것이라는 것이 이해될 것이다.System 800 may include a coding component 804 that may implement various coding or encoding methods described herein. Coding component 804 may reduce the average bitrate of video from input 802 to output of coding component 804 to produce a coded representation of the video. Thus, coding techniques are often referred to as video compression or video transcoding techniques. The output of coding component 804 may be stored or transmitted via a coupled communication, as represented by component 806 . The stored or communicated (or coded) bitstream representation of the video received at input 802 is used by component 808 to generate pixel values or displayable video that is transmitted to display interface 810. can do. The process of creating user-viewable video from the bitstream representation is often referred to as video decompression. Further, it will be understood that while certain video processing operations are referred to as “coding” operations or tools, the coding tools or operations are used at the encoder side and the corresponding decoding tools or operations that reverse the results of the coding will be performed by the decoder. will be.

주변 버스 인터페이스 또는 디스플레이 인터페이스의 예들은 범용 직렬 버스(universal serial bus (USB)) 또는 고해상도 멀티미디어 인터페이스 (high definition multimedia interface (HDMI)) 또는 디스플레이포트 등을 포함할 수 있다. 스토리지 인터페이스의 예들은 SATA(직렬 고급 기술 연결), PCI, IDE 인터페이스 등을 포함한다. 본 문서에서 설명되는 기술들은 이동 전화, 노트북, 스마트폰, 또는 디지털 데이터 처리 및/또는 비디오 디스플레이를 실행할 수 있는 기타 디바이스들과 같은 다양한 전자 디바이스들에서 구현될 수 있다.Examples of a peripheral bus interface or display interface may include a universal serial bus (USB) or a high definition multimedia interface (HDMI) or DisplayPort. Examples of storage interfaces include SATA (Serial Advanced Technology Connection), PCI, IDE interfaces, and the like. The techniques described in this document may be implemented in a variety of electronic devices such as mobile phones, notebooks, smart phones, or other devices capable of performing digital data processing and/or video display.

도 9는 예시적인 비디오 처리 장치(900)의 개략도이다. 장치(900)는 본 명세서에서 개시되는 하나 또는 그 이상의 방법들을 실시하는 데에 사용될 수 있다. 장치(900)는 스마트폰, 태블릿, 컴퓨터, 사물인터넷(IoT) 수신기 등에서 구현될 수 있다. 장치(900)는 하나 또는 그 이상의 프로세서들(902), 하나 또는 그 이상의 메모리들(904), 그리고 비디오 처리 하드웨어(906)를 포함할 수 있다. 프로세서(들)(902)은 본 문서에서 개시되는 하나 또는 그 이상의 방법들을 실시하도록 구성될 수 있다. 메모리(메모리들)(904)은 여기에서 개시된 방법들 및 기술들을 실시하는 데에 사용되는 데이터 및 코드를 저장하는 데에 사용될 수 있다. 비디오 처리 하드웨어(906)는 본 문서에서 개시되는 일부 기술들을 하드웨어 회로에서 실행할 수 있다. 일부 실시예들에서, 비디오 처리 하드웨어(906)는 예를 들어 그래픽 코프로세서와 같은 프로세서(902)에 적어도 부분적으로 포함될 수 있다.9 is a schematic diagram of an exemplary video processing device 900 . Apparatus 900 can be used to practice one or more methods disclosed herein. The device 900 may be implemented in a smartphone, tablet, computer, Internet of Things (IoT) receiver, or the like. Apparatus 900 may include one or more processors 902 , one or more memories 904 , and video processing hardware 906 . Processor(s) 902 may be configured to implement one or more methods disclosed herein. Memory (memories) 904 can be used to store data and code used in practicing the methods and techniques disclosed herein. Video processing hardware 906 may implement some of the techniques disclosed herein in hardware circuitry. In some embodiments, video processing hardware 906 may be included at least in part in processor 902, such as, for example, a graphics coprocessor.

도 10은 비디오 처리의 예시적인 방법(1000)을 위한 순서도이다. 방법(1000)은 단계 1002 에서 ISOBMFF의 시각 미디어 데이터 파일내의 CRR 샘플들에 대한 디스크립션을 결정하는 (예를 들어, 시그널링) 것을 포함한다, 단계 1004 에서, 상기 CRR 샘플 그룹에 기반하여 상기 시각 미디어 데이터 파일과 시각 미디어 데이터 간의 변환이 실시되어 진다. 상기 CRR 샘플들의 디스크립션은 ISOBMFF 미디어 파일의 다양한 위치에 포함될 수 있다. 예를 들어, 상기 CRR 샘플들의 디스크립션은 CRR 샘플 그룹내, 유형 2 DRAP 샘플 그룹내, EDRAP 샘플 그룹 및/또는 DRAP 샘플그룹내에 포함될 수 있다. 일부 예에서, 상기 CRR 샘플들의 디스크립션은 SampleToGroupBox 및/또는 CompactSampleToGroupBox 내, 또는 예를 들어 group_type_parameter 내에 포함될 수 있다. 일부 예에서, 상기 CRR 샘플들은 유형 2 DRAP 샘플들 및/또는 EDRAP 샘플들로 표시될 수 있다. 또는, 이 컨텍스트(context) 내에서 각 샘플은 인코딩된 픽처를 포함한다.10 is a flow chart for an example method 1000 of video processing. Method 1000 includes determining (e.g. signaling) a description for CRR samples in a visual media data file of an ISOBMFF, at step 1002, at step 1004, the visual media data based on the CRR sample group. Conversion between files and visual media data is performed. A description of the CRR samples may be included in various locations of the ISOBMFF media file. For example, the description of the CRR samples may be included in a CRR sample group, a type 2 DRAP sample group, an EDRAP sample group, and/or a DRAP sample group. In some examples, the description of the CRR samples may be included in SampleToGroupBox and/or CompactSampleToGroupBox, or in group_type_parameter, for example. In some examples, the CRR samples may be indicated as type 2 DRAP samples and/or EDRAP samples. Or, within this context, each sample includes an encoded picture.

상기 CRR 샘플들의 디스크립션은 어떤 샘플 그룹에 속하는 (예를들어, 포함되어 지는) 샘플들을 식별하기 위한 하나 이상의 샘플 식별자들을 포함할 수 있다. 다른 예에서, 상기 CRR 샘플들의 디스크립션은 상기 CRR 샘플들을 위한 참조 픽처들의 식별자들을 포함할 수 있다. 다른 예에서, 상기 CRR 샘플들의 디스크립션은 현재 샘플을 디코더하기 위해 참조로 필요한 샘플들의 수를 포함할 수 있다. 예로서, 상기 CRR 샘플들의 디스크립션은 샘플 그룹내 샘플 엔트리내에 포함될 수 있다. 일부 예들에서, 시각 미디어 데이터 파일은 적절한 디코딩을 지원하도록 제한될 수 있다. 예를 들어, 상기 시각 미디어 데이터 파일은, 현재 샘플이 가장 인접한 선행 초기 샘플, 상기 현재 샘플 보다 디코딩 순서가 선행하는 하나 이상의 CRR 샘플들, 또는 이들의 조합만을 참조하는 경우에만, 상기 현재 샘플이 상기 CRR 샘플들 중 하나가 되도록 제한될 수 있다. 다른 예에서, 상기 시각 미디어 데이터 파일은, 현재 샘플에서 디코딩이 시작될때 현재 샘플과 디코딩 순서 및 출력 순서에서 상기 현재 샘플 이후의 모든 샘플들이 바르게 디코딩될 수 있는 경우에만, 상기 현재 샘플이 상기 CRR 샘플들 중 하나가 되도록 제한될 수 있다. 다른 예에서, 상기 시각 미디어 데이터 파일은, 가장 인접한 선행 초기 샘플, 상기 현재 샘플 보다 디코딩 순서가 선행하는 하나 이상의 CRR 샘플들, 또는 이들의 조합을 디코딩한 이후, 현재 샘플과 상기 현재 샘플 이후의 모든 샘플들이 바르게 디코딩되는 경우에만, 상기 현재 샘플이 상기 CRR 샘플들 중 하나가 되도록 제한될 수 있다.The description of the CRR samples may include one or more sample identifiers for identifying samples belonging to (eg, included in) a certain sample group. In another example, the description of the CRR samples may include identifiers of reference pictures for the CRR samples. In another example, the description of the CRR samples may include the number of samples needed as a reference to decode the current sample. As an example, the description of the CRR samples may be included in a sample entry in a sample group. In some examples, a visual media data file may be constrained to support proper decoding. For example, the visual media data file only refers to a current sample, a preceding initial sample that is closest to the current sample, one or more CRR samples preceding the current sample in decoding order, or a combination thereof. It can be limited to be one of the CRR samples. In another example, the visual media data file is configured such that the current sample is the CRR sample only when the current sample and all samples after the current sample in decoding order and output order can be correctly decoded when decoding starts from the current sample. can be limited to be one of them. In another example, the visual media data file may include, after decoding the nearest preceding initial sample, one or more CRR samples preceding the current sample in decoding order, or a combination thereof, the current sample and all subsequent samples. The current sample may be constrained to be one of the CRR samples only if the samples are decoded correctly.

도 11은 본 개시의 기술들을 활용할 수 있는 예시적인 비디오 코딩 시스템(1100)을 도시하는 개략도이다. 도 11에 보이는 바와 같이, 비디오 코딩 시스템(1100)은 소스 디바이스(1110) 및 목적지 디바이스(1120)를 포함할 수 있다. 소스 디바이스(1110)는 인코딩된 비디오 데이터를 생성시키며, 비디오 인코딩 디바이스로 불릴 수 있다. 목적지 디바이스(1120)는 소스 디바이스(1110)에 의해 생성된 상기 인코딩된 비디오 데이터를 디코딩 할 수 있으며, 비디오 디코딩 디바이스로 불릴 수 있다.11 is a schematic diagram illustrating an example video coding system 1100 that may utilize the techniques of this disclosure. As shown in FIG. 11 , a video coding system 1100 may include a source device 1110 and a destination device 1120 . The source device 1110 generates encoded video data and may be referred to as a video encoding device. Destination device 1120 can decode the encoded video data generated by source device 1110 and can be referred to as a video decoding device.

소스 디바이스(1110)는 비디오 소스(1112), 비디오 인코더(1114) 및 입력/출력(I/O) 인터페이스(1116)를 포함할 수 있다. 비디오 소스(1112)는 비디오 캡처 디바이스, 비디오 콘텐트 제공기로부터 비디오 데이터를 수신하기 위한 인터페이스, 그리고/또는 비디오 데이터를 생성하기 위한 컴퓨터 그래픽 시스템과 같은 소스, 또는 이러한 소스들의 조합을 포함할 수 있다. 상기 비디오 데이터는 하나 또는 그 이상의 픽처들을 포함할 수 있다. 비디오 인코더(1114)는 비디오 소스(1112)로부터의 상기 비디오 데이터를 인코딩하여 비트스트림을 생성한다. 상기 비트스트림은 상기 비디오 데이터의 코딩된 표현을 형성하는 비트들의 시퀀스를 포함할 수 있다. 상기 비트스트림은 코딩된 픽처들 및 연관된 데이터를 포함할 수 있다. 코딩된 픽처는 픽처의 코딩된 표현이다. 연관된 데이터는 시퀀스 파라미터 세트들, 픽처 파라미터 세트들 및 기타 신택스 구조들을 포함할 수 있다. I/O 인터페이스(1116)는 변조기/복조기 (모뎀) 및/또는 송신기를 포함할 수 있다. 인코딩된 비디오 데이터는 네트워크(1130)를 통해 I/O 인터페이스(1116)를 거쳐 목적지 디바이스(1120)로 직접 전송될 수 있다. 상기 인코딩된 비디오 데이터는 또한 목적지 디바이스(1120)에 의한 접근을 위해 스토리지 매체/서버(1140)에 저장될 수 있다.The source device 1110 may include a video source 1112 , a video encoder 1114 , and an input/output (I/O) interface 1116 . Video source 1112 may include a source such as a video capture device, an interface for receiving video data from a video content provider, and/or a computer graphics system for generating video data, or a combination of such sources. The video data may include one or more pictures. Video encoder 1114 encodes the video data from video source 1112 to generate a bitstream. The bitstream may include a sequence of bits forming a coded representation of the video data. The bitstream may include coded pictures and associated data. A coded picture is a coded representation of a picture. Associated data may include sequence parameter sets, picture parameter sets, and other syntax structures. I/O interface 1116 may include a modulator/demodulator (modem) and/or a transmitter. The encoded video data may be directly transmitted over network 1130 to destination device 1120 via I/O interface 1116 . The encoded video data may also be stored on storage medium/server 1140 for access by destination device 1120 .

목적지 디바이스(1120)는 I/O 인터페이스(1126), 비디오 디코더(1124) 및 디스플레이 디바이스(1122)를 포함할 수 있다. I/O 인터페이스(1126)는 수신기 및/또는 모뎀을 포함할 수 있다. I/O 인터페이스(1126)는 소스 디바이스(1110) 또는 스토리지 매체/서버(1140)로부터 인코딩된 비디오 데이터를 획득할 수 있다. 비디오 디코더(1124)는 상기 인코딩된 비디오 데이터를 디코딩할 수 있다. 디스플레이 디바이스(1122)는 상기 디코딩된 비디오 데이터를 사용자에게 보여줄 수 있다. 디스플레이 디바이스(1122)는 목적지 디바이스(1120)와 통합될 수 있거나 또는 외부 디스플레이 디바이스와 접속하도록 구성될 목적지 디바이스(1120)의 외부에 있을 수 있다.Destination device 1120 may include an I/O interface 1126 , a video decoder 1124 and a display device 1122 . I/O interface 1126 may include a receiver and/or modem. I/O interface 1126 can obtain encoded video data from source device 1110 or storage medium/server 1140 . The video decoder 1124 can decode the encoded video data. The display device 1122 may display the decoded video data to a user. The display device 1122 may be integrated with the destination device 1120 or may be external to the destination device 1120 to be configured to interface with an external display device.

비디오 인코더(1114) 및 비디오 디코더(1124)는 HEVC(고효율 비디오 코딩) 표준, VVC(다목적 비디오 코딩) 표준 및 기타 현행 및/또는 추가 표준들과 같은 비디오 압축 표준에 따라 작동할 수 있다.Video encoder 1114 and video decoder 1124 may operate in accordance with video compression standards such as the High Efficiency Video Coding (HEVC) standard, the Versatile Video Coding (VVC) standard, and other existing and/or additional standards.

도 12는 도 11에 도시된 시스템(1100) 내의 비디오 인코더(1114)일 수 있는 비디오 인코더(1200)의 일 예를 도시하는 블록도이다. 비디오 인코더(1200)는 본 개시의 기술들 중 어느 하나 또는 전부를 실시하도록 구성될 수 있다. 도 12의 예에서, 비디오 인코더(1200)는 복수의 기능적 구성요소들을 포함한다. 본 개시에서 설명되는 기술들은 비디오 인코더(1200)의 다양한 구성요소들 사이에 공유될 수 있다. 일부 예들에서, 프로세서는 본 개시에서 설명되는 기술들 중 어느 하나 또는 전부를 실시하도록 구성될 수 있다.12 is a block diagram illustrating an example of a video encoder 1200, which may be video encoder 1114 in system 1100 shown in FIG. 11 . Video encoder 1200 may be configured to implement any or all of the techniques of this disclosure. In the example of FIG. 12 , video encoder 1200 includes a plurality of functional components. The techniques described in this disclosure may be shared among the various components of video encoder 1200 . In some examples, a processor may be configured to implement any or all of the techniques described in this disclosure.

비디오 인코더(1200)의 기능적 구성요소들은 분할부(1201), 모드 선택부(1203), 모션 추정부(1204), 모션 보상부(1205) 및 인트라 예측부(1206)를 포함할 수 있는 예측부(1202), 잔차 생성부(1207), 변환 처리부(1208), 양자화부(1209), 역양자화부(1210), 역변환부(1211), 복원부(1212), 버퍼(1213), 그리고 엔트로피 인코딩부(1214)를 포함할 수 있다. The functional components of the video encoder 1200 are a division unit 1201, a mode selection unit 1203, a motion estimation unit 1204, a motion compensation unit 1205 and a prediction unit which may include an intra prediction unit 1206. 1202, residual generator 1207, transform processing unit 1208, quantization unit 1209, inverse quantization unit 1210, inverse transform unit 1211, restoration unit 1212, buffer 1213, and entropy encoding may include section 1214 .

다른 예들에서, 비디오 인코더(1200)는 더 많은 수의, 적은 수의 또는 상이한 기능적 구성요소들을 포함할 수 있다. 일 예에서, 예측부(1202)는 인트라 블록 카피(IBC) 유닛을 포함한다. 상기 IBC 유닛은 적어도 하나의 참조 픽처가 현재 비디오 블록이 위치하는 픽처인 IBC 모드로 예측을 수행할 수 있다.In other examples, video encoder 1200 may include more, fewer, or different functional components. In one example, the prediction unit 1202 includes an intra block copy (IBC) unit. The IBC unit may perform prediction in an IBC mode in which at least one reference picture is a picture in which a current video block is located.

게다가, 모션 추정부(1204) 및 모션 보상부(1205)와 같은 몇몇 구성요소들은 고도로 통합될 수 있지만, 설명의 목적을 위해 도12의 예에서는 별도로 도시되었다.Additionally, some components, such as motion estimation unit 1204 and motion compensation unit 1205, may be highly integrated, but are shown separately in the example of FIG. 12 for illustrative purposes.

분할부(1201)는 픽처를 하나 또는 그 이상의 비디오 블록들로 분할할 수 있다. 비디오 인코더(1200) 및 비디오 디코더(1300)는 다양한 비디오 블록 크기들을 지원할 수 있다.The division unit 1201 may divide a picture into one or more video blocks. Video encoder 1200 and video decoder 1300 can support various video block sizes.

모드 선택부(1203)는 예를 들어 오류 결과들에 기반하여, 인터 또는 인트라 코딩 모드들 중 하나를 선택할 수 있으며, 그 결과로 인트라 또는 인터 코딩된 블록을 잔차 생성부(1207)에 제공하여 잔차 블록 데이터를 생성하도록 하고, 복원부(212)에 제공하여 인코딩된 블록을 참조 픽처로 사용하기 위해 복원하도록 할 수 있다. 몇몇 예들에서, 모드 선택부(1203)는 예측이 인트라 예측 신호 및 인터 예측 신호에 기반하는 인트라 인터 결합 예측(CIIP) 모드를 선택할 수 있다. 모드 선택부(1203)는 또한 인터 예측의 경우 블록에 대한 모션 벡터를 위한 해상도(예를 들어, 서브 픽셀 또는 정수 픽셀 정밀도)를 선택할 수 있다. The mode selector 1203 may select one of the inter or intra coding modes based on, for example, error results, and as a result, provide the intra or inter coded block to the residual generator 1207 to generate the residual Block data may be generated and provided to the reconstruction unit 212 to reconstruct the encoded block for use as a reference picture. In some examples, the mode selector 1203 may select an intra-inter combined prediction (CIIP) mode in which prediction is based on an intra-prediction signal and an inter-prediction signal. The mode selection unit 1203 may also select a resolution (eg, sub-pixel or integer pixel precision) for a motion vector for a block in the case of inter prediction.

현재 비디오 블록에 인터 예측을 실시하기 위해, 모션 추정부(1204)는 버퍼(1213)에서부터 상기 현재 비디오 블록까지 하나 또는 그 이상의 참조 프레임들을 비교하여 상기 현재 비디오 블록에 대한 모션 정보를 생성할 수 있다. 모션 보상부(1205)는 상기 모션 정보 및 버퍼(1213)로부터 상기 현재 블록에 관련된 픽처가 아닌 픽처들의 디코딩된 샘플들에 기반하여 상기 현재 비디오에 대한 예측 비디오 블록을 결정할 수 있다.To perform inter prediction on the current video block, the motion estimation unit 1204 may compare one or more reference frames from the buffer 1213 to the current video block to generate motion information on the current video block. . The motion compensator 1205 may determine a prediction video block for the current video based on the motion information and decoded samples of pictures other than a picture related to the current block from the buffer 1213 .

모션 추정부(1204) 및 모션 보상부(1205)는 현재 비디오 블록에 대해, 예를 들어 상기 현재 비디오 블록이 I 슬라이스, P 슬라이스 또는 B 슬라이스에 있는지 여부에 의존하여, 상이한 동작들을 실시할 수 있다.Motion estimation unit 1204 and motion compensation unit 1205 may perform different operations on the current video block depending on, for example, whether the current video block is in an I slice, P slice or B slice. .

일부 예들에서, 모션 추정부(1204)는 현재 비디오 블록에 대해 단방향(uni-directional) 예측을 실시할 수 있으며, 모션 추정부(1204)는 현재 비디오 블록을 위한 참조 비디오 블록에 대해 리스트 0 또는 리스트 1의 참조 픽처들을 탐색할 수 있다. 그리고 나서, 모션 추정부(1204)는 참조 비디오 블록을 포함하는 리스트 0 또는 리스트 0에서 참조 픽처를 표시하는 참조 인덱스를 그리고 상기 현재 비디오 블록과 상기 참조 비디오 블록 사이의 공간적 이동(displacement)을 표시하는 모션 벡터를 생성할 수 있다. 모션 추정부(1204)는 상기 참조 인덱스, 예측 방향 지시자 그리고 상기 모션 벡터를 상기 비디오 블록의 모션 정보로 출력할 수 있다. 모션 보상부(1205)는 상기 현재 비디오 블록의 상기 모션 정보에 의해 지시되는 상기 참조 비디오 블록에 기반하여 상기 현재 블록의 예측 비디오 블록을 생성할 수 있다.In some examples, motion estimation unit 1204 may perform uni-directional prediction on the current video block, and motion estimation unit 1204 may perform list 0 or list 0 prediction on a reference video block for the current video block. 1 reference pictures can be searched. Then, the motion estimation unit 1204 includes list 0 including the reference video block or a reference index indicating a reference picture in list 0 and spatial displacement between the current video block and the reference video block. You can create motion vectors. The motion estimation unit 1204 may output the reference index, the prediction direction indicator, and the motion vector as motion information of the video block. The motion compensator 1205 may generate a prediction video block of the current block based on the reference video block indicated by the motion information of the current video block.

다른 예들에서, 모션 추정부(1204)는 현재 비디오 블록에 대해 양방향(bi-directional) 예측을 실시할 수 있으며, 모션 추정부(1204)는 현재 비디오 블록을 위한 참조 비디오 블록에 대해 리스트 0의 참조 픽처들을 탐색할 수 있고, 현재 비디오 블록을 위한 다른 참조 비디오 블록에 대해 리스트 1의 참조 픽처들을 탐색할 수 있다. 그리고 나서, 모션 추정부(1204)는 참조 비디오 블록들을 포함하는 리스트 0 또는 리스트 0에서 참조 픽처들을 표시하는 참조 인덱스를 그리고 상기 현재 비디오 블록과 상기 참조 비디오 블록들 사이의 공간적 이동들(displacements)을 표시하는 모션 벡터들을 생성할 수 있다. 모션 추정부(1204)는 상기 참조 인덱스들 및 상기 현재 비디오 블록의 상기 모션 벡터들을 상기 비디오 블록의 모션 정보로 출력할 수 있다. 모션 보상부(1205)는 상기 현재 비디오 블록의 상기 모션 정보에 의해 지시되는 상기 참조 비디오 블록들에 기반하여 상기 현재 블록의 예측 비디오 블록을 생성할 수 있다.In other examples, motion estimation unit 1204 may perform bi-directional prediction on the current video block, and motion estimation unit 1204 may perform a reference video block in List 0 for the current video block. Pictures can be searched, and reference pictures in List 1 can be searched for other reference video blocks for the current video block. Then, the motion estimation unit 1204 determines list 0 containing reference video blocks or a reference index indicating reference pictures in list 0 and spatial displacements between the current video block and the reference video blocks. Motion vectors to be displayed can be generated. The motion estimation unit 1204 may output the reference indices and the motion vectors of the current video block as motion information of the video block. The motion compensator 1205 may generate a prediction video block of the current video block based on the reference video blocks indicated by the motion information of the current video block.

일부 예들에서, 모션 추정부(1204)는 디코더의 디코딩 처리를 위한 모션 정보의 풀 세트를 출력할 수 있다. 일부 예들에서, 모션 추정부(1204)는 현재 비디오에 대한 모션 정보의 풀 세트를 출력하지 않을 수 있다. 오히려, 모션 추정부(1204)는 다른 비디오 블록의 모션 정보를 참조하여 현재 비디오 블록의 모션 정보를 시그널링할 수 있다. 예를 들어, 모션 추정부(1204)는 현재 비디오 블록의 모션 정보가 이웃 비디오 블록의 모션 정보와 충분히 유사하다고 판단할 수 있다.In some examples, motion estimation unit 1204 can output a full set of motion information for the decoder's decoding process. In some examples, motion estimation unit 1204 may not output a full set of motion information for the current video. Rather, the motion estimation unit 1204 may signal motion information of the current video block by referring to motion information of other video blocks. For example, the motion estimation unit 1204 may determine that motion information of the current video block is sufficiently similar to motion information of neighboring video blocks.

일 예에서, 모션 추정부(1204)는, 현재 비디오 블록과 관련된 신택스 구조에서, 현재 비디오 블록이 다른 비디오 블록과 동일한 모션 정보를 가지고 있다는 것을 도 13의 비디오 디코더(1300)에게 표시하는 값을 지시할 수 있다.In one example, the motion estimation unit 1204 indicates, in a syntax structure related to the current video block, a value indicating to the video decoder 1300 of FIG. 13 that the current video block has the same motion information as other video blocks. can do.

다른 예에서, 모션 추정부(1204)는, 현재 비디오 블록과 관련된 신택스 구조에서, 다른 비디오 블록 및 모션 벡터 차분(MVD: motion vector difference)을 식별할 수 있다. 상기 모션 벡터 차분은 현재 비디오 블록과 지시되는 비디오 블록의 모션 벡터 사이의 차분을 지시한다. 비디오 디코더(1300)는 지시되는 비디오 블록의 모션 벡터 및 모션 벡터 차분을 이용하여 현재 비디오 블록의 모션 벡터를 결정할 수 있다.In another example, the motion estimation unit 1204 may identify another video block and a motion vector difference (MVD) in a syntax structure related to the current video block. The motion vector difference indicates a difference between motion vectors of the current video block and the indicated video block. The video decoder 1300 may determine the motion vector of the current video block by using the motion vector of the indicated video block and the motion vector difference.

위에서 논의된 것처럼, 비디오 인코더(1200)는 모션 벡터를 예측적으로 시그널링할 수 있다. 비디오 인코더(1200)에 의해 실시될 수 있는 예측적 시그널링 기술들의 두 가지 예에는 향상된 모션 벡터 예측(advanced motion vector prediction (AMVP))과 머지 모드 시그널링이 있다.As discussed above, video encoder 1200 can predictively signal motion vectors. Two examples of predictive signaling techniques that may be implemented by video encoder 1200 are advanced motion vector prediction (AMVP) and merge mode signaling.

인트라 예측부(1206)는 현재 비디오 블록에 대해 인트라 예측을 실시할 수 있다. 인트라 예측부(1206)가 현재 비디오 블록에 대해 인트라 예측을 실시하는 경우, 인트라 예측부(1206)는 동일한 픽처의 다른 비디오 블록들의 디코딩된 샘플들에 기반하여 현재 비디오 블록에 대한 예측 데이터를 생성할 수 있다. 현재 비디오 블록에 대한 예측 데이터는 예측된 비디오 블록 및 다양한 신택스 요소들을 포함할 수 있다.The intra prediction unit 1206 may perform intra prediction on the current video block. When the intra prediction unit 1206 performs intra prediction on the current video block, the intra prediction unit 1206 generates prediction data for the current video block based on decoded samples of other video blocks of the same picture. can Prediction data for the current video block may include the predicted video block and various syntax elements.

잔차 생성부(1207)는 현재 비디오 블록에서 현재 비디오 블록의 예측 비디오 블록(들)을 차감하여 현재 비디오 블록에 대한 잔차 데이터를 생성할 수 있다. 현재 비디오 블록의 상기 잔차 데이터는 현재 비디오 블록의 샘플들의 상이한 샘플 구성요소들에 해당하는 잔차 비디오 블록들을 포함할 수 있다.The residual generator 1207 may generate residual data for the current video block by subtracting the predicted video block(s) of the current video block from the current video block. The residual data of the current video block may include residual video blocks corresponding to different sample components of samples of the current video block.

다른 예들에서, 가령 스킵 모드에서, 현재 비디오 블록에 대한 잔차 데이터가 없을 수 있으며, 잔차 생성부(1207)는 감산 동작을 실시하지 않을 수 있다.In other examples, for example in skip mode, there may be no residual data for the current video block and the residual generator 1207 may not perform a subtraction operation.

변환 처리부(1208)는 하나 또는 그 이상의 변환들을 현재 비디오 블록과 연관된 잔차 비디오 블록에 적용하여 현재 비디오 블록에 대한 하나 또는 그 이상의 변환 계수 비디오 블록들을 생성할 수 있다.Transform processing unit 1208 may apply one or more transforms to the residual video block associated with the current video block to generate one or more transform coefficient video blocks for the current video block.

변환 처리부(1208)가 현재 비디오 블록과 연관된 변환 계수 비디오 블록을 생성한 후, 양자화부(1209)는 현재 비디오 블록과 연관된 하나 또는 그 이상의 양자화 파라미터(QP: quantization parameter) 값들에 기반하여 현재 비디오 블록과 연관된 상기 변환 계수 비디오 블록을 양자화 할 수 있다.After the transform processing unit 1208 generates a transform coefficient video block associated with the current video block, the quantization unit 1209 converts the current video block based on one or more quantization parameter (QP) values associated with the current video block. The transform coefficient video block associated with can be quantized.

역양자화부(1210) 및 역변환부(1211)는 역양자화 및 역변환을 상기 변환 계수 비디오 블록에 각각 적용하여 상기 변환 계수 비디오 블록으로부터 잔차 비디오 블록을 복원할 수 있다. 복원부(1212)는 상기 복원된 잔차 비디오 블록을 예측부(1202)에 의해 생성된 하나 또는 그 이상의 예측 비디오 블록들에 해당하는 샘플들에 더하여 버퍼(1213)에 저장하기 위해 현재 블록과 연관된 복원 비디오 블록을 생성할 수 있다.The inverse quantizer 1210 and the inverse transform unit 1211 may reconstruct a residual video block from the transform coefficient video block by applying inverse quantization and inverse transform to the transform coefficient video block, respectively. The reconstruction unit 1212 adds the reconstructed residual video block to samples corresponding to one or more predicted video blocks generated by the prediction unit 1202 and stores it in the buffer 1213 to restore data associated with the current block. You can create video blocks.

복원부(1212)가 상기 비디오 블록을 복원한 후에, 루프 필터링 동작이 상기 비디오 블록에서 비디오 블로킹 아티팩트들을 감소시키기 위해 실시될 수 있다.After reconstruction unit 1212 reconstructs the video block, a loop filtering operation may be performed to reduce video blocking artifacts in the video block.

엔트로피 인코딩부(1214)는 비디오 인코더(1200)의 다른 기능적 구성요소들로부터 데이터를 수신할 수 있다. 엔트로피 인코딩부(1214)가 상기 데이터를 수신할 때, 엔트로피 인코딩부(1214)는 하나 또는 그 이상의 엔트로피 인코딩 동작들을 실시하여 엔트로피 인코딩된 데이터를 생성하고 상기 엔트로피 인코딩된 데이터를 포함하는 비트스트림을 출력할 수 있다.The entropy encoding unit 1214 may receive data from other functional components of the video encoder 1200 . When the entropy encoding unit 1214 receives the data, the entropy encoding unit 1214 generates entropy-encoded data by performing one or more entropy encoding operations and outputs a bitstream including the entropy-encoded data. can do.

도 13은 도 11에 도시된 시스템(1100) 내의 비디오 디코더(1124)일 수 있는 비디오 디코더(1300)의 일 예를 도시하는 블록도이다.FIG. 13 is a block diagram illustrating an example of a video decoder 1300 , which may be video decoder 1124 in system 1100 shown in FIG. 11 .

비디오 디코더(1300)는 본 개시의 기술들 중 어느 하나 또는 전부를 실시하도록 구성될 수 있다. 도 13의 예에서, 비디오 디코더(1300)는 복수의 기능적 구성요소들을 포함한다. 본 개시에서 설명되는 기술들은 비디오 디코더(1300)의 다양한 구성요소들 사이에 공유될 수 있다. 일부 예들에서, 프로세서는 본 개시에서 설명되는 기술들 중 어느 하나 또는 전부를 실시하도록 구성될 수 있다.Video decoder 1300 may be configured to implement any or all of the techniques of this disclosure. In the example of FIG. 13 , video decoder 1300 includes a plurality of functional components. The techniques described in this disclosure may be shared among the various components of video decoder 1300 . In some examples, a processor may be configured to implement any or all of the techniques described in this disclosure.

도 13의 예에서, 비디오 디코더(1300)는 엔트로피 디코딩부(1301), 모션 보상부(1302), 인트라 예측부(1303), 역양자화부(1304), 역변환부(1305), 복원부(1306), 그리고 버퍼(1307)를 포함한다. 일부 예들에서, 비디오 디코더(1300)는 비디오 인코더(1200) (도 12)와 관련하여 설명된 인코딩 패스에 일반적으로 상반된 디코딩 패스를 실시할 수 있다.In the example of FIG. 13 , the video decoder 1300 includes an entropy decoding unit 1301, a motion compensation unit 1302, an intra prediction unit 1303, an inverse quantization unit 1304, an inverse transform unit 1305, and a reconstruction unit 1306. ), and a buffer 1307. In some examples, video decoder 1300 may perform a decoding pass generally reciprocal to the encoding pass described with respect to video encoder 1200 (FIG. 12).

엔트로피 디코딩부(1301)는 인코딩된 비트스트림을 검색할 수 있다. 상기 인코딩된 비트스트림은 엔트로피 코딩된 비디오 데이터(예를 들어, 비디오 데이터의 인코딩된 블록들)을 포함할 수 있다. 엔트로피 디코딩부(1301)는 상기 엔트로피 코딩된 비디오 데이터를 디코딩할 수 있으며, 엔트로피 디코딩된 비디오 데이터로부터, 모션 보상부(1302)는 모션 벡터들, 모션 벡터 정밀도 및 참조 픽처 리스트 인덱스들을 포함하는 모션 정보 및 기타 모션 정보를 결정할 수 있다. 예를 들어, 모션 보상부(1302)는 AMVP 및 머지 모드를 실행하여 이러한 정보를 결정할 수 있다.The entropy decoding unit 1301 may search an encoded bitstream. The encoded bitstream may include entropy coded video data (eg, encoded blocks of video data). The entropy decoding unit 1301 may decode the entropy-coded video data, and from the entropy-decoded video data, the motion compensation unit 1302 may obtain motion information including motion vectors, motion vector precision, and reference picture list indices. and other motion information. For example, the motion compensation unit 1302 can determine this information by executing AMVP and merge mode.

모션 보상부(1302)는 가능한 경우 보간 필터들에 기반한 보간을 실시하여 모션 보상된 블록들을 산출할 수 있다. 서브 픽셀 정밀도와 함께 사용될 보간 필터들에 대한 식별자들은 신택스 요소들에 포함될 수 있다.The motion compensation unit 1302 may calculate motion compensated blocks by performing interpolation based on interpolation filters, if possible. Identifiers for interpolation filters to be used with sub-pixel precision may be included in the syntax elements.

모션 보상부(1302)는 비디오 인코더(1200)가 비디오 블록의 인코딩 동안 사용한 것과 같이 보간 필터들을 사용하여 참조 블록의 서브-정수 픽셀들에 대한 보간된 값들을 계산할 수 있다. 모션 보상부(1302)는 비디오 인코더(1200)가 사용한 상기 보간 필터들을 수신된 신택스 정보에 따라 결정할 수 있으며, 상기 보간 필터들을 사용하여 예측적 블록들을 산출할 수 있다.Motion compensator 1302 can calculate interpolated values for sub-integer pixels of a reference block using interpolation filters as video encoder 1200 used during encoding of the video block. The motion compensator 1302 may determine the interpolation filters used by the video encoder 1200 according to received syntax information, and may calculate predictive blocks using the interpolation filters.

모션 보상부(1302)는 인코딩된 비디오 시퀀스의 프레임(들) 및/또는 슬라이스(들)을 인코딩 하는 데에 사용된 블록들의 크기들을 판단하기 위한 일부 신택스 정보, 인코딩된 비디오 시퀀스의 픽처의 각 매크로블록이 어떻게 분할되는지를 기술하는 분할 정보, 각 파티션이 어떻게 인코딩되었는지를 표시하는 모드들, 각각의 인터 코딩된 블록에 대한 하나 또는 그 상의 참조 프레임들 (및 참조 프레임 리스트들), 그리고 인코딩된 비디오 시퀀스를 디코딩하기 위한 기타 정보를 이용할 수 있다.The motion compensator 1302 includes some syntax information for determining sizes of blocks used to encode the frame(s) and/or slice(s) of the encoded video sequence, and each macro of a picture of the encoded video sequence. Partitioning information describing how the block is divided, modes indicating how each partition was encoded, one or more reference frames (and reference frame lists) for each inter-coded block, and the encoded video Other information may be used to decode the sequence.

인트라 예측부(1303)는 공간적으로 인접한 블록들로부터 예측 블록을 형성하기 위해 예를 들어 비트스트림에서 수신된 인트라 예측 모드들을 이용할 수 있다. 역양자화부(1304)는 비트스트림에서 제공되며 엔트로피 디코딩부(1301)에 의해 디코딩된 양자화된 비디오 블록 계수들을 역 양자화(즉 양자화 해제)한다. 역변환부(1305)는 역변환을 적용한다.The intra prediction unit 1303 may use, for example, intra prediction modes received from a bitstream to form a prediction block from spatially adjacent blocks. The inverse quantization unit 1304 inversely quantizes (ie de-quantizes) the quantized video block coefficients provided in the bitstream and decoded by the entropy decoding unit 1301 . The inverse transform unit 1305 applies an inverse transform.

복원부(1306)는 모션 보상부(1302) 또는 인트라 예측부(1303)에 의해 생성된 해당 예측 블록들과 잔차 블록들을 합산하여 디코딩된 블록들을 형성할 수 있다. 요구되는 경우, 디블로킹 필터 또한 블록화 아티팩트(blockiness artifacts)를 제거하기 위해 디코딩된 블록들의 필터링에 적용될 수 있다. 그리고 나서, 디코딩된 비디오 블록들은 버퍼(1307)에 저장되며, 버퍼는 후속 모션 보상/인트라 예측을 위한 참조 블록들을 제공하고, 또한 디스플레이 장치상에 제시하기 위한 디코딩된 비디오를 산출한다.The reconstruction unit 1306 may form decoded blocks by adding the corresponding prediction blocks generated by the motion compensation unit 1302 or the intra prediction unit 1303 and the residual blocks. If required, a deblocking filter may also be applied to the filtering of the decoded blocks to remove blockiness artifacts. The decoded video blocks are then stored in buffer 1307, which provides reference blocks for subsequent motion compensation/intra prediction, and also yields decoded video for presentation on a display device.

도 14는 예시적인 인코더(1400)의 개략도이다. 인코더(1400)는 VVC의 기술들을 구현하는 데에 적합하다. 인코더(1400)는 3개의 인루프 필터들, 즉 디블로킹 필터(DF)(1402), 샘플 적응적 오프세트(SAO)(1404), 그리고 적응적 루프 필터(ALF)(1406)을 포함한다. 기정의된 필터들을 사용하는, DF(1402)와 달리, SAO(1404) 및 ALF(1406)는 오프셋을 가산하는 것 및 유한 임펄스 응답(finite impulse response (FIR)) 필터를 적용하는 것에 의해 원샘플들과 재구성된 샘플들 간의 평균 제곱 오차들을 감소시키기 위해 현재 픽처의 원샘플들을 활용하며, 코딩된 부가 정보는 오프셋들 및 필터 계수들을 시그널링한다. ALF(1406)는 각 픽처의 마지막 처리 단계에 위치하며, 이전 단계들에서 생성되는 아티팩트들을 잡아내어 수정하려고 하는 툴이라고 간주될 수 있다.14 is a schematic diagram of an example encoder 1400. Encoder 1400 is suitable for implementing the techniques of VVC. The encoder 1400 includes three in-loop filters: a deblocking filter (DF) 1402, a sample adaptive offset (SAO) 1404, and an adaptive loop filter (ALF) 1406. Unlike DF 1402, which uses predefined filters, SAO 1404 and ALF 1406 generate raw-sample output by adding an offset and applying a finite impulse response (FIR) filter. Utilizes the original samples of the current picture to reduce mean square errors between s and reconstructed samples, and the coded side information signals offsets and filter coefficients. The ALF 1406 is located at the last processing stage of each picture and can be regarded as a tool that tries to catch and correct artifacts generated in previous stages.

인코더(1400)는 인트라 예측 컴포넌트(1408) 및 입력 비디오를 수신하도록 구성된 모션 추정/보상(ME/MC) 컴포넌트(1410)을 더 포함할 수 있다. 인트라 예측 컴포넌트(1408)는 인트라 예측을 실시하도록 구성되는 반면, ME/MC 컴포넌트(1410)는 인터 예측을 실시하기 위해 참조 픽처 버퍼(1412)로부터 획득된 참조 픽처들을 활용하도록 구성된다. 인터 예측 또는 인트라 예측으로부터의 잔차 블록들은 변환(T) 컴포넌트(1414) 및 양자화(Q) 컴포넌트(1416)로 공급되어 양자화된 잔차 변환 계수들이 생성되고, 이것들은 엔트로피 코딩 컴포넌트(1418)로 공급된다. 엔트로피 코딩 컴포넌트(1418)는 예측 결과들 및 양자화된 변환 계수들을 엔트로피 코딩하여 비디오 디코더(미도시) 측으로 전송한다. 양자화 컴포넌트(1416)로부터 출력된 양자화 컴포넌트들은 역양자화(IQ) 컴포넌트들(1420), 역변환 컴포넌트(1422), 그리고 복원(REC) 컴포넌트(1424)로 공급될 수 있다. REC 컴포넌트(1424)는 영상들을 DF(1402), SAO(1404), 그리고 ALF(1406)으로 출력하여 해당 영상들이 참조 픽처 버퍼(1412)에 저장되기 전에 필터링 될 수 있다.The encoder 1400 can further include an intra prediction component 1408 and a motion estimation/compensation (ME/MC) component 1410 configured to receive the input video. Intra prediction component 1408 is configured to perform intra prediction, while ME/MC component 1410 is configured to utilize reference pictures obtained from reference picture buffer 1412 to perform inter prediction. Residual blocks from inter prediction or intra prediction are fed to a transform (T) component 1414 and a quantization (Q) component 1416 to generate quantized residual transform coefficients, which are fed to an entropy coding component 1418 . An entropy coding component 1418 entropy codes the prediction results and quantized transform coefficients and sends them to a video decoder (not shown). Quantization components output from quantization component 1416 may be fed to inverse quantization (IQ) components 1420 , inverse transform component 1422 , and reconstruction (REC) component 1424 . The REC component 1424 outputs images to the DF 1402 , SAO 1404 , and ALF 1406 so that the corresponding images may be filtered before being stored in the reference picture buffer 1412 .

일부 예시들에 의해 선호되는 해결책들의 목록이 다음에 제공된다. A list of preferred solutions by some examples is provided next.

아래 해결책들은 본원에 개시된 기술들의 예들을 보여준다.The solutions below show examples of the techniques disclosed herein.

1. 시각 미디어 처리 방법(예를 들어, 도 10에 설명된 방법(1000))으로, 픽처를 포함하는 비디오와 상기 비디오의 비트스트림 간의 변환을 실시하는 단계(1004)를 포함하고, 상기 픽처는 종속 랜덤 액세스 포인트(DRAP) 픽처로 상기 비트스트림에서 코딩되며, 상기 비트스트림은 포맷 규칙에 부합하고, 상기 포맷 규칙은, 상기 DRAP픽처와 동일한 계층에 있으면서 디코딩 순서에서 상기 DRAP 픽처를 뒤따르고 출력 순서에서 상기 DRAP 픽처에 선행하는 하나 또는 그 이상의 픽처들이 상기 동일한 계층 내의 픽처를 인터 예측을 위해 참조하는지 여부를 표시하는 추가 향상 정보(SEI) 메시지에 신택스 요소가 포함되는지 여부를 명시하며, 이때 상기 픽처는 디코딩 순서에서 상기 DRAP 픽처 보다 먼저인, 방법.1. A visual media processing method (e.g., method 1000 described in FIG. 10), comprising: converting between a video containing a picture and a bitstream of the video (1004), wherein the picture Coded in the bitstream as a dependent random access point (DRAP) picture, the bitstream conforming to a format rule, the format rule being in the same layer as the DRAP picture, following the DRAP picture in decoding order and output order specifies whether a syntax element is included in an additional enhancement information (SEI) message indicating whether one or more pictures preceding the DRAP picture refer to a picture in the same layer for inter prediction in Is earlier than the DRAP picture in decoding order.

2. 해결책 1의 방법으로서, 상기 SEI 메시지는 DRAP 표시 SEI 메시지인, 방법.2. The method of solution 1, wherein the SEI message is a DRAP indication SEI message.

3. 해결책 1의 방법으로서, 상기 SEI 메시지는 상기 비트스트림에 포함된 DRAP 표시 SEI 메시지와 상이한, 방법.3. The method of solution 1, wherein the SEI message is different from a DRAP indication SEI message included in the bitstream.

4. 해결책 2 내지 3 중 어느 방법으로서, 상기 포맷 규칙은, 상기 SEI 메시지의 존재가 상기 DRAP 픽처와 동일한 계층에 있고 디코딩 순서에서 상기 DRAP 픽처를 뒤따르며 출력 순서에서 상기DRAP 픽처에 선행하는 하나 또는 그 이상의 픽처들이 동일한 계층에 있는 상기 픽처를 인터 예측을 위해 참조하는 것이 허용됨을 표시한다고 명시하며, 이때 상기 픽처는 디코딩 순서에서 상기 DRAP 픽처 보다 먼저인, 방법.4. The method of any of solutions 2 to 3, wherein the format rule is one in which the presence of the SEI message is in the same layer as the DRAP picture, follows the DRAP picture in decoding order and precedes the DRAP picture in output order; or Specifies that more pictures indicate that it is allowed to refer to the picture in the same layer for inter prediction, wherein the picture precedes the DRAP picture in decoding order.

5. 해결책 2 내지 3 중 어느 방법으로서, 상기 포맷 규칙은, 상기 SEI 메시지의 존재가 상기 DRAP 픽처와 동일한 계층에 있고 디코딩 순서에서 상기 DRAP 픽처를 뒤따르며 출력 순서에서 상기DRAP 픽처에 선행하는 하나 또는 그 이상의 픽처들이 동일한 계층에 있는 상기 픽처를 인터 예측을 위해 참조하는 것이 허용되지 않음을 표시한다고 명시하며, 이때 상기 픽처는 디코딩 순서에서 상기 DRAP 픽처 보다 먼저인, 방법.5. In any of solutions 2 to 3, the format rule is one in which the presence of the SEI message is in the same layer as the DRAP picture, follows the DRAP picture in decoding order and precedes the DRAP picture in output order, or Specifies that more pictures indicate that it is not allowed to refer to the picture in the same layer for inter prediction, where the picture precedes the DRAP picture in decoding order.

6. 해결책 1 내지 5 중 어느 방법으로서, 상기 신택스 요소는 1 비트 플래그를 포함하는, 방법.6. The method of any of solutions 1 to 5, wherein the syntax element comprises a 1 bit flag.

아래 해결책들은 이전 섹션에서 논의된 기술들의 예시적 실시예들을 보여준다.The solutions below show example embodiments of the techniques discussed in the previous section.

7. 비디오 처리의 방법으로서, 하나 또는 그 이상의 픽처들을 포함하는 비디오와 상기 비디오의 비트스트림 간의 변환을 실시하는 단계를 포함하고, 상기 비트스트림은 유형 2 종속 랜덤 액세스 포인트(DRAP) 픽처를 포함하며, 상기 비트스트림은 포맷 규칙에 부합하고, 상기 포맷 규칙은, 상기 비트스트림 내에, 디코딩 순서에서 상기 유형 2 DRAP 픽처를 뒤따르지만 출력 순서에서 상기 유형 2 DRAP 픽처에 선행하는, 계층 내의 픽처들이 상기 계층 내에 있으면서 상기 디코딩 순서에서 상기 유형 2 DRAP 픽처 보다 먼저인 픽처를 인터 예측을 위해 참조하는 것이 허용되는지 여부를 표시하기 위한 특정 유형의 종속 랜덤 액세스 포인트(DRAP) 표시 신택스 메시지를 포함할 것을 명시하는, 방법.7. A method of video processing, comprising: performing conversion between a video comprising one or more pictures and a bitstream of the video, the bitstream comprising a type 2 dependent random access point (DRAP) picture; , the bitstream conforms to a format rule, and the format rule determines that pictures in a layer that follow the type 2 DRAP picture in decoding order but precede the type 2 DRAP picture in output order in the bitstream are the layer A specific type of dependent random access point (DRAP) indication syntax message for indicating whether it is allowed to refer for inter prediction to a picture that is in the decoding order and that is earlier than the type 2 DRAP picture in the decoding order. method.

8. 해결책 7의 방법으로서, 상기 특정 유형의 DRAP 표시 신택스 메시지는 유형 2 DRAP 표시 신택스 메시지에 해당하는, 방법.8. The method of solution 7, wherein the specific type of DRAP indication syntax message corresponds to a type 2 DRAP indication syntax message.

9. 해결책 7의 방법으로서, 상기 특정 유형의 DRAP 표시 신택스 메시지는 DRAP 표시 신택스 메시지에 해당하는, 방법.9. The method of solution 7, wherein the specific type of DRAP indication syntax message corresponds to a DRAP indication syntax message.

10. 해결책 7 내지 9 중 어느 방법으로서, 상기 신택스 요소는 1 비트 플래그를 포함하는, 방법.10. The method of any of solutions 7-9, wherein the syntax element comprises a 1-bit flag.

11. 비디오 처리의 방법으로서, 비디오와 상기 비디오의 비트스트림 간의 변환을 실시하는 단계를 포함하고, 상기 비트스트림은, 교차 랜덤 액세스 포인트 참조(CRR)가 상기 비트스트림을 저장하는 파일 포맷에서 시그널링되는지 여부 및 어떻게 시그널링되는지를 명시하는 포맷 규칙에 부합하는, 방법.11. A method of video processing, comprising the step of performing conversion between a video and a bitstream of the video, the bitstream whether a cross random access point reference (CRR) is signaled in a file format storing the bitstream. conforming to format rules specifying whether and how to be signaled.

12. 해결책 11의 방법으로서, 상기 포맷 규칙은 상기 CRR을 표시하는 샘플 그룹을 정의하는, 방법.12. The method of solution 11, wherein the format rule defines a group of samples indicating the CRR.

13. 해결책 11의 방법으로서, 상기 포맷 규칙은 종속 랜덤 액세스 포인트(DRAP) 샘플 그룹이 상기 CRR을 포함한다고 정의하는, 방법.13. The method of solution 11, wherein the format rule defines that a dependent random access point (DRAP) sample group includes the CRR.

14. 해결책 13의 방법으로서, 상기 CRR을 시그널링하는 상기 DRAP 샘플 그룹은 상기 CRR을 시그널링하기 위한 버전 필드 또는 grouping_type_parameter 필드를 포함하는, 방법.14. The method of solution 13, wherein the DRAP sample group signaling the CRR includes a version field or grouping_type_parameter field for signaling the CRR.

15. 비디오 처리의 방법으로서, 비디오와 상기 비디오의 비트스트림 간의 변환을 실시하는 단계를 포함하고, 상기 비트스트림은, 상기 비트스트림이 종속 랜덤 액세스 포인트(DRAP) 픽처를 포함하는 경우, DRAP 샘플 그룹의 구성원으로부터의 랜덤 액세스를 위해 요구되는 랜덤 액세스 포인트(RAP) 샘플들의 수를 표시하는 DRAP 샘플 엔트리에 필드가 포함됨을 명시하는 포맷 규칙에 부합하는, 방법.15. A method of video processing comprising: performing conversion between a video and a bitstream of the video, wherein the bitstream comprises a DRAP sample group when the bitstream includes a dependent random access point (DRAP) picture A method that conforms to a format rule specifying the inclusion of a field in a DRAP sample entry indicating the number of random access point (RAP) samples required for random access from a member of.

16. 해결책 15의 방법으로서, 상기 포맷 규칙은 상기 DRAP 샘플 그룹의 상기 구성원들에 대한 RAP 식별자를 표시하는 다른 필드를 포함할 것을 더 명시하는, 방법.16. The method of solution 15, wherein the format rule further specifies to include another field indicating RAP identifiers for the members of the DRAP sample group.

17. 해결책 1 내지 16 중 어느 방법으로서, 종속 랜덤 액세스 포인트(DRAP) 샘플은, 디코딩 및 출력 순서 모두에서 그것 뒤의 모든 샘플들이, 상기 DRAP 샘플에 선행하는 최근접 초기 샘플이 참조를 위해 가용한 경우, 정확하게 디코딩 될 수 있는 샘플인, 방법.17. The method of any of solutions 1 to 16, wherein a dependent random access point (DRAP) sample is such that all samples following it in both decoding and output order are such that the nearest initial sample preceding the DRAP sample is available for reference. If the sample can be accurately decoded, the method.

18. 해결책 1 내지 17 중 어느 방법으로서, 상기 비트스트림을 파일 포맷에 부합하는 파일에 저장하는 단계를 더 포함하는, 방법.18. The method of any of solutions 1 to 17, further comprising storing the bitstream in a file conforming to a file format.

19. 해결책 1 내지 17 중 어느 방법으로서, 상기 비트스트림은 파일 포맷에 부합하는 파일로부터 판독되는, 방법.19. The method of any of solutions 1 to 17, wherein the bitstream is read from a file conforming to a file format.

20. 해결책 18 내지 19 중 어느 방법으로서, 상기 파일 포맷은 국제 표준화 기구 기반 미디어 파일 포맷(International Standards Organization Base Media File Format (ISOBMFF))인, 방법.20. The method of any of solutions 18 to 19, wherein the file format is an International Standards Organization Base Media File Format (ISOBMFF).

21. 해결책 1 내지 20 중 하나 또는 그 이상에서 언급된 방법을 구현하도록 구성된 프로세서를 포함하는 비디오 디코딩 장치.21. A video decoding device comprising a processor configured to implement the method recited in one or more of solutions 1 to 20.

22. 해결책 1 내지 20 중 하나 또는 그 이상에서 언급된 방법을 구현하도록 구성된 프로세서를 포함하는 비디오 인코딩 장치.22. A video encoding device comprising a processor configured to implement the method recited in one or more of solutions 1 to 20.

23. 컴퓨터 코드를 저장하는 컴퓨터 프로그램 제품으로, 상기 코드는, 프로세서에 의해 실행될 때, 상기 프로세서로 하여금 해결책 1 내지 해결책 20 중 임의의 것에서 언급된 방법을 구현하게 하는, 컴퓨터 프로그램 제품.23. A computer program product storing computer code, which, when executed by a processor, causes the processor to implement the method recited in any of solutions 1-20.

24. 해결책 1 내지 20 중 어느 것에 따라 생성된 비트스트림 포맷에 부합하는 비트스트림을 저장하는 컴퓨터 판독 가능 매체.24. A computer readable medium storing a bitstream conforming to the bitstream format generated according to any one of solutions 1 to 20.

25. 해결책 1 내지 20 중 어느 것에서 언급된 방법에 따라 비트스트림을 생성하는 단계 및 상기 비트스트림을 컴퓨터 판독가능 매체에 작성하는 단계를 포함하는 방법.25. A method comprising generating a bitstream according to the method recited in any of solutions 1 to 20 and writing the bitstream to a computer readable medium.

26. 본 문서에서 기술된 방법, 장치 또는 시스템.26. A method, device or system described herein.

본 명세서에 설명된 해결책들에서, 인코더는 포맷 규칙에 따라 코딩된 표현을 산출하는 것에 의해 상기 포맷 규칙을 준수할 수 있다. 본 명세서에 설명된 해결책들에서, 디코더는 디코딩된 비디오를 산출하기 위해 포맷 규칙에 따라 신택스 요소들의 존재 및 부재에 대한 지식으로 코딩된 표현에서의 신택스 요소들을 파싱하기 위해 상기 포맷 규칙을 사용할 수 있다.In the solutions described herein, the encoder can comply with the format rule by yielding a coded representation according to the format rule. In the solutions described herein, a decoder may use the format rule to parse syntax elements in a coded representation with knowledge of the presence and absence of syntax elements according to the format rule to yield a decoded video. .

본 문서에서, "비디오 프로세싱"이라는 용어는 비디오 인코딩, 비디오 디코딩, 비디오 압축 또는 비디오 압축 해제를 지칭할 수 있다. 예를 들어, 비디오의 픽셀 표현으로부터 대응하는 비트스트림 표현으로 또는 그 반대로 변환하는 동안 비디오 압축 알고리즘들이 적용될 수 있다. 현재 비디오 블록의 비트스트림 표현은, 예를 들어, 신택스에 의해 정의된 바와 같이, 비트스트림 내의 상이한 위치들에 병치(co-locate)되거나 분산되는 비트들에 대응할 수 있다. 예를 들어, 매크로블록은 변환되고 코딩된 오차 잔차 값들의 관점에서 그리고 또한 헤더들 내의 비트들 및 비트스트림 내의 다른 필드들을 사용하여 인코딩될 수 있다. 게다가, 변환 동안, 디코더는, 위의 해결책들에 설명된 바와 같이, 결정에 기초하여, 일부 필드들이 존재하거나 존재하지 않을 수 있다는 것에 대한 지식으로 비트스트림을 파싱할 수 있다. 유사하게, 인코더는 특정 신택스 필드들이 포함되어야 하는지 여부를 결정할 수 있고, 그에 따라 코딩된 표현으로부터 신택스 필드들을 포함하거나 제외하는 것에 의해 코딩된 표현을 생성할 수 있다.In this document, the term “video processing” may refer to video encoding, video decoding, video compression or video decompression. For example, video compression algorithms may be applied during conversion from a pixel representation of a video to a corresponding bitstream representation or vice versa. A bitstream representation of the current video block may correspond to bits that are co-located or distributed at different locations within the bitstream, eg, as defined by syntax. For example, a macroblock can be encoded in terms of transformed and coded error residual values and also using bits in headers and other fields in the bitstream. Moreover, during transformation, the decoder can parse the bitstream with the knowledge that some fields may or may not be present based on the decision, as described in the solutions above. Similarly, an encoder can determine whether certain syntax fields are to be included, and can create a coded representation accordingly by including or excluding syntax fields from a coded representation.

본 문서에 설명된 개시된 및 다른 해결책들, 예들, 실시예들, 모듈들 및 기능 동작들은 디지털 전자 회로로, 또는 본 문서에 개시된 구조 및 그의 구조적 등가물을 포함한, 컴퓨터 소프트웨어, 펌웨어, 또는 하드웨어로, 또는 이들 중 하나 이상의 조합으로 구현될 수 있다. 개시된 및 다른 실시예들은 하나 이상의 컴퓨터 프로그램 제품으로서, 즉 데이터 처리 장치에 의한 실행을 위해 또는 데이터 처리 장치의 동작을 제어하기 위해 컴퓨터 판독 가능 매체에 인코딩된 컴퓨터 프로그램 명령어들의 하나 이상의 모듈로서 구현될 수 있다. 컴퓨터 판독 가능 매체는 머신 판독 가능 저장 디바이스, 머신 판독 가능 저장 기판, 메모리 디바이스, 머신 판독 가능 전파 신호를 실현하는 조성물(composition of matter), 또는 이들 중 하나 이상의 조합일 수 있다. "데이터 프로세싱 장치"라는 용어는, 예로서, 프로그래밍 가능 프로세서, 컴퓨터, 또는 다수의 프로세서들 또는 컴퓨터들을 포함한, 데이터를 처리하기 위한 모든 장치들, 디바이스들, 및 머신들을 포괄한다. 장치는, 하드웨어 외에도, 문제의 컴퓨터 프로그램을 위한 실행 환경을 생성하는 코드, 예를 들면, 프로세서 펌웨어, 프로토콜 스택, 데이터베이스 관리 시스템, 운영 체제, 또는 이들 중 하나 이상의 조합을 구성하는 코드를 포함할 수 있다. 전파 신호는 인공적으로 생성된 신호, 예를 들면, 적합한 수신기 장치로 전송하기 위한 정보를 인코딩하기 위해 생성되는 머신 생성 전기, 광학, 또는 전자기 신호이다.The disclosed and other solutions, examples, embodiments, modules and functional operations described herein may be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structure disclosed herein and structural equivalents thereof, or a combination of one or more of them. The disclosed and other embodiments may be implemented as one or more computer program products, ie, as one or more modules of computer program instructions encoded on a computer readable medium for execution by or for controlling the operation of a data processing apparatus. there is. The computer readable medium may be a machine readable storage device, a machine readable storage substrate, a memory device, a composition of matter realizing a machine readable propagated signal, or a combination of one or more of these. The term "data processing apparatus" encompasses all apparatus, devices, and machines for processing data, including, by way of example, a programmable processor, a computer, or multiple processors or computers. In addition to hardware, the device may include code that creates an execution environment for the computer program in question, such as code that makes up a processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of these. there is. A propagated signal is an artificially generated signal, for example a machine generated electrical, optical or electromagnetic signal generated to encode information for transmission to a suitable receiver device.

컴퓨터 프로그램(프로그램, 소프트웨어, 소프트웨어 애플리케이션, 스크립트 또는 코드라고도 함)은, 컴파일되는 또는 인터프리트되는 언어들을 포함한, 임의의 형태의 프로그래밍 언어로 작성될 수 있고, 독립형 프로그램으로서 또는 모듈, 컴포넌트, 서브루틴 또는 컴퓨팅 환경에서 사용하기에 적합한 다른 유닛으로서를 포함한, 임의의 형태로 배포될 수 있다. 컴퓨터 프로그램이 파일 시스템에서의 파일에 반드시 대응하는 것은 아니다. 프로그램은 다른 프로그램들 또는 데이터(예를 들면, 마크업 언어 문서에 저장된 하나 이상의 스크립트)를 보유하는 파일의 일 부분에, 문제의 프로그램에 전용된 단일 파일에, 또는 다수의 통합 파일들(예를 들면, 하나 이상의 모듈, 서브 프로그램, 또는 코드 부분을 저장하는 파일들)에 저장될 수 있다. 컴퓨터 프로그램은 하나의 컴퓨터에서 또는 하나의 사이트에 위치하거나 다수의 사이트들에 걸쳐 분산되고 통신 네트워크에 의해 상호연결되는 다수의 컴퓨터들에서 실행되도록 배포될 수 있다.A computer program (also referred to as a program, software, software application, script, or code) may be written in any form of programming language, including compiled or interpreted languages, and may be written as a stand-alone program or as a module, component, or subroutine. or as other units suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program may be part of a file holding other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple consolidated files (e.g. For example, files that store one or more modules, subprograms, or code sections). A computer program may be distributed to be executed on one computer or on multiple computers located at one site or distributed across multiple sites and interconnected by a communication network.

본 문서에 설명된 과정들 및 논리 흐름들은 입력 데이터에 대해 동작하여 출력을 생성하는 것에 의해 기능들을 수행하기 위해 하나 이상의 컴퓨터 프로그램을 실행하는 하나 이상의 프로그래밍 가능 프로세서에 의해 수행될 수 있다. 프로세스들 및 논리 흐름들이 또한 특수 목적 로직 회로, 예를 들면, FPGA(field programmable gate array) 또는 ASIC(application specific integrated circuit)에 의해 수행될 수 있고, 장치가 또한 이들로서 구현될 수 있다.The processes and logic flows described in this document can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows may also be performed by, and an apparatus may also be implemented by a special purpose logic circuit, eg, a field programmable gate array (FPGA) or an application specific integrated circuit (ASIC).

컴퓨터 프로그램의 실행에 적합한 프로세서는, 예로서, 범용 및 특수 목적 마이크로프로세서들 둘 모두, 및 임의의 종류의 디지털 컴퓨터의 임의의 하나 이상의 프로세서를 포함한다. 일반적으로, 프로세서는 판독 전용 메모리 또는 랜덤 액세스 메모리 또는 둘 모두로부터 명령어들 및 데이터를 수신할 것이다. 컴퓨터의 필수 요소들은 명령어들을 수행하기 위한 프로세서 및 명령어들과 데이터를 저장하기 위한 하나 이상의 메모리 디바이스이다. 일반적으로, 컴퓨터는 또한 데이터를 저장하기 위한 하나 이상의 대용량 저장 디바이스, 예를 들면, 자기, 자기 광학 디스크, 또는 광학 디스크를 포함할 것이거나, 또는 이들로부터 데이터를 수신하거나 이들로 데이터를 전송하도록 동작 가능하게 결합될 수 있거나, 또는 둘 모두일 것이다. 그렇지만, 컴퓨터가 그러한 디바이스들을 가질 필요는 없다. 컴퓨터 프로그램 명령어들과 데이터를 저장하기에 적합한 컴퓨터 판독 가능 매체는, 예로서, 반도체 메모리 디바이스, 예를 들면, EPROM, EEPROM, 및 플래시 메모리 디바이스; 자기 디스크, 예를 들면, 내장형 하드 디스크 또는 이동식 디스크; 자기 광학 디스크; 및 CD ROM과 DVD-ROM 디스크를 포함한, 모든 형태의 비휘발성 메모리, 매체 및 메모리 디바이스를 포함한다. 프로세서 및 메모리는 특수 목적 로직 회로에 의해 보완되거나 그에 통합될 수 있다.Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from read only memory or random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or operate to receive data from or transmit data to, one or more mass storage devices for storing data, such as magnetic, magneto-optical disks, or optical disks. possibly be combined, or both. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include, by way of example, semiconductor memory devices such as EPROM, EEPROM, and flash memory devices; magnetic disks, such as built-in hard disks or removable disks; magneto-optical disk; and all forms of non-volatile memory, media and memory devices, including CD ROM and DVD-ROM disks. The processor and memory may be supplemented by or integrated with special purpose logic circuitry.

본 특허 문서가 많은 구체적 사항들을 포함하지만, 이들은 임의의 주제의 범위 또는 청구될 수 있는 것의 범위에 대한 제한으로서 해석되어서는 안 되며, 오히려 특정 기술들의 특정의 실시예들에 특정적일 수 있는 특징들에 대한 설명으로서 해석되어야 한다. 개별적인 실시예들의 맥락에서 본 특허 문서에 설명되는 특정한 특징들이 또한 단일 실시예에서 조합하여 구현될 수 있다. 이와 달리, 단일 실시예의 맥락에서 설명되는 다양한 특징들이 또한 다수의 실시예들에서 개별적으로 또는 임의의 적합한 하위 조합으로 구현될 수 있다. 더욱이, 특징들이 특정 조합들로 기능하는 것으로 위에서 설명되고 심지어 처음에 그 자체로서 청구될 수 있지만, 청구된 조합으로부터의 하나 이상의 특징이 일부 경우에 그 조합으로부터 제거될 수 있고, 청구된 조합은 하위 조합 또는 하위 조합의 변형에 관한 것일 수 있다.Although this patent document contains many specifics, they should not be construed as limitations on the scope of any subject matter or of what may be claimed, but rather on features that may be specific to particular embodiments of particular technologies. should be interpreted as an explanation of Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, while features are described above as functioning in particular combinations and may even be initially claimed as such, one or more features from a claimed combination may in some cases be removed from that combination, and a claimed combination may be a sub-combination. Or it may be about a variation of a sub-combination.

유사하게, 동작들이 도면에서 특정의 순서로 묘사되지만, 이것은, 바람직한 결과를 달성하기 위해, 그러한 동작들이 도시된 특정의 순서로 또는 순차적 순서로 수행되어야 하거나, 모든 예시된 동작들이 수행되어야 하는 것을 요구하는 것으로 이해되어서는 안 된다. 더욱이, 본 특허 문서에 설명된 실시예들에서 다양한 시스템 컴포넌트들의 분리가 모든 실시예들에서 그러한 분리를 요구하는 것으로서 이해되어서는 안 된다.Similarly, while actions are depicted in a particular order in the figures, this requires that either those acts be performed in the particular order shown or in a sequential order, or that all illustrated acts be performed, in order to achieve a desired result. It should not be understood as Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.

단지 몇 가지 구현들 및 예들이 설명되고 다른 구현들, 향상들 및 변형들이 이 특허 문서에 설명되고 예시된 것에 기초하여 이루어질 수 있다.Only a few implementations and examples are described and other implementations, enhancements and variations may be made based on what is described and illustrated in this patent document.

제1 컴포넌트와 제2 컴포넌트 사이에 라인, 트레이스, 또는 다른 매체가 있는 경우를 제외하고, 중간 컴포넌트들이 없을 경우, 제1 컴포넌트는 제2 컴포넌트에 직접 결합된다. 제1 컴포넌트와 제2 컴포넌트 사이에 라인, 트레이스, 또는 다른 매체 이외의 중간 컴포넌트들이 있는 경우, 제1 컴포넌트는 제2 컴포넌트에 간접적으로 결합된다. "결합된"이라는 용어와 그 변형들은 직접 결합된 것과 간접적으로 결합된 것 양쪽 모두를 포함한다. "약"이라는 용어의 사용은, 달리 언급되지 않는 한 후속 개수의 ±10%를 포함하는 범위를 의미한다.The first component is directly coupled to the second component in the absence of intermediate components, except when there is a line, trace, or other medium between the first component and the second component. When there are intermediate components other than lines, traces, or other media between the first component and the second component, the first component is indirectly coupled to the second component. The term “coupled” and variations thereof include both directly coupled and indirectly coupled. Use of the term “about” refers to a range inclusive of ±10% of the following number unless otherwise stated.

본 개시내용에서 수개의 실시예들이 제공되었지만, 개시된 시스템 및 방법들은 본 개시내용의 사상과 범위로부터 벗어나지 않고 많은 다른 특정한 형태로 구현될 수도 있다는 것을 이해해야 한다. 본 예들은 제한이 아니라 예시로서 간주되어야 하며, 여기서 주어진 상세사항들로 제한하고자 하는 의도가 아니다. 예를 들어, 다양한 요소들 또는 컴포넌트들이 결합되거나 또 다른 시스템 내에 통합될 수 있으며, 소정의 특징들은 생략되거나, 구현되지 않을 수도 있다.Although several embodiments have been provided in this disclosure, it should be understood that the disclosed systems and methods may be embodied in many other specific forms without departing from the spirit and scope of the disclosure. These examples are to be regarded as illustrative rather than limiting and are not intended to be limiting to the details given herein. For example, various elements or components may be combined or integrated into another system, and certain features may be omitted or not implemented.

또한, 다양한 실시예에서 개별적 또는 별개인 것으로 설명되고 예시된 기술들, 시스템들, 서브시스템들, 및 방법들은, 본 개시내용의 범위로부터 벗어나지 않고 다른 시스템들, 모듈들, 기술들, 또는 방법들과 결합되거나 통합될 수도 있다. 결합된 것으로 도시되거나 논의된 다른 항목들은 직접 접속되거나, 전기적이든, 기계적이든 또는 기타의 방식으로든, 어떤 인터페이스, 디바이스, 또는 중간 컴포넌트를 통해 간접 결합되거나 통신할 수 있다. 변경, 대체, 및 수정의 다른 예들은 본 기술분야의 통상의 기술자라면 알아낼 수 있으며, 여기서 개시된 사상과 범위로부터 벗어나지 않고 이루어질 수 있을 것이다.Further, the techniques, systems, subsystems, and methods described and illustrated as individually or separately in various embodiments may be combined with other systems, modules, techniques, or methods without departing from the scope of the present disclosure. may be combined with or integrated with. Other items shown or discussed as coupled may be directly connected or indirectly coupled or communicate through some interface, device, or intermediate component, whether electrically, mechanically, or otherwise. Other examples of alterations, substitutions, and modifications may occur to those skilled in the art and may be made without departing from the spirit and scope disclosed herein.

Claims

As a method for processing video data,
Determining a description of cross random access point reference (CRR) samples in a visual media data file in an International Organization for Standardization (ISO) based media file format (ISOBMFF) ; and
performing conversion between the visual media data file and the visual media data based on the CRR sample group.

According to claim 1,
Wherein the description of the CRR samples is included within a CRR sample group.

According to claim 1,
wherein the description of the CRR samples is included within a dependent random access point (DRAP) sample group.

According to claim 1,
Wherein the description of the CRR samples is included within a type 2 DRAP sample group.

According to claim 1,
wherein the description of the CRR samples is included within an Enhanced Dependent Random Access Point (EDRAP) sample group.

According to any one of claims 1 to 5,
Wherein the description of the CRR samples is included in a SampleToGroupBox.

According to any one of claims 1 to 6,
Wherein the description of the CRR samples is contained within a CompactSampleToGroupBox.

According to any one of claims 1 to 6,
Wherein the description of the CRR samples is included in a group type parameter (group_type_parameter).

According to any one of claims 1 to 8,
Wherein the CRR samples are denoted as type 2 DRAP samples.

According to any one of claims 1 to 8,
Wherein the CRR samples are denoted as Enhanced Dependent Random Access Point (EDRAP) samples.

According to any one of claims 1 to 10,
Each sample comprises a picture.

According to any one of claims 1 to 6,
Wherein the description of the CRR samples includes one or more sample identifiers for identifying samples belonging to a certain sample group.

According to any one of claims 1 to 12,
Wherein the description of the CRR samples includes identifiers of reference pictures for the CRR samples.

According to any one of claims 1 to 13,
Wherein the description of CRR samples includes a number of samples needed as a reference to decode a current sample.

According to any one of claims 1 to 14,
wherein the description of the CRR samples is included within a sample entry within a sample group.

According to any one of claims 1 to 15,
When a current sample refers to only the nearest preceding initial sample, one or more CRR samples preceding the current sample in decoding order, or a combination thereof, the current sample is one of the CRR samples,

According to any one of claims 1 to 16,
When decoding starts from a current sample, if the current sample and all samples after the current sample in decoding order and output order can be correctly decoded, the current sample is one of the CRR samples,

According to any one of claims 1 to 17,
After decoding the nearest preceding initial sample, one or more CRR samples preceding the current sample in decoding order, or a combination thereof, the current sample and all samples after the current sample are correctly decoded.

According to any one of claims 1 to 18,
wherein the converting comprises generating the visual media data file according to the visual media data.

According to any one of claims 1 to 18,
wherein the transforming comprises parsing the visual media data file to obtain the visual media data.

An apparatus for processing video data, comprising:
a processor and non-transitory memory that stores instructions;
wherein the instructions, when executed by the processor, cause the processor to carry out the method of any of claims 1-20.

As a non-transitory computer readable medium,
a computer program product to be used by a video coding device;
The computer program product comprises computer-executable instructions stored on the non-transitory computer-readable medium that, when executed by a processor, cause the video coding device to perform the method of any of claims 1-20. Including, medium.